Fast conversion of small and medium numbers to broadband conversion in ASM

I have an array of uint types in C #. After checking if the program is running on a little-endian machine, I want to convert the data to a big-endian type. Since the amount of data can become very large, but always even, I thought of considering two types of uint as an ulong type, for better performance and a program in ASM, so I'm looking for a very fast (fastest, if possible) assembler algorithm for converting little-endian in big-endian.

+5
source share
3 answers

For a lot of data, a command bswap(available in Visual C ++ under , and intrinsics) is the way to go. It even surpasses the handwritten assembly. They are not available in pure C # without P / Invoke, therefore: _byteswap_ushort_byteswap_ulong_byteswap_uint64

  • Use this only if you have a lot of data for exchanging bytes.
  • You should seriously consider writing I / O to the lowest-level applications in managed C ++ so that you can swap before transferring data to a managed array. You already need to write a C ++ library, so you have nothing to lose, and you will bypass all the performance problems associated with P / Invoke for algorithms with a low degree of complexity working with large data sets.

PS: . , , , . , , , , , - .

+6

, . ( CLI, ). , , , 0

LDLOC 0
SHL 24
LDLOC 0
LDC.i4 0x0000ff00
SHL 8
OR
LDLOC 0
LDC.i4 0x00ff0000
SHL.UN 8
OR
LDLOC 0
SHL.UN 24
OR

13 (x86) (, , , ). , .

  • ( , !)
  • (, )
  • ( )

13 , ! , , , - , .

, , , , .

+2

I was thinking of considering two uint types as a ulon type

Ok, this will also replace two uint values, which may not be desirable ...

You can try some C # code in unsafe mode, which can really work quite well. How:

public static unsafe void SwapInts(uint[] data) {
   int cnt = data.Length;
   fixed (uint* d = data) {
      byte* p = (byte*)d;
      while (cnt-- > 0) {
         byte a = *p;
         p++;
         byte b = *p;
         *p = *(p + 1);
         p++;
         *p = b;
         p++;
         *(p - 3) = *p;
         *p = a;
         p++;
      }
   }
}

On my computer, the throughput is about 2 GB per second.

+1
source

All Articles