AIX 128-bit comparison (64 bits) for sorting hash values

I need to sort an array of structures like this on AIX (64 bits) using the xlC_r compiler:

struct digest_line { uint64_t first; uint64_t second; }; 

Now I am making a long way (comparing the first element, and if they are equal, compare the second element.) Is there a faster way to compare these values?

Edit: I forgot to mention that I am using the AIX qsort() function. According to the qsort man page, the comparison function is defined as

 int (*ComparisonPointer)(const void*, const void*); 

which (for me) means that I cannot just return the value int64_t , but something like this:

 int compare_digests(const void *a, const void *b) { struct digest_line *aa = (struct digest_line *) a; struct digest_line *bb = (struct digest_line *) b; int64_t ret = aa->first - bb->first; if (!ret) { ret = aa->second - bb->second; } return (ret == 0) ? 0 : (ret > 0) ? 1 : -1; } 

It does not look ... right. I keep thinking that there should be a better way.

+4
source share
2 answers

There are problems in your code because you are doing signed comparisons of unsigned data. Use one of the following options:

More Orthodox

It is much faster.

 int compare_digests(const void *a, const void *b) { const struct digest_line *aa = (const struct digest_line *) a; const struct digest_line *bb = (const struct digest_line *) b; if (aa->first > bb->first) return +1; else if (aa->first < bb->first) return -1; else if (aa->second > bb->second) return +1; else if (aa->second < bb->second) return -1; else return 0; } 

Less orthodox

This is noticeably slower; do not use it.

 int compare_digests(const void *a, const void *b) { struct digest_line aa = *(struct digest_line *) a; struct digest_line bb = *(struct digest_line *) b; if (aa.first > bb.first) return +1; else if (aa.first < bb.first) return -1; else if (aa.second > bb.second) return +1; else if (aa.second < bb.second) return -1; else return 0; } 

Timing

After performing some measurements, it is clear that the “less orthodox” method is also slower. More than 20 runs (each of which performs 100,000,000 iterations with a different pair of values ​​compared at each iteration), I got the average time and standard deviations (in seconds):

  Mean Standard Deviation Value 0.732914 0.005000 Pointer 0.655853 0.003895 Null 0.353649 0.003448 

The difference between the versions of values ​​and pointers is significant (0.077s is many times greater than the standard deviation), and the version of the pointer is faster. Therefore, use the standard pointer-based comparator. "Null" uses a comparison function that simply returns 0 without any comparisons.

Typical output lines:

 Value: 0.730634 (less = 51517909, more = 48482090, equl = 1) Pointer: 0.684107 (less = 51517909, more = 48482090, equl = 1) Null: 0.351807 (less = 0, more = 0, equl = 100000000) 

Test code

Two comparators were renamed compare_digests_val() to compare values ​​and compare_digests_ptr() to compare pointers. Functions like Clock and clk_* are a high-resolution timer package using gettimeofday() on the platform where I tested. It’s clear that there are significant incremental overheads and statistics accumulation in the cycle, but this simply means that the difference in the comparators is more significant.

 static int compare_digests_nul(const void *a, const void *b) { return 0; } static void time_comparisons(const char *tag, int (*compare)(const void *, const void *)) { struct digest_line a = { 0, 0 }; struct digest_line b = { 0, 0 }; int less = 0; int more = 0; int equl = 0; Clock clk; char buffer[32]; clk_init(&clk); clk_start(&clk); for (int i = 0; i < 100000000; i++) { int j = (*compare)(&a, &b); if (j < 0) less++; else if (j > 0) more++; else equl++; a.first += 1234567890123ULL; a.second += 2345678901234ULL; b.first += 7654321098765ULL; b.second += 8765432109876ULL; } clk_stop(&clk); printf("%-8s %s (less = %9d, more = %9d, equl = %9d)\n", tag, clk_elapsed_us(&clk, buffer, sizeof(buffer)), less, more, equl); } int main(void) { for (int i = 0; i < 20; i++) { time_comparisons("Value:", compare_digests_val); time_comparisons("Pointer:", compare_digests_ptr); time_comparisons("Null:", compare_digests_nul); } return 0; } 
+4
source

Probably the best choice on any platform is to simply use memcmp . This should be highly optimized (and included) for any decent architecture. A look at assembler should tell you if the compiler does some clever optimization. And then benchmarking can tell you which of your versions is better, because, for example, problems with alignment can also play a role and depend on the type of data that you have.

I don't have your architecture at hand, so I quickly checked out my old i686 with gcc. Assembler next function

 int compare(struct digest* a, struct digest* b) { return memcmp(a, b, sizeof *a); } 

Looks pretty well optimized.

Edit: Jonathan is right in his remark that this does not necessarily give numerical ordering to a 128-bit pattern. But as long as you are interested in a standing order to bring order :) to your digest, this should work fine on all platforms. AFAIR AIX platforms have a large endian, so they should work well there.

+1
source

All Articles