A solution that does not require any validation is what I saw in the memcpy implementation. Basically, you start copying a byte of data for each byte until you get an address that is a multiple of the desired alignment.
After that, you can start copying data fragments in Word format with all the advantages that you have with a aligned address (cyclic sweep, vectorization, etc.).
You will get the best of it with large data blocks.
Apparently, neither clang nor gcc define any macro to report unattached access. ( gcc/clang -E -dM - < /dev/null -march=native ).
Some ideas you might consider:
- Reduce need first: problems arise when using pointers. Try to avoid this by refactoring how you process your data.
asm : write platform- asm to load / save to / from uneven access, although it is highly dependent on the platforms you work with.- SSE allows for uneven access.
source share