Some definitions for a clear answer ...
NEON has 32 registers, 64-bit wide (double representation as 16 registers, 128 bits).
The NEON module can view the same register bank as:
- sixteen 128-bit quad registers, Q0-Q15
- thirty-two 64-bit double word registers, D0-D31.
uint16x8_t is a type that requires 128-bit storage, so it must be in the quadword register.
ARM NEON Intrinsics has a definition of vector array data type in ARM® C Language Extensions :
... for use in load and store operations, in table lookups, and as a result of type operations that return a pair of vectors.
vzip instruction
... interleaves the elements of two vectors.
vzip Dd, Dm
and has intrinsic for example
uint8x8x2_t vzip_u8 (uint8x8_t, uint8x8_t)
from them we can conclude that uint8x8x2_t is actually a list of two random numeric double-word registers, because vzip instructions have no requirement for the order of the input registers.
Now the answer is ...
uint8x8x2_t can contain irregular two registers with two words, and uint16x8_t is a data structure consisting of two consecutive double-word registers that first have an even index (D0-D31 → Q0-Q15).
Because of this, you cannot distinguish a vector array data type with two double-word registers in a quad-word register ... easily.
The compiler may be smart enough to help you, or you can just force the conversion, but I would check the resulting assembly for correctness as well as performance.
source share