I played with D built-in assembler and SSE, but found something that I do not understand. When I try to add two float4 vectors right after the declaration, the calculation is correct. If I put the calculation in a separate function, I get a series of nan s.
//function contents identical to code section in unittest float4 add(float4 lhs, float4 rhs) { float4 res; auto lhs_addr = &lhs; auto rhs_addr = &rhs; asm { mov RAX, lhs_addr; mov RBX, rhs_addr; movups XMM0, [RAX]; movups XMM1, [RBX]; addps XMM0, XMM1; movups res, XMM0; } return res; } unittest { float4 lhs = {1, 2, 3, 4}; float4 rhs = {4, 3, 2, 1}; println(add(lhs, rhs)); //float4(nan, nan, nan, nan) //identical code starts here float4 res; auto lhs_addr = &lhs; auto rhs_addr = &rhs; asm { mov RAX, lhs_addr; mov RBX, rhs_addr; movups XMM0, [RAX]; movups XMM1, [RBX]; addps XMM0, XMM1; movups res, XMM0; } //end identical code println(res); //float4(5, 5, 5, 5) }
The assembly is functionally identical (as far as I can tell) this link .
Edit: I am using a custom float4 structure (now its just an array) because I want to have an add function like float4 add(float4 lhs, float rhs) . At the moment, this leads to a compiler error:
Error: expected floating point constant expression instead of rhs
Note. I am using DMD 2.071.0