I need a faster quaternion-vector multiplication procedure for my math library. Now I use the canonical v' = qv(q^-1) , which gives the same result as multiplying a vector by a matrix made from a quaternion, so I'm sure of its correctness.
So far I have implemented 3 alternative methods "faster":
# 1, I have no idea where I got this from:
v' = (q.xyz * 2 * dot(q.xyz, v)) + (v * (qw*qw - dot(q.xyz, q.zyx))) + (cross(q.xyz, v) * qw * w)
Implemented as:
vec3 rotateVector(const quat& q, const vec3& v) { vec3 u(qx, qy, qz); float s = qw; return vec3(u * 2.0f * vec3::dot(u, v)) + (v * (s*s - vec3::dot(u, u))) + (vec3::cross(u, v) * s * 2.0f); }
# 2, courtesy of this beautiful blog
t = 2 * cross(q.xyz, v); v' = v + qw * t + cross(q.xyz, t);
Implemented as:
__m128 rotateVector(__m128 q, __m128 v) { __m128 temp = _mm_mul_ps(vec4::cross(q, v), _mm_set1_ps(2.0f)); return _mm_add_ps( _mm_add_ps(v, _mm_mul_ps(_mm_shuffle_ps(q, q, _MM_SHUFFLE(3, 3, 3, 3)), temp)), vec4::cross(q, temp)); }
And No. 3, from numerous sources,
v' = v + 2.0 * cross(cross(v, q.xyz) + qw * v, q.xyz);
implemented as:
__m128 rotateVector(__m128 q, __m128 v) { //return v + 2.0 * cross(cross(v, q.xyz) + qw * v, q.xyz); return _mm_add_ps(v, _mm_mul_ps(_mm_set1_ps(2.0f), vec4::cross( _mm_add_ps( _mm_mul_ps(_mm_shuffle_ps(q, q, _MM_SHUFFLE(3, 3, 3, 3)), v), vec4::cross(v, q)), q))); }
All 3 of them give incorrect results. However, I noticed some interesting patterns. First of all, # 1 and # 2 give the same result. # 3 gives the same result as me, getting from multiplying a vector by a derivative matrix if the specified matrix is transposed (I discovered this by accident, earlier my square matrix code accepted matrices of strings that were incorrect).
Saving my quaternion data is defined as:
union { __m128 data; struct { float x, y, z, w; }; float f[4]; };
Are my implementations incorrect or am I not seeing something here?