I recently read this post: Floating point and integer computing on modern hardware , and I was curious how my own processor works on this quasi-control, so I put together two versions of the code: one in C # and one in C ++ (Visual Studio 2010 Express) and compiled them both with optimizations to see what happens. The result of my C # version is pretty reasonable:
int add/sub: 350ms int div/mul: 3469ms float add/sub: 1007ms float div/mul: 67493ms double add/sub: 1914ms double div/mul: 2766ms
When I compiled and launched the C ++ version, everything completely different shook out:
int add/sub: 210.653ms int div/mul: 2946.58ms float add/sub: 3022.58ms float div/mul: 172931ms double add/sub: 1007.63ms double div/mul: 74171.9ms
I expected some performance differences, but not this big one! I don’t understand why division / multiplication in C ++ is much slower than addition / subtraction, where the managed version of C # is more reasonable for my expectations. The code for the C ++ version for the function is as follows:
template< typename T> void GenericTest(const char *typestring) { T v = 0; T v0 = (T)((rand() % 256) / 16) + 1; T v1 = (T)((rand() % 256) / 16) + 1; T v2 = (T)((rand() % 256) / 16) + 1; T v3 = (T)((rand() % 256) / 16) + 1; T v4 = (T)((rand() % 256) / 16) + 1; T v5 = (T)((rand() % 256) / 16) + 1; T v6 = (T)((rand() % 256) / 16) + 1; T v7 = (T)((rand() % 256) / 16) + 1; T v8 = (T)((rand() % 256) / 16) + 1; T v9 = (T)((rand() % 256) / 16) + 1; HTimer tmr = HTimer(); tmr.Start(); for (int i = 0 ; i < 100000000 ; ++i) { v += v0; v -= v1; v += v2; v -= v3; v += v4; v -= v5; v += v6; v -= v7; v += v8; v -= v9; } tmr.Stop();
The code for C # tests is not general and is implemented this way:
static double DoubleTest() { Random rnd = new Random(); Stopwatch sw = new Stopwatch(); double v = 0; double v0 = (double)rnd.Next(1, int.MaxValue); double v1 = (double)rnd.Next(1, int.MaxValue); double v2 = (double)rnd.Next(1, int.MaxValue); double v3 = (double)rnd.Next(1, int.MaxValue); double v4 = (double)rnd.Next(1, int.MaxValue); double v5 = (double)rnd.Next(1, int.MaxValue); double v6 = (double)rnd.Next(1, int.MaxValue); double v7 = (double)rnd.Next(1, int.MaxValue); double v8 = (double)rnd.Next(1, int.MaxValue); double v9 = (double)rnd.Next(1, int.MaxValue); sw.Start(); for (int i = 0; i < 100000000; i++) { v += v0; v -= v1; v += v2; v -= v3; v += v4; v -= v5; v += v6; v -= v7; v += v8; v -= v9; } sw.Stop(); Console.WriteLine("double add/sub: {0}", sw.ElapsedMilliseconds); sw.Reset(); sw.Start(); for (int i = 0; i < 100000000; i++) { v /= v0; v *= v1; v /= v2; v *= v3; v /= v4; v *= v5; v /= v6; v *= v7; v /= v8; v *= v9; } sw.Stop(); Console.WriteLine("double div/mul: {0}", sw.ElapsedMilliseconds); sw.Reset(); return v; }
Any ideas here?