Worse, multithreading for a better system (possibly due to Deedle)

We are dealing with a multithreaded C # service using Deedle. Tests on the quad-core current system compared to the octa-core target system show that the service is twice as slow in the target system, and not twice as fast. Even with the number of threads limited to two, the target system is still almost 40% slower.

The analysis shows a lot of expectations in Deedle (/ F #), forcing the target system to work mainly on two cores. Test programs without deadlines show normal behavior and high memory bandwidth on the target system.

Any ideas on what could lead to this and how best to approach this situation?

EDIT: Most of the time expectations seem to be fulfilled in Invoke calls.

+1
multithreading c # deedle
source share
1 answer

The problem turned out to be a combination of using Windows 7, .NET 4.5 (or actually version 4.0) and the heavy use of tail recursion in F # / Deedle.

Using Visual Studio Concurrency Visualizer, I have already found that most of the time is spent on invoke calls. Upon closer inspection, they lead to the following call trace:

ntdll.dll:RtlEnterCriticalSection ntdll.dll:RtlpLookupDynamicFunctionEntry ntdll.dll:RtlLookupFunctionEntry clr.dll:JIT_TailCall <some Deedle/F# thing>.Invoke 

A search for these functions led to several articles and forum topics indicating that using F # can lead to many calls in JIT_TailCall and that in .NET 4.6 there is a new JIT compiler that seems to handle some of the problems associated with these calls. Although I did not find anything mentioning the lock / sync issues, it gave me an idea that updating .NET 4.6 might be a solution.

However, on my own Windows 8.1 system, which also uses .NET 4.5, the problem does not occur. After searching for a bit for such Invoke calls, I found that the call trace on this system looked like this:

 ntdll.dll:RtlAcquireSRWLockShared ntdll.dll:RtlpLookupDynamicFunctionEntry ntdll.dll:RtlLookupFunctionEntry clr.dll:JIT_TailCall <some Deedle/F# thing>.Invoke 

Apparently, in Windows 8 (.1), the lock mechanism was changed to something less strict, which led to a much lesser need to wait for the lock.

Thus, only with the target system combination of both strict locking of Windows 7 and .NET 4.5 with a less efficient JIT compiler, F # heavy use of tail recursion causes problems. After upgrading to .NET 4.6, the problem disappeared and our service works as expected.

+1
source share

All Articles