Is my 32-bit headache a 64-bit migraine?!? (or 64-bit CLR Runtime errors)

What unusual, unexpected consequences have occurred in terms of performance, memory, etc. when switching from .NET applications under 64-bit JIT to 32-bit JIT? I am interested in the good, but more interested in the surprisingly bad problems that people face.

Now I am writing a new .NET application that will be deployed on both 32-bit and 64-bit. There were many questions regarding application porting issues. I'm not interested in "gotchas" in terms of programming / porting . (i.e. proper handling of inline / COM interactions, reference types embedded in structures, resizing structures, etc.)

However, this question and the answer to it made me think - What other questions do I skip?

There were many questions and blog posts that covered this issue or hit one aspect of it, but I did not see anything that made up a decent list of issues.

In particular - My application is very attached to the processor and has huge patterns of memory usage (hence the need on a 64-bit basis), as well as a graphic one. I am worried about what other hidden problems may exist in CLR or JIT running on 64bit Windows (using .NET 3.5sp1).

Here are a few questions that I now know:

  • ( Now I know that ) Properties, even automatic properties, are not included in x64.
  • The application’s memory profile changes, both because of the size of the links and because the memory allocator has different performance characteristics
  • Startup time may suffer from x64

I would like to know what other, specific, problems people found in JIT on 64-bit Windows, as well as if there are any workarounds for performance.

Thanks everyone!

---- EDIT -----

Just to clarify -

I know that trying to optimize the early one is often bad. I know that the second assumption is that the system is often bad. I also know that portability to 64-bit has its problems - we run and test it daily on 64-bit systems to help with this. and etc.

My application, however, is not a typical business application. This is a scientific software application. We have many processes that sit using a 100% processor on all cores (it is very multithreaded) for several hours.

I spend a lot of time profiling the application, and this is of the utmost importance. However, most profilers disable many JIT features, so small details in things like allocating memory, embedding in JIT, etc. can be very complicated when you work under the profiler. Hence my need for a question.

+8
c # clr jit
Mar 11 '09 at 15:22
source share
8 answers

I remember hearing a question from the IRC channel that I frequent. It optimizes the temporary copy in this instance:

EventHandler temp = SomeEvent; if(temp != null) { temp(this, EventArgs.Empty); } 

The return of the race condition and the appearance of possible zero support exceptions.

+3
Mar 11 '09 at 15:49
source share

A particularly complex performance issue in .NET relates to poor JIT:

https://connect.microsoft.com/VisualStudio/feedback/details/93858/struct-methods-should-be-inlined?wa=wsignin1.0

Basically, inlining and structs do not work well together on x64 (although this page assumes that the attachment now works, subsequent redundant copies are not eliminated, which sounds suspicious given the tiny primary difference).

In any case, after struggling with .NET long enough for this, my solution is to use C ++ for something numerically intensive. Even in the ā€œgoodā€ cases for .NET, where you do not deal with structures and use arrays where border optimization is optimized, C ++ is superior to .NET hands down .

If you do something more complex than point products, the picture worsens very quickly; .NET code is even longer + less readable (because you need to manually embed the material and / or not use generics) and much slower.

I switched to using Eigen in C ++: this is absolutely great, resulting in readable code and high performance; the thin shell C ++ / CLI then provides the glue between the computing engine and the .NET world.

Eigen operates on a template metaprogram; compiling vector expressions into built-in SSE instructions and doing a lot of disgusting cache-related loops, expansions and permutations for you; and although it focuses on linear algebra, it will also work with integer and non-matrix array expressions.

So, for example, if P is a matrix, such Just Works material:

 1.0 / (P.transpose() * P).diagonal().sum(); 

... which does not select the temporarily transposed version of P and does not calculate the entire matrix product, but only the fields that it needs.

So, if you can work in Full Trust - just use C ++ through C ++ / CLI, it works much better.

+4
Feb 25 2018-10-25
source share

In most cases, Visual Studio and the compiler do a pretty good job of hiding problems from you. However, I am aware of one serious problem that may arise if you configure the application to automatically detect the platform (x86 vs x64), and also have any dependencies on 32-bit third-party DLLs. In this case, on 64-bit platforms, it will try to call the DLL using 64-bit conventions and structures, and it simply will not work.

+1
Mar 11 '09 at 15:32
source share

You mentioned porting issues, these are the ones to worry about. I (obviously) don't know your application, but trying to guess, JIT is often a waste of time. The people who write JIT have a deep understanding of x86 / x64 chip architecture and probably know what works better and what is worse than anyone else on the planet.

Yes, it is possible that you have a corner case that is different and unique, but if you are "in the process of writing a new application", then I would not worry about the JIT compiler. This is probably a dumb loop that can be avoided somewhere that will buy you a 100x performance boost that you get from trying to guess about JIT. Reminds me of the problems we encountered in writing our ORM, we look at the code and think that we could snatch a couple of machine instructions from it ... of course, the code then disconnected and connected to the database server over the network, thus , we cut microseconds from a process that was limited to milliseconds somewhere else.

A universal performance tuning rule ... If you haven’t measured your performance, you don’t know where your bottlenecks are, you just think you know ... and you are probably wrong.

+1
Mar 11 '09 at 15:38
source share

About the Quibblesome answer:

I tried to run the following code in my Windows 7 x64 mode in Release mode without a debugger, and a NullReferenceException was never thrown .

 using System; using System.Threading; namespace EventsMultithreadingTest { public class Program { private static Action<object> _delegate = new Action<object>(Program_Event); public static event Action<object> Event; public static void Main(string[] args) { Thread thread = new Thread(delegate() { while (true) { Action<object> ev = Event; if (ev != null) { ev.Invoke(null); } } }); thread.Start(); while (true) { Event += _delegate; Event -= _delegate; } } static void Program_Event(object obj) { object.Equals(null, null); } } } 
+1
Dec 11 '09 at 21:48
source share

I believe that 64 JITs are not fully developed / ported to take advantage of such 64-bit architecture processors, so they have problems, you can get "emulated" behavior of your assemblies, which can cause problems and unexpected behavior. I would consider cases where this can be avoided and / or maybe see if there is a good 64 C ++ fast compiler for writing critical calculations and time algorithms. But even if you have difficulty finding information or don’t have time to read using disparate code, I’m pretty sure that extracting heavy computations from managed code will reduce any problems that may arise and increase productivity [I’m somewhat sure that you already do it but just mention :)]

0
06 Feb '10 at 7:57
source share

The profiler should not significantly affect the results of synchronization. If the profiler’s overhead is really ā€œsignificant,ā€ you probably won’t be able to get much more speed out of your code and should think about finding bottlenecks in your hardware (disk, RAM, or processor?) And upgrades. (It looks like you're connected to the CPU, so where to start)

In general, .net and JIT relieve you of most 64-bit port problems. As you know, there are effects related to the size of the register (changes in memory usage, sorting by its own code, the need for all parts of the program to be built-in 64-bit assemblies) and some differences in performance (large memory card, more registers, more wide tires, etc.), so I can’t tell you anything more than you already know on this front. Other problems that I saw are the OS, and not C # - now there are different registry bushes for 64-bit and WOW64 applications, so some registry calls should be carefully written.

It's generally not good to think about what JIT will do with your code and try to tune it to work better, because JIT is likely to change with .net 4 or 5 or 6, and your ā€œoptimizationsā€ may turn into inefficiencies or, worse, errors. Also keep in mind that JIT compiles code specifically for the processor it runs on, so a potential improvement on your development PC may not be an improvement on another PC. The fact that you manage to use JIT today on today's processor may bite you in a few years, when you update something.

In particular, you indicate that "properties are not embedded in x64." By the time you run your entire code base, turning all your properties into fields, there may well be a new 64-bit JIT that does the built-in properties. Indeed, it may work better than your "workaround" code. Let Microsoft optimize this for you.

You rightly point out that your memory profile may change. Thus, you may need more RAM, faster disks for virtual memory, and larger processor caches. All problems with the equipment. You can reduce the effect by using (for example) Int32 rather than int, but this may not matter much and could potentially hurt performance (since your processor can process its own 64-bit values ​​more efficiently than half-sized 32-bit values) .

You say that the startup time may be longer, but this seems inappropriate in the application, which, as you say, works for several hours at 100% CPU.

So are you really worried? Perhaps the time of your code is on a 32-bit PC, and then the time when it performs the same task on a 64-bit PC. Is there half an hour difference in 4 hours? Or is the difference just 3 seconds? Or is a 64-bit PC faster? You may be looking for solutions to problems that do not exist.

So, back to the usual, more universal, advice. Profile and time to identify bottlenecks. Look at the algorithms and mathematical processes that you apply and try to improve or replace them with more efficient ones. Make sure your multi-threaded approach helps, and does not harm your performance (i.e., avoids waiting and blocking). Try to reduce memory allocation / deallocation - for example, to reuse objects, rather than replace them with new ones. Try to reduce the use of frequent function calls and virtual functions. Switch to C ++ and get rid of the overhead of garbage collection, border checking, etc. that .net imposes. Hmmm. None of this has anything to do with 64-bit, right?

0
Feb 06 2018-10-06T00
source share

I am not familiar with 64-bit issues, but I have one comment:

We must forget about small efficiencies, say, about 97% of the time: premature optimization is the root of all evils. - Donald Knut

-one
Mar 11 '09 at 15:26
source share



All Articles