Windows service sporadic failure in .NET 4 followed by a blocked port when trying to reboot

About once a day, I get the following error in our mission-critical trading service.

Source: .NET Runtime, Type: Error, Application: Application.exe, Framework version: v4.0.30319, Description: The process was interrupted due to an internal error in .NET Runtime by IP 000006447F281DBD (000006447F100000) with exit code 80131506.

After receiving this error and trying to restart the application, it looks like the sockets we were connected to were not cleared of the previous (unsuccessful) execution, because we get a System.ServiceModel.AddressAlreadyInUseException when trying to bind a socket at startup.

I have two questions.

  • We need to understand why the first error occurs, do you have any information from the error codes, etc.
  • We need a way to bind successfully after an error. Any suggestions for cleaning up the ports during the next run.

In the environment in which the application is running,

  • Microsoft Windows Server 2003 R2
  • X64 standard version
  • Service Pack 2
  • 2x 4Core Intel CPU X5365 @ 3.00 GHz
  • 16.0 GB of RAM.
+6
source share
4 answers

This is an ExecutionEngineException from earlier days of .NET. You cannot catch it in .NET 4.0, AppDomain.UnhandledException will not work.

The general diagnosis for this exception is that the integrity of the garbage collected has been compromised. A typical trigger is unmanaged code written at the end of a buffer. Or it may be environmental, virus scanners have the ability to cause this problem. Especially for Symantec security products. This is most likely in your case, given that the ports do not close automatically when your service ends. It is also technically possible that an error in the CLR causes this.

I would therefore recommend:

  • Inspect the source code base and carefully review the unmanaged code used.
  • Contact your third-party component vendor and ask about known heap corruption issues.
  • Review the configuration of the machine that runs this code. If possible, disable add-ons, temporarily disable anything that is not strictly necessary to start your service.
  • Return the project to .NET 3.5 SP1.
+4
source share

To get additional error information, add a global, last chance, exception handler. This will throw any exception that is not otherwise handled. It should be registered (at least (exception type, message and stack traffic (ideally also a mini-memory drive and a list of loaded assemblies with versions and code base).

This will give you a much better chance of fixing (or at least mitigating) the original problem.


The problem with sockets is that the sockets wait for a while for all the data to be cleared until they are complete (see TCP View for a while, as you will see this, because the system inherits the sockets after the applications have finished with them). A.

0
source share

After years of dealing with this problem in a number of applications, it seems Microsoft has finally accepted it as a bug in the .net 4 clr that causes this. http://support.microsoft.com/kb/2640103 .

For many years I "fixed" it, forcing the garbage collector to work in server mode (gcServer enabled = "true" in app.config), which essentially causes all application threads to pause during collection, removing the possibility of other threads accessing the memory being processed Gc.

0
source share

Adding to what @Richard pointed out, your exception is an unhandled exception, and you can use case for the next event and find out why the exception occurred. You can also use this to delete any unmanaged objects.

 AppDomain.CurrentDomain.UnhandledException +=new UnhandledExceptionEventHandler( CurrentDomain_UnhandledException ); static void CurrentDomain_UnhandledException( object sender, UnhandledExceptionEventArgs e ) { // Log the reason. // Also cleanup open sockets if possible. } 
-2
source share

All Articles