NetTcpActivator Service (Net.Tcp Listener Adapter) stops responding

In my current project, we (I mean "project team") use WCF services hosted in IIS.

Here are some technical details that may be important:

  • We use NET 3.5 for WCF services
  • We use the communication protocol NET.TCP
  • We use IIS 7 and IIS 7.5 to host these services.
  • We use several IIS workflows on each server.

So, the problem is that sometimes WCF services become unavailable. When we try to reach these WCF services, we get a timeout error. And the only way to restore the WCF service is to restart the Windows NetTcpActivator service (Net.Tcp Listener Adapter).

According to my theory of colleagues, this error may be related to the problems described in this article in KB:

FIX: Smsvchost.exe for WCF service stops responding when starting WCF service based on .NET Framework 4 http://support.microsoft.com/kb/2536618 p>

According to this article, SMSvcHost (the container service hosting NetTcpActivator and the port sharing service) freezes if it cannot redirect a request to w3wp (IIS workflow) in more than 60 seconds (non-configurable timeout). Unfortunately, we cannot find a way to reproduce this error. For example, we limited SMSvcHost to 1 processor core and 1 thread, and extended pending connections limited 1M and pressed 100% CPU load in user mode. And it did not freeze!

Sometimes our load tests lead to strange errors, but when we stop them, all services are automatically restored to their normal state. But sometimes a lightweight load can hang NetTcpActivator!

In addition, I would like to say that this is not a new problem. My colleagues already got this 3 years ago (see this topic for more information http://forums.iis.net/t/1167668.aspx/1/10 ). And, unfortunately, they did not receive an answer. The problem disappeared after some configuration changes! And now he has returned to the new server.

I will be very grateful to all of you for your thoughts and ideas!

+7
source share
1 answer

Well, after many studies, I found out the cause of our problem. There may be other scenarios where this happens, but hopefully this will help some people. Microsoft is in the process of reproducing in its labs and ultimately needs to be fixed.

In our case, all the planets had to align. We had one integrated .NET 4 application pool for the client and server (on the developer's machine). The service used an external configuration file for bindings ( <bindings configSource="serviceModel.bindings.config" /> ), which was linked to another project and copied during the build with a custom build job added to the .csproj service.

To reproduce the problem:

  • Stop all running SMSvcHost services (Net.Tcp *, Net.Pipe, Net.Msmq). Restarting will not work as the SMSvcHost process will not disappear.
  • In Visual Studio, run the cleanup tool for WcfService
  • From Windows Explorer, remove serviceModel.bindings.config in WcfService
  • Launch iisreset (gets rid of w3wp and starts SMSvcHost services - press F5 - list of services to see this)
  • WcfService assembly (copies the associated configuration file)
  • Go to the WcfClient page, send twice. If you get an error message every time, you probably have a problem. In our main application, it provided a timeout in the CommunicationObjectFaultedException test application instead of a timeout, but everything is in order.
  • Stop SMSvcHost Services. If an error occurs, event 8 for SMSvcHost is logged in the system event log.

I still don't know if w3wp or SMSvcHost is the culprit. Step # 3 is crucial, although I still can’t explain why. If you do not delete the file, everything will be fine. If you change the file (creation date remains unchanged), everything is in order. If you move the XML configuration to the main Web.config file, everything will be fine. When the build task copies the file, the updated date is updated, so I assume it is cached somehow, and one of the processes detects a date change.

If you restart the SMSvcHost services (full stop, full start) once or twice, the client request will pass, and from now on you are all the same.

So, now I assume that this may be a problem right after deployment, but if you make sure everything works (and restart the services if necessary), then everything should be in order. You also cannot execute external / linked files.

As soon as Microsoft fixes this problem, I hope you have more understanding.

Final update I forgot to return to this earlier. Microsoft essentially admitted that they probably had a mistake, but since there was a workaround and spent enough time on the ticket, they closed it, not studied further. It seems like some kind of race when SMSvcHost starts up with the following installation (similar to what I posted earlier):

  • Host WCF in IIS
  • Use non-HTTP binding to embed SMSvcHost
  • Use an external configuration file for bindings using configSource

Linking the external configuration had nothing to do with it. The workaround was to not use the configSource , which we are doing now.

0
source

All Articles