Track database connectivity issues

Background

We have several web applications on different web servers that connect to the same database server. Over the past couple of months, we noticed that every time after a while our web servers will not be able to connect to the database server.

Our environment

We have a couple of different web environments, some of which run ColdFusion and others that run .NET. .NET applications are both web forms and MVC. They cover several versions from 2.0 to 4.5. ColdFusion and .NET web servers are Windows-based machines. Both ColdFusion and .NET web environments are clustered, some of them physical and others virtual.

Our database server is SQL Server 2008 r2. It contains several databases. Each application has its own database user, with which it connects to a server, which only gives it access to a specific database.

Other facts

  • When we notice problems, they occur in short bursts that last from a couple of seconds to a couple of minutes.
  • When we notice problems, the package contains errors from several different applications, and not just from one application at a time.
  • When we notice problems, the package contains errors from applications from different web environments. (This makes us think that we can exclude that the applications themselves are a problem).
  • A surge in connectivity issues occurs at different times during the day and night. They are not always during high use.
  • We tracked things like the number of user connections, memory, I / O, CPU usage, etc ... and we did not see any spikes or anything else that could indicate a problem.
  • We installed wirehark on the web server and db servers in the hope of catching the problem without any success.

Questions

  • Does anyone have any suggestions on where I should look further?
  • Are there database properties that can cause this?
  • Is there a way to โ€œbetter controlโ€ the connection between the database and the web server?
  • Is there anything that can be done on the application side to better understand what is going on?

Errors Detected by Applications

  • .NET Errors
    • When connecting to SQL Server, a network-related or specific instance error occurred. The server was not found or was not available. Verify the instance name is correct and configure SQL Server to connect remotely. (provider: Named Pipes provider, error: 40 - Could not open SQL Server connection)
    • Timed out. The wait period expires before the operation is completed or the server does not respond.
    • There was a transport layer error while receiving results from the server. (provider: TCP provider, error: 0 - semaphore timeout period has expired.)
    • Timed out. The wait period expires before a connection is received from the pool. Perhaps this was due to the fact that all joined connections were used and the maximum pool size was reached.
  • ColdFusion Errors
    • Error executing database query. The TCP / IP connection to the host failed. java.net.ConnectException: connection time completed: connect
      An error occurred on line 38.
    • Error querying database. Connection reset by peer: socket write error
      Error on line 91.
    • Error querying database. Shutting down with an attempt to establish a connection. An error occurred on line 38.
+7
source share
1 answer

In CF, I once had a problem like what you saw. I had CF on 1 server and sql 2008 r2 on another server. I would see CF errors, as you posted below. To track this to a network error, I wrote something like this:

1) created down.bat

tracert serverip 

2) Then I put <cftry><cfcatch> around the request.

When the request generated an error, I would execute

 <cfexecute name="C:\path\to\down.bat" variable="log" timeout="60" /> <cfmail to="ME" from="Server" subject="SQL DOWN"> Server Debugging Info: ------------------------------------------------------------ #now()# #cfcatch.Detail# #cfcatch.Message# #log# </cfmail> </cfexecute> 

This helped me fix the situation, which ended up being hardware in the data center.

+1
source

All Articles