I think you can go too fast.
Most operating systems have a limit on the number of sockets that they can open at any given time, but this is actually worse.
When a socket is closed, it is placed at a specific timeout for a certain time. Usually this is twice as much as the time spent on the packet, and ensures that there are no packets on the network that are on the way to your socket.
Once this time is up, you can be sure that all packets on the network have already died. The socket is placed in this special state, so that packets that were online when you close it can be captured and thrown away if they arrive before they die.
I think that what happens in your case, sockets are not released as fast as you think.
We had a similar problem with code that opened up a lot of short sessions. For some time it worked fine, but then the hardware accelerated, allowing you to open up a lot of time over a certain period of time. This was manifested in the inability to open more sessions.
One way to check this is to make netstat -a from the command line and see how many sessions are really in standby.
If this is true, there are several ways to deal with it.
- Reuse your sessions manually or by maintaining a connection pool.
- enter a delay in each connection to try to stop the saturation point.
- exit until you reach saturation, and then change your behavior, for example, run your connection logic inside the while statement, which retries up to 60 times with a two-second delay each time before completely abandoning. This allows you to work at full speed, slowing down only if there is a problem.
This last marker deserves some expansion. In fact, we used the delay strategy in our aforementioned application, which would gradually reduce the load on the resource provider if it complained, instead of 30 two-second delays, we chose a one-second delay, then two seconds, then four, etc.
The general process for the retreat strategy is as follows, and it can be used in any case where there may be a temporary lack of resource. The action mentioned in the pseudo code below will be opening the socket in your case.
set maxdelay to 16 # maximum time period between attempts set maxtries to 10 # maximum attempts set delay to 0 set tries to 0 while more actions needed: if delay is not 0: sleep delay attempt action if action failed: add 1 to tries if tries is greater than maxtries: exit with permanent error if delay is 0: set delay to 1 else: double delay if delay is greater than maxdelay: set delay to maxdelay else: set delay to 0 set tries to 0
This allows the process to run at full speed in the vast majority of cases, but backs off when errors occur, hoping that the resource time will be restored. Gradually increasing delays allows you to restore more serious resource constraints, and maximum attempts will catch what you would call permanent errors (or errors that take too long to recover).