AppFabric key seems unstable

We are trying to use AppFabric distributed cache. After going back and forth with non-domain servers a lot, we finally placed them in the domain, and installation / configuration was a bit easier. We ran it and confused it after several errors, most of which seem trivial, to include several test or more descriptive error messages in AppFabric. "Temporary mistake" does not explain much ...

But there are still problems.

We installed 3 servers, one of which is the "master". We finally got the cache working, and we confirmed this by specifying a network load balancer on one server at a time, confirming that we can set the cache on one server and get it on another.

Then I restarted the AppFabric caching service on all servers and suddenly did not work. Get-CacheHost says they are at the top, but we get exceptions, such as:

  ErrorCode <ERRCA0018>: SubStatus <ES0001>: The request timed out
 ErrorCode <ERRCA0017>: SubStatus <ES0001>: There is a temporary failure.  Please retry later.

Why did this error happen just by restarting the services?

Is AppFabric Cache Ready to Use?

What happens if the server shuts down? Long wait times?

Are we dependent on a "lead" server?

I suspect that he will recover after 5-10 minutes of R & R. It seems that he is returning on his own.

Update:. This happened a few minutes later. Now we tested by removing one server from the cluster, and this led to a long timeout and, finally, an exception.

+8
caching appfabric
source share
1 answer

We debugged this for some time, and I share what we have found so far.

  • UAC in Windows 2008 actually blocks access to the local computer, so commands to the local computer will not work. Launch PowerShell as an administrator or completely disable UAC for workarounds.
  • A simple manual configuration file change will not work. You need to use the export and import commands.
  • Firewalls are a serious problem as the installer opens a 222 * range of ports, but PowerShell tools use other Windows services. Turning off the firewall on all servers (not recommended) solved the problem.
  • If the server is removed from the cluster, there will be an initial timeout before the cluster can work again.
  • After a restart, the cluster uses 2-5 minutes for backup.
  • If a restart and one server are unavailable, the startup time will increase.
  • If a server that supports file sharing for configuration is unavailable, services do not start. We tried to solve this problem by providing each server with a private share.
+7
source share

Source: https://habr.com/ru/post/651292/


All Articles