Graceful degradation in Java to avoid memory errors

What tools or best practices are available to gracefully degrade service in a Java service during memory-intensive batch requests? This application is multithreaded. The amount of work required to process each request can vary greatly and is not easy to split and parallelize.

I am afraid to write application-level code that relates to the use of heap and GC, but we find that the application may get into a problem, which means memory errors or a full GC, taking more than one intense request. Often, a full GC cannot find free memory.

In short: I'm thinking of adding some throttling capabilities or queues to forestall this problem.

Any ideas or tips appreciated.

+4
source share
5 answers

As joeslice said, implement throttling through a simple resource pool. At the most basic level, this is a semaphore - your workflows must be authorized before processing requests. Since you say you have heterogeneous tasks, you probably want the permissions to be a bit more complex, for example. get a certain number of permissions proportional to the size of the work.

In the past, I have found that this does not always work. Let them say that your heuristics are disabled, and your application throws OOM anyway. This is important to prevent the process from freezing in a bad state, so immediately delete and restart the process. There are several ways to notice when OOM occurs, for example. see java from memory, then exit .

+1
source

Here is an example implementation by Netty authors ( link ). They mainly track memory usage and are directly throttled based on this statistics.

Another, rougher way to do this is to limit concurrent execution using a fixed thread pool and a limited queue. The usual way is to allow the caller queue.put() to complete the task on their own as soon as this queue is full. Thus, the load will (well, as expected) be distributed on the way back to the client until the creation of new requests becomes slower. Consequently, the behavior of the application. becomes more "elegant."

In practice, I almost use only the “crude” method described above. It works very well. Basically, a combination of a fixed thread pool and a limited queue + Caller triggers a reject policy. I keep the parameters (queue size, thread pool size) customizable, and then, after the design is complete, I will configure these parameters. Sometimes it becomes obvious that the thread pool can be divided between the service, etc., Therefore, in this case it is very convenient to use the ThreadPoolExecutor class to get the policy of running pools / restricted queues / calling threads, all wrapped in one.

+1
source

I wonder if there is a way to predetermine roughly the amount of memory that you will use for a given task .... If there was some way to determine that a particular input could lead to the measurement of explosive memory, maybe you can try not to run it at the same moment as another high-ranking mission.

If you can determine the relative size from task to task (this is a big assumption), you can allow (say) 100 units of work that will be performed immediately using the Semaphore count. A typical job can only be considered one unit (and have only one permission), where larger work may be required to get 10 or 20 permissions before starting ....

Of course, if you cannot determine anything about the size of the memory consumed, you can still learn how to further divide your problem so that you make more small memory instead of a small number of large jobs.

0
source

Application servers typically have settings for a worker thread pool. The maximum number of threads in this pool roughly determines how much memory you will consume. This is a simple and, importantly, a working concept.

I would not call it "graceful degradation." This is throttling. Elegant degradation involves lowering the level of service (for example, the number of parts provided to the user) in order to maintain at least the basic essential functions for each current user. With throttling, additional users are simply out of luck.

Elegant degradation by this definition requires knowledge of the nature of the application, and therefore you should know the code about it.

The obvious approach is to divide all possible operations into classes according to their need for the user. 1st grade should always be handled. The 2nd (3rd, 4th, ...) class should be serviced only if the server is below a certain load level, otherwise a “temporarily unavailable” error is returned.

0
source

Are you using J2EE? Since this is the task of the application server for load balancing, and I am sure that many major AppServers applications support it. Your application should not worry about this.

0
source

All Articles