Java implementation of "Tiered Queue" for fast producers, slow consumers

Question

Java implementation of "Tiered Queue" for fast producers, slow consumers

I have a producer-consumer scenario where producers produce much faster than consumers can consume. Typically, the solution is to block producers, as the producer / consumer scenario runs as fast as the slowest component. Throttling or blocking manufacturers is not a good solution, because our application provides enough time for consumers to catch up with them later.

Here is a diagram depicting the full “phase” in our application compared to the more common scenario:

Our Application Common Scenario 2N +--------+--------+ |PPPPPPPP|oooooooo| P = Producer |PPPPPPPP|oooooooo| C = Consumer N +--------+--------+ N +--------+--------+--------+ o = Other Work |CPCPCPCP|CCCCCCCC| |CPCPCPCP|CPCPCPCP|oooooooo| N = number of tasks |CPCPCPCP|CCCCCCCC| |CPCPCPCP|CPCPCPCP|oooooooo| ------------------- ---------------------------- 0 T/2 T 0 T/2 T 3T/2

The idea is to maximize throughput without hampering manufacturers.

The data on which our tasks work is easily serialized, so I plan to implement a file system solution to distribute all tasks that cannot be executed immediately.

I am using Java ThreadPoolExecutor with BlockingQueue with maximum capacity to ensure that we are running out of memory. The problem is the implementation of such a "multi-level" queue, where tasks that can be queued in memory are executed immediately, otherwise the data is queued on disk.

I came up with two possible solutions:

Deploy a BlockingQueue from scratch using a LinkedBlockingQueue or ArrayBlockingQueue as a reference. It can be as simple as copying an implementation in a standard library and adding a read / write file system.
Continue to use the standard implementation of BlockingQueue , implement a separate FilesystemQueue to store my data, and using one or more threads to delete files, create a Runnable and reset them using ThreadPoolExecutor .

Are these of them reasonable and is a potential better approach possible?

+5

java multithreading concurrency queue threadpool

joe May 02, '16 at 21:57

source share

3 answers

Before moving on to a more complex solution, are you sure that using a limited BlockingQueue is a transaction violation for you? It may turn out that increasing the size of your heap and preallocating a sufficiently large capacity is still okay for you. This will allow you to avoid the difficulties and performance uncertainties at the price of GC pauses that are within your comfort zone.

However, if your workload is unbalanced, that it can take advantage of saving the number of messages that cannot fit in memory (compared to the proven MPMC lock queue), it looks like you need a simpler and smaller version of ActiveMQ or its Apollo off-shoot. Depending on your application, you may find useful ActiveMQ features, in which case you can use it directly. If not, you're probably better off looking for a JMS space, as bowmore suggests .

+4

Dimitar Dimitrov May 03 '16 at 11:11

source share

This sounds like the perfect situation to use the JMS queue rather than the file system.

Instead of using a blocking queue, send messages in a persistent JMS queue. You can still try a tiered approach by combining the JMS queue in parallel with BlockingQueue , sending the JMS queue when the BlockingQueue is full, but I'm sure that a pure JMS approach will work just fine on its own.

+2

bowmore May 02, '16 at 22:18

source share

denlap · Accepted Answer · 2016-05-03T13:11:44+0000

the first option - increases the available size of the heap space , as suggested by Dimitar Dimitrov , using the -Xmx memory -Xmx , for example java -Xmx2048m

From Oracle Documentation : Note that the JVM uses more memory than just a bunch. For example, Java methods, thread stacks, and native descriptors separate from the heap, as well as internal JVM data structures, are allocated in memory.

Here is also a diagram about how java memory heap is categorized .

the second option is to use a library that implements the requested functionality. For this you can use ashes-queue

From the project overview : this is a simple FIFO implementation in Java that has ongoing support. That is, if the queue is full, overflowed messages will be saved and when the slots, they will be returned to memory.

The third option is to create your own implementation . In this case, you can view this thread to help you with this.

Your suggestions are included in this last third option. Both are reasonable. In terms of implementation, you should go with the first option, as this will guarantee an easier implementation and clean design.

Java implementation of "Tiered Queue" for fast producers, slow consumers

More articles: