I have a producer-consumer scenario where producers produce much faster than consumers can consume. Typically, the solution is to block producers, as the producer / consumer scenario runs as fast as the slowest component. Throttling or blocking manufacturers is not a good solution, because our application provides enough time for consumers to catch up with them later.
Here is a diagram depicting the full βphaseβ in our application compared to the more common scenario:
Our Application Common Scenario 2N +--------+--------+ |PPPPPPPP|oooooooo| P = Producer |PPPPPPPP|oooooooo| C = Consumer N +--------+--------+ N +--------+--------+--------+ o = Other Work |CPCPCPCP|CCCCCCCC| |CPCPCPCP|CPCPCPCP|oooooooo| N = number of tasks |CPCPCPCP|CCCCCCCC| |CPCPCPCP|CPCPCPCP|oooooooo| ------------------- ---------------------------- 0 T/2 T 0 T/2 T 3T/2
The idea is to maximize throughput without hampering manufacturers.
The data on which our tasks work is easily serialized, so I plan to implement a file system solution to distribute all tasks that cannot be executed immediately.
I am using Java ThreadPoolExecutor with BlockingQueue with maximum capacity to ensure that we are running out of memory. The problem is the implementation of such a "multi-level" queue, where tasks that can be queued in memory are executed immediately, otherwise the data is queued on disk.
I came up with two possible solutions:
- Deploy a
BlockingQueue from scratch using a LinkedBlockingQueue or ArrayBlockingQueue as a reference. It can be as simple as copying an implementation in a standard library and adding a read / write file system. - Continue to use the standard implementation of
BlockingQueue , implement a separate FilesystemQueue to store my data, and using one or more threads to delete files, create a Runnable and reset them using ThreadPoolExecutor .
Are these of them reasonable and is a potential better approach possible?
source share