Retrieving collection contents in a thread-safe manner

I would like to read the contents of the java collection in a multi-threaded way. There were many questions with the same context, but none of them were indicated at a specific reading point.

I have a set of integers. I just want it to execute multiple threads, with each thread pulling one whole at a time. I want the whole collection to be iterated, and that no whole is pulled twice by two different threads.

Honestly, I do not know what works. I know that Iterators are not thread safe, but when it comes to reading, I don't know. I did some tests to try to get stream errors, but did not reach 100% certainty:

int imax = 500; Collection<Integer> li = new ArrayList<Integer>(imax); for (int i = 0; i < imax; i++) { li.add(i); } final Iterator<Integer> it = li.iterator(); Thread[] threads = new Thread[20]; for (int i = 0; i < threads.length; i++) { threads[i] = new Thread("Thread " + i) { @Override public void run() { while(it.hasNext()) { System.out.println(it.next()); } } }; } for (int ithread = 0; ithread < threads.length; ++ithread) { threads[ithread].setPriority(Thread.NORM_PRIORITY); threads[ithread].start(); } try { for (int ithread = 0; ithread < threads.length; ++ithread) threads[ithread].join(); } catch (InterruptedException ie) { throw new RuntimeException(ie); } 

EDIT: In the actual use case, each of this whole is used to start intensive work, for example, to determine if it is simple.

In the above example, a list of integers is pulled out without duplicates or misses, but I do not know if this is accidental.

Using a HashSet instead of an ArrayList also works, but again, maybe this is random.

How do you do in practice, if you have a common collection (not necessarily a list) and need to pull its contents in a multi-threaded way?

+4
source share
4 answers

Your use case will benefit from using a queue - there are several streaming security implementations like ArrayBlockingQueue.

 Collection<Integer> li = new ArrayList<Integer>(imax); final BlockingQueue<Integer> queue = new ArrayBlockingQueue<>(li.size(), false, li); Thread[] threads = new Thread[20]; for (int i = 0; i < threads.length; i++) { threads[i] = new Thread("Thread " + i) { @Override public void run() { Integer i; while ((i = queue.poll()) != null) { System.out.println(i); } } }; } 

It is thread safe, and each thread can work independently of the others on part of the original collection.

+2
source

It depends on the collection. If during reading there are no structural changes, you can read them at the same time, this is normal. Most collections do NOT change read-only or iteration-only structure, so everything is fine, but be sure to read the documentation for the collection you are using before doing this.

For example, HashSet javadocs :

Please note that this implementation is not synchronized. If multiple threads access the hash set at the same time and at least one of the threads changes the set, it must be synchronized externally.

This means that reading from two streams at the same time is just fine if there is no write.


One way to do this is to split the data and allow each thread to read collection.size()/ numberOfThreads .
thread #i will read from collection.size()/numThreads * i to collection.size()/numThreads * (i+1)

(Pay special attention to ensure that the last elements are not skipped, this can be done by setting the last frpm collection.size()/numThreads * i branch to collection.size() , but this may cause the last thread will work much more and make you wait for the battle of threads).

Another option is to use the interval task queue, and each thread will read elements until the queue is empty, and read elements at given intervals. The queue must be synchronized because it is being changed simultaneously by multiple threads.

+2
source

In general, collecting content by iteration is not enough to make it multithreaded. This is an operation that you perform with a list after retrieving the content. So what you have to do is the following:

  • Use single threads for content and workload sharing.
  • Run multiple threads / jobs for processing, giving them (most) of the workload. Make sure the threads are not using the source list.
  • use one thread to combine the results.

If you need to share a collection, use the streaming security assembly. They can be created using Collections .synchronized ... functions. However, keep in mind that this means that threads must wait for each other, and if you do not have a significant amount of work, this will make your program slower than a single streaming version.

Please note that all objects that you share between threads must be thread safe (for example, by transferring all access in synchronized blocks). Best source of information about it concurrency in practice

+2
source

You can use the synchronized versions available from java.util.Collections . Or you can try special data structures in java.util.concurrent (e.g. ConcurrentHashMap ).

I would prefer that one of them roll on their own.

Another thought is to synchronize the whole method, if necessary, and not just access to the collection.

And remember that immutable objects are always thread safe. You only need to synchronize the common, mutable state.

+1
source

All Articles