Regarding the inner workings of parallel hashmap

Question

Regarding the inner workings of parallel hashmap

I went through ConcurrentHashMap and this related tutorial and had some questions.

This article mentioned that ConcurrentHashMap allows multiple readers to read at the same time without any locks. This is achieved by breaking the card into different parts based on the concurrency level and blocking only part of the card during updates. The default concurrency level is 16, and accordingly the card is divided into 16 parts, and each part is controlled by a different lock. This means that 16 threads can work on the map at the same time, until they work on different parts of the Map. This makes ConcurrentHashMap high performance, even though it is not damaged. Despite this, it has the caveat: Since update operations such as put() , remove() , putAll() or clear() are not synchronized, simultaneous extraction may not reflect the most recent changes on the map
Another point mentioned in the article: Another important point to remember is the iteration over CHM, the Iterator returned by keySet is weakly consistent and reflects only the state of ConcurrentHashMap at a certain point and may not reflect any recent changes .

I do not understand the goals in bold, could you provide more information or show me in a simple program?

+4

java map concurrenthashmap

user2094103 Feb 27 '13 at 17:25

source share

2 answers

matts · Answer 1 · 2013-02-27T17:49:44+0000

Since update operations such as put (), remove (), putAll () or clear () are not synchronized, simultaneous extraction may not reflect the last change on the map
As I understand it, this means that a map modification in one thread may not necessarily be noticed by searching at the same time in another thread. Consider the following example:
```
  Thread 1 starts Thread 1 call to get("a") a call to get("a") completes, returning null | | Thread 1 ---------+---------------------------------+------- time -----> Thread 2 -----+---------------------------+----------------- | | Thread 2 starts a Thread 2 call to call to put("a", 1) put("a", 1) completes 
```
Despite the fact that Thread 2 put value in the Thread 1 get file is completed, thread 1 did not see the map modification and did not return null .
Another important point to remember is the iteration over CHM, the Iterator returned by keySet of ConcurrentHashMap is weekly consistent and reflects only the state of ConcurrentHashMap and some point and may not reflect any recent changes.
This is a similar situation. If Thread 1 receives an Iterator from the ConcurrentHashMap keySet , and later Thread 2 places a new record on the card, Thread 1 Iterator does not guarantee that this record will be visible. (This may or may not be possible.)

Lee meador · Answer 2 · 2013-02-27T17:34:51+0000

The real problem here is that when multiple threads cheat on the data structure, the threads will not necessarily march at the blocking stage.

One thread is read for user1. One thread writes for user2. No thread can predict where another thread will execute in their respective processes. In addition, we cannot predict any orders for users to complete these two processes. If the record first updates the data, the read will show the updated status, even if user1 could request a read a little earlier.

Reading or changing during a restart is the same, with the additional consideration that the transition to the next (during iteration) essentially becomes a “read” operation of the map state, if not the content of any specific data in it.

So, when you enable concurrency in these data structures, you get a test "close enough" in terms of time. (This is very similar to the same considerations with databases, except that we are used to thinking about databases in this way, and the time frame is several factors out of 10 different.

NOTE. To make a comment about the wonderful little timeline shown by @Matts in another answer ...

The timeline shows two streams and the beginning and end for each stream. Triggering for two threads can occur in any order (a, b) or (b, a). The ends can be executed in any order because you cannot determine how long the operation will take. This provides four ways to start and end two threads. (it starts and ends first, starts first, and b ends first, b starts first, and ends first, b starts first, and b ends first). Now ... imagine 20 threads that do the same thing in response to, say, 20 end users send requests for this and that. How many possible ways this can work.

Regarding the inner workings of parallel hashmap

More articles: