C: How does pthread dataspecific work?

I am not sure how pthread dataspecific works: given the following code (found on the Internet), this means that I can create, for example, 5 threads basically, have func call only in some of them (say 2) these threads will have a data "key" for something (ptr = malloc (OBJECT_SIZE)), and other threads will have the same key, but with a NULL value?

static pthread_key_t key; static pthread_once_t key_once = PTHREAD_ONCE_INIT; static void make_key() { (void) pthread_key_create(&key, NULL); } func() { void *ptr; (void) pthread_once(&key_once, make_key); if ((ptr = pthread_getspecific(key)) == NULL) { ptr = malloc(OBJECT_SIZE); ... (void) pthread_setspecific(key, ptr); } ... } 

Some explanation of how the data works and how it can be implemented in pthread (a simple way) will be appreciated!

+4
source share
2 answers

Your reasoning is correct. These calls are for flow dependent data. This is a way to give each thread a “global” area where it can store what it needs, but only if it needs it.

The key is shared among all threads since it was created using pthread_once() the first time it is needed, but the value assigned to this key is different for each thread (unless it is set to NULL). With the value a void* in the memory block, a stream that needs data specific to the stream can allocate it and save the address for later use. And threads that do not invoke a procedure that needs data that depend on the stream never lose memory, because it has never been allocated for them.

In one area where I used them, you need to make the C standard library thread-safe. The strtok() unlike the thread-safe strtok_r() , which was considered an abomination when we did this) in the implementation in which I was involved in using almost the same code on the first call, allocate some memory that will be used strtok() for storing information for subsequent calls. These subsequent calls will retrieve thread-specific data to continue tokenizing the string without interfering with other threads that do the same.

This meant that library users did not need to worry about crosstalk between threads - they still had to ensure that one thread did not call the function until the last one ended, but it would be the same with a threaded code.

This allowed us to give the “correct” environment C to each thread running on our system, without the usual “you must call these special non-standard re-entry procedures” that other providers impose on their users.

As for the implementation, from what I remember about DCE user-mode threads (which, I think, was the predecessor of current pthreads), each thread had one structure that stored things like command pointers, stack pointers, register contents, etc. .d. on the. It was very simple to add a single pointer to this structure to achieve very powerful functions at minimal cost. The pointer pointed to an array (linked list in some implementations) of key / pointer pairs, so that each thread could have several keys (for example, one for strtok() , one for rand() ).

+6
source

The answer to your first question is yes. Simply put, it allows each thread to allocate and save its own data. This is roughly equivalent to the fact that each thread simply distributes and transfers its own data structure. The API eliminates the need to pass a local stream structure to all subfunctions and allows you to search for it on demand.

The implementation does not really matter much (it may vary depending on the OS) if the results match.

You can think of it as a two-level hashmap. The key determines which local "variable" for the stream you want to receive, and the second level can search by thread to request a value for the stream.

+1
source

All Articles