Why are malloc () and printf () called non-reentrant?

On UNIX systems, we know that malloc() is an integral function (system call). Why is this?

Similarly, printf() also called non-reentrant; why?

I know the definition of re-entrancy, but I wanted to know why it refers to these functions. What prevents them from guaranteeing reentrant?

+37
c unix operating-system reentrancy
Oct. 15 '10 at 10:12
source share
6 answers

malloc and printf usually use global structures and use internal synchronization. That is why they are not reentrant.

The malloc function may be thread safe or unsafe. Both are not reentrant:

  • Malloc works with a global heap, and it is possible that two different malloc calls that happen at the same time return the same memory block. (The 2nd call to malloc must occur before the block address is issued, but the piece is not marked as unavailable). This violates the malloc postcondition, so this implementation will not return.

  • To prevent this effect, the thread-safe malloc implementation will use lock-based synchronization. However, if malloc is called from a signal handler, the following situation may occur:

     malloc(); //initial call lock(memory_lock); //acquire lock inside malloc implementation signal_handler(); //interrupt and process signal malloc(); //call malloc() inside signal handler lock(memory_lock); //try to acquire lock in malloc implementation // DEADLOCK! We wait for release of memory_lock, but // it won't be released because the original malloc call is interrupted 

    This situation does not happen when malloc simply called from different threads. Indeed, the concept of reentrancy goes beyond thread safety and also requires that functions function normally , even if one of its calls never ends . This is basically the reason why any function with locks would not be repeated.

The printf function also works with global data. Any output stream usually uses a global buffer attached to the resource data is sent (a buffer for the terminal or for the file). The printing process is usually a sequence of copying data to buffer and then flush the buffer. This buffer must be protected by malloc locks in the same way. Consequently, printf also not reentrant.

+48
Oct. 15 '10 at
source share

Let's understand what we mean by repeat participant . The re-entry function can be called before the previous call is completed. This can happen if

  • the function is called in the signal handler (or, more importantly, than the Unix interrupt handler) for the signal that was raised during the execution of the function
  • function called recursively

malloc is not repetitive because it manages several global data structures that track free blocks of memory.

printf is not repetitive because it modifies the global variable, i.e. the contents of the FILE * file.

+10
Oct. 15 2018-10-10
source share

Here are at least three concepts, all of which are combined in a spoken language, which may be why you were confused.

  • thread safe
  • critical section
  • Reentrant

First, take the simplest one: Both malloc and printf thread safe . Since 2011, they have guaranteed the safety of flows in standard C since 2011, in POSIX since 2001, and in practice long before that. This means that the following program is guaranteed that it will not cause a crash or misbehavior:

 #include <pthread.h> #include <stdio.h> void *printme(void *msg) { while (1) printf("%s\r", (char*)msg); } int main() { pthread_t thr; pthread_create(&thr, NULL, printme, "hello"); pthread_create(&thr, NULL, printme, "goodbye"); pthread_join(thr, NULL); } 

An example of a thread-unsafe strtok is strtok . If you call strtok from two different threads at the same time, the result is undefined behavior, because strtok internally uses a static buffer to monitor its state. glibc adds strtok_r to fix this problem, and C11 added the same (but not necessarily under a different name, because Not Invented Here) as strtok_s .

Ok, but do not printf use global resources to create your output too? In fact, what would it mean even to print to stdout from two threads at the same time? This brings us to the next topic. Obviously, printf will be a critical section in any program that uses it. Only one thread of execution is allowed to be inside the critical section at once.

At least in POSIX-compatible systems, this is achieved by starting printf with a call to flockfile(stdout) and ending with a call to funlockfile(stdout) , which is basically like accepting a global mutex associated with stdout.

However, each individual FILE in the program is allowed to have its own mutex. This means that one thread can call fprintf(f1,...) at the same time that the second thread is in the middle of a call to fprintf(f2,...) . There is no race. (Regardless of whether your libc makes these two calls in parallel, it is QoI . I really don't know what glibc does.)

Similarly, malloc unlikely to be a critical sector in any modern system, since modern systems are smart enough to support one memory pool for each thread in the system , instead of having all N threads fight for one pool. (The sbrk system call sbrk probably still be the critical section, but malloc spends very little time in sbrk . Or mmap , or like all the cool kids use these days.)

Well, therefore, what does re-entrancy actually mean? Basically, this means that the function can be safely called recursively - the current call is β€œheld” when the second call is made, and then the first call is still able to β€œpick up where it left off”. (Technically, this may not be due to a recursive call: the first call may be in Thread A, which is interrupted in the middle of Thread B, which makes the second call. But this scenario is just a special case of thread safety, so we can forget about it in this paragraph.)

Neither printf nor malloc can be called recursively as a single thread, since they are sheet functions (they do not call themselves and do not call any user-managed code that could make a recursive call), And, as we saw above, since 2001 ( using locks) they were thread safe for * multi-threaded repeated calls with repeated calls.

So, whoever told you that printf and malloc not reentrant was wrong; what they wanted to say was probably due to the fact that both of them can be critical sections in your program - bottlenecks where only one thread can go.




A note on pedantry: glibc provides an extension with which printf can be made to invoke arbitrary user code, including recall. This is absolutely safe in all its permutations - at least in terms of thread safety. (Obviously, this opens the door to completely insane vulnerabilities in the format of the string.) There are two options: register_printf_function (which is documented and reasonably normal, but officially β€œoutdated”) and register_printf_specifier (which is almost identical, with the exception of one additional undocumented parameter and complete absence user-oriented documentation ). I would not recommend any of them, and to mention them here is just as interesting.

 #include <stdio.h> #include <printf.h> // glibc extension int widget(FILE *fp, const struct printf_info *info, const void *const *args) { static int count = 5; int w = *((const int *) args[0]); printf("boo!"); // direct recursive call return fprintf(fp, --count ? "<%W>" : "<%d>", w); // indirect recursive call } int widget_arginfo(const struct printf_info *info, size_t n, int *argtypes) { argtypes[0] = PA_INT; return 1; } int main() { register_printf_function('W', widget, widget_arginfo); printf("|%W|\n", 42); } 
+3
Nov 11 '14 at 20:10
source share

Most likely because you cannot start writing output, while another call to printf still prints it on its own. The same goes for allocating and freeing memory.

+1
Oct 15 '10 at 10:16
source share

This is because both work with global resources: heap memory structures and the console.

EDIT: The heap is nothing but the structure of a linked list. Each malloc or free modifies it, so the simultaneous use of multiple threads when recording access to it can damage its consistency.

EDIT2: another detail: they can be made reentrant by default using mutexes. But this approach is expensive and there is no guarantee that they will always be used in the MT environment.

So, there are two solutions: make 2 library functions, one reentrant and one do not leave or leave part of the mutex to the user. They chose the second.

In addition, this may be due to the fact that the original versions of these functions were not reentrant, so they were declared for compatibility.

-2
Oct 15 '10 at 10:17
source share

If you try to call malloc from two separate threads (unless you have a thread safe version not guaranteed by the C standard), bad things happen because there is only one heap for two threads. The same for printf - undefined behavior. This is what makes them really non-tolerant.

-four
Oct. 15 '10 at 10:20
source share



All Articles