Why does a library loaded through LD_PRELOAD work before initialization?

In the following minimal example, a library loaded through LD_PRELOAD with functions to intercept fopen and openat seems to work before it is initialized. (Linux - CentOS 7.3). Why??

comm.c library comm.c :

 #define _GNU_SOURCE #include <dlfcn.h> #include <stdarg.h> #include <stdio.h> #include <fcntl.h> typedef FILE *(*fopen_type)(const char *, const char *); // initialize to invalid value (non-NULL) // init() should initialize this correctly fopen_type g_orig_fopen = (fopen_type) 1; typedef int (*openat_type)(int, const char *, int, ...); openat_type g_orig_openat; void init() { g_orig_fopen = (fopen_type)dlsym(RTLD_NEXT,"fopen"); g_orig_openat = (openat_type)dlsym(RTLD_NEXT,"openat"); } FILE *fopen(const char *filename, const char *mode) { // have to do this here because init is not called yet??? FILE * const ret = ((fopen_type)dlsym(RTLD_NEXT,"fopen"))(filename, mode); printf("g_orig_fopen %p fopen file %s\n", g_orig_fopen, filename); return ret; } int openat(int dirfd, const char* pathname, int flags, ...) { int fd; va_list ap; printf("g_orig_fopen %p openat file %s\n", g_orig_fopen, pathname); if (flags & (O_CREAT)) { va_start(ap, flags); fd = g_orig_openat(dirfd, pathname, flags, va_arg(ap, mode_t)); } else fd = g_orig_openat(dirfd, pathname, flags); return fd; } 

compiled with:

 gcc -shared -fPIC -Wl,-init,init -ldl comm.c -o comm.so 

I have an empty subdir subdirectory. Then the fopen library function appears before init :

 #LD_PRELOAD=./comm.so find subdir g_orig_fopen 0x1 fopen file /proc/filesystems g_orig_fopen 0x1 fopen file /proc/mounts subdir g_orig_fopen 0x7f7b2e574620 openat file subdir 
+7
linux ld-preload
source share
1 answer

Obviously, fopen is called before comm.so initialized. It is interesting to place a breakpoint in fopen() to understand (check this link to get debugging symbols of various packages). I get this back trace:

 (gdb) bt #0 fopen (filename=0x7ffff79cd2e7 "/proc/filesystems", mode=0x7ffff79cd159 "r") at comm.c:28 #1 0x00007ffff79bdb0e in selinuxfs_exists_internal () at init.c:64 #2 0x00007ffff79b5d98 in init_selinuxmnt () at init.c:99 #3 init_lib () at init.c:154 #4 0x00007ffff7de88aa in call_init (l=<optimized out>, argc=argc@entry =1, argv=argv@entry =0x7fffffffdf58, env=env@entry =0x7fffffffdf68) at dl-init.c:72 #5 0x00007ffff7de89bb in call_init (env=0x7fffffffdf68, argv=0x7fffffffdf58, argc=1, l=<optimized out>) at dl-init.c:30 #6 _dl_init (main_map=0x7ffff7ffe170, argc=1, argv=0x7fffffffdf58, env=0x7fffffffdf68) at dl-init.c:120 #7 0x00007ffff7dd9c5a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 #8 0x0000000000000001 in ?? () #9 0x00007fffffffe337 in ?? () #10 0x0000000000000000 in ?? () 

Obviously, comm.so depends on other libraries ( libdl.so , which require libselinux.so ). And comm.so not the only library declaring an init function. libdl.so and libselinux.so also declare.

So comm.so is the first loadable library (since it is declared using LD_PRELOAD ), but comm.so depends on libdl.so (due to -ldl at compile time), and libdl.so depends on libselinux.so . So, to load comm.so the init functions from libdl.so and libselinux.so are called earlier. And finally, the init function from libselinux.so calls fopen()

Personally, I usually allow dynamic characters the first time a character is called. Like this:

 FILE *fopen(const char *filename, const char *mode) { static FILE *(*real_fopen)(const char *filename, const char *mode) = NULL; if (!real_fopen) real_fopen = dlsym(RTLD_NEXT, "fopen"); return real_fopen(filename, mode); } 
+6
source share

All Articles