Opening / proc / net / tcp in C ++ from a POSIX thread fails most of the time

When I try to open / proc / net / tcp from a POSIX child stream in C ++, it fails with the error "There is no such file or directory." If I try to open it from the parent thread, it will be successful every time, and the process of opening / closing it in the parent thread will make it successful for about a third of the time in the child thread. I can open / proc / uptime in a child thread 100% of the time without a problem. Here is an example of code that can be compiled with "g ++ -Wall test.cc -o test -pthread":

#include <iostream> #include <fstream> #include <cstring> #include <cerrno> #include <pthread.h> using namespace std; void * open_test (void *) { ifstream in; in.open("/proc/net/tcp"); if (in.fail()) cout << "Failed - " << strerror(errno) << endl; else cout << "Succeeded" << endl; in.close(); return 0; } int main (int argc, char * argv[]) { open_test(NULL); pthread_t thread; pthread_create(&thread, NULL, open_test, NULL); pthread_exit(0); } 

I run this on an Ubuntu 12.04 box with Intel i5-2520M (2 cores * 2 virtual cores) on a Linux 3.2.0 kernel. Here is the output that I run the above code 6 times in a row:

 mike@ung :/tmp$ ./test Succeeded Failed - No such file or directory mike@ung :/tmp$ ./test Succeeded Succeeded mike@ung :/tmp$ ./test Succeeded Failed - No such file or directory mike@ung :/tmp$ ./test Succeeded Failed - No such file or directory mike@ung :/tmp$ ./test Succeeded Succeeded mike@ung :/tmp$ ./test Succeeded Failed - No such file or directory mike@ung :/tmp$ 

It may be worth noting that I do not have this problem if I use fork instead of posix streams. If I use fork, then the child process has no problems reading / proc / net / tcp

Just a couple of data points that need to be thrown .... It looks like this is a regression on Linux, since 2.6.35 seems to work 100% of the time. 3.2.0 pukes most of the time even on my slow old Pentium M based laptop

+7
source share
3 answers

This behavior seems to be a kind of bug in the /proc virtual file system. If you add this code immediately before opening the file:

  system("ls -l /proc/net /proc/self/net/tcp"); 

You will see that /proc/net is a symbolic link to /proc/self/net , and /proc/sec/net/tcp correctly specified for both open_test calls, even if the call to the spawned stream does not work.

Edit: I just realized that the above test is fictitious, as it will refer to the system call shell process, not this process. Using the following function instead also detects an error:

 void ls_command () { ostringstream cmd; cmd << "ls -l /proc/net " << "/proc/" << getpid() << "/net/tcp " << "/proc/" << syscall(SYS_gettid) << "/net/tcp"; system(cmd.str().c_str()); } 

You will see that the spawned stream sometimes cannot see the parent file /net/tcp . In fact, it disappeared, since it is a spawned shell process in which the ls is executed.

The workaround below allows the child thread to reliably access what would be its /proc/net/tcp .

My theory is that this is some kind of race condition error with the correct setting of the /proc/self entry for the stream as the correct combination of the parent state and the specific state of the stream. As a test, and I work, I modified the open_test code to use the "process identifier" associated with the thread, instead of trying to access the parent process (since /proc/self refers to the identifier of the parent process, not to the thread).

Edit: As the evidence shows, the error is because the parent process cleared the state /proc/self/... before the child thread got the opportunity to read it. I still claim that this is a mistake, since the child thread is still technically part of the process. It getpid() remains the same before and after calling the main thread pthread_exit() . The /proc entry for the parent process must remain valid until all child threads are complete. Although

Edit2: Jonas claims this may not be a mistake. As evidence of this, there is this from man proc :

  / proc / [pid] / fd
               ...
               In a multithreaded process, the contents of this directory are
               not available if the main thread has already terminated (typi-
               ally by calling pthread_exit (3)). 

But then review this entry for /proc/self in the same man entry:

  / proc / self
               This directory refers to the process accessing the / proc file
               system, and is identical to the / proc directory named by the
               process ID of the same process. 

If you believe that this is not an error, because threads and processes are handled the same way on Linux, then threads should have the expectation that /proc/self will work. The error can be fixed by changing /proc/self to change the value of /proc/[gettid] when the version of /proc/[getpid] no longer available, just as the workaround is below.

 void * open_test (void *) { ifstream in; string file = "/proc/net/tcp"; in.open(file.c_str()); if (in.fail()) { ostringstream ss; ss << "/proc/" << syscall(SYS_gettid) << "/net/tcp"; cout << "Can't access " << file << ", using " << ss.str() << " instead" << endl; file = ss.str(); in.open(file.c_str()); } if (in.fail()) cout << "Failed - " << strerror(errno) << endl; else cout << "Succeeded" << endl; in.close(); return 0; } 
+2
source

As Scott points out in his answer, adding pthread_join(thread, NULL) eliminates the symptoms. But why?

Put the program in gdb and set a breakpoint at the point where the failure fails:

 (gdb) break test.cc:14 Breakpoint 1 at 0x400c98: file test.cc, line 14. 

Then we can observe two different types of behavior:

  •  (gdb) run […] Succeeded [New Thread 0x7ffff7fd1700 (LWP 18937)] // <- child thread [Thread 0x7ffff7fd3740 (LWP 18934) exited] // <- parent thread [Switching to Thread 0x7ffff7fd1700 (LWP 18937)] Breakpoint 1, open_test () at test.cc:14 
  •  (gdb) run Succeeded [New Thread 0x7ffff7fd1700 (LWP 19427)] // <- child thread Succeeded [Thread 0x7ffff7fd1700 (LWP 19427) exited] [Inferior 1 (process 19424) exited normally] 

The first assumes that the parent process terminates before the child. Like Linux, the processes and threads are almost the same, which means that the PID associated with the main process is cleared. However, nothing interferes with the flow of child threads. This and his pid is still great. Just /proc/self points to the PID of the main process that was deleted at this point.

+4
source

If you add a call to pthread_join (thread, NULL) before calling pthread_exit (), your program will work correctly.

+2
source

All Articles