Why can this flow control scheme cause deadlock?

Question

Why can this flow control scheme cause deadlock?

I use the common has_threads base class to control any type that should be allowed to instantiate boost::thread .

The has_threads instances have their own set of thread (to support the waitAll and interruptAll functions, which I do not include below) and should automatically call removeThread when the thread completes, maintain this set integrity.

In my program, I have only one of them. Themes are created every 10 seconds, and each one searches the database. When the search is complete, the thread ends and removeThread should be called; with a set of mutexes, the stream object is removed from internal tracking. I see that this works fine with ABC output.

From time to time, however, mechanisms collide. removeThread is executed, possibly twice at the same time. I cannot understand why this leads to a deadlock. All calls to the thread from this point never output anything except A ^{[It is worth noting that I use threaddable stdlib and that the problem remains when IOStreams are not used.]} Stack ^traces indicate that the mutex blocks these threads, but why the lock will not eventually be released by the first thread for the second, then the second for the third and etc.?

scoped_lock I missing something fundamental in how scoped_lock works? Is there something obvious here that I skipped this could lead to a deadlock despite (or even because of?) Using a mutex lock?

Sorry for the bad question, but I am sure you know that it is almost impossible to provide real test indices for such errors.

 class has_threads { protected: template <typename Callable> void createThread(Callable f, bool allowSignals) { boost::mutex::scoped_lock l(threads_lock); // Create and run thread boost::shared_ptr<boost::thread> t(new boost::thread()); // Track thread threads.insert(t); // Run thread (do this after inserting the thread for tracking so that we're ready for the on-exit handler) *t = boost::thread(&has_threads::runThread<Callable>, this, f, allowSignals); } private: /** * Entrypoint function for a thread. * Sets up the on-end handler then invokes the user-provided worker function. */ template <typename Callable> void runThread(Callable f, bool allowSignals) { boost::this_thread::at_thread_exit( boost::bind( &has_threads::releaseThread, this, boost::this_thread::get_id() ) ); if (!allowSignals) blockSignalsInThisThread(); try { f(); } catch (boost::thread_interrupted& e) { // Yes, we should catch this exception! // Letting it bubble over is _potentially_ dangerous: // http://stackoverflow.com/questions/6375121 std::cout << "Thread " << boost::this_thread::get_id() << " interrupted (and ended)." << std::endl; } catch (std::exception& e) { std::cout << "Exception caught from thread " << boost::this_thread::get_id() << ": " << e.what() << std::endl; } catch (...) { std::cout << "Unknown exception caught from thread " << boost::this_thread::get_id() << std::endl; } } void has_threads::releaseThread(boost::thread::id thread_id) { std::cout << "A"; boost::mutex::scoped_lock l(threads_lock); std::cout << "B"; for (threads_t::iterator it = threads.begin(), end = threads.end(); it != end; ++it) { if ((*it)->get_id() != thread_id) continue; threads.erase(it); break; } std::cout << "C"; } void blockSignalsInThisThread() { sigset_t signal_set; sigemptyset(&signal_set); sigaddset(&signal_set, SIGINT); sigaddset(&signal_set, SIGTERM); sigaddset(&signal_set, SIGHUP); sigaddset(&signal_set, SIGPIPE); // http://www.unixguide.net/network/socketfaq/2.19.shtml pthread_sigmask(SIG_BLOCK, &signal_set, NULL); } typedef std::set<boost::shared_ptr<boost::thread> > threads_t; threads_t threads; boost::mutex threads_lock; }; struct some_component : has_threads { some_component() { // set a scheduler to invoke createThread(bind(&some_work, this)) every 10s } void some_work() { // usually pretty quick, but I guess sometimes it could take >= 10s } };

+4

c ++ deadlock boost-thread

Lightness races in orbit Dec 08 '11 at 17:05

source share

3 answers

neuro · Answer 1 · 2011-12-08T17:24:44+0000

Well, a deadlock may occur if the same thread blocks an existing mutex (unless you use a recursive mutex).

If a part of the release is called a second time by the same thread as with your code, you have a dead end.

I have not studied your code in detail, but you will probably have to redesign your code (simplify?) To make sure that the lock cannot be obtained twice by the same thread. You can probably use a security check to own a lock ...

EDIT: As said in my comment and in IronMensan's answer, one of the possible cases is that the thread stops during creation, with at_exit being called until the mutex is blocked, which is blocked in the part of creating your code.

EDIT2:

Well, with a mutex and a fixed lock, I can only imagine a recursive lock or lock that will not be released. This can happen if the loop goes to infinite due to memory corruption, for example.

I suggest adding more logs with a thread id to check if there is a recursive lock or something strange. Then I will check that my cycle is correct. I will also check that at_exit is called only once per thread ...

One more thing, check the effect of erasing (thereby calling the destructor) the stream, while in the at_exit function ...

my 2 cents

IronMensan · Answer 2 · 2011-12-08T17:28:25+0000

You may need to do something like this:

  void createThread(Callable f, bool allowSignals) { // Create and run thread boost::shared_ptr<boost::thread> t(new boost::thread()); { boost::mutex::scoped_lock l(threads_lock); // Track thread threads.insert(t); } //Do not hold threads_lock while starting the new thread in case //it completes immediately // Run thread (do this after inserting the thread for tracking so that we're ready for the on-exit handler) *t = boost::thread(&has_threads::runThread<Callable>, this, f, allowSignals); }

In other words, use thread_lock exclusively to protect threads .

Update:

To expand something in the comments with assumptions about how boost :: thread works, lock patterns might look something like this:

createThread :

( createThread ) get threads_lock
( boost::thread::opeator = ) get boost::thread internal lock
( boost::thread::opeator = ) release boost::thread internal lock
( createThread ) release threads_lock

stream end handler:

( at_thread_exit ) get internal lock boost::thread
( releaseThread ) get threads_lock
( releaseThread ) release threads_lock
( at_thread_exit ) release boost:thread internal lock

If these two boost::thread locks are the same lock, the probability of a deadlock is obvious. But this is an assumption, because most of the formatting code scares me, and I try not to look at it.

createThread can / should be redesigned to move step 4 up between steps one and two and eliminate a potential deadlock.

IronMensan · Answer 3 · 2011-12-08T21:10:35+0000

It is possible that the created thread ends before or during the assignment operation in createThread . You may need an event queue or some other structure. Although a simpler, albeit hacky, solution may work. Do not change createThread , since you must use threads_lock to protect the threads themselves and the thread objects that it points to. Instead, change runThread to this:

  template <typename Callable> void runThread(Callable f, bool allowSignals) { //SNIP setup try { f(); } //SNIP catch blocks //ensure that createThread is complete before this thread terminates boost::mutex::scoped_lock l(threads_lock); }

Why can this flow control scheme cause deadlock?

More articles: