Our project uses several libraries with the extension 1.48 on several platforms, including Windows, Mac, Android and IOS. We can consistently ensure that the iOS version of the project crashes (but not trivially, but reliably) when using iOS and from our research, we see that ~ thread_data_base is called in the thread_info thread while its thread is still running.
This seems to be due to the smart pointer reaching zero, although it is obviously still within the framework of the thread_proxy function, which creates it and runs the requested function in the stream. This, apparently, happens in different cases: the call stack is not identical between failures, although there are several variations that are common.
Just to be clear - this often requires running code that creates hundreds of threads, although there are no more than 30 at a time. I was "lucky", and he got up to run very early too, but it's rare. I created a version of the destructor that actually catches the code in the act:
in libs / thread / src / pthread / thread.cpp:
thread_data_base::~thread_data_base() { boost::detail::thread_data_base* const thread_info=detail::get_current_thread_data(); void *void_thread_info = (void *) thread_info; void *void_this = (void *) this;
I should note that (as can be seen from the code with comments), which I previously checked to see that void_thread_info == void_this, because I was checked only in case the thread_info thread was destroyed. I also saw cases where the value returned by get_current_thread_data other than zero differs from "this", which is really strange.
Also, when I first wrote this version of the code, I wrote:
if (((void*)thread_info) == ((void*)this))
and at runtime I got a very strange exception, which said something about a table of virtual functions or something like that - I donβt remember. I decided that he was trying to call "==" for this type of object and was unsatisfied with it, so I rewrote it as above, turning the conversions into invalid * as separate lines of code. This in itself is rather suspicious for me. I'm not the only one to run to rush to the compiler accusations, but ...
I should also note that when we caught this trap, we saw a destructor for ~ shared_count appearing twice in a row on the stack in the Xcode source. Very double. We tried to make out the disassembly, but could not make anything out of it.
Again - it looks like it is always the result of shared_count, which seems to belong to shared_ptr, to which thread_info belongs, which reached zero too soon.
Update: it seems that you can get into situations that reach the above trap without causing harm. After fixing the problem (see Answer), I saw how this happens, but always after thread_info-> run () completes. I donβt understand how ... but it works.
Additional Information:
I should note that boost.sh from Pete Goodliffe (and modified by others), which is commonly used to compile boost for iOS, has the following note in the header:
: ${EXTRA_CPPFLAGS:="-DBOOST_AC_USE_PTHREADS -DBOOST_SP_USE_PTHREADS"} # The EXTRA_CPPFLAGS definition works around a thread race issue in # shared_ptr. I encountered this historically and have not verified that # the fix is no longer required. Without using the posix thread primitives # an invalid compare-and-swap ARM instruction (non-thread-safe) was used for the # shared_ptr use count causing nasty and subtle bugs. # # Should perhaps also consider/use instead: -BOOST_SP_USE_PTHREADS
I use these flags, but to no avail.
I found the following, which is very painful - it looks like they had the same problem in std::thread:
http://llvm.org/bugs/show_bug.cgi?format=multiple&id=12730
This suggests using an alternative implementation inside boost for handheld processors, which seems to also directly solve this problem: spinlock_gcc_arm.hpp
The version included in boost 1.48 uses an outdated console. I took the updated version with boost 1.52, but it's hard for me to compile it. I get the following error: Suggested instructions must be in the IT block
I found a link to what looks like a similar use of this instruction here: https://zeromq.jira.com/browse/LIBZMQ-414
I was able to use the same idea to get the code 1.52 for compilation by changing the code as follows (I inserted the appropriate IT instruction)
__asm__ __volatile__( "ldrex %0, [%2]; \n" "cmp %0, %1; \n" "it ne; \n" "strexne %0, %1, [%2]; \n" BOOST_SP_ARM_BARRIER : "=&r"( r ): // outputs "r"( 1 ), "r"( &v_ ): // inputs "memory", "cc" );
But anyway, there are ifdefs in this file that are looking for an arm architecture that is not defined that way in my environment. After I just edited the file so that only the ARM 7 code is left, the compiler complains about the definition of BOOST_SP_ARM_BARRIER:
In the file included from. / boost / smart _ptr / detail / spinlock.hpp: 35: ./boost/smart_ptr/detail/spinlock_gcc_arm.hpp:39:13: error: the command requires the CPU function not to be enabled at the moment BOOST_SP_ARM_BARRIER: ^. / boost / smart _ptr / detail / spinlock_gcc_arm.hpp: 13: 32: note: extended from the macro "BOOST_SP_ARM_BARRIER"
# define BOOST_SP_ARM_BARRIER "dmb"
Any ideas?