Multithreaded C Lua module leading to segfault in Lua script


I wrote a very simple C library for Lua, which consists of one function that starts a thread, and the specified thread does nothing but a loop:

#include "lua.h" #include "lauxlib.h" #include <pthread.h> #include <stdio.h> pthread_t handle; void* mythread(void* args) { printf("In the thread !\n"); while(1); pthread_exit(NULL); } int start_mythread() { return pthread_create(&handle, NULL, mythread, NULL); } int start_mythread_lua(lua_State* L) { lua_pushnumber(L, start_mythread()); return 1; } static const luaL_Reg testlib[] = { {"start_mythread", start_mythread_lua}, {NULL, NULL} }; int luaopen_test(lua_State* L) { /* //for lua 5.2 luaL_newlib(L, testlib); lua_setglobal(L, "test"); */ luaL_register(L, "test", testlib); return 1; } 


Now, if I write a very simple Lua script that just does:

 require("test") test.start_mythread() 

Running a script with lua myscript.lua sometimes calls segfault. Here, what GDB has to say about the core dump:

 Program terminated with signal 11, Segmentation fault. #0 0xb778b75c in ?? () (gdb) thread apply all bt Thread 2 (Thread 0xb751c940 (LWP 29078)): #0 0xb75b3715 in _int_free () at malloc.c:4087 #1 0x08058ab9 in l_alloc () #2 0x080513a2 in luaM_realloc_ () #3 0x0805047b in sweeplist () #4 0x080510ef in luaC_freeall () #5 0x080545db in close_state () #6 0x0804acba in main () at lua.c:389 Thread 1 (Thread 0xb74efb40 (LWP 29080)): #0 0xb778b75c in ?? () #1 0xb74f6efb in start_thread () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0 #2 0xb7629dfe in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129 

With slight changes in the main thread stack from time to time.
It seems that the start_thread function wants to go to the given address (in this case, b778b75c), which sometimes happens to belong to unreachable memory.
Edit
I also have valgrind output:

 ==642== Memcheck, a memory error detector ==642== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==642== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info ==642== Command: lua5.1 go.lua ==642== In the thread ! In the thread ! ==642== Thread 2: ==642== Jump to the invalid address stated on the next line ==642== at 0x403677C: ??? ==642== by 0x46BEEFA: start_thread (pthread_create.c:309) ==642== by 0x41C1DFD: clone (clone.S:129) ==642== Address 0x403677c is not stack'd, malloc'd or (recently) free'd ==642== ==642== ==642== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==642== Access not within mapped region at address 0x403677C ==642== at 0x403677C: ??? ==642== by 0x46BEEFA: start_thread (pthread_create.c:309) ==642== by 0x41C1DFD: clone (clone.S:129) ==642== If you believe this happened as a result of a stack ==642== overflow in your program main thread (unlikely but ==642== possible), you can try to increase the size of the ==642== main thread stack using the --main-stacksize= flag. ==642== The main thread stack size used in this run was 8388608. ==642== ==642== HEAP SUMMARY: ==642== in use at exit: 1,296 bytes in 6 blocks ==642== total heap usage: 515 allocs, 509 frees, 31,750 bytes allocated ==642== ==642== LEAK SUMMARY: ==642== definitely lost: 0 bytes in 0 blocks ==642== indirectly lost: 0 bytes in 0 blocks ==642== possibly lost: 136 bytes in 1 blocks ==642== still reachable: 1,160 bytes in 5 blocks ==642== suppressed: 0 bytes in 0 blocks ==642== Rerun with --leak-check=full to see details of leaked memory ==642== ==642== For counts of detected and suppressed errors, rerun with: -v ==642== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) Killed 


However, I was fine only by opening the lua translator and entering the same instructions manually one by one.
Also, a C program that does the same thing using the same lib:

 int start_mythread(); int main() { int ret = start_mythread(); return ret; } 

As expected, it never failed during my tests.
I tried with both Lua 5.1 and 5.2, to no avail.
Edit: I must point out that I tested this on a single-core eeePC running on 32-bit Debian Wheezy (Linux 3.2).
I just tested my main machine (4-core 64-bit Arch Linux) and ran the script with lua myscript.lua segfaults every time there ... Entering commands from the interpreter prompt works fine, although in the C program above.
The reason I wrote this small library is primarily because I am writing a large library with which I had the first problem. After several hours of fruitless debugging, including removing all common structures / variables one by one (yes, I was so desperate), I came to this piece of code.
So, I think there is something that I am doing wrong with Lua, but what could it be? I searched for this problem as much as I could, but what I found was mainly people who have problems using the Lua API from multiple threads (this is not what I'm trying to do here).
If you have an idea, any help would be greatly appreciated.

Edit To be more precise, I would like to know if I should take extra precautions regarding threads when writing C lib for use in Lua scripts. Does Lua need threads created from a dynamically loaded library to complete it when it "unloads" the library?

+8
c multithreading linux pthreads lua
source share
2 answers

Why does Segfault happen in the Lua module?

Your Lua script exits before the thread that segfault calls completes. The Lua module is unloaded with dlclose() during the shutdown of the regular interpreter, so the instructions for the stream are deleted from memory, and it proceeds to read the next instruction.

What are the options?

Any solution that stops threads before unloading the module will work. Using pthread_join() in the main thread will wait for the threads to finish (you can kill long threads with pthread_cancel() ). Calling pthread_exit() on the main thread before unloading the module will also prevent failure (since it will prevent dlclose() ), but it also cancels the normal Lua interpreter cleanup / shutdown procedure.

Here are some examples that work:

 int pexit(lua_State* L) { pthread_exit(NULL); return 0; } int join(lua_State* L) { pthread_join(handle, NULL); return 0; } static const luaL_Reg testlib[] = { {"start_mythread", start_mythread_lua}, {"join", join}, {"exit", pexit}, {NULL, NULL} }; void* mythread(void* args) { int i, j, k; printf("In the thread !\n"); for (i = 0; i < 10000; ++i) { for (j = 0; j < 10000; ++j) { for (k = 0; k < 10; ++k) { pow(1, i); } } } pthread_exit(NULL); } 

Now the script will come out nicely:

 require('test') test.start_mythread() print("launched thread") test.join() -- or test.exit() print("thread joined") 

To automate this, you can bind to the garbage collector, since all objects in the module are freed before the shared object is unloaded. (as the great wolf suggested)

Discussion of the call to pthread_exit () from main (): There is a definite problem if main () ends before threads are generated, unless you call pthread_exit () explicitly . All threads this will be completed because main () is executed and there is no longer support threads. If main () explicitly calls pthread_exit () as the last thing it does, main () will block and remain alive until it supports the threads it creates until they are executed.

(This quote is a bit misleading: returning from main() roughly equivalent to calling exit() , which will exit the process, including all running threads. This may or may not be exactly the behavior you want. Calling pthread_exit() is basically a thread, on the other hand, will exit the main thread, but all other threads will execute until they stop on their own or someone else kills them. Again, this may or may not be the behavior that you you want. you choose the wrong option for your option using Ania.)

+2
source share

So it seems that I need to make sure that all my threads are finished by the time Lua unloads my library.

Decision

I can set up a cleanup function that will be called when the library is unloaded.
As part of this function, I can make sure that all threads launched by my library are complete. Calling pthread_exit from this can be easy if I have separate threads that still work, but I'm not sure how safe / clean it is, since it abruptly interrupts Lua ...
Anyway, I can achieve this by creating a metatable with the __gc field set for my cleanup function, and then apply this metatable to my lib table in Lua 5.2.

 int cleanup(lua_State* L) { /*Do the cleaning*/ return 0; } int luaopen_test(lua_State* L) { //for lua 5.2 //metatable with cleanup method for the lib luaL_newmetatable(L, "test.cleanup"); //set our cleanup method as the __gc callback lua_pushstring(L, "__gc"); lua_pushcfunction(L, cleanup); lua_settable(L, -3); //open our test lib luaL_newlib(L, testlib); //associate it with our metatable luaL_setmetatable(L, "test.cleanup"); return 1; } 

In Lua 5.1, the __gc parameter works only for user data. There are several solutions to make it work in my case:
- Disabling Lua / Terminating a program callback
- http://lua-users.org/wiki/LuaFaq (see "Why do __gc and __len metadata not work on tables?")
- Greatwolf's decision to have a global object with the specified metatable.

0
source share

All Articles