I had a problem with the sysall Linux futex( FUTEX_WAIT) operation , sometimes returning early, seemingly for no reason. The documentation indicates certain conditions that can lead to an early (without FUTEX_WAKE) return , but all of them include non-zero return values: EAGAINif the value at the futex address does not match, it ETIMEDOUTexpects the timer to timeout EINTRwhen interrupted by a signal (without restarting), etc. d. But I see a return value of 0. What other than FUTEX_WAKEor aborting a thread whose pointer set_tid_addresspoints to futex can lead to a return FUTEX_WAITwith a return value of 0?
In case this is useful, the specific futex I was expecting is the address of the tid stream (set by syscall clonec CLONE_CHILD_CLEARTID), and the stream has not . My (apparently wrong) assumption is that an operation FUTEX_WAITreturning 0 can only happen when the termination of a thread leads to serious errors in program logic, which have since been fixed by cyclizing and retrying, even if it returns 0 but now I'm curious why this happened.
Here is a minimal test case:
#define _GNU_SOURCE
#include <sched.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <linux/futex.h>
#include <signal.h>
static char stack[32768];
static int tid;
static int foo(void *p)
{
syscall(SYS_getpid);
syscall(SYS_getpid);
syscall(SYS_exit, 0);
}
int main()
{
int pid = getpid();
for (;;) {
int x = clone(foo, stack+sizeof stack,
CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND
|CLONE_THREAD|CLONE_SYSVSEM
|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID
|CLONE_DETACHED,
0, &tid, 0, &tid);
syscall(SYS_futex, &tid, FUTEX_WAIT, x, 0);
syscall(SYS_tgkill, pid, tid, SIGKILL);
}
}
Let it start for a while, it should end with Killed( SIGKILL), which is possible only if the stream still exists when it returns FUTEX_WAIT.
, - , , futex, ( ), , FUTEX_WAIT.