HANG in native state after detach on api.detach_spawn test
Running my new test for #2601 (closed) api.detach_spawn in a loop, after fixing #2601 (closed) and #2688 (closed), in one run I saw a hang after detach finished:
Looks completely native except dynamo_exited* aren't set -- ok, that's normal, dynamo_exit_post_detach clears them all for a possible re-attach:
(gdb) p dynamo_initialized
$1 = false
(gdb) p dynamo_exited
$2 = false
(gdb) p doing_detach
$3 = false
(gdb) p dynamo_detaching_flag
$4 = -1
(gdb) p dynamo_exited_all_other_threads
$5 = false
(gdb) p dynamo_exited_all_and_cleaned
No symbol "dynamo_exited_all_and_cleaned" in current context.
(gdb) p dynamo_exited_and_cleaned
$6 = false
So we're completely native. It's not clear why it's stuck: there are 10 threads and they all seem to be running the 'print(".")' but the callstacks are hard to figure out: are they in some fprintf lock or what.
From base of stack up I get this far:
0xed2fbde0 0xed2fc300 No symbol matches (void *)$retaddr.
0xed2fbde4 0xf753c9e5 vfprintf + 469 in section .text of /lib/i386-linux-gnu/libc.so.6
0xed2fc300 0xed2fc31c No symbol matches (void *)$retaddr.
0xed2fc304 0x08049d9c print + 54 in section .text of /home/bruening/dr/git/build_x86_dbg_tests/suite/tests/bin/api.detach_spawn
0xed2fc31c 0xed2fc358 No symbol matches (void *)$retaddr.
0xed2fc320 0x080495ba parent_func + 42 in section .text of /home/bruening/dr/git/build_x86_dbg_tests/suite/tests/bin/api.detach_spawn
0xed2fc358 0xed2fc428 No symbol matches (void *)$retaddr.
0xed2fc35c 0xf76b0f72 start_thread + 210 in section .text of /lib/i386-linux-gnu/libpthread.so.0
But the vfprintf frame has it next hitting somewhere around an funlockfile func ptr, but there's no retaddr near there:
(gdb) x/2i 0xf753c9e5-5
0xf753c9e0 <vfprintf+464>: call 0xf75415f0
0xf753c9e5 <vfprintf+469>: lea -0xc(%ebp),%esp
(gdb) x/23i 0xf75415f0
0xf75415f0: push %ebp
0xf75415f1: push %edi
0xf75415f2: push %esi
0xf75415f3: mov %eax,%esi
0xf75415f5: push %ebx
0xf75415f6: sub $0x20ec,%esp
0xed2f9ce8 0xed2f9d10 No symbol matches (void *)$retaddr.
0xed2f9cec 0xf76b9170 funlockfile in section .text of /lib/i386-linux-gnu/libpthread.so.0
#+END_EXAMPLE
Going from the top down we're in libc but gdb won't name the routine:
#+BEGIN_EXAMPLE
(gdb) x/5i $pc-2
0xf76eac8e: int $0x80
=> 0xf76eac90: pop %ebp
0xf76eac91: pop %edx
0xf76eac92: pop %ecx
0xf76eac93: ret
(gdb) dps $esp $esp+64
0xed2f9cc8 0x00000001 No symbol matches (void *)$retaddr.
0xed2f9ccc 0x00000002 No symbol matches (void *)$retaddr.
0xed2f9cd0 0x00000080 No symbol matches (void *)$retaddr.
0xed2f9cd4 0xf75f48b1 No symbol matches (void *)$retaddr.
(gdb) x/10i 0xf75f48b1-12
0xf75f48a5: mov $0xf0,%eax
0xf75f48aa: call *%gs:0x10
0xf75f48b1: mov %edx,%eax
0xf75f48b3: xchg %eax,(%ebx)
f74f9000-f76a4000 r-xp 00000000 fc:01 12321400 /lib/i386-linux-gnu/libc-2.19.so
f76a4000-f76a6000 r--p 001aa000 fc:01 12321400 /lib/i386-linux-gnu/libc-2.19.so
f76a6000-f76a7000 rw-p 001ac000 fc:01 12321400 /lib/i386-linux-gnu/libc-2.19.so
0xf0 == 240 == futex
(gdb) info reg
eax 0xfffffe00 -512
ecx 0x80 128 == flag
edx 0x2 2 == mustbe
ebx 0xf76a788c -144017268 == int*futex
esp 0xed2f9cc8 0xed2f9cc8
ebp 0x1 0x1 == val3
esi 0x0 0 == timeout
edi 0x1 1 == int*uaddr2
#define FUTEX_WAIT 0
#define FUTEX_PRIVATE_FLAG 128
#define FUTEX_WAIT_PRIVATE (FUTEX_WAIT | FUTEX_PRIVATE_FLAG)
Looks like the routine starts here:
0xf75f4890: push %edx
So its caller is:
(gdb) x/4i 0xf754182e-5
0xf7541829: call 0xf75f4890
0xf754182e: jmp 0xf75416ee
Anyway it's a futex inside vfprintf: hard to get more when these libc routines apparently have no names known to gdb.
Did we mess up something about stderr? Running 300x w/o DR I don't see a hang.