Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • D dynamorio
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1,467
    • Issues 1,467
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 44
    • Merge requests 44
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • DynamoRIO
  • dynamorio
  • Issues
  • #2690
Closed
Open
Issue created Nov 06, 2017 by Derek Bruening@derekbrueningContributor

HANG in native state after detach on api.detach_spawn test

Running my new test for #2601 (closed) api.detach_spawn in a loop, after fixing #2601 (closed) and #2688 (closed), in one run I saw a hang after detach finished:

Looks completely native except dynamo_exited* aren't set -- ok, that's normal, dynamo_exit_post_detach clears them all for a possible re-attach:

(gdb) p dynamo_initialized
$1 = false
(gdb) p dynamo_exited
$2 = false
(gdb) p doing_detach
$3 = false
(gdb) p dynamo_detaching_flag
$4 = -1
(gdb) p dynamo_exited_all_other_threads
$5 = false
(gdb) p dynamo_exited_all_and_cleaned
No symbol "dynamo_exited_all_and_cleaned" in current context.
(gdb) p dynamo_exited_and_cleaned
$6 = false

So we're completely native. It's not clear why it's stuck: there are 10 threads and they all seem to be running the 'print(".")' but the callstacks are hard to figure out: are they in some fprintf lock or what.

From base of stack up I get this far:

0xed2fbde0  0xed2fc300  No symbol matches (void *)$retaddr.
0xed2fbde4  0xf753c9e5  vfprintf + 469 in section .text of /lib/i386-linux-gnu/libc.so.6

0xed2fc300  0xed2fc31c  No symbol matches (void *)$retaddr.
0xed2fc304  0x08049d9c  print + 54 in section .text of /home/bruening/dr/git/build_x86_dbg_tests/suite/tests/bin/api.detach_spawn

0xed2fc31c  0xed2fc358  No symbol matches (void *)$retaddr.
0xed2fc320  0x080495ba  parent_func + 42 in section .text of /home/bruening/dr/git/build_x86_dbg_tests/suite/tests/bin/api.detach_spawn

0xed2fc358  0xed2fc428  No symbol matches (void *)$retaddr.
0xed2fc35c  0xf76b0f72  start_thread + 210 in section .text of /lib/i386-linux-gnu/libpthread.so.0

But the vfprintf frame has it next hitting somewhere around an funlockfile func ptr, but there's no retaddr near there:

(gdb) x/2i 0xf753c9e5-5
   0xf753c9e0 <vfprintf+464>:	call   0xf75415f0
   0xf753c9e5 <vfprintf+469>:	lea    -0xc(%ebp),%esp
(gdb) x/23i 0xf75415f0
   0xf75415f0:	push   %ebp
   0xf75415f1:	push   %edi
   0xf75415f2:	push   %esi
   0xf75415f3:	mov    %eax,%esi
   0xf75415f5:	push   %ebx
   0xf75415f6:	sub    $0x20ec,%esp

0xed2f9ce8  0xed2f9d10  No symbol matches (void *)$retaddr.
0xed2f9cec  0xf76b9170  funlockfile in section .text of /lib/i386-linux-gnu/libpthread.so.0
#+END_EXAMPLE

Going from the top down we're in libc but gdb won't name the routine:
#+BEGIN_EXAMPLE
(gdb) x/5i $pc-2
   0xf76eac8e:	int    $0x80
=> 0xf76eac90:	pop    %ebp
   0xf76eac91:	pop    %edx
   0xf76eac92:	pop    %ecx
   0xf76eac93:	ret    
(gdb) dps $esp $esp+64
0xed2f9cc8  0x00000001  No symbol matches (void *)$retaddr.
0xed2f9ccc  0x00000002  No symbol matches (void *)$retaddr.
0xed2f9cd0  0x00000080  No symbol matches (void *)$retaddr.
0xed2f9cd4  0xf75f48b1  No symbol matches (void *)$retaddr.
(gdb) x/10i 0xf75f48b1-12
   0xf75f48a5:	mov    $0xf0,%eax
   0xf75f48aa:	call   *%gs:0x10
   0xf75f48b1:	mov    %edx,%eax
   0xf75f48b3:	xchg   %eax,(%ebx)

f74f9000-f76a4000 r-xp 00000000 fc:01 12321400                           /lib/i386-linux-gnu/libc-2.19.so
f76a4000-f76a6000 r--p 001aa000 fc:01 12321400                           /lib/i386-linux-gnu/libc-2.19.so
f76a6000-f76a7000 rw-p 001ac000 fc:01 12321400                           /lib/i386-linux-gnu/libc-2.19.so

0xf0 == 240 == futex

(gdb) info reg
eax            0xfffffe00	-512
ecx            0x80	128                  == flag
edx            0x2	2                    == mustbe
ebx            0xf76a788c	-144017268   == int*futex
esp            0xed2f9cc8	0xed2f9cc8
ebp            0x1	0x1                  == val3
esi            0x0	0                    == timeout
edi            0x1	1                    == int*uaddr2

#define FUTEX_WAIT		0
#define FUTEX_PRIVATE_FLAG	128
#define FUTEX_WAIT_PRIVATE	(FUTEX_WAIT | FUTEX_PRIVATE_FLAG)

Looks like the routine starts here:

   0xf75f4890:	push   %edx

So its caller is:

(gdb) x/4i 0xf754182e-5
   0xf7541829:	call   0xf75f4890
   0xf754182e:	jmp    0xf75416ee

Anyway it's a futex inside vfprintf: hard to get more when these libc routines apparently have no names known to gdb.

Did we mess up something about stderr? Running 300x w/o DR I don't see a hang.

Assignee
Assign to
Time tracking