currently-in-app-state thread now can't handle a signal
The latest DR TLS changes from #2089 (closed) break the DrM nudge test:
Trying to update the DR used by DrM results in the nudge test failing (I had to add custom output to see the nudge-out file contents):
62: <received nudge mask=0x40000 id=0x00000000 arg=0x0000000000000000>
62: <ERROR: master_signal_handler with no siginfo (i#26?): tid=16408, sig=4>
62: |
62:
62: CMake Error at runtest.cmake:249 (message):
62: Timed out waiting for summary output
62:
master_signal_handler: sig=4, retaddr=0xf776bbd0
handle_nudge_signal: sig=4 code=-1 errno=268697600
received nudge version=1 flags=0x0 mask=0x40000 id=0x00000000 arg=0x0000000000000000
SYSLOG_INFORMATION: received nudge mask=0x40000 id=0x00000000 arg=0x0000000000000000
master_signal_handler 4 returning now
Exit from F1647(0x08048901).0xe79cbcc0 (shared)
(UNKNOWN DIRECT EXIT F1647.0xe79cbcc0->F1647)
dispatch: target = 0x08048901
synch with all threads my id = 17188 Giving 4 permission and seeking 4 state
Skipping synch with thread 17188
Finished synch with all threads: result=1
returning holding initexit_lock and all_threads_synch_lock
os_switch_seg_to_base: switching to app, setting gs to 0x63
os_switch_seg_to_base to app: set_thread_area successful for thread 17188 base 0x0804cc80
TLS copy is 0xe77516e4
os_switch_seg_to_base: switching to dr, setting fs to 0x73
os_switch_seg_to_base to DR: set_thread_area successful for thread 17188 base 0xe77516e4
=> <failure here>
os_switch_seg_to_base: switching to dr, setting gs to 0x63
os_switch_seg_to_base to DR: set_thread_area successful for thread 17188 base 0xe77c2b70
os_switch_seg_to_base: switching to dr, setting fs to 0x73
os_switch_seg_to_base to DR: set_thread_area successful for thread 17188 base 0xe77bd000
os_file_exists failed: 0xfffffffe
os_file_exists failed: 0xfffffffe
os_file_exists failed: 0xfffffffe
Just flushed targetf, next_tag is 0x08048901
Entry into F1647(0x08048901).0xe79cbc77 (shared)
I can repro like this:
# tests/run_app_in_bg -out ./nudge-out /work/drmemory/git/build_x86_dbg/bin/drmemory -debug -dr_debug -dr /work/dr/git/exports -batch -dr_ops -loglevel -dr_ops 2 -callstack_style 0x27 -no_results_to_stderr -- tests/infloop
17294
# for ((i=0; i<2; i++)); do bin/drmemory -debug -dr_debug -dr /work/dr/git/exports -batch -nudge 17294; done
The failure happens when the 2nd nudge arrives when the thread is in app state. The #2089 (closed) changes mean that a nudge signal arriving in a native thread will always cause this assert. Previously dr_switch_to_app_state() did nothing on Linux which is why nothing fired.
For the DrM failure, it's a leak scan nudge, which calls dr_switch_to_app_state() for Windows TEB. For now I will just have DrM only do the swap to app state on Windows. Longer term we need a better signal-in-cur-app-thread strategy.
Xref the related #1921 (closed).