[AArch64][jdk8] Incorrect handling synchro signal in case mangling epilogue pc
We've got synchro signal on thread.
main_signal_handler: thread=1588266, sig=12, xsp=0x0000fff923c94da0, retaddr=0x000000000000000c
siginfo: sig = 12, pid = 1587929, status = 0, errno = 0, si_code = -6
x0 = 0x0000000000000000
x1 = 0x0000fff923c6e000
x2 = 0x000000000000000c
x3 = 0x0000000000000030
x4 = 0x000000000000005c
x5 = 0x0000000000003c05
x6 = 0x0000fff09015a9b8
x7 = 0xfefeff6f6071735e
x8 = 0x7f7f7f7f7f7f7f7f
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x0000a701409d1276
x13 = 0x0000000000000040
x14 = 0x000000000000003f
x15 = 0x0000000000000000
x16 = 0x0000ffffa651dc00
x17 = 0x0000ffffa6bc4080
x18 = 0x0000000000000000
x19 = 0x0000000000000030
x20 = 0x0000fff090484208
x21 = 0x0000fff0901b94f8
x22 = 0x0000ffffa6600340
x23 = 0x0000000000000001
x24 = 0x0000000000000021
x25 = 0x0000fff09045a8f8
x26 = 0x0000000000000021
x27 = 0x0000000000000108
x28 = 0x0000fff923c6e000
x29 = 0x0000fff106572880
x30 = 0x0000ffffa635068c
sp = 0x0000fff106572880
pc = 0x0000ffff238c68c8
pstate = 0x0000000020000000
pc is 0x0000ffff238c68c8
code cache for the bb looks like
(gdb) x /16i (0x0000ffff238c68c8-48)
0xffff238c6898: ldr x0, [x25, #8]
0xffff238c689c: str x0, [x28]
0xffff238c68a0: mov x0, x28
0xffff238c68a4: ldr x28, [x28, #48]
0xffff238c68a8: lsl x27, x28, #3
0xffff238c68ac: mov x28, x0
0xffff238c68b0: ldr x0, [x28]
0xffff238c68b4: str x1, [x28, #8]
0xffff238c68b8: mov x1, x28
0xffff238c68bc: ldr x28, [x28, #48]
0xffff238c68c0: ldr x0, [x0, x28, lsl #3]
0xffff238c68c4: mov x28, x1
==> 0xffff238c68c8: ldr x1, [x28, #8] <==
0xffff238c68cc: cmp x20, x0
0xffff238c68d0: b.eq 0xffff238c6de8 // b.none
0xffff238c68d4: b 0xffff238c6a68
clear bb and bb after mangling
interp: start_pc = 0x0000ffffa6350424
check_thread_vm_area: pc = 0x0000ffffa6350424
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
0x0000ffffa6350424 f9400720 ldr +0x08(%x25)[8byte] -> %x0
0x0000ffffa6350428 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
bb ilist after mangling:
TAG 0x0000ffffa6350424
+0 L3 @0x0000fff923eafda0 f9400720 ldr +0x08(%x25)[8byte] -> %x0
+4 m4 @0x0000fff923eb1110 f9000380 str %x0 -> (%x28)[8byte]
+8 m4 @0x0000fff923eb1df0 aa1c03e0 orr %xzr %x28 lsl $0x0000000000000000 -> %x0
+12 m4 @0x0000fff923eb1358 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+16 L3 @0x0000fff923eb0430 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
+20 m4 @0x0000fff923eb1090 aa0003fc orr %xzr %x0 lsl $0x0000000000000000 -> %x28
+24 m4 @0x0000fff923eae950 f9400380 ldr (%x28)[8byte] -> %x0
+28 m4 @0x0000fff923eaf438 f9000781 str %x1 -> +0x08(%x28)[8byte]
+32 m4 @0x0000fff923eae9d0 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+36 m4 @0x0000fff923eaeb18 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+40 L3 @0x0000fff923eaf0a8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
+44 m4 @0x0000fff923eafd20 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
==> +48 m4 @0x0000fff923eb00e8 f9400781 ldr +0x08(%x28)[8byte] -> %x1 <==
+52 L3 @0x0000fff923eb12d8 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+56 L3 @0x0000fff923eb20b8 54000340 b.eq $0x0000ffffa635049c
+60 L4 @0x0000fff923eb1e70 14000000 b $0x0000ffffa6350438
END 0x0000ffffa6350424
So, pc 0x0000ffff238c68c8
is mangling m4 instruction ldr +0x08(%x28)[8byte] -> %x1
When the thread was awake, dispatcher set target 0x0000ffffa635042c
handle_suspend_signal: awake now
main_signal_handler 12 returning now to 0x0000ffff22d11454
Exit due to proactive reset
d_r_dispatch: target = 0x0000ffffa635042c
Building new bb
interp: start_pc = 0x0000ffffa635042c
check_thread_vm_area: pc = 0x0000ffffa635042c
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
==> 0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0 <==
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
bb ilist after mangling:
TAG 0x0000ffffa635042c
+0 m4 @0x0000fff923eb1110 f9000781 str %x1 -> +0x08(%x28)[8byte]
+4 m4 @0x0000fff923eb12d8 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+8 m4 @0x0000fff923eb1df0 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
==> +12 L3 @0x0000fff923eaf0a8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0 <==
+16 m4 @0x0000fff923eb1358 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
*** +20 m4 @0x0000fff923eb0430 f9400781 ldr +0x08(%x28)[8byte] -> %x1 ***
+24 L3 @0x0000fff923eafd20 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+28 L3 @0x0000fff923eb00e8 54000340 b.eq $0x0000ffffa635049c
+32 L4 @0x0000fff923eafda0 14000000 b $0x0000ffffa6350438
END 0x0000ffffa635042c
Looks like we back to 1st original instruction ldr (%x0,%x28,lsl #3)[8byte] -> %x0
before our mangle ldr +0x08(%x28)[8byte] -> %x1
but probably register context was not restored and x0 register has incorrect value
crash signal context
main_signal_handler: thread=1588266, sig=11, xsp=0x0000fff923c94da0, retaddr=0x000000000000000b
siginfo: sig = 11, pid = 264, status = 0, errno = 0, si_code = 1
x0 = 0x0000000000000000
x1 = 0x0000fff923c6e000
x2 = 0x000000000000000c
x3 = 0x0000000000000030
x4 = 0x000000000000005c
x5 = 0x0000000000003c05
x6 = 0x0000fff09015a9b8
x7 = 0xfefeff6f6071735e
x8 = 0x7f7f7f7f7f7f7f7f
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x0000a701409d1276
x13 = 0x0000000000000040
x14 = 0x000000000000003f
x15 = 0x0000000000000000
x16 = 0x0000ffffa651dc00
x17 = 0x0000ffffa6bc4080
x18 = 0x0000000000000000
x19 = 0x0000000000000030
x20 = 0x0000fff090484208
x21 = 0x0000fff0901b94f8
x22 = 0x0000ffffa6600340
x23 = 0x0000000000000001
x24 = 0x0000000000000021
x25 = 0x0000fff09045a8f8
x26 = 0x0000000000000021
x27 = 0x0000000000000108
x28 = 0x0000000000000021
x29 = 0x0000fff106572880
x30 = 0x0000ffffa635068c
sp = 0x0000fff106572880
pc = 0x0000ffff2417046c
pstate = 0x0000000020000000
computing memory target for 0x0000ffff2417046c causing SIGSEGV, kernel claims it is 0x0000000000000108
compute_memory_target: falling back to racy protection checks
opnd_compute_address for: (%x0,%x28,lsl #3)
base => 0x0000000000000000
index,scale => 0x0000000000000108
disp => 0x0000000000000108
For SIGSEGV at cache pc 0x0000ffff2417046c, computed target read 0x0000000000000108
faulting instr: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
** Received SIGSEGV at cache pc 0x0000ffff2417046c in thread 1588266
record_pending_signal(11) from cache pc 0x0000ffff2417046c
not certain can delay so handling now
action is not SIG_IGN
(gdb) x /9i (0x0000ffff2417046c-12)
0xffff24170460: str x1, [x28, #8]
0xffff24170464: mov x1, x28
0xffff24170468: ldr x28, [x28, #48]
0xffff2417046c: ldr x0, [x0, x28, lsl #3]
0xffff24170470: mov x28, x1
0xffff24170474: ldr x1, [x28, #8]
0xffff24170478: cmp x20, x0
0xffff2417047c: b.eq 0xffff24170484 // b.none
0xffff24170480: b 0xffff238c6a68
Looks like the following case:
- before synchro signal we executed
ldr x0, [x0, x28, lsl #3]
and change x0 - after the signal, we back from
ldr x1, [x28, #8]
toldr x0, [x0, x28, lsl #3]
but don't restore context - we execute
ldr x0, [x0, x28, lsl #3]
the 2nd time but with incorrect register context
Derek comment is "You would expect this to be marked as a mangling epilogue. Translation in a mangling epilogue is supposed to target the next instruction and "emulate" the rest of the epilogue, as it is sometimes impossible to undo the app instr and thus returning the being-mangled instr PC for restart is not feasible. This makes it look like that is not done correctly for stolen register mangling on AArch64."