Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle signals properly #529

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ChinYikMing
Copy link
Collaborator

@ChinYikMing ChinYikMing commented Dec 26, 2024

flowchart TB
    X[page faults] --> trap --> do_page_fault --> lock_mm_and_find_vma --> find_vma
    find_vma -- valid VMA --> A[Map the VMA to page table and resume the userspace]
    find_vma -- NULL --> bad_area_nosemaphore
    bad_area_nosemaphore -- from kernel space --> die_kernel_fault
    bad_area_nosemaphore -- from user space --> do_trap --> B[Checks whether the signal should be ignored or blocked, and if not, adds it to the task's pending signal list.]
                         
    ret_from_exception --> resume_userspace_slow --> do_work_pending -- determined by _TIF_WORK_MASK --> do_signal --> handle_signal --> setup_rt_frame --> C[Update SEPC CSR if needed] --> resume_userspace
Loading

The above flowchart is the simplified version of handling page faults.

Page faults trigger a trap, which is handled by do_page_fault(). This function calls lock_mm_and_find_vma() to locate and validate the virtual memory area (VMA), returning the VMA if valid, or NULL otherwise. Typically, attempts to read or write to a NULL VMA result in a NULL return. If the VMA is invalid, bad_area_nosemaphore() is invoked, which checks whether the fault originated in kernel or user space.

For user-space faults, a SIGSEGV signal is sent to the user process via do_trap(), which determines if the signal should be ignored or blocked, and if not, adds it to the task's pending signal list. Kernel-space faults cause the kernel to crash via die_kernel_fault().

Before returning to user space (via the resume_userspace label), pending work (indicated by the _TIF_WORK_MASK mask) is processed by do_work_pending(). Signals are handled by do_signal(), which in turn calls handle_signal(). handle_signal() creates a signal handler frame that will be jumped to upon returning to user space. This frame creation process might modifies the Control and Status Register (CSR) SEPC. If there are a signal pending, the SEPC CSR overwritten the original trap/fault PC. This caused an assertion failure in get_ppn_and_offset() when running the vi program, reported in [1].

To address this, a variable last_csr_sepc was introduced to store the original SEPC CSR value before entering the trap path. After returning to user space, last_csr_sepc is compared with the current SEPC CSR value. If they differ, the fault ld/st instruction returns early and jumps to the signal handler frame.

This commit prevents emulator crashes when the guest OS accesses invalid memory. Consequently, reads or writes to a NULL value now correctly result in a segmentation fault. In addition, two user-space programs: mem_null_read and mem_null_write are bundled into the rootfs for verification.

mem_null_read and mem_null_write source codes:

mem_null_read

#include <stdio.h>

int main(){
	char *ptr = NULL;
	printf("*ptr: %c\n", *ptr);
	return 0;
}

mem_null_write

#include <stdio.h>

int main(){
        char *ptr = NULL;
        *ptr = 'a';

        return 0;
}

Original behaviour

$ make system ENABLE_SYSTEM=1 -j$(nproc)
  1. Emulator crashes
$ mem_null_read
  1. Emulator crashes
$ mem_null_write
  1. Emulator crashes
$ vi

Patch Reproduce / Testing procedure:

$ make system ENABLE_SYSTEM=1 -j$(nproc)
  1. NULL read causes SIGSEGV without crashing emulator
$ mem_null_read
  1. NULL write causes SIGSEGV without crashing emulator
$ mem_null_write
  1. w/o filename causes SIGSEGV without crashing emulator
$ vi

[1] #508

vi Crach Discussion

Running vi without a filename results in a NULL pointer access (behavior that is implementation-defined), correctly triggering a segmentation fault. Dmesg confirms that the faulting address is indeed NULL. However, running vi without a filename in RVVM and semu does not produce a SIGSEGV fault. My investigation revealed that semu simply increments the program counter (PC) by 4, effectively skipping the faulting instruction. I'm unsure if this is a valid approach. What I know is that read/write access to NULL should generate a non-maskable SIGSEGV signal, consistent with standard Unix signal.

Dmesg after running vi (getting SIGSEGV instead of crashing w/ the patch):

[   10.374478] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   10.374812] Oops [#2]
[   10.374962] Modules linked in:
[   10.375152] CPU: 0 PID: 61 Comm: vi Tainted: G      D            6.1.119 #1
[   10.375504] Hardware name: rv32emu (DT)
[   10.375696] epc : strncpy_from_user+0x6c/0x190
[   10.375986]  ra : getname_flags+0x74/0x194
[   10.376269] epc : c01bd3a4 ra : c00ba71c sp : c0989ea0
[   10.376553]  gp : c0476320 tp : c0aa0580 t0 : 00000ff0
[   10.376831]  t1 : fefefeff t2 : 691a2420 s0 : c0989eb0
[   10.377115]  s1 : c0882000 a0 : 00000000 a1 : 00000000
[   10.377377]  a2 : 00000ff0 a3 : 00000000 a4 : 00000000
[   10.377632]  a5 : 00000ff0 a6 : 0000146d a7 : c0882010
[   10.377904]  s2 : c0989f38 s3 : 00000000 s4 : 00000000
[   10.378167]  s5 : c047752c s6 : 00000000 s7 : 00000000
[   10.378429]  s8 : 00001000 s9 : 00000002 s10: 00000014
[   10.378688]  s11: ffffffff t3 : 80808080 t4 : 00040000
[   10.378968]  t5 : 00000005 t6 : 00000ff0
[   10.379174] status: 00040120 badaddr: 00000000 cause: 0000000d
[   10.379455] [<c01bd3a4>] strncpy_from_user+0x6c/0x190
[   10.379790] [<c00ba71c>] getname_flags+0x74/0x194
[   10.380117] [<c00ba88c>] getname+0x1c/0x2c
[   10.380421] [<c00a9f7c>] do_sys_openat2+0x4c/0xf0
[   10.380776] [<c00aa11c>] do_sys_open+0x40/0x58
[   10.381122] [<c00aa194>] sys_openat+0x24/0x34
[   10.381467] [<c00023c0>] ret_from_syscall+0x0/0x4
[   10.381788] ---[ end trace 0000000000000000 ]---
Segmentation fault
#

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks

Benchmark suite Current: 6542e92 Previous: c9510ce Ratio
Dhrystone 1342 Average DMIPS over 10 runs 1296 Average DMIPS over 10 runs 0.97
Coremark 970.051 Average iterations/sec over 10 runs 971.107 Average iterations/sec over 10 runs 1.00

This comment was automatically generated by workflow using github-action-benchmark.

src/emulate.c Show resolved Hide resolved
@ChinYikMing ChinYikMing force-pushed the handle-signal branch 2 times, most recently from 6a5733b to b4c467f Compare December 27, 2024 03:39
src/system.c Outdated Show resolved Hide resolved
@ChinYikMing ChinYikMing force-pushed the handle-signal branch 2 times, most recently from 27a2c7e to c9510ce Compare December 27, 2024 04:00
Page faults trigger a trap, which is handled by do_page_fault(). This
function calls lock_mm_and_find_vma() to locate and validate the virtual
memory area (VMA), returning the VMA if valid, or NULL otherwise.
Typically, attempts to read or write to a NULL VMA result in a NULL
return. If the VMA is invalid, bad_area_nosemaphore() is invoked, which
checks whether the fault originated in kernel or user space.

For user-space faults, a SIGSEGV signal is sent to the user process via
do_trap(), which determines if the signal should be ignored or blocked,
and if not, adds it to the task's pending signal list. Kernel-space
faults cause the kernel to crash via die_kernel_fault().

Before returning to user space (via the resume_userspace label), pending
work (indicated by the _TIF_WORK_MASK mask) is processed by
do_work_pending(). Signals are handled by do_signal(), which in turn
calls handle_signal(). handle_signal() creates a signal handler frame
that will be jumped to upon returning to user space. This frame creation
process might modifies the Control and Status Register (CSR) SEPC.
If there are a signal pending, the SEPC CSR overwritten the original
trap/fault PC. This caused an assertion failure in get_ppn_and_offset()
when running the vi program, reported in [1].

To address this, a variable last_csr_sepc was introduced to store the
original SEPC CSR value before entering the trap path. After returning
to user space, last_csr_sepc is compared with the current SEPC CSR
value. If they differ, the fault ld/st instruction returns early and
jumps to the signal handler frame.

This commit prevents emulator crashes when the guest OS accesses invalid
memory. Consequently, reads or writes to a NULL value now correctly
result in a segmentation fault. In addition, two user-space programs:
mem_null_read and mem_null_write are bundled into the rootfs for
verification.

Original behaviour
1. $ make system ENABLE_SYSTEM=1 -j$(nproc)
2. $ mem_null_read                            # Emulator crashes
3. $ mem_null_write                           # Emulator crashes
4. $ vi                                       # Emulator crashes

Patch Reproduce / Testing procedure:
1. $ make system ENABLE_SYSTEM=1 -j$(nproc)
2. $ mem_null_read       # NULL read causes SIGSEGV without crashing
3. $ mem_null_write      # NULL write causes SIGSEGV without crashing
4. $ vi                  # w/o filename causes SIGSEGV without crashing

[1] sysprog21#508
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants