Skip to content
Snippets Groups Projects
Commit 8289b691 authored by Jeremy Soller's avatar Jeremy Soller
Browse files

Merge branch 'rsoc_2022' into 'master'

Drivers and kernel part 7.

See merge request !263
parents 84edf939 6703e6d9
No related branches found
No related tags found
1 merge request!263Drivers and kernel part 7.
Pipeline #10473 passed
...@@ -6,8 +6,8 @@ date = "2022-07-08T13:37:00+02:00" ...@@ -6,8 +6,8 @@ date = "2022-07-08T13:37:00+02:00"
# Introduction # Introduction
So far it's been two weeks since my third RSoC, where so far I have mainly So far it's been two weeks since my third RSoC began, where so far I have
worked on moving large parts of kernel code, which deal with process mainly worked on moving large parts of kernel code, which deal with process
management, to userspace. In this blog post I will try summarizing what has management, to userspace. In this blog post I will try summarizing what has
been accomplished so far, and exciting things I have started but not finished. been accomplished so far, and exciting things I have started but not finished.
......
+++
title = "RSoC: improving drivers and kernel - part 7"
author = "4lDO2"
date = "2022-08-16T12:00:00+02:00"
+++
# Introduction
In my last blog post, I introduced the `userspace_fexec`/`userspace_clone`
features. As the names suggest, they move the inherently complex
implementations of `fork(3)` and `execve(2)`, from the kernel into relibc,
giving userspace much more freedom while simplifying the kernel. There has been
considerable progress since last post; __the features
`userspace_fexec`/`userspace_clone`, `userspace_initfs`, and
`userspace_initfs`, have now all been merged!__
## RMM
After having thoroughly debugged the orbital/orblogin memory corruption bug
with little success, I decided to go as far as phase out the old paging code
(`ActivePageTable`/`InactivePageTable`/`Mapper` etc.) in favor of RMM (Redox
Memory Manager). Surprisingly, this fixed the bug entirely in the process, and
it turns out the issue was simply that parent page tables were not properly
unmapped (causing use-after-free), most likely due to the coexistence of RMM
and the old paging code, which did not agree on how the number of page table
entries were counted.
## Userspace initfs
I mentioned moving initfs to userspace as a TODO from last post. The changes
required were very simple: rather than having the bootloader pass a physical
address range containing the initfs image, and then letting the kernel load
bootstrap from within the filesystem, it now simply loads a "bootstrap/initfs
blob" into (virtual) memory at 0x0, and jumps to an address provided by the
bootloader. The bootloader loads both `/kernel`, `/bootstrap`, and `/initfs`,
the latter two of which are put adjacently in physical address space.
This also means `bootstrap` will now fork into both a scheme handler serving
`initfs:` from the initfs memory, and for executing init.
## Userspace cwd and userspace path canonicalization
Redox used to expose two system calls, `chdir` and `getcwd`, also a TODO from
last post, which get and set the current working directory (identical to
POSIX). This would modify an internal `cwd` string in each kernel context, used
for canonicalizing paths while handing path-based syscalls (`open`, `chmod`
(now removed in favor of `fchmod`), `unlink`, and `rmdir`), allowing e.g.
`open("./foo", O_RDONLY) => open("file:/path/to/foo", O_RDONLY)`. Now that
`userspace_cwd` is merged however, the kernel will only allow
already-canonicalized paths, i.e. enforce that both the scheme name and path
are present. Hence, relibc will instead canonicalize the paths itself, and
`chdir`/`getcwd` are implemented simply by accessing a global variable
(although `sigprocmask` is run before and after). This global variable is
passed in execve using auxiliary vectors.
But most importantly, the `SYS_OPEN` handler in the kernel, no longer resolves
cross-scheme symlinks (i.e. handles EXDEV internally), which has also been
moved to relibc. While also reducing the number of file operations initiated
from the kernel, it reduces the amount of state needed for syscall handlers,
which will be very helpful for a possible syscall multiplexing API
(userspace-to-kernel io_uring).
Hopefully at some point, most if not all syscalls on Redox will be fully
completion-based, i.e. the caller sends a request, waits (if blocking), and
then asynchronously runs completion code (either returning from a blocking
syscall, or in the future pushing an io_uring CQE). In the process, the kernel
may become "stackless", i.e. use the same kernel stack for all processes, and
thereby reduce the memory footprint of contexts (threads) by an order of
magnitude.
# TODO
Luckily, the userspace initfs TODO, and fixing the orbital/orblogin bug, have
both been finished!
On-demand paging is still not yet implemented, even though I have written a
large part of it. This would allow optimizing ld.so and relibc's execve, by
calling mmap with CoW in order to load ELF segments, which is especially
important when running rustc on Redox.
The syscall interface has also not been reworked either, but there is a clear
need for overcoming the limitations of `Packet` (such as being limited to 4
args per scheme op), so the io_uring SQE format will very likely be used by
scheme handlers soon, with or without using fancy ring buffers for passing
them.
`PTRACE_EVENT_CLONE` is now sent to tracers, although some work is still needed
there, and the `acid ptrace` test should be re-introduced (this was an issue
before userspace_fexec too).
I also tried implementing (basic) KASLR as a side project, and succeeded,
although with a visible performance cost (not sure why). It would also be
interesting to implement regular userspace ASLR in the relibc fexec handler and
possibly `ld.so`.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment