Skip to content

(optional) Dynamic CPU feature-based optimizations

Jacob Lorentzon requested to merge 4lDO2/kernel:alternative2 into master

Adds basic support for dynamic hot spot code optimization, implemented as an assembly macro (similar to Linux's ALTERNATIVE).

As a result, fsgsbase, XSAVE+XSAVEOPT and thus AVX+AVX2, become enabled by default if supported by the CPU. Old CPUs lacking SMAP will also now be supported by default again.

Features can be hard-enabled (the kernel will panic if the CPU lacks that feature), hard-disabled, and auto, specified in config.toml. The code that remaps, overwrites, and remaps again, can be disabled at compile time.

There are possibly lots of subsequent optimizations left to be done, including --emit-relocs based dynamic re-relocation for e.g. selecting memcpy/memset. But that needs benchmarks first, of course.

This doesn't add support for accessing the upper half of the 256-bit YMM registers from a ptracer, but as the offset is already obtained from CPUID, doing so would be relatively straightforward.

Edited by Jacob Lorentzon

Merge request reports