(optional) Dynamic CPU feature-based optimizations
Adds basic support for dynamic hot spot code optimization, implemented as an assembly macro (similar to Linux's ALTERNATIVE
).
As a result, fsgsbase, XSAVE+XSAVEOPT and thus AVX+AVX2, become enabled by default if supported by the CPU. Old CPUs lacking SMAP will also now be supported by default again.
Features can be hard-enabled (the kernel will panic if the CPU lacks that feature), hard-disabled, and auto, specified in config.toml
. The code that remaps, overwrites, and remaps again, can be disabled at compile time.
There are possibly lots of subsequent optimizations left to be done, including --emit-relocs
based dynamic re-relocation for e.g. selecting memcpy/memset. But that needs benchmarks first, of course.
This doesn't add support for accessing the upper half of the 256-bit YMM registers from a ptracer, but as the offset is already obtained from CPUID, doing so would be relatively straightforward.