Simplify x86_64 percpu and GSBASE calculation
Among other things (such as downgrading x86/x86_64 kernel stack alignment to 8 bytes), this reduces percpu overhead by merging .tdata
and .tbss
into the same page. The GDT is now stored at a fixed offset in each PCR (processor control region, i.e. a "kernel thread control block"), allowing the expected kernel GSBASE value to be cheaply calculated. Paranoid ISRs conditionally run SWAPGS again, but safely, which allows context switches inside them (which is needed by e.g. #DB).
Edited by Jacob Lorentzon