Implement TLB shootdown
This implements basic TLB shootdown, fixing acid tlb
and hopefully some other difficult-to-explain bugs were certain programs were executing zeroes. All places were memory is deallocated, now properly awaits TLB shootdown on other affected CPUs (although I'm not as sure everything is currently valid when moving pages).
It's not optimized yet, and may result in some initial slowdown. It only interrupts and waits for CPUs that were using the affected address space, but doesn't send individual pages to the other CPUs. Merely indicating that there are invalid pages, it forces them to reload CR3 (slow compared to INVLPG up to 32 pages, according to Linux).
There are two places where ignoring an INVLPG (per page) is unnecessary, but where doing so amplifies a userspace race condition (which is likely an invalid-state-caused race condition, as gdb reports all CPUs as running the idle interrupt::halt
loop).
Apart from possible performance issues, this fixes #136 (closed).