Skip to content

Four distinct fixes for significant incorrectness bugs, as three small commits

Derick Eddington requested to merge kcired/relibc:derick/fixes into master

Four distinct fixes for significant incorrectness bugs, as three small commits:

  1. Rwlock::acquire_write_lock would block indefinitely, incorrectly, in futex_wait(), when all other threads already did their futex_wake() beforehand. This could cause a thread to become stuck forever, incorrectly, which would break whatever applications use Rwlock (which is many, via your pthreads implementation which uses it).

    I encountered this bug with tests/bins_static/pthread/rwlock_randtest sometimes becoming stuck blocking forever. My investigation in GDB lead me to conclude the following (but I'm not 100% certain this is what was actually happening): With a subset of the threads (sometimes only one) doing pthread_rwlock_wrlock() near the same time as the other threads finishing, this caused the soon-to-be-stuck thread(s) to initially see the rwlock as having the "writer-waiting" bit as either 0 or 1, but then another thread attempts to acquire a write lock and then maybe releases it, which changes the "writer-waiting" bit to the opposite value, which causes the soon-to-be-stuck thread(s)' CAS to fail (and also see the rwlock state as having zero current lockers) which causes them to go on to wait incorrectly assuming they'll be woken, but the other threads won't be indicating any waking anymore because they already did that and finished before the now-stuck thread(s) wait.

    The fix is simply to not wait when the rwlock state indicated zero lockers (when the CAS failed, which can only happen when the "writer-waiting" bit changed). In this case, it's possible that a wake from another thread is never done again, because with zero lockers it's possible the rwlock is never unlocked by another thread again. When there were zero lockers but the "writer-waiting" bit changed, it makes sense to immediately retry the CAS, which is another consequence of not waiting in this case.

    Apparently, this race condition does not happen in your CI pipeline, which I'm guessing is why you hadn't noticed it before. I encountered the bug in my NixOS host and with my build of relibc there as "debug" (in contrast to your usual "release" build). I suspect that differences in your CI's kernel version, environment, and "release" build, versus my NixOS kernel version, environment, and "debug" build, are why the race condition plays out differently for me. I was encountering this bug with rwlock_randtest in something like 25% of my local runs in my NixOS host. With my fix, the bug never happened again in many thousands of runs over hours.

  2. header::grp::parse_grp, which is responsible for parsing /etc/group:

    1. parse_grp would cause memory corruption, and so allocator corruption, and so often seg-fault. This was due to the use of unsafe code incorrectly using the old backing memory of a Vec after it had been reallocated and moved to a different allocation thus invalidating the old allocation. The fix is simply to use safe code that leads to an approach that doesn't have this problem.

    2. parse_grp also would incorrectly compare buf.len() < buf.len() which caused it to fail to detect the ERANGE error case when it should have, which would cause a panic in the subsequent buf[..strings.len()] slice indexing. The fix is simply to do the intended buf.len() < strings.len().

    I encountered these two bugs because NixOS has an unusually long entry in /etc/group for its nixbld group. This caused the Vec to need to be grown and so reallocated, which caused the memory corruption and seg-fault crashes, which I'm guessing is not happening in your CI environment. This also caused tests/bins_static/grp/getgrgid_r to panic with its char buf[100] being too small for the nixbld group (but ERANGE should've been returned instead of crashing).

  3. getgrgid_r would return ENOENT and not assign *result when the requested entry was not found, but POSIX and Linux say it's supposed to return 0 and assign *result = NULL in that case. Or, on error, it would return an error code without assigning *result = NULL, but it's supposed to also assign NULL on error. The fix is simply to make it follow POSIX.

    Further, getgrgid_r was not propagating the ERANGE error case even though parse_grp already detects and indicates it. This ERANGE case is probably the most useful to be able to distinguish, so I think it should be propagated as POSIX suggests it be. While the previous behavior of returning EINVAL for any failure of parse_grp might conform to a loose interpretation of POSIX, it's not as useful as following what POSIX suggests for using different error codes for different error cases. Further, I added handling of the Error::Misc variant, even though it's currently unused, because that seems more robust for the future in case it becomes used, and because it was easy to add handling of this now which is complete and should be valid into the future.

Edited by Derick Eddington

Merge request reports