Optimise `memcmp` for speed
I saw that in other parts of the string
module iterations over usize
were
used to increase iteration speed. In this patch I apply the same logic to
memcmp
. With this change I measured a 7x speedup for memcmp
on a ~1MB
buffer (comparing two buffers with the same content) on my machine (i7-7500U),
but I did not do any real world benchmarking for the change. The increase in
speed comes with the tradeoff of both increased complexity and larger generated
assembly code for the function.
I tested the correctness of the implementation by generating two randomly filled
buffers and comparing the memcmp
result of the old implementation against this
new one.
I ran the tests and currently currently three of them fail:
- netdb (fails to run)
- stdio/rename (fails to verify)
- unistd/pipe (fails to verify)
They do so though regardless of this change, so I don't think they are related.