From e10a34635628fab3bd383e2d17c89dcd6be75b15 Mon Sep 17 00:00:00 2001
From: Benedikt Rascher-Friesenhausen
 <benediktrascherfriesenhausen+git@gmail.com>
Date: Sun, 7 Oct 2018 10:25:19 +0200
Subject: [PATCH] Optimise `memcmp` for speed

I saw that in other parts of the `string` module iterations over `usize` were
used to increase iteration speed.  In this patch I apply the same logic to
`memcmp`.  With this change I measured a 7x speedup for `memcmp` on a ~1MB
buffer (comparing two buffers with the same content) on my machine (i7-7500U),
but I did not do any real world benchmarking for the change.  The increase in
speed comes with the tradeoff of both increased complexity and larger generated
assembly code for the function.

I tested the correctness of the implementation by generating two randomly filled
buffers and comparing the `memcmp` result of the old implementation against this
new one.

I ran the tests and currently currently three of them fail:
  - netdb (fails to run)
  - stdio/rename (fails to verify)
  - unistd/pipe (fails to verify)

They do so though regardless of this change, so I don't think they are related.
---
 src/header/string/mod.rs | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/src/header/string/mod.rs b/src/header/string/mod.rs
index 9a6e4ddd..0efe60f8 100644
--- a/src/header/string/mod.rs
+++ b/src/header/string/mod.rs
@@ -73,14 +73,32 @@ pub unsafe extern "C" fn memchr(s: *const c_void, c: c_int, n: usize) -> *mut c_
 
 #[no_mangle]
 pub unsafe extern "C" fn memcmp(s1: *const c_void, s2: *const c_void, n: usize) -> c_int {
-    let mut i = 0;
-    while i < n {
-        let a = *(s1 as *const u8).offset(i as isize);
-        let b = *(s2 as *const u8).offset(i as isize);
-        if a != b {
+    let (div, rem) = (n / mem::size_of::<usize>(), n % mem::size_of::<usize>());
+    let mut a = s1 as *const usize;
+    let mut b = s2 as *const usize;
+    for _ in 0..div {
+        if *a != *b {
+            for i in 0..mem::size_of::<usize>() {
+                let c = *(a as *const u8).offset(i as isize);
+                let d = *(b as *const u8).offset(i as isize);
+                if c != d {
+                    return c as i32 - d as i32;
+                }
+            }
+            unreachable!()
+        }
+        a = a.offset(1);
+        b = b.offset(1);
+    }
+
+    let mut a = a as *const u8;
+    let mut b = b as *const u8;
+    for _ in 0..rem {
+        if *a != *b {
             return a as i32 - b as i32;
         }
-        i += 1;
+        a = a.offset(1);
+        b = b.offset(1);
     }
     0
 }
-- 
GitLab