Skip to content

A faster implementation of the memcpy family

Jeremy Soller requested to merge pi-pi3:faster-externs into master

Created by: pi-pi3

The default implementation of the memcpy, memmove, memset and memcmp functions in the kernel file uses a naive implementation by copying, assigning or comparing bytes ony by one. This can be slow. This commit proposes a reimplementation of those functions by copying, assigning or comparing in group of 8 bytes by using the u64 type and its respective pointers instead of u8. Alternative version for 32-bit architectures are also supplied for future compatibility with x86. Both version first copy whatever they can with wide word types. The tail, i.e. the final few bytes that do not fit in a dword or qword are then copied byte by byte.

A comparison of performance of all functions on both 32-bit and 64-bit version can be seen in commit c4fc76f

Merge request reports