Improve deduplication
Created by: ticki
-
Remove use of AtomicOption
: Using this type is relatively expensive, since it has allocation indirection, causing it to hold an extra pointer for every entry, effectively doubling the size of the table. Instead, there should be multiple arrays of typeAtomicU64
. However, there are complications wrt. atomicity. -
Half the fingerprint size through seeding: 128-bit fingerprints are enough, if no outside party is able to test for collisions. That is, the fingerprinter is seeded. This means that to come up with a collisions, you must do so naturally, or repeatedly making request. The UID from the disk header can be used as seed -- or perhaps better, a new random one can be picked.
Note that it is important that you don't rely on the fingerprint to look up in the table. The candidate should be found through only the cheap checksum.