hash + LRC better than CRC?

Started by John Naylor8 months ago2 messageshackers
Jump to latest
#1John Naylor
john.naylor@enterprisedb.com

On Mon, Dec 16, 2024 I wrote in [0]/messages/by-id/CANWCAZYQnppe=XHxXGwYEvuaqx7_v91sHk54kqWYRyinzvhbVA@mail.gmail.com:

So as I understand it the trade-off
for WAL error detection is:

CRC
odd: 100%
even: the collision-avoidance probability of a mediocre hash function

good hash function:
odd: the collision-avoidance probability of a good hash function
even: the collision-avoidance probability of a good hash function

Stated this way, it's possible we don't have the best solution, but
it's also not immediately obvious to me that the second way is so much
better that it's worth the effort to change it.

If we did go to a hash function, It'd be ideal to have the collision
guarantees of an "almost universal" hash function. For any two
messages of length at most 'n', the claimed probability of collision
is at most, for example:

VHASH [1]: n * 2**-61
CLHASH [1]: 2.0004 * 2**-64 (for same length strings)
umash [2]: ceil(n / 4096) 2**-55
polymur hash [3]: n * 2**-60.2

...but these are all 64-bit hashes, and some have further traits that
make them impractical for us. I'm not aware of any 32-bit universal
hashes.

Paul Khuong, the author of UMASH, mentioned two interesting facts to me:

1) It's trivial to reduce a 64-bit universal hash to a 32 (or
whatever)-bit universal hash by composing it with another universal
hash function -- a convenient example is Dietzfelbinger's
multiplicative hash (see section 2.3 in [1]https://arxiv.org/abs/1504.06804), which is just a multiply
and a shift.

2) An XOR-based longitudinal redundancy check (LRC) [2]https://en.wikipedia.org/wiki/Longitudinal_redundancy_check also detects
errors of any odd number of bitflips, and is cheap enough to compute
along with the hashing loop such that it's likely free on superscalar
CPUs.

For WAL we could use e.g. a 31-bit hash plus an LRC reduced to a
single parity bit, or keep 32 bits of hash and steal a WAL header
padding byte to store a 1-byte LRC (which would additionally detect
most 2-bit errors and some burst errors (see section 3.1 in [3]https://users.ece.cmu.edu/~koopman/pubs/maxino09_checksums.pdf) if we
wanted to keep some weaker CRC-type guarantees).

128-bit arithmetic has easier portability than CRC instructions (if we
avoid designs that require carryless multiplication), and it'd be nice
to get consistent performance across platforms, possibly with better
error detection for sector-sized errors. Assuming a suitable function
(and licence) can be found.

[0]: /messages/by-id/CANWCAZYQnppe=XHxXGwYEvuaqx7_v91sHk54kqWYRyinzvhbVA@mail.gmail.com
[1]: https://arxiv.org/abs/1504.06804
[2]: https://en.wikipedia.org/wiki/Longitudinal_redundancy_check
[3]: https://users.ece.cmu.edu/~koopman/pubs/maxino09_checksums.pdf

--
John Naylor
Amazon Web Services

#2Michael Paquier
michael@paquier.xyz
In reply to: John Naylor (#1)
Re: hash + LRC better than CRC?

On Sun, Aug 31, 2025 at 08:43:20PM +0700, John Naylor wrote:

1) It's trivial to reduce a 64-bit universal hash to a 32 (or
whatever)-bit universal hash by composing it with another universal
hash function -- a convenient example is Dietzfelbinger's
multiplicative hash (see section 2.3 in [1]), which is just a multiply
and a shift.

2) An XOR-based longitudinal redundancy check (LRC) [2] also detects
errors of any odd number of bitflips, and is cheap enough to compute
along with the hashing loop such that it's likely free on superscalar
CPUs.

For WAL we could use e.g. a 31-bit hash plus an LRC reduced to a
single parity bit, or keep 32 bits of hash and steal a WAL header
padding byte to store a 1-byte LRC (which would additionally detect
most 2-bit errors and some burst errors (see section 3.1 in [3]) if we
wanted to keep some weaker CRC-type guarantees).

128-bit arithmetic has easier portability than CRC instructions (if we
avoid designs that require carryless multiplication),

Hmm. Okay. If this is more efficient and more reliable, why not.
There's no free cake in this area so my understanding is that it is
usually a trade-off between efficiency and reliability, and that we
cannot have both of them. Of course, this counts for the case of more
efficiency with the same reliability across two different methods, and
vice-versa the efficiency/reliability part.

and it'd be nice
to get consistent performance across platforms, possibly with better
error detection for sector-sized errors. Assuming a suitable function
(and licence) can be found.

With the amount of platforms that Postgres needs to support, this
feels like potentially like a lot of work in terms of benchmarking.
I am wondering about some actual numbers, in terms of
micro-benchmarking the computations, of course. If this provides for
example a very high performance improvement on Linux and the most
popular BSDs (say as a matter of 2x or 3x), that may be worth
considering to me anyway, even if for example WIN32 shows no
performance difference for reasons I am not aware of, which would
likely come down to the instructions we have through the MSVC
compilers.

Saying that, the argument of reducing 64-bit hash to 32-bit sounds
kind of appealing, still I would worry about the cost of the extra
computation.
--
Michael