need clarification about hash_bytes() non-determinitstic behaviour between Little Endian and Big Endian
Hello everyone!
Recently I've been looking onto hashfn.c and faced a different output when looking on local
variables using GDB and on custom example using hash_bytes(). The example is attached and
was compiled and ran on Big Endian machine like:
gcc -Wall -Wextra -DWORDS_BIGENDIAN -g -O0 test_hash.c -o test_hash
./test_hash test
before final a=316197320 b=2658358868 c=2658358868
3358672099
./test_hash testtesttest
before final a=3395140076 b=3735912863 c=4252500541
2256767852
Littile Endian:
gcc -Wall -Wextra -g -O0 test_hash.c -o test_hash
./test_hash test
before final a=317111240 b=2658358868 c=2658358868
1771415073
./test_hash testtesttest
before final a=572913213 b=3185033534 c=3535459743
547154463
However, the output will be the same if the input bytes
are a palindrome.
Big Endian:
./test_hash deed
before final a=47758264 b=2658358868 c=2658358868
1406051429
Little Endian:
./test_hash deed
before final a=47758264 b=2658358868 c=2658358868
1406051429
After looking inside hash_bytes() I've noticed what was the reason of this.
When the function goes inside word-aligned branch there is the same += operation
for an 'a' variable:
/* Code path for aligned source data */
const uint32 *ka = (const uint32 *) k;
...
#ifdef WORDS_BIGENDIAN
...
case 4:
a += ka[0];
break;
#else /* !WORDS_BIGENDIAN */
...
case 4:
a += ka[0];
break;
...
#endif /* WORDS_BIGENDIAN */
And in my example ka[0] represents 'test' bytes fit inside 32 bits.
But as endian is different 'a' gets different value after this operation.
And this is why palindromes make hash_bytes() return the same value.
However, if provided example is compiled without -DWORDS_BIGENDIAN,
hash_bytes() will return the same value on Big Endian as Little Endian would.
So my question is it necessary for hash_bytes() to return the same result on any endianness
or am I missing some logic under #ifdef WORDS_BIGENDIAN?
Kind regards,
Ian Ilyasov.
Attachments:
test_hash.ctext/x-csrc; name=test_hash.cDownload
Ilyasov Ian <ianilyasov@outlook.com> writes:
So my question is it necessary for hash_bytes() to return the same result on any endianness
No, I don't believe we expect hash functions to produce
machine-independent results.
regards, tom lane