64-bit hash function for hstore and citext data type
Hi all,
Commit[1] has added 64-bit hash functions for core data types and in the same
discussion thread[2] Robert Haas suggested to have the similar extended hash
function for hstore and citext data type. Attaching patch proposes the same.
Regards,
Amul
1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=81c5e46c490e2426db243eada186995da5bb0ba7
2] /messages/by-id/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com
On 2018/09/26 11:20, amul sul wrote:
Hi all,
Commit[1] has added 64-bit hash functions for core data types and in the same
discussion thread[2] Robert Haas suggested to have the similar extended hash
function for hstore and citext data type. Attaching patch proposes the same.Regards,
Amul1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=81c5e46c490e2426db243eada186995da5bb0ba7
2] /messages/by-id/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com
I reviewed citext-add-extended-hash-function-v1.patch and
hstore-add-extended-hash-function-v1.patch.
I could patch and test them without trouble and could not find any issues.
I think both patches are well.
Best regards,
On Wed, Nov 21, 2018 at 10:34 AM Hironobu SUZUKI <hironobu@interdb.jp> wrote:
On 2018/09/26 11:20, amul sul wrote:
Hi all,
Commit[1] has added 64-bit hash functions for core data types and in the same
discussion thread[2] Robert Haas suggested to have the similar extended hash
function for hstore and citext data type. Attaching patch proposes the same.Regards,
Amul1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=81c5e46c490e2426db243eada186995da5bb0ba7
2] /messages/by-id/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.comI reviewed citext-add-extended-hash-function-v1.patch and
hstore-add-extended-hash-function-v1.patch.I could patch and test them without trouble and could not find any issues.
Thanks to looking at the patch.
Regards,
Amul
On 9/26/18 12:20 PM, amul sul wrote:
Hi all,
Commit[1] has added 64-bit hash functions for core data types and in the same
discussion thread[2] Robert Haas suggested to have the similar extended hash
function for hstore and citext data type. Attaching patch proposes the same.
I wonder if the hstore hash function is actually correct. I see it
pretty much just computes hash on the varlena representation. The
important question is - can there be two different encodings for the
same hstore value? If that's possible, those two versions would end up
with a different hash value, breaking the hashing scheme.
I'm not very familiar with hstore internals so I don't know if that's
actually possible, but if you look at hstore_cmp, that seems to be far
more complex than just comparing the varlena values directly.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
"Tomas" == Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
Tomas> I wonder if the hstore hash function is actually correct. I see
Tomas> it pretty much just computes hash on the varlena representation.
Tomas> The important question is - can there be two different encodings
Tomas> for the same hstore value?
I was going to say "no", but in fact on closer examination there is an
edge case caused by the fact that hstoreUpgrade allows an _empty_ hstore
from pg_upgraded 8.4 data through without modifying it. (There's also a
vanishingly unlikely case involving the pgfoundry release of hstore-new.)
I'm inclined to fix this in hstoreUpgrade rather than complicate
hstore_hash with historical trivia. Also there have been no field
complaints - I guess it's unlikely that there is much pg 8.4 hstore data
in the wild that anyone wants to hash.
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tomas" == Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
Tomas> The important question is - can there be two different encodings
Tomas> for the same hstore value?
I was going to say "no", but in fact on closer examination there is an
edge case caused by the fact that hstoreUpgrade allows an _empty_ hstore
from pg_upgraded 8.4 data through without modifying it. (There's also a
vanishingly unlikely case involving the pgfoundry release of hstore-new.)
Ugh. Still, that's a pre-existing problem in hstore_hash, and so I don't
think it's a blocker for this patch.
I'm inclined to fix this in hstoreUpgrade rather than complicate
hstore_hash with historical trivia. Also there have been no field
complaints - I guess it's unlikely that there is much pg 8.4 hstore data
in the wild that anyone wants to hash.
Changing hstoreUpgrade at this point seems like wasted/misguided effort.
I don't doubt that there was a lot of 8.4 hstore data out there, but how
much remains unmigrated? If we're going to take this seriously at all,
my inclination would be to change hstore_hash[_extended] to test for
the empty-hstore case and force the same value it gets for such an
hstore made today.
In the meantime, I went ahead and pushed these patches. The only
non-cosmetic changes I made were to remove the changes in
citext--unpackaged--1.0.sql/hstore--unpackaged--1.0.sql; those
were wrong, because the point of those files is to migrate pre-9.1
databases into the extension system. Such a database would not
contain an extended hash function, and so adding an ALTER EXTENSION
command for that function would cause the script to fail.
regards, tom lane
I wrote:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
I'm inclined to fix this in hstoreUpgrade rather than complicate
hstore_hash with historical trivia. Also there have been no field
complaints - I guess it's unlikely that there is much pg 8.4 hstore data
in the wild that anyone wants to hash.
Changing hstoreUpgrade at this point seems like wasted/misguided effort.
Oh, cancel that --- I was having a momentary brain fade about how that
function is used. I agree your proposal is sensible.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
I'm inclined to fix this in hstoreUpgrade rather than complicate
hstore_hash with historical trivia. Also there have been no field
complaints - I guess it's unlikely that there is much pg 8.4 hstore
data in the wild that anyone wants to hash.
Changing hstoreUpgrade at this point seems like wasted/misguided effort.
Tom> Oh, cancel that --- I was having a momentary brain fade about how
Tom> that function is used. I agree your proposal is sensible.
Here's what I have queued up to push:
--
Andrew (irc:RhodiumToad)
Attachments:
0001-Fix-hstore-hash-function-for-empty-hstores-upgraded-.patchtext/x-patchDownload+19-29
Thanks Tom for enhancing & committing these patches.
Regards,
Amul
Show quoted text
On Sat, Nov 24, 2018 at 12:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tomas" == Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
Tomas> The important question is - can there be two different encodings
Tomas> for the same hstore value?I was going to say "no", but in fact on closer examination there is an
edge case caused by the fact that hstoreUpgrade allows an _empty_ hstore
from pg_upgraded 8.4 data through without modifying it. (There's also a
vanishingly unlikely case involving the pgfoundry release of hstore-new.)Ugh. Still, that's a pre-existing problem in hstore_hash, and so I don't
think it's a blocker for this patch.I'm inclined to fix this in hstoreUpgrade rather than complicate
hstore_hash with historical trivia. Also there have been no field
complaints - I guess it's unlikely that there is much pg 8.4 hstore data
in the wild that anyone wants to hash.Changing hstoreUpgrade at this point seems like wasted/misguided effort.
I don't doubt that there was a lot of 8.4 hstore data out there, but how
much remains unmigrated? If we're going to take this seriously at all,
my inclination would be to change hstore_hash[_extended] to test for
the empty-hstore case and force the same value it gets for such an
hstore made today.In the meantime, I went ahead and pushed these patches. The only
non-cosmetic changes I made were to remove the changes in
citext--unpackaged--1.0.sql/hstore--unpackaged--1.0.sql; those
were wrong, because the point of those files is to migrate pre-9.1
databases into the extension system. Such a database would not
contain an extended hash function, and so adding an ALTER EXTENSION
command for that function would cause the script to fail.regards, tom lane