murmur3 hash binary data migration from Oracle to PostgreSQL
Hi Team,
We are doing Migration from Oracle to PostgreSQL. In SOurce database we
have Binary data stored using murmur3 hashing function. In Oracle this data
is being generated from the Java code and inserted into the Oracle
database.
As part of Migration processes the reference data on which this murmur3 is
generated is also getting changed while migrating to PostgreSQL.
In PostgreSQL do we have any mechanism for fetching this murmur3 hash
function for any UUID.
Please let me know, what ever solution is available for the implementation.
--
*Best Regards,*
Jagmohan
On 25/04/2023 03:21 CEST Jagmohan Kaintura <jagmohan@tecorelabs.com> wrote:
We are doing Migration from Oracle to PostgreSQL. In SOurce database we have
Binary data stored using murmur3 hashing function. In Oracle this data is
being generated from the Java code and inserted into the Oracle database.
Do you store the hash and the binary data? The hash is a key to the binary
data?
As part of Migration processes the reference data on which this murmur3 is
generated is also getting changed while migrating to PostgreSQL.
Why is the data changing during migration? Shouldn't a migration preserve
the data and only adapt it if the database model needs to change?
In PostgreSQL do we have any mechanism for fetching this murmur3 hash
function for any UUID.
I don't understand what you mean by that. What does it have to do with UUID?
Do you want to generate the MurmurHash in Postgres? Postgres has no builtin
support for that hash function and I can't find any extension in a quick
online search.
Or do you want to just look up rows by the MurmurHash? That's a trivial
SELECT statement. Store the hash in an indexed column of type bytea to have
performant lookups.
--
Erik
On Mon, Apr 24, 2023 at 9:22 PM Jagmohan Kaintura
<jagmohan@tecorelabs.com> wrote:
We are doing Migration from Oracle to PostgreSQL. In SOurce database we have Binary data stored using murmur3 hashing function. In Oracle this data is being generated from the Java code and inserted into the Oracle database.
As part of Migration processes the reference data on which this murmur3 is generated is also getting changed while migrating to PostgreSQL.
In PostgreSQL do we have any mechanism for fetching this murmur3 hash function for any UUID.
Please let me know, what ever solution is available for the implementation.
My apologies if I misparsed a couple of statements. I am having
trouble determining if you are migrating away from Murmur3.
If you are selecting a new digest, then SipHash would be an excellent
choice. It was designed to avoid collisions and be fast. Plus it was
designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
get much better than those two fellows.
Jeff
Hi ,
No we want to generate murmur3 format only.
BR,
Jagmohan
Show quoted text
On 25-Apr-2023, at 8:52 AM, Jeffrey Walton <noloader@gmail.com> wrote:
On Mon, Apr 24, 2023 at 9:22 PM Jagmohan Kaintura
<jagmohan@tecorelabs.com> wrote:We are doing Migration from Oracle to PostgreSQL. In SOurce database we have Binary data stored using murmur3 hashing function. In Oracle this data is being generated from the Java code and inserted into the Oracle database.
As part of Migration processes the reference data on which this murmur3 is generated is also getting changed while migrating to PostgreSQL.
In PostgreSQL do we have any mechanism for fetching this murmur3 hash function for any UUID.
Please let me know, what ever solution is available for the implementation.
My apologies if I misparsed a couple of statements. I am having
trouble determining if you are migrating away from Murmur3.If you are selecting a new digest, then SipHash would be an excellent
choice. It was designed to avoid collisions and be fast. Plus it was
designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
get much better than those two fellows.Jeff
On 25/04/2023 12:44 CEST Jagmohan Kaintura <jagmohan@tecorelabs.com> wrote:
No we want to generate murmur3 format only.
On 25-Apr-2023, at 8:52 AM, Jeffrey Walton <noloader@gmail.com> wrote:
My apologies if I misparsed a couple of statements. I am having
trouble determining if you are migrating away from Murmur3.If you are selecting a new digest, then SipHash would be an excellent
choice. It was designed to avoid collisions and be fast. Plus it was
designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
get much better than those two fellows.
I don't understand. In your original message you write:
On 25/04/2023 03:21 CEST Jagmohan Kaintura <jagmohan@tecorelabs.com> wrote:
In Oracle this data is being generated from the Java code and inserted into
the Oracle database.
Do you generate the hash in Java or in Oracle? I don't know if Oracle
Database provides a MurmurHash function but in Java you could use
apache-commons[0]https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/digest/MurmurHash3.html and change your Java code to also insert the hash (if you
don't already do so).
PS: Please don't top post. https://wiki.postgresql.org/wiki/Mailing_Lists#Email_etiquette_mechanics
--
Erik
No we want to generate murmur3 format only.
If you need a server-side murmur3 function
- one alternative is the https://github.com/markokr/pghashlib extension.
psql (15.2 (Debian 15.2-2))
Type "help" for help.
db=# create extension hashlib;
CREATE EXTENSION
db=#
select hash_string('', 'murmur3');
select hash_string('a', 'murmur3');
select hash_string('abcdefg', 'murmur3');
select encode(hash128_string('abcdefg', 'murmur3'), 'hex');
+-------------+
| hash_string |
+-------------+
| 0 |
+-------------+
(1 row)
+-------------+
| hash_string |
+-------------+
| 1009084850 |
+-------------+
(1 row)
+-------------+
| hash_string |
+-------------+
| -2009294074 |
+-------------+
(1 row)
+----------------------------------+
| encode |
+----------------------------------+
| 069b3c88000000000000000000000000 |
+----------------------------------+
(1 row)
In my test, I have used this fork:
https://github.com/bgdevlab/pghashlib/tree/bgdevlab/builds
RUN mkdir -p /pghashlib \
&& git clone --depth 1 --branch bgdevlab/builds
https://github.com/bgdevlab/pghashlib.git /pghashlib \
&& cd /pghashlib \
&& make USE_PGXS=1 \
&& make USE_PGXS=1 install \
&& rm -rf /pghashlib
Regards,
Imre