Merging two database dumps
I have a situation with multiple postgres servers running all with the same
databases and table structure. I need to periodically export the data from
each of there then merge them all into a single server. On occasion, it's
feasible for the same record (primary key) to be stored in two or more
servers
I was using pgdump without the --insert option however I just noticed that
pgrestore will stop inserting into a table when the conflict occurs,
leaving me with an incomplete set.
Question is what are my other options to skip over the conflicting record
when merging?
From the docs, it appears that making dumps with the --insert option may be
the only way to go however performance is an issue. In this case would
dropping all indexes help?
Hi Alex,
For storing duplicate rows, dropping primary and unique indexes is the only
way.
One alternative is create a column with timestamp which updates on every
insert/update so that timestamp will be primary. Hope it helps.
Regards,
Pavan
On Wed, Jun 13, 2018, 4:47 PM Alex O'Ree <spyhunter99@gmail.com> wrote:
Show quoted text
I have a situation with multiple postgres servers running all with the
same databases and table structure. I need to periodically export the data
from each of there then merge them all into a single server. On occasion,
it's feasible for the same record (primary key) to be stored in two or more
serversI was using pgdump without the --insert option however I just noticed that
pgrestore will stop inserting into a table when the conflict occurs,
leaving me with an incomplete set.Question is what are my other options to skip over the conflicting record
when merging?From the docs, it appears that making dumps with the --insert option may
be the only way to go however performance is an issue. In this case would
dropping all indexes help?
You can try one option, although just a thought in the air 😊
Use postgres FDW ex. https://robots.thoughtbot.com/postgres-foreign-data-wrapper
Create foreign tables in the relevant server schema
And then union/union all 😊 or your custom constraint on the destination table where you dump the rows.
For ex.
You have server1, server2, server3
And you have server4 as your new single server.
You create FDW of server1, server2, server3 on server 4 and then import table into respective server schema.
server1.table1, server2.table1, server3.table1
and then
insert into server4.table1 select * from( select * from server1.table1 union select * from server2.table1 union select * from server3.table1) a;
something 😊
Thanks,
Vijay
From: Alex O'Ree <spyhunter99@gmail.com>
Date: Wednesday, June 13, 2018 at 4:47 PM
To: "pgsql-general@lists.postgresql.org" <pgsql-general@lists.postgresql.org>
Subject: [External] Merging two database dumps
I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On occasion, it's feasible for the same record (primary key) to be stored in two or more servers
I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.
Question is what are my other options to skip over the conflicting record when merging?
From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?
Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
I have a situation with multiple postgres servers running all with the
same databases and table structure. I need to periodically export the
data from each of there then merge them all into a single server. On
occasion, it's feasible for the same record (primary key) to be stored
in two or more servers
what should happen in this case?
I was using pgdump without the --insert option however I just noticed
that pgrestore will stop inserting into a table when the conflict
occurs, leaving me with an incomplete set.
Other solution:
* create the tables on the destination server without the PK or with an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/
your problem seems as a typical task for logical replication to me. You
needs 9.4 at least.
Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com
Yes Vijay,
It might work, but I'm thinking it will be a performance overhead in case of
complex data.
Regards,
Pavan
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Desired behavior is to just log the error and continue the import using
pgdump based copy commands
The servers are not on the same network. Sneaker net is the only way
On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer <andreas@a-kretschmer.de>
wrote:
Show quoted text
Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
I have a situation with multiple postgres servers running all with the
same databases and table structure. I need to periodically export the
data from each of there then merge them all into a single server. On
occasion, it's feasible for the same record (primary key) to be stored
in two or more serverswhat should happen in this case?
I was using pgdump without the --insert option however I just noticed
that pgrestore will stop inserting into a table when the conflict
occurs, leaving me with an incomplete set.Other solution:
* create the tables on the destination server without the PK or with an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/your problem seems as a typical task for logical replication to me. You
needs 9.4 at least.Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com
On 06/13/2018 06:21 AM, Alex O'Ree wrote:
Desired behavior is to just log the error and continue the import using
pgdump based copy commands
Each COPY is atomic so if any part of it fails the whole thing fails, so
you will not be able to achieve what you want that way.
The servers are not on the same network. Sneaker net is the only way
On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer
<andreas@a-kretschmer.de <mailto:andreas@a-kretschmer.de>> wrote:Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
I have a situation with multiple postgres servers running all
with the
same databases and table structure. I need to periodically export
the
data from each of there then merge them all into a single server. On
occasion, it's feasible for the same record (primary key) to bestored
in two or more servers
what should happen in this case?
I was using pgdump without the --insert option however I just
noticed
that pgrestore will stop inserting into a table when the conflict
occurs, leaving me with an incomplete set.Other solution:
* create the tables on the destination server without the PK or with an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/your problem seems as a typical task for logical replication to me. You
needs 9.4 at least.Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com <http://www.2ndQuadrant.com>
--
Adrian Klaver
adrian.klaver@aklaver.com
Thanks for the clarification
On Wed, Jun 13, 2018 at 9:32 AM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
Show quoted text
On 06/13/2018 06:21 AM, Alex O'Ree wrote:
Desired behavior is to just log the error and continue the import using
pgdump based copy commandsEach COPY is atomic so if any part of it fails the whole thing fails, so
you will not be able to achieve what you want that way.The servers are not on the same network. Sneaker net is the only way
On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer <andreas@a-kretschmer.de
<mailto:andreas@a-kretschmer.de>> wrote:Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
I have a situation with multiple postgres servers running all
with the
same databases and table structure. I need to periodically export
the
data from each of there then merge them all into a single server.
On
occasion, it's feasible for the same record (primary key) to be
stored
in two or more servers
what should happen in this case?
I was using pgdump without the --insert option however I just
noticed
that pgrestore will stop inserting into a table when the conflict
occurs, leaving me with an incomplete set.Other solution:
* create the tables on the destination server without the PK or with
an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/your problem seems as a typical task for logical replication to me.
You
needs 9.4 at least.Regards, Andreas
-- 2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com <http://www.2ndQuadrant.com>--
Adrian Klaver
adrian.klaver@aklaver.com