Move rows from one database to other

Started by Thomas Güttlerabout 9 years ago25 messagesgeneral
Jump to latest
#1Thomas Güttler
guettliml@thomas-guettler.de

I want to move table rows from one database to an central database.

Both run PostgreSQL.

My use case looks like this:

There are N satellite databases in different data centers. N is about 100 at the moment.

There is one central database.

I need a way to reliably move rows from the satellite databases to the central one

Example

The rows of host1 look like this:

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi

The rows of host2 look like this:

host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

After syncing, all lines which were transferred should be deleted on the satellite databases.

The central table should look like this (it has the same schema)

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi
host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

I don't want to code this myself, since there a tons of possible race conditions:

- inserts can happen during syncing.
- Network can break during syncing.
- inserts into the central table can break (e.g. disk full): No loss at the satellite database must happen.
- ...

How to solve this with PostgreSQL?

Regards,
Thomas Güttler

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Thomas Güttler (#1)
Re: Move rows from one database to other

On 02/21/2017 12:53 AM, Thomas Güttler wrote:

I want to move table rows from one database to an central database.

You actually talking about moving from ~100 databases to the central
database, correct?

Both run PostgreSQL.

Are all the Postgres instances the same version and what is the version
or versions?

My use case looks like this:

There are N satellite databases in different data centers. N is about
100 at the moment.

There is one central database.

I need a way to reliably move rows from the satellite databases to the
central one

Two ways I can think of:

https://www.postgresql.org/docs/9.6/static/dblink.html

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html

Example

The rows of host1 look like this:

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi

The rows of host2 look like this:

host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

After syncing, all lines which were transferred should be deleted on the
satellite databases.

The central table should look like this (it has the same schema)

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi
host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

Is there a Primary Key on the satellite tables or some way of
determining unique rows?

Is there any existing overlap between the data in the central database
and the satellite databases?

I don't want to code this myself, since there a tons of possible race
conditions:

How much data are you talking about moving from each database?

How active are the satellite databases?

- inserts can happen during syncing.

Can UPDATEs happen?

- Network can break during syncing.
- inserts into the central table can break (e.g. disk full): No loss at
the satellite database must happen.
- ...

How to solve this with PostgreSQL?

Regards,
Thomas Güttler

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3William Ivanski
william.ivanski@gmail.com
In reply to: Adrian Klaver (#2)
Re: Move rows from one database to other

You can try OmniDB: http://www.omnidb.com.br/en_index.aspx

OmniDB has a Convert feature, where you can set a data transfer, even if
the target table exists.

Em ter, 21 de fev de 2017 às 11:18, Adrian Klaver <adrian.klaver@aklaver.com>
escreveu:

On 02/21/2017 12:53 AM, Thomas Güttler wrote:

I want to move table rows from one database to an central database.

You actually talking about moving from ~100 databases to the central
database, correct?

Both run PostgreSQL.

Are all the Postgres instances the same version and what is the version
or versions?

My use case looks like this:

There are N satellite databases in different data centers. N is about
100 at the moment.

There is one central database.

I need a way to reliably move rows from the satellite databases to the
central one

Two ways I can think of:

https://www.postgresql.org/docs/9.6/static/dblink.html

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html

Example

The rows of host1 look like this:

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi

The rows of host2 look like this:

host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

After syncing, all lines which were transferred should be deleted on the
satellite databases.

The central table should look like this (it has the same schema)

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi
host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

Is there a Primary Key on the satellite tables or some way of
determining unique rows?

Is there any existing overlap between the data in the central database
and the satellite databases?

I don't want to code this myself, since there a tons of possible race
conditions:

How much data are you talking about moving from each database?

How active are the satellite databases?

- inserts can happen during syncing.

Can UPDATEs happen?

- Network can break during syncing.
- inserts into the central table can break (e.g. disk full): No loss at
the satellite database must happen.
- ...

How to solve this with PostgreSQL?

Regards,
Thomas Güttler

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--

William Ivanski - Microsoft MVP

#4Melvin Davidson
melvin6925@gmail.com
In reply to: William Ivanski (#3)
Re: Move rows from one database to other

On Tue, Feb 21, 2017 at 9:27 AM, William Ivanski <william.ivanski@gmail.com>
wrote:

You can try OmniDB: http://www.omnidb.com.br/en_index.aspx

OmniDB has a Convert feature, where you can set a data transfer, even if
the target table exists.

Em ter, 21 de fev de 2017 às 11:18, Adrian Klaver <
adrian.klaver@aklaver.com> escreveu:

On 02/21/2017 12:53 AM, Thomas Güttler wrote:

I want to move table rows from one database to an central database.

You actually talking about moving from ~100 databases to the central
database, correct?

Both run PostgreSQL.

Are all the Postgres instances the same version and what is the version
or versions?

My use case looks like this:

There are N satellite databases in different data centers. N is about
100 at the moment.

There is one central database.

I need a way to reliably move rows from the satellite databases to the
central one

Two ways I can think of:

https://www.postgresql.org/docs/9.6/static/dblink.html

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html

Example

The rows of host1 look like this:

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi

The rows of host2 look like this:

host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

After syncing, all lines which were transferred should be deleted on the
satellite databases.

The central table should look like this (it has the same schema)

host1, 2017-02-21, abc
host1, 2017-02-20, def
host1, 2017-02-19, ghi
host2, 2017-02-21, foo
host2, 2017-02-20, bar
host2, 2017-02-19, blu

Is there a Primary Key on the satellite tables or some way of
determining unique rows?

Is there any existing overlap between the data in the central database
and the satellite databases?

I don't want to code this myself, since there a tons of possible race
conditions:

How much data are you talking about moving from each database?

How active are the satellite databases?

- inserts can happen during syncing.

Can UPDATEs happen?

- Network can break during syncing.
- inserts into the central table can break (e.g. disk full): No loss at
the satellite database must happen.
- ...

How to solve this with PostgreSQL?

Regards,
Thomas Güttler

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--

William Ivanski - Microsoft MVP

Depending on how much data you want to move, and if the tables have the
same structure, you might also want to consider using
pg_dump -a

OR
multiple instances of

on satellite
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | PROGRAM 'command' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]

on central
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]

A BIG consideration is:
Does the Central DB have the same table structures as all satellite DB's?
Does the Central DB already have records in the tables.
Do all Satellite tables have unique records for each other?

As Adrian stated, it would be very helpful if you provided us with all
O/S and PostgreSQL vesions involved.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#5Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Melvin Davidson (#4)
Re: Move rows from one database to other

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 21/02/2017 15:38, Melvin Davidson ha
scritto:</div>
<blockquote
cite="mid:CANu8FixqdzCZP4h3fPTfWxuhmXCmyfpcV3YtRDre=OoqPE8KkA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"><span
class="gmail-HOEnZb"><font color="#888888">
<div>
<div dir="ltr">
</div>
</div>
</font></span></blockquote>
</div>
Depending on how much data you want to move, and if the tables
have the same structure, you might also want to consider using<br>
</div>
<div class="gmail_extra">pg_dump -a<br>
<br>
</div>
<div class="gmail_extra">OR<br>
</div>
<div class="gmail_extra">multiple instances of <br>
<pre class="gmail-SYNOPSIS">on satellite
COPY { <tt class="gmail-REPLACEABLE gmail-c2">table_name</tt> [ ( <tt class="gmail-REPLACEABLE gmail-c2">column_name</tt> [, ...] ) ] | ( <tt class="gmail-REPLACEABLE gmail-c2">query</tt> ) }
TO { '<tt class="gmail-REPLACEABLE gmail-c2">filename</tt>' | PROGRAM '<tt class="gmail-REPLACEABLE gmail-c2">command</tt>' | STDOUT }
[ [ WITH ] ( <tt class="gmail-REPLACEABLE gmail-c2">option</tt> [, ...] ) ]
</pre>
<pre class="gmail-SYNOPSIS">on central
COPY <tt class="gmail-REPLACEABLE gmail-c2">table_name</tt> [ ( <tt class="gmail-REPLACEABLE gmail-c2">column_name</tt> [, ...] ) ]
FROM { '<tt class="gmail-REPLACEABLE gmail-c2">filename</tt>' | PROGRAM '<tt class="gmail-REPLACEABLE gmail-c2">command</tt>' | STDIN }
[ [ WITH ] ( <tt class="gmail-REPLACEABLE gmail-c2">option</tt> [, ...] ) ]

A BIG consideration is:
Does the Central DB have the same table structures as all satellite DB's?
Does the Central DB already have records in the tables.
Do all Satellite tables have unique records for each other?

</pre>
<pre class="gmail-SYNOPSIS">As Adrian stated, it would be very helpful if you provided us with all O/S and PostgreSQL vesions involved.
</pre>
</div>
</div>
</blockquote>
<br>
AFAIK pg_dump is version-independent (I use it to upgrade some small
clusters), given that you are using the one from the newest version
and you are dumping from old and restoring from new. <br>
Another thing that I saw is that pg_dump is less I/O stressful (I
can almost safely dump databases from my production server in
traffic hours, even if I never tried doing it in rush hours :-) )
and therefore slower than COPY, but on the other side, using COPY,
the risk is to have all I/O bandwidth eaten by data transfer,
especially if the amount of data to be transferred is considerable.
I tried some times to COPY across different clusters with different
versions, but I always had issues.<br>
<br>
My 2 cents.<br>
<br>
Cheers<br>
Moreno.<br>
<br>
<blockquote
cite="mid:CANu8FixqdzCZP4h3fPTfWxuhmXCmyfpcV3YtRDre=OoqPE8KkA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra"><br>
-- <br>
<div class="gmail_signature">
<div dir="ltr"><font size="4"><b><span
style="font-family:courier new,monospace">Melvin
Davidson</span></b></font><br>
<font style="font-weight:bold" size="3"><span
style="color:rgb(128,0,255)">I reserve the right to
fantasize.  Whether or not you </span><br
style="color:rgb(128,0,255)">
<span style="color:rgb(128,0,255)">wish to share my
fantasy is entirely up to you. </span><img
moz-do-not-send="true" style="color: rgb(128, 0,
255);"
src="http://us.i1.yimg.com/us.yimg.com/i/mesg/tsmileys2/01.gif&quot;&gt;&lt;/font&gt;&lt;br&gt;
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>

#6Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Adrian Klaver (#2)
Re: Move rows from one database to other

Am 21.02.2017 um 15:12 schrieb Adrian Klaver:

On 02/21/2017 12:53 AM, Thomas Güttler wrote:

I want to move table rows from one database to an central database.

You actually talking about moving from ~100 databases to the central database, correct?

Both run PostgreSQL.

Are all the Postgres instances the same version and what is the version or versions?

Yes, all run postgres, but the version can be different (but not much).
Satellite-DB 9.5 and 9.6 and central 9.6.

My use case looks like this:

There are N satellite databases in different data centers. N is about
100 at the moment.

There is one central database.

I need a way to reliably move rows from the satellite databases to the
central one

Two ways I can think of:

https://www.postgresql.org/docs/9.6/static/dblink.html

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html

Since postgres_fdw is newer, I would focus on it, right?

If I understood it correctly, then there would be N (about 100) tables in the central DB.

What happens if there is a network outage (for about 3 minutes) during accessing a remote table?

Is there a Primary Key on the satellite tables or some way of determining unique rows?

The concrete schema is not specified up to now. But I guess UUID as primary key would be the best fit.
Or am I wrong?

Is there any existing overlap between the data in the central database and the satellite databases?

No, there won't be overlaps. Every satellite system creates its own rows.

How much data are you talking about moving from each database?
How active are the satellite databases?

100k rows per day per satellite. Each row has only few bytes.

Moving of rows should happen every ten minutes.

- inserts can happen during syncing.

Can UPDATEs happen?

No, rows get created and moved and later deleted.

Thank you Adrian for your questions. It helped me to narrow down my problem.

Regards,
Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Thomas Güttler (#1)
Re: Move rows from one database to other

On 02/21/2017 12:53 AM, Thomas Güttler wrote:

I want to move table rows from one database to an central database.

Both run PostgreSQL.

How to solve this with PostgreSQL?

Should have added earlier. This is a specific case of the more general
case of ETL(Extract/Transform/Load). There are a host of tools out there
that do this. For instance I use Python and the following is available:

http://petl.readthedocs.io/en/latest/index.html

What tool you choose comes down to what you are comfortable with:

1) Writing your own programs/scripts and in what language?

2) Using a GUI that sets things up for you.

Regards,
Thomas Güttler

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Thomas Güttler
guettliml@thomas-guettler.de
In reply to: William Ivanski (#3)
Re: Move rows from one database to other

Am 21.02.2017 um 15:27 schrieb William Ivanski:

You can try OmniDB: http://www.omnidb.com.br/en_index.aspx

OmniDB has a Convert feature, where you can set a data transfer, even if the target table exists.

I am unsure if omnidb is the right tool here.

I don't need a GUI. The movement of the rows should happen in background.

Regards,
Thomas Güttler

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Melvin Davidson (#4)
Re: Move rows from one database to other

Depending on how much data you want to move, and if the tables have the same structure, you might also want to consider
using
pg_dump -a

OR
multiple instances of

on satellite
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | PROGRAM 'command' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]

on central
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]

A BIG consideration is:
Does the Central DB have the same table structures as all satellite DB's?

yes, same schema

Does the Central DB already have records in the tables.

yes, movement of rows should happen every ten minutes.

Do all Satellite tables have unique records for each other?

Yes, UUID primary key.

As Adrian stated, it would be very helpful if you provided us with all O/S and PostgreSQL vesions involved.

Versions are 9.5 and 9.6

I have other concerns: atomar transaction. Movement should happen completely or not all.

I don't think you can do this reliable (atomic transaction) with "copy table_name".

Regards,
Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Melvin Davidson
melvin6925@gmail.com
In reply to: Thomas Güttler (#9)
Re: Move rows from one database to other

On Tue, Feb 21, 2017 at 11:10 AM, Thomas Güttler <
guettliml@thomas-guettler.de> wrote:

Depending on how much data you want to move, and if the tables have the
same structure, you might also want to consider
using
pg_dump -a

OR
multiple instances of

on satellite
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | PROGRAM 'command' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]

on central
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]

A BIG consideration is:
Does the Central DB have the same table structures as all satellite DB's?

yes, same schema

Does the Central DB already have records in the tables.

yes, movement of rows should happen every ten minutes.

Do all Satellite tables have unique records for each other?

Yes, UUID primary key.

As Adrian stated, it would be very helpful if you provided us with all O/S

and PostgreSQL vesions involved.

Versions are 9.5 and 9.6

I have other concerns: atomar transaction. Movement should happen
completely or not all.

I don't think you can do this reliable (atomic transaction) with "copy
table_name".

Regards,
Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

I have other concerns: atomar transaction. Movement should happen

completely or not all.

I don't think you can do this reliable (atomic transaction) with "copy

table_name".

You can if you wrap it in a transaction:
EG:
BEGIN;
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | PROGRAM 'command' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]
COMMIT;

BEGIN;
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
COMMIT;

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Thomas Güttler (#6)
Re: Move rows from one database to other

On 02/21/2017 07:53 AM, Thomas Güttler wrote:

Am 21.02.2017 um 15:12 schrieb Adrian Klaver:

On 02/21/2017 12:53 AM, Thomas Güttler wrote:

I want to move table rows from one database to an central database.

You actually talking about moving from ~100 databases to the central
database, correct?

Both run PostgreSQL.

Are all the Postgres instances the same version and what is the
version or versions?

Yes, all run postgres, but the version can be different (but not much).
Satellite-DB 9.5 and 9.6 and central 9.6.

My use case looks like this:

There are N satellite databases in different data centers. N is about
100 at the moment.

There is one central database.

I need a way to reliably move rows from the satellite databases to the
central one

Two ways I can think of:

https://www.postgresql.org/docs/9.6/static/dblink.html

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html

Since postgres_fdw is newer, I would focus on it, right?

If I understood it correctly, then there would be N (about 100) tables
in the central DB.

What happens if there is a network outage (for about 3 minutes) during
accessing a remote table?

I misunderstood your original intent, I thought this was a one time
process to move data to the central database. Given that it is to be a
continuous process a FDW may not be the answer, one of the reasons being
the above question. You will be denied the data in the remote table
during the outage. Also not sure what you will be doing with the data in
the central database and how often? In any case it will involve reaching
out to all the satellites each time you want to query the latest data.
Looks more like some kind of push mechanism from the satellites to the
central database is in order. Then once the data is on the central
database it is 'captured'. A question that comes to mind is if there is
a problem should the data transfer from one or more satellites lag that
of the others?

Is there a Primary Key on the satellite tables or some way of
determining unique rows?

The concrete schema is not specified up to now. But I guess UUID as
primary key would be the best fit.
Or am I wrong?

Is there any existing overlap between the data in the central database
and the satellite databases?

No, there won't be overlaps. Every satellite system creates its own rows.

How much data are you talking about moving from each database?
How active are the satellite databases?

100k rows per day per satellite. Each row has only few bytes.

Moving of rows should happen every ten minutes.

- inserts can happen during syncing.

Can UPDATEs happen?

No, rows get created and moved and later deleted.

Thank you Adrian for your questions. It helped me to narrow down my
problem.

Regards,
Thomas

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Thomas Güttler (#8)
Re: Move rows from one database to other

On 02/21/2017 08:06 AM, Thomas Güttler wrote:

Am 21.02.2017 um 15:27 schrieb William Ivanski:

You can try OmniDB: http://www.omnidb.com.br/en_index.aspx

OmniDB has a Convert feature, where you can set a data transfer, even
if the target table exists.

I am unsure if omnidb is the right tool here.

I don't need a GUI. The movement of the rows should happen in background.

Given the versions of Postgres you are using there is logical replication:

https://www.postgresql.org/docs/9.5/static/protocol-replication.html

https://2ndquadrant.com/en/resources/pglogical/
"AGGREGATE - Accumulate changes from sharded database servers into a
Data Warehouse"

I have not used this capability yet, so others would have to comment on
its applicability.

Regards,
Thomas Güttler

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Adrian Klaver (#11)
Re: Move rows from one database to other

I misunderstood your original intent, I thought this was a one time process to move data to the central database. Given
that it is to be a continuous process a FDW may not be the answer, one of the reasons being the above question. You will
be denied the data in the remote table during the outage. Also not sure what you will be doing with the data in the
central database and how often? In any case it will involve reaching out to all the satellites each time you want to
query the latest data. Looks more like some kind of push mechanism from the satellites to the central database is in
order. Then once the data is on the central database it is 'captured'. A question that comes to mind is if there is a
problem should the data transfer from one or more satellites lag that of the others?

I am sorry. Yes, I forgot to mention that the process should be a continous push.

If not all data is in the central, that's ok.

The central should only see the data which is already in the central place.

Up to now it is not important who triggers this process. It could be a push (from satellite to central) or a pull (from
central to satellite)

If there is a lag ... then there is a lag. That can happen (network outage), this is not very important.
More important is that no single row gets lost or duplicated.

Regards,
Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Melvin Davidson (#10)
Re: Move rows from one database to other

I have other concerns: atomar transaction. Movement should happen completely or not all.
I don't think you can do this reliable (atomic transaction) with "copy table_name".

You can if you wrap it in a transaction:

I want to **move** the data. The data should get deleted on the satellite after transfer.

I don't know how to delete the data which was copied, since inserts can happen during the copy statement.

Regards,
Thomas Güttler

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Thomas Güttler (#14)
Re: Move rows from one database to other

On 02/22/2017 04:51 AM, Thomas Güttler wrote:

I have other concerns: atomar transaction. Movement should happen
completely or not all.
I don't think you can do this reliable (atomic transaction) with
"copy table_name".

You can if you wrap it in a transaction:

I want to **move** the data. The data should get deleted on the
satellite after transfer.

Well the replication suggestion is out.

I don't know how to delete the data which was copied, since inserts can
happen during the copy statement.

However you end up doing this I think you will probably need some sort
of flag on the rows on the satellites. It could be a timestamp field of
when the rows where inserted on the satellite or a boolean
field(copied). First instinct is to use an insert timestamp and a
tracking table that stores the last timestamp used to move rows, where
the timestamp is only written on a successful transfer. To improve the
chances of successful transfer more smaller transfer batches rather then
larger transfers.

Regards,
Thomas Güttler

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16Francisco Olarte
folarte@peoplecall.com
In reply to: Thomas Güttler (#14)
Re: Move rows from one database to other

Thomas:

On Wed, Feb 22, 2017 at 1:51 PM, Thomas Güttler
<guettliml@thomas-guettler.de> wrote:

I want to **move** the data. The data should get deleted on the satellite
after transfer.
I don't know how to delete the data which was copied, since inserts can
happen during the copy statement.

Depending on the structure / control / kind of ops you have on the
satellite you can try a two-table trick, similar to a common one used
to process files in spool directories.

1st, you move everything to a holding table transactionally in the
satellite ( insert into holding table delete from main table returning
* )

Then you have the holding table to transfer and clear, not touched by
anyone ( as transfer process is the only one touching it )

You can do some tricks. You could inherit holding from main if you
want to be able to see data while it is been transferred. Or play
renaming tricks. It all depends in the concrete app, but you can try
to fit the pattern in it, I've done it several times and its a useful
one.

Francisco Olarte.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#17Jan Michálek
godzilalalala@gmail.com
In reply to: Francisco Olarte (#16)
Re: Move rows from one database to other

Maybe you can do something like

WITH cte AS (
DELETE FROM t1 WHERE cond
RETURNINIG *
)
INSERT into t2
SELECT * FROM cte;

To move rows between tables with combination with fdw_postgres and
foreign table.

Je;

On Thu, Feb 23, 2017 at 10:33:27AM +0100, Francisco Olarte wrote:

Thomas:

On Wed, Feb 22, 2017 at 1:51 PM, Thomas G�ttler
<guettliml@thomas-guettler.de> wrote:

I want to **move** the data. The data should get deleted on the satellite
after transfer.
I don't know how to delete the data which was copied, since inserts can
happen during the copy statement.

Depending on the structure / control / kind of ops you have on the
satellite you can try a two-table trick, similar to a common one used
to process files in spool directories.

1st, you move everything to a holding table transactionally in the
satellite ( insert into holding table delete from main table returning
* )

Then you have the holding table to transfer and clear, not touched by
anyone ( as transfer process is the only one touching it )

You can do some tricks. You could inherit holding from main if you
want to be able to see data while it is been transferred. Or play
renaming tricks. It all depends in the concrete app, but you can try
to fit the pattern in it, I've done it several times and its a useful
one.

Francisco Olarte.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Adrian Klaver (#15)
Re: Move rows from one database to other

Am 22.02.2017 um 16:00 schrieb Adrian Klaver:

On 02/22/2017 04:51 AM, Thomas Güttler wrote:

I have other concerns: atomar transaction. Movement should happen
completely or not all.
I don't think you can do this reliable (atomic transaction) with
"copy table_name".

You can if you wrap it in a transaction:

I want to **move** the data. The data should get deleted on the
satellite after transfer.

Well the replication suggestion is out.

I don't know how to delete the data which was copied, since inserts can
happen during the copy statement.

However you end up doing this I think you will probably need some sort of flag on the rows on the satellites. It could
be a timestamp field of when the rows where inserted on the satellite or a boolean field(copied). First instinct is to
use an insert timestamp and a tracking table that stores the last timestamp used to move rows, where the timestamp is
only written on a successful transfer. To improve the chances of successful transfer more smaller transfer batches
rather then larger transfers.

I really need a solid solution.

You said "... improve the chances of successful transfer ...". This makes me nervous.

Delays are no problems, but data loss or duplication is.

Regards,
Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#19Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Francisco Olarte (#16)
Re: Move rows from one database to other

Am 23.02.2017 um 10:33 schrieb Francisco Olarte:

Thomas:

On Wed, Feb 22, 2017 at 1:51 PM, Thomas Güttler
<guettliml@thomas-guettler.de> wrote:

I want to **move** the data. The data should get deleted on the satellite
after transfer.
I don't know how to delete the data which was copied, since inserts can
happen during the copy statement.

Depending on the structure / control / kind of ops you have on the
satellite you can try a two-table trick, similar to a common one used
to process files in spool directories.
[cut]

This sound good. Is there a name for this trick, to find more details?

Regards,
Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#20Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Jan Michálek (#17)
Re: Move rows from one database to other

Am 23.02.2017 um 13:44 schrieb Leknín Řepánek:

Maybe you can do something like

WITH cte AS (
DELETE FROM t1 WHERE cond
RETURNINIG *
)
INSERT into t2
SELECT * FROM cte;

To move rows between tables with combination with fdw_postgres and
foreign table.

... this way you don't need the second table ... nice.

--
Thomas Guettler http://www.thomas-guettler.de/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#21Francisco Olarte
folarte@peoplecall.com
In reply to: Thomas Güttler (#18)
#22Francisco Olarte
folarte@peoplecall.com
In reply to: Thomas Güttler (#20)
#23Francisco Olarte
folarte@peoplecall.com
In reply to: Thomas Güttler (#19)
#24Thomas Güttler
guettliml@thomas-guettler.de
In reply to: Francisco Olarte (#21)
#25Francisco Olarte
folarte@peoplecall.com
In reply to: Thomas Güttler (#24)