Logical subscription / publication lifetimes

Started by andrew cookeabout 4 years ago3 messagesgeneral
Jump to latest
#1andrew cooke
andrew@acooke.org

If I define a publication at time Tp, then load some data on the
publisher, then start a subscription at time Ts, then load some more
data on the publisher, does the subscriber get data from Tp or Ts
onwards?

Also, if a subscription is disabled and then re-enabled does it lose
the data inbetween, or is it back-filled?

I am not finding the answers to these questions in the docs at
https://www.postgresql.org/docs/current/logical-replication.html but
maybe I am overlooking something. The link above does mention copying
an existing table which may imply Ts?

Thanks,
Andrew

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: andrew cooke (#1)
Re: Logical subscription / publication lifetimes

On Fri, Apr 22, 2022 at 5:00 AM andrew cooke <andrew@acooke.org> wrote:

If I define a publication at time Tp, then load some data on the
publisher, then start a subscription at time Ts, then load some more
data on the publisher, does the subscriber get data from Tp or Ts
onwards?

It depends. By default, neither, the publisher is publishing the entire
contents of the table and the subscriber will do everything necessary to
replicate those contents in their entirety.

If you specify copy_data = false I'm not sure what you end up with
initially or after disable. My guess is the subscription defines the first
transaction it cares about when it connects to the publisher, defaulting to
the most recent publisher transaction (all older transactions would be
handled via copy_data = true) but then so long as the slot remains active
the publisher will place the data into the slot even while the subscriber
is not active and the subscriber will receive all of it next time it comes
online/re-enables.

David J.

#3andrew cooke
andrew@acooke.org
In reply to: David G. Johnston (#2)
Re: Logical subscription / publication lifetimes

Ah, thanks! I should have read the documentation of all the
parameters!

So the portion of data that is covered by "copy_data" is going to
reflect updates and deletes prior to the creation of the slot even if
"publish=insert" (only)?

This makes sense because I can't see how else it could be practically
implemented, but just want to be sure I am understanding. The idea
that there are two phases (copy existing data then replicate
operations) is a big help.

Thanks again,
Andrew

Show quoted text

On Fri, Apr 22, 2022 at 09:13:15AM -0700, David G. Johnston wrote:

On Fri, Apr 22, 2022 at 5:00 AM andrew cooke <andrew@acooke.org> wrote:

If I define a publication at time Tp, then load some data on the
publisher, then start a subscription at time Ts, then load some more
data on the publisher, does the subscriber get data from Tp or Ts
onwards?

It depends. By default, neither, the publisher is publishing the entire
contents of the table and the subscriber will do everything necessary to
replicate those contents in their entirety.

If you specify copy_data = false I'm not sure what you end up with
initially or after disable. My guess is the subscription defines the first
transaction it cares about when it connects to the publisher, defaulting to
the most recent publisher transaction (all older transactions would be
handled via copy_data = true) but then so long as the slot remains active
the publisher will place the data into the slot even while the subscriber
is not active and the subscriber will receive all of it next time it comes
online/re-enables.

David J.