pg_stat_replication view
I'm not sure if this is a documentation issue, or something else.
The description of the pg_stat_replication.state column gives:
* catchup: This WAL sender's connected standby is catching up with the
primary.
* streaming: This WAL sender is streaming changes after its connected
standby server has caught up with the primary.
What does this mean? Is the standby "caught up" when it replays the LSN
which was current on the master as-of the time that the standby initiated
this connection? Or is it caught up when the master receives at least one
notification that a certain LSN was replayed on the replica, and verifies
that no new WAL has been generated after that certain LSN was generated?
Neither of those things?
If a replica has caught up and then fallen behind again, is that different
from a user/dba perspective than if it never caught up in the first place?
Also, the docs say "Lag times work automatically for physical replication.
Logical decoding plugins may optionally emit tracking messages; if they do
not, the tracking mechanism will simply display NULL lag." Does the
logical decoding plugin associated with built-in PUBLICATION/SUBSCRIPTION
mechanism introduced in v10 emit tracking messages?
Cheers,
Jeff
On Mon, Dec 10, 2018 at 02:24:43PM -0500, Jeff Janes wrote:
What does this mean? Is the standby "caught up" when it replays the LSN
which was current on the master as-of the time that the standby initiated
this connection? Or is it caught up when the master receives at least one
notification that a certain LSN was replayed on the replica, and verifies
that no new WAL has been generated after that certain LSN was generated?
Neither of those things?
The WAL sender would switch from catchup to streaming mode when it sees
that there is no more data to send to the standby. Please look for the
call of WalSndSetState(WALSNDSTATE_STREAMING) in walsender.c.
If a replica has caught up and then fallen behind again, is that different
from a user/dba perspective than if it never caught up in the first
place?
Not really, because it means that it has been able to catch up with the
latest LSN of the primary at least once. Perhaps you have suggestions
to improve the documentation?
--
Michael