Monitoring logical replication

Started by Shaheed Haqueover 2 years ago6 messagesgeneral
Jump to latest
#1Shaheed Haque
shaheedhaque@gmail.com

Hi,

I've been playing with logical replication (currently on PG14),
specifically in an AWS RDS Postgres context, but NOT using AWS' own
replication tooling. I'm generally familiar with the challenges of
distributed systems (such causality, time synchronisation etc), but not
especially familiar with PG.

In looking at how to tell how a given subscriber has caught up with its
publisher, there is plenty of advice around the Web, for example
https://dba.stackexchange.com/questions/314324/monitor-logical-replication-using-lsn.
Like this example, much advice ends up talking about using separate queries
on the publisher and the subscriber to compare LSNs. First, (I think) I
understand the core difficulty that comparing LSNs is inherently racy, but
given that, I'm a bit unclear as to why a single query on the publisher is
not enough...IIUC:

- Changes sent from the publisher to the subscriber are identified by
LSN.
- The publisher knows it's own current latest LSN (pg_current_wal_lsn()),
but this seems not to be exposed at the subscriber.
- The subscriber knows what it has applied locally and even tells the
publisher (pg_stat_subscription.latest_end_lsn), but it does not seem to
be exposed at the publisher.

Have I missed something? Is there a way to track the LSN delta (given that
this is known to be racy) just by querying one end?

Second, how do folk "know" when replication is "done". For example, if the
two LSNs continued to match for 1 * replication lag? Or N * replication
lag? What would be a plausible N?

Third, as we know when logical replication is started, the initial table
state is captured in a snapshot, and sent across using COPY TABLE under the
covers. Now, let's say that the publisher is idle (i.e. no SQL writes to
the user's schema...obviously pg_catalog might change as replication is
configured and enabled) and that the replication starts with the publisher
as LSN_start. How could one know when the copying is done:

- I initially assumed that the publisher's LSN would not change from
LSN_start, but as the copying proceeds, I see that it DOES change
(presumably because there are updates happening to pg_catalog, such as the
temporary slots coming and going).
- Is there some kind of singleton state on either publisher or
subscriber that could be checked to know? (At the moment, I am counting the
records in all copied tables).

I realise that the knowledge that the publisher is "idle" is a special
case, but right now, my test for being "done" is:

- Number of records in copied tables matches AND the publisher's
pg_stat_subscription matches the subscriber's pg_stat_subscription.
latest_end_lsn.

Plus or minus the bit about replication lag, is there a better way?

Thanks, Shaheed

#2Klaus Darilion
klaus.mailinglists@pernau.at
In reply to: Shaheed Haque (#1)
Re: Monitoring logical replication

Hi Saheed!

I monitor our replication this way:

1. Every 10 seconds i fetch the current LSN and write it into a table,
next with the current timestamp. Further I fetch confirmend LSNs from
the replication slots and delete old entries in lsn2data table.

calculate_logical_replication_lag.php:

<?php

$path = realpath(dirname(__FILE__) . "/../inc");
set_include_path($path . PATH_SEPARATOR . get_include_path());

require_once('config.php');
$config_int['syslogprefix'] = basename(__FILE__);
require_once('logging.php');

$dbuser="replication_lag_user";
$dbpass="XXXXXXXXXXXXXXXXXXXX";
if (!$dbconn = pg_pconnect('host='.$config_int['dbhost'].'
dbname='.$config_int['dbname'].' user='.$dbuser.' password='.$dbpass)) {
print "Sorry, database connection failed";
exit;
}

$accuracy = 10; // in seconds

//
// Preparations:
//
// CREATE TABLE lsn2data(
// lsn pg_lsn PRIMARY KEY,
// seen timestamp NOT NULL DEFAULT NOW()
// );
// CREATE ROLE replication_lag_user WITH LOGIN PASSWORD
'XXXXXXXXXXXXXXXXXXX';
// GRANT ALL ON TABLE lsn2data TO replication_lag_user;
//
// CREATE OR REPLACE FUNCTION get_replication_lag() RETURNS TABLE
(subscriber name, lag bigint) AS
// $BODY$
// DECLARE
// subscriber name;
// BEGIN
// FOR subscriber IN
// SELECT slot_name FROM pg_replication_slots
// LOOP
// RETURN QUERY SELECT slot_name, EXTRACT(EPOCH FROM
NOW()-seen)::bigint lag from lsn2data,pg_replication_slots WHERE
slot_name=subscriber AND lsn < confirmed_flush_lsn ORDER BY lsn DESC
LIMIT 1;
// END LOOP;
// RETURN;
// END
// $BODY$
// LANGUAGE plpgsql;
//
while (1) {
$dbq = pg_query("INSERT INTO lsn2data (lsn) VALUES
(pg_current_wal_lsn())");
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error:
".pg_last_error()."\n");
exit(1);
}

$dbq = pg_query("DELETE FROM lsn2data WHERE lsn < (".
"SELECT lsn FROM lsn2data WHERE lsn < (".
"SELECT confirmed_flush_lsn FROM
pg_replication_slots ORDER BY confirmed_flush_lsn ASC LIMIT 1".
") ORDER BY lsn DESC LIMIT 1".
")"
);
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error:
".pg_last_error()."\n");
exit(1);
}
sleep($accuracy);
}

2. I graph the replications lags (converted from LSN to seconds) in my
check_mk monitoring:

#!/bin/bash

#
# Managed by Puppet:
modules/base/files/monitoring/check_logical_replication_lag.sh
#
# Check the logical replication lag and export performance data for each
subscriber
#

# exit on error
#set -e

#Make sure this script only runs one at a time
(

ME=$0
MEBASE=`basename $0`

mylog () {
echo "$MEBASE: $1"
logger -t "$MEBASE" "$1"
}

flock -x -w 1 200
if [ $? != "0" ]; then
#echo "ERROR: $0 is already running ... exit"
logger -t "$MEBASE" "ERROR: $0 is already running ... exit"
exit 1
fi

# Do stuff

# Variablen fuer Monitoring
CMK_SPOOLDIR=/var/lib/check_mk_agent/spool
CMK_NAME=$MEBASE
CMK_SPOOLFILE=600_`basename ${CMK_NAME}`.txt
CMK_HEADER="<<<local>>>"
TMP_FILE="/tmp/logical_replication_lag.csv"

# Schwellwerte
warn=300
crit=600

final_output="$CMK_HEADER\nP $CMK_NAME "

# move to a directory where user postgresl may reside (sudo)
cd /tmp

# Lag auslesen. Waehrend dem initialen aufsynchen eines Subscribers gibt
es temporaere Subscriptions, mit dem Namen reg_xxx1-pid-sync-pid.
# Damit diese nicht getrackt werden gibt es die huebsche LIKE Clause.
rm -f "$TMP_FILE"
sudo -u postgres psql regdns -c "COPY (SELECT subscriber,lag FROM
get_replication_lag() WHERE subscriber LIKE '%\_____' ORDER BY 2 DESC)
TO '$TMP_FILE' With CSV" 2>&1> /dev/null
LC=$(sudo -u postgres psql -t regdns -c "SELECT count(*) FROM
get_replication_lag();" | tr -d ' ')

if [ $LC == "0" ]; then
echo -e "$CMK_HEADER\n0 $CMK_NAME - No Slaves with Replication
found - maybe we are a slave?" > $CMK_SPOOLDIR/$CMK_SPOOLFILE
exit 0;
fi

grep $(hostname | cut -d '-' -f2) "$TMP_FILE" > /dev/null
if [ $? != "0" ]; then
echo -e "$CMK_HEADER\n2 $CMK_NAME - Postgres Output does not
seem valid. Please check script $ME and output in $TMP_FILE" >
$CMK_SPOOLDIR/$CMK_SPOOLFILE
exit 1;
fi

# CSV in Array einlesen
IFS=$'\n' read -d '' -r -a input_file < "$TMP_FILE"

# Auswerten
maxlag=0
for i in "${input_file[@]}"; do
node=`echo $i | awk -F "," '{print $1}' | tr -- _ -`
lag=`echo $i | awk -F "," '{print $2}'`
final_output="$final_output$node=$lag;$warn;$crit|"
#
https://unix.stackexchange.com/questions/186663/is-there-a-unix-command-that-gives-the-minimum-maximum-of-two-numbers
maxlag=$(( maxlag > lag ? maxlag : lag ))
done
final_output="${final_output}max-lag=$maxlag;$warn;$crit"

# Letztes Pipe Zeichen rausschneiden
#final_output=`echo $final_output | rev | cut -c 2- | rev`

# Spool File schreiben
echo -e $final_output > $CMK_SPOOLDIR/$CMK_SPOOLFILE
logger -t "$MEBASE" "$final_output"

) 200>/tmp/`basename $0`.exclusivelock

3. During initial sync I check the status on the subscriber. Once it has
synced all tables of the publication, it will send me an email.
#
# Managed by Puppet:
modules/pdns/templates/check_pglogical_subscription.sh.erb
#

#
# This script checks and eventually creates the subscription, and wait
until the initial sync is finished
#

PUB=regdns2020_pub
SLEEP=5
PREFIX=check_pglogical_subscription.sh
NUMTAB=175

SECONDS=0
date
while true; do
echo "SELECT * from pg_subscription;" | sudo -u postgres psql -t
regdns | grep -q $PUB
if [ $? -eq 0 ]; then
echo "OK: Host is subscribed to '$PUB'. Checking for
table count ..."
break
fi
echo "ERROR: Host is not subscribed to '$PUB'. Subscribing to
master ..."
logger -t $PREFIX "ERROR: Host is not subscribed to '$PUB'.
Subscribing to master ..."
echo "CREATE SUBSCRIPTION `hostname -s|tr -- - _` CONNECTION
'host=XXXXX dbname=XXXX user=XXXXX password=XXXXXX PUBLICATION
regdns2020_pub;" | sudo -u postgres psql regdns && touch
/etc/regdns.schema_subscription.created
echo "Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "Re-Checking in $SLEEP seconds ..."
sleep $SLEEP
done

while true; do
COUNT=$(echo "SELECT count(*) from pg_subscription_rel;" | sudo
-u postgres psql -t regdns | head -1 | xargs)
if [ $COUNT -eq $NUMTAB ]; then
echo "OK: Subscription '$PUB' contains $NUMTAB tables -
that is OK. Checking for initial-sync status ..."
logger -t $PREFIX "OK: Subscription '$PUB' contains
$NUMTAB tables - that is OK. Checking for initial-sync status ..."
break
fi
echo "ERROR: Subscription '$PUB' contains $COUNT tables, but
should contain $NUMTAB table. Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "ERROR: Subscription '$PUB' contains $COUNT
tables, but should contain $NUMTAB table. Re-Checking in $SLEEP seconds
..."
sleep $SLEEP
done

while true; do
COUNTFIN=$(echo "SELECT count(*) from pg_subscription_rel WHERE
srsubstate='r';" | sudo -u postgres psql -t regdns | head -1 | xargs)
if [ $COUNTFIN -eq $NUMTAB ]; then
echo "OK: Initial sync of $COUNTFIN/$NUMTAB tables
finished in $SECONDS seconds."
logger -t $PREFIX "OK: Initial sync of $COUNTFIN/$NUMTAB
tables finished in $SECONDS seconds."
echo "OK: Initial sync of $COUNTFIN/$NUMTAB tables
finished in $SECONDS seconds." | mailx -s "$HOST $SECONDS seconds to
subscribe" -- root
break
fi
echo "PROGRESS: Initial sync of $COUNTFIN/$NUMTAB tables
finished. Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "PROGRESS: Initial sync of $COUNTFIN/$NUMTAB
tables finished. Re-Checking in $SLEEP seconds ..."
sleep $SLEEP
done

regards
Klaus

Am 2023-10-07 17:31, schrieb Shaheed Haque:

Show quoted text

Hi,

I've been playing with logical replication (currently on PG14),
specifically in an AWS RDS Postgres context, but NOT using AWS' own
replication tooling. I'm generally familiar with the challenges of
distributed systems (such causality, time synchronisation etc), but
not especially familiar with PG.

In looking at how to tell how a given subscriber has caught up with
its publisher, there is plenty of advice around the Web, for example
https://dba.stackexchange.com/questions/314324/monitor-logical-replication-using-lsn.
Like this example, much advice ends up talking about using separate
queries on the publisher and the subscriber to compare LSNs. First, (I
think) I understand the core difficulty that comparing LSNs is
inherently racy, but given that, I'm a bit unclear as to why a single
query on the publisher is not enough...IIUC:

* Changes sent from the publisher to the subscriber are identified by
LSN.
* The publisher knows it's own current latest LSN
(pg_current_wal_lsn()), but this seems not to be exposed at the
subscriber.
* The subscriber knows what it has applied locally and even tells the
publisher (pg_stat_subscription.latest_end_lsn), but it does not seem
to be exposed at the publisher.

Have I missed something? Is there a way to track the LSN delta (given
that this is known to be racy) just by querying one end?

Second, how do folk "know" when replication is "done". For example, if
the two LSNs continued to match for 1 * replication lag? Or N *
replication lag? What would be a plausible N?

Third, as we know when logical replication is started, the initial
table state is captured in a snapshot, and sent across using COPY
TABLE under the covers. Now, let's say that the publisher is idle
(i.e. no SQL writes to the user's schema...obviously pg_catalog might
change as replication is configured and enabled) and that the
replication starts with the publisher as LSN_start. How could one know
when the copying is done:

* I initially assumed that the publisher's LSN would not change from
LSN_start, but as the copying proceeds, I see that it DOES change
(presumably because there are updates happening to pg_catalog, such as
the temporary slots coming and going).
* Is there some kind of singleton state on either publisher or
subscriber that could be checked to know? (At the moment, I am
counting the records in all copied tables).

I realise that the knowledge that the publisher is "idle" is a special
case, but right now, my test for being "done" is:

* Number of records in copied tables matches AND the publisher's
pg_stat_subscription matches the subscriber's
pg_stat_subscription.latest_end_lsn.

Plus or minus the bit about replication lag, is there a better way?

Thanks, Shaheed

#3Shaheed Haque
shaheedhaque@gmail.com
In reply to: Klaus Darilion (#2)
Re: Monitoring logical replication

This is great, thank you for posting. I'm currently a subcontinent or two
away from my dev env, but will compare your approach with mine (you are
using some facilities of psql I'm not familiar with). At least you have
confirmed that LSNs are the place to start.

Thanks again, Shaheed

On Tue, 30 Jan 2024, 05:15 Klaus Darilion, <klaus.mailinglists@pernau.at>
wrote:

Show quoted text

Hi Saheed!

I monitor our replication this way:

1. Every 10 seconds i fetch the current LSN and write it into a table,
next with the current timestamp. Further I fetch confirmend LSNs from
the replication slots and delete old entries in lsn2data table.

calculate_logical_replication_lag.php:

<?php

$path = realpath(dirname(__FILE__) . "/../inc");
set_include_path($path . PATH_SEPARATOR . get_include_path());

require_once('config.php');
$config_int['syslogprefix'] = basename(__FILE__);
require_once('logging.php');

$dbuser="replication_lag_user";
$dbpass="XXXXXXXXXXXXXXXXXXXX";
if (!$dbconn = pg_pconnect('host='.$config_int['dbhost'].'
dbname='.$config_int['dbname'].' user='.$dbuser.' password='.$dbpass)) {
print "Sorry, database connection failed";
exit;
}

$accuracy = 10; // in seconds

//
// Preparations:
//
// CREATE TABLE lsn2data(
// lsn pg_lsn PRIMARY KEY,
// seen timestamp NOT NULL DEFAULT NOW()
// );
// CREATE ROLE replication_lag_user WITH LOGIN PASSWORD
'XXXXXXXXXXXXXXXXXXX';
// GRANT ALL ON TABLE lsn2data TO replication_lag_user;
//
// CREATE OR REPLACE FUNCTION get_replication_lag() RETURNS TABLE
(subscriber name, lag bigint) AS
// $BODY$
// DECLARE
// subscriber name;
// BEGIN
// FOR subscriber IN
// SELECT slot_name FROM pg_replication_slots
// LOOP
// RETURN QUERY SELECT slot_name, EXTRACT(EPOCH FROM
NOW()-seen)::bigint lag from lsn2data,pg_replication_slots WHERE
slot_name=subscriber AND lsn < confirmed_flush_lsn ORDER BY lsn DESC
LIMIT 1;
// END LOOP;
// RETURN;
// END
// $BODY$
// LANGUAGE plpgsql;
//
while (1) {
$dbq = pg_query("INSERT INTO lsn2data (lsn) VALUES
(pg_current_wal_lsn())");
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error:
".pg_last_error()."\n");
exit(1);
}

$dbq = pg_query("DELETE FROM lsn2data WHERE lsn < (".
"SELECT lsn FROM lsn2data WHERE lsn < (".
"SELECT confirmed_flush_lsn FROM
pg_replication_slots ORDER BY confirmed_flush_lsn ASC LIMIT 1".
") ORDER BY lsn DESC LIMIT 1".
")"
);
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error:
".pg_last_error()."\n");
exit(1);
}
sleep($accuracy);
}

2. I graph the replications lags (converted from LSN to seconds) in my
check_mk monitoring:

#!/bin/bash

#
# Managed by Puppet:
modules/base/files/monitoring/check_logical_replication_lag.sh
#
# Check the logical replication lag and export performance data for each
subscriber
#

# exit on error
#set -e

#Make sure this script only runs one at a time
(

ME=$0
MEBASE=`basename $0`

mylog () {
echo "$MEBASE: $1"
logger -t "$MEBASE" "$1"
}

flock -x -w 1 200
if [ $? != "0" ]; then
#echo "ERROR: $0 is already running ... exit"
logger -t "$MEBASE" "ERROR: $0 is already running ... exit"
exit 1
fi

# Do stuff

# Variablen fuer Monitoring
CMK_SPOOLDIR=/var/lib/check_mk_agent/spool
CMK_NAME=$MEBASE
CMK_SPOOLFILE=600_`basename ${CMK_NAME}`.txt
CMK_HEADER="<<<local>>>"
TMP_FILE="/tmp/logical_replication_lag.csv"

# Schwellwerte
warn=300
crit=600

final_output="$CMK_HEADER\nP $CMK_NAME "

# move to a directory where user postgresl may reside (sudo)
cd /tmp

# Lag auslesen. Waehrend dem initialen aufsynchen eines Subscribers gibt
es temporaere Subscriptions, mit dem Namen reg_xxx1-pid-sync-pid.
# Damit diese nicht getrackt werden gibt es die huebsche LIKE Clause.
rm -f "$TMP_FILE"
sudo -u postgres psql regdns -c "COPY (SELECT subscriber,lag FROM
get_replication_lag() WHERE subscriber LIKE '%\_____' ORDER BY 2 DESC)
TO '$TMP_FILE' With CSV" 2>&1> /dev/null
LC=$(sudo -u postgres psql -t regdns -c "SELECT count(*) FROM
get_replication_lag();" | tr -d ' ')

if [ $LC == "0" ]; then
echo -e "$CMK_HEADER\n0 $CMK_NAME - No Slaves with Replication
found - maybe we are a slave?" > $CMK_SPOOLDIR/$CMK_SPOOLFILE
exit 0;
fi

grep $(hostname | cut -d '-' -f2) "$TMP_FILE" > /dev/null
if [ $? != "0" ]; then
echo -e "$CMK_HEADER\n2 $CMK_NAME - Postgres Output does not
seem valid. Please check script $ME and output in $TMP_FILE" >
$CMK_SPOOLDIR/$CMK_SPOOLFILE
exit 1;
fi

# CSV in Array einlesen
IFS=$'\n' read -d '' -r -a input_file < "$TMP_FILE"

# Auswerten
maxlag=0
for i in "${input_file[@]}"; do
node=`echo $i | awk -F "," '{print $1}' | tr -- _ -`
lag=`echo $i | awk -F "," '{print $2}'`
final_output="$final_output$node=$lag;$warn;$crit|"
#

https://unix.stackexchange.com/questions/186663/is-there-a-unix-command-that-gives-the-minimum-maximum-of-two-numbers
maxlag=$(( maxlag > lag ? maxlag : lag ))
done
final_output="${final_output}max-lag=$maxlag;$warn;$crit"

# Letztes Pipe Zeichen rausschneiden
#final_output=`echo $final_output | rev | cut -c 2- | rev`

# Spool File schreiben
echo -e $final_output > $CMK_SPOOLDIR/$CMK_SPOOLFILE
logger -t "$MEBASE" "$final_output"

) 200>/tmp/`basename $0`.exclusivelock

3. During initial sync I check the status on the subscriber. Once it has
synced all tables of the publication, it will send me an email.
#
# Managed by Puppet:
modules/pdns/templates/check_pglogical_subscription.sh.erb
#

#
# This script checks and eventually creates the subscription, and wait
until the initial sync is finished
#

PUB=regdns2020_pub
SLEEP=5
PREFIX=check_pglogical_subscription.sh
NUMTAB=175

SECONDS=0
date
while true; do
echo "SELECT * from pg_subscription;" | sudo -u postgres psql -t
regdns | grep -q $PUB
if [ $? -eq 0 ]; then
echo "OK: Host is subscribed to '$PUB'. Checking for
table count ..."
break
fi
echo "ERROR: Host is not subscribed to '$PUB'. Subscribing to
master ..."
logger -t $PREFIX "ERROR: Host is not subscribed to '$PUB'.
Subscribing to master ..."
echo "CREATE SUBSCRIPTION `hostname -s|tr -- - _` CONNECTION
'host=XXXXX dbname=XXXX user=XXXXX password=XXXXXX PUBLICATION
regdns2020_pub;" | sudo -u postgres psql regdns && touch
/etc/regdns.schema_subscription.created
echo "Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "Re-Checking in $SLEEP seconds ..."
sleep $SLEEP
done

while true; do
COUNT=$(echo "SELECT count(*) from pg_subscription_rel;" | sudo
-u postgres psql -t regdns | head -1 | xargs)
if [ $COUNT -eq $NUMTAB ]; then
echo "OK: Subscription '$PUB' contains $NUMTAB tables -
that is OK. Checking for initial-sync status ..."
logger -t $PREFIX "OK: Subscription '$PUB' contains
$NUMTAB tables - that is OK. Checking for initial-sync status ..."
break
fi
echo "ERROR: Subscription '$PUB' contains $COUNT tables, but
should contain $NUMTAB table. Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "ERROR: Subscription '$PUB' contains $COUNT
tables, but should contain $NUMTAB table. Re-Checking in $SLEEP seconds
..."
sleep $SLEEP
done

while true; do
COUNTFIN=$(echo "SELECT count(*) from pg_subscription_rel WHERE
srsubstate='r';" | sudo -u postgres psql -t regdns | head -1 | xargs)
if [ $COUNTFIN -eq $NUMTAB ]; then
echo "OK: Initial sync of $COUNTFIN/$NUMTAB tables
finished in $SECONDS seconds."
logger -t $PREFIX "OK: Initial sync of $COUNTFIN/$NUMTAB
tables finished in $SECONDS seconds."
echo "OK: Initial sync of $COUNTFIN/$NUMTAB tables
finished in $SECONDS seconds." | mailx -s "$HOST $SECONDS seconds to
subscribe" -- root
break
fi
echo "PROGRESS: Initial sync of $COUNTFIN/$NUMTAB tables
finished. Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "PROGRESS: Initial sync of $COUNTFIN/$NUMTAB
tables finished. Re-Checking in $SLEEP seconds ..."
sleep $SLEEP
done

regards
Klaus

Am 2023-10-07 17:31, schrieb Shaheed Haque:

Hi,

I've been playing with logical replication (currently on PG14),
specifically in an AWS RDS Postgres context, but NOT using AWS' own
replication tooling. I'm generally familiar with the challenges of
distributed systems (such causality, time synchronisation etc), but
not especially familiar with PG.

In looking at how to tell how a given subscriber has caught up with
its publisher, there is plenty of advice around the Web, for example

https://dba.stackexchange.com/questions/314324/monitor-logical-replication-using-lsn
.

Like this example, much advice ends up talking about using separate
queries on the publisher and the subscriber to compare LSNs. First, (I
think) I understand the core difficulty that comparing LSNs is
inherently racy, but given that, I'm a bit unclear as to why a single
query on the publisher is not enough...IIUC:

* Changes sent from the publisher to the subscriber are identified

by

LSN.
* The publisher knows it's own current latest LSN
(pg_current_wal_lsn()), but this seems not to be exposed at the
subscriber.
* The subscriber knows what it has applied locally and even tells

the

publisher (pg_stat_subscription.latest_end_lsn), but it does not seem
to be exposed at the publisher.

Have I missed something? Is there a way to track the LSN delta (given
that this is known to be racy) just by querying one end?

Second, how do folk "know" when replication is "done". For example, if
the two LSNs continued to match for 1 * replication lag? Or N *
replication lag? What would be a plausible N?

Third, as we know when logical replication is started, the initial
table state is captured in a snapshot, and sent across using COPY
TABLE under the covers. Now, let's say that the publisher is idle
(i.e. no SQL writes to the user's schema...obviously pg_catalog might
change as replication is configured and enabled) and that the
replication starts with the publisher as LSN_start. How could one know
when the copying is done:

* I initially assumed that the publisher's LSN would not change

from

LSN_start, but as the copying proceeds, I see that it DOES change
(presumably because there are updates happening to pg_catalog, such as
the temporary slots coming and going).
* Is there some kind of singleton state on either publisher or
subscriber that could be checked to know? (At the moment, I am
counting the records in all copied tables).

I realise that the knowledge that the publisher is "idle" is a special
case, but right now, my test for being "done" is:

* Number of records in copied tables matches AND the publisher's
pg_stat_subscription matches the subscriber's
pg_stat_subscription.latest_end_lsn.

Plus or minus the bit about replication lag, is there a better way?

Thanks, Shaheed

#4Shaheed Haque
shaheedhaque@gmail.com
In reply to: Shaheed Haque (#3)
Re: Monitoring logical replication

Hi all,

Is there an "official" pairing of LSN values on the publication and
subscription sides that should be used to track the delta between the two
systems? I ask because Google is full of different pairs being used. I
tried to identify the highest level interface points exposed, i.e. what is
documented on
https://www.postgresql.org/docs/current/replication-origins.html, the
pg_stat_subscription table, the pg_stat_publication table and the
pg_current_wal_lsn() function on the publisher, but these seem to be barely
used.

Am I missing something?

Thanks, Shaheed

P.S. On a related note, I see a (stalled?) discussion on providing LSN ->
timestamp conversion
</messages/by-id/CAAKRu_bw7Pgw8Mi9LJrBkFvPPHgvVjPphrT8ugbzs-2V0f+1Rw@mail.gmail.com
I'd just like to say that something like that would be very useful.

On Tue, 30 Jan 2024 at 11:27, Shaheed Haque <shaheedhaque@gmail.com> wrote:

Show quoted text

This is great, thank you for posting. I'm currently a subcontinent or two
away from my dev env, but will compare your approach with mine (you are
using some facilities of psql I'm not familiar with). At least you have
confirmed that LSNs are the place to start.

Thanks again, Shaheed

On Tue, 30 Jan 2024, 05:15 Klaus Darilion, <klaus.mailinglists@pernau.at>
wrote:

Hi Saheed!

I monitor our replication this way:

1. Every 10 seconds i fetch the current LSN and write it into a table,
next with the current timestamp. Further I fetch confirmend LSNs from
the replication slots and delete old entries in lsn2data table.

calculate_logical_replication_lag.php:

<?php

$path = realpath(dirname(__FILE__) . "/../inc");
set_include_path($path . PATH_SEPARATOR . get_include_path());

require_once('config.php');
$config_int['syslogprefix'] = basename(__FILE__);
require_once('logging.php');

$dbuser="replication_lag_user";
$dbpass="XXXXXXXXXXXXXXXXXXXX";
if (!$dbconn = pg_pconnect('host='.$config_int['dbhost'].'
dbname='.$config_int['dbname'].' user='.$dbuser.' password='.$dbpass)) {
print "Sorry, database connection failed";
exit;
}

$accuracy = 10; // in seconds

//
// Preparations:
//
// CREATE TABLE lsn2data(
// lsn pg_lsn PRIMARY KEY,
// seen timestamp NOT NULL DEFAULT NOW()
// );
// CREATE ROLE replication_lag_user WITH LOGIN PASSWORD
'XXXXXXXXXXXXXXXXXXX';
// GRANT ALL ON TABLE lsn2data TO replication_lag_user;
//
// CREATE OR REPLACE FUNCTION get_replication_lag() RETURNS TABLE
(subscriber name, lag bigint) AS
// $BODY$
// DECLARE
// subscriber name;
// BEGIN
// FOR subscriber IN
// SELECT slot_name FROM pg_replication_slots
// LOOP
// RETURN QUERY SELECT slot_name, EXTRACT(EPOCH FROM
NOW()-seen)::bigint lag from lsn2data,pg_replication_slots WHERE
slot_name=subscriber AND lsn < confirmed_flush_lsn ORDER BY lsn DESC
LIMIT 1;
// END LOOP;
// RETURN;
// END
// $BODY$
// LANGUAGE plpgsql;
//
while (1) {
$dbq = pg_query("INSERT INTO lsn2data (lsn) VALUES
(pg_current_wal_lsn())");
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error:
".pg_last_error()."\n");
exit(1);
}

$dbq = pg_query("DELETE FROM lsn2data WHERE lsn < (".
"SELECT lsn FROM lsn2data WHERE lsn < (".
"SELECT confirmed_flush_lsn FROM
pg_replication_slots ORDER BY confirmed_flush_lsn ASC LIMIT 1".
") ORDER BY lsn DESC LIMIT 1".
")"
);
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error:
".pg_last_error()."\n");
exit(1);
}
sleep($accuracy);
}

2. I graph the replications lags (converted from LSN to seconds) in my
check_mk monitoring:

#!/bin/bash

#
# Managed by Puppet:
modules/base/files/monitoring/check_logical_replication_lag.sh
#
# Check the logical replication lag and export performance data for each
subscriber
#

# exit on error
#set -e

#Make sure this script only runs one at a time
(

ME=$0
MEBASE=`basename $0`

mylog () {
echo "$MEBASE: $1"
logger -t "$MEBASE" "$1"
}

flock -x -w 1 200
if [ $? != "0" ]; then
#echo "ERROR: $0 is already running ... exit"
logger -t "$MEBASE" "ERROR: $0 is already running ... exit"
exit 1
fi

# Do stuff

# Variablen fuer Monitoring
CMK_SPOOLDIR=/var/lib/check_mk_agent/spool
CMK_NAME=$MEBASE
CMK_SPOOLFILE=600_`basename ${CMK_NAME}`.txt
CMK_HEADER="<<<local>>>"
TMP_FILE="/tmp/logical_replication_lag.csv"

# Schwellwerte
warn=300
crit=600

final_output="$CMK_HEADER\nP $CMK_NAME "

# move to a directory where user postgresl may reside (sudo)
cd /tmp

# Lag auslesen. Waehrend dem initialen aufsynchen eines Subscribers gibt
es temporaere Subscriptions, mit dem Namen reg_xxx1-pid-sync-pid.
# Damit diese nicht getrackt werden gibt es die huebsche LIKE Clause.
rm -f "$TMP_FILE"
sudo -u postgres psql regdns -c "COPY (SELECT subscriber,lag FROM
get_replication_lag() WHERE subscriber LIKE '%\_____' ORDER BY 2 DESC)
TO '$TMP_FILE' With CSV" 2>&1> /dev/null
LC=$(sudo -u postgres psql -t regdns -c "SELECT count(*) FROM
get_replication_lag();" | tr -d ' ')

if [ $LC == "0" ]; then
echo -e "$CMK_HEADER\n0 $CMK_NAME - No Slaves with Replication
found - maybe we are a slave?" > $CMK_SPOOLDIR/$CMK_SPOOLFILE
exit 0;
fi

grep $(hostname | cut -d '-' -f2) "$TMP_FILE" > /dev/null
if [ $? != "0" ]; then
echo -e "$CMK_HEADER\n2 $CMK_NAME - Postgres Output does not
seem valid. Please check script $ME and output in $TMP_FILE" >
$CMK_SPOOLDIR/$CMK_SPOOLFILE
exit 1;
fi

# CSV in Array einlesen
IFS=$'\n' read -d '' -r -a input_file < "$TMP_FILE"

# Auswerten
maxlag=0
for i in "${input_file[@]}"; do
node=`echo $i | awk -F "," '{print $1}' | tr -- _ -`
lag=`echo $i | awk -F "," '{print $2}'`
final_output="$final_output$node=$lag;$warn;$crit|"
#

https://unix.stackexchange.com/questions/186663/is-there-a-unix-command-that-gives-the-minimum-maximum-of-two-numbers
maxlag=$(( maxlag > lag ? maxlag : lag ))
done
final_output="${final_output}max-lag=$maxlag;$warn;$crit"

# Letztes Pipe Zeichen rausschneiden
#final_output=`echo $final_output | rev | cut -c 2- | rev`

# Spool File schreiben
echo -e $final_output > $CMK_SPOOLDIR/$CMK_SPOOLFILE
logger -t "$MEBASE" "$final_output"

) 200>/tmp/`basename $0`.exclusivelock

3. During initial sync I check the status on the subscriber. Once it has
synced all tables of the publication, it will send me an email.
#
# Managed by Puppet:
modules/pdns/templates/check_pglogical_subscription.sh.erb
#

#
# This script checks and eventually creates the subscription, and wait
until the initial sync is finished
#

PUB=regdns2020_pub
SLEEP=5
PREFIX=check_pglogical_subscription.sh
NUMTAB=175

SECONDS=0
date
while true; do
echo "SELECT * from pg_subscription;" | sudo -u postgres psql -t
regdns | grep -q $PUB
if [ $? -eq 0 ]; then
echo "OK: Host is subscribed to '$PUB'. Checking for
table count ..."
break
fi
echo "ERROR: Host is not subscribed to '$PUB'. Subscribing to
master ..."
logger -t $PREFIX "ERROR: Host is not subscribed to '$PUB'.
Subscribing to master ..."
echo "CREATE SUBSCRIPTION `hostname -s|tr -- - _` CONNECTION
'host=XXXXX dbname=XXXX user=XXXXX password=XXXXXX PUBLICATION
regdns2020_pub;" | sudo -u postgres psql regdns && touch
/etc/regdns.schema_subscription.created
echo "Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "Re-Checking in $SLEEP seconds ..."
sleep $SLEEP
done

while true; do
COUNT=$(echo "SELECT count(*) from pg_subscription_rel;" | sudo
-u postgres psql -t regdns | head -1 | xargs)
if [ $COUNT -eq $NUMTAB ]; then
echo "OK: Subscription '$PUB' contains $NUMTAB tables -
that is OK. Checking for initial-sync status ..."
logger -t $PREFIX "OK: Subscription '$PUB' contains
$NUMTAB tables - that is OK. Checking for initial-sync status ..."
break
fi
echo "ERROR: Subscription '$PUB' contains $COUNT tables, but
should contain $NUMTAB table. Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "ERROR: Subscription '$PUB' contains $COUNT
tables, but should contain $NUMTAB table. Re-Checking in $SLEEP seconds
..."
sleep $SLEEP
done

while true; do
COUNTFIN=$(echo "SELECT count(*) from pg_subscription_rel WHERE
srsubstate='r';" | sudo -u postgres psql -t regdns | head -1 | xargs)
if [ $COUNTFIN -eq $NUMTAB ]; then
echo "OK: Initial sync of $COUNTFIN/$NUMTAB tables
finished in $SECONDS seconds."
logger -t $PREFIX "OK: Initial sync of $COUNTFIN/$NUMTAB
tables finished in $SECONDS seconds."
echo "OK: Initial sync of $COUNTFIN/$NUMTAB tables
finished in $SECONDS seconds." | mailx -s "$HOST $SECONDS seconds to
subscribe" -- root
break
fi
echo "PROGRESS: Initial sync of $COUNTFIN/$NUMTAB tables
finished. Re-Checking in $SLEEP seconds ..."
logger -t $PREFIX "PROGRESS: Initial sync of $COUNTFIN/$NUMTAB
tables finished. Re-Checking in $SLEEP seconds ..."
sleep $SLEEP
done

regards
Klaus

Am 2023-10-07 17:31, schrieb Shaheed Haque:

Hi,

I've been playing with logical replication (currently on PG14),
specifically in an AWS RDS Postgres context, but NOT using AWS' own
replication tooling. I'm generally familiar with the challenges of
distributed systems (such causality, time synchronisation etc), but
not especially familiar with PG.

In looking at how to tell how a given subscriber has caught up with
its publisher, there is plenty of advice around the Web, for example

https://dba.stackexchange.com/questions/314324/monitor-logical-replication-using-lsn
.

Like this example, much advice ends up talking about using separate
queries on the publisher and the subscriber to compare LSNs. First, (I
think) I understand the core difficulty that comparing LSNs is
inherently racy, but given that, I'm a bit unclear as to why a single
query on the publisher is not enough...IIUC:

* Changes sent from the publisher to the subscriber are

identified by

LSN.
* The publisher knows it's own current latest LSN
(pg_current_wal_lsn()), but this seems not to be exposed at the
subscriber.
* The subscriber knows what it has applied locally and even tells

the

publisher (pg_stat_subscription.latest_end_lsn), but it does not seem
to be exposed at the publisher.

Have I missed something? Is there a way to track the LSN delta (given
that this is known to be racy) just by querying one end?

Second, how do folk "know" when replication is "done". For example, if
the two LSNs continued to match for 1 * replication lag? Or N *
replication lag? What would be a plausible N?

Third, as we know when logical replication is started, the initial
table state is captured in a snapshot, and sent across using COPY
TABLE under the covers. Now, let's say that the publisher is idle
(i.e. no SQL writes to the user's schema...obviously pg_catalog might
change as replication is configured and enabled) and that the
replication starts with the publisher as LSN_start. How could one know
when the copying is done:

* I initially assumed that the publisher's LSN would not change

from

LSN_start, but as the copying proceeds, I see that it DOES change
(presumably because there are updates happening to pg_catalog, such as
the temporary slots coming and going).
* Is there some kind of singleton state on either publisher or
subscriber that could be checked to know? (At the moment, I am
counting the records in all copied tables).

I realise that the knowledge that the publisher is "idle" is a special
case, but right now, my test for being "done" is:

* Number of records in copied tables matches AND the publisher's
pg_stat_subscription matches the subscriber's
pg_stat_subscription.latest_end_lsn.

Plus or minus the bit about replication lag, is there a better way?

Thanks, Shaheed

#5Ron
ronljohnsonjr@gmail.com
In reply to: Shaheed Haque (#4)
Re: Monitoring logical replication

On Tue, Jun 18, 2024 at 5:03 AM Shaheed Haque <shaheedhaque@gmail.com>
wrote:

Hi all,

Is there an "official" pairing of LSN values on the publication and
subscription sides that should be used to track the delta between the two
systems? I ask because Google is full of different pairs being used. I
tried to identify the highest level interface points exposed, i.e. what is
documented on
https://www.postgresql.org/docs/current/replication-origins.html, the
pg_stat_subscription table, the pg_stat_publication table and the
pg_current_wal_lsn() function on the publisher, but these seem to be barely
used.

The attached scripts (whose guts I took from a Stack Exchange post) might
be a good starting point. It certainly works for physical replication!

P.S. On a related note, I see a (stalled?) discussion on providing LSN ->
timestamp conversion
</messages/by-id/CAAKRu_bw7Pgw8Mi9LJrBkFvPPHgvVjPphrT8ugbzs-2V0f+1Rw@mail.gmail.com
I'd just like to say that something like that would be very useful.

Out of curiosity, how does that work? Is an instance's initial LSN really
based on Epoch?

Attachments:

physical_backlog.shtext/x-sh; charset=US-ASCII; name=physical_backlog.shDownload
#6Shaheed Haque
shaheedhaque@gmail.com
In reply to: Ron (#5)
Re: Monitoring logical replication

Mostly to close the loop on this, now that I have things going seemingly
reliably...

On Tue, 18 Jun 2024 at 14:33, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

On Tue, Jun 18, 2024 at 5:03 AM Shaheed Haque <shaheedhaque@gmail.com>
wrote:

Hi all,

Is there an "official" pairing of LSN values on the publication and
subscription sides that should be used to track the delta between the two
systems? I ask because Google is full of different pairs being used. I
tried to identify the highest level interface points exposed, i.e. what is
documented on
https://www.postgresql.org/docs/current/replication-origins.html, the
pg_stat_subscription table, the pg_stat_publication table and the
pg_current_wal_lsn() function on the publisher, but these seem to be barely
used.

The attached scripts (whose guts I took from a Stack Exchange post) might
be a good starting point. It certainly works for physical replication!

P.S. On a related note, I see a (stalled?) discussion on providing LSN
-> timestamp conversion
</messages/by-id/CAAKRu_bw7Pgw8Mi9LJrBkFvPPHgvVjPphrT8ugbzs-2V0f+1Rw@mail.gmail.com
I'd just like to say that something like that would be very useful.

Out of curiosity, how does that work? Is an instance's initial LSN really
based on Epoch?

According to the docs at
https://www.postgresql.org/docs/current/datatype-pg-lsn.html, the LSN is "a
64-bit integer, representing a byte position in the write-ahead log
stream", so I guess some black magic is required to turn that into a
timestamp.

My use case might not be all that common, as I am NOT interested in a
long-term replica. What I am doing is making a copy of an "old" Django
deployment to a "new" deployment such that when the two ends are in close
sync, I can freeze traffic to the old deployment, pause for any final
catchup, and then run a Django migration on the new, before switching on
the new (thereby minimising the down time for the app). That being said:

- It turns out that one cannot use the LSN alone to check for sync
since, during the initial "full table copying" phase of the replication,
the LSNs at the two ends seem to be the same.
- Instead, I track three metrics on each end:
- On the publication end:
- "count(*)" for each table
- pg_current_wal_lsn()
- tuples_processed from pg_stat_progress_copy for each table
- On the subscription end:
- "count(*)" for each table
- latest_end_lsn from pg_stat_subscription
- tuples_processed from pg_stat_progress_copy for each table
- The sync is considered complete when all three metrics are aligned.
- I then freeze activity on the "old" deployment, wait for any in-flight
metric changes to come through, and then run the migration on the "new"
deployment.

As always perhaps that will be of use to somebody, but YMMV.

Thanks all,

Shaheed