Correct query to check streaming replication lag
Hello All,
Can anyone please tell me which of the following is the correct replication
lag query to find streaming replication lag in seconds?
1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp()))
AS time_lag;
2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;
Regards,
Granthana
On Tue, Jan 14, 2014 at 2:31 PM, Granthana Biswas <granthana@zedo.com>wrote:
Can anyone please tell me which of the following is the correct
replication lag query to find streaming replication lag in seconds?IMHO none is 'correct'. :-)
1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp()))
AS time_lag;
This is the difference between now and last replayed log. What if the last
transaction on master has happened a few minutes ago?
2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;
For reason same as above, this won't be exact but the most accurate you can
get.
Best Regards,
*Sameer Kumar | Database Consultant*
*ASHNIK PTE. LTD.*101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350 <%2B65%208110%200350>* T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz
[image: email patch]
This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).
Attachments:
image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
Thank you Sameer for your reply. Is there any other query that would help
get exact replication lag?
Regards,
Granthana
Regards,
Granthana
On Fri, Jan 17, 2014 at 2:46 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:
Show quoted text
On Tue, Jan 14, 2014 at 2:31 PM, Granthana Biswas <granthana@zedo.com>wrote:
Can anyone please tell me which of the following is the correct
replication lag query to find streaming replication lag in seconds?IMHO none is 'correct'. :-)
1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp()))
AS time_lag;
This is the difference between now and last replayed log. What if the
last transaction on master has happened a few minutes ago?
2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;For reason same as above, this won't be exact but the most accurate you
can get.Best Regards,
*Sameer Kumar | Database Consultant**ASHNIK PTE. LTD.*101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350 <%2B65%208110%200350>* T: +65 6438 3504 |
www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz[image: email patch]
This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).
Attachments:
image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
On Fri, Jan 17, 2014 at 5:31 PM, Granthana Biswas <granthana@zedo.com>wrote:
Thank you Sameer for your reply. Is there any other query that would help
get exact replication lag?
You 2nd Query is the most accurate you can get.
2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;For reason same as above, this won't be exact but the most accurate you
can get.You can get lags in terms of bytes (and I have always believed that
makes more sense).
May I ask, why are you looking for this info? Is it purely monitoring?
Regards
Sameer
Ashnik Pte Ltd.
Yes it's purely for monitoring purpose.
Regards,
Granthana
On Fri, Jan 17, 2014 at 3:29 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:
Show quoted text
On Fri, Jan 17, 2014 at 5:31 PM, Granthana Biswas <granthana@zedo.com>wrote:
Thank you Sameer for your reply. Is there any other query that would help
get exact replication lag?You 2nd Query is the most accurate you can get.
2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;For reason same as above, this won't be exact but the most accurate you
can get.You can get lags in terms of bytes (and I have always believed that
makes more sense).
May I ask, why are you looking for this info? Is it purely monitoring?
Regards
Sameer
Ashnik Pte Ltd.
On Jan 17, 2014, at 5:07 AM, Granthana Biswas <granthana@zedo.com> wrote:
Yes it's purely for monitoring purpose.
I use the pg_controldata cmd locally and via bash/ssh shared keys and compare various values that seem interesting such as "Time of latest checkpoint, Latest checkpoint location." My interest is recoverability and checkpoints seemed relevant at the time.
I found a comment in the docs:
http://www.postgresql.org/docs/9.2/static/functions-admin.html
"pg_xlog_location_diff calculates the difference in bytes between two transaction log locations. It can be used with pg_stat_replication or some functions shown in Table 9-59 to get the replication lag."
and
"The functions shown in Table 9-60 provide information about the current status of the standby. These functions may be executed both during recovery and in normal running."
These look interesting wrt lag studies and seem to work on the stby:
template1=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
18/9E000000
(1 row)
template1=# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
------------------------------
18/9E000000
(1 row)
Well in that case monitoring pending wal bytes would make more sense.
Regards
Sameer
PS: Sent from my Mobile device. Pls ignore typo n abb
Yes we already do that. Count the number of ready wal files.
Regards,
Granthana
On Sat, Jan 18, 2014 at 9:39 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:
Show quoted text
Well in that case monitoring pending wal bytes would make more sense.
Regards
SameerPS: Sent from my Mobile device. Pls ignore typo n abb
On Mon, Jan 20, 2014 at 1:53 PM, Granthana Biswas <granthana@zedo.com>wrote:
Yes we already do that. Count the number of ready wal files.
I guess a better place to check would be pg_stat_replication
Check this discussion:
/messages/by-id/4F13ED11.6080001@gmail.com
Another way is explained in wiki:
http://wiki.postgresql.org/wiki/Streaming_Replication
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
pg_current_xlog_location
--------------------------
0/2000000
(1 row)
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
pg_last_xlog_receive_location
-------------------------------
0/2000000
(1 row)
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)
Best Regards,
*Sameer Kumar | Database Consultant*
*ASHNIK PTE. LTD.*101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350* T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz
[image: email patch]
This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).
Attachments:
image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
On Tue, Jan 21, 2014 at 12:41 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:
On Mon, Jan 20, 2014 at 1:53 PM, Granthana Biswas <granthana@zedo.com>wrote:
Yes we already do that. Count the number of ready wal files.
I guess a better place to check would be pg_stat_replication
Check this discussion:
/messages/by-id/4F13ED11.6080001@gmail.comAnother way is explained in wiki:
http://wiki.postgresql.org/wiki/Streaming_Replication$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
pg_current_xlog_location
--------------------------
0/2000000
(1 row)$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
pg_last_xlog_receive_location
-------------------------------
0/2000000
(1 row)$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location,
flush_location) AS replay_delta, sync_priority, sync_state FROM
pg_stat_replication ORDER BY replay_delta ASC, application_name;
Regards
--
Michael
On Tue, Jan 21, 2014 at 12:12 PM, Michael Paquier <michael.paquier@gmail.com
wrote:
Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location,
flush_location) AS replay_delta, sync_priority, sync_state FROM
pg_stat_replication ORDER BY replay_delta ASC, application_name;
This is much more simple, but I guess would work only with v9.2 and higher
(?).
Best Regards,
*Sameer Kumar | Database Consultant*
*ASHNIK PTE. LTD. *101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350* T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz
[image: email patch]
This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).
Attachments:
image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
On Tue, Jan 21, 2014 at 1:30 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:
On Tue, Jan 21, 2014 at 12:12 PM, Michael Paquier <
michael.paquier@gmail.com> wrote:Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location,
flush_location) AS replay_delta, sync_priority, sync_state FROM
pg_stat_replication ORDER BY replay_delta ASC, application_name;This is much more simple, but I guess would work only with v9.2 and higher
(?).
Yep, pg_xlog_location_diff is in core since 9.2.
--
Michael
Hi Ray,
We are already using the following query:
SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;
We cannot use pg_xlog_location_diff as we use postgresql 9.1.
Regards,
Granthana
On Fri, Jan 17, 2014 at 8:24 PM, Ray Stell <stellr@vt.edu> wrote:
Show quoted text
On Jan 17, 2014, at 5:07 AM, Granthana Biswas <granthana@zedo.com> wrote:
Yes it's purely for monitoring purpose.
I use the pg_controldata cmd locally and via bash/ssh shared keys and
compare various values that seem interesting such as "Time of latest
checkpoint, Latest checkpoint location." My interest is recoverability and
checkpoints seemed relevant at the time.I found a comment in the docs:
http://www.postgresql.org/docs/9.2/static/functions-admin.html
"pg_xlog_location_diff calculates the difference in bytes between two
transaction log locations. It can be used with pg_stat_replication or
some functions shown in Table 9-59<http://www.postgresql.org/docs/9.2/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE>to get the replication lag."and
"The functions shown in Table 9-60<http://www.postgresql.org/docs/9.2/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE>provide information about the current status of the standby. These
functions may be executed both during recovery and in normal running."
These look interesting wrt lag studies and seem to work on the stby:template1=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
18/9E000000
(1 row)template1=# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
------------------------------
18/9E000000
(1 row)
We are already using the following query:
SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;This is (delay) not the correct thing to monitor.
We cannot use pg_xlog_location_diff as we use postgresql 9.1.
You can still use the other two methods I mentioned.
Regards
Sameer
Ashnik Pte Ltd, Singapore
Yes byte lag as well as it makes more sense.
Regards,
Granthana
Regards,
Granthana
On Tue, Jan 21, 2014 at 11:03 AM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:
Show quoted text
We are already using the following query:
SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now()
- pg_last_xact_replay_timestamp()) END AS log_delay;This is (delay) not the correct thing to monitor.
We cannot use pg_xlog_location_diff as we use postgresql 9.1.
You can still use the other two methods I mentioned.
Regards
Sameer
Ashnik Pte Ltd, Singapore
On Tue, Jan 21, 2014 at 2:33 PM, Sameer Kumar <sameer.kumar@ashnik.com>
wrote:
We are already using the following query:
SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now()
-
pg_last_xact_replay_timestamp()) END AS log_delay;
This is (delay) not the correct thing to monitor.
We cannot use pg_xlog_location_diff as we use postgresql 9.1.
You can still use the other two methods I mentioned.
FYI, here is an equivalent written in plpgsql easily findable by googling a
bit, making a pg_xlog_location_diff-like function usable even in 9.1 and
9.0 servers:
CREATE OR REPLACE FUNCTION pg_xlog_location_diff_sql( text, text)
RETURNS numeric
LANGUAGE plpgsql
AS
$function$
DECLARE
offset1 text;
offset2 text;
xlog1 text;
xlog2 text;
SQL text;
diff text;
BEGIN
/* Extract the Offset and xlog from input in
offset and xlog variables */
offset1=split_part($1,'/',2);
xlog1=split_part($1,'/',1);
offset2=split_part($2,'/',2);
xlog2=split_part($2,'/',1);
/* Prepare SQL query for calculation based on following formula
(FF000000 * xlog + offset) - (FF000000 * xlog + offset)
which gives value in hexadecimal. Since, hexadecimal calculation
is cumbersome
so convert into decimal and then calculate the difference */
SQL='SELECT (x'''||'FF000000'||'''::bigint * x'''||xlog1||'''::bigint
+ x'''||offset1||'''::bigint)'||'
-
(x'''||'FF000000'||'''::bigint * x'''||xlog2||'''::bigint
+ x'''||offset2||'''::bigint)';
EXECUTE SQL into diff;
/* Return the value in numeric by explicit casting */
RETURN diff::numeric;
END;
$function$;
Source:
http://vibhorkumar.wordpress.com/2013/02/18/pg_xlog_location_diff-function-for-postgreqsqlppas/
--
Michael
Thanks a load Michael. This is really helpful.
Regards,
Granthana
On Tue, Jan 21, 2014 at 12:19 PM, Michael Paquier <michael.paquier@gmail.com
Show quoted text
wrote:
On Tue, Jan 21, 2014 at 2:33 PM, Sameer Kumar <sameer.kumar@ashnik.com>
wrote:We are already using the following query:
SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROMnow() -
pg_last_xact_replay_timestamp()) END AS log_delay;
This is (delay) not the correct thing to monitor.
We cannot use pg_xlog_location_diff as we use postgresql 9.1.
You can still use the other two methods I mentioned.
FYI, here is an equivalent written in plpgsql easily findable by googling
a bit, making a pg_xlog_location_diff-like function usable even in 9.1 and
9.0 servers:
CREATE OR REPLACE FUNCTION pg_xlog_location_diff_sql( text, text)
RETURNS numeric
LANGUAGE plpgsql
AS
$function$
DECLARE
offset1 text;
offset2 text;
xlog1 text;
xlog2 text;
SQL text;
diff text;
BEGIN
/* Extract the Offset and xlog from input in
offset and xlog variables */offset1=split_part($1,'/',2);
xlog1=split_part($1,'/',1);
offset2=split_part($2,'/',2);
xlog2=split_part($2,'/',1);/* Prepare SQL query for calculation based on following formula
(FF000000 * xlog + offset) - (FF000000 * xlog + offset)
which gives value in hexadecimal. Since, hexadecimal calculation
is cumbersome
so convert into decimal and then calculate the difference */SQL='SELECT (x'''||'FF000000'||'''::bigint *
x'''||xlog1||'''::bigint
+ x'''||offset1||'''::bigint)'||'
-
(x'''||'FF000000'||'''::bigint *
x'''||xlog2||'''::bigint
+ x'''||offset2||'''::bigint)';
EXECUTE SQL into diff;/* Return the value in numeric by explicit casting */
RETURN diff::numeric;
END;
$function$;Source:
http://vibhorkumar.wordpress.com/2013/02/18/pg_xlog_location_diff-function-for-postgreqsqlppas/
--
Michael