Correct query to check streaming replication lag

Started by Granthana Biswasabout 12 years ago17 messagesgeneral
Jump to latest
#1Granthana Biswas
granthana@zedo.com

Hello All,

Can anyone please tell me which of the following is the correct replication
lag query to find streaming replication lag in seconds?

1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp()))
AS time_lag;

2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

Regards,
Granthana

#2Sameer Kumar
sameer.kumar@ashnik.com
In reply to: Granthana Biswas (#1)
Re: Correct query to check streaming replication lag

On Tue, Jan 14, 2014 at 2:31 PM, Granthana Biswas <granthana@zedo.com>wrote:

Can anyone please tell me which of the following is the correct
replication lag query to find streaming replication lag in seconds?

IMHO none is 'correct'. :-)

1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp()))

AS time_lag;

This is the difference between now and last replayed log. What if the last

transaction on master has happened a few minutes ago?

2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

For reason same as above, this won't be exact but the most accurate you can
get.

Best Regards,
*Sameer Kumar | Database Consultant*

*ASHNIK PTE. LTD.*101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350 <%2B65%208110%200350>* T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

[image: email patch]

This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).

Attachments:

image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
#3Granthana Biswas
granthana@zedo.com
In reply to: Sameer Kumar (#2)
Re: Correct query to check streaming replication lag

Thank you Sameer for your reply. Is there any other query that would help
get exact replication lag?

Regards,
Granthana

Regards,
Granthana

On Fri, Jan 17, 2014 at 2:46 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:

Show quoted text

On Tue, Jan 14, 2014 at 2:31 PM, Granthana Biswas <granthana@zedo.com>wrote:

Can anyone please tell me which of the following is the correct
replication lag query to find streaming replication lag in seconds?

IMHO none is 'correct'. :-)

1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp()))

AS time_lag;

This is the difference between now and last replayed log. What if the

last transaction on master has happened a few minutes ago?

2. SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

For reason same as above, this won't be exact but the most accurate you
can get.

Best Regards,
*Sameer Kumar | Database Consultant*

*ASHNIK PTE. LTD.*101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350 <%2B65%208110%200350>* T: +65 6438 3504 |
www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

[image: email patch]

This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).

Attachments:

image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
#4Sameer Kumar
sameer.kumar@ashnik.com
In reply to: Granthana Biswas (#3)
Re: Correct query to check streaming replication lag

On Fri, Jan 17, 2014 at 5:31 PM, Granthana Biswas <granthana@zedo.com>wrote:

Thank you Sameer for your reply. Is there any other query that would help
get exact replication lag?

You 2nd Query is the most accurate you can get.

2. SELECT CASE WHEN pg_last_xlog_receive_location() =

pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

For reason same as above, this won't be exact but the most accurate you
can get.

You can get lags in terms of bytes (and I have always believed that

makes more sense).

May I ask, why are you looking for this info? Is it purely monitoring?

Regards
Sameer
Ashnik Pte Ltd.

#5Granthana Biswas
granthana@zedo.com
In reply to: Sameer Kumar (#4)
Re: Correct query to check streaming replication lag

Yes it's purely for monitoring purpose.

Regards,
Granthana

On Fri, Jan 17, 2014 at 3:29 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:

Show quoted text

On Fri, Jan 17, 2014 at 5:31 PM, Granthana Biswas <granthana@zedo.com>wrote:

Thank you Sameer for your reply. Is there any other query that would help
get exact replication lag?

You 2nd Query is the most accurate you can get.

2. SELECT CASE WHEN pg_last_xlog_receive_location() =

pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

For reason same as above, this won't be exact but the most accurate you
can get.

You can get lags in terms of bytes (and I have always believed that

makes more sense).

May I ask, why are you looking for this info? Is it purely monitoring?

Regards
Sameer
Ashnik Pte Ltd.

#6Ray Stell
stellr@vt.edu
In reply to: Granthana Biswas (#5)
Re: Correct query to check streaming replication lag

On Jan 17, 2014, at 5:07 AM, Granthana Biswas <granthana@zedo.com> wrote:

Yes it's purely for monitoring purpose.

I use the pg_controldata cmd locally and via bash/ssh shared keys and compare various values that seem interesting such as "Time of latest checkpoint, Latest checkpoint location." My interest is recoverability and checkpoints seemed relevant at the time.

I found a comment in the docs:

http://www.postgresql.org/docs/9.2/static/functions-admin.html

"pg_xlog_location_diff calculates the difference in bytes between two transaction log locations. It can be used with pg_stat_replication or some functions shown in Table 9-59 to get the replication lag."

and
"The functions shown in Table 9-60 provide information about the current status of the standby. These functions may be executed both during recovery and in normal running."

These look interesting wrt lag studies and seem to work on the stby:

template1=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
18/9E000000
(1 row)

template1=# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
------------------------------
18/9E000000
(1 row)

#7Sameer Kumar
sameer.kumar@ashnik.com
In reply to: Granthana Biswas (#5)
Re: Correct query to check streaming replication lag

Well in that case monitoring pending wal bytes would make more sense.

Regards
Sameer

PS: Sent from my Mobile device. Pls ignore typo n abb

#8Granthana Biswas
granthana@zedo.com
In reply to: Sameer Kumar (#7)
Re: Correct query to check streaming replication lag

Yes we already do that. Count the number of ready wal files.

Regards,
Granthana

On Sat, Jan 18, 2014 at 9:39 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:

Show quoted text

Well in that case monitoring pending wal bytes would make more sense.

Regards
Sameer

PS: Sent from my Mobile device. Pls ignore typo n abb

#9Sameer Kumar
sameer.kumar@ashnik.com
In reply to: Granthana Biswas (#8)
Re: Correct query to check streaming replication lag

On Mon, Jan 20, 2014 at 1:53 PM, Granthana Biswas <granthana@zedo.com>wrote:

Yes we already do that. Count the number of ready wal files.

I guess a better place to check would be pg_stat_replication

Check this discussion:
/messages/by-id/4F13ED11.6080001@gmail.com

Another way is explained in wiki:
http://wiki.postgresql.org/wiki/Streaming_Replication

$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
pg_current_xlog_location
--------------------------
0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
pg_last_xlog_receive_location
-------------------------------
0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)

Best Regards,
*Sameer Kumar | Database Consultant*

*ASHNIK PTE. LTD.*101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350* T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

[image: email patch]

This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).

Attachments:

image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
#10Michael Paquier
michael@paquier.xyz
In reply to: Sameer Kumar (#9)
Re: Correct query to check streaming replication lag

On Tue, Jan 21, 2014 at 12:41 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:

On Mon, Jan 20, 2014 at 1:53 PM, Granthana Biswas <granthana@zedo.com>wrote:

Yes we already do that. Count the number of ready wal files.

I guess a better place to check would be pg_stat_replication

Check this discussion:
/messages/by-id/4F13ED11.6080001@gmail.com

Another way is explained in wiki:
http://wiki.postgresql.org/wiki/Streaming_Replication

$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
pg_current_xlog_location
--------------------------
0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
pg_last_xlog_receive_location
-------------------------------
0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)

Mind you, here is a simple suggestion:

SELECT application_name, pg_xlog_location_diff(sent_location,
flush_location) AS replay_delta, sync_priority, sync_state FROM
pg_stat_replication ORDER BY replay_delta ASC, application_name;
Regards
--
Michael

#11Sameer Kumar
sameer.kumar@ashnik.com
In reply to: Michael Paquier (#10)
Re: Correct query to check streaming replication lag

On Tue, Jan 21, 2014 at 12:12 PM, Michael Paquier <michael.paquier@gmail.com

wrote:

Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location,
flush_location) AS replay_delta, sync_priority, sync_state FROM
pg_stat_replication ORDER BY replay_delta ASC, application_name;

This is much more simple, but I guess would work only with v9.2 and higher
(?).

Best Regards,
*Sameer Kumar | Database Consultant*

*ASHNIK PTE. LTD. *101 Cecil Street, #11-11 Tong Eng Building, Singapore
069533
M : *+65 8110 0350* T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

[image: email patch]

This email may contain confidential, privileged or copyright material and
is solely for the use of the intended recipient(s).

Attachments:

image002.jpgimage/jpeg; name=image002.jpgDownload+1-0
#12Michael Paquier
michael@paquier.xyz
In reply to: Sameer Kumar (#11)
Re: Correct query to check streaming replication lag

On Tue, Jan 21, 2014 at 1:30 PM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:

On Tue, Jan 21, 2014 at 12:12 PM, Michael Paquier <
michael.paquier@gmail.com> wrote:

Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location,
flush_location) AS replay_delta, sync_priority, sync_state FROM
pg_stat_replication ORDER BY replay_delta ASC, application_name;

This is much more simple, but I guess would work only with v9.2 and higher
(?).

Yep, pg_xlog_location_diff is in core since 9.2.
--
Michael

#13Granthana Biswas
granthana@zedo.com
In reply to: Ray Stell (#6)
Re: Correct query to check streaming replication lag

Hi Ray,

We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

Regards,
Granthana

On Fri, Jan 17, 2014 at 8:24 PM, Ray Stell <stellr@vt.edu> wrote:

Show quoted text

On Jan 17, 2014, at 5:07 AM, Granthana Biswas <granthana@zedo.com> wrote:

Yes it's purely for monitoring purpose.

I use the pg_controldata cmd locally and via bash/ssh shared keys and
compare various values that seem interesting such as "Time of latest
checkpoint, Latest checkpoint location." My interest is recoverability and
checkpoints seemed relevant at the time.

I found a comment in the docs:

http://www.postgresql.org/docs/9.2/static/functions-admin.html

"pg_xlog_location_diff calculates the difference in bytes between two
transaction log locations. It can be used with pg_stat_replication or
some functions shown in Table 9-59<http://www.postgresql.org/docs/9.2/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE&gt;to get the replication lag."

and
"The functions shown in Table 9-60<http://www.postgresql.org/docs/9.2/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE&gt;provide information about the current status of the standby. These
functions may be executed both during recovery and in normal running."
These look interesting wrt lag studies and seem to work on the stby:

template1=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
18/9E000000
(1 row)

template1=# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
------------------------------
18/9E000000
(1 row)

#14Sameer Kumar
sameer.kumar@ashnik.com
In reply to: Granthana Biswas (#13)
Re: Correct query to check streaming replication lag

We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
pg_last_xact_replay_timestamp()) END AS log_delay;

This is (delay) not the correct thing to monitor.

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

You can still use the other two methods I mentioned.

Regards
Sameer
Ashnik Pte Ltd, Singapore

#15Granthana Biswas
granthana@zedo.com
In reply to: Sameer Kumar (#14)
Re: Correct query to check streaming replication lag

Yes byte lag as well as it makes more sense.

Regards,
Granthana

Regards,
Granthana

On Tue, Jan 21, 2014 at 11:03 AM, Sameer Kumar <sameer.kumar@ashnik.com>wrote:

Show quoted text

We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now()
- pg_last_xact_replay_timestamp()) END AS log_delay;

This is (delay) not the correct thing to monitor.

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

You can still use the other two methods I mentioned.

Regards
Sameer
Ashnik Pte Ltd, Singapore

#16Michael Paquier
michael@paquier.xyz
In reply to: Sameer Kumar (#14)
Re: Correct query to check streaming replication lag

On Tue, Jan 21, 2014 at 2:33 PM, Sameer Kumar <sameer.kumar@ashnik.com>
wrote:

We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now()

-

pg_last_xact_replay_timestamp()) END AS log_delay;

This is (delay) not the correct thing to monitor.

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

You can still use the other two methods I mentioned.

FYI, here is an equivalent written in plpgsql easily findable by googling a
bit, making a pg_xlog_location_diff-like function usable even in 9.1 and
9.0 servers:
CREATE OR REPLACE FUNCTION pg_xlog_location_diff_sql( text, text)
RETURNS numeric
LANGUAGE plpgsql
AS
$function$
DECLARE
offset1 text;
offset2 text;
xlog1 text;
xlog2 text;
SQL text;
diff text;
BEGIN
/* Extract the Offset and xlog from input in
offset and xlog variables */

offset1=split_part($1,'/',2);
xlog1=split_part($1,'/',1);
offset2=split_part($2,'/',2);
xlog2=split_part($2,'/',1);

/* Prepare SQL query for calculation based on following formula
(FF000000 * xlog + offset) - (FF000000 * xlog + offset)
which gives value in hexadecimal. Since, hexadecimal calculation
is cumbersome
so convert into decimal and then calculate the difference */

       SQL='SELECT (x'''||'FF000000'||'''::bigint * x'''||xlog1||'''::bigint
                                +  x'''||offset1||'''::bigint)'||'
                -
                   (x'''||'FF000000'||'''::bigint * x'''||xlog2||'''::bigint
                                +  x'''||offset2||'''::bigint)';
       EXECUTE SQL into diff;

/* Return the value in numeric by explicit casting */

RETURN diff::numeric;
END;
$function$;

Source:
http://vibhorkumar.wordpress.com/2013/02/18/pg_xlog_location_diff-function-for-postgreqsqlppas/
--
Michael

#17Granthana Biswas
granthana@zedo.com
In reply to: Michael Paquier (#16)
Re: Correct query to check streaming replication lag

Thanks a load Michael. This is really helpful.

Regards,
Granthana

On Tue, Jan 21, 2014 at 12:19 PM, Michael Paquier <michael.paquier@gmail.com

Show quoted text

wrote:

On Tue, Jan 21, 2014 at 2:33 PM, Sameer Kumar <sameer.kumar@ashnik.com>
wrote:

We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM

now() -

pg_last_xact_replay_timestamp()) END AS log_delay;

This is (delay) not the correct thing to monitor.

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

You can still use the other two methods I mentioned.

FYI, here is an equivalent written in plpgsql easily findable by googling
a bit, making a pg_xlog_location_diff-like function usable even in 9.1 and
9.0 servers:
CREATE OR REPLACE FUNCTION pg_xlog_location_diff_sql( text, text)
RETURNS numeric
LANGUAGE plpgsql
AS
$function$
DECLARE
offset1 text;
offset2 text;
xlog1 text;
xlog2 text;
SQL text;
diff text;
BEGIN
/* Extract the Offset and xlog from input in
offset and xlog variables */

offset1=split_part($1,'/',2);
xlog1=split_part($1,'/',1);
offset2=split_part($2,'/',2);
xlog2=split_part($2,'/',1);

/* Prepare SQL query for calculation based on following formula
(FF000000 * xlog + offset) - (FF000000 * xlog + offset)
which gives value in hexadecimal. Since, hexadecimal calculation
is cumbersome
so convert into decimal and then calculate the difference */

SQL='SELECT (x'''||'FF000000'||'''::bigint *
x'''||xlog1||'''::bigint
+ x'''||offset1||'''::bigint)'||'
-
(x'''||'FF000000'||'''::bigint *
x'''||xlog2||'''::bigint
+ x'''||offset2||'''::bigint)';
EXECUTE SQL into diff;

/* Return the value in numeric by explicit casting */

RETURN diff::numeric;
END;
$function$;

Source:
http://vibhorkumar.wordpress.com/2013/02/18/pg_xlog_location_diff-function-for-postgreqsqlppas/
--
Michael