Thread: Correct query to check streaming replication lag

Correct query to check streaming replication lag

From
Granthana Biswas
Date:
Hello All,

Can anyone please tell me which of the following is the correct replication lag query to find streaming replication lag in seconds?

1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp())) AS time_lag;

2. SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;


Regards,
Granthana

Re: Correct query to check streaming replication lag

From
Sameer Kumar
Date:

On Tue, Jan 14, 2014 at 2:31 PM, Granthana Biswas <granthana@zedo.com> wrote:
Can anyone please tell me which of the following is the correct replication lag query to find streaming replication lag in seconds?

IMHO none is 'correct'. :-)

1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp())) AS time_lag;

This is the difference between now and last replayed log. What if the last transaction on master has happened a few minutes ago?
 
2. SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

For reason same as above, this won't be exact but the most accurate you can get.

Best Regards,
Sameer Kumar | Database Consultant
ASHNIK PTE. LTD.
101 Cecil Street, #11-11 Tong Eng Building, Singapore 069533
M : +65 8110 0350 T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

email patch

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Attachment

Re: Correct query to check streaming replication lag

From
Sameer Kumar
Date:


On Fri, Jan 17, 2014 at 5:31 PM, Granthana Biswas <granthana@zedo.com> wrote:


Thank you Sameer for your reply. Is there any other query that would help get exact replication lag?


You 2nd Query is the most accurate you can get. 
 
2. SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
For reason same as above, this won't be exact but the most accurate you can get.
You can get lags in terms of bytes (and I have always believed that makes more sense).


May I ask, why are you looking for this info? Is it purely monitoring?

Regards
Sameer
Ashnik Pte Ltd.

Re: Correct query to check streaming replication lag

From
Granthana Biswas
Date:


Thank you Sameer for your reply. Is there any other query that would help get exact replication lag?

Regards,
Granthana

Regards,
Granthana


On Fri, Jan 17, 2014 at 2:46 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:

On Tue, Jan 14, 2014 at 2:31 PM, Granthana Biswas <granthana@zedo.com> wrote:
Can anyone please tell me which of the following is the correct replication lag query to find streaming replication lag in seconds?

IMHO none is 'correct'. :-)

1. SELECT extract(seconds from (now() - pg_last_xact_replay_timestamp())) AS time_lag;

This is the difference between now and last replayed log. What if the last transaction on master has happened a few minutes ago?
 
2. SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

For reason same as above, this won't be exact but the most accurate you can get.

Best Regards,
Sameer Kumar | Database Consultant
ASHNIK PTE. LTD.
101 Cecil Street, #11-11 Tong Eng Building, Singapore 069533
M : +65 8110 0350 T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

email patch

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).


Attachment

Re: Correct query to check streaming replication lag

From
Granthana Biswas
Date:
Yes it's purely for monitoring purpose.


Regards,
Granthana


On Fri, Jan 17, 2014 at 3:29 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:


On Fri, Jan 17, 2014 at 5:31 PM, Granthana Biswas <granthana@zedo.com> wrote:


Thank you Sameer for your reply. Is there any other query that would help get exact replication lag?


You 2nd Query is the most accurate you can get. 
 
2. SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
For reason same as above, this won't be exact but the most accurate you can get.
You can get lags in terms of bytes (and I have always believed that makes more sense).


May I ask, why are you looking for this info? Is it purely monitoring?

Regards
Sameer
Ashnik Pte Ltd.


Re: Correct query to check streaming replication lag

From
Ray Stell
Date:

On Jan 17, 2014, at 5:07 AM, Granthana Biswas <granthana@zedo.com> wrote:

Yes it's purely for monitoring purpose.


I use the pg_controldata cmd locally and via bash/ssh shared keys and compare various values that seem interesting such as "Time of latest checkpoint, Latest checkpoint location."  My interest is recoverability and checkpoints seemed relevant at the time.  

I found a comment in the docs:


"pg_xlog_location_diff calculates the difference in bytes between two transaction log locations. It can be used with pg_stat_replication or some functions shown in Table 9-59 to get the replication lag."

and 

"The functions shown in Table 9-60 provide information about the current status of the standby. These functions may be executed both during recovery and in normal running."  

These look interesting wrt lag studies and seem to work on the stby:

template1=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location 
-------------------------------
 18/9E000000
(1 row)

template1=# select pg_last_xlog_replay_location();
 pg_last_xlog_replay_location 
------------------------------
 18/9E000000
(1 row)



Attachment

Re: Correct query to check streaming replication lag

From
Sameer Kumar
Date:

Well in that case monitoring pending wal bytes would make more sense.

Regards
Sameer

PS: Sent from my Mobile device. Pls ignore typo n abb

Re: Correct query to check streaming replication lag

From
Granthana Biswas
Date:
Yes we already do that. Count the number of ready wal files.

Regards,
Granthana


On Sat, Jan 18, 2014 at 9:39 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:

Well in that case monitoring pending wal bytes would make more sense.

Regards
Sameer

PS: Sent from my Mobile device. Pls ignore typo n abb


Re: Correct query to check streaming replication lag

From
Sameer Kumar
Date:

On Mon, Jan 20, 2014 at 1:53 PM, Granthana Biswas <granthana@zedo.com> wrote:
Yes we already do that. Count the number of ready wal files.

I guess a better place to check would be pg_stat_replication


Check this discussion:


Another way is explained in wiki:
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host) pg_current_xlog_location 
-------------------------- 0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host) pg_last_xlog_receive_location 
------------------------------- 0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host) pg_last_xlog_replay_location 
------------------------------ 0/2000000
(1 row)

Best Regards,
Sameer Kumar | Database Consultant
ASHNIK PTE. LTD.
101 Cecil Street, #11-11 Tong Eng Building, Singapore 069533
M : +65 8110 0350 T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

email patch

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Attachment

Re: Correct query to check streaming replication lag

From
Michael Paquier
Date:



On Tue, Jan 21, 2014 at 12:41 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:

On Mon, Jan 20, 2014 at 1:53 PM, Granthana Biswas <granthana@zedo.com> wrote:
Yes we already do that. Count the number of ready wal files.

I guess a better place to check would be pg_stat_replication


Check this discussion:


Another way is explained in wiki:
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)pg_current_xlog_location 
--------------------------0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)pg_last_xlog_receive_location 
-------------------------------0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)pg_last_xlog_replay_location 
------------------------------0/2000000
(1 row)

Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location, flush_location) AS replay_delta, sync_priority, sync_state FROM pg_stat_replication ORDER BY replay_delta ASC, application_name;
Regards
--
Michael

Re: Correct query to check streaming replication lag

From
Sameer Kumar
Date:

On Tue, Jan 21, 2014 at 12:12 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location, flush_location) AS replay_delta, sync_priority, sync_state FROM pg_stat_replication ORDER BY replay_delta ASC, application_name;

This is much more simple, but I guess would work only with v9.2 and higher (?).


Best Regards,
Sameer Kumar | Database Consultant
ASHNIK PTE. LTD.
101 Cecil Street, #11-11 Tong Eng Building, Singapore 069533
M : +65 8110 0350 T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

email patch

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Attachment

Re: Correct query to check streaming replication lag

From
Michael Paquier
Date:



On Tue, Jan 21, 2014 at 1:30 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:

On Tue, Jan 21, 2014 at 12:12 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
Mind you, here is a simple suggestion:
SELECT application_name, pg_xlog_location_diff(sent_location, flush_location) AS replay_delta, sync_priority, sync_state FROM pg_stat_replication ORDER BY replay_delta ASC, application_name;

This is much more simple, but I guess would work only with v9.2 and higher (?).
Yep, pg_xlog_location_diff is in core since 9.2.
--
Michael

Re: Correct query to check streaming replication lag

From
Sameer Kumar
Date:


We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

This is (delay) not the correct thing to monitor.

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

You can still use the other two methods I mentioned.


Regards
Sameer
Ashnik Pte Ltd, Singapore

Re: Correct query to check streaming replication lag

From
Michael Paquier
Date:


On Tue, Jan 21, 2014 at 2:33 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:
>>
>>
>> We are already using the following query:
>>
>> SELECT CASE WHEN pg_last_xlog_receive_location(
>> ) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
>> pg_last_xact_replay_timestamp()) END AS log_delay;
>>
> This is (delay) not the correct thing to monitor.
>
>> We cannot use pg_xlog_location_diff as we use postgresql 9.1.
>>
> You can still use the other two methods I mentioned.

FYI, here is an equivalent written in plpgsql easily findable by googling a bit, making a pg_xlog_location_diff-like function usable even in 9.1 and 9.0 servers:
CREATE OR REPLACE FUNCTION pg_xlog_location_diff_sql( text, text)
 RETURNS numeric
 LANGUAGE plpgsql
AS
 $function$
    DECLARE
       offset1 text;
       offset2 text;
       xlog1 text;
       xlog2 text;
       SQL text;
       diff text;
    BEGIN
       /* Extract the Offset and xlog from input in
          offset and xlog variables */
           
       offset1=split_part($1,'/',2);
         xlog1=split_part($1,'/',1);
       offset2=split_part($2,'/',2);
         xlog2=split_part($2,'/',1);
       
       /* Prepare SQL query for calculation based on following formula
         (FF000000 * xlog + offset) - (FF000000 * xlog + offset)
         which gives value in hexadecimal. Since, hexadecimal calculation is cumbersome
         so convert into decimal and then calculate the difference */
       
       SQL='SELECT (x'''||'FF000000'||'''::bigint * x'''||xlog1||'''::bigint
                                +  x'''||offset1||'''::bigint)'||'
                -
                   (x'''||'FF000000'||'''::bigint * x'''||xlog2||'''::bigint
                                +  x'''||offset2||'''::bigint)';
       EXECUTE SQL into diff;
       
       /* Return the value in numeric by explicit casting  */
       
       RETURN diff::numeric;
    END;
 $function$;

Source: http://vibhorkumar.wordpress.com/2013/02/18/pg_xlog_location_diff-function-for-postgreqsqlppas/
--
Michael

Re: Correct query to check streaming replication lag

From
Granthana Biswas
Date:
Hi Ray,

We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

We cannot use pg_xlog_location_diff as we use postgresql 9.1.


Regards,
Granthana


On Fri, Jan 17, 2014 at 8:24 PM, Ray Stell <stellr@vt.edu> wrote:

On Jan 17, 2014, at 5:07 AM, Granthana Biswas <granthana@zedo.com> wrote:

Yes it's purely for monitoring purpose.


I use the pg_controldata cmd locally and via bash/ssh shared keys and compare various values that seem interesting such as "Time of latest checkpoint, Latest checkpoint location."  My interest is recoverability and checkpoints seemed relevant at the time.  

I found a comment in the docs:


"pg_xlog_location_diff calculates the difference in bytes between two transaction log locations. It can be used with pg_stat_replication or some functions shown in Table 9-59 to get the replication lag."

and 

"The functions shown in Table 9-60 provide information about the current status of the standby. These functions may be executed both during recovery and in normal running."  

These look interesting wrt lag studies and seem to work on the stby:

template1=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location 
-------------------------------
 18/9E000000
(1 row)

template1=# select pg_last_xlog_replay_location();
 pg_last_xlog_replay_location 
------------------------------
 18/9E000000
(1 row)




Re: Correct query to check streaming replication lag

From
Granthana Biswas
Date:
Yes byte lag as well as it makes more sense.

Regards,
Granthana

Regards,
Granthana


On Tue, Jan 21, 2014 at 11:03 AM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:


We are already using the following query:

SELECT CASE WHEN pg_last_xlog_receive_location(
) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

This is (delay) not the correct thing to monitor.

We cannot use pg_xlog_location_diff as we use postgresql 9.1.

You can still use the other two methods I mentioned.


Regards
Sameer
Ashnik Pte Ltd, Singapore

Re: Correct query to check streaming replication lag

From
Granthana Biswas
Date:

Thanks a load Michael. This is really helpful.

Regards,
Granthana



On Tue, Jan 21, 2014 at 12:19 PM, Michael Paquier <michael.paquier@gmail.com> wrote:


On Tue, Jan 21, 2014 at 2:33 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:
>>
>>
>> We are already using the following query:
>>
>> SELECT CASE WHEN pg_last_xlog_receive_location(
>> ) = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
>> pg_last_xact_replay_timestamp()) END AS log_delay;
>>
> This is (delay) not the correct thing to monitor.
>
>> We cannot use pg_xlog_location_diff as we use postgresql 9.1.
>>
> You can still use the other two methods I mentioned.

FYI, here is an equivalent written in plpgsql easily findable by googling a bit, making a pg_xlog_location_diff-like function usable even in 9.1 and 9.0 servers:
CREATE OR REPLACE FUNCTION pg_xlog_location_diff_sql( text, text)
 RETURNS numeric
 LANGUAGE plpgsql
AS
 $function$
    DECLARE
       offset1 text;
       offset2 text;
       xlog1 text;
       xlog2 text;
       SQL text;
       diff text;
    BEGIN
       /* Extract the Offset and xlog from input in
          offset and xlog variables */
           
       offset1=split_part($1,'/',2);
         xlog1=split_part($1,'/',1);
       offset2=split_part($2,'/',2);
         xlog2=split_part($2,'/',1);
       
       /* Prepare SQL query for calculation based on following formula
         (FF000000 * xlog + offset) - (FF000000 * xlog + offset)
         which gives value in hexadecimal. Since, hexadecimal calculation is cumbersome
         so convert into decimal and then calculate the difference */
       
       SQL='SELECT (x'''||'FF000000'||'''::bigint * x'''||xlog1||'''::bigint
                                +  x'''||offset1||'''::bigint)'||'
                -
                   (x'''||'FF000000'||'''::bigint * x'''||xlog2||'''::bigint
                                +  x'''||offset2||'''::bigint)';
       EXECUTE SQL into diff;
       
       /* Return the value in numeric by explicit casting  */
       
       RETURN diff::numeric;
    END;
 $function$;

Source: http://vibhorkumar.wordpress.com/2013/02/18/pg_xlog_location_diff-function-for-postgreqsqlppas/
--
Michael