Thread: Proper use of pg_xlog_location_diff()

Proper use of pg_xlog_location_diff()

From

Fabio Ugo Venchiarutti

Date:

16 January 2015, 01:12:26

Greetings


Our company is writing a small ad-hoc implementation of a load balancer
for Postgres (`version()` = PostgreSQL 9.2.9 on
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat
4.4.7-4), 64-bit).

We're using both streaming and WAL shipping based replication.


Most mainstream solutions seem to implement load balancing with plain
round robin over a connection pool. Given that our cloud nodes are
diversely capable and subject to noisy neighborhood conditions, we need
to factor in instantaneous load profiles (We achieved this by exporting
some /sys and /proc paths through custom views and everything works as
expected).


We're now adding functionality to temporarily blacklist hot standby
clusters based on their WAL records lag and pg_xlog_location_diff()
seems to be the key tool for this, but we're perhaps misusing it.


The current draft implementation uses the following queries and compares
the output to determine how many bytes a given slave is lagging.
Is there any shortcoming to such approach?


--------------------------------
-- ON MASTER:
--------------------------------
SELECT
    pg_xlog_location_diff(pg_current_xlog_location(), '000/00000000')
;
--------------------------------

--------------------------------
-- ON STANDBY:
--------------------------------
SELECT
    pg_xlog_location_diff(
        COALESCE(
            pg_last_xlog_receive_location(),
            pg_last_xlog_replay_location()
        ),
        '000/00000000'
    )
;
--------------------------------



Thanks in advance


Fabio Ugo Venchiarutti

Re: Proper use of pg_xlog_location_diff()

From

Jim Nasby

Date:

16 January 2015, 01:37:41

On 1/15/15 7:12 PM, Fabio Ugo Venchiarutti wrote:
> Greetings
>
>
> Our company is writing a small ad-hoc implementation of a load balancer for Postgres (`version()` = PostgreSQL 9.2.9
onx86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit). 
>
> We're using both streaming and WAL shipping based replication.
>
>
> Most mainstream solutions seem to implement load balancing with plain round robin over a connection pool. Given that
ourcloud nodes are diversely capable and subject to noisy neighborhood conditions, we need to factor in instantaneous
loadprofiles (We achieved this by exporting some /sys and /proc paths through custom views and everything works as
expected).
>
>
> We're now adding functionality to temporarily blacklist hot standby clusters based on their WAL records lag and
pg_xlog_location_diff()seems to be the key tool for this, but we're perhaps misusing it. 
>
>
> The current draft implementation uses the following queries and compares the output to determine how many bytes a
givenslave is lagging. 
> Is there any shortcoming to such approach?
>
>
> --------------------------------
> -- ON MASTER:
> --------------------------------
> SELECT
>      pg_xlog_location_diff(pg_current_xlog_location(), '000/00000000')
> ;
> --------------------------------

That's very nonsensical; it will always return the same thing as pg_current_xlog_location.

> --------------------------------
> -- ON STANDBY:
> --------------------------------
> SELECT
>      pg_xlog_location_diff(
>          COALESCE(
>              pg_last_xlog_receive_location(),

Note that that is the xlog location that has been *sync'd to disk*. That could potentially lag significantly behind the
master'sLSN. I think your safest bet would be getting pg_current_xlog_location from the master and subtracting
pg_last_xlog_replay_location()from it (but note you could get a negative result). 

BTW, http://www.postgresql.org/docs/devel/static/warm-standby.html#STREAMING-REPLICATION says to use
pg_last_xlog_receive_location()instead of pg_last_xlog_replay_location() because it tells you what's committed to disk
ona standby vs what's visible. But for what you're doing I think you want pg_last_xlog_replay_location(). 

Also, I don't think you should coalesce. If you get a NULL for any of this then something's almost certainly wrong
(likea server is misconfigured). If you were going to coalesce I'd say you should coalesce to 2^63-1. 
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Proper use of pg_xlog_location_diff()

From

Fabio Ugo Venchiarutti

Date:

16 January 2015, 02:41:24


On 16/01/15 14:37, Jim Nasby wrote:
> On 1/15/15 7:12 PM, Fabio Ugo Venchiarutti wrote:
>> Greetings
>>
>>
>> Our company is writing a small ad-hoc implementation of a load
>> balancer for Postgres (`version()` = PostgreSQL 9.2.9 on
>> x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red
>> Hat 4.4.7-4), 64-bit).
>>
>> We're using both streaming and WAL shipping based replication.
>>
>>
>> Most mainstream solutions seem to implement load balancing with plain
>> round robin over a connection pool. Given that our cloud nodes are
>> diversely capable and subject to noisy neighborhood conditions, we
>> need to factor in instantaneous load profiles (We achieved this by
>> exporting some /sys and /proc paths through custom views and
>> everything works as expected).
>>
>>
>> We're now adding functionality to temporarily blacklist hot standby
>> clusters based on their WAL records lag and pg_xlog_location_diff()
>> seems to be the key tool for this, but we're perhaps misusing it.
>>
>>
>> The current draft implementation uses the following queries and
>> compares the output to determine how many bytes a given slave is lagging.
>> Is there any shortcoming to such approach?
>>
>>
>> --------------------------------
>> -- ON MASTER:
>> --------------------------------
>> SELECT
>>      pg_xlog_location_diff(pg_current_xlog_location(), '000/00000000')
>> ;
>> --------------------------------
>
> That's very nonsensical; it will always return the same thing as
> pg_current_xlog_location.
>
>> --------------------------------
>> -- ON STANDBY:
>> --------------------------------
>> SELECT
>>      pg_xlog_location_diff(
>>          COALESCE(
>>              pg_last_xlog_receive_location(),
>
> Note that that is the xlog location that has been *sync'd to disk*. That
> could potentially lag significantly behind the master's LSN. I think
> your safest bet would be getting pg_current_xlog_location from the
> master and subtracting pg_last_xlog_replay_location() from it (but note
> you could get a negative result).
>
> BTW,
> http://www.postgresql.org/docs/devel/static/warm-standby.html#STREAMING-REPLICATION
> says to use pg_last_xlog_receive_location() instead of
> pg_last_xlog_replay_location() because it tells you what's committed to
> disk on a standby vs what's visible. But for what you're doing I think
> you want pg_last_xlog_replay_location().
>
> Also, I don't think you should coalesce. If you get a NULL for any of
> this then something's almost certainly wrong (like a server is
> misconfigured). If you were going to coalesce I'd say you should
> coalesce to 2^63-1.


Thank you

 >> --------------------------------
 >> -- ON MASTER:
 >> --------------------------------
 >> SELECT
 >>      pg_xlog_location_diff(pg_current_xlog_location(), '000/00000000')
 >> ;
 >> --------------------------------
 >
 > That's very nonsensical; it will always return the same thing as
 > pg_current_xlog_location.

I know it seems odd but I'm just using the query itself to keep
everything in decimals as my safety threshold is expressed in bytes.

The question this query asks is "how many WAL bytes does this cluster
know about since the hypothetical record 000/00000000"?

Then I do the math.

The actual full form is the same for both master and standbys (already
changed based on your input):

--------------------------------
SELECT
    pg_is_in_recovery() AS in_recovery,
    pg_xlog_location_diff(
        (CASE
            WHEN (pg_is_in_recovery()) THEN
                pg_last_xlog_replay_location()
            ELSE
                pg_current_xlog_location()
        END),
        '000/00000000'
    ) AS total_wal_offset
--------------------------------


 > says to use pg_last_xlog_receive_location() instead of
 > pg_last_xlog_replay_location() because it tells you what's committed to
 > disk on a standby vs what's visible. But for what you're doing I think
 > you want pg_last_xlog_replay_location().

You basically framed what my question boils down to.

I'm trying to get my head around
http://www.postgresql.org/docs/9.2/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE

Does it mean that pg_last_xlog_receive_location() returns the last WAL
record that has been successfully "staged for replay" by the stream
replication whereas pg_last_xlog_replay_location() returns the last
successful WAL replay regardless of it coming from streaming or archive
shipping?
As in 2 different stages the first of which is basically irrelevant to
record visibility?

Many thanks


Fabio

Re: Proper use of pg_xlog_location_diff()

From

Jim Nasby

Date:

16 January 2015, 07:39:34

On 1/15/15 8:41 PM, Fabio Ugo Venchiarutti wrote:
> Does it mean that pg_last_xlog_receive_location() returns the last WAL record that has been successfully "staged for
replay"by the stream replication whereas pg_last_xlog_replay_location() returns the last successful WAL replay
regardlessof it coming from streaming or archive shipping? 
> As in 2 different stages the first of which is basically irrelevant to record visibility?

(At least in -HEAD) receive is what's been committed to disk; replay is what's actually been replayed and should be
visible.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com