Re: Negative replication lag? - Mailing list pgsql-general

From Quentin Hartman
Subject Re: Negative replication lag?
Date
Msg-id CAJ48qNb3=GDJy_+k-aPXYUT4Ect9ZZOVh7+qCm30_RVHappfRA@mail.gmail.com
Whole thread Raw
In response to Re: Negative replication lag?  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-general
Ah, that makes sense. I think I'll add some logic to the script that has it get new data points if it comes up with a negative value.

Thanks for the insight.

QH


On Mon, Apr 22, 2013 at 5:11 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-04-22 16:36:38 -0600, Quentin Hartman wrote:
> I'm using this script to check my replication lag on my streaming
> replication pairs with Nagios:
>
> https://gist.github.com/jacobian/743942
>
> It generally works fine, but will occasionally return a negative lag value
> (-37kb for example) which of course causes it to throw an alarm, but is
> total nonsense. I've been working on the assumption that it is some sort of
> bug in the script, but in taking a quick look at it nothing jumps out at me.
>
> Is there something in Postgres itself that could cause this to happen once
> in awhile? Is it something to be concerned about? Is there a better way to
> monitor this state?

Well, between the time pg_current_xlog_location() is run on the primary
and pg_last_xlog_replay_location() on the standby some time passes, so
its not all that unlikely that wal has been generated, streamed *and*
applied in that time. Given the short timeframe it only happens every
now and then.

Did you check the pg_stat_replication view on the primary?

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-general by date:

Previous
From: Ryan Kelly
Date:
Subject: Re: run COPY as user other than postgres
Next
From: Kevin Grittner
Date:
Subject: Re: pg_stop_backup running for 2h10m?