Thread: replay doesn't catch up with receive on standby

replay doesn't catch up with receive on standby

From
Steven Parkes
Date:
This is on 9.0.3: I've got two dbs running as standby to a main db. They start up fine and seem to think they're all
caughtup (by /var/log logs), but 

SELECT pg_last_xlog_receive_location() AS receive, pg_last_xlog_replay_location() AS replay;

reports replay behind receive and it doesn't change. This is on both dbs.

Notably the main db isn't (wasn't) doing anything, so no new commits were causing things to move forward. I did a write
toit and both slaves moved both their recieved and replay serial numbers up. 

Is there a valid situation where an idle master/standby setup could remain with replay behind received indefinitely?
(Mynagios monitor isn't very happy with that (at present)) and before changing that I'd like to understand better
what'sgoing on.) 

Re: replay doesn't catch up with receive on standby

From
Fujii Masao
Date:
On Tue, Apr 19, 2011 at 9:00 AM, Steven Parkes <smparkes@smparkes.net> wrote:
> This is on 9.0.3: I've got two dbs running as standby to a main db. They start up fine and seem to think they're all
caughtup (by /var/log logs), but 
>
> SELECT pg_last_xlog_receive_location() AS receive, pg_last_xlog_replay_location() AS replay;
>
> reports replay behind receive and it doesn't change. This is on both dbs.
>
> Notably the main db isn't (wasn't) doing anything, so no new commits were causing things to move forward. I did a
writeto it and both slaves moved both their recieved and replay serial numbers up. 
>
> Is there a valid situation where an idle master/standby setup could remain with replay behind received indefinitely?
(Mynagios monitor isn't very happy with that (at present)) and before changing that I'd like to understand better
what'sgoing on.) 

Did you run query on the standby? If yes, I guess that query conflict prevented
the reply location from advancing.
http://www.postgresql.org/docs/9.0/static/hot-standby.html#HOT-STANDBY-CONFLICT

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: replay doesn't catch up with receive on standby

From
Steven Parkes
Date:
> Did you run query on the standby?

Yup. Both standbys. They both responded the same way.

> If yes, I guess that query conflict prevented
> the reply location from advancing.
> http://www.postgresql.org/docs/9.0/static/hot-standby.html#HOT-STANDBY-CONFLICT

The standbys were idle and this was a persistent state. I restarted the standbys and they stayed in this state. Am I
missingsomething? I thought these conflicts were related to queries against the standbys but there shouldn't have been
anythat I'm aware. Certainly none should survive a restart ... 

Am I missing something about the conflict?

It also seems notable that a new commit on the master cleared the issue ... Does that seem like the hot standby
conflictcase? 


Re: replay doesn't catch up with receive on standby

From
Fujii Masao
Date:
On Tue, Apr 19, 2011 at 10:28 AM, Steven Parkes <smparkes@smparkes.net> wrote:
>> Did you run query on the standby?
>
> Yup. Both standbys. They both responded the same way.
>
>> If yes, I guess that query conflict prevented
>> the reply location from advancing.
>> http://www.postgresql.org/docs/9.0/static/hot-standby.html#HOT-STANDBY-CONFLICT
>
> The standbys were idle and this was a persistent state. I restarted the standbys and they stayed in this state. Am I
missingsomething? I thought these conflicts were related to queries against the standbys but there shouldn't have been
anythat I'm aware. Certainly none should survive a restart ... 
>
> Am I missing something about the conflict?
>
> It also seems notable that a new commit on the master cleared the issue ... Does that seem like the hot standby
conflictcase? 

Probably no.

Was there idle-in-transaction in the master when the problem happened?
If yes, this can happen. In that case, only half of WAL record can be written
to the disk by walwriter and sent to the standby by walsender. The rest
will be written and sent after you'll have finished the transaction. In this
case, the receive location indicates the end of that WAL record obviously.
OTOH, since that half-baked WAL record cannot be replayed, the replay
location cannot advance and still has to indicate the end of previous WAL
record.

If you issue new commit, all of the WAL record is flushed to the standby.
So that WAL record was replayed and the replay location advanced. I guess
you observed the above situation.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: replay doesn't catch up with receive on standby

From
Steven Parkes
Date:
> Was there idle-in-transaction in the master when the problem happened?

Shouldn't have been, but that's what I was wondering, too. I didn't check. Not sure I know how to check.

That was my guess and I mostly wanted to confirm that that could happen. Does seem like an edge case. I don't expect
uncommittedtransactions to be hanging around in general, or even long periods between some kind of write. 

Thanks for the help.