Re: BUG #7710: Xid epoch is not updated properly during checkpoint - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Date
Msg-id 20121201231022.GB25134@awork2.anarazel.de
Whole thread Raw
In response to Re: BUG #7710: Xid epoch is not updated properly during checkpoint  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Re: BUG #7710: Xid epoch is not updated properly during checkpoint
List pgsql-bugs
On 2012-12-01 17:56:33 -0500, Tom Lane wrote:
> tarvip@gmail.com writes:
> > [ txid_current can show a bogus value near XID wraparound ]
> > This happens only if wal_level=hot_standby.
>
> I believe what is happening here is

Hrmpf. You had to report the fix for that three minutes before
me. ;)

> (1) CreateCheckPoint sets up checkPoint.nextXid and
> checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD.  At this point,
> nextXid is still a bit less than the wrap point.
>
> (2) After performing the checkpoint, at line 7113, CreateCheckPoint
> calls LogStandbySnapshot() which "helpfully" updates checkPoint.nextXid
> to the latest value.  Which by now has wrapped around.  But it doesn't
> fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
> effectively lost the epoch bump that should have happened.

Same conclusion here.

> While we could add some more logic to try to correct the epoch value
> in this scenario, I think it's a much better idea to just stop having
> LogStandbySnapshot update the nextXid.  That seems to me to be useless
> complication.  I also quite dislike the fact that we're effectively
> redefining the checkpoint nextXid from being taken before the main
> body of the checkpoint to being taken afterwards, but *only* in
> XLogStandbyInfoActive mode.  If that inconsistency isn't already causing
> bugs (besides this one) today, it'll probably cause them in the future.
>
> So barring objections, I'm going to remove LogStandbySnapshot's behavior
> of returning the updated nextXid.

I don't see any reason why it would be bad to remove this. I think the
current behaviour could actually even delay getting to an active state
slightly in the presence of prepared transactions because its used to
create to initialize the KnownAssignedXid machinery in xlog_redo. If the
prepared xacts are suboverflown its a *good* thing to have an old
->nextXid.

Greetings,

Andres Freund

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Next
From: Andres Freund
Date:
Subject: Re: BUG #7710: Xid epoch is not updated properly during checkpoint