Thread: Standardizing how pg_waldump presents recovery conflict XID cutoffs

Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
Most recovery conflicts are generated in REDO routines using a
standard approach these days: they all call
ResolveRecoveryConflictWithSnapshot() with a latestRemovedXid argument
taken directly from the WAL record. Right now we don't quite present
this information in a uniform way, even though REDO routines apply the
cutoffs in a uniform way.

ISTM that there is value consistently using the same symbol names for
these cutoffs in every WAL record that has such a cutoff. The REDO
routine doesn't care about how the cutoff was generated during
original execution anyway -- it is always the responsibility of code
that runs during original execution (details of which will vary by
record type). Consistency makes all of this fairly explicit, and makes
it easier to use tools like pg_waldump to debug recovery conflicts --
the user can grep for the same generic symbol name and see everything.

Attached WIP patch brings heapam's VISIBLE record type and SP-GiST's
VACUUM_REDIRECT record type in line with this convention. It also
changes the symbol name from latestRemovedXid to something more
general: latestCommittedXid (since many of these WAL records don't
actually remove anything).

The patch also documents how these cutoffs are supposed to work at a
high level. We got the details slightly wrong (resulting in false
conflicts) for several years with FREEZE_PAGE records (see bugfix
commit 66fbcb0d2e for details), which seems like a relatively easy
mistake to make -- so we should try to avoid similar mistakes in the
future.

I'm not necessarily that attached to the name latestCommittedXid. It
is more accurate, but it's also a little bit too similar to another
common XID symbol name, latestCompletedXid. Can anyone suggest an
alternative?

-- 
Peter Geoghegan

Attachment

Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Andres Freund
Date:
Hi,

I like the idea of this, but:

On 2022-11-15 10:24:05 -0800, Peter Geoghegan wrote:
> I'm not necessarily that attached to the name latestCommittedXid. It
> is more accurate, but it's also a little bit too similar to another
> common XID symbol name, latestCompletedXid. Can anyone suggest an
> alternative?

... I strongly dislike latestCommittedXid. That seems at least as misleading
as latestRemovedXid and has the danger of confusion with latestCompletedXid
as you mention.

How about latestAffectedXid? Based on a quick scroll through the changed
structures it seems like it'd be reasonably discriptive for most?

Greetings,

Andres Freund



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Tue, Nov 15, 2022 at 12:29 PM Andres Freund <andres@anarazel.de> wrote:
> ... I strongly dislike latestCommittedXid. That seems at least as misleading
> as latestRemovedXid and has the danger of confusion with latestCompletedXid
> as you mention.

> How about latestAffectedXid?

I get why you don't care for latestCommittedXid, of course, but the
name does have some advantages. Namely:

1. Most conflicts come from PRUNE records (less often index deletion
records) where the XID is some heap tuple's xmax, a
committed-to-everybody XID on the primary (at the point of the
original execution of the prune). It makes sense to emphasize the idea
that snapshots running on a replica need to agree that this XID is
definitely committed -- we need to kill any snapshots that don't
definitely agree that this one particular XID is committed by now.

2. It hints at the idea that we don't need to set any XID to do
cleanup for aborted transactions, per the optimization in
HeapTupleHeaderAdvanceLatestRemovedXid().

Perhaps something like "mustBeCommittedCutoff" would work better? What
do you think of that? The emphasis on how things need to work on the
REDO side seems useful.

-- 
Peter Geoghegan



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Andres Freund
Date:
Hi,

On 2022-11-15 13:54:24 -0800, Peter Geoghegan wrote:
> On Tue, Nov 15, 2022 at 12:29 PM Andres Freund <andres@anarazel.de> wrote:
> > ... I strongly dislike latestCommittedXid. That seems at least as misleading
> > as latestRemovedXid and has the danger of confusion with latestCompletedXid
> > as you mention.
> 
> > How about latestAffectedXid?
> 
> I get why you don't care for latestCommittedXid, of course, but the
> name does have some advantages. Namely:
> 
> 1. Most conflicts come from PRUNE records (less often index deletion
> records) where the XID is some heap tuple's xmax, a
> committed-to-everybody XID on the primary (at the point of the
> original execution of the prune). It makes sense to emphasize the idea
> that snapshots running on a replica need to agree that this XID is
> definitely committed -- we need to kill any snapshots that don't
> definitely agree that this one particular XID is committed by now.

I don't agree that it makes sense there - to me it sounds like the record is
just carrying the globally latest committed xid rather than something just
describing the record.

I also just don't think "agreeing that a particular XID is committed" is a
good description of latestRemovedXID, there's just too many ways that
"agreeing xid has committed" can be understood. To me it's not obvious that
it's about mvcc snapshots.

If we want to focus on the mvcc affects we could just go for something like
snapshotConflictHorizon or such.



> Perhaps something like "mustBeCommittedCutoff" would work better? What
> do you think of that? The emphasis on how things need to work on the
> REDO side seems useful.

I don't think "committed" should be part of the name.

Greetings,

Andres Freund



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Tue, Nov 15, 2022 at 5:29 PM Andres Freund <andres@anarazel.de> wrote:
> If we want to focus on the mvcc affects we could just go for something like
> snapshotConflictHorizon or such.

Okay, let's go with snapshotConflictHorizon. I'll use that name in the
next revision, which I should be able to post tomorrow.

-- 
Peter Geoghegan



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Andres Freund
Date:
On 2022-11-15 20:48:56 -0800, Peter Geoghegan wrote:
> On Tue, Nov 15, 2022 at 5:29 PM Andres Freund <andres@anarazel.de> wrote:
> > If we want to focus on the mvcc affects we could just go for something like
> > snapshotConflictHorizon or such.
> 
> Okay, let's go with snapshotConflictHorizon. I'll use that name in the
> next revision, which I should be able to post tomorrow.

Cool!



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Tue, Nov 15, 2022 at 8:48 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Okay, let's go with snapshotConflictHorizon. I'll use that name in the
> next revision, which I should be able to post tomorrow.

Attached is a somewhat cleaned up version that uses that symbol name
for everything.

--
Peter Geoghegan

Attachment

Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Andres Freund
Date:
Hi,

On 2022-11-16 14:14:30 -0800, Peter Geoghegan wrote:
>  /*
> - * If 'tuple' contains any visible XID greater than latestRemovedXid,
> - * ratchet forwards latestRemovedXid to the greatest one found.
> - * This is used as the basis for generating Hot Standby conflicts, so
> - * if a tuple was never visible then removing it should not conflict
> - * with queries.
> + * Maintain snapshotConflictHorizon for caller by ratcheting forward its value
> + * using any committed XIDs contained in 'tuple', an obsolescent heap tuple
> + * that caller is in the process of physically removing via pruning.
> + * (Also supports generating index deletion snapshotConflictHorizon values.)

The "(also...) formulation seems a bit odd. How about "an obsolescent heap
tuple that the caller is physically removing, e.g. via HOT pruning or index
deletion." or such?


> + * snapshotConflictHorizon format values are how all hot Standby conflicts are
> + * generated by REDO routines (at least wherever a granular cutoff is used).

Not quite parsing for me.

> + * Caller must initialize its value to InvalidTransactionId, which is generally
> + * interpreted as "definitely no need for a recovery conflict".
> + *
> + * Final value must reflect all heap tuples that caller will physically remove
> + * via the ongoing pruning operation.  ResolveRecoveryConflictWithSnapshot() is
> + * passed the final value (taken from caller's WAL record) by a REDO routine.

> +    /*
> +     * It's quite possible that final snapshotConflictHorizon value will be
> +     * invalid in final WAL record, indicating that we definitely don't need to
> +     * generate a conflict
> +     */

*the final

Isn't this already described in the header?


> @@ -3337,12 +3337,17 @@ GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0,
>   * GetConflictingVirtualXIDs -- returns an array of currently active VXIDs.
>   *
>   * Usage is limited to conflict resolution during recovery on standby servers.
> - * limitXmin is supplied as either latestRemovedXid, or InvalidTransactionId
> - * in cases where we cannot accurately determine a value for latestRemovedXid.
> + * limitXmin is supplied as either a snapshotConflictHorizon format XID, or as
> + * InvalidTransactionId in cases where caller cannot accurately determine a
> + * safe snapshotConflictHorizon value.
>   *
>   * If limitXmin is InvalidTransactionId then we want to kill everybody,
>   * so we're not worried if they have a snapshot or not, nor does it really
> - * matter what type of lock we hold.
> + * matter what type of lock we hold.  Caller must avoid calling here with
> + * snapshotConflictHorizon format XIDs that were set to InvalidTransactionId

What are "snapshotConflictHorizon format XIDs"? I guess you mean format in the
sense of having the semantics of snapshotConflictHorizon?



Greetings,

Andres Freund



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Wed, Nov 16, 2022 at 3:27 PM Andres Freund <andres@anarazel.de> wrote:
> The "(also...) formulation seems a bit odd. How about "an obsolescent heap
> tuple that the caller is physically removing, e.g. via HOT pruning or index
> deletion." or such?

Okay, WFM.

> > + * snapshotConflictHorizon format values are how all hot Standby conflicts are
> > + * generated by REDO routines (at least wherever a granular cutoff is used).
>
> Not quite parsing for me.

I meant something like: this is a cutoff that works in the same way as
any other cutoff involved with recovery conflicts, in general, with
the exception of those cases that have very coarse grained conflicts,
such as DROP TABLESPACE.

I suppose it would be better to just say the first part. Will fix.

> > +     /*
> > +      * It's quite possible that final snapshotConflictHorizon value will be
> > +      * invalid in final WAL record, indicating that we definitely don't need to
> > +      * generate a conflict
> > +      */
>
> *the final
>
> Isn't this already described in the header?

Sort of, but arguably it makes sense to call it out specifically.
Though on second thought, yeah, lets just get rid of it.

> What are "snapshotConflictHorizon format XIDs"? I guess you mean format in the
> sense of having the semantics of snapshotConflictHorizon?

Yes. That is the only possible way that any recovery conflict ever
works on the REDO side, with the exception of a few
not-very-interesting cases such as DROP TABLESPACE.

GetConflictingVirtualXIDs() assigns a special meaning to
InvalidTransactionId which is the *opposite* of the special meaning
that snapshotConflictHorizon-based values assign to
InvalidTransactionId. At one point they actually did the same
definition for InvalidTransactionId, but that was changed soon after
hot standby first went in (when we taught btree delete records to not
use ludicrously conservative cutoffs that caused needless conflicts).

Anyway, worth calling this out directly in these comments IMV. We're
addressing two closely related things that assign opposite meanings to
InvalidTransactionId, which is rather confusing.

-- 
Peter Geoghegan



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Andres Freund
Date:
Hi,

On 2022-11-16 15:37:40 -0800, Peter Geoghegan wrote:
> On Wed, Nov 16, 2022 at 3:27 PM Andres Freund <andres@anarazel.de> wrote:
> > What are "snapshotConflictHorizon format XIDs"? I guess you mean format in the
> > sense of having the semantics of snapshotConflictHorizon?
> 
> Yes. That is the only possible way that any recovery conflict ever
> works on the REDO side, with the exception of a few
> not-very-interesting cases such as DROP TABLESPACE.
> 
> GetConflictingVirtualXIDs() assigns a special meaning to
> InvalidTransactionId which is the *opposite* of the special meaning
> that snapshotConflictHorizon-based values assign to
> InvalidTransactionId. At one point they actually did the same
> definition for InvalidTransactionId, but that was changed soon after
> hot standby first went in (when we taught btree delete records to not
> use ludicrously conservative cutoffs that caused needless conflicts).
> 
> Anyway, worth calling this out directly in these comments IMV. We're
> addressing two closely related things that assign opposite meanings to
> InvalidTransactionId, which is rather confusing.

It makes sense to call this out, but I'd
s/snapshotConflictHorizon format XIDs/cutoff with snapshotConflictHorizon semantics/

or such?

Greetings,

Andres Freund



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Wed, Nov 16, 2022 at 4:25 PM Andres Freund <andres@anarazel.de> wrote:
> > Anyway, worth calling this out directly in these comments IMV. We're
> > addressing two closely related things that assign opposite meanings to
> > InvalidTransactionId, which is rather confusing.
>
> It makes sense to call this out, but I'd
> s/snapshotConflictHorizon format XIDs/cutoff with snapshotConflictHorizon semantics/
>
> or such?

WFM.

-- 
Peter Geoghegan



Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Wed, Nov 16, 2022 at 4:34 PM Peter Geoghegan <pg@bowt.ie> wrote:
> WFM.

Attached is v3.

Plan is to commit this later on today, barring objections.

Thanks
-- 
Peter Geoghegan

Attachment

Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs

From
Peter Geoghegan
Date:
On Thu, Nov 17, 2022 at 9:02 AM Peter Geoghegan <pg@bowt.ie> wrote:
> Plan is to commit this later on today, barring objections.

Pushed, thanks.

-- 
Peter Geoghegan