Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id 68df596638f60989cc72680d7029f0d51ca8cc19.camel@j-davis.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Sun, 2022-12-18 at 14:20 -0800, Peter Geoghegan wrote:
> On Thu, Dec 15, 2022 at 10:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
> > I agree that the burden of catch-up freezing is excessive here (in
> > fact I already wrote something to that effect on the wiki page).
> > The
> > likely solution can be simple enough.
>
> Attached is v10, which fixes this issue, but using a different
> approach to the one I sketched here.

Comments on 0002:

Can you explain the following portion of the diff:


  - else if (MultiXactIdPrecedes(multi, cutoffs->MultiXactCutoff))
  + else if (MultiXactIdPrecedes(multi, cutoffs->OldestMxact))

  ...

  + /* Can't violate the MultiXactCutoff invariant, either */
  + if (!need_replace)
  +     need_replace = MultiXactIdPrecedes(
  +        multi, cutoffs->MultiXactCutoff);

Regarding correctness, it seems like the basic structure and invariants
are the same, and it builds on the changes already in 9e5405993c. Patch
0002 seems *mostly* about making choices within the existing framework.
That gives me more confidence.

That being said, it does push harder against the limits on both sides.
If I understand correctly, that means pages with wider distributions of
xids are going to persist longer, which could expose pre-existing bugs
in new and interesting ways.

Next, the 'freeze_required' field suggests that it's more involved in
the control flow that causes freezing than it actually is. All it does
is communicate how the trackers need to be adjusted. The return value
of heap_prepare_freeze_tuple() (and underneath, the flags set by
FreezeMultiXactId()) are what actually control what happens. It would
be nice to make this more clear somehow.

The comment:

  /*
   * If we freeze xmax, make absolutely sure that it's not an XID that
   * is important.  (Note, a lock-only xmax can be removed independent
   * of committedness, since a committed lock holder has released the
   * lock).
   */

caused me to go down a rabbit hole looking for edge cases where we
might want to freeze an xmax but not an xmin; e.g. tup.xmax <
OldestXmin < tup.xmin or the related case where tup.xmax < RecentXmin <
tup.xmin. I didn't find a problem, so that's good news.

I also tried some pgbench activity along with concurrent vacuums (and
vacuum freezes) along with periodic verify_heapam(). No problems there.
 
Did you already describe the testing you've done for 0001+0002
specfiically? It's not radically new logic, but it would be good to try
to catch minor state-handling errors.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Simplifications for error messages related to compression
Next
From: David Rowley
Date:
Subject: Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)