Re: Faulty HEAP_XMAX_LOCK_ONLY & HEAP_KEYS_UPDATED hintbit combination - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Faulty HEAP_XMAX_LOCK_ONLY & HEAP_KEYS_UPDATED hintbit combination
Date
Msg-id 20210202042133.GA72239@nol
Whole thread Raw
In response to Re: Faulty HEAP_XMAX_LOCK_ONLY & HEAP_KEYS_UPDATED hintbit combination  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Faulty HEAP_XMAX_LOCK_ONLY & HEAP_KEYS_UPDATED hintbit combination
List pgsql-hackers
On Mon, Feb 01, 2021 at 02:00:48PM -0300, Alvaro Herrera wrote:
> On 2021-Jan-24, Julien Rouhaud wrote:
> 
> > While working on pg14 compatibility for an extension relying on an apparently
> > uncommon combination of FOR UPDATE and stored function calls, I hit some new
> > Asserts introduced in 866e24d47db (Extend amcheck to check heap pages):
> > 
> > +    /*
> > +     * Do not allow tuples with invalid combinations of hint bits to be placed
> > +     * on a page.  These combinations are detected as corruption by the
> > +     * contrib/amcheck logic, so if you disable one or both of these
> > +     * assertions, make corresponding changes there.
> > +     */
> > +    Assert(!((tuple->t_data->t_infomask & HEAP_XMAX_LOCK_ONLY) &&
> > +             (tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED)));
> > 
> > 
> > I attach a simple self contained script to reproduce the problem, the last
> > UPDATE triggering the Assert.
> 
> Maybe we should contest the idea that this is a sensible thing to Assert
> against.  AFAICS this was originally suggested here:
>
https://www.postgresql.org/message-id/flat/CAFiTN-syyHc3jZoou51v0SR8z0POoNfktqEO6MaGig4YS8mosA%40mail.gmail.com#ad215d0ee0606b5f67bbc57d011c96b8
> and it appears now to have been a bad idea.  If I recall correctly,
> HEAP_KEYS_UPDATED is supposed to distinguish locks/updates that don't
> modify the key columns from those that do.  Since SELECT FOR UPDATE
> stands for a future update that may modify arbitrary portions of the
> tuple (including "key" columns), then it produces that bit, just as said
> UPDATE or a DELETE; as opposed to SELECT FOR NO KEY UPDATE which stands
> for a future UPDATE that will only change columns that aren't part of
> any keys.

Thanks for the clarification, that makes sense.

> So I think that I misspoke earlier in this thread when I said this is a
> bug, and that the right fix here is to remove the Assert() and change
> amcheck to match.

I'm attaching a patch to do so.

> Separately, maybe it'd also be good to have a test case based on
> Julien's SQL snippet that produces this particular infomask combination
> (and other interesting combinations) and passes them through VACUUM etc
> to see that everything behaves correctly.

I also updated amcheck perl regression tests to generate such a combination,
add added an additional pass of verify_heapam() just after the VACUUM.

> 
> You could also argue the HEAP_KEYS_UPDATED is a misnomer and that we'd
> do well to change its name, and update README.tuplock.

Changing the name may be overkill, but claryfing the hint bit usage in
README.tuplock would definitely be useful, especially since the combination
isn't always produced.  How about adding something like:

 HEAP_KEYS_UPDATED
  This bit lives in t_infomask2.  If set, indicates that the XMAX updated
  this tuple and changed the key values, or it deleted the tuple.
+  It can also be set in combination of HEAP_XMAX_LOCK_ONLY.
  It's set regardless of whether the XMAX is a TransactionId or a MultiXactId.


Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit
Next
From: Peter Geoghegan
Date:
Subject: Re: New IndexAM API controlling index vacuum strategies