Re: Eliminate redundant tuple visibility check in vacuum - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Eliminate redundant tuple visibility check in vacuum
Date
Msg-id CAAKRu_aUQcRvrt=ka4qCMJVRjGUvLCJjiiu_6iRQAq2mb9LOHA@mail.gmail.com
Whole thread Raw
In response to Re: Eliminate redundant tuple visibility check in vacuum  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Eliminate redundant tuple visibility check in vacuum
List pgsql-hackers
On Thu, Sep 21, 2023 at 3:53 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Sep 13, 2023 at 1:29 PM Andres Freund <andres@anarazel.de> wrote:
>
> > >               /*
> > >                * The criteria for counting a tuple as live in this block need to
> > > @@ -1682,7 +1664,7 @@ retry:
> > >                * (Cases where we bypass index vacuuming will violate this optimistic
> > >                * assumption, but the overall impact of that should be negligible.)
> > >                */
> > > -             switch (res)
> > > +             switch ((HTSV_Result) presult.htsv[offnum])
> > >               {
> > >                       case HEAPTUPLE_LIVE:
> >
> > I think we should assert that we have a valid HTSV_Result here, i.e. not
> > -1. You could wrap the cast and Assert into an inline funciton as well.
>
> This isn't a bad idea, although I don't find it completely necessary either.

Attached v5 does this. Even though a value of -1 would hit the default
switch case and error out, I can see the argument for this validation
-- as all other places switching on an HTSV_Result are doing so on a
value which was always an HTSV_Result.

Once I started writing the function comment, however, I felt a bit
awkward. In order to make the function available to both pruneheap.c
and vacuumlazy.c, I had to put it in a header file. Writing a
function, available to anyone including heapam.h, which takes an int
and returns an HTSV_Result feels a bit odd. Do we want it to be common
practice to use an int value outside the valid enum range to store
"status not yet computed" for HTSV_Results?

Anyway, on a tactical note, I added the inline function to heapam.h
below the PruneResult definition since it is fairly tightly coupled to
the htsv array in PruneResult. All of the function prototypes are
under a comment that says "function prototypes for heap access method"
-- which didn't feel like an accurate description of this function. I
wonder if it makes sense to have pruneheap.c include vacuum.h and move
pruning specific stuff like this helper and PruneResult over there? I
can't remember why I didn't do this before, but maybe there is a
reason not to? I also wasn't sure if I needed to forward declare the
inline function or not.

Oh, and, one more note. I've dropped the former patch 0001 which
changed the function comment about off_loc above heap_page_prune(). I
have plans to write a separate patch adding an error context callback
for HOT pruning with the offset number and would include such a change
in that patch.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Synchronizing slots from primary to standby