Thread: heapgettup refactoring

heapgettup refactoring

From
Melanie Plageman
Date:
Hi,

Attached is a patchset to refactor heapgettup(), heapgettup_pagemode(),
and heapgetpage(). heapgettup() and heapgettup_pagemode() have a lot of
duplicated code, confusingly nested if statements, and unnecessary local
variables. While working on a feature for the AIO/DIO patchset, I
noticed that it was difficult to add new code to heapgettup() and
heapgettup_pagemode() because of how the functions are written.

I've taken a stab at refactoring them -- without generating less
efficient code or causing regressions. I'm interested if people find it
more readable and if those with more assembly expertise see issues (new
branches added which are not highly predictable, etc). I took a look at
the assembly for those symbols compiled at O2 but am not experienced
enough at analysis to come to any conclusions.

- Melanie

Attachment

Re: heapgettup refactoring

From
Justin Pryzby
Date:
FYI:

[18:51:54.707] ../src/backend/access/heap/heapam.c(720): warning C4098: 'heapgettup': 'void' function returning a
value
[18:51:54.707] ../src/backend/access/heap/heapam.c(850): warning C4098: 'heapgettup_pagemode': 'void' function
returninga value
 

For some reason, MSVC is the only one to complain, and cfbot doesn't
currently tell you about it.  I have a patch to show that, which I'll
send $later.



Re: heapgettup refactoring

From
Andres Freund
Date:
Hi,

On 2022-10-31 14:37:39 -0400, Melanie Plageman wrote:
> and heapgetpage(). heapgettup() and heapgettup_pagemode() have a lot of
> duplicated code, confusingly nested if statements, and unnecessary local
> variables. While working on a feature for the AIO/DIO patchset, I
> noticed that it was difficult to add new code to heapgettup() and
> heapgettup_pagemode() because of how the functions are written.

Thanks for working on this - the current state is quite painful.


> From cde2d6720f4f5ab2531c22ad4a5f0d9e6ec1039d Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman@gmail.com>
> Date: Wed, 26 Oct 2022 20:00:34 -0400
> Subject: [PATCH v1 1/3] Remove breaks in HeapTupleSatisfiesVisibility
>
> breaks in HeapTupleSatisfiesVisibility were superfluous
> ---
>  src/backend/access/heap/heapam_visibility.c | 7 -------
>  1 file changed, 7 deletions(-)
>
> diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
> index 6e33d1c881..dd5d5da190 100644
> --- a/src/backend/access/heap/heapam_visibility.c
> +++ b/src/backend/access/heap/heapam_visibility.c
> @@ -1769,25 +1769,18 @@ HeapTupleSatisfiesVisibility(HeapTuple htup, Snapshot snapshot, Buffer buffer)
>      {
>          case SNAPSHOT_MVCC:
>              return HeapTupleSatisfiesMVCC(htup, snapshot, buffer);
> -            break;
>          case SNAPSHOT_SELF:
>              return HeapTupleSatisfiesSelf(htup, snapshot, buffer);
> -            break;
>          case SNAPSHOT_ANY:
>              return HeapTupleSatisfiesAny(htup, snapshot, buffer);
> -            break;
>          case SNAPSHOT_TOAST:
>              return HeapTupleSatisfiesToast(htup, snapshot, buffer);
> -            break;
>          case SNAPSHOT_DIRTY:
>              return HeapTupleSatisfiesDirty(htup, snapshot, buffer);
> -            break;
>          case SNAPSHOT_HISTORIC_MVCC:
>              return HeapTupleSatisfiesHistoricMVCC(htup, snapshot, buffer);
> -            break;
>          case SNAPSHOT_NON_VACUUMABLE:
>              return HeapTupleSatisfiesNonVacuumable(htup, snapshot, buffer);
> -            break;
>      }

Not sure what the author of this code, a certain Mr Freund, was thinking when
he added those returns...


> From 9d8b01960463dc64ff5b111d523ff80fce3017af Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman@gmail.com>
> Date: Mon, 31 Oct 2022 13:40:06 -0400
> Subject: [PATCH v1 2/3] Turn HeapKeyTest macro into function
>
> This should always be inlined appropriately now. It is easier to read as
> a function. Also, remove unused include in catcache.c.
> ---
>  src/backend/access/heap/heapam.c   | 10 ++--
>  src/backend/utils/cache/catcache.c |  1 -
>  src/include/access/valid.h         | 76 ++++++++++++------------------
>  3 files changed, 36 insertions(+), 51 deletions(-)
>
> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 12be87efed..1c995faa12 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -716,8 +716,10 @@ heapgettup(HeapScanDesc scan,
>                                                      snapshot);
>
>                  if (valid && key != NULL)
> -                    HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd),
> -                                nkeys, key, valid);
> +                {
> +                    valid = HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd),
> +                                nkeys, key);
> +                }
>
>                  if (valid)
>                  {

superfluous parens.



> --- a/src/include/access/valid.h
> +++ b/src/include/access/valid.h
> @@ -19,51 +19,35 @@
>   *
>   *        Test a heap tuple to see if it satisfies a scan key.
>   */
> -#define HeapKeyTest(tuple, \
> -                    tupdesc, \
> -                    nkeys, \
> -                    keys, \
> -                    result) \
> -do \
> -{ \
> -    /* Use underscores to protect the variables passed in as parameters */ \
> -    int            __cur_nkeys = (nkeys); \
> -    ScanKey        __cur_keys = (keys); \
> - \
> -    (result) = true; /* may change */ \
> -    for (; __cur_nkeys--; __cur_keys++) \
> -    { \
> -        Datum    __atp; \
> -        bool    __isnull; \
> -        Datum    __test; \
> - \
> -        if (__cur_keys->sk_flags & SK_ISNULL) \
> -        { \
> -            (result) = false; \
> -            break; \
> -        } \
> - \
> -        __atp = heap_getattr((tuple), \
> -                             __cur_keys->sk_attno, \
> -                             (tupdesc), \
> -                             &__isnull); \
> - \
> -        if (__isnull) \
> -        { \
> -            (result) = false; \
> -            break; \
> -        } \
> - \
> -        __test = FunctionCall2Coll(&__cur_keys->sk_func, \
> -                                   __cur_keys->sk_collation, \
> -                                   __atp, __cur_keys->sk_argument); \
> - \
> -        if (!DatumGetBool(__test)) \
> -        { \
> -            (result) = false; \
> -            break; \
> -        } \
> -    } \
> -} while (0)
> +static inline bool
> +HeapKeyTest(HeapTuple tuple, TupleDesc tupdesc, int nkeys, ScanKey keys)
> +{
> +    int cur_nkeys = nkeys;
> +    ScanKey cur_key = keys;
> +
> +    for (; cur_nkeys--; cur_key++)
> +    {
> +        Datum atp;
> +        bool isnull;
> +        Datum test;
> +
> +        if (cur_key->sk_flags & SK_ISNULL)
> +            return false;
> +
> +        atp = heap_getattr(tuple, cur_key->sk_attno, tupdesc, &isnull);
> +
> +        if (isnull)
> +            return false;
> +
> +        test = FunctionCall2Coll(&cur_key->sk_func,
> +                                cur_key->sk_collation,
> +                                atp, cur_key->sk_argument);
> +
> +        if (!DatumGetBool(test))
> +            return false;
> +    }
> +
> +    return true;
> +}

Seems like a simple and nice win in readability.

I recall looking at this in the past and thinking that there was some
additional subtlety here, but I can't see what that'd be.



> From a894ce38c488df6546392b9f3bd894b67edf951e Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman@gmail.com>
> Date: Mon, 31 Oct 2022 13:40:29 -0400
> Subject: [PATCH v1 3/3] Refactor heapgettup* and heapgetpage
>
> Simplify heapgettup(), heapgettup_pagemode(), and heapgetpage(). All
> three contained several unnecessary local variables, duplicate code, and
> nested if statements. Streamlining these improves readability and
> extensibility.

It'd be nice to break this into slightly smaller chunks.


> +
> +static inline void heapgettup_no_movement(HeapScanDesc scan)
> +{

FWIW, for function definitions we keep the return type (and with that also the
the "static inline") on a separate line.


> +    ItemId        lpp;
> +    OffsetNumber lineoff;
> +    BlockNumber page;
> +    Page dp;
> +    HeapTuple    tuple = &(scan->rs_ctup);
> +    /*
> +    * ``no movement'' scan direction: refetch prior tuple
> +    */
> +
> +    /* Since the tuple was previously fetched, needn't lock page here */
> +    if (!scan->rs_inited)
> +    {
> +        Assert(!BufferIsValid(scan->rs_cbuf));
> +        tuple->t_data = NULL;
> +        return;

Is it possible to have a no-movement scan with an uninitialized scan? That
doesn't really seem to make sense. At least that's how I understand the
explanation for NoMovement nearby:
 * dir == NoMovementScanDirection means "re-fetch the tuple indicated
 * by scan->rs_ctup".

We can't have a rs_ctup without an already started scan, no?

Looks like this is pre-existing code that you just moved, but it still seems
wrong.


> +    }
> +    page = ItemPointerGetBlockNumber(&(tuple->t_self));
> +    if (page != scan->rs_cblock)
> +        heapgetpage((TableScanDesc) scan, page);


We have a
    BlockNumber page;
and
    Page        dp;
in this code which seems almost intentionally confusing. This again is a
pre-existing issue but perhaps we could clean it up first?



> +static inline Page heapgettup_continue_page(HeapScanDesc scan, BlockNumber page, ScanDirection dir,
> +        int *linesleft, OffsetNumber *lineoff)
> +{
> +    HeapTuple    tuple = &(scan->rs_ctup);

Hm. Finding the next offset via rs_ctup doesn't seem quite right. For one,
it's not actually that cheap to extract the offset from an ItemPointer because
of the the way we pack it into ItemPointerData.


> +    Page dp = BufferGetPage(scan->rs_cbuf);
> +    TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp);

Newlines between definitions and code :)

Perhaps worth asserting that the scan is initialized and that rs_cbuf is valid?


> +    if (ScanDirectionIsForward(dir))
> +    {
> +        *lineoff = OffsetNumberNext(ItemPointerGetOffsetNumber(&(tuple->t_self)));
> +        *linesleft = PageGetMaxOffsetNumber(dp) - (*lineoff) + 1;

We can't access PageGetMaxOffsetNumber etc without holding a lock on the
page. It's not immediately obvious that that is held in all paths.


> +static inline BlockNumber heapgettup_initial_page(HeapScanDesc scan, ScanDirection dir)
> +{
> +    Assert(!ScanDirectionIsNoMovement(dir));
> +    Assert(!scan->rs_inited);

Is there a reason we couldn't set rs_inited in here, rather than reapeating
that in all callers?


ISTM that this function should deal with the
            /*
             * return null immediately if relation is empty
             */

logic, I think you now are repeating that check on every call to heapgettup().


> @@ -511,182 +711,55 @@ heapgettup(HeapScanDesc scan,
>             ScanKey key)
>  {
>      HeapTuple    tuple = &(scan->rs_ctup);
> -    Snapshot    snapshot = scan->rs_base.rs_snapshot;
> -    bool        backward = ScanDirectionIsBackward(dir);
>      BlockNumber page;
> -    bool        finished;
>      Page        dp;
> -    int            lines;
>      OffsetNumber lineoff;
>      int            linesleft;
> -    ItemId        lpp;
> +
> +    if (ScanDirectionIsNoMovement(dir))
> +        return heapgettup_no_movement(scan);

Maybe add an unlikely() - this path is barely ever used...


>      /*
> -     * calculate next starting lineoff, given scan direction
> +     * return null immediately if relation is empty
>       */
> -    if (ScanDirectionIsForward(dir))
> +    if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0)
>      {

As mentioned above, I don't think we should repeat the nblocks check on every
call.


> +        page = scan->rs_cblock;
> +        LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
> +        dp = heapgettup_continue_page(scan, page, dir, &linesleft, &lineoff);
> +        goto continue_page;
>      }
>
>      /*
>       * advance the scan until we find a qualifying tuple or run out of stuff
>       * to scan
>       */
> -    lpp = PageGetItemId(dp, lineoff);
> -    for (;;)
> +    while (page != InvalidBlockNumber)
>      {
> +        heapgetpage((TableScanDesc) scan, page);
> +        LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
> +        dp = heapgettup_start_page(scan, page, dir, &linesleft, &lineoff);
> +    continue_page:


I don't like the goto continue_page at all.  Seems that the paths leading here
should call LockBuffer(), heapgettup_start_page() etc?  Possibly a do {} while
() loop could do the trick as well.



Greetings,

Andres Freund



Re: heapgettup refactoring

From
Melanie Plageman
Date:
Thanks for the review!
Attached is v2 with feedback addressed.

On Mon, Oct 31, 2022 at 9:09 PM Andres Freund <andres@anarazel.de> wrote:
> > From 9d8b01960463dc64ff5b111d523ff80fce3017af Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Mon, 31 Oct 2022 13:40:06 -0400
> > Subject: [PATCH v1 2/3] Turn HeapKeyTest macro into function
> >
> > This should always be inlined appropriately now. It is easier to read as
> > a function. Also, remove unused include in catcache.c.
> > ---
> >  src/backend/access/heap/heapam.c   | 10 ++--
> >  src/backend/utils/cache/catcache.c |  1 -
> >  src/include/access/valid.h         | 76 ++++++++++++------------------
> >  3 files changed, 36 insertions(+), 51 deletions(-)
> >
> > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> > index 12be87efed..1c995faa12 100644
> > --- a/src/backend/access/heap/heapam.c
> > +++ b/src/backend/access/heap/heapam.c
> > @@ -716,8 +716,10 @@ heapgettup(HeapScanDesc scan,
> >                                                                                                       snapshot);
> >
> >                               if (valid && key != NULL)
> > -                                     HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd),
> > -                                                             nkeys, key, valid);
> > +                             {
> > +                                     valid = HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd),
> > +                                                             nkeys, key);
> > +                             }
> >
> >                               if (valid)
> >                               {
>
> superfluous parens.

fixed.

> > From a894ce38c488df6546392b9f3bd894b67edf951e Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Mon, 31 Oct 2022 13:40:29 -0400
> > Subject: [PATCH v1 3/3] Refactor heapgettup* and heapgetpage
> >
> > Simplify heapgettup(), heapgettup_pagemode(), and heapgetpage(). All
> > three contained several unnecessary local variables, duplicate code, and
> > nested if statements. Streamlining these improves readability and
> > extensibility.
>
> It'd be nice to break this into slightly smaller chunks.

I can do that. Since incorporating feedback will be harder once I break
it up into smaller chunks, I'm inclined to wait to do so until I know
that the structure I have now is the one we will go with. (I know smaller
chunks will make it more reviewable.)

> > +
> > +static inline void heapgettup_no_movement(HeapScanDesc scan)
> > +{
>
> FWIW, for function definitions we keep the return type (and with that also the
> the "static inline") on a separate line.

Fixed

>
> > +     ItemId          lpp;
> > +     OffsetNumber lineoff;
> > +     BlockNumber page;
> > +     Page dp;
> > +     HeapTuple       tuple = &(scan->rs_ctup);
> > +     /*
> > +     * ``no movement'' scan direction: refetch prior tuple
> > +     */
> > +
> > +     /* Since the tuple was previously fetched, needn't lock page here */
> > +     if (!scan->rs_inited)
> > +     {
> > +             Assert(!BufferIsValid(scan->rs_cbuf));
> > +             tuple->t_data = NULL;
> > +             return;
>
> Is it possible to have a no-movement scan with an uninitialized scan? That
> doesn't really seem to make sense. At least that's how I understand the
> explanation for NoMovement nearby:
>  * dir == NoMovementScanDirection means "re-fetch the tuple indicated
>  * by scan->rs_ctup".
>
> We can't have a rs_ctup without an already started scan, no?
>
> Looks like this is pre-existing code that you just moved, but it still seems
> wrong.

Changed to an assert

>
> > +     }
> > +     page = ItemPointerGetBlockNumber(&(tuple->t_self));
> > +     if (page != scan->rs_cblock)
> > +             heapgetpage((TableScanDesc) scan, page);
>
>
> We have a
>         BlockNumber page;
> and
>         Page            dp;
> in this code which seems almost intentionally confusing. This again is a
> pre-existing issue but perhaps we could clean it up first?

in attached
page -> block
dp -> page
in basically all locations in heapam.c (should that be its own commit?)

> > +static inline Page heapgettup_continue_page(HeapScanDesc scan, BlockNumber page, ScanDirection dir,
> > +             int *linesleft, OffsetNumber *lineoff)
> > +{
> > +     HeapTuple       tuple = &(scan->rs_ctup);
>
> Hm. Finding the next offset via rs_ctup doesn't seem quite right. For one,
> it's not actually that cheap to extract the offset from an ItemPointer because
> of the the way we pack it into ItemPointerData.

So, it was like this before [1].
What about saving the lineoff in rs_cindex.

It is smaller, but that seems okay, right?

> > +     Page dp = BufferGetPage(scan->rs_cbuf);
> > +     TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp);
>
> Newlines between definitions and code :)

k

> Perhaps worth asserting that the scan is initialized and that rs_cbuf is valid?

indeed.

>
> > +     if (ScanDirectionIsForward(dir))
> > +     {
> > +             *lineoff = OffsetNumberNext(ItemPointerGetOffsetNumber(&(tuple->t_self)));
> > +             *linesleft = PageGetMaxOffsetNumber(dp) - (*lineoff) + 1;
>
> We can't access PageGetMaxOffsetNumber etc without holding a lock on the
> page. It's not immediately obvious that that is held in all paths.

In heapgettup() I call LockBuffer() before invoking
heapgettup_continue_page() and heapgettup_start_page() which are the
ones doing this.

I did have big plans for using the continue_page and start_page
functions in heapgettup_pagemode() as well, but since I'm not doing that
now, I can add in an expectation that the lock is held.

I added a comment saying the caller is responsible for acquiring the
lock if needed. I thought of adding an assert, but I don't see that
being done outside of bufmgr.c

    BufferDesc *bufHdr = GetBufferDescriptor(buffer - 1);
    Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr)));

> > +static inline BlockNumber heapgettup_initial_page(HeapScanDesc scan, ScanDirection dir)
> > +{
> > +     Assert(!ScanDirectionIsNoMovement(dir));
> > +     Assert(!scan->rs_inited);
>
> Is there a reason we couldn't set rs_inited in here, rather than reapeating
> that in all callers?

I wasn't sure if future callers or existing callers in the future may
need to do steps other than what is in heapgettup_initial_page() before
setting rs_inited. I thought of the responsibility of
heapgettup_initial_page() as returning the initial page to start the
scan. If it is going to do all initialization steps, perhaps the name
should change? I thought having a function that says it does
initialization of the scan might be confusing since initscan() also
exists.

> ISTM that this function should deal with the
>                         /*
>                          * return null immediately if relation is empty
>                          */
>
> logic, I think you now are repeating that check on every call to heapgettup().

So, that's a good point. If I move setting rs_inited inside of
heapgettup_initial_page(), then I can also easily move the empty table
check inside there too.

I don't want to set rs_inited before every return in
heapgettup_initial_page(). Do you think there are any issues with
setting it at the top of the function?

I thought about setting it at the very top (even before checking if the
relation is empty) Is it okay to set it before the empty table check?
rs_inited will be set to false at the bottom before returning. But,
maybe this will be an issue in other callers of
heapgettup_initial_page()?

Anyway, I have changed it in attached v2.

> > @@ -511,182 +711,55 @@ heapgettup(HeapScanDesc scan,
> >                  ScanKey key)
> >  {
> >       HeapTuple       tuple = &(scan->rs_ctup);
> > -     Snapshot        snapshot = scan->rs_base.rs_snapshot;
> > -     bool            backward = ScanDirectionIsBackward(dir);
> >       BlockNumber page;
> > -     bool            finished;
> >       Page            dp;
> > -     int                     lines;
> >       OffsetNumber lineoff;
> >       int                     linesleft;
> > -     ItemId          lpp;
> > +
> > +     if (ScanDirectionIsNoMovement(dir))
> > +             return heapgettup_no_movement(scan);
>
> Maybe add an unlikely() - this path is barely ever used...

done.

> > +             page = scan->rs_cblock;
> > +             LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
> > +             dp = heapgettup_continue_page(scan, page, dir, &linesleft, &lineoff);
> > +             goto continue_page;
> >       }
> >
> >       /*
> >        * advance the scan until we find a qualifying tuple or run out of stuff
> >        * to scan
> >        */
> > -     lpp = PageGetItemId(dp, lineoff);
> > -     for (;;)
> > +     while (page != InvalidBlockNumber)
> >       {
> > +             heapgetpage((TableScanDesc) scan, page);
> > +             LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
> > +             dp = heapgettup_start_page(scan, page, dir, &linesleft, &lineoff);
> > +     continue_page:
>
>
> I don't like the goto continue_page at all.  Seems that the paths leading here
> should call LockBuffer(), heapgettup_start_page() etc?  Possibly a do {} while
> () loop could do the trick as well.

I don't see how a  do while loop would solve help with the problem.
We need to check if the block number is valid after getting a block
assignment before doing heapgetpage() (e.g. after
heapgettup_initial_page() and after heapgettup_advance_page()).

Removing the goto continue_page means adding the heapgettpage(),
heapgettup_start_page(), etc code block in two places now (both after
heapgettup_initial_page() and after heapgettup_advance_page()) and, in
both locations we have to add an if statement to check if the block is
valid. I feel like this makes the function longer and harder to
understand. Keeping the loop as short as possible makes it clear what it
is doing. I think that with an appropriate warning comment, the goto
continue_page is clearer and easier to understand. To me, starting a
page at the top of the outer loop, then looping through the lines in the
page and is the structure that makes the most sense.

- Melanie

[1] https://github.com/postgres/postgres/blob/master/src/backend/access/heap/heapam.c#L572

Attachment

Re: heapgettup refactoring

From
Peter Eisentraut
Date:
On 04.11.22 16:51, Melanie Plageman wrote:
> Thanks for the review!
> Attached is v2 with feedback addressed.

Your 0001 had already been pushed.

I have pushed your 0002.

I have also pushed the renaming of page -> block, dp -> page separately. 
  This should reduce the size of your 0003 a bit.

Please produce an updated version of the 0003 page for further review.




Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Wed, Nov 16, 2022 at 10:49 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 04.11.22 16:51, Melanie Plageman wrote:
> > Thanks for the review!
> > Attached is v2 with feedback addressed.
>
> Your 0001 had already been pushed.
>
> I have pushed your 0002.
>
> I have also pushed the renaming of page -> block, dp -> page separately.
>   This should reduce the size of your 0003 a bit.
>
> Please produce an updated version of the 0003 page for further review.

Thanks for looking at this!
I have attached a patchset with only the code changes contained in the
previous patch 0003. I have broken the refactoring down into many
smaller pieces for ease of review.

- Melanie

Attachment

Re: heapgettup refactoring

From
Peter Eisentraut
Date:
On 30.11.22 23:34, Melanie Plageman wrote:
> I have attached a patchset with only the code changes contained in the
> previous patch 0003. I have broken the refactoring down into many
> smaller pieces for ease of review.

To keep this moving along a bit, I have committed your 0002, which I 
think is a nice little improvement on its own.




Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Mon, Jan 2, 2023 at 5:23 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 30.11.22 23:34, Melanie Plageman wrote:
> > I have attached a patchset with only the code changes contained in the
> > previous patch 0003. I have broken the refactoring down into many
> > smaller pieces for ease of review.
>
> To keep this moving along a bit, I have committed your 0002, which I
> think is a nice little improvement on its own.

Thanks!
I've attached a rebased patchset - v4.

I also changed heapgettup_no_movement() to noinline (instead of inline).
David Rowley pointed out that this might make more sense given how
comparatively rare no movement scans are.

- Melanie

Attachment

Re: heapgettup refactoring

From
Peter Eisentraut
Date:
On 03.01.23 21:39, Melanie Plageman wrote:
>> On 30.11.22 23:34, Melanie Plageman wrote:
>>> I have attached a patchset with only the code changes contained in the
>>> previous patch 0003. I have broken the refactoring down into many
>>> smaller pieces for ease of review.
>>
>> To keep this moving along a bit, I have committed your 0002, which I
>> think is a nice little improvement on its own.
> 
> Thanks!
> I've attached a rebased patchset - v4.
> 
> I also changed heapgettup_no_movement() to noinline (instead of inline).
> David Rowley pointed out that this might make more sense given how
> comparatively rare no movement scans are.

Ok, let's look through these patches starting from the top then.

v4-0001-Add-no-movement-scan-helper.patch

This makes sense overall; there is clearly some duplicate code that can 
be unified.

It appears that during your rebasing you have effectively reverted your 
earlier changes that have been committed as 
8e1db29cdbbd218ab6ba53eea56624553c3bef8c.  You should undo that.

I don't understand the purpose of the noinline maker.  If it's not 
necessary to inline, we can just leave it off, but there is no need to 
outright prevent inlining AFAICT.

I don't know why you changed the if/else sequences.  Before, the 
sequence was effectively

if (forward)
{
     ...
}
else if (backward)
{
     ...
}
else
{
     /* it's no movement */
}

Now it's changed to

if (no movement)
{
     ...
     return;
}

if (forward)
{
     ...
}
else
{
     Assert(backward);
     ...
}

Sure, that's the same thing, but it looks less elegant to me.




Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Thu, Jan 5, 2023 at 8:52 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> Ok, let's look through these patches starting from the top then.
>
> v4-0001-Add-no-movement-scan-helper.patch
>
> This makes sense overall; there is clearly some duplicate code that can
> be unified.
>
> It appears that during your rebasing you have effectively reverted your
> earlier changes that have been committed as
> 8e1db29cdbbd218ab6ba53eea56624553c3bef8c.  You should undo that.

Thanks. I think I have addressed this.
I've attached a rebased v5.

> I don't understand the purpose of the noinline maker.  If it's not
> necessary to inline, we can just leave it off, but there is no need to
> outright prevent inlining AFAICT.
>

I have removed it.

> I don't know why you changed the if/else sequences.  Before, the
> sequence was effectively
>
> if (forward)
> {
>      ...
> }
> else if (backward)
> {
>      ...
> }
> else
> {
>      /* it's no movement */
> }
>
> Now it's changed to
>
> if (no movement)
> {
>      ...
>      return;
> }
>
> if (forward)
> {
>      ...
> }
> else
> {
>      Assert(backward);
>      ...
> }
>
> Sure, that's the same thing, but it looks less elegant to me.

In this commit, you could keep the original ordering of if statements. I
preferred no movement scan first because then backwards and forwards
scans' code is physically closer to the rest of the code without the
intrusion of the no movement scan code.

Ultimately, the refactor (in later patches) flips the ordering of if
statements at the top from
        if (scan direction)
        to
        if (initial or continue)
and this isn't a very interesting distinction for no movement scans. By
dealing with no movement scan at the top, I didn't have to handle no
movement scans in the initial and continue branches in the new structure.

Also, I will note that patches 4-6 at least and perhaps 4-7 do not make
sense as separate commits. I separated them for ease of review. Each is
correct and passes tests but is not really an improvement without the
others.

- Melanie

Attachment

Re: heapgettup refactoring

From
Peter Eisentraut
Date:
On 10.01.23 21:31, Melanie Plageman wrote:
> On Thu, Jan 5, 2023 at 8:52 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>>
>> Ok, let's look through these patches starting from the top then.
>>
>> v4-0001-Add-no-movement-scan-helper.patch
>>
>> This makes sense overall; there is clearly some duplicate code that can
>> be unified.
>>
>> It appears that during your rebasing you have effectively reverted your
>> earlier changes that have been committed as
>> 8e1db29cdbbd218ab6ba53eea56624553c3bef8c.  You should undo that.
> 
> Thanks. I think I have addressed this.
> I've attached a rebased v5.

In your v2 patch, you remove these assertions:

-       /* check that rs_cindex is in sync */
-       Assert(scan->rs_cindex < scan->rs_ntuples);
-       Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]);

Is that intentional?

I don't see any explanation, or some other equivalent code appearing 
elsewhere to replace this.




Re: heapgettup refactoring

From
David Rowley
Date:
On Thu, 19 Jan 2023 at 00:04, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> In your v2 patch, you remove these assertions:
>
> -       /* check that rs_cindex is in sync */
> -       Assert(scan->rs_cindex < scan->rs_ntuples);
> -       Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]);
>
> Is that intentional?
>
> I don't see any explanation, or some other equivalent code appearing
> elsewhere to replace this.

I guess it's because those asserts are not relevant unless
heapgettup_no_movement() is being called from heapgettup_pagemode().
Maybe they can be put back along the lines of:

Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 ||
scan->rs_cindex < scan->rs_ntuples);
Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || lineoff ==
scan->rs_vistuples[scan->rs_cindex]);

but it probably would be cleaner to just do an: if
(scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) { Assert(...);
Assert(...}; }

The only issue I see with that is that we don't seem to have anywhere
in the regression tests that call heapgettup_no_movement() when
rs_flags have SO_ALLOW_PAGEMODE. At least, adding an elog(NOTICE) to
heapgettup() just before calling heapgettup_no_movement() does not
seem to cause make check to fail.  I wonder if any series of SQL
commands would allow us to call heapgettup_no_movement() from
heapgettup()?

I think heapgettup_no_movement() also needs a header comment more
along the lines of:

/*
 * heapgettup_no_movement
 *        Helper function for NoMovementScanDirection direction for
heapgettup() and
 *        heapgettup_pagemode.
 */

I pushed the pgindent stuff that v5-0001 did along with some additions
to typedefs.list so that further runs could be done more easily as
changes are made to these patches.

David



Re: heapgettup refactoring

From
Melanie Plageman
Date:
Thanks for taking a look!

On Mon, Jan 23, 2023 at 6:08 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 19 Jan 2023 at 00:04, Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> > In your v2 patch, you remove these assertions:
> >
> > -       /* check that rs_cindex is in sync */
> > -       Assert(scan->rs_cindex < scan->rs_ntuples);
> > -       Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]);
> >
> > Is that intentional?
> >
> > I don't see any explanation, or some other equivalent code appearing
> > elsewhere to replace this.
>
> I guess it's because those asserts are not relevant unless
> heapgettup_no_movement() is being called from heapgettup_pagemode().
> Maybe they can be put back along the lines of:
>
> Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 ||
> scan->rs_cindex < scan->rs_ntuples);
> Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || lineoff ==
> scan->rs_vistuples[scan->rs_cindex]);
>
> but it probably would be cleaner to just do an: if
> (scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) { Assert(...);
> Assert(...}; }

I prefer the first method and have implemented that in attached v6.

> The only issue I see with that is that we don't seem to have anywhere
> in the regression tests that call heapgettup_no_movement() when
> rs_flags have SO_ALLOW_PAGEMODE. At least, adding an elog(NOTICE) to
> heapgettup() just before calling heapgettup_no_movement() does not
> seem to cause make check to fail.  I wonder if any series of SQL
> commands would allow us to call heapgettup_no_movement() from
> heapgettup()?

So, the places in which we set scan direction to no movement include:
- explain analyze on a ctas with no data
  EXPLAIN ANALYZE CREATE TABLE foo AS SELECT 1 WITH NO DATA;
  However, in standard_ExecutorRun() we only call ExecutePlan() if the
  ScanDirection is not no movement, so this wouldn't hit our code
- PortalRunSelect
- PersistHoldablePortal()

I can't say I know enough about portals currently to design a test that
will hit this code, but I will poke around some more.

> I think heapgettup_no_movement() also needs a header comment more
> along the lines of:
>
> /*
>  * heapgettup_no_movement
>  *        Helper function for NoMovementScanDirection direction for
> heapgettup() and
>  *        heapgettup_pagemode.
>  */

I've added a comment but I didn't include the function name in it -- I
find it repetitive when the comments above functions do that -- however,
I'm not strongly attached to that.

> I pushed the pgindent stuff that v5-0001 did along with some additions
> to typedefs.list so that further runs could be done more easily as
> changes are made to these patches.

Cool!

- Melanie

Attachment

Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Tue, Jan 24, 2023 at 04:17:23PM -0500, Melanie Plageman wrote:
> Thanks for taking a look!
> 
> On Mon, Jan 23, 2023 at 6:08 AM David Rowley <dgrowleyml@gmail.com> wrote:
> >
> > On Thu, 19 Jan 2023 at 00:04, Peter Eisentraut
> > <peter.eisentraut@enterprisedb.com> wrote:
> > > In your v2 patch, you remove these assertions:
> > >
> > > -       /* check that rs_cindex is in sync */
> > > -       Assert(scan->rs_cindex < scan->rs_ntuples);
> > > -       Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]);
> > >
> > > Is that intentional?
> > >
> > > I don't see any explanation, or some other equivalent code appearing
> > > elsewhere to replace this.
> >
> > I guess it's because those asserts are not relevant unless
> > heapgettup_no_movement() is being called from heapgettup_pagemode().
> > Maybe they can be put back along the lines of:
> >
> > Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 ||
> > scan->rs_cindex < scan->rs_ntuples);
> > Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || lineoff ==
> > scan->rs_vistuples[scan->rs_cindex]);
> >
> > but it probably would be cleaner to just do an: if
> > (scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) { Assert(...);
> > Assert(...}; }
> 
> I prefer the first method and have implemented that in attached v6.
> 
> > The only issue I see with that is that we don't seem to have anywhere
> > in the regression tests that call heapgettup_no_movement() when
> > rs_flags have SO_ALLOW_PAGEMODE. At least, adding an elog(NOTICE) to
> > heapgettup() just before calling heapgettup_no_movement() does not
> > seem to cause make check to fail.  I wonder if any series of SQL
> > commands would allow us to call heapgettup_no_movement() from
> > heapgettup()?
> 
> So, the places in which we set scan direction to no movement include:
> - explain analyze on a ctas with no data
>   EXPLAIN ANALYZE CREATE TABLE foo AS SELECT 1 WITH NO DATA;
>   However, in standard_ExecutorRun() we only call ExecutePlan() if the
>   ScanDirection is not no movement, so this wouldn't hit our code
> - PortalRunSelect
> - PersistHoldablePortal()
> 
> I can't say I know enough about portals currently to design a test that
> will hit this code, but I will poke around some more.
 
I don't think we can write a test for this afterall. I've started
another thread on the topic over here:

https://www.postgresql.org/message-id/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY%3D_T8QEqZqOL02rpi2bw%40mail.gmail.com



Re: heapgettup refactoring

From
David Rowley
Date:
"On Wed, 25 Jan 2023 at 10:17, Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I've added a comment but I didn't include the function name in it -- I
> find it repetitive when the comments above functions do that -- however,
> I'm not strongly attached to that.

I think the general format for header comments is:

/*
 * <function name>
 *\t\t<brief summary of what function does>
 *
 * [Further details]
 */

We've certainly got places that don't follow that, but I don't think
that's any reason to have no comment or invent some new format.

heapam.c seems to have some other format where we do: "<function name>
- <brief summary of what function does>". I generally just try to copy
the style from the surrounding code. I think generally, people won't
argue if you follow the style from the surrounding code, but there'd
be exceptions to that, I'm sure.

I'll skip further review of 0001 here as the whole
ScanDirectionNoMovement case is being discussed on the other thread.

v6-0002:

1. heapgettup_initial_block() needs a header comment to mention what
it does and what it returns. It would be good to make it obvious that
it returns InvalidBlockNumber when there are no blocks to scan.

2. After heapgettup_initial_block(), you're checking "if (block ==
InvalidBlockNumber). It might be worth a mention something like

/*
 * Check if we got to the end of the scan already.  This could happen for
 * an empty relation or if parallel workers scanned everything before we
 * got a chance.
 */

the backward scan comment wouldn't mention parallel workers.

v6-0003:

3. Can you explain why you removed the snapshot local variable in heapgettup()?

4. I think it might be a good idea to use unlikely() in if
(!scan->rs_inited). The idea is to help coax the compiler into moving
that code off to a cold path. That's likely especially important if
heapgettup_initial_block is inlined, which I see it is marked as.

v6-0004:

5. heapgettup_start_page() and heapgettup_continue_page() both need a
header comment to explain what they do and what the inputs and output
arguments are.

6. I'm not too sure what the following comment means:

/* block and lineoff now reference the physically next tid */

"block" is just a parameter to the function and its value is not
adjusted. The comment makes it sound like something was changed.

(I think these might be just not well updated from having split this
out from the 0006 patch as the same comment makes more sense in 0006)

v6-0005:

7. heapgettup_advance_block() needs a header comment.

8. Is there a reason why heapgettup_advance_block() handle backward
scans first? I'd expect you should just follow the lead of the other
functions and do ScanDirectionIsForward first.

v6-0006

David



Re: heapgettup refactoring

From
Melanie Plageman
Date:
v7 attached

On Fri, Jan 27, 2023 at 10:34 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> "On Wed, 25 Jan 2023 at 10:17, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I've added a comment but I didn't include the function name in it -- I
> > find it repetitive when the comments above functions do that -- however,
> > I'm not strongly attached to that.
>
> I think the general format for header comments is:
>
> /*
>  * <function name>
>  *\t\t<brief summary of what function does>
>  *
>  * [Further details]
>  */
>
> We've certainly got places that don't follow that, but I don't think
> that's any reason to have no comment or invent some new format.
>
> heapam.c seems to have some other format where we do: "<function name>
> - <brief summary of what function does>". I generally just try to copy
> the style from the surrounding code. I think generally, people won't
> argue if you follow the style from the surrounding code, but there'd
> be exceptions to that, I'm sure.

I have followed the same convention as the other functions in heapam.c
in the various helper functions comments I've added in this version.

> v6-0002:
>
> 1. heapgettup_initial_block() needs a header comment to mention what
> it does and what it returns. It would be good to make it obvious that
> it returns InvalidBlockNumber when there are no blocks to scan.

I've done this.

>
> 2. After heapgettup_initial_block(), you're checking "if (block ==
> InvalidBlockNumber). It might be worth a mention something like
>
> /*
>  * Check if we got to the end of the scan already.  This could happen for
>  * an empty relation or if parallel workers scanned everything before we
>  * got a chance.
>  */
>
> the backward scan comment wouldn't mention parallel workers.

I've done this as well.

>
> v6-0003:
>
> 3. Can you explain why you removed the snapshot local variable in heapgettup()?

In the subsequent commit, the helpers I add call TestForOldSnapshot(),
and I didn't want to pass in the snapshot as a separate parameter since
I already need to pass the scan descriptor. I thought it was confusing
to have a local variable (snapshot) used in some places and the one in
the scan used in others. This "streamlining" commit also reduces the
number of times the snapshot variable is used, making it less necessary
to have a local variable.

I didn't remove the snapshot local variable in the same commit as adding
the helpers because I thought it made the diff harder to understand (for
review, the final commit should likely not be separate patches).

> 4. I think it might be a good idea to use unlikely() in if
> (!scan->rs_inited). The idea is to help coax the compiler into moving
> that code off to a cold path. That's likely especially important if
> heapgettup_initial_block is inlined, which I see it is marked as.

I've gone ahead and added unlikely. However, should I perhaps skip
inlining the heapgettup_initial_block() function?

> v6-0004:
>
> 5. heapgettup_start_page() and heapgettup_continue_page() both need a
> header comment to explain what they do and what the inputs and output
> arguments are.

I've added these. I've also removed an unused parameter to both, block.

>
> 6. I'm not too sure what the following comment means:
>
> /* block and lineoff now reference the physically next tid */
>
> "block" is just a parameter to the function and its value is not
> adjusted. The comment makes it sound like something was changed.
>
> (I think these might be just not well updated from having split this
> out from the 0006 patch as the same comment makes more sense in 0006)

Yes, that is true. I've updated it to just mention lineoff.

> v6-0005:
>
> 7. heapgettup_advance_block() needs a header comment.
>
> 8. Is there a reason why heapgettup_advance_block() handle backward
> scans first? I'd expect you should just follow the lead of the other
> functions and do ScanDirectionIsForward first.

The reason I do this is that backwards scans cannot be parallel, so
handling backwards scans first let me return, then handle parallel
scans, then forward scans. This reduced the level of nesting/if
statements for all of the code in this function.

- Melanie

Attachment

Re: heapgettup refactoring

From
David Rowley
Date:
On Tue, 31 Jan 2023 at 12:18, Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Fri, Jan 27, 2023 at 10:34 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > 4. I think it might be a good idea to use unlikely() in if
> > (!scan->rs_inited). The idea is to help coax the compiler into moving
> > that code off to a cold path. That's likely especially important if
> > heapgettup_initial_block is inlined, which I see it is marked as.
>
> I've gone ahead and added unlikely. However, should I perhaps skip
> inlining the heapgettup_initial_block() function?

I'm not sure of the exact best combination of functions to mark as
inline. I did try the v7 patchset from 0002 to 0006 on top of c2891175
and I found that the performance is slightly better after removing
inline from all 4 of the helper functions. However, I think if we do
unlikely() and the function is moved into the cold path then it
matters less if it's inlined.

create table a (a int);
insert into a select x from generate_series(1,1000000)x;
vacuum freeze a;

$ cat seqscan.sql
select * from a where a = 0;
$ cat countall.sql
select count(*) from a;

seqscan.sql filters out all rows and countall.sql returns all rows and
does an aggregate so we don't have to return all those in the query.

max_parallel_workers_per_gather=0;

master
$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep
tps; done
tps = 25.464091 (without initial connection time)
tps = 25.117001 (without initial connection time)
tps = 25.141646 (without initial connection time)

$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres |
grep tps; done
tps = 27.906307 (without initial connection time)
tps = 27.527580 (without initial connection time)
tps = 27.563035 (without initial connection time)

master + v7
$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep
tps; done
tps = 25.920370 (without initial connection time)
tps = 25.680052 (without initial connection time)
tps = 24.988895 (without initial connection time)

$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres |
grep tps; done
tps = 33.783122 (without initial connection time)
tps = 33.248571 (without initial connection time)
tps = 33.512984 (without initial connection time)

master + v7 + inline removed
$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep
tps; done
tps = 27.680115 (without initial connection time)
tps = 26.418562 (without initial connection time)
tps = 26.166800 (without initial connection time)

$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres |
grep tps; done
tps = 33.948588 (without initial connection time)
tps = 33.684966 (without initial connection time)
tps = 33.946700 (without initial connection time)

You can see that v7 helps countall.sql quite a bit. It seems to also
help a little bit with seqscan.sql. v7 + inline removed makes
seqscan.sql a decent amount faster than both master and master + v7.

David



Re: heapgettup refactoring

From
David Rowley
Date:
On Tue, 31 Jan 2023 at 12:18, Melanie Plageman
<melanieplageman@gmail.com> wrote:
> v7 attached

I've been looking over the v7-0002 patch today and I did make a few
adjustments to heapgettup_initial_block() as I would prefer to see the
branching of each of all these helper functions follow the pattern of:

if (<forward scan>)
{
    if (<parallel scan>)
        <parallel stuff>
    else
        <serial stuff>
}
else
{
    <backwards serial stuff>
}

which wasn't quite what the function was doing.

Along the way, I noticed that 0002 has a subtle bug that does not seem
to be present once the remaining patches are applied.  I think I'm
happier to push these along the lines of how you have them in the
patches, so I've held off pushing for now due to the bug and the
change I had to make to fix it.

The problem is around the setting of scan->rs_inited = true; you've
moved that into heapgettup_initial_block() and you've correctly not
initialised the scan for empty tables when you return
InvalidBlockNumber, however, you've not correctly considered the fact
that table_block_parallelscan_nextpage() could also return
InvalidBlockNumber if the parallel workers manage to grab all of the
blocks before the current process gets the first block. I don't know
for sure, but it looks like this could cause problems when
heapgettup() or heapgettup_pagemode() got called again for a rescan.
We'd have returned the NULL tuple to indicate that no further tuples
exist, but we'll have left rs_inited set to true which looks like
it'll cause issues.

I wondered if it might be better to do the scan->rs_inited = true; in
heapgettup() and heapgettup_pagemode() instead. The attached v8 patch
does it this way. Despite this fixing that bug, I think this might be
a slightly better division of duties.

If you're ok with the attached (and everyone else is too), then I can
push it in the (NZ) morning.  The remaining patches would need to be
rebased due to my changes.

David

Attachment

Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Thu, Feb 02, 2023 at 12:21:20AM +1300, David Rowley wrote:
> On Tue, 31 Jan 2023 at 12:18, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > v7 attached
> 
> I've been looking over the v7-0002 patch today and I did make a few
> adjustments to heapgettup_initial_block() as I would prefer to see the
> branching of each of all these helper functions follow the pattern of:
> 
> if (<forward scan>)
> {
>     if (<parallel scan>)
>         <parallel stuff>
>     else
>         <serial stuff>
> }
> else
> {
>     <backwards serial stuff>
> }
> 
> which wasn't quite what the function was doing.

I'm fine with this. One code comment about the new version inline.

> Along the way, I noticed that 0002 has a subtle bug that does not seem
> to be present once the remaining patches are applied.  I think I'm
> happier to push these along the lines of how you have them in the
> patches, so I've held off pushing for now due to the bug and the
> change I had to make to fix it.
> 
> The problem is around the setting of scan->rs_inited = true; you've
> moved that into heapgettup_initial_block() and you've correctly not
> initialised the scan for empty tables when you return
> InvalidBlockNumber, however, you've not correctly considered the fact
> that table_block_parallelscan_nextpage() could also return
> InvalidBlockNumber if the parallel workers manage to grab all of the
> blocks before the current process gets the first block. I don't know
> for sure, but it looks like this could cause problems when
> heapgettup() or heapgettup_pagemode() got called again for a rescan.
> We'd have returned the NULL tuple to indicate that no further tuples
> exist, but we'll have left rs_inited set to true which looks like
> it'll cause issues.

Ah, yes. In the later patches in the series, I handle all end of scan
cases (regardless of whether or not there was a beginning) in a single
place at the end of the function. There I release the buffer and reset
all state -- including setting rs_inited to false. So, that made it okay
to set rs_inited to true in heapgettup_initial_block().

When splitting it up, I made a mistake and missed the case you
mentioned. Thanks for catching that!

FWIW, I like setting rs_inited in heapgettup_initial_block() better in
the final refactor, but I agree with you that in this patch on its own
it is better in the body of heapgettup() and heapgettup_pagemode().
 
> I wondered if it might be better to do the scan->rs_inited = true; in
> heapgettup() and heapgettup_pagemode() instead. The attached v8 patch
> does it this way. Despite this fixing that bug, I think this might be
> a slightly better division of duties.

LGTM.

> From cbd37463bdaa96afed4c7c739c8e91b770a9f8a7 Mon Sep 17 00:00:00 2001
> From: David Rowley <dgrowley@gmail.com>
> Date: Wed, 1 Feb 2023 19:35:16 +1300
> Subject: [PATCH v8] Refactor heapam.c adding heapgettup_initial_block function
> 
> Here we adjust heapgettup() and heapgettup_pagemode() to move the code
> that fetches the first block out into a helper function.  This removes
> some code duplication.
> 
> Author: Melanie Plageman
> Reviewed-by: David Rowley
> Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
> ---
>  src/backend/access/heap/heapam.c | 225 ++++++++++++++-----------------
>  1 file changed, 103 insertions(+), 122 deletions(-)
> 
> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 0a8bac25f5..40168cc9ca 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -483,6 +483,67 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
>      scan->rs_ntuples = ntup;
>  }
>  
> +/*
> + * heapgettup_initial_block - return the first BlockNumber to scan
> + *
> + * Returns InvalidBlockNumber when there are no blocks to scan.  This can
> + * occur with empty tables and in parallel scans when parallel workers get all
> + * of the pages before we can get a chance to get our first page.
> + */
> +static BlockNumber
> +heapgettup_initial_block(HeapScanDesc scan, ScanDirection dir)
> +{
> +    Assert(!scan->rs_inited);
> +
> +    /* When there are no pages to scan, return InvalidBlockNumber */
> +    if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0)
> +        return InvalidBlockNumber;
> +
> +    if (ScanDirectionIsForward(dir))
> +    {
> +        /* serial scan */
> +        if (scan->rs_base.rs_parallel == NULL)
> +            return scan->rs_startblock;

I believe this else is superfluous since we returned above.

> +        else
> +        {
> +            /* parallel scan */
> +            table_block_parallelscan_startblock_init(scan->rs_base.rs_rd,
> +                                                     scan->rs_parallelworkerdata,
> +                                                     (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel);
> +
> +            /* may return InvalidBlockNumber if there are no more blocks */
> +            return table_block_parallelscan_nextpage(scan->rs_base.rs_rd,
> +                                                     scan->rs_parallelworkerdata,
> +                                                     (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel);
> +        }
> +    }
...
> @@ -889,62 +892,40 @@ heapgettup_pagemode(HeapScanDesc scan,
> -        if (!scan->rs_inited)
> -        {
> -            lineindex = lines - 1;
> -            scan->rs_inited = true;
> -        }
> -        else
> -        {
> +            page = BufferGetPage(scan->rs_cbuf);
> +            TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, page);
> +            lines = scan->rs_ntuples;
>              lineindex = scan->rs_cindex - 1;
>          }
> -        /* block and lineindex now reference the previous visible tid */

I think this is an unintentional diff.

>  
> +        /* block and lineindex now reference the previous visible tid */
>          linesleft = lineindex + 1;
>      }

- Melanie



Re: heapgettup refactoring

From
David Rowley
Date:
On Thu, 2 Feb 2023 at 10:12, Melanie Plageman <melanieplageman@gmail.com> wrote:
> FWIW, I like setting rs_inited in heapgettup_initial_block() better in
> the final refactor, but I agree with you that in this patch on its own
> it is better in the body of heapgettup() and heapgettup_pagemode().

We can reconsider that when we get to that patch. It just felt a bit
ugly to add an InvalidBlockNumber check after calling
table_block_parallelscan_nextpage()

> I believe this else is superfluous since we returned above.

TBH, that's on purpose. I felt that it just looked better that way as
the code all fitted onto my screen.  I think if the function was
longer and people had to scroll down to read it, it can often be
better to return and reduce the nesting. This allows you to mentally
not that a certain case is handled above.  However, since all these
helper functions seem to fit onto a screen without too much trouble,
it just seems better (to me) if they all follow the format that I
mentioned earlier. I might live to regret that as we often see
get-rid-of-useless-else-clause patches coming up. I'm starting to
wonder if someone's got some alarm that goes off every time one gets
committed, but we'll see. I'd much rather have consistency between the
helper functions than save a few bytes on tab characters. It would be
different if the indentation were shifting things too far right, but
that's not going to happen in a function that all fits onto a screen
at once.

I've attached a version of the next patch in the series. I admit to
making a couple of adjustments. I couldn't bring myself to remove the
snapshot local variable in this commit. We can deal with that when we
come to it in whichever patch that needs to be changed in.  The only
other thing I really did was question your use of rs_cindex to store
the last OffsetNumber.  I ended up adding a new field which slots into
the padding between a bool and BlockNumber field named rs_coffset for
this purpose. I noticed what Andres wrote [1] earlier in this thread
about that, so thought we should move away from looking at the last
tuple to get this number

I've attached the rebased and updated patch.

David

[1] https://postgr.es/m/20221101010948.hsf33emgnwzvil4a@awork3.anarazel.de

Attachment

Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Thu, Feb 02, 2023 at 07:00:37PM +1300, David Rowley wrote:
> I've attached a version of the next patch in the series. I admit to
> making a couple of adjustments. I couldn't bring myself to remove the
> snapshot local variable in this commit. We can deal with that when we
> come to it in whichever patch that needs to be changed in.

That seems fine to keep the diff easy to understand. Also,
heapgettup_pagemode() didn't have a snapshot local variable either.

> The only other thing I really did was question your use of rs_cindex
> to store the last OffsetNumber.  I ended up adding a new field which
> slots into the padding between a bool and BlockNumber field named
> rs_coffset for this purpose. I noticed what Andres wrote [1] earlier
> in this thread about that, so thought we should move away from looking
> at the last tuple to get this number
> 
> [1] https://postgr.es/m/20221101010948.hsf33emgnwzvil4a@awork3.anarazel.de

So, what Andres had said was: 

> Hm. Finding the next offset via rs_ctup doesn't seem quite right. For one,
> it's not actually that cheap to extract the offset from an ItemPointer because
> of the the way we pack it into ItemPointerData.

Because I was doing this in an earlier version:

> > +     HeapTuple       tuple = &(scan->rs_ctup);

And then in the later part of the code got tuple->t_self.

I did this because the code in master does this:

  lineoff =            /* previous offnum */
    Min(lines,
      OffsetNumberPrev(ItemPointerGetOffsetNumber(&(tuple->t_self))));

So I figured it was the same. Based on Andres' feedback, I switched to
saving the offset number in the scan descriptor and figured I could
reuse rs_cindex since it is larger than an OffsetNumber.

Your code also switches to saving the OffsetNumber -- just in a separate
variable of OffsetNumber type. I am fine with this if it your rationale
is that it is not a good idea to store a smaller number in a larger
datatype. However, the benefit I saw in reusing rs_cindex is that we
could someday converge the code for heapgettup() and
heapgettup_pagemode() even more. Even in my final refactor, there is a
lot of duplicate code between the two.

- Melanie



Re: heapgettup refactoring

From
David Rowley
Date:
On Fri, 3 Feb 2023 at 06:23, Melanie Plageman <melanieplageman@gmail.com> wrote:
> Your code also switches to saving the OffsetNumber -- just in a separate
> variable of OffsetNumber type. I am fine with this if it your rationale
> is that it is not a good idea to store a smaller number in a larger
> datatype. However, the benefit I saw in reusing rs_cindex is that we
> could someday converge the code for heapgettup() and
> heapgettup_pagemode() even more. Even in my final refactor, there is a
> lot of duplicate code between the two.

I was more concerned about the reuse of an unrelated field.  I'm
struggling to imagine why using the separate field would cause any
issues around not being able to reduce the code duplication any more
than we otherwise would. Surely in one case you need to get the offset
by indexing the rs_vistuples[] array and the other is the offset
directly. The only thing I can think of that would allow us not to
have a condition there would be if we populated the rs_vistuples[]
array with 1..n.  I doubt should do that and if we did, we could just
use the rs_cindex to index that without having to worry that we're
using an unrelated field for something.

I've pushed all but the final 2 patches now.

David



Re: heapgettup refactoring

From
David Rowley
Date:
On Fri, 3 Feb 2023 at 15:26, David Rowley <dgrowleyml@gmail.com> wrote:
> I've pushed all but the final 2 patches now.

I just pushed the final patch in the series.  I held back on moving
the setting of rs_inited back into the heapgettup_initial_block()
helper function as I wondered if we should even keep that field.

It seems that rs_cblock == InvalidBlockNumber in all cases where
rs_inited == false, so maybe it's better just to use that as a
condition to check if the scan has started or not. I've attached a
patch which does that.

I ended up adjusting HeapScanDescData more than what is minimally
required to remove rs_inited. I wondered if rs_cindex should be closer
to rs_cblock in the struct so that we're more likely to be adjusting
the same cache line than if that field were closer to the end of the
struct.  We don't need rs_coffset and rs_cindex at the same time, so I
made it a union. I see that the struct is still 712 bytes before and
after this change. I've not yet tested to see if there are any
performance gains to this change.

David

Attachment

Re: heapgettup refactoring

From
Melanie Plageman
Date:
On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote:
> On Fri, 3 Feb 2023 at 15:26, David Rowley <dgrowleyml@gmail.com> wrote:
> > I've pushed all but the final 2 patches now.
> 
> I just pushed the final patch in the series.

Cool!

> I held back on moving the setting of rs_inited back into the
> heapgettup_initial_block() helper function as I wondered if we should
> even keep that field.
> 
> It seems that rs_cblock == InvalidBlockNumber in all cases where
> rs_inited == false, so maybe it's better just to use that as a
> condition to check if the scan has started or not. I've attached a
> patch which does that.
> 
> I ended up adjusting HeapScanDescData more than what is minimally
> required to remove rs_inited. I wondered if rs_cindex should be closer
> to rs_cblock in the struct so that we're more likely to be adjusting
> the same cache line than if that field were closer to the end of the
> struct.  We don't need rs_coffset and rs_cindex at the same time, so I
> made it a union. I see that the struct is still 712 bytes before and
> after this change. I've not yet tested to see if there are any
> performance gains to this change.
> 

I like the idea of using a union.

> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 7eb79cee58..e171d6e38b 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -321,13 +321,15 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
>      }
>  
>      scan->rs_numblocks = InvalidBlockNumber;
> -    scan->rs_inited = false;
>      scan->rs_ctup.t_data = NULL;
>      ItemPointerSetInvalid(&scan->rs_ctup.t_self);
>      scan->rs_cbuf = InvalidBuffer;
>      scan->rs_cblock = InvalidBlockNumber;
>  
> -    /* page-at-a-time fields are always invalid when not rs_inited */
> +    /*
> +     * page-at-a-time fields are always invalid when
> +     * rs_cblock == InvalidBlockNumber
> +     */

So, I was wondering what we should do about initializing rs_coffset here
since it doesn't fall under "don't initialize it because it is only used
for page-at-a-time mode". It might not be required for us to initialize
it in initscan, but we do bother to initialize other "current scan
state" fields. We could check if pagemode is enabled and initialize
rs_coffset or rs_cindex depending on that.

Then maybe the comment should call out the specific page-at-a-time
fields that are automatically invalid? (e.g. rs_ntuples, rs_vistuples)

I presume the point of the comment is to explain why those fields are
not being initialized here, which was a question I had when I looked at
initscan(), so it seems like we should make sure it explains that.

> @@ -717,9 +720,9 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber block, ScanDirection dir
>   * the scankeys.
>   *
>   * Note: when we fall off the end of the scan in either direction, we
> - * reset rs_inited.  This means that a further request with the same
> - * scan direction will restart the scan, which is a bit odd, but a
> - * request with the opposite scan direction will start a fresh scan
> + * reset rs_cblock to InvalidBlockNumber.  This means that a further request
> + * with the same scan direction will restart the scan, which is a bit odd, but
> + * a request with the opposite scan direction will start a fresh scan
>   * in the proper direction.  The latter is required behavior for cursors,
>   * while the former case is generally undefined behavior in Postgres
>   * so we don't care too much.

Not the fault of this patch, but I am having trouble parsing this
comment. What does restart the scan mean? I get that it is undefined
behavior, but it is confusing because it kind of sounds like a rescan
which is not what it is (right?). And what exactly is a fresh scan? It
is probably correct, I just don't really understand what it is saying.
Feel free to ignore this aside, as I think your change is correctly
updating the comment.

> @@ -2321,13 +2316,12 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate)
>              ReleaseBuffer(hscan->rs_cbuf);
>          hscan->rs_cbuf = InvalidBuffer;
>          hscan->rs_cblock = InvalidBlockNumber;
> -        hscan->rs_inited = false;
>  
>          return false;
>      }
>  
>      heapgetpage(scan, blockno);
> -    hscan->rs_inited = true;
> +    Assert(hscan->rs_cblock != InvalidBlockNumber);

Quite nice to have this assert.

> diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> index 8d28bc93ef..c6efd59eb5 100644
> --- a/src/include/access/heapam.h
> +++ b/src/include/access/heapam.h
> @@ -56,9 +56,18 @@ typedef struct HeapScanDescData
>      /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */
>  
>      /* scan current state */
> -    bool        rs_inited;        /* false = scan not init'd yet */
> -    OffsetNumber rs_coffset;    /* current offset # in non-page-at-a-time mode */
> -    BlockNumber rs_cblock;        /* current block # in scan, if any */
> +    union
> +    {
> +        /* current offset in non-page-at-a-time mode */
> +        OffsetNumber rs_coffset;
> +
> +        /* current tuple's index in vistuples for page-at-a-time mode */
> +        int            rs_cindex;
> +    };

With the union up here, the comment about page-at-a-time mode members
near the bottom of the struct is now a bit misleading.

The rest of my thoughts about that are with that comment.

> +
> +    BlockNumber rs_cblock;        /* current block # in scan, or
> +                                 * InvalidBlockNumber when the scan is not yet
> +                                 * initialized */

The formatting of this comment is a bit difficult to read. Perhaps it
could go above the member?

Also, a few random other complaints about the comments in
HeapScanDescData that are also not the fault of this patch:

- why is this comment there twice?
    /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */

- above the union it said (prior to this patch also)
    /* scan current state */
    which is in contrast to 
    /* state set up at initscan time */
    from the top, but, arguably, rs_strategy is set up at initscan time

- who or what is NB?
    /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */

>      Buffer        rs_cbuf;        /* current buffer in scan, if any */
>      /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
>  
> @@ -74,7 +83,6 @@ typedef struct HeapScanDescData
>      ParallelBlockTableScanWorkerData *rs_parallelworkerdata;
>  
>      /* these fields only used in page-at-a-time mode and for bitmap scans */
> -    int            rs_cindex;        /* current tuple's index in vistuples */
>      int            rs_ntuples;        /* number of visible tuples on page */
>      OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];    /* their offsets */

I would personally be okay with either a block comment somewhere nearby
describing which members are used in page-at-a-time mode or individual
comments for each field that is used only in page-at-a-time mode (or
something else, I just think the situation with the patch applied
currently is confusing).

- Melanie



Re: heapgettup refactoring

From
David Rowley
Date:
On Wed, 8 Feb 2023 at 09:41, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote:
> > I ended up adjusting HeapScanDescData more than what is minimally
> > required to remove rs_inited. I wondered if rs_cindex should be closer
> > to rs_cblock in the struct so that we're more likely to be adjusting
> > the same cache line than if that field were closer to the end of the
> > struct.  We don't need rs_coffset and rs_cindex at the same time, so I
> > made it a union. I see that the struct is still 712 bytes before and
> > after this change. I've not yet tested to see if there are any
> > performance gains to this change.
> >
>
> I like the idea of using a union.

Using the tests mentioned in [1], I tested out
remove_HeapScanDescData_rs_inited_field.patch. It's not looking very
promising at all.

seqscan.sql test:

master (e2c78e7ab)
tps = 27.769076 (without initial connection time)
tps = 28.155233 (without initial connection time)
tps = 26.990389 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch
tps = 23.990490 (without initial connection time)
tps = 23.450662 (without initial connection time)
tps = 23.600194 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch without union
HeapScanDescData change (just remove rs_inited field)
tps = 24.419007 (without initial connection time)
tps = 24.221389 (without initial connection time)
tps = 24.187756 (without initial connection time)


countall.sql test:

master (e2c78e7ab)

tps = 33.999408 (without initial connection time)
tps = 33.664292 (without initial connection time)
tps = 33.869115 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch
tps = 31.194316 (without initial connection time)
tps = 30.804987 (without initial connection time)
tps = 30.770236 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch without union
HeapScanDescData change (just remove rs_inited field)
tps = 32.626187 (without initial connection time)
tps = 32.876362 (without initial connection time)
tps = 32.481729 (without initial connection time)

I don't really have any explanation for why this slows performance so
much. My thoughts are that if the performance of scans is this
sensitive to the order of the fields in the struct then it's an
independent project to learn out why and what we can realistically
change to get the best performance here.

> So, I was wondering what we should do about initializing rs_coffset here
> since it doesn't fall under "don't initialize it because it is only used
> for page-at-a-time mode". It might not be required for us to initialize
> it in initscan, but we do bother to initialize other "current scan
> state" fields. We could check if pagemode is enabled and initialize
> rs_coffset or rs_cindex depending on that.

Maybe master should be initialising this field already. I didn't quite
see it as important as it's never used before rs_inited is set to
true. Maybe setting it to InvalidOffsetNumber is a good idea just in
case something tries to use it before it gets set.

> >   * Note: when we fall off the end of the scan in either direction, we
> > - * reset rs_inited.  This means that a further request with the same
> > - * scan direction will restart the scan, which is a bit odd, but a
> > - * request with the opposite scan direction will start a fresh scan
> > + * reset rs_cblock to InvalidBlockNumber.  This means that a further request
> > + * with the same scan direction will restart the scan, which is a bit odd, but
> > + * a request with the opposite scan direction will start a fresh scan
> >   * in the proper direction.  The latter is required behavior for cursors,
> >   * while the former case is generally undefined behavior in Postgres
> >   * so we don't care too much.
>
> Not the fault of this patch, but I am having trouble parsing this
> comment. What does restart the scan mean? I get that it is undefined
> behavior, but it is confusing because it kind of sounds like a rescan
> which is not what it is (right?). And what exactly is a fresh scan? It
> is probably correct, I just don't really understand what it is saying.
> Feel free to ignore this aside, as I think your change is correctly
> updating the comment.

I struggled with this too. It just looks incorrect. As far as I see
it, once the scan ends we do the same thing regardless of what the
scan direction is. Maybe it's worth looking back at when that comment
was added and seeing if it was true when it was written and then see
what changed. I think it's worth improving that independently.

I think I'd like to focus on the cleanup stuff before looking into
getting rid of rs_inited. I'm not sure I'm going to get time to do the
required performance tests to look into why removing rs_inited slows
things down so much.

> Also, a few random other complaints about the comments in
> HeapScanDescData that are also not the fault of this patch:
>
> - why is this comment there twice?
>         /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */

Seems to date back to 2019. I'll push something shortly to remove the
additional one.

> - above the union it said (prior to this patch also)
>         /* scan current state */
>         which is in contrast to
>         /* state set up at initscan time */
>         from the top, but, arguably, rs_strategy is set up at initscan time

I'm not sure what the best thing to do about that is. Given the
performance numbers I showed above after removing the rs_inited field,
I'd rather not move that field in the struct up to the other fields
that are set in initscan(). Perhaps you can suggest a patch which
improves the comments?

> - who or what is NB?
>         /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */

Latin for "note". It's used quite commonly in our code.

David

[1] https://postgr.es/m/CAApHDvpnA9SGp3OeXr4cYqX_w=NYN2YMzf2zfrABPNDsUqoNqw@mail.gmail.com



Re: heapgettup refactoring

From
David Rowley
Date:
On Wed, 8 Feb 2023 at 15:09, David Rowley <dgrowleyml@gmail.com> wrote:
> Using the tests mentioned in [1], I tested out
> remove_HeapScanDescData_rs_inited_field.patch. It's not looking very
> promising at all.

In light of the performance regression from removing the rs_inited
field, let's just forget doing that for now.  It does not really seem
that important compared to the other work that's already been done.

If one of us gets time during the v17 cycle, then maybe we can revisit it then.

I'll mark the patch as committed in the CF app.

David