Re: heapgettup refactoring - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: heapgettup refactoring
Date
Msg-id 20230201211254.fn3xw334ye5gp6tr@liskov
Whole thread Raw
In response to Re: heapgettup refactoring  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: heapgettup refactoring
List pgsql-hackers
On Thu, Feb 02, 2023 at 12:21:20AM +1300, David Rowley wrote:
> On Tue, 31 Jan 2023 at 12:18, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > v7 attached
> 
> I've been looking over the v7-0002 patch today and I did make a few
> adjustments to heapgettup_initial_block() as I would prefer to see the
> branching of each of all these helper functions follow the pattern of:
> 
> if (<forward scan>)
> {
>     if (<parallel scan>)
>         <parallel stuff>
>     else
>         <serial stuff>
> }
> else
> {
>     <backwards serial stuff>
> }
> 
> which wasn't quite what the function was doing.

I'm fine with this. One code comment about the new version inline.

> Along the way, I noticed that 0002 has a subtle bug that does not seem
> to be present once the remaining patches are applied.  I think I'm
> happier to push these along the lines of how you have them in the
> patches, so I've held off pushing for now due to the bug and the
> change I had to make to fix it.
> 
> The problem is around the setting of scan->rs_inited = true; you've
> moved that into heapgettup_initial_block() and you've correctly not
> initialised the scan for empty tables when you return
> InvalidBlockNumber, however, you've not correctly considered the fact
> that table_block_parallelscan_nextpage() could also return
> InvalidBlockNumber if the parallel workers manage to grab all of the
> blocks before the current process gets the first block. I don't know
> for sure, but it looks like this could cause problems when
> heapgettup() or heapgettup_pagemode() got called again for a rescan.
> We'd have returned the NULL tuple to indicate that no further tuples
> exist, but we'll have left rs_inited set to true which looks like
> it'll cause issues.

Ah, yes. In the later patches in the series, I handle all end of scan
cases (regardless of whether or not there was a beginning) in a single
place at the end of the function. There I release the buffer and reset
all state -- including setting rs_inited to false. So, that made it okay
to set rs_inited to true in heapgettup_initial_block().

When splitting it up, I made a mistake and missed the case you
mentioned. Thanks for catching that!

FWIW, I like setting rs_inited in heapgettup_initial_block() better in
the final refactor, but I agree with you that in this patch on its own
it is better in the body of heapgettup() and heapgettup_pagemode().
 
> I wondered if it might be better to do the scan->rs_inited = true; in
> heapgettup() and heapgettup_pagemode() instead. The attached v8 patch
> does it this way. Despite this fixing that bug, I think this might be
> a slightly better division of duties.

LGTM.

> From cbd37463bdaa96afed4c7c739c8e91b770a9f8a7 Mon Sep 17 00:00:00 2001
> From: David Rowley <dgrowley@gmail.com>
> Date: Wed, 1 Feb 2023 19:35:16 +1300
> Subject: [PATCH v8] Refactor heapam.c adding heapgettup_initial_block function
> 
> Here we adjust heapgettup() and heapgettup_pagemode() to move the code
> that fetches the first block out into a helper function.  This removes
> some code duplication.
> 
> Author: Melanie Plageman
> Reviewed-by: David Rowley
> Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
> ---
>  src/backend/access/heap/heapam.c | 225 ++++++++++++++-----------------
>  1 file changed, 103 insertions(+), 122 deletions(-)
> 
> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 0a8bac25f5..40168cc9ca 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -483,6 +483,67 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
>      scan->rs_ntuples = ntup;
>  }
>  
> +/*
> + * heapgettup_initial_block - return the first BlockNumber to scan
> + *
> + * Returns InvalidBlockNumber when there are no blocks to scan.  This can
> + * occur with empty tables and in parallel scans when parallel workers get all
> + * of the pages before we can get a chance to get our first page.
> + */
> +static BlockNumber
> +heapgettup_initial_block(HeapScanDesc scan, ScanDirection dir)
> +{
> +    Assert(!scan->rs_inited);
> +
> +    /* When there are no pages to scan, return InvalidBlockNumber */
> +    if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0)
> +        return InvalidBlockNumber;
> +
> +    if (ScanDirectionIsForward(dir))
> +    {
> +        /* serial scan */
> +        if (scan->rs_base.rs_parallel == NULL)
> +            return scan->rs_startblock;

I believe this else is superfluous since we returned above.

> +        else
> +        {
> +            /* parallel scan */
> +            table_block_parallelscan_startblock_init(scan->rs_base.rs_rd,
> +                                                     scan->rs_parallelworkerdata,
> +                                                     (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel);
> +
> +            /* may return InvalidBlockNumber if there are no more blocks */
> +            return table_block_parallelscan_nextpage(scan->rs_base.rs_rd,
> +                                                     scan->rs_parallelworkerdata,
> +                                                     (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel);
> +        }
> +    }
...
> @@ -889,62 +892,40 @@ heapgettup_pagemode(HeapScanDesc scan,
> -        if (!scan->rs_inited)
> -        {
> -            lineindex = lines - 1;
> -            scan->rs_inited = true;
> -        }
> -        else
> -        {
> +            page = BufferGetPage(scan->rs_cbuf);
> +            TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, page);
> +            lines = scan->rs_ntuples;
>              lineindex = scan->rs_cindex - 1;
>          }
> -        /* block and lineindex now reference the previous visible tid */

I think this is an unintentional diff.

>  
> +        /* block and lineindex now reference the previous visible tid */
>          linesleft = lineindex + 1;
>      }

- Melanie



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: pg_dump versus hash partitioning
Next
From: Robert Haas
Date:
Subject: Re: pg_dump versus hash partitioning