Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id 063e4eb4-32d9-439e-a0b1-75565a9835a8@iki.fi
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
On 19/06/2024 18:55, Melanie Plageman wrote:
> On Tue, Jun 18, 2024 at 6:02 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> I went through v22 to remind myself of what the patches do and do some
>> basic review - I have some simple questions / comments for now, nothing
>> major. I've kept the comments in separate 'review' patches, it does not
>> seem worth copying here.
> 
> Thanks so much for the review!
> 
> I've implemented your feedback in attached v23 except for what I
> mention below. I have not gone through each patch in the new set very
> carefully after making the changes because I think we should resolve
> the question of adding the new table scan descriptor before I do that.
> A change there will require a big rebase. Then I can go through each
> patch very carefully.
Had a quick look at this after a long pause. I only looked at the first 
few, hoping that reviewing them would allow you to commit at least some 
of them, making the rest easier.

v23-0001-table_scan_bitmap_next_block-counts-lossy-and-ex.patch

Looks good to me. (I'm not sure if this would be a net positive change 
on its own, but it's needed by the later patch so OK)

v23-0002-Remove-table_scan_bitmap_next_tuple-parameter-tb.patch

LGTM

v23-0003-Make-table_scan_bitmap_next_block-async-friendly.patch

> @@ -1955,19 +1954,26 @@ table_relation_estimate_size(Relation rel, int32 *attr_widths,
>   */
>  
>  /*
> - * Prepare to fetch / check / return tuples from `tbmres->blockno` as part of a
> - * bitmap table scan. `scan` needs to have been started via
> - * table_beginscan_bm(). Returns false if there are no tuples to be found on
> - * the page, true otherwise. `lossy_pages` is incremented if the block's
> - * representation in the bitmap is lossy; otherwise, `exact_pages` is
> - * incremented.
> + * Prepare to fetch / check / return tuples as part of a bitmap table scan.
> + * `scan` needs to have been started via table_beginscan_bm(). Returns false if
> + * there are no more blocks in the bitmap, true otherwise. `lossy_pages` is
> + * incremented if the block's representation in the bitmap is lossy; otherwise,
> + * `exact_pages` is incremented.
> + *
> + * `recheck` is set by the table AM to indicate whether or not the tuples
> + * from this block should be rechecked. Tuples from lossy pages will always
> + * need to be rechecked, but some non-lossy pages' tuples may also require
> + * recheck.
> + *
> + * `blockno` is only used in bitmap table scan code to validate that the
> + * prefetch block is staying ahead of the current block.
>   *
>   * Note, this is an optionally implemented function, therefore should only be
>   * used after verifying the presence (at plan time or such).
>   */
>  static inline bool
>  table_scan_bitmap_next_block(TableScanDesc scan,
> -                             struct TBMIterateResult *tbmres,
> +                             BlockNumber *blockno, bool *recheck,
>                               long *lossy_pages,
>                               long *exact_pages)
>  {

The new comment doesn't really explain what *blockno means. Is it an 
input or output parameter? How is it used with the prefetching?

v23-0004-Add-common-interface-for-TBMIterators.patch

> +/*
> + * Start iteration on a shared or private bitmap iterator. Note that tbm will
> + * only be provided by private BitmapHeapScan callers. dsa and dsp will only be
> + * provided by parallel BitmapHeapScan callers. For shared callers, one process
> + * must already have called tbm_prepare_shared_iterate() to create and set up
> + * the TBMSharedIteratorState. The TBMIterator is passed by reference to
> + * accommodate callers who would like to allocate it inside an existing struct.
> + */
> +void
> +tbm_begin_iterate(TBMIterator *iterator, TIDBitmap *tbm,
> +                  dsa_area *dsa, dsa_pointer dsp)
> +{
> +    Assert(iterator);
> +
> +    iterator->private_iterator = NULL;
> +    iterator->shared_iterator = NULL;
> +    iterator->exhausted = false;
> +
> +    /* Allocate a private iterator and attach the shared state to it */
> +    if (DsaPointerIsValid(dsp))
> +        iterator->shared_iterator = tbm_attach_shared_iterate(dsa, dsp);
> +    else
> +        iterator->private_iterator = tbm_begin_private_iterate(tbm);
> +}

Hmm, I haven't looked at how this is used the later patches, but a 
function signature where some parameters are used or not depending on 
the situation seems a bit awkward. Perhaps it would be better to let the 
caller call tbm_attach_shared_iterate(dsa, dsp) or 
tbm_begin_private_iterate(tbm), and provide a function to turn that into 
a TBMIterator? Something like:

TBMIterator *tbm_iterator_from_shared_iterator(TBMSharedIterator *);
TBMIterator *tbm_iterator_from_private_iterator(TBMPrivateIterator *);

Does tbm_iterator() really need the 'exhausted' flag? The private and 
shared variants work without one.

+1 on this patch in general, and I have no objections to its current 
form either, the above are just suggestions to consider.

v23-0006-BitmapHeapScan-uses-unified-iterator.patch

Makes sense. (Might be better to squash this with the previous patch)

v23-0007-BitmapHeapScan-Make-prefetch-sync-error-more-det.patch

LGTM

v23-0008-Push-current-scan-descriptor-into-specialized-sc.patch
v23-0009-Remove-ss_current-prefix-from-ss_currentScanDesc.patch

LGTM. I would squash these together.

v23-0010-Add-scan_in_progress-to-BitmapHeapScanState.patch

> --- a/src/include/nodes/execnodes.h
> +++ b/src/include/nodes/execnodes.h
> @@ -1804,6 +1804,7 @@ typedef struct ParallelBitmapHeapState
>   *        prefetch_target    current target prefetch distance
>   *        prefetch_maximum   maximum value for prefetch_target
>   *        initialized           is node is ready to iterate
> + *        scan_in_progress   is this a rescan
>   *        pstate               shared state for parallel bitmap scan
>   *        recheck               do current page's tuples need recheck
>   *        blockno               used to validate pf and current block in sync
> @@ -1824,6 +1825,7 @@ typedef struct BitmapHeapScanState
>      int            prefetch_target;
>      int            prefetch_maximum;
>      bool        initialized;
> +    bool        scan_in_progress;
>      ParallelBitmapHeapState *pstate;
>      bool        recheck;
>      BlockNumber blockno;

Hmm, the "is this a rescan" comment sounds inaccurate, because it is set 
as soon as the scan is started, not only when rescanning. Other than 
that, LGTM.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Add contrib/pg_logicalsnapinspect
Next
From: Alexander Lakhin
Date:
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan