Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id CAAKRu_a_p8ZsAq=eKSDErYuR17ZH7uv4UnQ4_NBe5NmtMpHaMA@mail.gmail.com
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers
On Fri, Jun 14, 2024 at 7:56 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Attached v21 is the rest of the patches to make bitmap heap scan use
> the read stream API.
---snip---
> The one relic of iterator ownership is that for parallel bitmap heap
> scan, a single process must scan the index, construct the bitmap, and
> call tbm_prepare_shared_iterate() to set up the iterator for all the
> processes. I didn't see a way to push this down (because to build the
> bitmap we have to scan the index and call ExecProcNode()). I wonder if
> this creates an odd split of responsibilities. I could use some other
> synchronization mechanism to communicate which process built the
> bitmap in the generic bitmap table scan code and thus should set up
> the iterator in the heap implementation, but that sounds like a pretty
> bad idea. Also, I'm not sure how a non-block-based table AM would use
> TBMIterators (or even TIDBitmap) anyway.

I tinkered around with this some more and actually came up with a
solution to my primary concern with the code structure. Attached is
v22. It still needs the performance regression investigation mentioned
in my previous email, but I feel more confident in the layering of the
iterator ownership I ended up with.

Because the patch numbers have changed, below is a summary of the
contents with the new patch numbers:

Patches 0001-0003 implement the async-friendly behavior needed both to
push down the VM lookups for prefetching and eventually to use the
read stream API.

Patches 0004-0006 add and make use of a common interface for the
shared and private bitmap iterators per Heikki's
suggestion in [1].

Patches 0008 - 0012 make new scan descriptors for bitmap table scans
and the heap AM implementation.

0013 and 0014 push all of the prefetch code down into heap AM code as
suggested by Heikki in [1].

0016 removes scan_bitmap_next_block() per Heikki's suggestion in [1].

Patches 0017 and 0018 make some changes to the TIDBitmap API to
support having multiple TBMIterateResults at the same time instead of
reusing the same one when iterating.

0019 uses the read stream API and removes all the bespoke prefetching
code from bitmap heap scan.

- Melanie

[1] https://www.postgresql.org/message-id/5a172d1e-d69c-409a-b1fa-6521214c81c2%40iki.fi

Attachment

pgsql-hackers by date:

Previous
From: John H
Date:
Subject: Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions
Next
From: Andres Freund
Date:
Subject: Re: cost delay brainstorming