Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id 8DE07ADE-1D86-4680-8793-057FAAE35B5C@enterprisedb.com
Whole thread Raw
In response to BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers

> On Feb 13, 2024, at 3:11 PM, Melanie Plageman <melanieplageman@gmail.com> wrote:

Thanks for the patch...

> Attached is a patch set which refactors BitmapHeapScan such that it
> can use the streaming read API [1]. It also resolves the long-standing
> FIXME in the BitmapHeapScan code suggesting that the skip fetch
> optimization should be pushed into the table AMs. Additionally, it
> moves table scan initialization to after the index scan and bitmap
> initialization.
>
> patches 0001-0002 are assorted cleanup needed later in the set.
> patches 0003 moves the table scan initialization to after bitmap creation
> patch 0004 is, I think, a bug fix. see [2].
> patches 0005-0006 push the skip fetch optimization into the table AMs
> patches 0007-0009 change the control flow of BitmapHeapNext() to match
> that required by the streaming read API
> patch 0010 is the streaming read code not yet in master
> patch 0011 is the actual bitmapheapscan streaming read user.
>
> patches 0001-0009 apply on top of master but 0010 and 0011 must be
> applied on top of a commit before a 21d9c3ee4ef74e2 (until a rebased
> version of the streaming read API is on the mailing list).

I followed your lead and applied them to 6a8ffe812d194ba6f4f26791b6388a4837d17d6c.  `make check` worked fine, though I
expectyou know that already. 

> The caveat is that these patches introduce breaking changes to two
> table AM functions for bitmapheapscan: table_scan_bitmap_next_block()
> and table_scan_bitmap_next_tuple().

You might want an independent perspective on how much of a hassle those breaking changes are, so I took a stab at that.
Having written a custom proprietary TAM for postgresql 15 here at EDB, and having ported it and released it for
postgresql16, I thought I'd try porting it to the the above commit with your patches.  Even without your patches, I
alreadysee breaking changes coming from commit f691f5b80a85c66d715b4340ffabb503eb19393e, which creates a similar amount
ofbreakage for me as does your patches.  Dealing with the combined breakage might amount to a day of work, including
testing,half of which I think I've already finished.  In other words, it doesn't seem like a big deal. 

Were postgresql 17 shaping up to be compatible with TAMs written for 16, your patch would change that qualitatively,
butsince things are already incompatible, I think you're in the clear. 

> A TBMIterateResult used to be threaded through both of these functions
> and used in BitmapHeapNext(). This patch set removes all references to
> TBMIterateResults from BitmapHeapNext. Because the streaming read API
> requires the callback to specify the next block, BitmapHeapNext() can
> no longer pass a TBMIterateResult to table_scan_bitmap_next_block().
>
> More subtly, table_scan_bitmap_next_block() used to return false if
> there were no more visible tuples on the page or if the block that was
> requested was not valid. With these changes,
> table_scan_bitmap_next_block() will only return false when the bitmap
> has been exhausted and the scan can end. In order to use the streaming
> read API, the user must be able to request the blocks it needs without
> requiring synchronous feedback per block. Thus, this table AM function
> must change its meaning.
>
> I think the way the patches are split up could be improved. I will
> think more about this. There are also probably a few mistakes with
> which comments are updated in which patches in the set.

I look forward to the next version of the patch set.  Thanks again for working on this.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






pgsql-hackers by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning
Next
From: jian he
Date:
Subject: Re: SQL:2011 application time