Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id CA+hUKGLHbKP3jwJ6_+hnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q@mail.gmail.com
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers
On Fri, Feb 14, 2025 at 5:52 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Thu, Feb 13, 2025 at 11:28 AM Tomas Vondra <tomas@vondra.me> wrote:
> > On 2/13/25 17:01, Melanie Plageman wrote:
> > I know it's not changing how much memory we allocate (compared to
> > master). I haven't thought about the GinScanEntry - yes, flexible array
> > member would make this a bit more complex.
>
> Oh, I see. I didn't understand Thomas' proposal. I don't know how hard
> it would be to make tidbitmap allocate the offsets on-demand. I'd need
> to investigate more. But probably not worth it for this patch.

No, what I meant was:  It would be nice to try to hold only one
uncompressed result set in memory at a time, like we achieved in the
vacuum patches.  The consumer expands them from a tiny object when the
associated buffer pops out the other end.  That should be possible
here too, right, because the bitmap is immutable and long lived, so
you should be able to stream (essentially) pointers into its internal
guts.  The current patch streams the uncompressed data itself, and
thus has to reserve space for the maximum possible amount of it, and
also forces you to think about fixed sizes.

I think if you want "consumer does the expanding" and also "dynamic
size" and also "consumer provides memory to avoid palloc churn", then
you might need two new functions: "how much memory would I need to
expand this thing?" and a "please expand it right here, it has the
amount of space you told me!".  Then I guess the consumer could keep
recycling the same piece of memory, and repalloc() if it's not big
enough.  Or something like that.

Yeah I guess you could in theory also stream pointers to individual
uncompressed result objects allocated with palloc(), that is point a
point in the per-buffer-data and make the consumer free it, but that
has other problems (less locality, allocator churn, need
cleanup/destructor mechanism for when the streams is reset or
destroyed early, still has lots of uncompressed copies of data in
memory *sometimes*) and is not what I was imagining.



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: describe special values in GUC descriptions more consistently
Next
From: Thomas Munro
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring