Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id CA+hUKGJfdSgdJoPO07yMGDcM6yBZ5RS0TuqnWaD6Fi8zUYUkFQ@mail.gmail.com
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Apr 15, 2025 at 5:44 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 10, 2025 at 11:15 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > The new streaming BHS isn't just issuing probabilistic hints about
> > future access obtained from a second iterator.  It has just one shared
> > iterator connected up to the workers' ReadStreams.  Each worker pulls
> > a disjoint set of blocks out of its stream, possibly running a bunch
> > of IOs in the background as required.
>
> It feels to me like the problem here is that the shared iterator is
> connected to unshared read-streams. If you make a shared read-stream
> object and connect the shared iterator to that instead, does that
> solve this whole problem, or is there more to it?

More or less, yeah, just put the whole ReadStream object in shared
memory, pin an LWLock on it and call it a parallel-aware or shared
ReadStream.  But how do you make the locking not terrible?

My "work stealing" brain dump was imagining a way to achieve the same
net effect, except NOT have to acquire an exclusive lock for every
buffer you pull out of the stream.  I was speculating that we could
achieve zero locking for most of the stream without any cache line
ping pong, but a cunning read barrier scheme could detect when you've
been flipped into a slower coordination mode by another backend and
need to turn on some locking and fight over the last handful of
buffers.  And I was also observing that if you can figure out to make
it general and reusable enough, we have more unsolved problems like
this in unrelated parallel query code not even involving streams.
It's a tiny more approachable subset of the old "data buffered in
other workers" problem, as I think you called it once.



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: use correct variable in error message in _allocAH function (pg_backup_archiver.c)
Next
From: Ashutosh Bapat
Date:
Subject: Re: Built-in Raft replication