Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id 20240403225759.sq5hmxdmbalroosy@liskov
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers
On Sun, Mar 31, 2024 at 11:45:51AM -0400, Melanie Plageman wrote:
> On Fri, Mar 29, 2024 at 12:05:15PM +0100, Tomas Vondra wrote:
> > 
> > 
> > On 3/29/24 02:12, Thomas Munro wrote:
> > > On Fri, Mar 29, 2024 at 10:43 AM Tomas Vondra
> > > <tomas.vondra@enterprisedb.com> wrote:
> > >> I think there's some sort of bug, triggering this assert in heapam
> > >>
> > >>   Assert(BufferGetBlockNumber(hscan->rs_cbuf) == tbmres->blockno);
> > > 
> > > Thanks for the repro.  I can't seem to reproduce it (still trying) but
> > > I assume this is with Melanie's v11 patch set which had
> > > v11-0016-v10-Read-Stream-API.patch.
> > > 
> > > Would you mind removing that commit and instead applying the v13
> > > stream_read.c patches[1]?  v10 stream_read.c was a little confused
> > > about random I/O combining, which I fixed with a small adjustment to
> > > the conditions for the "if" statement right at the end of
> > > read_stream_look_ahead().  Sorry about that.  The fixed version, with
> > > eic=4, with your test query using WHERE a < a, ends its scan with:
> > > 
> > 
> > I'll give that a try. Unfortunately unfortunately the v11 still has the
> > problem I reported about a week ago:
> > 
> >   ERROR:  prefetch and main iterators are out of sync
> > 
> > So I can't run the full benchmarks :-( but master vs. streaming read API
> > should work, I think.
> 
> Odd, I didn't notice you reporting this ERROR popping up. Now that I
> take a look, v11 (at least, maybe also v10) had this very sill mistake:
> 
>   if (scan->bm_parallel == NULL &&
>       scan->rs_pf_bhs_iterator &&
>       hscan->pfblockno > hscan->rs_base.blockno)
>        elog(ERROR, "prefetch and main iterators are out of sync");
> 
> It errors out if the prefetch block is ahead of the current block --
> which is the opposite of what we want. I've fixed this in attached v12.
> 
> This version also has v13 of the streaming read API. I noticed one
> mistake in my bitmapheap scan streaming read user -- it freed the
> streaming read object at the wrong time. I don't know if this was
> causing any other issues, but it at least is fixed in this version.

Attached v13 is rebased over master (which includes the streaming read
API now). I also reset the streaming read object on rescan instead of
creating a new one each time.

I don't know how much chance any of this has of going in to 17 now, but
I thought I would start looking into the regression repro Tomas provided
in [1].

I'm also not sure if I should try and group the commits into fewer
commits now or wait until I have some idea of whether or not the
approach in 0013 and 0014 is worth pursuing.

- Melanie

[1] https://www.postgresql.org/message-id/5d5954ed-6f43-4f1a-8e19-ece75b2b7362%40enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: pg_combinebackup --copy-file-range
Next
From: Peter Eisentraut
Date:
Subject: Re: Security lessons from liblzma - libsystemd