Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id 20240406134011.hjo3e2336sgcghfp@liskov
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers
On Sat, Apr 06, 2024 at 02:51:45AM +0200, Tomas Vondra wrote:
> 
> 
> On 4/6/24 01:53, Melanie Plageman wrote:
> > On Fri, Apr 05, 2024 at 04:06:34AM -0400, Melanie Plageman wrote:
> >> On Thu, Apr 04, 2024 at 04:35:45PM +0200, Tomas Vondra wrote:
> >>> On 4/4/24 00:57, Melanie Plageman wrote:
> >>>> On Sun, Mar 31, 2024 at 11:45:51AM -0400, Melanie Plageman wrote:
> >>> I'd focus on the first ~8-9 commits or so for now, we can commit more if
> >>> things go reasonably well.
> >>
> >> Sounds good. I will spend cleanup time on 0010-0013 tomorrow but would
> >> love to know if you agree with the direction before I spend more time.
> > 
> > In attached v16, I've split out 0010-0013 into 0011-0017. I think it is
> > much easier to understand.
> > 
> 
> Anyway, I've attached it as .tgz in order to not confuse cfbot. All the
> review comments are marked with XXX, so grep for that in the patches.
> There's two separate patches - the first one suggests a code change, so
> it was better to not merge that with your code. The second has just a
> couple XXX comments, I'm not sure why I kept it separate.
> 
> A couple review comments:
> 
> * I think 0001-0009 are 99% ready to. I reworded some of the commit
> messages a bit - I realize it's a bit bold, considering you're native
> speaker and I'm not. If you could check I didn't make it worse, that
> would be great.

Attached v17 has *only* patches 0001-0009 with these changes. I will
work on applying the remaining patches, addressing feedback, and adding
comments next.

I have reviewed and incorporated all of your feedback on these patches.
Attached v17 is your exact patches with 1 or 2 *very* slight tweaks to
commit messages (comma splice removal and single word adjustments) as
well as the changes listed below:

I have changed the following:

- 0003 added an assert that rs_empty_tuples_pending is 0 on rescan and
    endscan

- 0004 (your 0005)-- I followed up with Tom, but for now I have just
    removed the XXX and also reworded the message a bit

- 0006 (your 0007) fixed up the variable name (you changed valid ->
    valid_block but it had gotten changed back)

I have open questions on the following:

- 0003: should it be SO_NEED_TUPLES and need_tuples (instead of
    SO_NEED_TUPLE and need_tuple)?

- 0009 (your 0010)
    - Should I mention in the commit message that we added blockno and
        pfblockno in the BitmapHeapScanState only for validation or is that
        too specific?

    - Should I mention that a future (imminent) commit will remove the
        iterators from TableScanDescData and put them in HeapScanDescData? I
        imagine folks don't want those there, but it is easier for the
        progression of commits to put them there first and then move them

    - I'm worried this comment is vague and or potentially not totally
        correct. Should we remove it? I don't think we have conclusive proof
        that this is true.
  /*
   * Adjusting the prefetch iterator before invoking
   * table_scan_bitmap_next_block() keeps prefetch distance higher across
   * the parallel workers.
   */


> * I'm not sure extra_flags is the right way to pass the flag in 0003.
> The "extra_" name is a bit weird, and no other table AM functions do it
> this way and pass explicit bool flags instead. So my first "review"
> commit does it like that. Do you agree it's better that way?

Yes.

> * The one question I'm somewhat unsure about is why Tom chose to use the
> "wrong" recheck flag in the 2017 commit, when the correct recheck flag
> is readily available. Surely that had a reason, right? But I can't think
> of one ...

See above.

> > While I was doing that, I realized that I should remove the call to
> > table_rescan() from ExecReScanBitmapHeapScan() and just rely on the new
> > table_rescan_bm() invoked from BitmapHeapNext(). That is done in the
> > attached.
> > 
> > 0010-0018 still need comments updated but I focused on getting the split
> > out, reviewable version of them ready. I'll add comments (especially to
> > 0011 table AM functions) tomorrow. I also have to double-check if I
> > should add any asserts for table AMs about having implemented all of the
> > new begin/re/endscan() functions.
> > 
> 
> I added a couple more comments for those patches (10-12). Chances are
> the split in v16 clarifies some of my questions, but it'll have to wait
> till the morning ...

Will address this in next mail.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Flushing large data immediately in pqcomm
Next
From: Tom Lane
Date:
Subject: Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?