Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | 4636ce5a-082b-447d-9b62-4953fc18f1af@enterprisedb.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: BitmapHeapScan streaming read user and prelim refactoring
|
List | pgsql-hackers |
On 4/6/24 15:40, Melanie Plageman wrote: > On Sat, Apr 06, 2024 at 02:51:45AM +0200, Tomas Vondra wrote: >> >> >> On 4/6/24 01:53, Melanie Plageman wrote: >>> On Fri, Apr 05, 2024 at 04:06:34AM -0400, Melanie Plageman wrote: >>>> On Thu, Apr 04, 2024 at 04:35:45PM +0200, Tomas Vondra wrote: >>>>> On 4/4/24 00:57, Melanie Plageman wrote: >>>>>> On Sun, Mar 31, 2024 at 11:45:51AM -0400, Melanie Plageman wrote: >>>>> I'd focus on the first ~8-9 commits or so for now, we can commit more if >>>>> things go reasonably well. >>>> >>>> Sounds good. I will spend cleanup time on 0010-0013 tomorrow but would >>>> love to know if you agree with the direction before I spend more time. >>> >>> In attached v16, I've split out 0010-0013 into 0011-0017. I think it is >>> much easier to understand. >>> >> >> Anyway, I've attached it as .tgz in order to not confuse cfbot. All the >> review comments are marked with XXX, so grep for that in the patches. >> There's two separate patches - the first one suggests a code change, so >> it was better to not merge that with your code. The second has just a >> couple XXX comments, I'm not sure why I kept it separate. >> >> A couple review comments: >> >> * I think 0001-0009 are 99% ready to. I reworded some of the commit >> messages a bit - I realize it's a bit bold, considering you're native >> speaker and I'm not. If you could check I didn't make it worse, that >> would be great. > > Attached v17 has *only* patches 0001-0009 with these changes. I will > work on applying the remaining patches, addressing feedback, and adding > comments next. > > I have reviewed and incorporated all of your feedback on these patches. > Attached v17 is your exact patches with 1 or 2 *very* slight tweaks to > commit messages (comma splice removal and single word adjustments) as > well as the changes listed below: > > I have changed the following: > > - 0003 added an assert that rs_empty_tuples_pending is 0 on rescan and > endscan > OK > - 0004 (your 0005)-- I followed up with Tom, but for now I have just > removed the XXX and also reworded the message a bit > After the exercise I described a couple minutes ago, I think I'm convinced the assumption is unnecessary and we should use the correct recheck. Not that it'd make any difference in practice, considering none of the opclasses ever changes the recheck. Maybe the most prudent thing would be to skip this commit and maybe leave this for later, but I'm not forcing you to do that if it would mean a lot of disruption for the following patches. > - 0006 (your 0007) fixed up the variable name (you changed valid -> > valid_block but it had gotten changed back) > OK > I have open questions on the following: > > - 0003: should it be SO_NEED_TUPLES and need_tuples (instead of > SO_NEED_TUPLE and need_tuple)? > I think SO_NEED_TUPLES is more accurate, as we need all tuples from the block. But either would work. > - 0009 (your 0010) > - Should I mention in the commit message that we added blockno and > pfblockno in the BitmapHeapScanState only for validation or is that > too specific? > For the commit message I'd say it's too specific, I'd put it in the comment before the struct. > - Should I mention that a future (imminent) commit will remove the > iterators from TableScanDescData and put them in HeapScanDescData? I > imagine folks don't want those there, but it is easier for the > progression of commits to put them there first and then move them > I'd try not to mention future commits as justification too often, if we don't know that the future commit lands shortly after. > - I'm worried this comment is vague and or potentially not totally > correct. Should we remove it? I don't think we have conclusive proof > that this is true. > /* > * Adjusting the prefetch iterator before invoking > * table_scan_bitmap_next_block() keeps prefetch distance higher across > * the parallel workers. > */ > TBH it's not clear to me what "higher across parallel workers" means. But it sure shouldn't claim things that we think may not be correct. I don't have a good idea how to reword it, though. > >> * I'm not sure extra_flags is the right way to pass the flag in 0003. >> The "extra_" name is a bit weird, and no other table AM functions do it >> this way and pass explicit bool flags instead. So my first "review" >> commit does it like that. Do you agree it's better that way? > > Yes. > Cool >> * The one question I'm somewhat unsure about is why Tom chose to use the >> "wrong" recheck flag in the 2017 commit, when the correct recheck flag >> is readily available. Surely that had a reason, right? But I can't think >> of one ... > > See above. > >>> While I was doing that, I realized that I should remove the call to >>> table_rescan() from ExecReScanBitmapHeapScan() and just rely on the new >>> table_rescan_bm() invoked from BitmapHeapNext(). That is done in the >>> attached. >>> >>> 0010-0018 still need comments updated but I focused on getting the split >>> out, reviewable version of them ready. I'll add comments (especially to >>> 0011 table AM functions) tomorrow. I also have to double-check if I >>> should add any asserts for table AMs about having implemented all of the >>> new begin/re/endscan() functions. >>> >> >> I added a couple more comments for those patches (10-12). Chances are >> the split in v16 clarifies some of my questions, but it'll have to wait >> till the morning ... > > Will address this in next mail. > OK, thanks. If think 0001-0008 are ready to go, with some minor tweaks per above (tuple vs. tuples etc.), and the question about the recheck flag. If you can do these tweaks, I'll get that committed today and we can try to get a couple more patches in tomorrow. Sounds reasonable? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: