Thread: heapgettup refactoring
Hi, Attached is a patchset to refactor heapgettup(), heapgettup_pagemode(), and heapgetpage(). heapgettup() and heapgettup_pagemode() have a lot of duplicated code, confusingly nested if statements, and unnecessary local variables. While working on a feature for the AIO/DIO patchset, I noticed that it was difficult to add new code to heapgettup() and heapgettup_pagemode() because of how the functions are written. I've taken a stab at refactoring them -- without generating less efficient code or causing regressions. I'm interested if people find it more readable and if those with more assembly expertise see issues (new branches added which are not highly predictable, etc). I took a look at the assembly for those symbols compiled at O2 but am not experienced enough at analysis to come to any conclusions. - Melanie
Attachment
FYI: [18:51:54.707] ../src/backend/access/heap/heapam.c(720): warning C4098: 'heapgettup': 'void' function returning a value [18:51:54.707] ../src/backend/access/heap/heapam.c(850): warning C4098: 'heapgettup_pagemode': 'void' function returninga value For some reason, MSVC is the only one to complain, and cfbot doesn't currently tell you about it. I have a patch to show that, which I'll send $later.
Hi, On 2022-10-31 14:37:39 -0400, Melanie Plageman wrote: > and heapgetpage(). heapgettup() and heapgettup_pagemode() have a lot of > duplicated code, confusingly nested if statements, and unnecessary local > variables. While working on a feature for the AIO/DIO patchset, I > noticed that it was difficult to add new code to heapgettup() and > heapgettup_pagemode() because of how the functions are written. Thanks for working on this - the current state is quite painful. > From cde2d6720f4f5ab2531c22ad4a5f0d9e6ec1039d Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplageman@gmail.com> > Date: Wed, 26 Oct 2022 20:00:34 -0400 > Subject: [PATCH v1 1/3] Remove breaks in HeapTupleSatisfiesVisibility > > breaks in HeapTupleSatisfiesVisibility were superfluous > --- > src/backend/access/heap/heapam_visibility.c | 7 ------- > 1 file changed, 7 deletions(-) > > diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c > index 6e33d1c881..dd5d5da190 100644 > --- a/src/backend/access/heap/heapam_visibility.c > +++ b/src/backend/access/heap/heapam_visibility.c > @@ -1769,25 +1769,18 @@ HeapTupleSatisfiesVisibility(HeapTuple htup, Snapshot snapshot, Buffer buffer) > { > case SNAPSHOT_MVCC: > return HeapTupleSatisfiesMVCC(htup, snapshot, buffer); > - break; > case SNAPSHOT_SELF: > return HeapTupleSatisfiesSelf(htup, snapshot, buffer); > - break; > case SNAPSHOT_ANY: > return HeapTupleSatisfiesAny(htup, snapshot, buffer); > - break; > case SNAPSHOT_TOAST: > return HeapTupleSatisfiesToast(htup, snapshot, buffer); > - break; > case SNAPSHOT_DIRTY: > return HeapTupleSatisfiesDirty(htup, snapshot, buffer); > - break; > case SNAPSHOT_HISTORIC_MVCC: > return HeapTupleSatisfiesHistoricMVCC(htup, snapshot, buffer); > - break; > case SNAPSHOT_NON_VACUUMABLE: > return HeapTupleSatisfiesNonVacuumable(htup, snapshot, buffer); > - break; > } Not sure what the author of this code, a certain Mr Freund, was thinking when he added those returns... > From 9d8b01960463dc64ff5b111d523ff80fce3017af Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplageman@gmail.com> > Date: Mon, 31 Oct 2022 13:40:06 -0400 > Subject: [PATCH v1 2/3] Turn HeapKeyTest macro into function > > This should always be inlined appropriately now. It is easier to read as > a function. Also, remove unused include in catcache.c. > --- > src/backend/access/heap/heapam.c | 10 ++-- > src/backend/utils/cache/catcache.c | 1 - > src/include/access/valid.h | 76 ++++++++++++------------------ > 3 files changed, 36 insertions(+), 51 deletions(-) > > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c > index 12be87efed..1c995faa12 100644 > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -716,8 +716,10 @@ heapgettup(HeapScanDesc scan, > snapshot); > > if (valid && key != NULL) > - HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd), > - nkeys, key, valid); > + { > + valid = HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd), > + nkeys, key); > + } > > if (valid) > { superfluous parens. > --- a/src/include/access/valid.h > +++ b/src/include/access/valid.h > @@ -19,51 +19,35 @@ > * > * Test a heap tuple to see if it satisfies a scan key. > */ > -#define HeapKeyTest(tuple, \ > - tupdesc, \ > - nkeys, \ > - keys, \ > - result) \ > -do \ > -{ \ > - /* Use underscores to protect the variables passed in as parameters */ \ > - int __cur_nkeys = (nkeys); \ > - ScanKey __cur_keys = (keys); \ > - \ > - (result) = true; /* may change */ \ > - for (; __cur_nkeys--; __cur_keys++) \ > - { \ > - Datum __atp; \ > - bool __isnull; \ > - Datum __test; \ > - \ > - if (__cur_keys->sk_flags & SK_ISNULL) \ > - { \ > - (result) = false; \ > - break; \ > - } \ > - \ > - __atp = heap_getattr((tuple), \ > - __cur_keys->sk_attno, \ > - (tupdesc), \ > - &__isnull); \ > - \ > - if (__isnull) \ > - { \ > - (result) = false; \ > - break; \ > - } \ > - \ > - __test = FunctionCall2Coll(&__cur_keys->sk_func, \ > - __cur_keys->sk_collation, \ > - __atp, __cur_keys->sk_argument); \ > - \ > - if (!DatumGetBool(__test)) \ > - { \ > - (result) = false; \ > - break; \ > - } \ > - } \ > -} while (0) > +static inline bool > +HeapKeyTest(HeapTuple tuple, TupleDesc tupdesc, int nkeys, ScanKey keys) > +{ > + int cur_nkeys = nkeys; > + ScanKey cur_key = keys; > + > + for (; cur_nkeys--; cur_key++) > + { > + Datum atp; > + bool isnull; > + Datum test; > + > + if (cur_key->sk_flags & SK_ISNULL) > + return false; > + > + atp = heap_getattr(tuple, cur_key->sk_attno, tupdesc, &isnull); > + > + if (isnull) > + return false; > + > + test = FunctionCall2Coll(&cur_key->sk_func, > + cur_key->sk_collation, > + atp, cur_key->sk_argument); > + > + if (!DatumGetBool(test)) > + return false; > + } > + > + return true; > +} Seems like a simple and nice win in readability. I recall looking at this in the past and thinking that there was some additional subtlety here, but I can't see what that'd be. > From a894ce38c488df6546392b9f3bd894b67edf951e Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplageman@gmail.com> > Date: Mon, 31 Oct 2022 13:40:29 -0400 > Subject: [PATCH v1 3/3] Refactor heapgettup* and heapgetpage > > Simplify heapgettup(), heapgettup_pagemode(), and heapgetpage(). All > three contained several unnecessary local variables, duplicate code, and > nested if statements. Streamlining these improves readability and > extensibility. It'd be nice to break this into slightly smaller chunks. > + > +static inline void heapgettup_no_movement(HeapScanDesc scan) > +{ FWIW, for function definitions we keep the return type (and with that also the the "static inline") on a separate line. > + ItemId lpp; > + OffsetNumber lineoff; > + BlockNumber page; > + Page dp; > + HeapTuple tuple = &(scan->rs_ctup); > + /* > + * ``no movement'' scan direction: refetch prior tuple > + */ > + > + /* Since the tuple was previously fetched, needn't lock page here */ > + if (!scan->rs_inited) > + { > + Assert(!BufferIsValid(scan->rs_cbuf)); > + tuple->t_data = NULL; > + return; Is it possible to have a no-movement scan with an uninitialized scan? That doesn't really seem to make sense. At least that's how I understand the explanation for NoMovement nearby: * dir == NoMovementScanDirection means "re-fetch the tuple indicated * by scan->rs_ctup". We can't have a rs_ctup without an already started scan, no? Looks like this is pre-existing code that you just moved, but it still seems wrong. > + } > + page = ItemPointerGetBlockNumber(&(tuple->t_self)); > + if (page != scan->rs_cblock) > + heapgetpage((TableScanDesc) scan, page); We have a BlockNumber page; and Page dp; in this code which seems almost intentionally confusing. This again is a pre-existing issue but perhaps we could clean it up first? > +static inline Page heapgettup_continue_page(HeapScanDesc scan, BlockNumber page, ScanDirection dir, > + int *linesleft, OffsetNumber *lineoff) > +{ > + HeapTuple tuple = &(scan->rs_ctup); Hm. Finding the next offset via rs_ctup doesn't seem quite right. For one, it's not actually that cheap to extract the offset from an ItemPointer because of the the way we pack it into ItemPointerData. > + Page dp = BufferGetPage(scan->rs_cbuf); > + TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp); Newlines between definitions and code :) Perhaps worth asserting that the scan is initialized and that rs_cbuf is valid? > + if (ScanDirectionIsForward(dir)) > + { > + *lineoff = OffsetNumberNext(ItemPointerGetOffsetNumber(&(tuple->t_self))); > + *linesleft = PageGetMaxOffsetNumber(dp) - (*lineoff) + 1; We can't access PageGetMaxOffsetNumber etc without holding a lock on the page. It's not immediately obvious that that is held in all paths. > +static inline BlockNumber heapgettup_initial_page(HeapScanDesc scan, ScanDirection dir) > +{ > + Assert(!ScanDirectionIsNoMovement(dir)); > + Assert(!scan->rs_inited); Is there a reason we couldn't set rs_inited in here, rather than reapeating that in all callers? ISTM that this function should deal with the /* * return null immediately if relation is empty */ logic, I think you now are repeating that check on every call to heapgettup(). > @@ -511,182 +711,55 @@ heapgettup(HeapScanDesc scan, > ScanKey key) > { > HeapTuple tuple = &(scan->rs_ctup); > - Snapshot snapshot = scan->rs_base.rs_snapshot; > - bool backward = ScanDirectionIsBackward(dir); > BlockNumber page; > - bool finished; > Page dp; > - int lines; > OffsetNumber lineoff; > int linesleft; > - ItemId lpp; > + > + if (ScanDirectionIsNoMovement(dir)) > + return heapgettup_no_movement(scan); Maybe add an unlikely() - this path is barely ever used... > /* > - * calculate next starting lineoff, given scan direction > + * return null immediately if relation is empty > */ > - if (ScanDirectionIsForward(dir)) > + if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0) > { As mentioned above, I don't think we should repeat the nblocks check on every call. > + page = scan->rs_cblock; > + LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE); > + dp = heapgettup_continue_page(scan, page, dir, &linesleft, &lineoff); > + goto continue_page; > } > > /* > * advance the scan until we find a qualifying tuple or run out of stuff > * to scan > */ > - lpp = PageGetItemId(dp, lineoff); > - for (;;) > + while (page != InvalidBlockNumber) > { > + heapgetpage((TableScanDesc) scan, page); > + LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE); > + dp = heapgettup_start_page(scan, page, dir, &linesleft, &lineoff); > + continue_page: I don't like the goto continue_page at all. Seems that the paths leading here should call LockBuffer(), heapgettup_start_page() etc? Possibly a do {} while () loop could do the trick as well. Greetings, Andres Freund
Thanks for the review! Attached is v2 with feedback addressed. On Mon, Oct 31, 2022 at 9:09 PM Andres Freund <andres@anarazel.de> wrote: > > From 9d8b01960463dc64ff5b111d523ff80fce3017af Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplageman@gmail.com> > > Date: Mon, 31 Oct 2022 13:40:06 -0400 > > Subject: [PATCH v1 2/3] Turn HeapKeyTest macro into function > > > > This should always be inlined appropriately now. It is easier to read as > > a function. Also, remove unused include in catcache.c. > > --- > > src/backend/access/heap/heapam.c | 10 ++-- > > src/backend/utils/cache/catcache.c | 1 - > > src/include/access/valid.h | 76 ++++++++++++------------------ > > 3 files changed, 36 insertions(+), 51 deletions(-) > > > > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c > > index 12be87efed..1c995faa12 100644 > > --- a/src/backend/access/heap/heapam.c > > +++ b/src/backend/access/heap/heapam.c > > @@ -716,8 +716,10 @@ heapgettup(HeapScanDesc scan, > > snapshot); > > > > if (valid && key != NULL) > > - HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd), > > - nkeys, key, valid); > > + { > > + valid = HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd), > > + nkeys, key); > > + } > > > > if (valid) > > { > > superfluous parens. fixed. > > From a894ce38c488df6546392b9f3bd894b67edf951e Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplageman@gmail.com> > > Date: Mon, 31 Oct 2022 13:40:29 -0400 > > Subject: [PATCH v1 3/3] Refactor heapgettup* and heapgetpage > > > > Simplify heapgettup(), heapgettup_pagemode(), and heapgetpage(). All > > three contained several unnecessary local variables, duplicate code, and > > nested if statements. Streamlining these improves readability and > > extensibility. > > It'd be nice to break this into slightly smaller chunks. I can do that. Since incorporating feedback will be harder once I break it up into smaller chunks, I'm inclined to wait to do so until I know that the structure I have now is the one we will go with. (I know smaller chunks will make it more reviewable.) > > + > > +static inline void heapgettup_no_movement(HeapScanDesc scan) > > +{ > > FWIW, for function definitions we keep the return type (and with that also the > the "static inline") on a separate line. Fixed > > > + ItemId lpp; > > + OffsetNumber lineoff; > > + BlockNumber page; > > + Page dp; > > + HeapTuple tuple = &(scan->rs_ctup); > > + /* > > + * ``no movement'' scan direction: refetch prior tuple > > + */ > > + > > + /* Since the tuple was previously fetched, needn't lock page here */ > > + if (!scan->rs_inited) > > + { > > + Assert(!BufferIsValid(scan->rs_cbuf)); > > + tuple->t_data = NULL; > > + return; > > Is it possible to have a no-movement scan with an uninitialized scan? That > doesn't really seem to make sense. At least that's how I understand the > explanation for NoMovement nearby: > * dir == NoMovementScanDirection means "re-fetch the tuple indicated > * by scan->rs_ctup". > > We can't have a rs_ctup without an already started scan, no? > > Looks like this is pre-existing code that you just moved, but it still seems > wrong. Changed to an assert > > > + } > > + page = ItemPointerGetBlockNumber(&(tuple->t_self)); > > + if (page != scan->rs_cblock) > > + heapgetpage((TableScanDesc) scan, page); > > > We have a > BlockNumber page; > and > Page dp; > in this code which seems almost intentionally confusing. This again is a > pre-existing issue but perhaps we could clean it up first? in attached page -> block dp -> page in basically all locations in heapam.c (should that be its own commit?) > > +static inline Page heapgettup_continue_page(HeapScanDesc scan, BlockNumber page, ScanDirection dir, > > + int *linesleft, OffsetNumber *lineoff) > > +{ > > + HeapTuple tuple = &(scan->rs_ctup); > > Hm. Finding the next offset via rs_ctup doesn't seem quite right. For one, > it's not actually that cheap to extract the offset from an ItemPointer because > of the the way we pack it into ItemPointerData. So, it was like this before [1]. What about saving the lineoff in rs_cindex. It is smaller, but that seems okay, right? > > + Page dp = BufferGetPage(scan->rs_cbuf); > > + TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp); > > Newlines between definitions and code :) k > Perhaps worth asserting that the scan is initialized and that rs_cbuf is valid? indeed. > > > + if (ScanDirectionIsForward(dir)) > > + { > > + *lineoff = OffsetNumberNext(ItemPointerGetOffsetNumber(&(tuple->t_self))); > > + *linesleft = PageGetMaxOffsetNumber(dp) - (*lineoff) + 1; > > We can't access PageGetMaxOffsetNumber etc without holding a lock on the > page. It's not immediately obvious that that is held in all paths. In heapgettup() I call LockBuffer() before invoking heapgettup_continue_page() and heapgettup_start_page() which are the ones doing this. I did have big plans for using the continue_page and start_page functions in heapgettup_pagemode() as well, but since I'm not doing that now, I can add in an expectation that the lock is held. I added a comment saying the caller is responsible for acquiring the lock if needed. I thought of adding an assert, but I don't see that being done outside of bufmgr.c BufferDesc *bufHdr = GetBufferDescriptor(buffer - 1); Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr))); > > +static inline BlockNumber heapgettup_initial_page(HeapScanDesc scan, ScanDirection dir) > > +{ > > + Assert(!ScanDirectionIsNoMovement(dir)); > > + Assert(!scan->rs_inited); > > Is there a reason we couldn't set rs_inited in here, rather than reapeating > that in all callers? I wasn't sure if future callers or existing callers in the future may need to do steps other than what is in heapgettup_initial_page() before setting rs_inited. I thought of the responsibility of heapgettup_initial_page() as returning the initial page to start the scan. If it is going to do all initialization steps, perhaps the name should change? I thought having a function that says it does initialization of the scan might be confusing since initscan() also exists. > ISTM that this function should deal with the > /* > * return null immediately if relation is empty > */ > > logic, I think you now are repeating that check on every call to heapgettup(). So, that's a good point. If I move setting rs_inited inside of heapgettup_initial_page(), then I can also easily move the empty table check inside there too. I don't want to set rs_inited before every return in heapgettup_initial_page(). Do you think there are any issues with setting it at the top of the function? I thought about setting it at the very top (even before checking if the relation is empty) Is it okay to set it before the empty table check? rs_inited will be set to false at the bottom before returning. But, maybe this will be an issue in other callers of heapgettup_initial_page()? Anyway, I have changed it in attached v2. > > @@ -511,182 +711,55 @@ heapgettup(HeapScanDesc scan, > > ScanKey key) > > { > > HeapTuple tuple = &(scan->rs_ctup); > > - Snapshot snapshot = scan->rs_base.rs_snapshot; > > - bool backward = ScanDirectionIsBackward(dir); > > BlockNumber page; > > - bool finished; > > Page dp; > > - int lines; > > OffsetNumber lineoff; > > int linesleft; > > - ItemId lpp; > > + > > + if (ScanDirectionIsNoMovement(dir)) > > + return heapgettup_no_movement(scan); > > Maybe add an unlikely() - this path is barely ever used... done. > > + page = scan->rs_cblock; > > + LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE); > > + dp = heapgettup_continue_page(scan, page, dir, &linesleft, &lineoff); > > + goto continue_page; > > } > > > > /* > > * advance the scan until we find a qualifying tuple or run out of stuff > > * to scan > > */ > > - lpp = PageGetItemId(dp, lineoff); > > - for (;;) > > + while (page != InvalidBlockNumber) > > { > > + heapgetpage((TableScanDesc) scan, page); > > + LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE); > > + dp = heapgettup_start_page(scan, page, dir, &linesleft, &lineoff); > > + continue_page: > > > I don't like the goto continue_page at all. Seems that the paths leading here > should call LockBuffer(), heapgettup_start_page() etc? Possibly a do {} while > () loop could do the trick as well. I don't see how a do while loop would solve help with the problem. We need to check if the block number is valid after getting a block assignment before doing heapgetpage() (e.g. after heapgettup_initial_page() and after heapgettup_advance_page()). Removing the goto continue_page means adding the heapgettpage(), heapgettup_start_page(), etc code block in two places now (both after heapgettup_initial_page() and after heapgettup_advance_page()) and, in both locations we have to add an if statement to check if the block is valid. I feel like this makes the function longer and harder to understand. Keeping the loop as short as possible makes it clear what it is doing. I think that with an appropriate warning comment, the goto continue_page is clearer and easier to understand. To me, starting a page at the top of the outer loop, then looping through the lines in the page and is the structure that makes the most sense. - Melanie [1] https://github.com/postgres/postgres/blob/master/src/backend/access/heap/heapam.c#L572
Attachment
On 04.11.22 16:51, Melanie Plageman wrote: > Thanks for the review! > Attached is v2 with feedback addressed. Your 0001 had already been pushed. I have pushed your 0002. I have also pushed the renaming of page -> block, dp -> page separately. This should reduce the size of your 0003 a bit. Please produce an updated version of the 0003 page for further review.
On Wed, Nov 16, 2022 at 10:49 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 04.11.22 16:51, Melanie Plageman wrote: > > Thanks for the review! > > Attached is v2 with feedback addressed. > > Your 0001 had already been pushed. > > I have pushed your 0002. > > I have also pushed the renaming of page -> block, dp -> page separately. > This should reduce the size of your 0003 a bit. > > Please produce an updated version of the 0003 page for further review. Thanks for looking at this! I have attached a patchset with only the code changes contained in the previous patch 0003. I have broken the refactoring down into many smaller pieces for ease of review. - Melanie
Attachment
- v3-0005-Add-heapgettup-helpers.patch
- v3-0002-Push-lpp-variable-closer-to-usage-in-heapgetpage.patch
- v3-0003-Add-heapgettup_initial_page-helper.patch
- v3-0001-Add-no-movement-scan-helper.patch
- v3-0004-Streamline-heapgettup-for-refactor.patch
- v3-0007-Refactor-heapgettup-and-heapgettup_pagemode.patch
- v3-0006-Add-heapgettup_advance_page.patch
On 30.11.22 23:34, Melanie Plageman wrote: > I have attached a patchset with only the code changes contained in the > previous patch 0003. I have broken the refactoring down into many > smaller pieces for ease of review. To keep this moving along a bit, I have committed your 0002, which I think is a nice little improvement on its own.
On Mon, Jan 2, 2023 at 5:23 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 30.11.22 23:34, Melanie Plageman wrote: > > I have attached a patchset with only the code changes contained in the > > previous patch 0003. I have broken the refactoring down into many > > smaller pieces for ease of review. > > To keep this moving along a bit, I have committed your 0002, which I > think is a nice little improvement on its own. Thanks! I've attached a rebased patchset - v4. I also changed heapgettup_no_movement() to noinline (instead of inline). David Rowley pointed out that this might make more sense given how comparatively rare no movement scans are. - Melanie
Attachment
On 03.01.23 21:39, Melanie Plageman wrote: >> On 30.11.22 23:34, Melanie Plageman wrote: >>> I have attached a patchset with only the code changes contained in the >>> previous patch 0003. I have broken the refactoring down into many >>> smaller pieces for ease of review. >> >> To keep this moving along a bit, I have committed your 0002, which I >> think is a nice little improvement on its own. > > Thanks! > I've attached a rebased patchset - v4. > > I also changed heapgettup_no_movement() to noinline (instead of inline). > David Rowley pointed out that this might make more sense given how > comparatively rare no movement scans are. Ok, let's look through these patches starting from the top then. v4-0001-Add-no-movement-scan-helper.patch This makes sense overall; there is clearly some duplicate code that can be unified. It appears that during your rebasing you have effectively reverted your earlier changes that have been committed as 8e1db29cdbbd218ab6ba53eea56624553c3bef8c. You should undo that. I don't understand the purpose of the noinline maker. If it's not necessary to inline, we can just leave it off, but there is no need to outright prevent inlining AFAICT. I don't know why you changed the if/else sequences. Before, the sequence was effectively if (forward) { ... } else if (backward) { ... } else { /* it's no movement */ } Now it's changed to if (no movement) { ... return; } if (forward) { ... } else { Assert(backward); ... } Sure, that's the same thing, but it looks less elegant to me.
On Thu, Jan 5, 2023 at 8:52 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > Ok, let's look through these patches starting from the top then. > > v4-0001-Add-no-movement-scan-helper.patch > > This makes sense overall; there is clearly some duplicate code that can > be unified. > > It appears that during your rebasing you have effectively reverted your > earlier changes that have been committed as > 8e1db29cdbbd218ab6ba53eea56624553c3bef8c. You should undo that. Thanks. I think I have addressed this. I've attached a rebased v5. > I don't understand the purpose of the noinline maker. If it's not > necessary to inline, we can just leave it off, but there is no need to > outright prevent inlining AFAICT. > I have removed it. > I don't know why you changed the if/else sequences. Before, the > sequence was effectively > > if (forward) > { > ... > } > else if (backward) > { > ... > } > else > { > /* it's no movement */ > } > > Now it's changed to > > if (no movement) > { > ... > return; > } > > if (forward) > { > ... > } > else > { > Assert(backward); > ... > } > > Sure, that's the same thing, but it looks less elegant to me. In this commit, you could keep the original ordering of if statements. I preferred no movement scan first because then backwards and forwards scans' code is physically closer to the rest of the code without the intrusion of the no movement scan code. Ultimately, the refactor (in later patches) flips the ordering of if statements at the top from if (scan direction) to if (initial or continue) and this isn't a very interesting distinction for no movement scans. By dealing with no movement scan at the top, I didn't have to handle no movement scans in the initial and continue branches in the new structure. Also, I will note that patches 4-6 at least and perhaps 4-7 do not make sense as separate commits. I separated them for ease of review. Each is correct and passes tests but is not really an improvement without the others. - Melanie
Attachment
- v5-0003-Add-heapgettup_initial_block-helper.patch
- v5-0001-pgindent-heapgettup-functions.patch
- v5-0004-Streamline-heapgettup-for-refactor.patch
- v5-0002-Add-no-movement-scan-helper.patch
- v5-0005-Add-heapgettup-helpers.patch
- v5-0007-Refactor-heapgettup-and-heapgettup_pagemode.patch
- v5-0006-Add-heapgettup_advance_block-helper.patch
On 10.01.23 21:31, Melanie Plageman wrote: > On Thu, Jan 5, 2023 at 8:52 AM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: >> >> Ok, let's look through these patches starting from the top then. >> >> v4-0001-Add-no-movement-scan-helper.patch >> >> This makes sense overall; there is clearly some duplicate code that can >> be unified. >> >> It appears that during your rebasing you have effectively reverted your >> earlier changes that have been committed as >> 8e1db29cdbbd218ab6ba53eea56624553c3bef8c. You should undo that. > > Thanks. I think I have addressed this. > I've attached a rebased v5. In your v2 patch, you remove these assertions: - /* check that rs_cindex is in sync */ - Assert(scan->rs_cindex < scan->rs_ntuples); - Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]); Is that intentional? I don't see any explanation, or some other equivalent code appearing elsewhere to replace this.
On Thu, 19 Jan 2023 at 00:04, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > In your v2 patch, you remove these assertions: > > - /* check that rs_cindex is in sync */ > - Assert(scan->rs_cindex < scan->rs_ntuples); > - Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]); > > Is that intentional? > > I don't see any explanation, or some other equivalent code appearing > elsewhere to replace this. I guess it's because those asserts are not relevant unless heapgettup_no_movement() is being called from heapgettup_pagemode(). Maybe they can be put back along the lines of: Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || scan->rs_cindex < scan->rs_ntuples); Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || lineoff == scan->rs_vistuples[scan->rs_cindex]); but it probably would be cleaner to just do an: if (scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) { Assert(...); Assert(...}; } The only issue I see with that is that we don't seem to have anywhere in the regression tests that call heapgettup_no_movement() when rs_flags have SO_ALLOW_PAGEMODE. At least, adding an elog(NOTICE) to heapgettup() just before calling heapgettup_no_movement() does not seem to cause make check to fail. I wonder if any series of SQL commands would allow us to call heapgettup_no_movement() from heapgettup()? I think heapgettup_no_movement() also needs a header comment more along the lines of: /* * heapgettup_no_movement * Helper function for NoMovementScanDirection direction for heapgettup() and * heapgettup_pagemode. */ I pushed the pgindent stuff that v5-0001 did along with some additions to typedefs.list so that further runs could be done more easily as changes are made to these patches. David
Thanks for taking a look! On Mon, Jan 23, 2023 at 6:08 AM David Rowley <dgrowleyml@gmail.com> wrote: > > On Thu, 19 Jan 2023 at 00:04, Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: > > In your v2 patch, you remove these assertions: > > > > - /* check that rs_cindex is in sync */ > > - Assert(scan->rs_cindex < scan->rs_ntuples); > > - Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]); > > > > Is that intentional? > > > > I don't see any explanation, or some other equivalent code appearing > > elsewhere to replace this. > > I guess it's because those asserts are not relevant unless > heapgettup_no_movement() is being called from heapgettup_pagemode(). > Maybe they can be put back along the lines of: > > Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || > scan->rs_cindex < scan->rs_ntuples); > Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || lineoff == > scan->rs_vistuples[scan->rs_cindex]); > > but it probably would be cleaner to just do an: if > (scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) { Assert(...); > Assert(...}; } I prefer the first method and have implemented that in attached v6. > The only issue I see with that is that we don't seem to have anywhere > in the regression tests that call heapgettup_no_movement() when > rs_flags have SO_ALLOW_PAGEMODE. At least, adding an elog(NOTICE) to > heapgettup() just before calling heapgettup_no_movement() does not > seem to cause make check to fail. I wonder if any series of SQL > commands would allow us to call heapgettup_no_movement() from > heapgettup()? So, the places in which we set scan direction to no movement include: - explain analyze on a ctas with no data EXPLAIN ANALYZE CREATE TABLE foo AS SELECT 1 WITH NO DATA; However, in standard_ExecutorRun() we only call ExecutePlan() if the ScanDirection is not no movement, so this wouldn't hit our code - PortalRunSelect - PersistHoldablePortal() I can't say I know enough about portals currently to design a test that will hit this code, but I will poke around some more. > I think heapgettup_no_movement() also needs a header comment more > along the lines of: > > /* > * heapgettup_no_movement > * Helper function for NoMovementScanDirection direction for > heapgettup() and > * heapgettup_pagemode. > */ I've added a comment but I didn't include the function name in it -- I find it repetitive when the comments above functions do that -- however, I'm not strongly attached to that. > I pushed the pgindent stuff that v5-0001 did along with some additions > to typedefs.list so that further runs could be done more easily as > changes are made to these patches. Cool! - Melanie
Attachment
On Tue, Jan 24, 2023 at 04:17:23PM -0500, Melanie Plageman wrote: > Thanks for taking a look! > > On Mon, Jan 23, 2023 at 6:08 AM David Rowley <dgrowleyml@gmail.com> wrote: > > > > On Thu, 19 Jan 2023 at 00:04, Peter Eisentraut > > <peter.eisentraut@enterprisedb.com> wrote: > > > In your v2 patch, you remove these assertions: > > > > > > - /* check that rs_cindex is in sync */ > > > - Assert(scan->rs_cindex < scan->rs_ntuples); > > > - Assert(lineoff == scan->rs_vistuples[scan->rs_cindex]); > > > > > > Is that intentional? > > > > > > I don't see any explanation, or some other equivalent code appearing > > > elsewhere to replace this. > > > > I guess it's because those asserts are not relevant unless > > heapgettup_no_movement() is being called from heapgettup_pagemode(). > > Maybe they can be put back along the lines of: > > > > Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || > > scan->rs_cindex < scan->rs_ntuples); > > Assert((scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) == 0 || lineoff == > > scan->rs_vistuples[scan->rs_cindex]); > > > > but it probably would be cleaner to just do an: if > > (scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) { Assert(...); > > Assert(...}; } > > I prefer the first method and have implemented that in attached v6. > > > The only issue I see with that is that we don't seem to have anywhere > > in the regression tests that call heapgettup_no_movement() when > > rs_flags have SO_ALLOW_PAGEMODE. At least, adding an elog(NOTICE) to > > heapgettup() just before calling heapgettup_no_movement() does not > > seem to cause make check to fail. I wonder if any series of SQL > > commands would allow us to call heapgettup_no_movement() from > > heapgettup()? > > So, the places in which we set scan direction to no movement include: > - explain analyze on a ctas with no data > EXPLAIN ANALYZE CREATE TABLE foo AS SELECT 1 WITH NO DATA; > However, in standard_ExecutorRun() we only call ExecutePlan() if the > ScanDirection is not no movement, so this wouldn't hit our code > - PortalRunSelect > - PersistHoldablePortal() > > I can't say I know enough about portals currently to design a test that > will hit this code, but I will poke around some more. I don't think we can write a test for this afterall. I've started another thread on the topic over here: https://www.postgresql.org/message-id/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY%3D_T8QEqZqOL02rpi2bw%40mail.gmail.com
"On Wed, 25 Jan 2023 at 10:17, Melanie Plageman <melanieplageman@gmail.com> wrote: > I've added a comment but I didn't include the function name in it -- I > find it repetitive when the comments above functions do that -- however, > I'm not strongly attached to that. I think the general format for header comments is: /* * <function name> *\t\t<brief summary of what function does> * * [Further details] */ We've certainly got places that don't follow that, but I don't think that's any reason to have no comment or invent some new format. heapam.c seems to have some other format where we do: "<function name> - <brief summary of what function does>". I generally just try to copy the style from the surrounding code. I think generally, people won't argue if you follow the style from the surrounding code, but there'd be exceptions to that, I'm sure. I'll skip further review of 0001 here as the whole ScanDirectionNoMovement case is being discussed on the other thread. v6-0002: 1. heapgettup_initial_block() needs a header comment to mention what it does and what it returns. It would be good to make it obvious that it returns InvalidBlockNumber when there are no blocks to scan. 2. After heapgettup_initial_block(), you're checking "if (block == InvalidBlockNumber). It might be worth a mention something like /* * Check if we got to the end of the scan already. This could happen for * an empty relation or if parallel workers scanned everything before we * got a chance. */ the backward scan comment wouldn't mention parallel workers. v6-0003: 3. Can you explain why you removed the snapshot local variable in heapgettup()? 4. I think it might be a good idea to use unlikely() in if (!scan->rs_inited). The idea is to help coax the compiler into moving that code off to a cold path. That's likely especially important if heapgettup_initial_block is inlined, which I see it is marked as. v6-0004: 5. heapgettup_start_page() and heapgettup_continue_page() both need a header comment to explain what they do and what the inputs and output arguments are. 6. I'm not too sure what the following comment means: /* block and lineoff now reference the physically next tid */ "block" is just a parameter to the function and its value is not adjusted. The comment makes it sound like something was changed. (I think these might be just not well updated from having split this out from the 0006 patch as the same comment makes more sense in 0006) v6-0005: 7. heapgettup_advance_block() needs a header comment. 8. Is there a reason why heapgettup_advance_block() handle backward scans first? I'd expect you should just follow the lead of the other functions and do ScanDirectionIsForward first. v6-0006 David
v7 attached On Fri, Jan 27, 2023 at 10:34 PM David Rowley <dgrowleyml@gmail.com> wrote: > > "On Wed, 25 Jan 2023 at 10:17, Melanie Plageman > <melanieplageman@gmail.com> wrote: > > I've added a comment but I didn't include the function name in it -- I > > find it repetitive when the comments above functions do that -- however, > > I'm not strongly attached to that. > > I think the general format for header comments is: > > /* > * <function name> > *\t\t<brief summary of what function does> > * > * [Further details] > */ > > We've certainly got places that don't follow that, but I don't think > that's any reason to have no comment or invent some new format. > > heapam.c seems to have some other format where we do: "<function name> > - <brief summary of what function does>". I generally just try to copy > the style from the surrounding code. I think generally, people won't > argue if you follow the style from the surrounding code, but there'd > be exceptions to that, I'm sure. I have followed the same convention as the other functions in heapam.c in the various helper functions comments I've added in this version. > v6-0002: > > 1. heapgettup_initial_block() needs a header comment to mention what > it does and what it returns. It would be good to make it obvious that > it returns InvalidBlockNumber when there are no blocks to scan. I've done this. > > 2. After heapgettup_initial_block(), you're checking "if (block == > InvalidBlockNumber). It might be worth a mention something like > > /* > * Check if we got to the end of the scan already. This could happen for > * an empty relation or if parallel workers scanned everything before we > * got a chance. > */ > > the backward scan comment wouldn't mention parallel workers. I've done this as well. > > v6-0003: > > 3. Can you explain why you removed the snapshot local variable in heapgettup()? In the subsequent commit, the helpers I add call TestForOldSnapshot(), and I didn't want to pass in the snapshot as a separate parameter since I already need to pass the scan descriptor. I thought it was confusing to have a local variable (snapshot) used in some places and the one in the scan used in others. This "streamlining" commit also reduces the number of times the snapshot variable is used, making it less necessary to have a local variable. I didn't remove the snapshot local variable in the same commit as adding the helpers because I thought it made the diff harder to understand (for review, the final commit should likely not be separate patches). > 4. I think it might be a good idea to use unlikely() in if > (!scan->rs_inited). The idea is to help coax the compiler into moving > that code off to a cold path. That's likely especially important if > heapgettup_initial_block is inlined, which I see it is marked as. I've gone ahead and added unlikely. However, should I perhaps skip inlining the heapgettup_initial_block() function? > v6-0004: > > 5. heapgettup_start_page() and heapgettup_continue_page() both need a > header comment to explain what they do and what the inputs and output > arguments are. I've added these. I've also removed an unused parameter to both, block. > > 6. I'm not too sure what the following comment means: > > /* block and lineoff now reference the physically next tid */ > > "block" is just a parameter to the function and its value is not > adjusted. The comment makes it sound like something was changed. > > (I think these might be just not well updated from having split this > out from the 0006 patch as the same comment makes more sense in 0006) Yes, that is true. I've updated it to just mention lineoff. > v6-0005: > > 7. heapgettup_advance_block() needs a header comment. > > 8. Is there a reason why heapgettup_advance_block() handle backward > scans first? I'd expect you should just follow the lead of the other > functions and do ScanDirectionIsForward first. The reason I do this is that backwards scans cannot be parallel, so handling backwards scans first let me return, then handle parallel scans, then forward scans. This reduced the level of nesting/if statements for all of the code in this function. - Melanie
Attachment
- v7-0002-Add-heapgettup_initial_block-helper.patch
- v7-0001-Remove-NoMovementScanDirection-inconsistencies.patch
- v7-0005-Add-heapgettup_advance_block-helper.patch
- v7-0004-Add-heapgettup-helpers.patch
- v7-0003-Streamline-heapgettup-for-refactor.patch
- v7-0006-Refactor-heapgettup-and-heapgettup_pagemode.patch
On Tue, 31 Jan 2023 at 12:18, Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Jan 27, 2023 at 10:34 PM David Rowley <dgrowleyml@gmail.com> wrote: > > 4. I think it might be a good idea to use unlikely() in if > > (!scan->rs_inited). The idea is to help coax the compiler into moving > > that code off to a cold path. That's likely especially important if > > heapgettup_initial_block is inlined, which I see it is marked as. > > I've gone ahead and added unlikely. However, should I perhaps skip > inlining the heapgettup_initial_block() function? I'm not sure of the exact best combination of functions to mark as inline. I did try the v7 patchset from 0002 to 0006 on top of c2891175 and I found that the performance is slightly better after removing inline from all 4 of the helper functions. However, I think if we do unlikely() and the function is moved into the cold path then it matters less if it's inlined. create table a (a int); insert into a select x from generate_series(1,1000000)x; vacuum freeze a; $ cat seqscan.sql select * from a where a = 0; $ cat countall.sql select count(*) from a; seqscan.sql filters out all rows and countall.sql returns all rows and does an aggregate so we don't have to return all those in the query. max_parallel_workers_per_gather=0; master $ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in {1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep tps; done tps = 25.464091 (without initial connection time) tps = 25.117001 (without initial connection time) tps = 25.141646 (without initial connection time) $ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in {1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres | grep tps; done tps = 27.906307 (without initial connection time) tps = 27.527580 (without initial connection time) tps = 27.563035 (without initial connection time) master + v7 $ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in {1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep tps; done tps = 25.920370 (without initial connection time) tps = 25.680052 (without initial connection time) tps = 24.988895 (without initial connection time) $ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in {1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres | grep tps; done tps = 33.783122 (without initial connection time) tps = 33.248571 (without initial connection time) tps = 33.512984 (without initial connection time) master + v7 + inline removed $ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in {1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep tps; done tps = 27.680115 (without initial connection time) tps = 26.418562 (without initial connection time) tps = 26.166800 (without initial connection time) $ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in {1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres | grep tps; done tps = 33.948588 (without initial connection time) tps = 33.684966 (without initial connection time) tps = 33.946700 (without initial connection time) You can see that v7 helps countall.sql quite a bit. It seems to also help a little bit with seqscan.sql. v7 + inline removed makes seqscan.sql a decent amount faster than both master and master + v7. David
On Tue, 31 Jan 2023 at 12:18, Melanie Plageman <melanieplageman@gmail.com> wrote: > v7 attached I've been looking over the v7-0002 patch today and I did make a few adjustments to heapgettup_initial_block() as I would prefer to see the branching of each of all these helper functions follow the pattern of: if (<forward scan>) { if (<parallel scan>) <parallel stuff> else <serial stuff> } else { <backwards serial stuff> } which wasn't quite what the function was doing. Along the way, I noticed that 0002 has a subtle bug that does not seem to be present once the remaining patches are applied. I think I'm happier to push these along the lines of how you have them in the patches, so I've held off pushing for now due to the bug and the change I had to make to fix it. The problem is around the setting of scan->rs_inited = true; you've moved that into heapgettup_initial_block() and you've correctly not initialised the scan for empty tables when you return InvalidBlockNumber, however, you've not correctly considered the fact that table_block_parallelscan_nextpage() could also return InvalidBlockNumber if the parallel workers manage to grab all of the blocks before the current process gets the first block. I don't know for sure, but it looks like this could cause problems when heapgettup() or heapgettup_pagemode() got called again for a rescan. We'd have returned the NULL tuple to indicate that no further tuples exist, but we'll have left rs_inited set to true which looks like it'll cause issues. I wondered if it might be better to do the scan->rs_inited = true; in heapgettup() and heapgettup_pagemode() instead. The attached v8 patch does it this way. Despite this fixing that bug, I think this might be a slightly better division of duties. If you're ok with the attached (and everyone else is too), then I can push it in the (NZ) morning. The remaining patches would need to be rebased due to my changes. David
Attachment
On Thu, Feb 02, 2023 at 12:21:20AM +1300, David Rowley wrote: > On Tue, 31 Jan 2023 at 12:18, Melanie Plageman > <melanieplageman@gmail.com> wrote: > > v7 attached > > I've been looking over the v7-0002 patch today and I did make a few > adjustments to heapgettup_initial_block() as I would prefer to see the > branching of each of all these helper functions follow the pattern of: > > if (<forward scan>) > { > if (<parallel scan>) > <parallel stuff> > else > <serial stuff> > } > else > { > <backwards serial stuff> > } > > which wasn't quite what the function was doing. I'm fine with this. One code comment about the new version inline. > Along the way, I noticed that 0002 has a subtle bug that does not seem > to be present once the remaining patches are applied. I think I'm > happier to push these along the lines of how you have them in the > patches, so I've held off pushing for now due to the bug and the > change I had to make to fix it. > > The problem is around the setting of scan->rs_inited = true; you've > moved that into heapgettup_initial_block() and you've correctly not > initialised the scan for empty tables when you return > InvalidBlockNumber, however, you've not correctly considered the fact > that table_block_parallelscan_nextpage() could also return > InvalidBlockNumber if the parallel workers manage to grab all of the > blocks before the current process gets the first block. I don't know > for sure, but it looks like this could cause problems when > heapgettup() or heapgettup_pagemode() got called again for a rescan. > We'd have returned the NULL tuple to indicate that no further tuples > exist, but we'll have left rs_inited set to true which looks like > it'll cause issues. Ah, yes. In the later patches in the series, I handle all end of scan cases (regardless of whether or not there was a beginning) in a single place at the end of the function. There I release the buffer and reset all state -- including setting rs_inited to false. So, that made it okay to set rs_inited to true in heapgettup_initial_block(). When splitting it up, I made a mistake and missed the case you mentioned. Thanks for catching that! FWIW, I like setting rs_inited in heapgettup_initial_block() better in the final refactor, but I agree with you that in this patch on its own it is better in the body of heapgettup() and heapgettup_pagemode(). > I wondered if it might be better to do the scan->rs_inited = true; in > heapgettup() and heapgettup_pagemode() instead. The attached v8 patch > does it this way. Despite this fixing that bug, I think this might be > a slightly better division of duties. LGTM. > From cbd37463bdaa96afed4c7c739c8e91b770a9f8a7 Mon Sep 17 00:00:00 2001 > From: David Rowley <dgrowley@gmail.com> > Date: Wed, 1 Feb 2023 19:35:16 +1300 > Subject: [PATCH v8] Refactor heapam.c adding heapgettup_initial_block function > > Here we adjust heapgettup() and heapgettup_pagemode() to move the code > that fetches the first block out into a helper function. This removes > some code duplication. > > Author: Melanie Plageman > Reviewed-by: David Rowley > Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com > --- > src/backend/access/heap/heapam.c | 225 ++++++++++++++----------------- > 1 file changed, 103 insertions(+), 122 deletions(-) > > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c > index 0a8bac25f5..40168cc9ca 100644 > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -483,6 +483,67 @@ heapgetpage(TableScanDesc sscan, BlockNumber block) > scan->rs_ntuples = ntup; > } > > +/* > + * heapgettup_initial_block - return the first BlockNumber to scan > + * > + * Returns InvalidBlockNumber when there are no blocks to scan. This can > + * occur with empty tables and in parallel scans when parallel workers get all > + * of the pages before we can get a chance to get our first page. > + */ > +static BlockNumber > +heapgettup_initial_block(HeapScanDesc scan, ScanDirection dir) > +{ > + Assert(!scan->rs_inited); > + > + /* When there are no pages to scan, return InvalidBlockNumber */ > + if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0) > + return InvalidBlockNumber; > + > + if (ScanDirectionIsForward(dir)) > + { > + /* serial scan */ > + if (scan->rs_base.rs_parallel == NULL) > + return scan->rs_startblock; I believe this else is superfluous since we returned above. > + else > + { > + /* parallel scan */ > + table_block_parallelscan_startblock_init(scan->rs_base.rs_rd, > + scan->rs_parallelworkerdata, > + (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel); > + > + /* may return InvalidBlockNumber if there are no more blocks */ > + return table_block_parallelscan_nextpage(scan->rs_base.rs_rd, > + scan->rs_parallelworkerdata, > + (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel); > + } > + } ... > @@ -889,62 +892,40 @@ heapgettup_pagemode(HeapScanDesc scan, > - if (!scan->rs_inited) > - { > - lineindex = lines - 1; > - scan->rs_inited = true; > - } > - else > - { > + page = BufferGetPage(scan->rs_cbuf); > + TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, page); > + lines = scan->rs_ntuples; > lineindex = scan->rs_cindex - 1; > } > - /* block and lineindex now reference the previous visible tid */ I think this is an unintentional diff. > > + /* block and lineindex now reference the previous visible tid */ > linesleft = lineindex + 1; > } - Melanie
On Thu, 2 Feb 2023 at 10:12, Melanie Plageman <melanieplageman@gmail.com> wrote: > FWIW, I like setting rs_inited in heapgettup_initial_block() better in > the final refactor, but I agree with you that in this patch on its own > it is better in the body of heapgettup() and heapgettup_pagemode(). We can reconsider that when we get to that patch. It just felt a bit ugly to add an InvalidBlockNumber check after calling table_block_parallelscan_nextpage() > I believe this else is superfluous since we returned above. TBH, that's on purpose. I felt that it just looked better that way as the code all fitted onto my screen. I think if the function was longer and people had to scroll down to read it, it can often be better to return and reduce the nesting. This allows you to mentally not that a certain case is handled above. However, since all these helper functions seem to fit onto a screen without too much trouble, it just seems better (to me) if they all follow the format that I mentioned earlier. I might live to regret that as we often see get-rid-of-useless-else-clause patches coming up. I'm starting to wonder if someone's got some alarm that goes off every time one gets committed, but we'll see. I'd much rather have consistency between the helper functions than save a few bytes on tab characters. It would be different if the indentation were shifting things too far right, but that's not going to happen in a function that all fits onto a screen at once. I've attached a version of the next patch in the series. I admit to making a couple of adjustments. I couldn't bring myself to remove the snapshot local variable in this commit. We can deal with that when we come to it in whichever patch that needs to be changed in. The only other thing I really did was question your use of rs_cindex to store the last OffsetNumber. I ended up adding a new field which slots into the padding between a bool and BlockNumber field named rs_coffset for this purpose. I noticed what Andres wrote [1] earlier in this thread about that, so thought we should move away from looking at the last tuple to get this number I've attached the rebased and updated patch. David [1] https://postgr.es/m/20221101010948.hsf33emgnwzvil4a@awork3.anarazel.de
Attachment
On Thu, Feb 02, 2023 at 07:00:37PM +1300, David Rowley wrote: > I've attached a version of the next patch in the series. I admit to > making a couple of adjustments. I couldn't bring myself to remove the > snapshot local variable in this commit. We can deal with that when we > come to it in whichever patch that needs to be changed in. That seems fine to keep the diff easy to understand. Also, heapgettup_pagemode() didn't have a snapshot local variable either. > The only other thing I really did was question your use of rs_cindex > to store the last OffsetNumber. I ended up adding a new field which > slots into the padding between a bool and BlockNumber field named > rs_coffset for this purpose. I noticed what Andres wrote [1] earlier > in this thread about that, so thought we should move away from looking > at the last tuple to get this number > > [1] https://postgr.es/m/20221101010948.hsf33emgnwzvil4a@awork3.anarazel.de So, what Andres had said was: > Hm. Finding the next offset via rs_ctup doesn't seem quite right. For one, > it's not actually that cheap to extract the offset from an ItemPointer because > of the the way we pack it into ItemPointerData. Because I was doing this in an earlier version: > > + HeapTuple tuple = &(scan->rs_ctup); And then in the later part of the code got tuple->t_self. I did this because the code in master does this: lineoff = /* previous offnum */ Min(lines, OffsetNumberPrev(ItemPointerGetOffsetNumber(&(tuple->t_self)))); So I figured it was the same. Based on Andres' feedback, I switched to saving the offset number in the scan descriptor and figured I could reuse rs_cindex since it is larger than an OffsetNumber. Your code also switches to saving the OffsetNumber -- just in a separate variable of OffsetNumber type. I am fine with this if it your rationale is that it is not a good idea to store a smaller number in a larger datatype. However, the benefit I saw in reusing rs_cindex is that we could someday converge the code for heapgettup() and heapgettup_pagemode() even more. Even in my final refactor, there is a lot of duplicate code between the two. - Melanie
On Fri, 3 Feb 2023 at 06:23, Melanie Plageman <melanieplageman@gmail.com> wrote: > Your code also switches to saving the OffsetNumber -- just in a separate > variable of OffsetNumber type. I am fine with this if it your rationale > is that it is not a good idea to store a smaller number in a larger > datatype. However, the benefit I saw in reusing rs_cindex is that we > could someday converge the code for heapgettup() and > heapgettup_pagemode() even more. Even in my final refactor, there is a > lot of duplicate code between the two. I was more concerned about the reuse of an unrelated field. I'm struggling to imagine why using the separate field would cause any issues around not being able to reduce the code duplication any more than we otherwise would. Surely in one case you need to get the offset by indexing the rs_vistuples[] array and the other is the offset directly. The only thing I can think of that would allow us not to have a condition there would be if we populated the rs_vistuples[] array with 1..n. I doubt should do that and if we did, we could just use the rs_cindex to index that without having to worry that we're using an unrelated field for something. I've pushed all but the final 2 patches now. David
On Fri, 3 Feb 2023 at 15:26, David Rowley <dgrowleyml@gmail.com> wrote: > I've pushed all but the final 2 patches now. I just pushed the final patch in the series. I held back on moving the setting of rs_inited back into the heapgettup_initial_block() helper function as I wondered if we should even keep that field. It seems that rs_cblock == InvalidBlockNumber in all cases where rs_inited == false, so maybe it's better just to use that as a condition to check if the scan has started or not. I've attached a patch which does that. I ended up adjusting HeapScanDescData more than what is minimally required to remove rs_inited. I wondered if rs_cindex should be closer to rs_cblock in the struct so that we're more likely to be adjusting the same cache line than if that field were closer to the end of the struct. We don't need rs_coffset and rs_cindex at the same time, so I made it a union. I see that the struct is still 712 bytes before and after this change. I've not yet tested to see if there are any performance gains to this change. David
Attachment
On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote: > On Fri, 3 Feb 2023 at 15:26, David Rowley <dgrowleyml@gmail.com> wrote: > > I've pushed all but the final 2 patches now. > > I just pushed the final patch in the series. Cool! > I held back on moving the setting of rs_inited back into the > heapgettup_initial_block() helper function as I wondered if we should > even keep that field. > > It seems that rs_cblock == InvalidBlockNumber in all cases where > rs_inited == false, so maybe it's better just to use that as a > condition to check if the scan has started or not. I've attached a > patch which does that. > > I ended up adjusting HeapScanDescData more than what is minimally > required to remove rs_inited. I wondered if rs_cindex should be closer > to rs_cblock in the struct so that we're more likely to be adjusting > the same cache line than if that field were closer to the end of the > struct. We don't need rs_coffset and rs_cindex at the same time, so I > made it a union. I see that the struct is still 712 bytes before and > after this change. I've not yet tested to see if there are any > performance gains to this change. > I like the idea of using a union. > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c > index 7eb79cee58..e171d6e38b 100644 > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -321,13 +321,15 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock) > } > > scan->rs_numblocks = InvalidBlockNumber; > - scan->rs_inited = false; > scan->rs_ctup.t_data = NULL; > ItemPointerSetInvalid(&scan->rs_ctup.t_self); > scan->rs_cbuf = InvalidBuffer; > scan->rs_cblock = InvalidBlockNumber; > > - /* page-at-a-time fields are always invalid when not rs_inited */ > + /* > + * page-at-a-time fields are always invalid when > + * rs_cblock == InvalidBlockNumber > + */ So, I was wondering what we should do about initializing rs_coffset here since it doesn't fall under "don't initialize it because it is only used for page-at-a-time mode". It might not be required for us to initialize it in initscan, but we do bother to initialize other "current scan state" fields. We could check if pagemode is enabled and initialize rs_coffset or rs_cindex depending on that. Then maybe the comment should call out the specific page-at-a-time fields that are automatically invalid? (e.g. rs_ntuples, rs_vistuples) I presume the point of the comment is to explain why those fields are not being initialized here, which was a question I had when I looked at initscan(), so it seems like we should make sure it explains that. > @@ -717,9 +720,9 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber block, ScanDirection dir > * the scankeys. > * > * Note: when we fall off the end of the scan in either direction, we > - * reset rs_inited. This means that a further request with the same > - * scan direction will restart the scan, which is a bit odd, but a > - * request with the opposite scan direction will start a fresh scan > + * reset rs_cblock to InvalidBlockNumber. This means that a further request > + * with the same scan direction will restart the scan, which is a bit odd, but > + * a request with the opposite scan direction will start a fresh scan > * in the proper direction. The latter is required behavior for cursors, > * while the former case is generally undefined behavior in Postgres > * so we don't care too much. Not the fault of this patch, but I am having trouble parsing this comment. What does restart the scan mean? I get that it is undefined behavior, but it is confusing because it kind of sounds like a rescan which is not what it is (right?). And what exactly is a fresh scan? It is probably correct, I just don't really understand what it is saying. Feel free to ignore this aside, as I think your change is correctly updating the comment. > @@ -2321,13 +2316,12 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate) > ReleaseBuffer(hscan->rs_cbuf); > hscan->rs_cbuf = InvalidBuffer; > hscan->rs_cblock = InvalidBlockNumber; > - hscan->rs_inited = false; > > return false; > } > > heapgetpage(scan, blockno); > - hscan->rs_inited = true; > + Assert(hscan->rs_cblock != InvalidBlockNumber); Quite nice to have this assert. > diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h > index 8d28bc93ef..c6efd59eb5 100644 > --- a/src/include/access/heapam.h > +++ b/src/include/access/heapam.h > @@ -56,9 +56,18 @@ typedef struct HeapScanDescData > /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */ > > /* scan current state */ > - bool rs_inited; /* false = scan not init'd yet */ > - OffsetNumber rs_coffset; /* current offset # in non-page-at-a-time mode */ > - BlockNumber rs_cblock; /* current block # in scan, if any */ > + union > + { > + /* current offset in non-page-at-a-time mode */ > + OffsetNumber rs_coffset; > + > + /* current tuple's index in vistuples for page-at-a-time mode */ > + int rs_cindex; > + }; With the union up here, the comment about page-at-a-time mode members near the bottom of the struct is now a bit misleading. The rest of my thoughts about that are with that comment. > + > + BlockNumber rs_cblock; /* current block # in scan, or > + * InvalidBlockNumber when the scan is not yet > + * initialized */ The formatting of this comment is a bit difficult to read. Perhaps it could go above the member? Also, a few random other complaints about the comments in HeapScanDescData that are also not the fault of this patch: - why is this comment there twice? /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */ - above the union it said (prior to this patch also) /* scan current state */ which is in contrast to /* state set up at initscan time */ from the top, but, arguably, rs_strategy is set up at initscan time - who or what is NB? /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */ > Buffer rs_cbuf; /* current buffer in scan, if any */ > /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */ > > @@ -74,7 +83,6 @@ typedef struct HeapScanDescData > ParallelBlockTableScanWorkerData *rs_parallelworkerdata; > > /* these fields only used in page-at-a-time mode and for bitmap scans */ > - int rs_cindex; /* current tuple's index in vistuples */ > int rs_ntuples; /* number of visible tuples on page */ > OffsetNumber rs_vistuples[MaxHeapTuplesPerPage]; /* their offsets */ I would personally be okay with either a block comment somewhere nearby describing which members are used in page-at-a-time mode or individual comments for each field that is used only in page-at-a-time mode (or something else, I just think the situation with the patch applied currently is confusing). - Melanie
On Wed, 8 Feb 2023 at 09:41, Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote: > > I ended up adjusting HeapScanDescData more than what is minimally > > required to remove rs_inited. I wondered if rs_cindex should be closer > > to rs_cblock in the struct so that we're more likely to be adjusting > > the same cache line than if that field were closer to the end of the > > struct. We don't need rs_coffset and rs_cindex at the same time, so I > > made it a union. I see that the struct is still 712 bytes before and > > after this change. I've not yet tested to see if there are any > > performance gains to this change. > > > > I like the idea of using a union. Using the tests mentioned in [1], I tested out remove_HeapScanDescData_rs_inited_field.patch. It's not looking very promising at all. seqscan.sql test: master (e2c78e7ab) tps = 27.769076 (without initial connection time) tps = 28.155233 (without initial connection time) tps = 26.990389 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch tps = 23.990490 (without initial connection time) tps = 23.450662 (without initial connection time) tps = 23.600194 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch without union HeapScanDescData change (just remove rs_inited field) tps = 24.419007 (without initial connection time) tps = 24.221389 (without initial connection time) tps = 24.187756 (without initial connection time) countall.sql test: master (e2c78e7ab) tps = 33.999408 (without initial connection time) tps = 33.664292 (without initial connection time) tps = 33.869115 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch tps = 31.194316 (without initial connection time) tps = 30.804987 (without initial connection time) tps = 30.770236 (without initial connection time) master + remove_HeapScanDescData_rs_inited_field.patch without union HeapScanDescData change (just remove rs_inited field) tps = 32.626187 (without initial connection time) tps = 32.876362 (without initial connection time) tps = 32.481729 (without initial connection time) I don't really have any explanation for why this slows performance so much. My thoughts are that if the performance of scans is this sensitive to the order of the fields in the struct then it's an independent project to learn out why and what we can realistically change to get the best performance here. > So, I was wondering what we should do about initializing rs_coffset here > since it doesn't fall under "don't initialize it because it is only used > for page-at-a-time mode". It might not be required for us to initialize > it in initscan, but we do bother to initialize other "current scan > state" fields. We could check if pagemode is enabled and initialize > rs_coffset or rs_cindex depending on that. Maybe master should be initialising this field already. I didn't quite see it as important as it's never used before rs_inited is set to true. Maybe setting it to InvalidOffsetNumber is a good idea just in case something tries to use it before it gets set. > > * Note: when we fall off the end of the scan in either direction, we > > - * reset rs_inited. This means that a further request with the same > > - * scan direction will restart the scan, which is a bit odd, but a > > - * request with the opposite scan direction will start a fresh scan > > + * reset rs_cblock to InvalidBlockNumber. This means that a further request > > + * with the same scan direction will restart the scan, which is a bit odd, but > > + * a request with the opposite scan direction will start a fresh scan > > * in the proper direction. The latter is required behavior for cursors, > > * while the former case is generally undefined behavior in Postgres > > * so we don't care too much. > > Not the fault of this patch, but I am having trouble parsing this > comment. What does restart the scan mean? I get that it is undefined > behavior, but it is confusing because it kind of sounds like a rescan > which is not what it is (right?). And what exactly is a fresh scan? It > is probably correct, I just don't really understand what it is saying. > Feel free to ignore this aside, as I think your change is correctly > updating the comment. I struggled with this too. It just looks incorrect. As far as I see it, once the scan ends we do the same thing regardless of what the scan direction is. Maybe it's worth looking back at when that comment was added and seeing if it was true when it was written and then see what changed. I think it's worth improving that independently. I think I'd like to focus on the cleanup stuff before looking into getting rid of rs_inited. I'm not sure I'm going to get time to do the required performance tests to look into why removing rs_inited slows things down so much. > Also, a few random other complaints about the comments in > HeapScanDescData that are also not the fault of this patch: > > - why is this comment there twice? > /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */ Seems to date back to 2019. I'll push something shortly to remove the additional one. > - above the union it said (prior to this patch also) > /* scan current state */ > which is in contrast to > /* state set up at initscan time */ > from the top, but, arguably, rs_strategy is set up at initscan time I'm not sure what the best thing to do about that is. Given the performance numbers I showed above after removing the rs_inited field, I'd rather not move that field in the struct up to the other fields that are set in initscan(). Perhaps you can suggest a patch which improves the comments? > - who or what is NB? > /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */ Latin for "note". It's used quite commonly in our code. David [1] https://postgr.es/m/CAApHDvpnA9SGp3OeXr4cYqX_w=NYN2YMzf2zfrABPNDsUqoNqw@mail.gmail.com
On Wed, 8 Feb 2023 at 15:09, David Rowley <dgrowleyml@gmail.com> wrote: > Using the tests mentioned in [1], I tested out > remove_HeapScanDescData_rs_inited_field.patch. It's not looking very > promising at all. In light of the performance regression from removing the rs_inited field, let's just forget doing that for now. It does not really seem that important compared to the other work that's already been done. If one of us gets time during the v17 cycle, then maybe we can revisit it then. I'll mark the patch as committed in the CF app. David