Thread: heapgettup() with NoMovementScanDirection unused in core?
Hi, David Rowley and I were discussing how to test the NoMovementScanDirection case for heapgettup() and heapgettup_pagemode() in [1] (since there is not currently coverage). We are actually wondering if it is dead code (in core). This is a link to the code in question on github in [2] (side note: is there a more evergreen way to do this that doesn't involve pasting a hundred lines of code into this email? You need quite a few lines of context for it to be clear what code I am talking about.) standard_ExecutorRun() doesn't run ExecutePlan() if scan direction is no movement. if (!ScanDirectionIsNoMovement(direction)) { ... ExecutePlan(estate, queryDesc->planstate, } And other users of heapgettup() through table_scan_getnextslot() and the like all seem to pass ForwardScanDirection as the direction. A skilled code archaeologist brought our attention to adbfab119b308a which appears to remove the only users in the core codebase calling heapgettup() and heapgettup_pagemode() with NoMovementScanDirection (and those users were not themselves used). Perhaps we can remove support for NoMovementScanDirection with heapgettup()? Unless someone knows of a good use case for a table AM to do this? - Melanie [1] https://www.postgresql.org/message-id/CAAKRu_ZyiXwWS1WXSZneoy%2BsjoH_%2BF5UhO-1uFhyi-u0d6z%2BfA%40mail.gmail.com [2] https://github.com/postgres/postgres/blob/master/src/backend/access/heap/heapam.c#L656
On Wed, 25 Jan 2023 at 13:55, Melanie Plageman <melanieplageman@gmail.com> wrote: > David Rowley and I were discussing how to test the > NoMovementScanDirection case for heapgettup() and heapgettup_pagemode() > in [1] (since there is not currently coverage). We are actually > wondering if it is dead code (in core). Yeah, so I see nothing in core that can cause heapgettup() or heapgettup_pagemode() to be called with NoMovementScanDirection. I imagine one possible way to hit it might be in an extension where someone's written their own ExecutorRun_hook that does not have the same NoMovementScanDirection check that standard_ExecutorRun() has. So far my thoughts are that we should just rip it out and see if anyone complains. If they complain loudly enough, then it's easy enough to put it back without any compatibility issues. However, if it's for the ExecutorRun_hook reason, then they'll likely be better to add the same NoMovementScanDirection as we have in standard_ExecutorRun(). I'm just not keen on refactoring the code without the means to test that the new code actually works. Does anyone know of any reason why we shouldn't ditch the nomovement code in heapgettup/heapgettup_pagemode? Maybe we'd also want to Assert that the direction is either forwards or backwards in table_scan_getnextslot and table_scan_getnextslot_tidrange. (I see heap_getnextslot_tidrange() does not have any handling for NoMovementScanDirection, so if this is not dead code, that probably needs to be fixed) David
David Rowley <dgrowleyml@gmail.com> writes: > Does anyone know of any reason why we shouldn't ditch the nomovement > code in heapgettup/heapgettup_pagemode? AFAICS, the remaining actual use-case for NoMovementScanDirection is that defined by ExecutorRun: * If direction is NoMovementScanDirection then nothing is done * except to start up/shut down the destination. Otherwise, * we retrieve up to 'count' tuples in the specified direction. * * Note: count = 0 is interpreted as no portal limit, i.e., run to * completion. We must have the NoMovementScanDirection option because count = 0 does not mean "do nothing", and I noted at least two call sites that require it. The heapgettup definition is thus not only unreachable, but confusingly inconsistent with this meaning. I wonder if we couldn't also get rid of this confusingly-inconsistent alternative usage in the planner: * 'indexscandir' is one of: * ForwardScanDirection: forward scan of an ordered index * BackwardScanDirection: backward scan of an ordered index * NoMovementScanDirection: scan of an unordered index, or don't care * (The executor doesn't care whether it gets ForwardScanDirection or * NoMovementScanDirection for an indexscan, but the planner wants to * distinguish ordered from unordered indexes for building pathkeys.) While that comment's claim is plausible, I think it's been wrong for years. AFAICS indxpath.c makes pathkeys before it ever does this: index_is_ordered ? ForwardScanDirection : NoMovementScanDirection, and nothing depends on that later either. So I think we could simplify this to something like "indexscandir is either ForwardScanDirection or BackwardScanDirection. (Unordered index types need not support BackwardScanDirection.)" regards, tom lane
Hi, On 2023-01-25 10:02:28 -0500, Tom Lane wrote: > David Rowley <dgrowleyml@gmail.com> writes: > > Does anyone know of any reason why we shouldn't ditch the nomovement > > code in heapgettup/heapgettup_pagemode? +1 Because I dug it up yesterday. There used to be callers of heap* with NoMovement. But they were unused themselves: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=adbfab119b308a7e0e6b1305de9be222cfd5c85b > * If direction is NoMovementScanDirection then nothing is done > * except to start up/shut down the destination. Otherwise, > * we retrieve up to 'count' tuples in the specified direction. > * > * Note: count = 0 is interpreted as no portal limit, i.e., run to > * completion. > > We must have the NoMovementScanDirection option because count = 0 > does not mean "do nothing", and I noted at least two call sites > that require it. I wonder if we'd be better off removing NoMovementScanDirection, and using count == (uint64)-1 for what NoMovementScanDirection is currently used for as an ExecutorRun parameter. Seems less confusing to me - right now we have two parameters with non-obbvious meanings and interactions. > I wonder if we couldn't also get rid of this confusingly-inconsistent > alternative usage in the planner: > > * 'indexscandir' is one of: > * ForwardScanDirection: forward scan of an ordered index > * BackwardScanDirection: backward scan of an ordered index > * NoMovementScanDirection: scan of an unordered index, or don't care > * (The executor doesn't care whether it gets ForwardScanDirection or > * NoMovementScanDirection for an indexscan, but the planner wants to > * distinguish ordered from unordered indexes for building pathkeys.) +1 Certainly seems confusing to me. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2023-01-25 10:02:28 -0500, Tom Lane wrote: >> We must have the NoMovementScanDirection option because count = 0 >> does not mean "do nothing", and I noted at least two call sites >> that require it. > I wonder if we'd be better off removing NoMovementScanDirection, and using > count == (uint64)-1 for what NoMovementScanDirection is currently used for as > an ExecutorRun parameter. Seems less confusing to me - right now we have two > parameters with non-obbvious meanings and interactions. I'm down on that because it seems just about certain to break extensions that call the executor, and it isn't adding enough clarity (IMHO) to justify that. regards, tom lane
Hi, I have written the patch to remove the unreachable code in heapgettup_pagemode](). On Wed, Jan 25, 2023 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wonder if we couldn't also get rid of this confusingly-inconsistent > alternative usage in the planner: > > * 'indexscandir' is one of: > * ForwardScanDirection: forward scan of an ordered index > * BackwardScanDirection: backward scan of an ordered index > * NoMovementScanDirection: scan of an unordered index, or don't care > * (The executor doesn't care whether it gets ForwardScanDirection or > * NoMovementScanDirection for an indexscan, but the planner wants to > * distinguish ordered from unordered indexes for building pathkeys.) > > While that comment's claim is plausible, I think it's been wrong for > years. AFAICS indxpath.c makes pathkeys before it ever does this: > > index_is_ordered ? > ForwardScanDirection : > NoMovementScanDirection, > > and nothing depends on that later either. So I think we could > simplify this to something like "indexscandir is either > ForwardScanDirection or BackwardScanDirection. (Unordered > index types need not support BackwardScanDirection.)" > I also did what I *think* Tom is suggesting here -- make index scan's scan direction always forward or backward... Maybe the set should be two patches...dunno. - Melanie
Attachment
Melanie Plageman <melanieplageman@gmail.com> writes: > I have written the patch to remove the unreachable code in > heapgettup_pagemode](). A few thoughts ... 1. Do we really need quite so many Asserts? I'd kind of lean to just having one, at some higher level of the executor. 2. I'm not sure if we want to do this: - direction = estate->es_direction; - /* flip direction if this is an overall backward scan */ - if (ScanDirectionIsBackward(((IndexScan *) node->ss.ps.plan)->indexorderdir)) - { - if (ScanDirectionIsForward(direction)) - direction = BackwardScanDirection; - else if (ScanDirectionIsBackward(direction)) - direction = ForwardScanDirection; - } + direction = estate->es_direction * ((IndexScan *) node->ss.ps.plan)->indexorderdir; AFAIR, there is noplace today that depends on the exact encoding of ForwardScanDirection and BackwardScanDirection, and I'm not sure that we want to introduce such a dependency. If we do it at least deserves a comment here, and you probably ought to adjust the wishy-washy comment in sdir.h as well. Taking out the existing comment explaining what this code is doing is not an improvement either. 3. You didn't update the header comment for heapgettup, nor the one in pathnodes.h for IndexPath.indexscandir. 4. I don't think the proposed test case is worth the cycles. regards, tom lane
On Fri, Jan 27, 2023 at 05:05:16PM -0500, Tom Lane wrote: > Melanie Plageman <melanieplageman@gmail.com> writes: > > I have written the patch to remove the unreachable code in > > heapgettup_pagemode](). > > A few thoughts ... > > 1. Do we really need quite so many Asserts? I'd kind of lean > to just having one, at some higher level of the executor. Yes, perhaps I was a bit overzealous putting them in functions called for every tuple. I'm not sure where in the executor would make the most sense. ExecInitSeqScan() comes to mind, but I'm not sure that covers all of the desired cases. > > 2. I'm not sure if we want to do this: > > - direction = estate->es_direction; > - /* flip direction if this is an overall backward scan */ > - if (ScanDirectionIsBackward(((IndexScan *) node->ss.ps.plan)->indexorderdir)) > - { > - if (ScanDirectionIsForward(direction)) > - direction = BackwardScanDirection; > - else if (ScanDirectionIsBackward(direction)) > - direction = ForwardScanDirection; > - } > + direction = estate->es_direction * ((IndexScan *) node->ss.ps.plan)->indexorderdir; > > AFAIR, there is noplace today that depends on the exact encoding > of ForwardScanDirection and BackwardScanDirection, and I'm not > sure that we want to introduce such a dependency. If we do it > at least deserves a comment here, and you probably ought to adjust > the wishy-washy comment in sdir.h as well. Taking out the existing > comment explaining what this code is doing is not an improvement > either. I think you mean the enum value when you say encoding? I actually started using the ScanDirection value in the refactor of heapgettup() and heapgettup_pagemode() which I proposed here [1]. Why wouldn't we want to introduce such a dependency? > > 3. You didn't update the header comment for heapgettup, nor the > one in pathnodes.h for IndexPath.indexscandir. Oops -- thanks for catching those! > > 4. I don't think the proposed test case is worth the cycles. Just the one I wrote or any test case? - Melanie [1] https://www.postgresql.org/message-id/flat/CAAKRu_ZJg_N7zHtWP%2BJoSY_hrce4%2BGKioL137Y2c2En-kuXQ7g%40mail.gmail.com#8a106c6625bc069cf439230cd9fa1000
Melanie Plageman <melanieplageman@gmail.com> writes: > On Fri, Jan 27, 2023 at 05:05:16PM -0500, Tom Lane wrote: >> AFAIR, there is noplace today that depends on the exact encoding >> of ForwardScanDirection and BackwardScanDirection, and I'm not >> sure that we want to introduce such a dependency. > I think you mean the enum value when you say encoding? I actually > started using the ScanDirection value in the refactor of heapgettup() > and heapgettup_pagemode() which I proposed here [1]. Why wouldn't we > want to introduce such a dependency? It's just that in general, depending on the numeric values of an enum isn't a great coding practice. After thinking about it for awhile, I'd be happier if we added something like this to sdir.h, and then used it rather than directly depending on multiplication: /* * Determine the net effect of two direction specifications. * This relies on having ForwardScanDirection = +1, BackwardScanDirection = -1, * and will probably not do what you want if applied to any other values. */ #define CombineScanDirections(a, b) ((a) * (b)) The main thing this'd buy us is being able to grep for uses of the trick. If it's written as just multiplication, good luck being able to find what's depending on that, should you ever need to. >> 4. I don't think the proposed test case is worth the cycles. > Just the one I wrote or any test case? I think that all this code is quite well-tested already, so I'm not sure what's the point of adding another test for it. regards, tom lane
On Sat, 28 Jan 2023 at 12:15, Tom Lane <tgl@sss.pgh.pa.us> wrote: > /* > * Determine the net effect of two direction specifications. > * This relies on having ForwardScanDirection = +1, BackwardScanDirection = -1, > * and will probably not do what you want if applied to any other values. > */ > #define CombineScanDirections(a, b) ((a) * (b)) > > The main thing this'd buy us is being able to grep for uses of the > trick. If it's written as just multiplication, good luck being > able to find what's depending on that, should you ever need to. Yeah, I think the multiplication macro is a good way of doing it. Having the definition of it close to the ScanDirection enum's definition is likely a very good idea so that anyone adjusting the enum values is more likely to notice that it'll cause an issue. A small note on the enum declaration about the -1, +1 values being exploited in various places might be a good idea too. I see v6-0006 in [1] further exploits this, so that's further reason to document that. My personal preference would have been to call it ScanDirectionCombine, so the naming is more aligned to the 4 other macro names that start with ScanDirection in sdir.h, but I'm not going to fuss over it. David [1] https://postgr.es/m/CAAKRu_ZyiXwWS1WXSZneoy+sjoH_+F5UhO-1uFhyi-u0d6z+fA@mail.gmail.com David
David Rowley <dgrowleyml@gmail.com> writes: > My personal preference would have been to call it > ScanDirectionCombine, so the naming is more aligned to the 4 other > macro names that start with ScanDirection in sdir.h, but I'm not going > to fuss over it. No objection to that. regards, tom lane
v2 attached On Fri, Jan 27, 2023 at 6:28 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Sat, 28 Jan 2023 at 12:15, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > /* > > * Determine the net effect of two direction specifications. > > * This relies on having ForwardScanDirection = +1, BackwardScanDirection = -1, > > * and will probably not do what you want if applied to any other values. > > */ > > #define CombineScanDirections(a, b) ((a) * (b)) > > > > The main thing this'd buy us is being able to grep for uses of the > > trick. If it's written as just multiplication, good luck being > > able to find what's depending on that, should you ever need to. > > Yeah, I think the multiplication macro is a good way of doing it. > Having the definition of it close to the ScanDirection enum's > definition is likely a very good idea so that anyone adjusting the > enum values is more likely to notice that it'll cause an issue. A > small note on the enum declaration about the -1, +1 values being > exploited in various places might be a good idea too. I see v6-0006 in > [1] further exploits this, so that's further reason to document that. > > My personal preference would have been to call it > ScanDirectionCombine, so the naming is more aligned to the 4 other > macro names that start with ScanDirection in sdir.h, but I'm not going > to fuss over it. I've gone with this macro name. I've also updated comments Tom mentioned and removed the test. As for the asserts, I was at a bit of a loss as to where to put an assert which will make it clear that heapgettup() and heapgettup_pagemode() do not handle NoMovementScanDirection but was at a higher level of the executor. Do we not have to accommodate the direction changing from tuple to tuple? If we don't expect the plan node direction to change during execution, then why recalculate estate->es_direction for each invocation of Index/SeqNext()? As such, in this version I've put the asserts in heapgettup() and heapgettup_pagemode(). I also realized that it doesn't really make sense to assert about the index scan direction in ExecInitIndexOnlyScan() and ExecInitIndexScan() -- so I've moved the assertion to planner when we make the index plan from the path. I'm not sure if it is needed. - Melanie
Attachment
On Tue, 31 Jan 2023 at 09:57, Melanie Plageman <melanieplageman@gmail.com> wrote: > As for the asserts, I was at a bit of a loss as to where to put an > assert which will make it clear that heapgettup() and > heapgettup_pagemode() do not handle NoMovementScanDirection but was > at a higher level of the executor. My thoughts were that we might want to put them table_scan_getnextslot() and table_scan_getnextslot_tidrange(). My rationale for that was that it makes it more clear to table AM devs that they don't need to handle NoMovementScanDirection. > Do we not have to accommodate the > direction changing from tuple to tuple? If we don't expect the plan node > direction to change during execution, then why recalculate > estate->es_direction for each invocation of Index/SeqNext()? Yeah, this needs to be handled. FETCH can fetch forwards or backwards from a cursor. The code you have looks fine to me. > As such, in this version I've put the asserts in heapgettup() and > heapgettup_pagemode(). > > I also realized that it doesn't really make sense to assert about the > index scan direction in ExecInitIndexOnlyScan() and ExecInitIndexScan() > -- so I've moved the assertion to planner when we make the index plan > from the path. I'm not sure if it is needed. That's probably slightly better. The only thing I really have on this is my thoughts on the Asserts going in tableam.h plus the following comment: /* * These enum values were originally int8 values. Using -1, 0, and 1 as their * values conveniently mirrors their semantic value when used during execution. */ I don't really see any reason to keep the historical note here. I think something like the following might be better: /* * Defines the direction for scanning a table or an index. Scans are never * invoked using NoMovementScanDirectionScans. For convenience, we use the * values -1 and 1 for backward and forward scans. This allows us to perform * a few mathematical tricks such as what is done in ScanDirectionCombine. */ Also, a nitpick around the inconsistency with the Asserts. In make_indexscan() and make_indexonlyscan() you're checking you're getting a forward and backward value, but in heapgettup() and heapgettup_pagemode() you're checking you don't get NoMovementScanDirection. I think the != NoMovementScanDirection is fine for both cases. Both can be easily fixed, so no need to submit another patch as far as I'm concerned. I'll leave this til tomorrow in case Tom wants to have another look too. David
On Tue, Jan 31, 2023 at 11:46:05PM +1300, David Rowley wrote: > On Tue, 31 Jan 2023 at 09:57, Melanie Plageman > <melanieplageman@gmail.com> wrote: > > As for the asserts, I was at a bit of a loss as to where to put an > > assert which will make it clear that heapgettup() and > > heapgettup_pagemode() do not handle NoMovementScanDirection but was > > at a higher level of the executor. > > My thoughts were that we might want to put them > table_scan_getnextslot() and table_scan_getnextslot_tidrange(). My > rationale for that was that it makes it more clear to table AM devs > that they don't need to handle NoMovementScanDirection. I previously had the asserts here, but I thought perhaps we shouldn't restrict table AMs from using NoMovementScanDirection in whatever way they'd like. We care about protecting heapgettup() and heapgettup_pagemode(). We could put a comment in the table AM API about NoMovementScanDirection not necessarily making sense for a next() type function and informing table AMs that they need not support it. > > > Do we not have to accommodate the > > direction changing from tuple to tuple? If we don't expect the plan node > > direction to change during execution, then why recalculate > > estate->es_direction for each invocation of Index/SeqNext()? > > Yeah, this needs to be handled. FETCH can fetch forwards or backwards > from a cursor. The code you have looks fine to me. > > > As such, in this version I've put the asserts in heapgettup() and > > heapgettup_pagemode(). > > > > I also realized that it doesn't really make sense to assert about the > > index scan direction in ExecInitIndexOnlyScan() and ExecInitIndexScan() > > -- so I've moved the assertion to planner when we make the index plan > > from the path. I'm not sure if it is needed. > > That's probably slightly better. > > The only thing I really have on this is my thoughts on the Asserts > going in tableam.h plus the following comment: > > /* > * These enum values were originally int8 values. Using -1, 0, and 1 as their > * values conveniently mirrors their semantic value when used during execution. > */ > > I don't really see any reason to keep the historical note here. I > think something like the following might be better: > > /* > * Defines the direction for scanning a table or an index. Scans are never > * invoked using NoMovementScanDirectionScans. For convenience, we use the > * values -1 and 1 for backward and forward scans. This allows us to perform > * a few mathematical tricks such as what is done in ScanDirectionCombine. > */ This comment looks good to me. > Also, a nitpick around the inconsistency with the Asserts. In > make_indexscan() and make_indexonlyscan() you're checking you're > getting a forward and backward value, but in heapgettup() and > heapgettup_pagemode() you're checking you don't get > NoMovementScanDirection. I think the != NoMovementScanDirection is > fine for both cases. Yes, I thought about it being weird that they are different. Perhaps we should check in both places that it is forward or backward. In heapgettup[_pagemode()] there is if/else -- so if the assert is only for NoMovementScanDirection and a new scan direction is added, it would fall through to the else. In planner, it is not that we are not "handling" NoMovementScanDirection (like in heapgettup) but rather that we are only passing Forward and Backward scan directions when creating the path nodes, so the Assert would be mainly to remind the developer that if they are creating a plan with a different scan direction that they should be intentional about it. So, I would favor having both asserts check that the direction is one of forward or backward. > Both can be easily fixed, so no need to submit another patch as far as > I'm concerned. I realized I forgot a commit message in the second version. Patch v1 has one. - Melanie
On Wed, 1 Feb 2023 at 03:02, Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Jan 31, 2023 at 11:46:05PM +1300, David Rowley wrote: > > My thoughts were that we might want to put them > > table_scan_getnextslot() and table_scan_getnextslot_tidrange(). My > > rationale for that was that it makes it more clear to table AM devs > > that they don't need to handle NoMovementScanDirection. > > I previously had the asserts here, but I thought perhaps we shouldn't > restrict table AMs from using NoMovementScanDirection in whatever way > they'd like. We care about protecting heapgettup() and > heapgettup_pagemode(). We could put a comment in the table AM API about > NoMovementScanDirection not necessarily making sense for a next() type > function and informing table AMs that they need not support it. hmm, but the recent discovery is that we'll never call ExecutePlan() with NoMovementScanDirection, so what exactly is going to call table_scan_getnextslot() and table_scan_getnextslot_tidrange() with NoMovementScanDirection? > So, I would favor having both asserts check that the direction is one of > forward or backward. That sounds fine to me. David
David Rowley <dgrowleyml@gmail.com> writes: > On Wed, 1 Feb 2023 at 03:02, Melanie Plageman <melanieplageman@gmail.com> wrote: >> I previously had the asserts here, but I thought perhaps we shouldn't >> restrict table AMs from using NoMovementScanDirection in whatever way >> they'd like. We care about protecting heapgettup() and >> heapgettup_pagemode(). We could put a comment in the table AM API about >> NoMovementScanDirection not necessarily making sense for a next() type >> function and informing table AMs that they need not support it. > hmm, but the recent discovery is that we'll never call ExecutePlan() > with NoMovementScanDirection, so what exactly is going to call > table_scan_getnextslot() and table_scan_getnextslot_tidrange() with > NoMovementScanDirection? Yeah. This is not an AM-local API. regards, tom lane
On Wed, 1 Feb 2023 at 03:02, Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Jan 31, 2023 at 11:46:05PM +1300, David Rowley wrote: > > Both can be easily fixed, so no need to submit another patch as far as > > I'm concerned. > > I realized I forgot a commit message in the second version. Patch v1 has > one. I made a couple of other adjustments to the Asserts and comments and pushed the result. On further looking, I felt the Assert was better off done in create_indexscan_plan() rather than make_index[only]scan(). I also put the asserts in tableam.h and removed the heapam.c ones. The rest was just comment adjustments. David