Re: Incremental sorts and EXEC_FLAG_REWIND - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: Incremental sorts and EXEC_FLAG_REWIND |
Date | |
Msg-id | CAAaqYe8V4r+AWzH=Rnwp5ByCJA4=vVN_xVmdbFHNQ-xEBKUp-A@mail.gmail.com Whole thread Raw |
In response to | Re: Incremental sorts and EXEC_FLAG_REWIND ("Jonathan S. Katz" <jkatz@postgresql.org>) |
Responses |
Re: Incremental sorts and EXEC_FLAG_REWIND
|
List | pgsql-hackers |
On Thu, May 7, 2020 at 2:57 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: > > On 4/24/20 6:57 PM, Tomas Vondra wrote: > > On Fri, Apr 24, 2020 at 04:35:02PM -0400, James Coleman wrote: > >> On Sun, Apr 19, 2020 at 12:14 PM James Coleman <jtc331@gmail.com> wrote: > >>> > >>> On Wed, Apr 15, 2020 at 2:04 PM James Coleman <jtc331@gmail.com> wrote: > >>> > > >>> > On Wed, Apr 15, 2020 at 11:02 AM James Coleman <jtc331@gmail.com> > >>> wrote: > >>> > > > >>> > > On Tue, Apr 14, 2020 at 2:53 AM Michael Paquier > >>> <michael@paquier.xyz> wrote: > >>> > > > > >>> > > > Hi, > >>> > > > > >>> > > > When initializing an incremental sort node, we have the > >>> following as > >>> > > > of ExecInitIncrementalSort(): > >>> > > > /* > >>> > > > * Incremental sort can't be used with either > >>> EXEC_FLAG_REWIND, > >>> > > > * EXEC_FLAG_BACKWARD or EXEC_FLAG_MARK, because we only > >>> one of many sort > >>> > > > * batches in the current sort state. > >>> > > > */ > >>> > > > Assert((eflags & (EXEC_FLAG_BACKWARD | > >>> > > > EXEC_FLAG_MARK)) == 0); > >>> > > > While I don't quite follow why EXEC_FLAG_REWIND should be > >>> allowed here > >>> > > > to begin with (because incremental sorts don't support rescans > >>> without > >>> > > > parameter changes, right?), the comment and the assertion are > >>> telling > >>> > > > a different story. > >>> > > > >>> > > I remember changing this assertion in response to an issue I'd found > >>> > > which led to rewriting the rescan implementation, but I must have > >>> > > missed updating the comment. > >>> > > >>> > All right, here are the most relevant messages: > >>> > > >>> > [1]: Here I'd said: > >>> > ---------- > >>> > While working on finding a test case to show rescan isn't implemented > >>> > properly yet, I came across a bug. At the top of > >>> > ExecInitIncrementalSort, we assert that eflags does not contain > >>> > EXEC_FLAG_REWIND. But the following query (with merge and hash joins > >>> > disabled) breaks that assertion: > >>> > > >>> > select * from t join (select * from t order by a, b) s on s.a = t.a > >>> > where t.a in (1,2); > >>> > > >>> > The comments about this flag in src/include/executor/executor.h say: > >>> > > >>> > * REWIND indicates that the plan node should try to efficiently > >>> support > >>> > * rescans without parameter changes. (Nodes must support ExecReScan > >>> calls > >>> > * in any case, but if this flag was not given, they are at liberty > >>> to do it > >>> > * through complete recalculation. Note that a parameter change > >>> forces a > >>> > * full recalculation in any case.) > >>> > > >>> > Now we know that except in rare cases (as just discussed recently up > >>> > thread) we can't implement rescan efficiently. > >>> > > >>> > So is this a planner bug (i.e., should we try not to generate > >>> > incremental sort plans that require efficient rewind)? Or can we just > >>> > remove that part of the assertion and know that we'll implement the > >>> > rescan, albeit inefficiently? We already explicitly declare that we > >>> > don't support backwards scanning, but I don't see a way to declare the > >>> > same for rewind. > >>> > ---------- > >>> > > >>> > So it seems to me that we can't disallow REWIND, and we have to > >>> > support rescan, but, we could try to mitigate the effects (without a > >>> > param change) with a materialize node, as noted below. > >>> > > >>> > [2]: Here, in response to my questioning above if this was a planner > >>> > bug, I'd said: > >>> > ---------- > >>> > Other nodes seem to get a materialization node placed above them to > >>> > support this case "better". Is that something we should be doing? > >>> > ---------- > >>> > > >>> > I never got any reply on this point; if we _did_ introduce a > >>> > materialize node here, then it would mean we could start disallowing > >>> > REWIND again. See the email for full details of a specific plan that I > >>> > encountered that reproduced this. > >>> > > >>> > Thoughts? > >>> > > >>> > > In the meantime, your question is primarily about making sure the > >>> > > code/comments/etc. are consistent and not a behavioral problem or > >>> > > failure you've seen in testing? > >>> > > >>> > Still want to confirm this is the case. > >>> > > >>> > James > >>> > > >>> > [1]: > >>> https://www.postgresql.org/message-id/CAAaqYe9%2Bap2SbU_E2WaC4F9ZMF4oa%3DpJZ1NBwaKDMP6GFUA77g%40mail.gmail.com > >>> > >>> > [2]: > >>> https://www.postgresql.org/message-id/CAAaqYe-sOp2o%3DL7nvGZDJ6GsL9%3Db_ztrGE1rhyi%2BF82p3my2bQ%40mail.gmail.com > >>> > >>> > >>> Looking at this more, I think this is definitely suspect. The current > >>> code shields lower nodes from EXEC_FLAG_BACKWARD and EXEC_FLAG_MARK -- > >>> the former is definitely fine because we declare that we don't support > >>> backwards scans. The latter seems like the same reasoning would apply, > >>> but unfortunately we didn't add it to ExecSupportsMarkRestore, so I've > >>> attached a patch to do that. > >>> > >>> The EXEC_FLAG_REWIND situation though I'm still not clear on -- given > >>> the comments/docs seem to suggest it's a hint for efficiency rather > >>> than something that has to work or be declared as not implemented, so > >>> it seems like one of the following should be the outcome: > >>> > >>> 1. "Disallow" it by only generating materialize nodes above the > >>> incremental sort node if REWIND will be required. I'm not sure if this > >>> would mean that incremental sort just wouldn't be useful in that case? > >>> 2. Keep the existing implementation where we basically ignore REWIND > >>> and use our more inefficient implementation. In this case, I believe > >>> we need to stop shielding child nodes from REWIND, though, since we we > >>> aren't actually storing the full result set and will instead be > >>> re-executing the child nodes. > >>> > >>> I've attached a patch to take course (2), since it's the easiest to > >>> implement. But I'd still like feedback on what we should do here, > >>> because I don't feel like I actually know what the semantics expected > >>> of the executor/planner are on this point. If we do go with this > >>> approach, someone should verify my comments additions about > >>> materialize nodes is correct. > >> > >> I also happened to noticed that in rescan we are always setting > >> node->bounded = false. I was under the impression that > >> ExecSetTupleBound would be called *after* ExecReScanIncrementalSort, > >> but looking at both ExecSetTupleBound and ExecReScanSort, but it seems > >> that the inverse is true. Therefore if we set this to false each time, > >> then we lose any possibility of using the bounded optimization for all > >> rescans. > >> > >> I've added a tiny patch (minus one line) to the earlier patch series > >> to fix that. > >> > > > > Thanks. I'll take a look at the issue and fixes. > > With Beta 1 just around the corner[1], I wanted to check in to see if > this was closer to being committed so we could close off the open > item[2] prior the beta release. > > Thanks! > > Jonathan > > [1] > https://www.postgresql.org/message-id/b782a4ec-5e8e-21a7-f628-624be683e6d6@postgresql.org > [2] https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items Tangential, I know, but I think we should consider [1] (includes a very minor patch) part of the open items on incremental sort also: James [1]: https://www.postgresql.org/message-id/CAEudQAqxf+mbirkO7pAdL61Qw8U8cF_QnEaL101L0tbBUocoQg@mail.gmail.com
pgsql-hackers by date: