Re: Incremental sorts and EXEC_FLAG_REWIND - Mailing list pgsql-hackers

From James Coleman
Subject Re: Incremental sorts and EXEC_FLAG_REWIND
Date
Msg-id CAAaqYe8V4r+AWzH=Rnwp5ByCJA4=vVN_xVmdbFHNQ-xEBKUp-A@mail.gmail.com
Whole thread Raw
In response to Re: Incremental sorts and EXEC_FLAG_REWIND  ("Jonathan S. Katz" <jkatz@postgresql.org>)
Responses Re: Incremental sorts and EXEC_FLAG_REWIND
List pgsql-hackers
On Thu, May 7, 2020 at 2:57 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
>
> On 4/24/20 6:57 PM, Tomas Vondra wrote:
> > On Fri, Apr 24, 2020 at 04:35:02PM -0400, James Coleman wrote:
> >> On Sun, Apr 19, 2020 at 12:14 PM James Coleman <jtc331@gmail.com> wrote:
> >>>
> >>> On Wed, Apr 15, 2020 at 2:04 PM James Coleman <jtc331@gmail.com> wrote:
> >>> >
> >>> > On Wed, Apr 15, 2020 at 11:02 AM James Coleman <jtc331@gmail.com>
> >>> wrote:
> >>> > >
> >>> > > On Tue, Apr 14, 2020 at 2:53 AM Michael Paquier
> >>> <michael@paquier.xyz> wrote:
> >>> > > >
> >>> > > > Hi,
> >>> > > >
> >>> > > > When initializing an incremental sort node, we have the
> >>> following as
> >>> > > > of ExecInitIncrementalSort():
> >>> > > >     /*
> >>> > > >      * Incremental sort can't be used with either
> >>> EXEC_FLAG_REWIND,
> >>> > > >      * EXEC_FLAG_BACKWARD or EXEC_FLAG_MARK, because we only
> >>> one of many sort
> >>> > > >      * batches in the current sort state.
> >>> > > >      */
> >>> > > >      Assert((eflags & (EXEC_FLAG_BACKWARD |
> >>> > > >                        EXEC_FLAG_MARK)) == 0);
> >>> > > > While I don't quite follow why EXEC_FLAG_REWIND should be
> >>> allowed here
> >>> > > > to begin with (because incremental sorts don't support rescans
> >>> without
> >>> > > > parameter changes, right?), the comment and the assertion are
> >>> telling
> >>> > > > a different story.
> >>> > >
> >>> > > I remember changing this assertion in response to an issue I'd found
> >>> > > which led to rewriting the rescan implementation, but I must have
> >>> > > missed updating the comment.
> >>> >
> >>> > All right, here are the most relevant messages:
> >>> >
> >>> > [1]: Here I'd said:
> >>> > ----------
> >>> > While working on finding a test case to show rescan isn't implemented
> >>> > properly yet, I came across a bug. At the top of
> >>> > ExecInitIncrementalSort, we assert that eflags does not contain
> >>> > EXEC_FLAG_REWIND. But the following query (with merge and hash joins
> >>> > disabled) breaks that assertion:
> >>> >
> >>> > select * from t join (select * from t order by a, b) s on s.a = t.a
> >>> > where t.a in (1,2);
> >>> >
> >>> > The comments about this flag in src/include/executor/executor.h say:
> >>> >
> >>> > * REWIND indicates that the plan node should try to efficiently
> >>> support
> >>> > * rescans without parameter changes. (Nodes must support ExecReScan
> >>> calls
> >>> > * in any case, but if this flag was not given, they are at liberty
> >>> to do it
> >>> > * through complete recalculation. Note that a parameter change
> >>> forces a
> >>> > * full recalculation in any case.)
> >>> >
> >>> > Now we know that except in rare cases (as just discussed recently up
> >>> > thread) we can't implement rescan efficiently.
> >>> >
> >>> > So is this a planner bug (i.e., should we try not to generate
> >>> > incremental sort plans that require efficient rewind)? Or can we just
> >>> > remove that part of the assertion and know that we'll implement the
> >>> > rescan, albeit inefficiently? We already explicitly declare that we
> >>> > don't support backwards scanning, but I don't see a way to declare the
> >>> > same for rewind.
> >>> > ----------
> >>> >
> >>> > So it seems to me that we can't disallow REWIND, and we have to
> >>> > support rescan, but, we could try to mitigate the effects (without a
> >>> > param change) with a materialize node, as noted below.
> >>> >
> >>> > [2]: Here, in response to my questioning above if this was a planner
> >>> > bug, I'd said:
> >>> > ----------
> >>> > Other nodes seem to get a materialization node placed above them to
> >>> > support this case "better". Is that something we should be doing?
> >>> > ----------
> >>> >
> >>> > I never got any reply on this point; if we _did_ introduce a
> >>> > materialize node here, then it would mean we could start disallowing
> >>> > REWIND again. See the email for full details of a specific plan that I
> >>> > encountered that reproduced this.
> >>> >
> >>> > Thoughts?
> >>> >
> >>> > > In the meantime, your question is primarily about making sure the
> >>> > > code/comments/etc. are consistent and not a behavioral problem or
> >>> > > failure you've seen in testing?
> >>> >
> >>> > Still want to confirm this is the case.
> >>> >
> >>> > James
> >>> >
> >>> > [1]:
> >>> https://www.postgresql.org/message-id/CAAaqYe9%2Bap2SbU_E2WaC4F9ZMF4oa%3DpJZ1NBwaKDMP6GFUA77g%40mail.gmail.com
> >>>
> >>> > [2]:
> >>> https://www.postgresql.org/message-id/CAAaqYe-sOp2o%3DL7nvGZDJ6GsL9%3Db_ztrGE1rhyi%2BF82p3my2bQ%40mail.gmail.com
> >>>
> >>>
> >>> Looking at this more, I think this is definitely suspect. The current
> >>> code shields lower nodes from EXEC_FLAG_BACKWARD and EXEC_FLAG_MARK --
> >>> the former is definitely fine because we declare that we don't support
> >>> backwards scans. The latter seems like the same reasoning would apply,
> >>> but unfortunately we didn't add it to ExecSupportsMarkRestore, so I've
> >>> attached a patch to do that.
> >>>
> >>> The EXEC_FLAG_REWIND situation though I'm still not clear on -- given
> >>> the comments/docs seem to suggest it's a hint for efficiency rather
> >>> than something that has to work or be declared as not implemented, so
> >>> it seems like one of the following should be the outcome:
> >>>
> >>> 1. "Disallow" it by only generating materialize nodes above the
> >>> incremental sort node if REWIND will be required. I'm not sure if this
> >>> would mean that incremental sort just wouldn't be useful in that case?
> >>> 2. Keep the existing implementation where we basically ignore REWIND
> >>> and use our more inefficient implementation. In this case, I believe
> >>> we need to stop shielding child nodes from REWIND, though, since we we
> >>> aren't actually storing the full result set and will instead be
> >>> re-executing the child nodes.
> >>>
> >>> I've attached a patch to take course (2), since it's the easiest to
> >>> implement. But I'd still like feedback on what we should do here,
> >>> because I don't feel like I actually know what the semantics expected
> >>> of the executor/planner are on this point. If we do go with this
> >>> approach, someone should verify my comments additions about
> >>> materialize nodes is correct.
> >>
> >> I also happened to noticed that in rescan we are always setting
> >> node->bounded = false. I was under the impression that
> >> ExecSetTupleBound would be called *after* ExecReScanIncrementalSort,
> >> but looking at both ExecSetTupleBound and ExecReScanSort, but it seems
> >> that the inverse is true. Therefore if we set this to false each time,
> >> then we lose any possibility of using the bounded optimization for all
> >> rescans.
> >>
> >> I've added a tiny patch (minus one line) to the earlier patch series
> >> to fix that.
> >>
> >
> > Thanks. I'll take a look at the issue and fixes.
>
> With Beta 1 just around the corner[1], I wanted to check in to see if
> this was closer to being committed so we could close off the open
> item[2] prior the beta release.
>
> Thanks!
>
> Jonathan
>
> [1]
> https://www.postgresql.org/message-id/b782a4ec-5e8e-21a7-f628-624be683e6d6@postgresql.org
> [2] https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items


Tangential, I know, but I think we should consider [1] (includes a
very minor patch) part of the open items on incremental sort also:

James

[1]: https://www.postgresql.org/message-id/CAEudQAqxf+mbirkO7pAdL61Qw8U8cF_QnEaL101L0tbBUocoQg@mail.gmail.com



pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: Incremental sorts and EXEC_FLAG_REWIND
Next
From: Tomas Vondra
Date:
Subject: Re: Incremental sorts and EXEC_FLAG_REWIND