Home > mailing lists

Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From	James Coleman
Subject	Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date	March 15, 2020 17:35:32
Msg-id	CAAaqYe9BsrW=DRBOd9yW0s2djofXTM9mRpO=LhHrCu4qdGgrVg@mail.gmail.com Whole thread Raw
In response to	Re: [PATCH] Incremental sort (was: PoC: Partial sort) (James Coleman <jtc331@gmail.com>)
List	pgsql-hackers

Tree view

On Sat, Mar 14, 2020 at 10:55 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Fri, Mar 13, 2020 at 1:06 PM James Coleman <jtc331@gmail.com> wrote:
> >
> > On Thu, Mar 12, 2020 at 5:53 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > >
> > > I gave this a very quick look; I don't claim to understand it or
> > > anything, but I thought these trivial cleanups worthwhile.  The only
> > > non-cosmetic thing is changing order of arguments to the SOn_printf()
> > > calls in 0008; I think they are contrary to what the comment says.
> >
> > Yes, I think you're correct (re: 0008).
> >
> > They all look generally good to me, and are included in the attached
> > patch series.
>
> I just realized something about this (unsure if in Alvaro's or in my
> applying that) broke make check pretty decently (3 test files broken,
> also much slower, and the incremental sort test returns a lot of
> obviously broken results).
>
> I'll take a look tomorrow and hopefully get a fix (probably will reply
> to the more recent subthread's though).

This took a bit of manually just excluding changes until I got a
red/green set because nothing in the patch set looked all incorrect.
But it turns out this change breaks things:

- if (tuplesort_gettupleslot(read_sortstate, ScanDirectionIsForward(dir),
-                                                     false, slot,
NULL) || node->finished)
+ if (node->finished ||
+          tuplesort_gettupleslot(read_sortstate, ScanDirectionIsForward(dir),
+                                                     false, slot, NULL))

I believe what's happening here is that we need the
tuplesort_gettupleslot to set the slot to a NULL tuple (if there
aren't any left) before we return the slot, but that returns false, so
the node->finished check is to ensure that the first time that method
nulls out the slot and returns false we still return the value in
slot.

Since this isn't obvious, I'll add a comment. I think I'm also going
to rename node->finished to be more clear.

In this debugging I also noticed that we don't set node->finished back
to false in rescan, which I assume is a bug, but I don't really
understand a whole lot about rescan. IIRC from some previous
discussions rescan exists for things like cursors, where you can move
back and forth over the result set. Assuming that's the case, do we
need explicit tests for cursors using incremental sort? Is there a
good strategy for how much to do there (since I don't want to
duplicate every non-cursor functional test).

Thanks,
James

pgsql-hackers by date:

From: Fabien COELHO
Date: 15 March 2020, 17:15:02
Subject: Re: pg_ls_tmpdir to show directories and shared filesets (andpg_ls_*)

From: Pavel Stehule
Date: 15 March 2020, 18:23:36
Subject: Re: proposal: new polymorphic types - commontype and commontypearray

Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

Previous

Next