Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From James Coleman
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id CAAaqYe829tk-mQ_ePn1zZqKTcocd9LVDnv8mxZ1UufNq034qhA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Sat, Apr 18, 2020 at 10:36 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Tue, Apr 07, 2020 at 10:53:05AM -0500, Justin Pryzby wrote:
> > On Tue, Apr 07, 2020 at 08:40:30AM -0400, James Coleman wrote:
> > > > And, should it use two spaces before "Sort Method", "Memory" and "Pre-sorted
> > ...
> > > I read through that subthread, and the ending seemed to be Peter
> > > wanting things to be unified. Was there a conclusion beyond that?
> >
> > This discussion is ongoing.  I think let's wait until that's settled before
> > addressing this more complex and even newer case.  We can add "explain, two
> > spaces and equals vs colon" to the "Open items" list if need be - I hope the
> > discussion will not delay the release.
>
> The change proposed on the WAL thread is minimal, and makes new explain(WAL)
> output consistent with the that of explain(BUFFERS).
>
> That uses a different format from "Sort", which is what incremental sort should
> follow.  (Hashjoin also uses the Sort's format of two-spaces and colons rather
> than equals).

I think it's not great that buffers/sort are different, but I agree
that we should follow sort.

> So the attached 0001 makes explain output for incremental sort more consistent
> with sort:
>
>      - Two spaces;
>      - colons rather than equals;
>      - Don't use semicolon, which isn't in use anywhere else;
>
> I tested with this:
> template1=# DROP TABLE t; CREATE TABLE t(i int, j int); INSERT INTO t SELECT a-(a%100), a%1000 FROM
generate_series(1,99999)a;CREATE INDEX ON t(i); VACUUM VERBOSE ANALYZE t;
 
> template1=# explain analyze SELECT * FROM t a ORDER BY i,j;
> ...
>    Full-sort Groups: 1000  Sort Method: quicksort  Average Memory: 28kB  Peak Memory: 28kB  Pre-sorted Groups: 1000
SortMethod: quicksort  Average Memory: 30kB  Peak Memory: 30kB
 
>
> On Tue, Apr 07, 2020 at 05:34:15PM +0200, Tomas Vondra wrote:
> > On Tue, Apr 07, 2020 at 08:40:30AM -0400, James Coleman wrote:
> > > On Tue, Apr 7, 2020 at 12:25 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > > Should "Pre-sorted Groups:" be on a separate line ?
> > > > | Full-sort Groups: 1 Sort Method: quicksort Memory: avg=28kB peak=28kB Pre-sorted Groups: 1 Sort Method:
quicksortMemory: avg=30kB peak=30kB
 
> > >
> > > I'd originally had that, but Tomas wanted it to be more compact. It's
> > > easy to adjust though if the consensus changes on that.
> >
> > I'm OK with changing the format if there's a consensus. The current
> > format seemed better to me, but I'm not particularly attached to it.
>
> I still think Pre-sorted groups should be on a separate line, as in 0002.
> In addition to looking better (to me), and being easier to read, another reason
> is that there are essentially key=>values here, but the keys are repeated (Sort
> Method, etc).

I collapsed this into 0001 because I think that if we're going to do
away with the key=value style then we effectively to have to do this
to avoid the repeated values being confusing (with key=value it kinda
worked, because that made it seem like the avg/peak were clearly a
subset of the Sort Groups info).

I also cleaned up the newline patch a bit in the process (we already
have a way to indent with a flag so don't need to do it directly).

> I also suggested to rename: s/Presorted/Pre-sorted/, but I didn't do that here.

I went ahead and did that too; we already use "Full-sort", so the
proposed change makes both parallel.

> Michael already patched most of the comment typos, the remainder I'm including
> here as a "nearby patch"..

Modified slightly.

James

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: HEAPDEBUGALL is broken
Next
From: Julien Rouhaud
Date:
Subject: Re: WAL usage calculation patch