Re: sqlsmith crash incremental sort - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: sqlsmith crash incremental sort |
Date | |
Msg-id | 20200416184416.vcmmy22zfmov54kl@development Whole thread Raw |
In response to | Re: sqlsmith crash incremental sort (James Coleman <jtc331@gmail.com>) |
Responses |
Re: sqlsmith crash incremental sort
Re: sqlsmith crash incremental sort |
List | pgsql-hackers |
On Thu, Apr 16, 2020 at 12:04:03PM -0400, James Coleman wrote: >On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote: >> >> >> On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote: >>> >>> >>> On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >>>> >>>> >>>> Well, yeah. The problem is the Unique simply compares the columns in the >>>> order it sees them, and it does not match the column order desired by >>>> incremental sort. But we don't push down this information at all :-( >>> >>> >>> This is a nice optimization better to have. Since the 'Sort and Unique' >>> would unique-ify the result of a UNION by sorting on all columns, why >>> not we adjust the sort order trying to match parse->sortClause so that >>> we can avoid the final sort node? >>> >>> Doing that we can transform plan from: >>> >>> # explain (costs off) select * from foo union select * from foo order by 1,3; >>> QUERY PLAN >>> ----------------------------------------------- >>> Incremental Sort >>> Sort Key: foo.a, foo.c >>> Presorted Key: foo.a >>> -> Unique >>> -> Sort >>> Sort Key: foo.a, foo.b, foo.c >>> -> Append >>> -> Seq Scan on foo >>> -> Seq Scan on foo foo_1 >>> (9 rows) >>> >>> To: >>> >>> # explain (costs off) select * from foo union select * from foo order by 1,3; >>> QUERY PLAN >>> ----------------------------------------- >>> Unique >>> -> Sort >>> Sort Key: foo.a, foo.c, foo.b >>> -> Append >>> -> Seq Scan on foo >>> -> Seq Scan on foo foo_1 >>> (6 rows) >>> >> >> Attached is what I'm thinking about this optimization. Does it make any >> sense? > >Shouldn't this go one either a new thread or on the thread for the >patch Tomas was referencing (by Teodor I believe)? > FWIW the optimization I had in mind is this: https://commitfest.postgresql.org/21/1651/ I now realize that was about GROUP BY, but it's not all that different and the concerns will / should be fairly similar, I think. IMO simply tweaking the sort keys to match the upper parts of the plan is probably way too simplistic, I'm afraid. For example, if the Unique significantly reduces cardinality, then the cost of the additional sort is much less important. It may be much better to optimize the "large" sort of the whole data set, either by reordering the columns as proposed by Teodor in his patch (by number of distinct values and/or cost of the comparison function function). Furthermore, this is one of the places that is not using incremental sort yet - I can easily imagine doing something like this: Sort -> Unique -> Incremenal Sort -> ... could be a massive win. So I think we can't just rejigger the sort keys abitrarily, we should / need to consider those alternatives. >Or are you saying you believe this patch guarantees we never see this >problem in incremental sort costing? > Yeah, that's not entirely close to me. But maybe it shows us where we to get the unprocessed target list? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: