Re: sqlsmith crash incremental sort - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: sqlsmith crash incremental sort |
Date | |
Msg-id | 20200416231300.5mkhjzs6ki4yntto@development Whole thread Raw |
In response to | Re: sqlsmith crash incremental sort (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: sqlsmith crash incremental sort
|
List | pgsql-hackers |
On Thu, Apr 16, 2020 at 08:44:16PM +0200, Tomas Vondra wrote: >On Thu, Apr 16, 2020 at 12:04:03PM -0400, James Coleman wrote: >>On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote: >>> >>> >>>On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote: >>>> >>>> >>>>On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >>>>> >>>>> >>>>>Well, yeah. The problem is the Unique simply compares the columns in the >>>>>order it sees them, and it does not match the column order desired by >>>>>incremental sort. But we don't push down this information at all :-( >>>> >>>> >>>>This is a nice optimization better to have. Since the 'Sort and Unique' >>>>would unique-ify the result of a UNION by sorting on all columns, why >>>>not we adjust the sort order trying to match parse->sortClause so that >>>>we can avoid the final sort node? >>>> >>>>Doing that we can transform plan from: >>>> >>>># explain (costs off) select * from foo union select * from foo order by 1,3; >>>> QUERY PLAN >>>>----------------------------------------------- >>>> Incremental Sort >>>> Sort Key: foo.a, foo.c >>>> Presorted Key: foo.a >>>> -> Unique >>>> -> Sort >>>> Sort Key: foo.a, foo.b, foo.c >>>> -> Append >>>> -> Seq Scan on foo >>>> -> Seq Scan on foo foo_1 >>>>(9 rows) >>>> >>>>To: >>>> >>>># explain (costs off) select * from foo union select * from foo order by 1,3; >>>> QUERY PLAN >>>>----------------------------------------- >>>> Unique >>>> -> Sort >>>> Sort Key: foo.a, foo.c, foo.b >>>> -> Append >>>> -> Seq Scan on foo >>>> -> Seq Scan on foo foo_1 >>>>(6 rows) >>>> >>> >>>Attached is what I'm thinking about this optimization. Does it make any >>>sense? >> >>Shouldn't this go one either a new thread or on the thread for the >>patch Tomas was referencing (by Teodor I believe)? >> > >FWIW the optimization I had in mind is this: > > https://commitfest.postgresql.org/21/1651/ > >I now realize that was about GROUP BY, but it's not all that different >and the concerns will / should be fairly similar, I think. > >IMO simply tweaking the sort keys to match the upper parts of the plan >is probably way too simplistic, I'm afraid. For example, if the Unique >significantly reduces cardinality, then the cost of the additional sort >is much less important. It may be much better to optimize the "large" >sort of the whole data set, either by reordering the columns as proposed >by Teodor in his patch (by number of distinct values and/or cost of the >comparison function function). > >Furthermore, this is one of the places that is not using incremental >sort yet - I can easily imagine doing something like this: > > > Sort > -> Unique > -> Incremenal Sort > -> ... > >could be a massive win. So I think we can't just rejigger the sort keys >abitrarily, we should / need to consider those alternatives. > >>Or are you saying you believe this patch guarantees we never see this >>problem in incremental sort costing? >> > >Yeah, that's not entirely close to me. But maybe it shows us where we to >get the unprocessed target list? > I think at the very least this needs to apply the same change also to generate_nonunion_paths, because otherwise this fails because of the same issue: set enable_hashagg = off; explain select * from foo except select * from foo order by 1, 3; I'm still of the opinion that this is really an optimization and behavior change, and I feel rather uneasy about pushing it post feature freeze without appropriate review. I also think we really ought to consider how would this work with the other optimizations I outlined elsewhere in this thread (comparison costs, ...). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: