Home > mailing lists

Re: query_planner() API change - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: query_planner() API change
Date	August 5, 2013 10:07:57
Msg-id	CAFjFpRdwoJwZPMm1g1Mw=0A1i8Ec2=TK8jsohMQycXU2_X_5Tw@mail.gmail.com Whole thread Raw
In response to	query_planner() API change (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: query_planner() API change ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>) Re: query_planner() API change (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Mon, Aug 5, 2013 at 3:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I've been looking at what it would take to do proper cost estimation
for the recently-discussed patch to suppress calculation of unnecessary
ORDER BY expressions.

Can you please mention the subject of the thread? I tried to locate the thread based on this description, but couldn't locate it. Are you referring to the discussion related to aggregation with specified ordering?

A doubt at the end ...

It turns out that knowledge of that would have
to propagate into query_planner(), because the place where we do the cost
comparison between unsorted and presorted paths is in there (planmain.c
lines 390ff in HEAD). As it stands, query_planner() will actually refuse
to return the presorted path to grouping_planner() at all if it thinks
it's a loser cost-wise, meaning grouping_planner() would have no
opportunity to override the decision. So there's no way to fix this
without some API change for query_planner().

While we could complicate query_planner()'s API even more to add some
understanding of unnecessary resjunk items, I think this is probably
the straw that breaks the camel's back for the current approach here.
There is already a comment like this in query_planner():

* This introduces some undesirable coupling between this code and
* grouping_planner, but the alternatives seem even uglier; we couldn't
* pass back completed paths without making these decisions here.

I think it's time to bite the bullet and *not* pass back completed paths.
What's looking more attractive now is to just pass back the top-level
RelOptInfo ("final_rel" in query_planner()). We could remove all three
output parameters of query_planner(), and essentially just move lines
265-420 (pretty much everything after the make_one_rel() call) into
planner.c. Since that code is almost all about grouping-related choices,
this seems like it'll be a net improvement modularity-wise, even though
it'll make grouping_planner() even bigger. We could probably ameliorate
the latter problem by putting the calculation of num_groups and adjustment
of tuple_fraction into a subroutine.

Can we change the query_planner() to return both the paths (presorted and unsorted) irrespective of the cost of presorted path, and let grouping_planner() (or any caller of query_planner()) handle which of them to pick up?

Objections, better ideas?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

From: Atri Sharma
Date: 05 August 2013, 09:46:28
Subject: Re: query_planner() API change

From: Michael Paquier
Date: 05 August 2013, 10:20:45
Subject: FOR UPDATE/SHARE incompatibility with GROUP BY, DISTINCT, HAVING and window functions

Re: query_planner() API change - Mailing list pgsql-hackers

Previous

Next