Re: MAX/MIN optimization via rewrite (plus query rewrites - Mailing list pgsql-hackers

From Mark Kirkwood
Subject Re: MAX/MIN optimization via rewrite (plus query rewrites
Date
Msg-id 41A4EE64.10601@coretech.co.nz
Whole thread Raw
In response to MAX/MIN optimization via rewrite (plus query rewrites generally)  (Mark Kirkwood <markir@coretech.co.nz>)
List pgsql-hackers
I think a summary of where the discussion went might be helpful 
(especially for me after a week or so away doing perl).

There were a number of approaches suggested, which I will attempt to 
summarize in a hand wavy fashion - (apologies for any misrepresentation 
caused):

i)   Rewrite max/min querys using order by in presence of a suitable index.

ii)  Provide alternate (i.e rewritten) querys for consideration along 
with the    original, letting the planner use its costing methods to choose as 
usual.

iii) Provide alternate plans based on presence of certain aggregate types in    the query, letting the planner use its
costingmethods to choose as 
 
usual.

iv)  Create short-cut evaluations for certain aggregates that don't actually    need to see all the (to-be aggregated)
data.

v)   Create a mechanism for defining per-aggregate optimization operators.

Note that some of these ideas may overlap one another to some extent.

Some critiques of the various approaches are:

i)   Too simple, rewrite may not be better than original, only simple 
queries    can be handled this way. Probably reasonably easy to implement.

ii)  Simple queries will be well handled, but very complex transformations    needed to handle even slightly more
complexones. Probably medium ->    difficult to implement.
 

iii) Rules for creating alternate plans will mimic the issues with ii).    Probably medium -> difficult to implement.

iv)  May need different short cuts for each aggregate -> datatype 
combination.    Implies conventional > and < operators, or the existence of similar    use definable ones (or a way of
findingsuitable ones). Guessing medium    to implement.
 


v)   Is kind of a generalization of iv). The key areas of difficulty are the    specification of said optimization
operatorsand the definition of 
 
an API    for constructing/calling them. Guessing difficult to implement.

I am leaning towards ii) or iv) as the most promising approaches - what 
do people think?

regards

Mark


pgsql-hackers by date:

Previous
From: "Barry Lind"
Date:
Subject: Re: [JDBC] Strange server error with current 8.0beta driver
Next
From: Thomas Hallgren
Date:
Subject: Intermittent bug