Home > mailing lists

Re: MAX/MIN optimization via rewrite (plus query rewrites generally) - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: MAX/MIN optimization via rewrite (plus query rewrites generally)
Date	November 12, 2004 01:34:52
Msg-id	87r7n0c8vw.fsf@stark.xeocode.com Whole thread Raw
In response to	Re: MAX/MIN optimization via rewrite (plus query rewrites generally) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: MAX/MIN optimization via rewrite (plus query rewrites generally) (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > It would also make it possible to deprecate DISTINCT ON in favour of GROUP BY
> > with first() calls.
> 
> Oh?  How is a first() aggregate going to know what sort order you want
> within the group?  AFAICS first() is only useful when you honestly do
> not care which group member you get ... which is certainly not the case
> for applications of DISTINCT ON.

It would look something like

select x,first(a),first(b) from (select x,a,b from table order by x,y) group by x

which is equivalent to

select DISTINCT ON (x) x,a,b from table ORDER BY x,y

The group by can see that the subquery is already sorted by x and doesn't need
to be resorted. In fact I believe you added the smarts to detect that
condition in response to a user asking about precisely this type of scenario.

This is actually more general than DISTINCT ON since DISTINCT ON is basically
a degenerate case of the above where the _only_ aggregate allowed is first().
The more general case could have first() as well as other aggregates, though
obviously they would make it unlikely that any optimizations would be
applicable.

I do kind of like the DISTINCT ON syntax, but the inability to use any other
aggregate functions makes me often have to convert queries I originally wrote
to use it to use the more general GROUP BY and first() instead.

-- 
greg

pgsql-hackers by date:

From: David Fetter
Date: 12 November 2004, 01:27:03
Subject: Re: multiline CSV fields

From: Tom Lane
Date: 12 November 2004, 01:46:23
Subject: Re: MAX/MIN optimization via rewrite (plus query rewrites generally)

Re: MAX/MIN optimization via rewrite (plus query rewrites generally) - Mailing list pgsql-hackers

Previous

Next