Re: weird behaviour on DISTINCT ON - Mailing list pgsql-hackers

From Gaetano Mendola
Subject Re: weird behaviour on DISTINCT ON
Date
Msg-id ctlup3$hd8$1@floppy.pyrenet.fr
Whole thread Raw
In response to Re: weird behaviour on DISTINCT ON  (Greg Stark <gsstark@mit.edu>)
Responses Re: weird behaviour on DISTINCT ON
List pgsql-hackers
Greg Stark wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
> 
> 
>>now what do I see is that for each different x value
>>the foo is executed more than once, I guess this is because
>>the distinct filter out the rows after executing the query.
>>
>>Is this behaviour the normal one? Shall be not documented ?
> 
> 
> Usually DISTINCT ON is only really useful if you're sorting on something.
> Otherwise the choice of which record is output is completely arbitrary.
> 
> So the above would typically be something like:
> 
> SELECT DISTINCT ON (x), y, foo(x)
>  ...
>  ORDER BY x, y
> 
> Now you can see why every record does need to be looked at to handle that.
> In fact the ORDER BY kicks in before output columns are generated so you can
> do things like:
> 
> SELECT DISTINCT ON (x), y, foo(x)
>  ...
>  ORDER BY x, y, foo(x)
> 
> And of course obviously foo() has to be executed for every record to do this.
> 
> Postgres doesn't try to detect cases where it's safe to change the regular
> order in which things are done and delay execution of functions whose results
> aren't needed right away. 
> 
> You could just use
> 
> SELECT x, foo(x) from (SELECT x ... GROUP BY x)
> 

I totally agree and this was clear after having seen what was happening,
my warning was due the fact that in the docs is written nowhere this
drawback. A novice, like I was 4 years ago ( I discover it in the code 
only today ), can burn it self.


Regards
Gaetano Mendola















pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: weird behaviour on DISTINCT ON
Next
From: "Merlin Moncure"
Date:
Subject: Re: Allow GRANT/REVOKE permissions to be applied to all schema objects with one command