Greg Stark wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>
>>now what do I see is that for each different x value
>>the foo is executed more than once, I guess this is because
>>the distinct filter out the rows after executing the query.
>>
>>Is this behaviour the normal one? Shall be not documented ?
>
>
> Usually DISTINCT ON is only really useful if you're sorting on something.
> Otherwise the choice of which record is output is completely arbitrary.
>
> So the above would typically be something like:
>
> SELECT DISTINCT ON (x), y, foo(x)
> ...
> ORDER BY x, y
>
> Now you can see why every record does need to be looked at to handle that.
> In fact the ORDER BY kicks in before output columns are generated so you can
> do things like:
>
> SELECT DISTINCT ON (x), y, foo(x)
> ...
> ORDER BY x, y, foo(x)
>
> And of course obviously foo() has to be executed for every record to do this.
>
> Postgres doesn't try to detect cases where it's safe to change the regular
> order in which things are done and delay execution of functions whose results
> aren't needed right away.
>
> You could just use
>
> SELECT x, foo(x) from (SELECT x ... GROUP BY x)
>
I totally agree and this was clear after having seen what was happening,
my warning was due the fact that in the docs is written nowhere this
drawback. A novice, like I was 4 years ago ( I discover it in the code
only today ), can burn it self.
Regards
Gaetano Mendola