> The really tricky part is that a DISTINCT ON needs to know about a
> first()
> aggregate. And to make optimal use of indexes, a last() aggregate as
> well. And
> ideally the planner/executor needs to know something is magic about
> first()/last() (and potentially min()/max() at some point) and that they
> don't
> need the complete set of tuples to calculate their results.
I'm going to be accused of hand-waving again, but please pardon me, I'm
enthusiastic, and I like to propose new idead, you can kick me if you
don't like them or if I put out too much uninformed bull !
Idea :
The aggregate accumulation function could have a way to say :
"stop ! I've had enough of these values ! Get on with the next item in the
GROUP BY clause !"
I don't know how, or if, the planner could use this (guess: no) or the
index scan use this (guess: no) but it would at least save the function
calls. I'd guess this idea is quite useless.
Aggregates could have an additional attribute saying how much values it
will need ('max_rows' maybe). This would prevent the creation of "magic"
aggregates for max() (which is a kind of special-casing), keep it generic
(so users can create magic aggregates like this).
Aggregates already consist of a bunch of functions (start, accumulate,
return retuls) so this could be just another element in this set.
This information would be known ahead of time and could influence the
query plans too. I'm going to wave my hand and say "not too much planning
cost" because I guess the aggregate details are fetched during planning so
fetching one more attribute would not be that long...
For instance first() would have max_rows=1, and users could code a "first
N accumulator-in-array" which would have max_rows=N...
This does not solve the problem of min() and max() which need max_rows=1
only if the result is sorted... hum... maybe another attribute like
max_rows_sorted = 1 for max() and -1 for min() meaning 'first 1' or 'last
1' (or first N or last N)... according to the "order by" clause it would
be known that the 'first N' of an 'order by ... asc' is the same as the
'last N' from an 'order by ... desc'
???