Re: Aggregate ORDER BY patch - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: Aggregate ORDER BY patch
Date
Msg-id 87r5qwrood.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: Aggregate ORDER BY patch  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Aggregate ORDER BY patch  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:>> Updated version of the aggregate order by patch.
Tom> Applied with some editorialization.  The main change I made wasTom> to get rid of all the ad-hoc DISTINCT handling
inparse_agg.cTom> and use transformDistinctClause() instead.
 

I'll review that; I avoided that code intentionally because the
semantics of query-level DISTINCT are different enough.
Tom> This exposed what I believe to be a bug in the submitted patch:Tom> it accepted cases like
Tom>     agg(DISTINCT x ORDER BY x,y)

This is not a bug, it was done intentionally (as you might have
guessed from the fact that there was a regression test for it). The
additional ORDER BY column in this case is always safe (since DISTINCT
adopts the equality operator from the sort, it's not possible for
additional sort columns to break the DISTINCT). I allowed the case
since there was therefore no good reason to forbid it.

There is at least one case where this makes a visible difference in
query output: if the aggregate can distinguish values of x which are
considered equal by the sort operator used, then the value of y
affects which value of x is seen. It is probably relatively easy to
generate examples of this using the box type and array_agg.
Tom> We do not allow that in ordinary query-level DISTINCT

Note that ordinary query-level DISTINCT has the reverse semantics; the
DISTINCT operation is (per spec) logically prior to the order by, the
fact that they are planned in the reverse order is an implementation
detail.

Query-level DISTINCT shouldn't allow columns in the order by that
aren't in the select list because those columns _do not exist_ at the
point that ordering logically takes place (even though in the
implementation, they might).

This isn't the case for aggregate order by.

-- 
Andrew (irc:RhodiumToad)


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Closing out CommitFest 2009-11
Next
From: Tom Lane
Date:
Subject: Re: tsearch parser inefficiency if text includes urls or emails - new version