Home > mailing lists

Re: Parsing of aggregate ORDER BY clauses - Mailing list pgsql-hackers

From	Daniel Grace
Subject	Re: Parsing of aggregate ORDER BY clauses
Date	July 20, 2010 18:00:37
Msg-id	AANLkTikYAD8995rqcr349IyN9KUpw4gXnG1UYQY3b5Ay@mail.gmail.com Whole thread Raw
In response to	Re: Parsing of aggregate ORDER BY clauses (Hitoshi Harada <umi.tanuki@gmail.com>)
Responses	Re: Parsing of aggregate ORDER BY clauses
List	pgsql-hackers

Tree view

On Mon, Jul 19, 2010 at 4:08 PM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
>
> 2010/7/19 Tom Lane <tgl@sss.pgh.pa.us>:
> > I looked into the problem reported here:
> > http://archives.postgresql.org/pgsql-bugs/2010-07/msg00119.php
> >

[...]
>
> >
> > 2. Split the processing of aggregates with ORDER BY/DISTINCT so that the
> > sorting/uniqueifying is done in a separate expression node that can work
> > with the "native" types of the given columns, and only after that do we
> > perform coercion to the aggregate function's input types.  This would be
> > logically the cleanest thing, perhaps, but it'd represent a very major
> > rework of the patch, with really no hope of getting it done for 9.0.

[...]
>
> > #3 seems the sanest fix, but I wonder if anyone has an objection or
> > better idea.
>
> I didn't look at the code yet, #2 sounds like the way to go. But I see
> the breakage is unacceptable for 9.0, so #3 is the choice for 9.0 and
> will we fix it as #2 for 9.1 or later?

I'm the original reporter of the mentioned bug.

One possible concern might be typecasts that aren't a 1:1
representation.  While no two VARCHARs are going to produce the same
TEXT, this is not true in other cases (1.1::float::integer and
1.2::float::integer both produce 1, for instance).

Off the top of my head, I can't think of a good example where this
would cause a problem -- it'd be easy enough to manufacture a possible
test case, but it'd be so contrived and I don't know if it's something
that would be seen in production code.  But if we SELECT
SOME_INTEGER_AGGREGATE(DISTINCT floatcol ORDER BY floatcol), should
the DISTINCT operate on floatcol (i.e. 1.1 and 1.2 are distinct, even
if it means the function is called with '1' twice) or
floatcol::integer (1.1 and 1.2 are not distinct)?

I'm guessing the former, even if it means the function is called
multiple times with the same final (after typecasting) input value.
The latter would only be correct if the user specifically wrote it as
DISTINCT floatval::INTEGER.

--
Daniel Grace

pgsql-hackers by date:

From: "Kevin Grittner"
Date: 20 July 2010, 17:51:52
Subject: Re: managing git disk space usage

From: Simon Riggs
Date: 20 July 2010, 18:12:01
Subject: SAVEPOINTs and COMMIT performance

Re: Parsing of aggregate ORDER BY clauses - Mailing list pgsql-hackers

Previous

Next