Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement - Mailing list pgsql-sql

From David D. Kilzer
Subject Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement
Date
Msg-id 20010507131227.D12724@elbonia.lubricants-oil.com
Whole thread Raw
In response to Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement  ("David D. Kilzer" <ddkilzer@lubricants-oil.com>)
List pgsql-sql
Okay, I found a workaround for PostgreSQL 7.0.3 for the specific query I
was working on below, but I'd still like to know if an ORDER BY clause
will ever be allowed with an aggregated SELECT statement, and whether it
would generally be more efficient to use an ORDER BY in the situation
described below instead of a subselect query.

Also, I realized after I sent the original message below that I could 
work around the "no ORDER BY, LIMIT in subselect" limitation in 
PostgreSQL 7.0.x using a subselect within a subselect, but then I'd
just be abusing the database engine, not finding a real-world 
solution.  :^)

Thanks!

Dave


On Mon, May 07, 2001 at 12:27:59PM -0500, David D. Kilzer wrote:

> NOTE: I did a moderate search through the PgSQL mail list archives, 
> but couldn't find an example of a question where both ORDER BY and
> aggregation were used.  In all examples it was possible to use either
> one or the other with a possible subselect.  [Keep reading...I
> discovered how to do this with subselects, but I'm wondering if (1) what
> I'm doing with ORDER BY and aggregation will ever be allowed and (2)
> whether the subselect solution is much more inefficient than being able
> to order a pre-aggregate set.]
> 
> I need to query a table and return aggregated results.  The aggregation
> occurs many ways (I'm joining 7 tables and returning 19 columns
> currently):
> 
>   o Columns in a GROUP BY clause
>   o SUM() functions
>   o COUNT() functions
> 
> However, I wrote a new FUNCTION and a new AGGREGATE to return the *last*
> value of a field during the aggregation process (see below).  This means
> that I would like to use an ORDER BY clause with the SELECT ... GROUP BY
> statement before the results are aggregated.
> 
> DROP FUNCTION "lastint4" (int4, int4);
> CREATE FUNCTION "lastint4" (int4, int4)
>         RETURNS int4
>              AS 'BEGIN RETURN $2; END;'
>        LANGUAGE 'PLPGSQL';
> 
> DROP AGGREGATE lastitem int4;
> CREATE AGGREGATE lastitem (
>           SFUNC1 = lastint4,
>         BASETYPE = int4,
>           STYPE1 = int4);
> 
> A simple (made-up) example:
> 
>   SELECT r.personid           AS personid
>         ,SUM(r.laps)          AS laps
>         ,COUNT(DISTINCT r.id) AS nightsraced
>         ,lastitem(r.carid)    AS carid
>     FROM race r
>    WHERE r.personid = 14
> GROUP BY r.personid
> ORDER BY r.date DESC;
> 
> Does PostgreSQL (or ANSI SQL) allow this?  In the simple tests I ran, I
> would get errors similar to the following:
> 
> ERROR:  Attribute r.carid must be GROUPed or used in an aggregate function
> 
> This seems a bit...obvious because r.carid is already being used in an
> aggregate function.  I'm guessing that I'm running into the internals
> of how PgSQL processes the query.
> 
> Hmm...well I just figured out how I could do this as a subselect, but it
> seems hugely inefficient and would require PostgreSQL 7.1 or later
> (ORDER BY, LIMIT used in a subselect):
> 
>   SELECT r.personid               AS personid
>         ,SUM(r.laps)              AS laps
>         ,COUNT(DISTINCT r.id)     AS nightsraced
>         ,(SELECT r.carid
>             FROM race r
>            WHERE r.personid = 14 
>         ORDER BY r.date DESC
>            LIMIT 1)               AS carid
>     FROM race r
>    WHERE r.personid = 14
> GROUP BY r.personid
> ORDER BY r.date;
> 
> Is a subselect like this really that inefficient (assuming appropriate 
> indexes on r.date and r.personid)?  I would think doing this during
> aggregation would be much more efficient.
> 
> I'm using PostgreSQL 7.0.3 on Debian GNU/Linux 2.2r3
> (postgresql-7.0.3-4) on a Linux 2.2.1x kernel.
> 
> Thanks for any insight you can provide!
> 
> Dave


pgsql-sql by date:

Previous
From: "Sylte"
Date:
Subject: Re: How to store gif or jpeg? Thanks!
Next
From: "Wei Weng"
Date:
Subject: Is this possible?