Thread: Difficult query

Difficult query

From
"Peter Alberer"
Date:
Hi all,

i have been thinking about the following query for some time but cannot
find a good answer.

I have a table called lr_area_stats__user_day that holds info about the
number of exercises a student has worked on in a certain area on a
certain day. The "area" is a short name for a course for example
"business admin" -> "ba". The columns total and succ tell about the
numbers of exercises solved and the number of successful solutions.

create table lr_area_stats__user_day (
    area            varchar(5),
    user_id         integer
                    constraint area_stats_user_fk
                    references persons(person_id)
                    on delete cascade,
    day             date,
    total           integer,
    succ            integer,
    distinct_excs   integer
);

I would like to get the "best" student for the last 14 days per area in
one query, but all I can get is all of the entries per area and student.
My query looks like that:

    select area, user_id, sum(total) as total, sum(succ) as succ
    from lr_area_stats__user_day
    where day >= (now() - '14 day'::interval)::date
    group by area, user_id
    order by area asc, succ desc, total desc

The output has all the info I need but tons of entries I want to filter
out:

vw1   |  258864 |     1 |    0
 vw1   |  258925 |     1 |    0
 wigeo |   15840 |     6 |    0 <--
 wigeo |  251229 |     4 |    0
 wipr  |  147405 |   818 |  776 <-- want to get the top entries per area
 wipr  |  140616 |  1174 |  734
 wipr  |  150895 |   808 |  591
 wipr  |  136125 |   621 |  542
 wipr  |  149796 |   627 |  526

How can I filter the output to only return the top values per area? I
would prefer not to use stored-procedures or client-side code if that is
not necessary.

Many TIA, peter



Re: Difficult query

From
Manfred Koizar
Date:
On Wed, 5 Mar 2003 17:02:42 +0100, "Peter Alberer"
<h9351252@obelix.wu-wien.ac.at> wrote:
>How can I filter the output to only return the top values per area? I
>would prefer not to use stored-procedures or client-side code if that is
>not necessary.

Peter,

DISTINCT ON is your friend.  I never tried to use it together with
GROUP BY;  at least you can use your original query as a sub-select
and wrap the DISTINCT ON query around it.

    SELECT DISTINCT ON (area) area, user_id, total, succ
      FROM (SELECT ...) AS t
     ORDER BY area, succ DESC, total DESC;

Servus
 Manfred

Re: Difficult query

From
"Peter Alberer"
Date:
Thanks a lot Manfred that is exactly what is was searching for!

I tried a few things with distinct and max in a similar select -
sub-select combination but did not know about "distinct on".

peter

>-----Ursprüngliche Nachricht-----
>Von: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>owner@postgresql.org] Im Auftrag von Manfred Koizar
>Gesendet: Donnerstag, 06. März 2003 11:39
>An: Peter Alberer
>Cc: pgsql-general@postgresql.org
>Betreff: Re: [GENERAL] Difficult query
>
>On Wed, 5 Mar 2003 17:02:42 +0100, "Peter Alberer"
><h9351252@obelix.wu-wien.ac.at> wrote:
>>How can I filter the output to only return the top values per area? I
>>would prefer not to use stored-procedures or client-side code if that
is
>>not necessary.
>
>Peter,
>
>DISTINCT ON is your friend.  I never tried to use it together with
>GROUP BY;  at least you can use your original query as a sub-select
>and wrap the DISTINCT ON query around it.
>
>    SELECT DISTINCT ON (area) area, user_id, total, succ
>      FROM (SELECT ...) AS t
>     ORDER BY area, succ DESC, total DESC;
>
>Servus
> Manfred
>
>---------------------------(end of
broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>http://www.postgresql.org/docs/faqs/FAQ.html