Re: Single Table Select With Aggregate Function - Mailing list pgsql-general

From David Johnston
Subject Re: Single Table Select With Aggregate Function
Date
Msg-id 018201ccca73$e9cb8220$bd628660$@yahoo.com
Whole thread Raw
In response to Single Table Select With Aggregate Function  (Rich Shepard <rshepard@appl-ecosys.com>)
List pgsql-general
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rich Shepard
Sent: Tuesday, January 03, 2012 6:27 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Single Table Select With Aggregate Function

   I'm probably not seeing the obvious so I keep making the same mistake.
The table holds water chemistry data from multiple streams, sites within
each stream, sampling dates, and many chemical constituents.

   What I need to do are three things:

   1.) Find the date and site for the maximum value of a specified
constituent on a named stream.

   2.) Find the values of that same constituent at other sites on the named
stream on that same date.

   3.) Find the date of the maximum value of a constituent for all sites on
the named stream.

   Thinking the last the easiest to do, I submitted this query:

select max(quant), site, sampdate from chemistry where stream = 'SheepCrk'
and param = 'TDS' group by site, sampdate;

but this gives me the value of each site and date, not the maximum for all
dates at a specific site. Postgres tells me that both site and sampdate must
be in the 'group by' clause.

   Obviously the syntax is more complex than I thought it would be and I
would like a pointer to the correct way to write these queries.

Rich


---------------------------------------------------------------

-- Base Query; most recent date for the maximum quant on a given site/stream
(not tested)
WITH
get_max_quant_for_each_site AS (
    SELCET max(quant) AS site_maximum, site, stream, param FROM
chemistry WHERE stream = 'SheepCrk' AND param = 'TDS' GROUP BY site, stream,
param
)
, locate_all_entries_with_the_max AS (
    SELECT site, sampledate, quant, stream
        , ROW_NUMBER() OVER (PARTITION BY site, stream, param ORDER
BY sampledate DESC) AS sample_row_number
    FROM (SELECT site, quant, sampledate, stream FROM chemistry)
    JOIN get_max_quant_for_each_site USING (site, stream, param)
    WHERE quant = site_maximum
)
, limit_to_one_per_site AS (
    SELECT site, sampledate, quant, stream, param
    FROM locate_all_entries_with_the_max
    WHERE sample_row_number = 1
)
SELECT * FROM limit_to_one_per_site;

-- The other queries should be able to manipulate the results of this query
to directly answer your question or use these results JOINed against the
chemistry table to get at what you need
-- My naming was done before deciding that you want more than just "site" in
joining/output but also "stream" and "param"
-- You can convert the "WITH" clauses into sub-queries to possibly improve
performance...though WITH is probably easier to follow along with during
initial development

David J.


pgsql-general by date:

Previous
From: Ondrej Ivanič
Date:
Subject: Re: Single Table Select With Aggregate Function
Next
From: Rich Shepard
Date:
Subject: Re: Single Table Select With Aggregate Function