Thread: Non-aggregate values attached to aggregates?

Non-aggregate values attached to aggregates?

From
Benjamin Smith
Date:
I have a list of students, and a list of enrollment records, and I'm trying to
get a list of students and their most recent enrollment/disenrollment dates.

create table students (id serial primary key, name varchar);
create table enrollments (
    students_id integer not null references students(id),
    start integer not null,
    finish integer not null default 0);
insert into students (name) VALUES ('johnny');
insert into enrollments (students_id, start, finish) VALUES
    (1, 20030901, 20040530);
insert into enrollments (students_id, start, finish) VALUES
    (1, 20040901, 0);

Student enrolled last year, and is currently enrolled. If students are
currently enrolled, the finish date is "0". Dates are kept as ]YYYYMMDD', eg
2004114 for Nov 14, 2004.

I want to be able to export the student name, most recent enrollment date, and
disenrollment date. I've successfully gotten the student name and most recent
enrollment date, but never the associated exit date.

This returns most recent enrollment date:
select students.name, max(enrollments.start) as start from students,
enrollments where enrollments.students_id=students.id group by students.name;

Now, to get the exit date, I've tried

select students.name,
    max(enrollments.start) as start,
    finish
    from students, enrollments
    where enrollments.students_id=students.id
    AND max(enrollments.start)=enrollments.start
    group by students.name, enrollments.finish

which results in "ERROR:  Aggregates not allowed in WHERE clause" and also:

select students.name,
    max(enrollments.start) as start,
    finish
    from students, enrollments
    where enrollments.students_id=students.id
    group by students.name, enrollments.finish
    having enrollments.start=max(enrollments.start);

which returns "ERROR: Attribute enrollments.start must be GROUPed or used in
an aggregate function"

How can this be done? Can it be done?

-Ben
--
"The best way to predict the future is to invent it."
- XEROX PARC slogan, circa 1978


Re: Non-aggregate values attached to aggregates?

From
Tom Lane
Date:
Benjamin Smith <lists@benjamindsmith.com> writes:
> I have a list of students, and a list of enrollment records, and I'm trying to
> get a list of students and their most recent enrollment/disenrollment dates.

I don't know any nice way to solve this type of problem in bog-standard SQL.
But it's pretty easy to solve using Postgres' DISTINCT ON extension.
Look at the "weather reports" example in the SELECT reference page
for inspiration.

            regards, tom lane

Re: Non-aggregate values attached to aggregates?

From
Michael Fuhr
Date:
On Thu, Dec 16, 2004 at 01:38:19PM -0800, Benjamin Smith wrote:

> I have a list of students, and a list of enrollment records, and I'm trying to
> get a list of students and their most recent enrollment/disenrollment dates.

If you don't mind using a non-standard construct, see the documentation
for SELECT DISTINCT ON.

http://www.postgresql.org/docs/7.4/static/queries-select-lists.html#QUERIES-DISTINCT
http://www.postgresql.org/docs/7.4/static/sql-select.html#SQL-DISTINCT

> Dates are kept as ]YYYYMMDD', eg 2004114 for Nov 14, 2004.

Why not use a DATE type?  You can reformat it with to_char() if
need be.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Non-aggregate values attached to aggregates?

From
Klint Gore
Date:
On Thu, 16 Dec 2004 13:38:19 -0800, Benjamin Smith <lists@benjamindsmith.com> wrote:
> I have a list of students, and a list of enrollment records, and I'm trying to
> get a list of students and their most recent enrollment/disenrollment dates.

just subselect the max date for the student_id in the where clause.

select students.name,
       enrollments.start,
       enrollments.finish
>from students, enrollments
where enrollments.students_id = students.id
and enrollments.start =
   (select max(e1.start)
    from enrollments e1
    where e1.student_id = students.id)

klint.

+---------------------------------------+-----------------+
: Klint Gore                            : "Non rhyming    :
: EMail   : kg@kgb.une.edu.au           :  slang - the    :
: Snail   : A.B.R.I.                    :  possibilities  :
: Mail      University of New England   :  are useless"   :
:           Armidale NSW 2351 Australia :     L.J.J.      :
: Fax     : +61 2 6772 5376             :                 :
+---------------------------------------+-----------------+

Re: Non-aggregate values attached to aggregates?

From
Benjamin Smith
Date:
Thanks much for your help!

It took a few tries to get what it was all about, but I got it.

On Thursday 16 December 2004 14:09, Michael Fuhr wrote:
> > Dates are kept as ]YYYYMMDD', eg 2004114 for Nov 14, 2004.
>
> Why not use a DATE type?  You can reformat it with to_char() if
> need be.

Not a bad idea. I used int because it was simple and "good enough". My app is
written in PHP, and by using strtotime() I can trap typo errors before they
get near the database (which frequently are difficult to display in a
layperson readable format)

EG:

$sql="INSERT INTO enrollments (date) values ('[date]')";
$pass=array('date'=>$_ENROLL['date'];
if (!$DB->SafeQuery($sql, $pass))
 return error("Database Error: ".$DB->Error());

If there's an error here, it'd say something like: "ERROR: Bad date external
representation 'mya 11 2004'. Compare with:

$sql="INSERT INTO enrollments (date) VALUES [date]";
IF (($date=strtotime($_REQUEST['date'])) ==-1)
    return Error("I'm sorry, but ".$_REQUEST['date']." does not appear to be a
valid date. Please fix and try again");
if (!$DB->SafeQuery($sql, array('date'=>$date)))
    return error("Database Error: ".$DB->Error());

which is much more "friendly" to the end user.

BTW: What is to_char, and why doesn't this work in PG 7.3.x?

select to_char(datefield) from TableX;

-Ben
--
"The best way to predict the future is to invent it."
- XEROX PARC slogan, circa 1978


Re: Non-aggregate values attached to aggregates?

From
Michael Fuhr
Date:
On Thu, Dec 16, 2004 at 03:21:35PM -0800, Benjamin Smith wrote:
> On Thursday 16 December 2004 14:09, Michael Fuhr wrote:
> > > Dates are kept as ]YYYYMMDD', eg 2004114 for Nov 14, 2004.
> >
> > Why not use a DATE type?  You can reformat it with to_char() if
> > need be.
>
> Not a bad idea. I used int because it was simple and "good enough". My app is
> written in PHP, and by using strtotime() I can trap typo errors before they
> get near the database (which frequently are difficult to display in a
> layperson readable format)

Using a DATE type in the database wouldn't prevent you from validating
the date before inserting it.  It *would* allow you to use various
date/time functions and operators in SQL queries.

> BTW: What is to_char, and why doesn't this work in PG 7.3.x?
>
> select to_char(datefield) from TableX;

When you say it doesn't work, do you mean it fails with a message
like "ERROR:  function to_char(date) does not exist"?  That's because
to_char() requires two arguments -- for details see the "Data Type
Formatting Functions" section of the "Functions and Operators"
chapter in the documentation.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/