Thread: Getting the latest unique items

Getting the latest unique items

From

"A.M."

Date:

11 December 2002, 13:16:54

I have a table as follows:
CREATE TABLE student_gradedmaterial(id SERIAL,studentid INT8 REFERENCES student,gradedmaterialid INT8 REFERENCES
gradedmaterial,caidINT8 REFERENCES ca,...submittime TIMESTAMP,           gradedtime TIMESTAMP,score INT4
 
);

Every time a student submits a homework, one new entry in the table is 
created. I know how to grab the latest version based on the submittime 
but naturally, I'd like to be able to count how many homeworks are 
graded and ungraded (ungraded means score is NULL). This smells of a 
subselect:

graded (grab row count):
SELECT UNIQUE id FROM student_gradedmaterial WHERE EXISTS (SELECT the 
latest unique submissions);
or:
SELECT COUNT(score) FROM student_gradedmaterial WHERE 
gradedmaterialid=X AND submittime = MAX(SELECT submittime FROM 
student_gradedmaterial WHERE gradedmaterialid=X);

(Sub-selects just make my head explode.) Any hints for me? Thanks.><><><><><><><><><
AgentM
agentm@cmu.edu

Re: Getting the latest unique items

From

Tomasz Myrta

Date:

11 December 2002, 13:35:20

I'm not sure if I understood your problem,
but did you try with "distinct on"?
select distinct on (id)
from
...
order by submittime desc

Regards,
Tomasz Myrta


A.M. wrote:
> I have a table as follows:> CREATE TABLE student_gradedmaterial(>     id SERIAL,>     studentid INT8 REFERENCES
student,>    gradedmaterialid INT8 REFERENCES gradedmaterial,>     caid INT8 REFERENCES ca,>     ...>     submittime
TIMESTAMP,>           gradedtime TIMESTAMP,>     score INT4> );>> Every time a student submits a homework, one new
entryin the table is> created. I know how to grab the latest version based on the submittime> but naturally, I'd like
tobe able to count how many homeworks are> graded and ungraded (ungraded means score is NULL). This smells of a>
subselect:>>graded (grab row count):> SELECT UNIQUE id FROM student_gradedmaterial WHERE EXISTS (SELECT the> latest
uniquesubmissions);> or:> SELECT COUNT(score) FROM student_gradedmaterial WHERE gradedmaterialid=X> AND submittime =
MAX(SELECTsubmittime FROM student_gradedmaterial WHERE> gradedmaterialid=X);>> (Sub-selects just make my head explode.)
Anyhints for me? Thanks.>  ><><><><><><><><><> AgentM> agentm@cmu.edu>>>> ---------------------------(end of
broadcast)--------------------------->TIP 2: you can get off all lists at once with the unregister command>    (send
"unregisterYourEmailAddressHere" to majordomo@postgresql.org)>

Re: Getting the latest unique items

From

"A.M."

Date:

12 December 2002, 00:16:16

When I try to run the following query:

select distinct on(student_gradedmaterial.id) student_gradedmaterial.id 
from coursesection_student,student_gradedmaterial WHERE 
gradedmaterialid=1 AND 
coursesection_student.studentid=student_gradedmaterial.studentid AND 
coursesectionid=1 and score is not null order by submittime desc;

I get the following error:

ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY 
expressions

Should I use GROUP BY somehow?
SELECT DISTINCT ON (student_gradedmaterial.id) 
student_gradedmaterial.id from 
coursesection_student,student_gradedmaterial where gradedmaterialid=1 
and coursesection_student.studentid=student_gradedmaterial.studentid 
and coursectionid=1 and score is not null having max(big subselect of 
max times);

?

The relevant schema follows:
CREATE TABLE coursesection_student (       coursesectionid INT8 REFERENCES coursesection,       studentid INT8
REFERENCESstudent,       status INT4 DEFAULT 0 NOT NULL, --2 switched, 1 dropped, 0 
 
enrolled       touch BOOL DEFAULT 'true',       UNIQUE(coursesectionid,studentid)
);
CREATE TABLE coursesection_ca(       coursesectionid INT8 REFERENCES coursesection,       caid INT8 REFERENCES ca
);
CREATE TABLE gradedmaterial (       id SERIAL PRIMARY KEY,       name TEXT,       visible BOOLEAN DEFAULT 'f',
openforsubmissionBOOLEAN DEFAULT 'f',       description TEXT,       webpage TEXT,       predefcomments TEXT,
weightINT4,       restrictedfiletypes TEXT,       duetime TIMESTAMP
 
);

CREATE TABLE coursesection_gradedmaterial(       gradedmaterialid INT8 REFERENCES gradedmaterial,       coursesectionid
INT8REFERENCES coursesection
 
);

CREATE TABLE student_gradedmaterial(       id SERIAL,       studentid INT8 REFERENCES student,       gradedmaterialid
INT8REFERENCES gradedmaterial,       caid INT8 REFERENCES ca,       score INT4,       comments TEXT,       submittime
TIMESTAMP,      gradedtime TIMESTAMP,       file OID,       emailtostudent BOOLEAN DEFAULT 'f',       suffix VARCHAR(6)
DEFAULT'.zip'
 
);
On Wednesday, December 11, 2002, at 01:36  PM, Tomasz Myrta wrote:

> I'm not sure if I understood your problem,
> but did you try with "distinct on"?
> select distinct on (id)
> from
> ...
> order by submittime desc
>
> Regards,
> Tomasz Myrta
>
>
> A.M. wrote:
>
> > I have a table as follows:
> > CREATE TABLE student_gradedmaterial(
> >     id SERIAL,
> >     studentid INT8 REFERENCES student,
> >     gradedmaterialid INT8 REFERENCES gradedmaterial,
> >     caid INT8 REFERENCES ca,
> >     ...
> >     submittime TIMESTAMP,
> >            gradedtime TIMESTAMP,
> >     score INT4
> > );
> >
> > Every time a student submits a homework, one new entry in the table 
> is
> > created. I know how to grab the latest version based on the 
> submittime
> > but naturally, I'd like to be able to count how many homeworks are
> > graded and ungraded (ungraded means score is NULL). This smells of a
> > subselect:
> >
> > graded (grab row count):
> > SELECT UNIQUE id FROM student_gradedmaterial WHERE EXISTS (SELECT the
> > latest unique submissions);
> > or:
> > SELECT COUNT(score) FROM student_gradedmaterial WHERE 
> gradedmaterialid=X
> > AND submittime = MAX(SELECT submittime FROM student_gradedmaterial 
> WHERE
> > gradedmaterialid=X);
> >
> > (Sub-selects just make my head explode.) Any hints for me? Thanks.
> >  ><><><><><><><><><
> > AgentM
> > agentm@cmu.edu
> >
> >
> >
> > ---------------------------(end of 
> broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >    (send "unregister YourEmailAddressHere" to 
> majordomo@postgresql.org)
> >
>
>
>
>
>><><><><><><><><><
AgentM
agentm@cmu.edu

Re: Getting the latest unique items

From

Tom Lane

Date:

12 December 2002, 02:13:31

"A.M." <agentm@cmu.edu> writes:
> When I try to run the following query:
> select distinct on(student_gradedmaterial.id) ...        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ... order by submittime desc;     ^^^^^^^^^^^^^^^^^^^

> I get the following error:

> ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY 
> expressions

Indeed.  You might benefit from reading the DISTINCT ON usage example
given in the SELECT reference page,
http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/sql-select.html
(see under "Description").  Basically, DISTINCT ON is *only* useful when
designed hand-in-hand with an ORDER BY ...
        regards, tom lane

Re: Getting the latest unique items

From

Tomasz Myrta

Date:

12 December 2002, 03:32:33

A.M. wrote:

> When I try to run the following query:
>
> select distinct on(student_gradedmaterial.id) student_gradedmaterial.id
> from coursesection_student,student_gradedmaterial WHERE
> gradedmaterialid=1 AND
> coursesection_student.studentid=student_gradedmaterial.studentid AND
> coursesectionid=1 and score is not null order by submittime desc;
>
> I get the following error:
>
> ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY
> expressions

No
You have to change your sort order (as in error above). It should be:
order by student_gradematerial.id asc, submittime desc;
Tomasz Myrta