Thread: Help with query involving aggregation and joining.

Help with query involving aggregation and joining.

From
Eddie Cheung
Date:
Hi,

I have not been able to construct an SQL statement for
my task. What seemed like a simple query turned out to
be beyond my capability. Any help will be very much
appreciated.

My requirements involve several large tables. I have
simplied scenario into the follow two fictional tables
which describes the same requirements.


HISTORY
=======
id | courseId  | submission
---+-----------+-------------1 |  101      | 2002-01-20 2 |  102      | 2002-02-173 |  104      | 2002-04-304 |  102
 | 2002-02-225 |  104      | 2002-03-156 |  104      | 2002-01-21
 
COURSE
======  id  | name
------+----------- 101 | Physics 102 | Chemistry 103 | Biology 104 | Maths 105 | English


Basically I would like to display the latest
submission for each course in a table as shown below,
order by name of the courses.

Query Results:
==============id | courseId |  name    | submission
---------------------------------------4  |  102     | Chemisty | 2002-02-223  |  104     | Maths    | 2002-04-301  |
101    | Physics  | 2002-01-20
 

I have tried many different approach but all failed...
Thanks a lot for your help. 

Regards,
Eddie



__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


Re: Help with query involving aggregation and joining.

From
Josh Berkus
Date:
Eddie,

> My requirements involve several large tables. I have
> simplied scenario into the follow two fictional tables
> which describes the same requirements.

Keep in mind that a simplified example may not solve your real problem ...

> Basically I would like to display the latest
> submission for each course in a table as shown below,
> order by name of the courses.
>
> Query Results:
> ==============
>  id | courseId |  name    | submission
> ---------------------------------------
>  4  |  102     | Chemisty | 2002-02-22
>  3  |  104     | Maths    | 2002-04-30
>  1  |  101     | Physics  | 2002-01-20

Easy:

SELECT id, courseid, name, max(submission) as submission
FROM history JOIN courses ON history.courseid = course.id
GROUP BY id, courseid, name
ORDER BY name

And as such, I suspect that your real case is more complicated than the above
...

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Help with query involving aggregation and joining.

From
Bruno Wolff III
Date:
On Sun, Feb 23, 2003 at 11:02:27 -0800, Eddie Cheung <vampyre5@yahoo.com> wrote:
> 
> HISTORY
> =======
> id | courseId  | submission
> ---+-----------+-------------
>  1 |  101      | 2002-01-20 
>  2 |  102      | 2002-02-17
>  3 |  104      | 2002-04-30
>  4 |  102      | 2002-02-22
>  5 |  104      | 2002-03-15
>  6 |  104      | 2002-01-21
>  
>  
> COURSE
> ====== 
>   id  | name
> ------+-----------
>   101 | Physics
>   102 | Chemistry
>   103 | Biology
>   104 | Maths
>   105 | English
> 
> 
> Basically I would like to display the latest
> submission for each course in a table as shown below,
> order by name of the courses.
> 
> Query Results:
> ==============
>  id | courseId |  name    | submission
> ---------------------------------------
>  4  |  102     | Chemisty | 2002-02-22
>  3  |  104     | Maths    | 2002-04-30
>  1  |  101     | Physics  | 2002-01-20

I think you want to do something like:

select distinct on (course.courseid) history.id, course.courseid, course.name, history.submission from course natural
joinhistory order by course.courseid, history.submission desc;
 


Re: Help with query involving aggregation and joining.

From
Eddie Cheung
Date:
Hi,

I was very glad to see the replies from you guys this
morning.  The two suggested SQL queries did not return
the expected results, but will help me to explain the
problem I am facing further.

1) Josh suggested the following query. (I have made
minor changes by adding the table name to each field)
  SELECT history.id, history.courseId, course.name,
MAX(history.submission) AS submission  FROM history JOIN course ON history.courseId =
course.Id  GROUP BY history.id, history.courseId, course.name  ORDER BY course.name;

The results returned are:id | courseid |   name    | submission
----+----------+-----------+------------ 2 |      102 | Chemistry | 2002-02-17 4 |      102 | Chemistry | 2002-02-22 3
|     104 | Maths     | 2002-04-30 5 |      104 | Maths     | 2002-03-15 6 |      104 | Maths     | 2002-01-21 1 |
101| Physics   | 2002-01-20
 

There are duplicate courses because the history.id
column has different values. The history.id cannot be
use in the GROUP BY clause. But it is one of the
displaying field, so I could not remove it from the
GROUP BY clause either.

2) Bruno suggested the following query:
 select distinct on (course.courseid) history.id, course.courseid, course.name,
history.submission from course natural join history order by course.courseid, history.submission desc; 
I have not used NATURAL JOIN before, but from what I
know, it joins the columns with the same name. Since
the joining columns of History and Course have
different names, I have replace JOIN clause. Please
let me know if I have made a mistake.

The modified query is:  SELECT DISTINCT ON (course.id) course.id,
history.id, course.name, history.submission   FROM history JOIN course ON history.courseId =
course.id   ORDER BY course.id, history.submission desc;

The results returned are :id  | id |   name    | submission
-----+----+-----------+------------101 |  1 | Physics   | 2002-01-20102 |  4 | Chemistry | 2002-02-22104 |  3 | Maths
 | 2002-04-30
 
The problem here is that the results are not ordered
by the submission date. If I sort by
"history.submission" first, I get  ERROR: SELECT DISTINCT ON expressions must match
initial ORDER BY expressions.
Please note that I cannot select distinct on the
course.name either because it is NOT unique. The
original tables are much larger, and the only unique
column is the id.

I have included the queries to create the tables here.

------------------------

CREATE TABLE course (id integer,name varchar(32),  Constraint course_pkey Primary Key (id)
);

CREATE TABLE history (id integer NOT NULL,courseid integer REFERENCES course(id),submission date,Constraint
history_pkeyPrimary Key (id)
 
);

INSERT INTO course (id,name) VALUES (101,'Physics');
INSERT INTO course (id,name) VALUES (102,'Chemistry');
INSERT INTO course (id,name) VALUES (103,'Biology');
INSERT INTO course (id,name) VALUES (104,'Maths');
INSERT INTO course (id,name) VALUES (105,'English');

INSERT INTO history (id,courseid,submission) VALUES
(1,101,'2002-01-20');
INSERT INTO history (id,courseid,submission) VALUES
(2,102,'2002-02-17');
INSERT INTO history (id,courseid,submission) VALUES
(3,104,'2002-04-30');
INSERT INTO history (id,courseid,submission) VALUES
(4,102,'2002-02-22');
INSERT INTO history (id,courseid,submission) VALUES
(5,104,'2002-03-15');
INSERT INTO history (id,courseid,submission) VALUES
(6,104,'2002-01-21');

--------------------------------


Thanks for all your help.


Regards,
Eddie



--- Bruno Wolff III <bruno@wolff.to> wrote:
> On Sun, Feb 23, 2003 at 11:02:27 -0800,
>   Eddie Cheung <vampyre5@yahoo.com> wrote:
> > 
> > HISTORY
> > =======
> > id | courseId  | submission
> > ---+-----------+-------------
> >  1 |  101      | 2002-01-20 
> >  2 |  102      | 2002-02-17
> >  3 |  104      | 2002-04-30
> >  4 |  102      | 2002-02-22
> >  5 |  104      | 2002-03-15
> >  6 |  104      | 2002-01-21
> >  
> >  
> > COURSE
> > ====== 
> >   id  | name
> > ------+-----------
> >   101 | Physics
> >   102 | Chemistry
> >   103 | Biology
> >   104 | Maths
> >   105 | English
> > 
> > 
> > Basically I would like to display the latest
> > submission for each course in a table as shown
> below,
> > order by name of the courses.
> > 
> > Query Results:
> > ==============
> >  id | courseId |  name    | submission
> > ---------------------------------------
> >  4  |  102     | Chemisty | 2002-02-22
> >  3  |  104     | Maths    | 2002-04-30
> >  1  |  101     | Physics  | 2002-01-20
> 
> I think you want to do something like:
> 
> select distinct on (course.courseid)
>   history.id, course.courseid, course.name,
> history.submission
>   from course natural join history
>   order by course.courseid, history.submission desc;
> 
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
majordomo@postgresql.org


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


Re: Help with query involving aggregation and joining.

From
Eddie Cheung
Date:
Hi,

I was very glad to see the replies from you guys this
morning.  The two suggested SQL queries did not return
the expected results, but will help me to explain the
problem I am facing further.

1) Josh suggested the following query. (I have made
minor changes by adding the table name to each field)
  SELECT history.id, history.courseId, course.name,
MAX(history.submission) AS submission  FROM history JOIN course ON history.courseId =
course.Id  GROUP BY history.id, history.courseId, course.name  ORDER BY course.name;

The results returned are:id | courseid |   name    | submission
----+----------+-----------+------------ 2 |      102 | Chemistry | 2002-02-17 4 |      102 | Chemistry | 2002-02-22 3
|     104 | Maths     | 2002-04-30 5 |      104 | Maths     | 2002-03-15 6 |      104 | Maths     | 2002-01-21 1 |
101| Physics   | 2002-01-20
 

There are duplicate courses because the history.id
column has different values. The history.id cannot be
use in the GROUP BY clause. But it is one of the
displaying field, so I could not remove it from the
GROUP BY clause either.

2) Bruno suggested the following query:
 select distinct on (course.courseid) history.id, course.courseid, course.name,
history.submission from course natural join history order by course.courseid, history.submission desc; 
I have not used NATURAL JOIN before, but from what I
know, it joins the columns with the same name. Since
the joining columns of History and Course have
different names, I have replace JOIN clause. Please
let me know if I have made a mistake.

The modified query is:  SELECT DISTINCT ON (course.id) course.id,
history.id, course.name, history.submission   FROM history JOIN course ON history.courseId =
course.id   ORDER BY course.id, history.submission desc;

The results returned are :id  | id |   name    | submission
-----+----+-----------+------------101 |  1 | Physics   | 2002-01-20102 |  4 | Chemistry | 2002-02-22104 |  3 | Maths
 | 2002-04-30
 
The problem here is that the results are not ordered
by the submission date. If I sort by
"history.submission" first, I get  ERROR: SELECT DISTINCT ON expressions must match
initial ORDER BY expressions.
Please note that I cannot select distinct on the
course.name either because it is NOT unique. The
original tables are much larger, and the only unique
column is the id.

I have included the queries to create the tables here.

------------------------

CREATE TABLE course (id integer,name varchar(32),  Constraint course_pkey Primary Key (id)
);

CREATE TABLE history (id integer NOT NULL,courseid integer REFERENCES course(id),submission date,Constraint
history_pkeyPrimary Key (id)
 
);

INSERT INTO course (id,name) VALUES (101,'Physics');
INSERT INTO course (id,name) VALUES (102,'Chemistry');
INSERT INTO course (id,name) VALUES (103,'Biology');
INSERT INTO course (id,name) VALUES (104,'Maths');
INSERT INTO course (id,name) VALUES (105,'English');

INSERT INTO history (id,courseid,submission) VALUES
(1,101,'2002-01-20');
INSERT INTO history (id,courseid,submission) VALUES
(2,102,'2002-02-17');
INSERT INTO history (id,courseid,submission) VALUES
(3,104,'2002-04-30');
INSERT INTO history (id,courseid,submission) VALUES
(4,102,'2002-02-22');
INSERT INTO history (id,courseid,submission) VALUES
(5,104,'2002-03-15');
INSERT INTO history (id,courseid,submission) VALUES
(6,104,'2002-01-21');

--------------------------------


Thanks for all your help.


Regards,
Eddie



--- Bruno Wolff III <bruno@wolff.to> wrote:
> On Sun, Feb 23, 2003 at 11:02:27 -0800,
>   Eddie Cheung <vampyre5@yahoo.com> wrote:
> > 
> > HISTORY
> > =======
> > id | courseId  | submission
> > ---+-----------+-------------
> >  1 |  101      | 2002-01-20 
> >  2 |  102      | 2002-02-17
> >  3 |  104      | 2002-04-30
> >  4 |  102      | 2002-02-22
> >  5 |  104      | 2002-03-15
> >  6 |  104      | 2002-01-21
> >  
> >  
> > COURSE
> > ====== 
> >   id  | name
> > ------+-----------
> >   101 | Physics
> >   102 | Chemistry
> >   103 | Biology
> >   104 | Maths
> >   105 | English
> > 
> > 
> > Basically I would like to display the latest
> > submission for each course in a table as shown
> below,
> > order by name of the courses.
> > 
> > Query Results:
> > ==============
> >  id | courseId |  name    | submission
> > ---------------------------------------
> >  4  |  102     | Chemisty | 2002-02-22
> >  3  |  104     | Maths    | 2002-04-30
> >  1  |  101     | Physics  | 2002-01-20
> 
> I think you want to do something like:
> 
> select distinct on (course.courseid)
>   history.id, course.courseid, course.name,
> history.submission
>   from course natural join history
>   order by course.courseid, history.submission desc;
> 
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
majordomo@postgresql.org


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


Re: Help with query involving aggregation and joining.

From
Josh Berkus
Date:
Eddie,

> 1) Josh suggested the following query. (I have made
> minor changes by adding the table name to each field)
>
>    SELECT history.id, history.courseId, course.name,
> MAX(history.submission) AS submission
>    FROM history JOIN course ON history.courseId =
> course.Id
>    GROUP BY history.id, history.courseId, course.name
>    ORDER BY course.name;
>
> The results returned are:
>  id | courseid |   name    | submission
> ----+----------+-----------+------------
>   2 |      102 | Chemistry | 2002-02-17
>   4 |      102 | Chemistry | 2002-02-22
>   3 |      104 | Maths     | 2002-04-30
>   5 |      104 | Maths     | 2002-03-15
>   6 |      104 | Maths     | 2002-01-21
>   1 |      101 | Physics   | 2002-01-20

Sorry, knew I was making it too simple.  Try:
  SELECT history.id, history.courseId, course.name, submission  FROM history JOIN course ON history.courseId =
course.IdJOIN (select course_id, max(submission) as sub_max    FROM history GROUP BY course_id) hmaxON
(history.course_id= hmax.course_id AND history.submission = hmax.sub_max)  GROUP BY history.id, history.courseId,
course.name ORDER BY course.name; 


--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Help with query involving aggregation and joining.

From
Rajesh Kumar Mallah
Date:
ERROR: 
SELECT DISTINCT ON expressions must match 
initial ORDER BY expressions may be gotten over by
wrapping the first query result in a subselect.

not sure though if its proper.
regds
mallah.

test=# SELECT * from ( SELECT  distinct on (a.id) b.id 
,courseid,name,submission   from course a join  history b on 
(a.id=b.courseid)  )  as results order by results.submission desc;

+----+----------+-----------+------------+
| id | courseid |   name    | submission |
+----+----------+-----------+------------+
|  3 |      104 | Maths     | 2002-04-30 |
|  2 |      102 | Chemistry | 2002-02-17 |
|  1 |      101 | Physics   | 2002-01-20 |
+----+----------+-----------+------------+
(3 rows)






On Monday 24 February 2003 10:48 am, Eddie Cheung wrote:
> Hi,
>
> I was very glad to see the replies from you guys this
> morning.  The two suggested SQL queries did not return
> the expected results, but will help me to explain the
> problem I am facing further.
>
> 1) Josh suggested the following query. (I have made
> minor changes by adding the table name to each field)
>
>    SELECT history.id, history.courseId, course.name,
> MAX(history.submission) AS submission
>    FROM history JOIN course ON history.courseId =
> course.Id
>    GROUP BY history.id, history.courseId, course.name
>    ORDER BY course.name;
>
> The results returned are:
>  id | courseid |   name    | submission
> ----+----------+-----------+------------
>   2 |      102 | Chemistry | 2002-02-17
>   4 |      102 | Chemistry | 2002-02-22
>   3 |      104 | Maths     | 2002-04-30
>   5 |      104 | Maths     | 2002-03-15
>   6 |      104 | Maths     | 2002-01-21
>   1 |      101 | Physics   | 2002-01-20
>
> There are duplicate courses because the history.id
> column has different values. The history.id cannot be
> use in the GROUP BY clause. But it is one of the
> displaying field, so I could not remove it from the
> GROUP BY clause either.
>
> 2) Bruno suggested the following query:
>
>   select distinct on (course.courseid)
>   history.id, course.courseid, course.name,
> history.submission
>   from course natural join history
>   order by course.courseid, history.submission desc;
>
> I have not used NATURAL JOIN before, but from what I
> know, it joins the columns with the same name. Since
> the joining columns of History and Course have
> different names, I have replace JOIN clause. Please
> let me know if I have made a mistake.
>
> The modified query is:
>    SELECT DISTINCT ON (course.id) course.id,
> history.id, course.name, history.submission
>    FROM history JOIN course ON history.courseId =
> course.id
>    ORDER BY course.id, history.submission desc;
>
> The results returned are :
>  id  | id |   name    | submission
> -----+----+-----------+------------
>  101 |  1 | Physics   | 2002-01-20
>  102 |  4 | Chemistry | 2002-02-22
>  104 |  3 | Maths     | 2002-04-30
>
> The problem here is that the results are not ordered
> by the submission date. If I sort by
> "history.submission" first, I get
>    ERROR: SELECT DISTINCT ON expressions must match
> initial ORDER BY expressions.
> Please note that I cannot select distinct on the
> course.name either because it is NOT unique. The
> original tables are much larger, and the only unique
> column is the id.
>
> I have included the queries to create the tables here.
>
> ------------------------
>
> CREATE TABLE course (
>     id integer,
>     name varchar(32),
>    Constraint course_pkey Primary Key (id)
> );
>
> CREATE TABLE history (
>     id integer NOT NULL,
>     courseid integer REFERENCES course(id),
>     submission date,
>     Constraint history_pkey Primary Key (id)
> );
>
> INSERT INTO course (id,name) VALUES (101,'Physics');
> INSERT INTO course (id,name) VALUES (102,'Chemistry');
> INSERT INTO course (id,name) VALUES (103,'Biology');
> INSERT INTO course (id,name) VALUES (104,'Maths');
> INSERT INTO course (id,name) VALUES (105,'English');
>
> INSERT INTO history (id,courseid,submission) VALUES
> (1,101,'2002-01-20');
> INSERT INTO history (id,courseid,submission) VALUES
> (2,102,'2002-02-17');
> INSERT INTO history (id,courseid,submission) VALUES
> (3,104,'2002-04-30');
> INSERT INTO history (id,courseid,submission) VALUES
> (4,102,'2002-02-22');
> INSERT INTO history (id,courseid,submission) VALUES
> (5,104,'2002-03-15');
> INSERT INTO history (id,courseid,submission) VALUES
> (6,104,'2002-01-21');
>
> --------------------------------
>
>
> Thanks for all your help.
>
>
> Regards,
> Eddie
>
> --- Bruno Wolff III <bruno@wolff.to> wrote:
> > On Sun, Feb 23, 2003 at 11:02:27 -0800,
> >
> >   Eddie Cheung <vampyre5@yahoo.com> wrote:
> > > HISTORY
> > > =======
> > > id | courseId  | submission
> > > ---+-----------+-------------
> > >  1 |  101      | 2002-01-20
> > >  2 |  102      | 2002-02-17
> > >  3 |  104      | 2002-04-30
> > >  4 |  102      | 2002-02-22
> > >  5 |  104      | 2002-03-15
> > >  6 |  104      | 2002-01-21
> > >
> > >
> > > COURSE
> > > ======
> > >   id  | name
> > > ------+-----------
> > >   101 | Physics
> > >   102 | Chemistry
> > >   103 | Biology
> > >   104 | Maths
> > >   105 | English
> > >
> > >
> > > Basically I would like to display the latest
> > > submission for each course in a table as shown
> >
> > below,
> >
> > > order by name of the courses.
> > >
> > > Query Results:
> > > ==============
> > >  id | courseId |  name    | submission
> > > ---------------------------------------
> > >  4  |  102     | Chemisty | 2002-02-22
> > >  3  |  104     | Maths    | 2002-04-30
> > >  1  |  101     | Physics  | 2002-01-20
> >
> > I think you want to do something like:
> >
> > select distinct on (course.courseid)
> >   history.id, course.courseid, course.name,
> > history.submission
> >   from course natural join history
> >   order by course.courseid, history.submission desc;
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to
>
> majordomo@postgresql.org
>
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

-- 

Regds
Mallah

----------------------------------------
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.


Re: Help with query involving aggregation and joining.

From
Christoph Haller
Date:
>
> ERROR:
> SELECT DISTINCT ON expressions must match
> initial ORDER BY expressions may be gotten over by
> wrapping the first query result in a subselect.
>
> not sure though if its proper.
> regds
> mallah.
>
> test=# SELECT * from ( SELECT  distinct on (a.id) b.id
> ,courseid,name,submission   from course a join  history b on
> (a.id=b.courseid)  )  as results order by results.submission desc;
>
I'm currently working with
select version();                           version
---------------------------------------------------------------PostgreSQL 7.2.3 on hppa-hp-hpux10.20, compiled by GCC
2.95.2
(1 row)
I've used your table definitions and sample data you've sent earlier
and do not receive an error message on the above SELECT statement.
Did you?

Regards, Christoph




Re: Help with query involving aggregation and joining.

From
Date:

sorry i did not post any sample data or definations.
i was replying to someone else's posting

i do get such an error on pgsql 7.3.2

regds
mallah.


>>
>> ERROR:
>> SELECT DISTINCT ON expressions must match
>> initial ORDER BY expressions may be gotten over by
>> wrapping the first query result in a subselect.
>>
>> not sure though if its proper.
>> regds
>> mallah.
>>
>> test=# SELECT * from ( SELECT  distinct on (a.id) b.id
>> ,courseid,name,submission   from course a join  history b on
>> (a.id=b.courseid)  )  as results order by results.submission desc;
>>
> I'm currently working with
> select version();
>                            version
> ---------------------------------------------------------------
> PostgreSQL 7.2.3 on hppa-hp-hpux10.20, compiled by GCC 2.95.2
> (1 row)
> I've used your table definitions and sample data you've sent earlier and do not receive an
> error message on the above SELECT statement. Did you?
>
> Regards, Christoph



-----------------------------------------
Get your free web based email at trade-india.com.  "India's Leading B2B eMarketplace.!"
http://www.trade-india.com/




Re: Help with query involving aggregation and joining.

From
Bruno Wolff III
Date:
On Sun, Feb 23, 2003 at 21:17:38 -0800, Eddie Cheung <vampyre5@yahoo.com> wrote:
> 
> 2) Bruno suggested the following query:
> 
>   select distinct on (course.courseid)
>   history.id, course.courseid, course.name,
> history.submission
>   from course natural join history
>   order by course.courseid, history.submission desc;
>   
> I have not used NATURAL JOIN before, but from what I
> know, it joins the columns with the same name. Since
> the joining columns of History and Course have
> different names, I have replace JOIN clause. Please
> let me know if I have made a mistake.

No I made a mistake and thought that courseId was used in both places.

> The modified query is:
>    SELECT DISTINCT ON (course.id) course.id,
> history.id, course.name, history.submission 
>    FROM history JOIN course ON history.courseId =
> course.id 
>    ORDER BY course.id, history.submission desc;
> 
> The results returned are :
>  id  | id |   name    | submission
> -----+----+-----------+------------
>  101 |  1 | Physics   | 2002-01-20
>  102 |  4 | Chemistry | 2002-02-22
>  104 |  3 | Maths     | 2002-04-30
>  
> The problem here is that the results are not ordered
> by the submission date. If I sort by
> "history.submission" first, I get
>    ERROR: SELECT DISTINCT ON expressions must match
> initial ORDER BY expressions.
> Please note that I cannot select distinct on the
> course.name either because it is NOT unique. The
> original tables are much larger, and the only unique
> column is the id.

My suggestion could be modified by making it a subselect and adding another
order by clause. You might get different performance (worse or better)
than the updated version of Josh's solution.


Re: Help with query involving aggregation and joining.

From
Eddie Cheung
Date:
Hi,

I was very amused by Mallah's suggestion because it is so simple and yet manage to
workaround the problem I am facing. So I will be using the SQL suggested by Bruno and
wrap the query with this workaround.

Christoph, I am using version:
PostgreSQL 7.2.3 on i686-pc-cygwin, compiled by GCC 2.95.3-5
The error message appeared when the SQL statement was not wrapped by   "SELECT * FROM (...) AS results ORDER..."
and when the sorting is swapped to sort by history.submission first. Details can be found
in my previous email.

I have not been able to get Josh suggestion to work. While tidying the SQL statement,
there were some error messages that I didn't manage to solve. Josh's suggestion is
probably more proper, but I will use the other alternative because the syntax is easier
for me to understand. Performance is not an issue in this case because the results came
back almost immediately in my environment.

For the record, I will be using something like this:

SELECT * FROM (SELECT DISTINCT ON (course.id) course.id, history.id, course.name, history.submission FROM history JOIN
courseON history.courseId =course.id ORDER BY course.id, history.submission DESC
 
) as results ORDER BY results.submission DESC;


Thank you for all your help. I really appreciate it.

Cheers,
Eddie



--- mallah@trade-india.com wrote:
> 
> 
> sorry i did not post any sample data or definations.
> i was replying to someone else's posting
> 
> i do get such an error on pgsql 7.3.2
> 
> regds
> mallah.
> 
> 
> >>
> >> ERROR:
> >> SELECT DISTINCT ON expressions must match
> >> initial ORDER BY expressions may be gotten over by
> >> wrapping the first query result in a subselect.
> >>
> >> not sure though if its proper.
> >> regds
> >> mallah.
> >>
> >> test=# SELECT * from ( SELECT  distinct on (a.id) b.id
> >> ,courseid,name,submission   from course a join  history b on
> >> (a.id=b.courseid)  )  as results order by results.submission desc;
> >>
> > I'm currently working with
> > select version();
> >                            version
> > ---------------------------------------------------------------
> > PostgreSQL 7.2.3 on hppa-hp-hpux10.20, compiled by GCC 2.95.2
> > (1 row)
> > I've used your table definitions and sample data you've sent earlier and do not
> receive an
> > error message on the above SELECT statement. Did you?
> >
> > Regards, Christoph
> 
> 
> 
> -----------------------------------------
> Get your free web based email at trade-india.com.
>    "India's Leading B2B eMarketplace.!"
> http://www.trade-india.com/
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


Re: Help with query involving aggregation and joining.

From
"Josh Berkus"
Date:
Eddie,

> I have not been able to get Josh suggestion to work. While tidying
> the SQL statement,
> there were some error messages that I didn't manage to solve. Josh's
> suggestion is
> probably more proper, but I will use the other alternative because
> the syntax is easier
> for me to understand.

Huh?  What sort of error messages?  Can you post the query you rans and
output?

-Josh