Thread: Join query help
Hi, We have the following three tables. safety=> SELECT record_id, record_date FROM record; record_id | record_date -----------+------------------------ 1 | 2007-07-23 11:30:37+10 2 | 2007-07-27 11:30:14+10 3 | 2007-07-17 13:15:03+10 (3 rows) safety=> SELECT observation_id, record_id, score_id FROM observation; observation_id | record_id | score_id ----------------+-----------+---------- 3240 | 1 | 1 3239 | 1 | 1 3238 | 1 | 2 3237 | 1 | 1 2872 | 2 | 1 2869 | 2 | 2 2870 | 2 | 1 2871 | 2 | 1 3218 | 3 | 2 3217 | 3 | 1 (10 rows) safety=> SELECT * FROM SCORE; score_id | score_description ----------+------------------- 0 | NA 1 | SAFE 2 | AT RISK (3 rows) What query do I write to generate the following? week_no | count(record_id | count(observation_id) | sum(score_id) where = '1' 2007, 30 | 2 | 8 | 6 2007, 29 | 1 | 2 | 1
Try 2: Here are my DDLs & DMLs -- Start CREATE TABLE record ( record_id integer PRIMARY KEY, record_date timestamp with time zone NOT NULL ); INSERT INTO record(record_id, record_date) VALUES ('1', '2007-07-23 11:30:37'); INSERT INTO record(record_id, record_date) VALUES ('2', '2007-07-27 11:30:14'); INSERT INTO record(record_id, record_date) VALUES ('3', '2007-07-17 13:15:03'); CREATE TABLE score ( score_id integer PRIMARY KEY, score_description character(7) NOT NULL ); INSERT INTO score(score_id, score_description) VALUES ('0', 'NA'); INSERT INTO score(score_id, score_description) VALUES ('1', 'SAFE'); INSERT INTO score(score_id, score_description) VALUES ('2', 'AT RISK'); CREATE TABLE observation ( observation_id integer PRIMARY KEY, record_id integer REFERENCES record (record_id), score_id integer REFERENCES score (score_id) ); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3240', '1', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3239', '1', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3238', '1', '2'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3237', '1', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2872', '2', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2869', '2', '2'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2870', '2', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2871', '2', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3218', '3', '2'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3217', '3', '1'); -- End By executing SELECT week, COUNT(record) as records, SUM(inspection) as inspections FROM ( SELECT TO_CHAR(record.record_date, 'YYYY, WW') as week, RECORD.RECORD_ID AS RECORD, COUNT(OBSERVATION_ID) AS INSPECTION FROM OBSERVATION LEFT JOIN record ON record.record_id = observation.record_id GROUP BY RECORD, WEEK ) A GROUP BY WEEK; I get week | records | inspections ----------+---------+-------------- 2007, 30 | 2 | 8 2007, 29 | 1 | 2 (2 rows) I like to query for a result set that will also have the sum(score_id) where score_id = '1' like the following week | records | inspections | score ----------+---------+-------------+------ 2007, 30 | 2 | 8 | 6 2007, 29 | 1 | 2 | 1 (2 rows) This will help identify that there were 6 SAFE observations found from the 8 inspections on week 30. I hope this is not too confusing that I have to re-design the whole schema
On Aug 20, 2007, at 19:52 , novice wrote: > Try 2: Here are my DDLs & DMLs Thanks for the data. It looks like the data you provided in the first set was a little different, and the queries I supplied in my previous message give you the results you want. > CREATE TABLE record > ( > record_id integer PRIMARY KEY, > record_date timestamp with time zone NOT NULL > ); > > INSERT INTO record(record_id, record_date) VALUES ('1', '2007-07-23 > 11:30:37'); Note: record_id is in integer, yet you're quoting the value ('1'). This causes the server to cast the text value to an integer. Here it's not going to cause much of a problem, just a couple CPU cycles. In table definitions (and possibly queries?) it can cause the server to ignore otherwise usable indexes when planning queries. (I see below your other inserts also quote integer values: you should drop the quotes there as well.) > CREATE TABLE score > ( > score_id integer PRIMARY KEY, > score_description character(7) NOT NULL > > ); Unless you have a strict business rule that score_description can have no more than 7 characters, you should just use text instead of character(7). It provides you more freedom (for example, you don't have to change the column type if you ever want to use longer descriptions). character(7) does not gain you anything in terms of performance. I'd also recommend adding a UNIQUE constraint to score_description. It appears your score_id has no meaning other than use as a surrogate key. There's nothing to prevent INSERT INTO score (score_id, score_description) VALUES (5, 'SAFE') which would be quite confusing, I should think. And if you don't have any particular reason to use a surrogate key, you could just use score_description as the primary key of the table, dropping score_id altogether. it would reduce the number of joins you needed to do to have easily interpretable query results. (This all goes for your record table as well.) > I like to query for a result set that will also have the sum(score_id) > where score_id = '1' like the following > > week | records | inspections | score > ----------+---------+-------------+------ > 2007, 30 | 2 | 8 | 6 > 2007, 29 | 1 | 2 | 1 > (2 rows) > (repeating from earlier post) SELECT to_char(record_week, 'YYYY, IW') as formatted_record_week , count(DISTINCT record_id) AS record_count , count(DISTINCT observation_id) AS observation_count , safe_score_sum FROM record_with_week NATURAL JOIN observation NATURAL JOIN ( SELECT record_week, sum(score_id) as safe_score_sum FROM record_with_week NATURAL JOIN observation NATURAL JOIN score WHERE score_description = 'SAFE' GROUP BY record_week ) safe_observation GROUP BY record_week, safe_score_sum; formatted_record_week | record_count | observation_count | safe_score_sum -----------------------+--------------+------------------- +---------------- 2007, 29 | 1 | 2 | 1 2007, 30 | 2 | 8 | 6 (2 rows) > This will help identify that there were 6 SAFE observations found from > the 8 inspections on week 30. Yeah, I thought so: you're actually looking for the *count* of SAFE observations, not the sum of the score_id for 'SAFE'. So what you really want is: SELECT TO_CHAR(record_week, 'YYYY, IW') AS formatted_record_week , count(DISTINCT record_id) AS record_count , count(DISTINCT observation_id) AS observation_count , count(DISTINCT safe_observation_id) as safe_observation_count FROM record_with_week NATURAL JOIN observation NATURAL JOIN ( SELECT record_week , observation_id as safe_observation_id FROM record_with_week NATURAL JOIN observation NATURAL JOIN score WHERE score_description = 'SAFE' ) safe_observation GROUP BY record_week; formatted_record_week | record_count | observation_count | safe_observation_count -----------------------+--------------+------------------- +------------------------ 2007, 29 | 1 | 2 | 1 2007, 30 | 2 | 8 | 6 (2 rows) Michael Glaesemann grzm seespotcode net
Many many thanks for all the advice =) On 21/08/07, Michael Glaesemann <grzm@seespotcode.net> wrote: > > On Aug 20, 2007, at 19:52 , novice wrote: > > > Try 2: Here are my DDLs & DMLs > > Thanks for the data. It looks like the data you provided in the first > set was a little different, and the queries I supplied in my previous > message give you the results you want. > > > CREATE TABLE record > > ( > > record_id integer PRIMARY KEY, > > record_date timestamp with time zone NOT NULL > > ); > > > > INSERT INTO record(record_id, record_date) VALUES ('1', '2007-07-23 > > 11:30:37'); > > Note: record_id is in integer, yet you're quoting the value ('1'). > This causes the server to cast the text value to an integer. Here > it's not going to cause much of a problem, just a couple CPU cycles. > In table definitions (and possibly queries?) it can cause the server > to ignore otherwise usable indexes when planning queries. (I see > below your other inserts also quote integer values: you should drop > the quotes there as well.) > > > CREATE TABLE score > > ( > > score_id integer PRIMARY KEY, > > score_description character(7) NOT NULL > > > > ); > > Unless you have a strict business rule that score_description can > have no more than 7 characters, you should just use text instead of > character(7). It provides you more freedom (for example, you don't > have to change the column type if you ever want to use longer > descriptions). character(7) does not gain you anything in terms of > performance. > > I'd also recommend adding a UNIQUE constraint to score_description. > It appears your score_id has no meaning other than use as a surrogate > key. There's nothing to prevent INSERT INTO score (score_id, > score_description) VALUES (5, 'SAFE') which would be quite confusing, > I should think. And if you don't have any particular reason to use a > surrogate key, you could just use score_description as the primary > key of the table, dropping score_id altogether. it would reduce the > number of joins you needed to do to have easily interpretable query > results. (This all goes for your record table as well.) > > > I like to query for a result set that will also have the sum(score_id) > > where score_id = '1' like the following > > > > week | records | inspections | score > > ----------+---------+-------------+------ > > 2007, 30 | 2 | 8 | 6 > > 2007, 29 | 1 | 2 | 1 > > (2 rows) > > > > (repeating from earlier post) > > SELECT to_char(record_week, 'YYYY, IW') as formatted_record_week > , count(DISTINCT record_id) AS record_count > , count(DISTINCT observation_id) AS observation_count > , safe_score_sum > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN ( > SELECT record_week, sum(score_id) as safe_score_sum > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN score > WHERE score_description = 'SAFE' > GROUP BY record_week > ) safe_observation > GROUP BY record_week, safe_score_sum; > formatted_record_week | record_count | observation_count | > safe_score_sum > -----------------------+--------------+------------------- > +---------------- > 2007, 29 | 1 | 2 > | 1 > 2007, 30 | 2 | 8 > | 6 > (2 rows) > > > > This will help identify that there were 6 SAFE observations found from > > the 8 inspections on week 30. > > Yeah, I thought so: you're actually looking for the *count* of SAFE > observations, not the sum of the score_id for 'SAFE'. So what you > really want is: > > SELECT TO_CHAR(record_week, 'YYYY, IW') AS formatted_record_week > , count(DISTINCT record_id) AS record_count > , count(DISTINCT observation_id) AS observation_count > , count(DISTINCT safe_observation_id) as safe_observation_count > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN ( > SELECT record_week > , observation_id as safe_observation_id > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN score > WHERE score_description = 'SAFE' > ) safe_observation > GROUP BY record_week; > formatted_record_week | record_count | observation_count | > safe_observation_count > -----------------------+--------------+------------------- > +------------------------ > 2007, 29 | 1 | 2 > | 1 > 2007, 30 | 2 | 8 > | 6 > (2 rows) > > Michael Glaesemann > grzm seespotcode net > > > >
On Aug 20, 2007, at 20:27 , Michael Glaesemann wrote: > Note: record_id is in integer, yet you're quoting the value ('1'). > This causes the server to cast the text value to an integer. Here > it's not going to cause much of a problem, just a couple CPU > cycles. In table definitions (and possibly queries?) it can cause > the server to ignore otherwise usable indexes when planning > queries. (I see below your other inserts also quote integer values: > you should drop the quotes there as well.) Some clarification: what I meant by "table definitions" is if you're actually defining text columns that reference integer columns. For example: CREATE TABLE foos (foo_id INTEGER PRIMARY KEY, foo text NOT NULL UNIQUE); CREATE TABLE bars (bar text PRIMARY KEY, foo_id text NOT NULL REFERENCES foos (foo_id)); Michael Glaesemann grzm seespotcode net
On Aug 20, 2007, at 20:33 , novice wrote: > Many many thanks for all the advice =) Glad to help. Good luck! Michael Glaesemann grzm seespotcode net