Thread: Crosstab-style query in pure SQL
TIA all I have a table containing milestones achieved for projects, marked as being in particular quarters. CREATE TABLE milestones ( proj_id int4, sortorder SERIAL, qtr int4, description varchar(200) ); Now I need the milestone descriptions output across the page like: proj_id | q1 | q2 | q3 | q4 and sorted according to "sortorder". Judicious use of CASE can get me the quarterly columns but I need to eliminate holes. Any clever ideas without resorting to procedural solutions (either table function or application code)? TIA - Richard Huxton
Richard, > I have a table containing milestones achieved for projects, marked as > being in particular quarters. > > CREATE TABLE milestones ( > proj_id int4, > sortorder SERIAL, > qtr int4, > description varchar(200) > ); > > Now I need the milestone descriptions output across the page like: > proj_id | q1 | q2 | q3 | q4 > and sorted according to "sortorder". Ah! A classic SQL problem. Take a look at Joe Celko's "SQL for Smarties": he does a good job of defining and discussing the three different solutions to the "Crosstab Query" problem. A second method you can use is the subselect method: SELECT proj_id, sortorder, qart1.q1, quart2.q2 .... FROM milestones LEFT OUTER JOIN ( SELECT proj_id, description as q1 FROM milestones WHERE qtr = 1) quart1 ON quart1.proj_id= milestones.proj_id LEFT OUTER JOIN ( SELECT proj_id, description q2 ... However, the above is generally chosen over the CASE statement method when the crosstab involves multiple tables; in your case, it is not a performance or similicity gain. There is a third method which involves building a grid of values: ref qtr1 qtr2 qtr3 qtr4 qtr1 1 0 0 0 qtr2 0 1 0 0 qtr3 0 0 1 0 qtr4 0 0 0 1 This method is the only one of the three that prevents you from having to build your query using procedural logic for a dynamic set of parameters. However, the above is set up for numeric values (i.e. one multiplies the grid number by the value to add into calculations) and will not work for your varchar-returning query. You could, however, constuct the grid with NULLs and '' strings and use that to modify the value; I leave that to your ingenuity. -Josh Berkus
> Richard, > >> I have a table containing milestones achieved for projects, marked as >> being in particular quarters. >> >> CREATE TABLE milestones ( >> proj_id int4, >> sortorder SERIAL, >> qtr int4, >> description varchar(200) >> ); >> >> Now I need the milestone descriptions output across the page like: >> proj_id | q1 | q2 | q3 | q4 >> and sorted according to "sortorder". > > Ah! A classic SQL problem. > > Take a look at Joe Celko's "SQL for Smarties": he does a good job of > defining and discussing the three different solutions to the "Crosstab > Query" problem. > > A second method you can use is the subselect method: > > SELECT proj_id, sortorder, qart1.q1, quart2.q2 .... > FROM milestones > LEFT OUTER JOIN ( SELECT proj_id, description as q1 > FROM milestones WHERE qtr = 1) quart1 > ON quart1.proj_id = milestones.proj_id > LEFT OUTER JOIN ( SELECT proj_id, description q2 ... > > However, the above is generally chosen over the CASE statement method > when the crosstab involves multiple tables; in your case, it is not a > performance or similicity gain. Thanks Josh, but that still leaves me with nulls if I join on sortorder too, and duplicates if not (as you imply, since it's equivalent to the CASE option). The problem is that since I defined "sortorder" as a serial, equivalent rows of a specific project don't match across the key. Without calculating a "row_index" based on (proj_id,sortorder) it doesn't look like there's anything to be done without procedural help. Didn't think there was anything simple - my fault for not having common keys to match up output rows - oversimplified the input stage and I'm paying for it on output. - Richard Huxton