Thread: Crosstab-style query in pure SQL

Crosstab-style query in pure SQL

From
dev@archonet.com
Date:
TIA all

I have a table containing milestones achieved for projects, marked as
being in particular quarters.

CREATE TABLE milestones (   proj_id     int4,   sortorder   SERIAL,   qtr         int4,   description varchar(200)
);

Now I need the milestone descriptions output across the page like: proj_id | q1 | q2 | q3 | q4
and sorted according to "sortorder".

Judicious use of CASE can get me the quarterly columns but I need to
eliminate holes. Any clever ideas without resorting to procedural
solutions (either table function or application code)?

TIA

- Richard Huxton


Re: Crosstab-style query in pure SQL

From
"Josh Berkus"
Date:
Richard,

> I have a table containing milestones achieved for projects, marked as
> being in particular quarters.
> 
> CREATE TABLE milestones (
>     proj_id     int4,
>     sortorder   SERIAL,
>     qtr         int4,
>     description varchar(200)
> );
> 
> Now I need the milestone descriptions output across the page like:
>   proj_id | q1 | q2 | q3 | q4
> and sorted according to "sortorder".

Ah!   A classic SQL problem.

Take a look at Joe Celko's "SQL for Smarties": he does a good job of
defining and discussing the three different solutions to the "Crosstab
Query" problem.

A second method you can use is the subselect method:

SELECT  proj_id, sortorder, qart1.q1, quart2.q2 ....
FROM milestones
LEFT OUTER JOIN ( SELECT proj_id, description as q1          FROM milestones WHERE qtr = 1) quart1        ON
quart1.proj_id= milestones.proj_id
 
LEFT OUTER JOIN ( SELECT proj_id, description q2 ... 

However, the above is generally chosen over the CASE statement method
when the crosstab involves multiple tables; in your case, it is not a
performance or similicity gain.

There is a third method which involves building a grid of values:

ref    qtr1    qtr2    qtr3    qtr4
qtr1    1    0    0    0
qtr2    0    1    0    0
qtr3    0    0    1    0
qtr4    0    0    0    1

This method is the only one of the three that prevents you from having
to build your query using procedural logic for a dynamic set of
parameters.   However, the above is set up for numeric values (i.e. one
multiplies the grid number by the value to add into calculations) and
will not work for your varchar-returning query.

You could, however, constuct the grid with NULLs and '' strings and use
that to modify the value; I leave that to your ingenuity.

-Josh Berkus








Re: Crosstab-style query in pure SQL

From
dev@archonet.com
Date:
> Richard,
>
>> I have a table containing milestones achieved for projects, marked as
>> being in particular quarters.
>>
>> CREATE TABLE milestones (
>>     proj_id     int4,
>>     sortorder   SERIAL,
>>     qtr         int4,
>>     description varchar(200)
>> );
>>
>> Now I need the milestone descriptions output across the page like:
>>   proj_id | q1 | q2 | q3 | q4
>> and sorted according to "sortorder".
>
> Ah!   A classic SQL problem.
>
> Take a look at Joe Celko's "SQL for Smarties": he does a good job of
> defining and discussing the three different solutions to the "Crosstab
> Query" problem.
>
> A second method you can use is the subselect method:
>
> SELECT  proj_id, sortorder, qart1.q1, quart2.q2 ....
> FROM milestones
> LEFT OUTER JOIN ( SELECT proj_id, description as q1
>            FROM milestones WHERE qtr = 1) quart1
>          ON quart1.proj_id = milestones.proj_id
> LEFT OUTER JOIN ( SELECT proj_id, description q2 ...
>
> However, the above is generally chosen over the CASE statement method
> when the crosstab involves multiple tables; in your case, it is not a
> performance or similicity gain.

Thanks Josh, but that still leaves me with nulls if I join on sortorder
too, and duplicates if not (as you imply, since it's equivalent to the
CASE option). The problem is that since I defined "sortorder" as a serial,
equivalent rows of a specific project don't match across the key.

Without calculating a "row_index" based on (proj_id,sortorder) it doesn't
look like there's anything to be done without procedural help. Didn't
think there was anything simple - my fault for not having common keys to
match up output rows - oversimplified the input stage and I'm paying for
it on output.

- Richard Huxton