Thread: Generating a cross tab (pivot table)

Generating a cross tab (pivot table)

From

Christoph Haller

Date:

07 November 2002, 06:47:48

Regarding to Terry's request on multiple aggregates and
Shahbaz's request for generating a cross tab ( pivot table )
in September, I've found an excellent example on a german inet page
http://www.itrain.de/
I've translated it and think it's useful for many who subscribed
(I hope so, maybe it was told before, but I couldn't find
anything about this topic in the techdecs).

Objective:
There is a relation "sales",
holding the sales of different products of different vendors.
The task is to generate a report which shows the sales
of every vendor and every product.

Consider the following table populated with some data:
CREATE TABLE sales (product TEXT,vendor  TEXT,sales   INTEGER
);

INSERT INTO sales VALUES ( 'milk'  , 'mr. pink'  , 12 ) ;
INSERT INTO sales VALUES ( 'milk'  , 'mr. brown' ,  8 ) ;
INSERT INTO sales VALUES ( 'honey' , 'mr. green' ,  2 ) ;
INSERT INTO sales VALUES ( 'milk'  , 'mr. green' , 34 ) ;
INSERT INTO sales VALUES ( 'butter', 'mr. pink'  , 17 ) ;
INSERT INTO sales VALUES ( 'butter', 'mr. brown' ,  2 ) ;
INSERT INTO sales VALUES ( 'honey' , 'mr. pink'  , 19 ) ;

The following query generates the report:
SELECT product,      SUM(CASE vendor WHEN 'mr. pink'  THEN sales ELSE 0 END) AS "mr.
pink ",      SUM(CASE vendor WHEN 'mr. brown' THEN sales ELSE 0 END) AS "mr.
brown",      SUM(CASE vendor WHEN 'mr. green' THEN sales ELSE 0 END) AS "mr.
green",      SUM(sales) AS "sum of sales"
FROM sales GROUP BY product ;
product | mr. pink  | mr. brown | mr. green | sum of sales
---------+-----------+-----------+-----------+--------------butter  |        17 |         2 |         0 |
19honey  |        19 |         0 |         2 |           21milk    |        12 |         8 |        34 |           54
 
(3 rows)

The example is based on MS SQL Server 7.0 and it appears to be
there is a valuable feature called CUBE which completes the report.

SELECT CASE WHEN GROUPING(product) = 1 THEN 'sum of sales' ELSE product
END,      SUM(CASE vendor WHEN 'mr. pink'  THEN sales ELSE 0 END) AS "mr.
pink ",      SUM(CASE vendor WHEN 'mr. brown' THEN sales ELSE 0 END) AS "mr.
brown",      SUM(CASE vendor WHEN 'mr. green' THEN sales ELSE 0 END) AS "mr.
green",      SUM(sales) AS "sum of sales"
FROM sales GROUP BY product WITH CUBE ;
product      | mr. pink  | mr. brown | mr. green | sum of sales
--------------+-----------+-----------+-----------+--------------butter       |        17 |     2 |  0 |       19honey
     |        19 |     0 |  2 |       21milk         |        12 |     8 |        34 |       54sum of sales |        48
|       10 |        36 |           94
 
(4 rows)

I would like to hear from the core team whether they think this feature
is worthy to be implemented, or even better, is there a similar one or
an easy workaround already.

It's obvious this approach is most inflexible.
As soon as there is a new vendor, one has to re-write the query and add
SUM(CASE vendor WHEN 'mr. new' THEN ... ,

In an advanced example it is shown how to deal with cross tabs in
general
using a stored procedure. I am going to translate this and re-write it
for postgres, too (ok, I will try).

Regards, Christoph

Re: Generating a cross tab (pivot table)

From

Richard Huxton

Date:

07 November 2002, 07:30:25

On Thursday 07 Nov 2002 11:47 am, Christoph Haller wrote:
> Regarding to Terry's request on multiple aggregates and
> Shahbaz's request for generating a cross tab ( pivot table )
> in September, I've found an excellent example on a german inet page
> http://www.itrain.de/
> I've translated it and think it's useful for many who subscribed
> (I hope so, maybe it was told before, but I couldn't find
> anything about this topic in the techdecs).

Very useful. Also note there are some examples of how to produce crosstab
results in the table-functions contrib directory in 7.3

--  Richard Huxton

Re: Generating a cross tab (pivot table)

From

Joe Conway

Date:

07 November 2002, 12:42:12

Richard Huxton wrote:
> On Thursday 07 Nov 2002 11:47 am, Christoph Haller wrote:
> 
>>Regarding to Terry's request on multiple aggregates and
>>Shahbaz's request for generating a cross tab ( pivot table )
>>in September, I've found an excellent example on a german inet page
>>http://www.itrain.de/
>>I've translated it and think it's useful for many who subscribed
>>(I hope so, maybe it was told before, but I couldn't find
>>anything about this topic in the techdecs).
> 
> Very useful. Also note there are some examples of how to produce crosstab 
> results in the table-functions contrib directory in 7.3
> 

Just to amplify a bit, in contrib/tablefunc there is a family of functions 
called crosstabN(), where N is 2, 3 and 4. These are meant as examples -- you 
could, for example, create a function crosstab5() if you need it. There is 
also a function called crosstab(), which returns type RECORD and thus requires 
the column definition to be specified in the query. See 
contrib/tablefunc/README.tablefunc for more details and examples.

These were done as relatively crude examples and therefore have some 
limitations which may or may not be a problem for you. If people find the 
functions useful and provide suggestions for improvement in functionality I'll 
try to upgrade them for 7.4.

Thanks,

Joe

Re: Generating a cross tab (pivot table)

From

Ludwig Lim

Date:

08 November 2002, 04:39:35

--- Christoph Haller <ch@rodos.fzk.de> wrote:
> It's obvious this approach is most inflexible.
> As soon as there is a new vendor, one has to
> re-write the query and add
> SUM(CASE vendor WHEN 'mr. new' THEN ... ,
> 

> In an advanced example it is shown how to deal with
> cross tabs in
> general
> using a stored procedure. I am going to translate
> this and re-write it
> for postgres, too (ok, I will try).
I saw something that might somewhat a bit more
flexible solution using SQL. I don't know it it work
in PostgreSQL. I saw it at the MySQL site.   The following is the URL: http://www.mysql.com/articles/wizard/index.html
 Has anyone tried this on a PostgreSQL database ?

regards,
ludwig


__________________________________________________
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com

Re: Generating a cross tab (pivot table)

From

Christoph Haller

Date:

11 November 2002, 10:42:37

> I saw something that might somewhat a bit more
> flexible solution using SQL. I don't know if it works
> in PostgreSQL. I saw it at the MySQL site.
>
>   The following is the URL:
>   http://www.mysql.com/articles/wizard/index.html
>
>   Has anyone tried this on a PostgreSQL database ?

No, not me.
But as far as I can tell the SQL statements can quite easily
be re-written in PostgreSQL:
e. g.
mysql> SELECT location, SUM(IF(gender='M',1,0)) AS M,
SUM(IF(gender='F',1,0)) AS F    -> FROM locations INNER JOIN employees USING (loc_code) GROUP BY
location;
becomes
SELECT location,
SUM(CASE WHEN gender='M' THEN 1 ELSE 0 END) AS "M",
SUM(CASE WHEN gender='F' THEN 1 ELSE 0 END) AS "F",
FROM locations LEFT JOIN employees ON
(locations.loc_code=employees.loc_code)
GROUP BY location;

And this goes for the perl script as well.

Regards, Christoph

Re: Generating a cross tab (pivot table)

From

"Josh Berkus"

Date:

11 November 2002, 13:24:45

Christoph, Lud,

> > I saw something that might somewhat a bit more
> > flexible solution using SQL. I don't know if it works
> > in PostgreSQL. I saw it at the MySQL site.
> >
> >   The following is the URL:
> >   http://www.mysql.com/articles/wizard/index.html
> >
> >   Has anyone tried this on a PostgreSQL database ?

Actually, I'm rather annoyed with the article author.   He first claims
that Joe Celko reccommends expensive add-on software for crosstabs (Joe
does not) and then goes on to use one of Joe's own solutions.  However,
given the author's thouroughness otherwise, I'm sure the innaccuracy is
due to some kind of misunderstanding.

There are, in fact, 3 simple SQL-based solutions to the crosstab
problem.  Which one you use depends on the shape of your data.  I am
not going to cover them in detail here (I'll save that for an article)
but to sum up:

1) The SUM(CASE()) statement method, as outlined in the article, which
is good for crosstabs expecting small numbers of columns.  Or, in the
case of this article, good for RDBMS which do not support subselects.

2) The LEFT JOIN + Sub-Select method, which is good for crosstabs with
lots of columns but not that many rows in the crosstabbed table.

3) The "crosstab reference grid" method, which is good for large tables
and crosstabs with lots of columns, but requires setup and maintainence
by trigger.

Joe Celko covers these 3 types in "SQL for Smarties".     I will write
a PostgreSQL implementation in a later article.

-Josh Berkus