Generating a cross tab II (pivot table) - Mailing list pgsql-sql

From Christoph Haller
Subject Generating a cross tab II (pivot table)
Date
Msg-id 3DCFD301.58756E39@rodos.fzk.de
Whole thread Raw
List pgsql-sql

Based on the e-mails on "Generating a cross tab (pivot table)",
I can give you a PLpgSQL procedure to automatically generate a
cross tab from any relation now.
It's my first steps in PLpgSQL. I am pretty sure this is not the
best way to implement, but I wanted to get some experience, so I
did it this way.

For all, who missed it last week, again the objective:
There is a relation "sales",
holding the sales of different products of different vendors.
The task is to generate a report which shows the sales
of every vendor and every product.

Consider the following table populated with some data:
CREATE TABLE sales (
 product TEXT,
 vendor  TEXT,
 sales   INTEGER
);

INSERT INTO sales VALUES ( 'milk'  , 'mr. pink'  , 12 ) ;
INSERT INTO sales VALUES ( 'milk'  , 'mr. brown' ,  8 ) ;
INSERT INTO sales VALUES ( 'honey' , 'mr. green' ,  2 ) ;
INSERT INTO sales VALUES ( 'milk'  , 'mr. green' , 34 ) ;
INSERT INTO sales VALUES ( 'butter', 'mr. pink'  , 17 ) ;
INSERT INTO sales VALUES ( 'butter', 'mr. brown' ,  2 ) ;
INSERT INTO sales VALUES ( 'honey' , 'mr. pink'  , 19 ) ;
The following query generates the report:
CREATE VIEW sales_report AS
SELECT product,
       SUM(CASE vendor WHEN 'mr. pink'  THEN sales ELSE 0 END) AS "mr. pink ",
       SUM(CASE vendor WHEN 'mr. brown' THEN sales ELSE 0 END) AS "mr. brown",
       SUM(CASE vendor WHEN 'mr. green' THEN sales ELSE 0 END) AS "mr. green",
       SUM(sales) AS "sum of sales"
FROM sales GROUP BY product ;
SELECT * FROM sales_report ;

 product | mr. pink  | mr. brown | mr. green | sum of sales
---------+-----------+-----------+-----------+--------------
 butter  |        17 |         2 |         0 |           19
 honey   |        19 |         0 |         2 |           21
 milk    |        12 |         8 |        34 |           54
(3 rows)
It's obvious this approach is most inflexible.
As soon as there is a new vendor, one has to re-write the query and add
SUM(CASE vendor WHEN 'mr. new' THEN ... ,

So what we need is a tool to automatically adapt the view to new vendors
resp. new products.
Here it is (choosing good mnemonics is not my favourite discipline):

CREATE OR REPLACE FUNCTION
create_pivot_report(TEXT,TEXT,TEXT,TEXT,TEXT,TEXT) RETURNS INTEGER AS '
DECLARE
 pg_views_rtype pg_views%ROWTYPE;
 vname_param    ALIAS FOR $1;
 pivot_column   ALIAS FOR $2;
 select_column  ALIAS FOR $3;
 pivot_table    ALIAS FOR $4;
 aggregate_func ALIAS FOR $5;
 aggr_column    ALIAS FOR $6;
 pivot_record   RECORD;
 create_view    TEXT;
BEGIN

SELECT INTO pg_views_rtype * FROM pg_views WHERE viewname = vname_param;
IF FOUND THEN
  EXECUTE ''DROP VIEW '' || quote_ident(vname_param) ;
END IF;
create_view :=
 ''CREATE VIEW '' || quote_ident(vname_param) ||
 '' AS SELECT '' || quote_ident(select_column) ;
FOR pivot_record IN
EXECUTE ''SELECT DISTINCT CAST(''
        || quote_ident(pivot_column)
        || '' AS TEXT) AS col1 FROM ''
        || quote_ident(pivot_table)
    || '' order by '' || quote_ident(pivot_column)
LOOP
  create_view :=
   create_view || '','' || aggregate_func ||
   ''(CASE '' || quote_ident(pivot_column) ||
   '' WHEN '' || quote_literal(pivot_record.col1) ||
   '' THEN '' || quote_ident(aggr_column) ||
   '' ELSE 0 END) AS "'' || pivot_record.col1 || ''"'' ;
END LOOP;
create_view :=
 create_view || '','' || aggregate_func ||
 ''('' || quote_ident(aggr_column) || '') AS "'' || aggregate_func ||
 '' of '' || aggr_column || ''" FROM '' || quote_ident(pivot_table) ||
 '' GROUP BY '' || quote_ident(select_column);
EXECUTE create_view ;

RETURN 0;
END;
' LANGUAGE 'plpgsql' ;

 -- where
 -- vname_param    ALIAS FOR $1; -- the view's name to create
 -- pivot_column   ALIAS FOR $2; -- the pivot column (entries to be CASEd)
 -- select_column  ALIAS FOR $3; -- the select column (entries to be grouped)
 -- pivot_table    ALIAS FOR $4; -- the name of the table to work on
 -- aggregate_func ALIAS FOR $5; -- the name of the aggregate function
 -- aggr_column    ALIAS FOR $6; -- the aggregate column (entries to be aggregated)

First try:
SELECT create_pivot_report
('sales_report2','vendor','product','sales','sum','sales');
SELECT * FROM sales_report2 ;
gives you 'sales_report2' as a copy of 'sales_report'.

Now add another data set:
INSERT INTO sales VALUES ( 'butter', 'mr. blue'  , 11 ) ;
Re-write the view by:
SELECT create_pivot_report
('sales_report2','vendor','product','sales','sum','sales');
And here we go
SELECT * FROM sales_report2 ;
 product | mr. blue | mr. brown | mr. green | mr. pink | sum of sales
---------+----------+-----------+-----------+----------+--------------
 butter  |       11 |         2 |         0 |       17 |           30
 honey   |        0 |         0 |         2 |       19 |           21
 milk    |        0 |         8 |        34 |       12 |           54
(3 rows)

More examples:
SELECT create_pivot_report
('sales_report3','vendor','product','sales','avg','sales');
SELECT create_pivot_report
('sales_report4','vendor','product','sales','stddev','sales');
SELECT create_pivot_report
('sales_report5','product','vendor','sales','sum','sales');
SELECT create_pivot_report
('sales_report6','product','vendor','sales','max','sales');
SELECT create_pivot_report
('sales_report7','vendor','product','sales','max','sales');

As you can see even interchanging the pivot column and the select column
works. Feel free to use the code.

Regards, Christoph

PS
I'm using PostgreSQL 7.2.1 on hppa-hp-hpux10.20, compiled by GCC 2.95.2
and I've noticed an unpleasant behaviour of the PLpgSQL parser.
Double dash -- comments before the first statement in the statement section
lead to strange parser errors. Is this intended?



pgsql-sql by date:

Previous
From: Christoph Haller
Date:
Subject: Re: Generating a cross tab (pivot table)
Next
From: Robert Treat
Date:
Subject: Re: Permission on insert rules