Thread: cross tab confusion

cross tab confusion

From
Matt Johnson
Date:
Folks,

Newbie... So far, I've been building a database with
MySQL and I'm finding it seems to have a few
limitations. (I should say it's probably my lack of
knowledge rather than a failing of MySQl).

I'm now looking at switching to Postgresql simply
because it seems more feature-rich, specifically in
the problem I'm faced with.

I'm building a pupil assessment record. I have these
two tables:

table `pupil`

pupil_id   name
1          jeff
2          fred
3          rita


table `reading_level`

record_id   pupil_id   week   level
1           1          1      6.1
2           2          1      4.3
3           1          2      6.2
4           2          2      4.4


I'd really need to present this data thus:

name   week1   week2   week3
jeff   6.1     6.2
fred   4.3     4.4

I believe it will involve a cross tab query, but I'm
completely lost. The real glory seems to be in writing
a query that will 'automatically' add new weeks as new
columns as more data is added to table
`reading_level`. In other words, the query doesn't
need rewriting after adding records for week 3, 4 etc.
Some query cleverness gets the weeks for itself.

Thanks for any help.

--
Matt Johnson



__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

Re: cross tab confusion

From
Oliver Elphick
Date:
On Sat, 2003-03-08 at 19:51, Matt Johnson wrote:
> Folks,
>
> Newbie... So far, I've been building a database with
> MySQL and I'm finding it seems to have a few
> limitations. (I should say it's probably my lack of
> knowledge rather than a failing of MySQl).

Many have found limitations in MySQL.  I have taken over a project from
someone who didn't discover them in time...

> I'm now looking at switching to Postgresql simply
> because it seems more feature-rich, specifically in
> the problem I'm faced with.
>
> I'm building a pupil assessment record. I have these
> two tables:
>
> table `pupil`
>
> pupil_id   name
> 1          jeff
> 2          fred
> 3          rita
>
>
> table `reading_level`
>
> record_id   pupil_id   week   level
> 1           1          1      6.1
> 2           2          1      4.3
> 3           1          2      6.2
> 4           2          2      4.4


Does that record_id column do anything except give you a unique key?
Unless it relates to something in the real world, it is redundant,
because (pupil_id, week) is the obvious primary key -- assuming you have
a new table for each academic year...

> I'd really need to present this data thus:
>
> name   week1   week2   week3
> jeff   6.1     6.2
> fred   4.3     4.4
>
> I believe it will involve a cross tab query, but I'm
> completely lost. The real glory seems to be in writing
> a query that will 'automatically' add new weeks as new
> columns as more data is added to table
> `reading_level`. In other words, the query doesn't
> need rewriting after adding records for week 3, 4 etc.
> Some query cleverness gets the weeks for itself.

You could create a view like this:

CREATE OR REPLACE VIEW pupil_reading_levels
  (name, week1, week2, week3, week4) AS
   SELECT p.name,
          (SELECT r1.level
             FROM pupil AS p1, reading_level AS r1
            WHERE p1.pupil_id = p.pupil_id AND
                  p1.pupil_id = r1.pupil_id AND
                  r1.week = 1) AS week1,
          (SELECT r2.level
             FROM pupil AS p2, reading_level AS r2
            WHERE p2.pupil_id = p.pupil_id AND
                  p2.pupil_id = r2.pupil_id AND
                  r2.week = 2) AS week2,
          (SELECT r3.level
             FROM pupil AS p3, reading_level AS r3
            WHERE p3.pupil_id = p.pupil_id AND
                  r3.week = 3) AS week3
          (SELECT r4.level
             FROM pupil AS p4, reading_level AS r4
            WHERE p4.pupil_id = p.pupil_id AND
                  r4.week = 4) AS week4
     FROM pupil AS p;

This will need all columns defined in advance, which might be
cumbersome, and will certainly give a lot of empty columns early in the
year.  I can't think of a way to define a view to create columns
dynamically

Alternatively, use this query:

 SELECT p.name, r.week, r.level
   FROM pupil AS p, reading_level AS r
  WHERE p.pupil_id = r.pupil_id
  ORDER BY p.name, r.week;

to put out data into a Perl script.  The script would construct and
display the report, either directly or by loading a temporary table and
running queries on that..

Perl::DBI would be an appropriate tool to use to handle the SQL
connection and queries within your Perl program.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "And God is able to make all grace abound to you, so
      that in all things at all times, having all that you
      need, you will abound in every good work."
                                  II Corinthians 9:8


Re: cross tab confusion

From
Matt Johnson
Date:
Thanks for your response.

> >
> > I'm building a pupil assessment record. I have
> these
> > two tables:
> >
> > table `pupil`
> >
> > pupil_id   name
> > 1          jeff
> > 2          fred
> > 3          rita
> >
> >
> > table `reading_level`
> >
> > record_id   pupil_id   week   level
> > 1           1          1      6.1
> > 2           2          1      4.3
> > 3           1          2      6.2
> > 4           2          2      4.4
>
>
> Does that record_id column do anything except give
> you a unique key?

Yes, I see what you're saying. It is, as you suggest,
redundant.

> Unless it relates to something in the real world, it
> is redundant,
> because (pupil_id, week) is the obvious primary key
> -- assuming you have
> a new table for each academic year...

I'd probably not want to have a new table for each
year would I? That would mean this database could only
continue to run while this school employed me! Not
many primary school teachers are database admins. I'd
want this to stand the test of time? Is that possible?
Is it too ambitious to expect this to run once it's
set up? Run and run?

> > I'd really need to present this data thus:
> >
> > name   week1   week2   week3
> > jeff   6.1     6.2
> > fred   4.3     4.4
> >

I came up with this (and I was proud! I only installed
this last week)...

The static query:

SELECT distinct test.name,a1.level AS l1,a2.level AS
l2,a3.level AS l3, a4.level AS l4
FROM `test`
LEFT JOIN `test` AS `a1` ON a1.date = 1 AND a1.name =
test.name
LEFT JOIN `test` AS `a2` ON a2.date = 2 AND a2.name =
test.name
LEFT JOIN `test` AS `a3` ON a3.date = 3 AND a3.name =
test.name
LEFT JOIN `test` AS `a4` ON a4.date = 4 AND a4.name =
test.name

It did the job. But would need altering when new data
was added.

Then, I used Zope to make it dynamic, so it
incorporates new weeks' data:

SELECT distinct `test`.`name`,<dtml-in
test_get_date>`week<dtml-var date>`.`level` AS
`week<dtml-var date>`,</dtml-in>
`test`.`name` AS `lose_comma` FROM `test`

<dtml-in test_get_date>
LEFT JOIN `test` AS `week<dtml-var date>`
ON `week<dtml-var date>`.`date` = <dtml-var date>
AND `week<dtml-var date>`.`name` = test.name
</dtml-in>

Please point out any stark errors in my thinking. This
won't be a huge app (but then these things grow I
guess so it's best to get these things right).

--
Matt Johnson

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

Re: cross tab confusion

From
Joe Conway
Date:
Matt Johnson wrote:
> Please point out any stark errors in my thinking. This
> won't be a huge app (but then these things grow I
> guess so it's best to get these things right).
>

There is no way to completely automate this just using SQL, but I think
you're on track with using an external language like python to automate
the writing of your query.

As an alternative, you could try the contrib/tablefunc function
crosstab. I've posted a new version that includes an improved crosstab
function (see http://www.joeconway.com/ and look for the link
"contrib/tablefunc with hashed crosstab"). Here's what the solution
might look like for a 20 week semester (the example is long, but only
the last sql statement needs to be run each week):

create table pupil(pupil_id integer, name text);
insert into pupil values(1,'jeff');
insert into pupil values(2,'fred');
insert into pupil values(3,'rita');

create table reading_level(record_id integer, pupil_id integer, week
integer, level float8);
insert into reading_level values(1,1,1,6.1);
insert into reading_level values(2,2,1,4.3);
insert into reading_level values(3,1,2,6.2);
insert into reading_level values(4,2,2,4.4);

create or replace function weeks() returns setof integer as'
declare
   weeknum integer;
begin
   for weeknum in 1..20 loop
     return next weeknum;
   end loop;
   return;
end
' language 'plpgsql';

CREATE TYPE weekly_report_tup AS (
   name text,
   w1 float8,
   w2 float8,
   w3 float8,
   w4 float8,
   w5 float8,
   w6 float8,
   w7 float8,
   w8 float8,
   w9 float8,
   w10 float8,
   w11 float8,
   w12 float8,
   w13 float8,
   w14 float8,
   w15 float8,
   w16 float8,
   w17 float8,
   w18 float8,
   w19 float8,
   w20 float8
);

CREATE OR REPLACE FUNCTION weekly_report(text,text)
RETURNS setof weekly_report_tup
AS '$libdir/tablefunc','crosstab_hash'
LANGUAGE 'C' STABLE STRICT;

select * from weekly_report(
   'select p.name, r.week, r.level from pupil p, reading_level r
    where r.pupil_id = p.pupil_id',
   'select * from weeks()'
);
  name | w1  | w2  | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 |
------+-----+-----+----+----+----+----+----+----+----+-----+-----+-----+
  jeff | 6.1 | 6.2 |    |    |    |    |    |    |    |     |     |
|
  fred | 4.3 | 4.4 |    |    |    |    |    |    |    |     |     |     |
(2 rows)

I've trimmed the output at w12 for clarity in the email, but 13-20 would
be there as well. You could easily extend this to 26 or 52 weeks if you
wanted. Now as more data is added...

insert into reading_level values(5,1,3,6.3);
insert into reading_level values(6,2,3,4.5);
insert into reading_level values(7,3,3,7.2);
select * from weekly_report(
   'select p.name, r.week, r.level from pupil p, reading_level r
    where r.pupil_id = p.pupil_id',
   'select * from weeks()'
);
  name | w1  | w2  | w3  | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12
------+-----+-----+-----+----+----+----+----+----+----+-----+-----+-----
  jeff | 6.1 | 6.2 | 6.3 |    |    |    |    |    |    |     |     |
  fred | 4.3 | 4.4 | 4.5 |    |    |    |    |    |    |     |     |
  rita |     |     | 7.2 |    |    |    |    |    |    |     |     |
(3 rows)

...you just run the query.

HTH,

Joe