Thread: cross tab confusion
Folks, Newbie... So far, I've been building a database with MySQL and I'm finding it seems to have a few limitations. (I should say it's probably my lack of knowledge rather than a failing of MySQl). I'm now looking at switching to Postgresql simply because it seems more feature-rich, specifically in the problem I'm faced with. I'm building a pupil assessment record. I have these two tables: table `pupil` pupil_id name 1 jeff 2 fred 3 rita table `reading_level` record_id pupil_id week level 1 1 1 6.1 2 2 1 4.3 3 1 2 6.2 4 2 2 4.4 I'd really need to present this data thus: name week1 week2 week3 jeff 6.1 6.2 fred 4.3 4.4 I believe it will involve a cross tab query, but I'm completely lost. The real glory seems to be in writing a query that will 'automatically' add new weeks as new columns as more data is added to table `reading_level`. In other words, the query doesn't need rewriting after adding records for week 3, 4 etc. Some query cleverness gets the weeks for itself. Thanks for any help. -- Matt Johnson __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
On Sat, 2003-03-08 at 19:51, Matt Johnson wrote: > Folks, > > Newbie... So far, I've been building a database with > MySQL and I'm finding it seems to have a few > limitations. (I should say it's probably my lack of > knowledge rather than a failing of MySQl). Many have found limitations in MySQL. I have taken over a project from someone who didn't discover them in time... > I'm now looking at switching to Postgresql simply > because it seems more feature-rich, specifically in > the problem I'm faced with. > > I'm building a pupil assessment record. I have these > two tables: > > table `pupil` > > pupil_id name > 1 jeff > 2 fred > 3 rita > > > table `reading_level` > > record_id pupil_id week level > 1 1 1 6.1 > 2 2 1 4.3 > 3 1 2 6.2 > 4 2 2 4.4 Does that record_id column do anything except give you a unique key? Unless it relates to something in the real world, it is redundant, because (pupil_id, week) is the obvious primary key -- assuming you have a new table for each academic year... > I'd really need to present this data thus: > > name week1 week2 week3 > jeff 6.1 6.2 > fred 4.3 4.4 > > I believe it will involve a cross tab query, but I'm > completely lost. The real glory seems to be in writing > a query that will 'automatically' add new weeks as new > columns as more data is added to table > `reading_level`. In other words, the query doesn't > need rewriting after adding records for week 3, 4 etc. > Some query cleverness gets the weeks for itself. You could create a view like this: CREATE OR REPLACE VIEW pupil_reading_levels (name, week1, week2, week3, week4) AS SELECT p.name, (SELECT r1.level FROM pupil AS p1, reading_level AS r1 WHERE p1.pupil_id = p.pupil_id AND p1.pupil_id = r1.pupil_id AND r1.week = 1) AS week1, (SELECT r2.level FROM pupil AS p2, reading_level AS r2 WHERE p2.pupil_id = p.pupil_id AND p2.pupil_id = r2.pupil_id AND r2.week = 2) AS week2, (SELECT r3.level FROM pupil AS p3, reading_level AS r3 WHERE p3.pupil_id = p.pupil_id AND r3.week = 3) AS week3 (SELECT r4.level FROM pupil AS p4, reading_level AS r4 WHERE p4.pupil_id = p.pupil_id AND r4.week = 4) AS week4 FROM pupil AS p; This will need all columns defined in advance, which might be cumbersome, and will certainly give a lot of empty columns early in the year. I can't think of a way to define a view to create columns dynamically Alternatively, use this query: SELECT p.name, r.week, r.level FROM pupil AS p, reading_level AS r WHERE p.pupil_id = r.pupil_id ORDER BY p.name, r.week; to put out data into a Perl script. The script would construct and display the report, either directly or by loading a temporary table and running queries on that.. Perl::DBI would be an appropriate tool to use to handle the SQL connection and queries within your Perl program. -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "And God is able to make all grace abound to you, so that in all things at all times, having all that you need, you will abound in every good work." II Corinthians 9:8
Thanks for your response. > > > > I'm building a pupil assessment record. I have > these > > two tables: > > > > table `pupil` > > > > pupil_id name > > 1 jeff > > 2 fred > > 3 rita > > > > > > table `reading_level` > > > > record_id pupil_id week level > > 1 1 1 6.1 > > 2 2 1 4.3 > > 3 1 2 6.2 > > 4 2 2 4.4 > > > Does that record_id column do anything except give > you a unique key? Yes, I see what you're saying. It is, as you suggest, redundant. > Unless it relates to something in the real world, it > is redundant, > because (pupil_id, week) is the obvious primary key > -- assuming you have > a new table for each academic year... I'd probably not want to have a new table for each year would I? That would mean this database could only continue to run while this school employed me! Not many primary school teachers are database admins. I'd want this to stand the test of time? Is that possible? Is it too ambitious to expect this to run once it's set up? Run and run? > > I'd really need to present this data thus: > > > > name week1 week2 week3 > > jeff 6.1 6.2 > > fred 4.3 4.4 > > I came up with this (and I was proud! I only installed this last week)... The static query: SELECT distinct test.name,a1.level AS l1,a2.level AS l2,a3.level AS l3, a4.level AS l4 FROM `test` LEFT JOIN `test` AS `a1` ON a1.date = 1 AND a1.name = test.name LEFT JOIN `test` AS `a2` ON a2.date = 2 AND a2.name = test.name LEFT JOIN `test` AS `a3` ON a3.date = 3 AND a3.name = test.name LEFT JOIN `test` AS `a4` ON a4.date = 4 AND a4.name = test.name It did the job. But would need altering when new data was added. Then, I used Zope to make it dynamic, so it incorporates new weeks' data: SELECT distinct `test`.`name`,<dtml-in test_get_date>`week<dtml-var date>`.`level` AS `week<dtml-var date>`,</dtml-in> `test`.`name` AS `lose_comma` FROM `test` <dtml-in test_get_date> LEFT JOIN `test` AS `week<dtml-var date>` ON `week<dtml-var date>`.`date` = <dtml-var date> AND `week<dtml-var date>`.`name` = test.name </dtml-in> Please point out any stark errors in my thinking. This won't be a huge app (but then these things grow I guess so it's best to get these things right). -- Matt Johnson __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
Matt Johnson wrote: > Please point out any stark errors in my thinking. This > won't be a huge app (but then these things grow I > guess so it's best to get these things right). > There is no way to completely automate this just using SQL, but I think you're on track with using an external language like python to automate the writing of your query. As an alternative, you could try the contrib/tablefunc function crosstab. I've posted a new version that includes an improved crosstab function (see http://www.joeconway.com/ and look for the link "contrib/tablefunc with hashed crosstab"). Here's what the solution might look like for a 20 week semester (the example is long, but only the last sql statement needs to be run each week): create table pupil(pupil_id integer, name text); insert into pupil values(1,'jeff'); insert into pupil values(2,'fred'); insert into pupil values(3,'rita'); create table reading_level(record_id integer, pupil_id integer, week integer, level float8); insert into reading_level values(1,1,1,6.1); insert into reading_level values(2,2,1,4.3); insert into reading_level values(3,1,2,6.2); insert into reading_level values(4,2,2,4.4); create or replace function weeks() returns setof integer as' declare weeknum integer; begin for weeknum in 1..20 loop return next weeknum; end loop; return; end ' language 'plpgsql'; CREATE TYPE weekly_report_tup AS ( name text, w1 float8, w2 float8, w3 float8, w4 float8, w5 float8, w6 float8, w7 float8, w8 float8, w9 float8, w10 float8, w11 float8, w12 float8, w13 float8, w14 float8, w15 float8, w16 float8, w17 float8, w18 float8, w19 float8, w20 float8 ); CREATE OR REPLACE FUNCTION weekly_report(text,text) RETURNS setof weekly_report_tup AS '$libdir/tablefunc','crosstab_hash' LANGUAGE 'C' STABLE STRICT; select * from weekly_report( 'select p.name, r.week, r.level from pupil p, reading_level r where r.pupil_id = p.pupil_id', 'select * from weeks()' ); name | w1 | w2 | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 | ------+-----+-----+----+----+----+----+----+----+----+-----+-----+-----+ jeff | 6.1 | 6.2 | | | | | | | | | | | fred | 4.3 | 4.4 | | | | | | | | | | | (2 rows) I've trimmed the output at w12 for clarity in the email, but 13-20 would be there as well. You could easily extend this to 26 or 52 weeks if you wanted. Now as more data is added... insert into reading_level values(5,1,3,6.3); insert into reading_level values(6,2,3,4.5); insert into reading_level values(7,3,3,7.2); select * from weekly_report( 'select p.name, r.week, r.level from pupil p, reading_level r where r.pupil_id = p.pupil_id', 'select * from weeks()' ); name | w1 | w2 | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 ------+-----+-----+-----+----+----+----+----+----+----+-----+-----+----- jeff | 6.1 | 6.2 | 6.3 | | | | | | | | | fred | 4.3 | 4.4 | 4.5 | | | | | | | | | rita | | | 7.2 | | | | | | | | | (3 rows) ...you just run the query. HTH, Joe