Re: cross tables, SELECT expressions, and GROUP BY problem - Mailing list pgsql-sql

From Ross Johnson
Subject Re: cross tables, SELECT expressions, and GROUP BY problem
Date
Msg-id 1143980127.8841.254.camel@desk.home
Whole thread Raw
In response to Re: cross tables, SELECT expressions, and GROUP BY problem  (Ross Johnson <ross.johnson@homemail.com.au>)
List pgsql-sql
Following up my own question again, I've realised my error and solved my
problem - in the interests of completing this thread, the working query
(two versions) can be found below...

On Sun, 2006-04-02 at 18:00 +1000, Ross Johnson wrote:
> On Sun, 2006-04-02 at 16:43 +1000, Ross Johnson wrote:
> > Hi,
> > 
> > I'm relatively stretched when it comes to SQL but I'm getting there. I
> > have a question though:
> > 
> > Thanks to various web sites I've succeeded in creating simple cross
> > tables in PostgreSQL. I'm now trying more complicated examples
> > (migrating queries from MS Access to PostgreSQL). I'm stuck on getting
> > grouping to work where the selection element isn't a real table field,
> > where it is generated by a CASE statement. Here's a full actual SELECT
> > statement, that works but isn't quite what I need, followed by the first
> > few rows of output:
> > 
> > SELECT DISTINCT
> >     CASE
> >         WHEN lower(c."Order") = 'coleoptera' THEN 5
> >         WHEN lower(c."Order") = 'trichoptera' THEN 8
> >         WHEN lower(c."Order") = 'ephemeroptera' THEN 6
> >         WHEN lower(c."Order") = 'plecoptera' THEN 7
> >         WHEN lower(c."Class") = 'oligochaeta' THEN 1
> >         WHEN lower(c."Family") LIKE 'chiron%' THEN 2
> >         WHEN lower(c."Order") = 'diptera' AND lower(c."Family") NOT LIKE 'chiron%' THEN 3
> >         ELSE 4
> >     END AS "Ranking",
> >     CASE
> >         WHEN lower(c."Order") = 'coleoptera' THEN 'Coleoptera'
> >         WHEN lower(c."Order") = 'trichoptera' THEN 'Trichoptera'
> >         WHEN lower(c."Order") = 'ephemeroptera' THEN 'Ephemeroptera'
> >         WHEN lower(c."Order") = 'plecoptera' THEN 'Plecoptera'
> >         WHEN lower(c."Class") = 'oligochaeta' THEN 'Oligochaeta'
> >         WHEN lower(c."Family") LIKE 'chiron%' THEN 'Chironomidae'
> >         WHEN lower(c."Order") = 'diptera' AND lower(c."Family") NOT LIKE 'chiron%' THEN 'Diptera (Other)'
> >         ELSE 'Other'
> >     END AS "Taxa",
> >     SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS "2222011",
> >     SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count" END) AS "2222012",
> >     SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS "2222013",
> >     SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS "2222014"
> > FROM "tblBugIDSheetInfo" b
> >     INNER JOIN "tblBugCount" c USING ("BugSheetID")
> > GROUP BY c."Order", c."Class", c."Family"
> > ORDER BY "Ranking"
> > 
> > Generates the following output:
> > 
> > Ranking    Taxa            2222011    2222012    2222013    2222014
> > ---------------------------------------------------------------
> > 1    "Oligochaeta"        487    1711    1759    1078
> > 1    "Oligochaeta"        7            
> > 1    "Oligochaeta"                    
> > 2    "Chironomidae"        1385    2335    1500    1513
> > 2    "Chironomidae"                    
> > 3    "Diptera (Other)"                5
> > 3    "Diptera (Other)"        1    1    3
> > 3    "Diptera (Other)"    199    19    40    37
> > 3    "Diptera (Other)"                
> > ...
> 
> I should add that I've just tried the following query, which is just a
> rearrangement of the above query using a sub-SELECT, and the result is
> the same as above apart from a slightly different ordering of the rows.
> That is, GROUP BY "Ranking" still doesn't appear to do anything. I was
> under the impression that a sub-SELECT creates a temporary, or at least
> a pseudo-temporary table (a "virtual" table as it's called in the
> documentation), so I would have expected this query to work even if the
> above one doesn't.
> 
> 
> SELECT DISTINCT
>     "Ranking", "Taxa",
>     SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS "2222011",
>     SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count" END) AS "2222012",
>     SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS "2222013",
>     SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS "2222014"
> FROM "tblBugIDSheetInfo" b
>     INNER JOIN (
>         SELECT "BugSheetID",
>         CASE
>             WHEN lower("Order") = 'coleoptera' THEN 5
>             WHEN lower("Order") = 'trichoptera' THEN 8
>             WHEN lower("Order") = 'ephemeroptera' THEN 6
>             WHEN lower("Order") = 'plecoptera' THEN 7
>             WHEN lower("Class") = 'oligochaeta' THEN 1
>             WHEN lower("Family") LIKE 'chiron%' THEN 2
>             WHEN lower("Order") = 'diptera' AND lower("Family") NOT LIKE 'chiron%' THEN 3
>             ELSE 4
>         END AS "Ranking",
>         CASE
>             WHEN lower("Order") = 'coleoptera' THEN 'Coleoptera'
>             WHEN lower("Order") = 'trichoptera' THEN 'Trichoptera'
>             WHEN lower("Order") = 'ephemeroptera' THEN 'Ephemeroptera'
>             WHEN lower("Order") = 'plecoptera' THEN 'Plecoptera'
>             WHEN lower("Class") = 'oligochaeta' THEN 'Oligochaeta'
>             WHEN lower("Family") LIKE 'chiron%' THEN 'Chironomidae'
>             WHEN lower("Order") = 'diptera' AND lower("Family") NOT LIKE 'chiron%' THEN 'Diptera (Other)'
>             ELSE 'Other'
>         END AS "Taxa",
>         "Order", "Class", "Family", "Count"
>         FROM "tblBugCount") c USING ("BugSheetID")
> GROUP BY c."Ranking", c."Taxa", c."Order", c."Class", c."Family"
> ORDER BY "Ranking"
> 

The problem was I was trying to do too much in one GROUP BY clause.
After getting the right result using a temporary table I backtracked and
found that the following version gives me the result I'm looking for.

SELECT DISTINCT"Ranking", "Taxa",SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS "2222011",SUM(CASE
WHENb."LocationCode" = '2222012' THEN c."Count" END) AS "2222012",SUM(CASE WHEN b."LocationCode" = '2222013' THEN
c."Count"END) AS "2222013",SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS "2222014"
 
FROM "tblBugIDSheetInfo" bINNER JOIN (    SELECT "BugSheetID",    CASE        WHEN lower("Order") = 'coleoptera' THEN 5
      WHEN lower("Order") = 'trichoptera' THEN 8        WHEN lower("Order") = 'ephemeroptera' THEN 6        WHEN
lower("Order")= 'plecoptera' THEN 7        WHEN lower("Class") = 'oligochaeta' THEN 1        WHEN lower("Family") LIKE
'chiron%'THEN 2        WHEN lower("Order") = 'diptera' AND lower("Family") NOT LIKE 'chiron%' THEN 3        ELSE 4
ENDAS "Ranking",    CASE        WHEN lower("Order") = 'coleoptera' THEN 'Coleoptera'        WHEN lower("Order") =
'trichoptera'THEN 'Trichoptera'        WHEN lower("Order") = 'ephemeroptera' THEN 'Ephemeroptera'        WHEN
lower("Order")= 'plecoptera' THEN 'Plecoptera'        WHEN lower("Class") = 'oligochaeta' THEN 'Oligochaeta'
WHENlower("Family") LIKE 'chiron%' THEN 'Chironomidae'        WHEN lower("Order") = 'diptera' AND lower("Family") NOT
LIKE'chiron%' THEN 'Diptera (Other)'        ELSE 'Other'    END AS "Taxa",    "Order", "Class", "Family", SUM("Count")
AS"Count"    FROM "tblBugCount"    GROUP BY "BugSheetID", "Order", "Class", "Family") c USING ("BugSheetID")
 
GROUP BY "Ranking", "Taxa"
ORDER BY "Ranking"

And modifying the original query gives the faster performance:

SELECT    "Ranking","Taxa",SUM("2222011") AS "2222011",SUM("2222012") AS "2222012",SUM("2222013") AS
"2222013",SUM("2222014")AS "2222014"
 
FROM (SELECT DISTINCTCASE    WHEN lower(c."Order") = 'coleoptera' THEN 5    WHEN lower(c."Order") = 'trichoptera' THEN
8   WHEN lower(c."Order") = 'ephemeroptera' THEN 6    WHEN lower(c."Order") = 'plecoptera' THEN 7    WHEN
lower(c."Class")= 'oligochaeta' THEN 1    WHEN lower(c."Family") LIKE 'chiron%' THEN 2    WHEN lower(c."Order") =
'diptera'AND lower(c."Family") NOT LIKE 'chiron%' THEN 3    ELSE 4END AS "Ranking",CASE    WHEN lower(c."Order") =
'coleoptera'THEN 'Coleoptera'    WHEN lower(c."Order") = 'trichoptera' THEN 'Trichoptera'    WHEN lower(c."Order") =
'ephemeroptera'THEN 'Ephemeroptera'    WHEN lower(c."Order") = 'plecoptera' THEN 'Plecoptera'    WHEN lower(c."Class")
='oligochaeta' THEN 'Oligochaeta'    WHEN lower(c."Family") LIKE 'chiron%' THEN 'Chironomidae'    WHEN lower(c."Order")
='diptera' AND lower(c."Family") NOT LIKE 'chiron%' THEN 'Diptera (Other)'    ELSE 'Other'END AS "Taxa",SUM(CASE WHEN
b."LocationCode"= '2222011' THEN c."Count" END) AS "2222011",SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count"
END)AS "2222012",SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS "2222013",SUM(CASE WHEN
b."LocationCode"= '2222014' THEN c."Count" END) AS "2222014"FROM "tblBugIDSheetInfo" b    INNER JOIN "tblBugCount" c
USING("BugSheetID")WHERE b."LocationCode" = '2222011'   OR b."LocationCode" = '2222012'   OR b."LocationCode" =
'2222013'  OR b."LocationCode" = '2222014'GROUP BY c."Order", c."Class", c."Family") d
 
GROUP BY "Ranking", "Taxa"
ORDER BY "Ranking"


Both these queries give me the result I was after:

> > I want to GROUP on the "Ranking" field as well so that all rows with the
> > same "Ranking" value are SUMmed. That is, I'm trying to achieve the
> > following:
> > 
> > Ranking    Taxa            2222011    2222012    2222013    2222014
> > ---------------------------------------------------------------
> > 1    "Oligochaeta"        494    1711    1759    1078
> > 2    "Chironomidae"        1385    2335    1500    1513
> > 3    "Diptera (Other)"    199    20    41    45
> > ...

Applying the grouping at the appropriate points helps :)

Regards.
Ross Johnson




pgsql-sql by date:

Previous
From: Ross Johnson
Date:
Subject: Re: cross tables, SELECT expressions, and GROUP BY problem
Next
From: Paul M Foster
Date:
Subject: References NULL field