Re: [SQL] Oddities with NULL and GROUP BY - Mailing list pgsql-sql

From Bruce Momjian
Subject Re: [SQL] Oddities with NULL and GROUP BY
Date
Msg-id 199907071823.OAA05656@candle.pha.pa.us
Whole thread Raw
In response to Re: [SQL] Oddities with NULL and GROUP BY  (secret <secret@kearneydev.com>)
List pgsql-sql

Looks like this is fixed in 6.5 too.a|b|sum-+-+---1|1|  32| |  3(2 rows)

[Charset iso-8859-1 unsupported, filtering to ASCII...]
> Jos_ Soares wrote:
> 
> > secret ha scritto:
> >
> >> "Jackson, DeJuan" wrote:
> >>
> >> > The behavior is valid, if you define NULL as meaning undefined.
> >> > In other words when you define something as NULL you're saying, "I
> >> don't
> >> > know what it is. It could be equal or not."
> >> >         -DEJ
> >> >
> >> > > -----Original Message-----
> >> > > From: secret [SMTP:secret@kearneydev.com]
> >> > > Sent: Friday, May 14, 1999 11:58 AM
> >> > > To:   PG-SQL
> >> > > Subject:      [SQL] Oddities with NULL and GROUP BY
> >> > >
> >> > >     Maybe there is something I don't know about how GROUP BY
> >> should
> >> > > work, but if I have a table like:
> >> > > a,b,c
> >> > > 1,1,1
> >> > > 1,1,2
> >> > > 1,1,3
> >> > > 1,2,1
> >> > > 1,3,1
> >> > >
> >> > > And I say SELECT a,b,sum(c) FROm .. GROUP BY a,b I get
> >> > > 1,1,6
> >> > > 1,2,1
> >> > > 1,3,1
> >> > >
> >> > > So whenever a or b changes we get a new summed row, well if I
> >> have rows
> >> > > where a or b are null, this doesn't happen, infact I seem to get
> >> all
> >> > > those rows individually... Like if:
> >> > > 1,1,1
> >> > > 1,1,3
> >> > > 1,NULL,10
> >> > > 1,NULL,20
> >> > > 1,2,3
> >> > >
> >> > > I get:
> >> > > 1,1,4
> >> > > 1,NULL,10
> >> > > 1,NULL,20
> >> > > 1,2,3
> >> > >
> >> > > Shouldn't I get 1,NULL,30?  Ie shouldn't NULL be treated like
> >> any other
> >> > > value?  Or is there some bit of information I'm missing?  I can
> >> set
> >> > > everything from NULL to 0 if need be, but I'd rather not...
> >> > >
> >> > > David Secret
> >> > > MIS Director
> >> > > Kearney Development Co., Inc.
> >> > >
> >>
> >>     IBM's DB/2 Disagrees, so does Oracle8!
> >>
> >> Here is a cut & paste from Oracle SQL+:
> >>
> >> SQL> select * from z;
> >>
> >>         A         B
> >> --------- ---------
> >>         1         1
> >>         1         2
> >>                   5
> >>                  10
> >>
> >> SQL> select a,sum(b) from z group by a;
> >>
> >>         A    SUM(B)
> >> --------- ---------
> >>         1         3
> >>                  15
> >>
> >> SQL>
> >>
> >>     I'm going to report this as a bug now that I've verified 2 major
> >> database
> >> vendors perform the task as I would expect them to, and PostgreSQL
> >> does it
> >> very differently.  The question is really is NULL=NULL, which I
> >> would say it
> >> should be.
> >
> >
> > I tried it in PostgreSQL 6.5beta1 with the same result:
> >
> > select * from z;
> > a| b
> > -+--
> > 1| 1
> > 1| 2
> >  | 5
> >  |10
> > (4 rows)
> >
> > select a,sum(b) from z group by a;
> > a|sum
> > -+---
> > 1|  3
> >  | 15
> > (2 rows)
> >
> > The Pratical SQL Handbook at page 171 says:
> > Since nulls represent "the great unknown", there is no way to know
> > whether one null is equal to any other null. Each unknown value
> > may or may not be different from another.
> > However, if the grouping column contains more than one null,
> > all of them are put into a single group.
> >
> > Thus: NULL!=NULL but on GROUP BY it is considered as NULL=NULL.
> >
> > Jos_
> >
> >
> >
> >
> > --
> > ______________________________________________________________
> > PostgreSQL 6.5.0 on i586-pc-linux-gnu, compiled by gcc 2.7.2.3
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > Jose'
> >
> 
>     Wonderful, that's as I expected.  However please try this in 6.5
> Beta1,
> CREATE TABLE z(a int4,b int4, c int4);
> INSERT INTO z VALUES (1,1,1);
> INSERT INTO z VALUES (1,1,2);
> INSERT INTO z(a,c) VALUES (2,1);
> INSERT INTO z(a,c) VALUES (2,2);
> 
> SELECT a,b,sum(c) FROM z GROUP BY a,b
> 
> GROUPing in PostgreSQL w/NULLs works just fine when there is only 1
> column, however when one throws 2 in, the 2nd one having NULLs it starts
> failing.  Your example demonstrates the right answer for 1 group by
> column, try it with 2 and I expect 6.5beta1 will fail as 6.4.2 does.
> 
>     As to NULL=NULL or NULL!=NULL, evadentally my estimation of why the
> problem is occuring was wrong. :)  But from the SQL handbook we
> definately have a bug here.
> 
> David Secret
> MIS Director
> Kearney Development Co., Inc.
> 
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: [SQL] Good Optimization
Next
From: Bruce Momjian
Date:
Subject: Re: [SQL] Good Optimization