Re: A brief guide to nulls - Mailing list pgsql-sql
From | Otto Hirr |
---|---|
Subject | Re: A brief guide to nulls |
Date | |
Msg-id | 002d01c2bcc4$3abb23b0$fc2b5d42@frodo Whole thread Raw |
In response to | RFC: A brief guide to nulls (dev@archonet.com) |
List | pgsql-sql |
Another viewpoint, is to go a bit further into implementation... i.e. for every field, there are two items... *one item being the null/notnull info *other item being the value info of the field per the type at def time This faciliatates the concepts of where <fieldname> is null ... testing the null/notnull info where <fieldname> is not null ... testing the null/notnull info where <fieldname> = "somevalue" ... testing the value info etc. Trying to "hide" the concept that every field has TWO items, the value and whether it is null/notnull, seems to confuse people. By SIMPLY stating that every SQL field has two items, then causes a scripter to think a bit more deeply about a where clause... Hence making your example ... > SELECT * FROM customer WHERE sex='M'; > SELECT * FROM customer WHERE sex<>'M'; > Now you might think this returns all customers, but it will miss those > where sex is null. You've asked for all rows where the value of sex is 'M' > and all those with values not equal to 'M' but not rows with *no value at all* all that less mysterious... i.e. the above two selects only tested one of the two items for the SQL field, the other item to test is the null/notnull item. .. Otto > -----Original Message----- > From: pgsql-sql-owner@postgresql.org > [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of dev@archonet.com > Sent: Wednesday, January 15, 2003 9:23 AM > To: pgsql-sql@postgresql.org > Subject: RFC: A brief guide to nulls > > > There have been a few posts recently where people have had > problems with > nulls. Anyone got comments on the below before I submit it to > techdocs? > > TIA > > - Richard Huxton > > A Brief Guide to NULLs > ====================== > > What is a null? > =============== > A null is *not* an empty string. > A null is *not* a value. > A null is *not* a "special" value. > A null is the absence of a value. > > > What do nulls mean? > =================== > Well, they *should* mean one of two things: > 1. There is no applicable value > 2. There is a value but it is unknown > > Example 1: Imagine you have a customer table with name and sex fields. > If you get a new customer "ACME Widgets Ltd", the sex field > is meaningless > since your customer is a company (case 1). > If you get a new customer "Jackie Smith" they might be male > or female, but > you might not know (case 2). > > Example 2: You have an address table with > (street,city,county,postalcode) > fields. > You might insert an address ("10 Downing > Street","London",Null,"WC1 1AA") > since you don't have a valid county. > You might also insert an address ("1 Any > Street","Maidstone","Kent",Null) > where there *must be* a valid postalcode, but you don't know > what it is. > > It might be useful to be able to distinguish between these > two cases - not > applicable and unknown, but there is only one option "Null" > available to > us, so we can't. > > > How do nulls work? > ================== > There is one very important rule when dealing with nulls. The > result of > any operation or comparison, when applied to a null is null. The only > exception is testing if a value is null. > > Example: with the customer table above you could run the > following queries: > SELECT * FROM customer WHERE sex='M'; > SELECT * FROM customer WHERE sex<>'M'; > Now you might think this returns all customers, but it will miss those > where sex is null. You've asked for all rows where the value > of sex is 'M' > and all those with values not equal to 'M' but not rows with > *no value at > all* > > It might help to think of a database as a set of statements > you *know* to > be true. A null indicates that you *cannot say anything at > all* about that > field. You can't say what it is, you can't say what it isn't, > you can only > say there is some information missing. > > So, to see all the customers with unknown or inapplicable sex > you would need: > SELECT * FROM customer WHERE sex IS NULL; > > There are actually three possible results for a test in SQL - > True (the > test passed), False (the test failed) and Null (you tested against a > null). A result of null usually gets treated as False, so > testing against > nulls always fails. > > If you try to perform an operation on nulls, again the result > is always > null. So the results of all of the following are null: > SELECT 'abc' || null; > SELECT 1 + null; > SELECT sqrt(null::numeric); > The first case can be especially confusing. Concatenating a > null string to > a string value will return null, not the original value. > > > Uniqueness and nulls > ==================== > If you define a unique index on a column it prevents you inserting two > values that are the same. It does not prevent you inserting > as many nulls > as you like. How could it, you don't have a value so it can't > be the same > as any other. > > Example: We create a table "ta" with a unique constraint on column "b" > CREATE TABLE ta ( > a int4, > b varchar(3), > PRIMARY KEY (a) > ); > CREATE UNIQUE INDEX ta_b_idx ON ta (b); > INSERT INTO ta VALUES (1,'aaa'); -- succeeds > INSERT INTO ta VALUES (2,'bbb'); -- succeeds > INSERT INTO ta VALUES (3,null); -- succeeds > INSERT INTO ta VALUES (4,'bbb'); -- fails > INSERT INTO ta VALUES (5,null); -- succeeds! > > Given the definition of what a null is, you only have two > choices: allow > multiple nulls or allow no nulls. If you want no nulls, > define the column > as NOT NULL when creating the table. > > > Keys and nulls > ============== > No column that is part of a primary key can be null. When you define a > PRIMARY KEY, none of the columns mentioned can take a null value. > Postgresql makes sure of this by defining the columns as NOT > NULL for you. > > Example: With table "ta" we just created, \d ta will show column a as > being not null. Otherwise, we could insert rows with a set to null and > have no way to tell them apart. > > > Subqueries and nulls > ==================== > Since tests always fail when testing against nulls you can > have unexpected > results with sub-queries. > > Example: Assume we have a companies table and a diary table. > Diary entries > are usually related to a particular company but not always. > SELECT co_name FROM companies WHERE co_id NOT IN (SELECT > dy_company FROM > diary); > If any row in diary contains a null dy_company then you will get *no > results*. > We can expand the query like so: > WHERE co_id NOT IN (SELECT dy_company FROM diary) > WHERE co_id NOT IN (1, 2, null, 3...) > WHERE NOT (co_id=1 OR co_id=2 OR co_id=null OR co_id=3...) > WHERE NOT (... OR null OR ...) > WHERE NOT (null) > WHERE null > > You either need to explicitly check for null values, or > define the column > in question as NOT NULL. > > > Further Information > =================== > Any good relational database book (try something written by Date) > Bruce's book (link LHS at http://techdocs.postgresql.org) > My Postgresql Notes (link at http://techdocs.postgresql.org) > > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > >