Thread: Table Constraints with NULL values

Table Constraints with NULL values

From
"David Allardyce"
Date:
It appears that Postgres will allow any INSERT, despite a multiple-column
constraint, if any of the values INSERTed are NULL.  If I read the included
excerpt correctly (there are like three negatives in the second sentence,
sheesh :) ), multiple NULL values for a column are acceptable or, in other
words, are not a violation of UNIQUEness.

However, shouldn't any values that are not NULL violate the constraint if
the same values exist already?

As an example, into the table definition at the bottom of this message...  This should be acceptable.     INSERT INTO
ao_functions(name, skill, arg1, arg2, arg3) VALUES (NULL,
 
NULL, NULL, NULL, NULL);     INSERT INTO ao_functions (name, skill, arg1, arg2, arg3) VALUES (NULL,
NULL, NULL, NULL, NULL);
 But this should not...     INSERT INTO ao_functions (name, skill, arg1, arg2, arg3) VALUES
('Hit', 1, -1, -1, 91);     INSERT INTO ao_functions (name, skill, arg1, arg2, arg3) VALUES
('Hit', 1, -1, -1, 91); ERROR:  Cannot insert a duplicate key into unique index unique_aofunction
 Why does this succeed?     INSERT INTO ao_functions (name, skill, arg1, arg2, arg3) VALUES
('TauntNPC', 1, NULL, NULL, NULL);     INSERT INTO ao_functions (name, skill, arg1, arg2, arg3) VALUES
('TauntNPC', 1, NULL, NULL, NULL);

-- Excerpt from the Postgres CREATE TABLE documentation ---
...
UNIQUE Constraint
...
The column definitions of the specified columns do not have to include a NOT
NULL constraint to be included in a UNIQUE constraint. Having more than one
null value in a column without a NOT NULL constraint, does not violate a
UNIQUE constraint. (This deviates from the SQL92 definition, but is a more
sensible convention. See the section on compatibility for more details.)
...
--- End of Excerpt --

CREATE TABLE ao_functions (       id               SERIAL CONSTRAINT funckey PRIMARY KEY,       name
CHARACTER(25),      skill             INTEGER NULL,       arg1            CHARACTER VARYING(100) NULL DEFAULT NULL,
 arg2            CHARACTER VARYING(100) NULL DEFAULT NULL,       arg3            CHARACTER VARYING(100) NULL DEFAULT
NULL,      CONSTRAINT unique_aofunction UNIQUE (name, skill, arg1, arg2, arg3)
 
);

David Allardyce



Re: Table Constraints with NULL values

From
Tom Lane
Date:
"David Allardyce" <dave@pod13.net> writes:
> However, shouldn't any values that are not NULL violate the constraint if
> the same values exist already?

No.  Postgres is conforming to the SQL standard in this.  SQL92 saith
in section 4.10:
        A unique constraint is satisfied if and only if no two rows in        a table have the same non-null values in
theunique columns. In        addition, if the unique constraint was defined with PRIMARY KEY,        then it requires
thatnone of the values in the specified column or        columns be the null value.
 

There's a slightly different statement in the definition of the UNIQUE
predicate, section 8.9:
        2) If there are no two rows in T such that the value of each column           in one row is non-null and is
equalto the value of the cor-           responding column in the other row according to Subclause 8.2,
"<comparisonpredicate>", then the result of the <unique predi-           cate> is true; otherwise, the result of the
<uniquepredicate>           is false.
 

AFAICT the intent is that all the comparison columns must be non-null
(and equal) for two rows to be considered to violate the uniqueness
constraint.

The standard *does* use the concept that you are after: section
3.1 Definitions saith
        h) distinct: Two values are said to be not distinct if either:           both are the null value, or they
compareequal according to           Subclause 8.2, "<comparison predicate>". Otherwise they are           distinct. Two
rows(or partial rows) are distinct if at least           one of their pairs of respective values is distinct. Otherwise
         they are not distinct. The result of evaluating whether or not           two values or two rows are distinct
isnever unknown.
 
        i) duplicate: Two or more values or rows are said to be duplicates           (of each other) if and only if
theyare not distinct.
 

which is terminology that they carefully refrain from using in defining
uniqueness constraints.  Had they meant what you want the behavior to
be, ISTM they'd have defined uniqueness constraints by saying "all the
rows must be distinct".


> -- Excerpt from the Postgres CREATE TABLE documentation ---
> ...
> UNIQUE Constraint
> ...
> The column definitions of the specified columns do not have to include a NOT
> NULL constraint to be included in a UNIQUE constraint. Having more than one
> null value in a column without a NOT NULL constraint, does not violate a
> UNIQUE constraint. (This deviates from the SQL92 definition, but is a more
> sensible convention. See the section on compatibility for more details.)

I believe that this statement is wrong, and that we do follow the spec.
There have been a number of arguments about this in the past though...
evidently whoever touched this doc page last had the opposite opinion.
        regards, tom lane


Re: Table Constraints with NULL values

From
"David Allardyce"
Date:
I don't mean to re-hash an argument that has been debated-to-death before,
but I can't help myself...

> > However, shouldn't any values that are not NULL violate the constraint
if
> > the same values exist already?
>
> No.  Postgres is conforming to the SQL standard in this.  SQL92 saith
> in section 4.10:
>
>          A unique constraint is satisfied if and only if no two rows in
>          a table have the same non-null values in the unique columns. In
>          addition, if the unique constraint was defined with PRIMARY KEY,
>          then it requires that none of the values in the specified column
or
>          columns be the null value.

I am not sure the standard supports your statement.  Apparently only the
PRIMARY KEY constraint may not contain null values, otherwise its a
violation if "..two rows... have the same non-null values..."

> There's a slightly different statement in the definition of the UNIQUE
> predicate, section 8.9:
>
>          2) If there are no two rows in T such that the value of each
column
>             in one row is non-null and is equal to the value of the cor-
>             responding column in the other row according to Subclause 8.2,
>             "<comparison predicate>", then the result of the <unique
predi-
>             cate> is true; otherwise, the result of the <unique predicate>
>             is false.
>
> AFAICT the intent is that all the comparison columns must be non-null
> (and equal) for two rows to be considered to violate the uniqueness
> constraint.

If 'each' means 'every' then I can see how you come to that conclusion.
However, AFAICT the intent is to say that all rows in a table can be said to
be unique if none of the non-null columns contains equal values in the
corresponding column of another row.  If I use my definition then two all
null rows will not violate the constraint.  If section 8.9 says that every
column must be NON-NULL, then any row that contains a NULL column will
always be in violation of uniquness.

> The standard *does* use the concept that you are after: section
> 3.1 Definitions saith
>
>          h) distinct: Two values are said to be not distinct if either:
>             both are the null value, or they compare equal according to
>             Subclause 8.2, "<comparison predicate>". Otherwise they are
>             distinct. Two rows (or partial rows) are distinct if at least
>             one of their pairs of respective values is distinct. Otherwise
>             they are not distinct. The result of evaluating whether or not
>             two values or two rows are distinct is never unknown.
>
>          i) duplicate: Two or more values or rows are said to be
duplicates
>             (of each other) if and only if they are not distinct.
>
> which is terminology that they carefully refrain from using in defining
> uniqueness constraints.  Had they meant what you want the behavior to
> be, ISTM they'd have defined uniqueness constraints by saying "all the
> rows must be distinct".
>
Actually, I don't think the definition of distinct is a contradiction at
all.  Instead, I think it explains why they specifically exclude NULL values
from the uniqueness comparison.  If they had defined uniqueness as "all rows
must be distinct" then two all null rows would violate the uniqueness
constraint.  Not the behavior I want at all.

David Allardyce
----------------- ISTM is the only one I couldn't figure out.





Re: Table Constraints with NULL values

From
Tom Lane
Date:
"David Allardyce" <dave@pod13.net> writes:
> ... If they had defined uniqueness as "all rows
> must be distinct" then two all null rows would violate the uniqueness
> constraint.  Not the behavior I want at all.

Er, why not?  You're essentially arguing that the UNIQUE constraint
should treat nulls as equal, so I'd think two all-null rows should be
considered equal too.  I do not see how you can consistently maintain
that (x,NULL) equals (x,NULL) but (NULL,NULL) doesn't equal (NULL,NULL).

But we have been around the maypole on this question several times
before, with no permanent resolution --- the plain fact is that the
spec isn't very clearly written.  Useful data would be tests
demonstrating how other systems (Oracle, DB2, etc) interpret the issue.
        regards, tom lane