Thread: PRIMARY KEY on a *group* of columns imply that each column is NOT NULL?
PRIMARY KEY on a *group* of columns imply that each column is NOT NULL?
From
Stephane Bortzmeyer
Date:
If I define a primary key: name TEXT NOT NULL, address INET, PRIMARY KEY(name, address) the definition (seen by \d) becomes: name | text | not null address | inet | not null "address" is now not null, which I do not want. It seems unnecessary: I just want the tuple (name, address) to be unique, which seems possible even if some 'address' values are NULL. It does not appear to be documented in http://www.postgresql.org/docs/7.4/interactive/ddl-constraints.html#AEN1975. Is there a workaround?
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT NULL?
From
"Guy Rouillier"
Date:
Stephane Bortzmeyer wrote: > If I define a primary key: > > name TEXT NOT NULL, > address INET, > PRIMARY KEY(name, address) > > the definition (seen by \d) becomes: > > name | text | not null > address | inet | not null > > "address" is now not null, which I do not want. It seems unnecessary: > I just want the tuple (name, address) to be unique, which seems > possible even if some 'address' values are NULL. > > It does not appear to be documented in > http://www.postgresql.org/docs/7.4/interactive/ddl-constraints.html#AEN1 975. > Is there a workaround? Per the SQL Commands Reference, under CREATE TABLE: "The primary key constraint specifies that a column or columns of a table may contain only unique (non-duplicate), nonnull values. Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT NULL" Primary key columns cannot contain null values. -- Guy Rouillier
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Tue, Apr 26, 2005 at 03:22:40PM -0500, Guy Rouillier <guyr@masergy.com> wrote a message of 37 lines which said: > "The primary key constraint specifies that a column or columns of a > table may contain only unique (non-duplicate), nonnull values. > Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT > NULL" > > Primary key columns cannot contain null values. I read the above also. It is perfectly clear for primary key on one column. But it does not apply to primary keys containing a group of columns. In that case (my case), columns do not have to be UNIQUE. But they have to be NOT NULL, which puzzles me.
On Tue, 2005-04-26 at 15:39, Stephane Bortzmeyer wrote: > On Tue, Apr 26, 2005 at 03:22:40PM -0500, > Guy Rouillier <guyr@masergy.com> wrote > a message of 37 lines which said: > > > "The primary key constraint specifies that a column or columns of a > > table may contain only unique (non-duplicate), nonnull values. > > Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT > > NULL" > > > > Primary key columns cannot contain null values. > > I read the above also. It is perfectly clear for primary key on one > column. > > But it does not apply to primary keys containing a group of > columns. In that case (my case), columns do not have to be UNIQUE. But > they have to be NOT NULL, which puzzles me. Here's a quote from the SQL1992 spec that's VERY clear: A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns. In addition, if the unique constraint was defined with PRIMARY KEY, then it requires that none of the values in the specified column or columns be the null value.
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Tue, Apr 26, 2005 at 03:48:44PM -0500, Scott Marlowe <smarlowe@g2switchworks.com> wrote a message of 26 lines which said: > Here's a quote from the SQL1992 spec that's VERY clear: Yes, PostgreSQL is right and implement the standard. Now, what's the rationale for the standard? I understand it for a single column but, for several columns, it should be still possible to have different tuples, such as (3, NULL) and (5, NULL) for instance.
Stephane Bortzmeyer wrote: > On Tue, Apr 26, 2005 at 03:48:44PM -0500, > Scott Marlowe <smarlowe@g2switchworks.com> wrote > a message of 26 lines which said: > > >>Here's a quote from the SQL1992 spec that's VERY clear: > > > Yes, PostgreSQL is right and implement the standard. Now, what's the > rationale for the standard? I understand it for a single column but, > for several columns, it should be still possible to have different > tuples, such as (3, NULL) and (5, NULL) for instance. The value of (3,NULL) isn't well-defined. In particular, you can't say that (3,NULL) = (3,NULL) since NULL means not-known. The fact that part of the value is not known means the value as a whole is not known. -- Richard Huxton Archonet Ltd
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
ptjm@interlog.com (Patrick TJ McPhee)
Date:
In article <20050426203938.GA18628@nic.fr>, Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote: % But it does not apply to primary keys containing a group of % columns. In that case (my case), columns do not have to be UNIQUE. But % they have to be NOT NULL, which puzzles me. It does apply to primary keys containing groups of columns. You can get the table definition you want by using a unique constraint, but you should know that in SQL, unique constraints don't apply to rows containing null values in the constrained columns. If you do this: create table x ( name TEXT NOT NULL, address INET, CONSTRAINT na UNIQUE (name, address) ); your table definition will be as you want it, but the constraint you want won't be there. $ INSERT INTO x VALUES ('alpha'); INSERT 194224 1 $ INSERT INTO x VALUES ('alpha'); INSERT 194225 1 $ INSERT INTO x VALUES ('alpha'); INSERT 194226 1 $ INSERT INTO x VALUES ('alpha'); INSERT 194227 1 -- Patrick TJ McPhee North York Canada ptjm@interlog.com
On Wed, 2005-04-27 at 02:12, Stephane Bortzmeyer wrote: > On Tue, Apr 26, 2005 at 03:48:44PM -0500, > Scott Marlowe <smarlowe@g2switchworks.com> wrote > a message of 26 lines which said: > > > Here's a quote from the SQL1992 spec that's VERY clear: > > Yes, PostgreSQL is right and implement the standard. Now, what's the > rationale for the standard? I understand it for a single column but, > for several columns, it should be still possible to have different > tuples, such as (3, NULL) and (5, NULL) for instance. Since NULL <> NULL, that means you could then have (5,NULL) and (5,NULL) since the two NULLS aren't equal.
On Wed, 27 Apr 2005, Stephane Bortzmeyer wrote: > On Tue, Apr 26, 2005 at 03:48:44PM -0500, > Scott Marlowe <smarlowe@g2switchworks.com> wrote > a message of 26 lines which said: > > > Here's a quote from the SQL1992 spec that's VERY clear: > > Yes, PostgreSQL is right and implement the standard. Now, what's the > rationale for the standard? I understand it for a single column but, > for several columns, it should be still possible to have different > tuples, such as (3, NULL) and (5, NULL) for instance. The case that they're trying to prevent is two tuples like (3, NULL) and (3,NULL) since uniqueness alone doesn't prevent them both from being inserted.
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Wed, Apr 27, 2005 at 05:19:32AM +0000, Patrick TJ McPhee <ptjm@interlog.com> wrote a message of 37 lines which said: > but you should know that in SQL, unique constraints don't apply to > rows containing null values May be I should but I didn't. > your table definition will be as you want it, but the constraint you > want won't be there. OK, I will try to write a custom trigger, then.
Stephane Bortzmeyer <bortzmeyer@nic.fr> writes: > Yes, PostgreSQL is right and implement the standard. Now, what's the > rationale for the standard? I understand it for a single column but, > for several columns, it should be still possible to have different > tuples, such as (3, NULL) and (5, NULL) for instance. If that's what you want, declare it as UNIQUE not PRIMARY KEY. regards, tom lane
On Wed, 2005-04-27 at 09:06, Stephane Bortzmeyer wrote: > On Wed, Apr 27, 2005 at 05:19:32AM +0000, > Patrick TJ McPhee <ptjm@interlog.com> wrote > a message of 37 lines which said: > > > but you should know that in SQL, unique constraints don't apply to > > rows containing null values > > May be I should but I didn't. Actually, considering that many databases (at least in the past) have ignored this and treated nulls as unique things, it's quite understandable. MSSQL, for instance, used to definitely allow only one null in a unique field. So, for that database, not null wasn't really necessary for a primary key column. I believe this problem exist(s)(ed) in several other supposedly "enterprise" class databases as well.
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Wed, Apr 27, 2005 at 10:26:30AM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote a message of 9 lines which said: > If that's what you want, declare it as UNIQUE not PRIMARY KEY. As shown by Patrick TJ McPhee, it does not work: tests=> create table x ( tests(> name TEXT NOT NULL, tests(> address INET, tests(> CONSTRAINT na UNIQUE (name, address) tests(> ); NOTICE: CREATE TABLE / UNIQUE will create implicit index "na" for table "x" CREATE TABLE tests=> INSERT INTO x (name) values ('foobar'); INSERT 45380 1 tests=> INSERT INTO x (name) values ('foobar'); INSERT 45381 1 tests=> INSERT INTO x (name) values ('foobar'); INSERT 45382 1 tests=> INSERT INTO x (name) values ('foobar'); INSERT 45383 1 tests=> select * from x; name | address --------+--------- foobar | foobar | foobar | foobar | (4 rows)
On Wed, 2005-04-27 at 09:06, Stephane Bortzmeyer wrote: > On Wed, Apr 27, 2005 at 05:19:32AM +0000, > Patrick TJ McPhee <ptjm@interlog.com> wrote > a message of 37 lines which said: > > > but you should know that in SQL, unique constraints don't apply to > > rows containing null values > > May be I should but I didn't. > > > your table definition will be as you want it, but the constraint you > > want won't be there. > > OK, I will try to write a custom trigger, then. Often the best bet here, btw, is to declare it not null then use something other than null to represent null, like the text characters NA or something.
Stephane Bortzmeyer wrote: > On Wed, Apr 27, 2005 at 10:26:30AM -0400, > Tom Lane <tgl@sss.pgh.pa.us> wrote > a message of 9 lines which said: > > >>If that's what you want, declare it as UNIQUE not PRIMARY KEY. > > > As shown by Patrick TJ McPhee, it does not work: > > tests=> create table x ( > tests(> name TEXT NOT NULL, > tests(> address INET, > tests(> CONSTRAINT na UNIQUE (name, address) > tests(> ); > NOTICE: CREATE TABLE / UNIQUE will create implicit index "na" for table "x" > CREATE TABLE > tests=> INSERT INTO x (name) values ('foobar'); > INSERT 45380 1 > tests=> INSERT INTO x (name) values ('foobar'); > INSERT 45381 1 > tests=> INSERT INTO x (name) values ('foobar'); > INSERT 45382 1 > tests=> INSERT INTO x (name) values ('foobar'); > INSERT 45383 1 > tests=> select * from x; > name | address > --------+--------- > foobar | > foobar | > foobar | > foobar | > (4 rows) > If i understand correctly, you want something like: create table x ( name TEXT NOT NULL PRIMARY KEY, address INET ); CREATE UNIQUE INDEX na ON x (name, address) WHERE address IS NULL; HTH Sebastian
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Wed, Apr 27, 2005 at 04:50:23PM +0200, Sebastian Böck <sebastianboeck@freenet.de> wrote a message of 48 lines which said: > CREATE UNIQUE INDEX na ON x (name, address) WHERE address IS NULL; No, because it prevents two tuples with the same value of "name".
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Wed, Apr 27, 2005 at 09:36:57AM -0500, Scott Marlowe <smarlowe@g2switchworks.com> wrote a message of 18 lines which said: > Often the best bet here, btw, is to declare it not null then use > something other than null to represent null, like the text > characters NA or something. Yes, but it defeats the purpose of NULL. And what should I use as a "pseudo-NULL" value for INET? 127.0.0.1? 0.0.0.0? Special values are well-known for the problems they raise. That's why many languages have NULL-like solutions (None in Python, undef in Perl, Maybe types in Haskell, etc).
Stephane Bortzmeyer wrote: > On Wed, Apr 27, 2005 at 04:50:23PM +0200, > Sebastian Böck <sebastianboeck@freenet.de> wrote > a message of 48 lines which said: > > >>CREATE UNIQUE INDEX na ON x (name, address) WHERE address IS NULL; > > > No, because it prevents two tuples with the same value of "name". Ahh, sorry! Ment something more like: CREATE TABLE table x ( name TEXT NOT NULL, address INET ); CREATE UNIQUE INDEX na ON x (name, address); CREATE UNIQUE INDEX n ON x (name) WHERE address IS NULL; HTH Sebastian
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Wed, Apr 27, 2005 at 05:04:07PM +0200, Sebastian Böck <sebastianboeck@freenet.de> wrote a message of 24 lines which said: One is enough :-) vvvvv > CREATE TABLE table x ( > name TEXT NOT NULL, > address INET > ); > > CREATE UNIQUE INDEX na ON x (name, address); > CREATE UNIQUE INDEX n ON x (name) WHERE address IS NULL; Great! It works fine. Many thanks. tests=> select * from x; name | address ------+--------- foo | foo | 1.2.3.4 foo | ::1 bar | ::1 bar | (5 rows) tests=> insert into x (name) values ('bar'); ERROR: duplicate key violates unique constraint "n" tests=> insert into x (name, address) values ('bar', '::1'); ERROR: duplicate key violates unique constraint "na" tests=> insert into x (name) values ('new'); INSERT 21128 1
On Wed, 2005-04-27 at 17:00 +0200, Stephane Bortzmeyer wrote: > On Wed, Apr 27, 2005 at 09:36:57AM -0500, > Scott Marlowe <smarlowe@g2switchworks.com> wrote > a message of 18 lines which said: > > > Often the best bet here, btw, is to declare it not null then use > > something other than null to represent null, like the text > > characters NA or something. > > Yes, but it defeats the purpose of NULL. And what should I use as a > "pseudo-NULL" value for INET? 127.0.0.1? 0.0.0.0? Special values are > well-known for the problems they raise. That's why many languages have > NULL-like solutions (None in Python, undef in Perl, Maybe types in > Haskell, etc). No. NULL is NOT 'None', nor 'undef', and definitely not NULL as in C. Those are perfectly defined values, although special ones. Only 'undef' is quite misleading, but nevertheless it is just _one_ value. That is, given a variable A, you can always write a boolean expression that evaluates True or False to test if A is _equal_ to None/undef/NULL (in C): $ python -c "a = None; print a == None" True $ perl -e 'print a == undef, "\n"' 1 $ cat p.c #include <stdio.h> int main(int argc, char *argv[]) { char *a = NULL; printf ("%d\n", a == NULL); } $ cc p.c $ ./a.out 1 About Haskell, I don't know. For what I understand from quick reading the manual, it'd say that Nothing is similar. You _can_ tell if something is equal to Nothing. In databases, NULL has a completely different meaning. It doesn't mean _no value_, which is just a special (single) value, but it means _unknown value_. You simply can't compare it with any single value (even special ones) and expect a boolean answer. The only possible answer is 'I don't know', which is NULL in boolean. Notice that the boolean NULL is _not_ the same of False. So you get: marco=# select 2 = 2; ?column? ---------- t (1 row) marco=# select 2 = 3; ?column? ---------- f (1 row) marco=# select 2 = NULL; ?column? ---------- (1 row) that is, neither true nor false. Back to your example, you can compare ('a', 2) with ('a', 2), the result is 't' and thus you've managed to identify the right row (it works as a primary key). Also, ('a', 3') is different from ('a', '2'), so you can tell the two rows are different. But what if you allow ('a', NULL)? ('a', NULL) is neither the same _nor different_ from ('a', 2). The result of comparison is NULL, no matter how you're testing it: marco=# select ('a', 2) = ('a', NULL); ?column? ---------- (1 row) marco=# select ('a', 2) <> ('a', NULL); ?column? ---------- (1 row) see? _Neither_ one is true. This would completely defeat the purpose of the primary key. And of course, comparing ('a', NULL) with ('a', NULL) results in exactly the same: marco=# select ('a', NULL) = ('a', NULL); ?column? ---------- (1 row) marco=# select ('a', NULL) <> ('a', NULL); ?column? ---------- (1 row) That's why NULLs are not allowed in primary keys. The key simply won't work. NULL in databases is not _one_ special value. It's _any_ value, since it's unknown. The boolean expression: 2 = NULL might be true or might be false, since NULL could be _any_ integer in this expression. This is completely different from the semantic of None/undef/NULL in most programming languages. You wrote: "Special values are well-known for the problems they raise." then NULL is definitely _not_ the value you're looking for. Everything can be said of NULL, but that it is "well-known". In your case, by choosing (name, address) as the primary key, you're saying 'I need to know both the name and the address to be able to retrieve a datum in the table'. This implies that if you have partial knowledge (you don't know the address), you can't "naturally" retrieve a single datum (or insert it). Depending on what you're trying to achieve, you may need to split the table (normalization the theorists call it). I don't like theory much, but its conclusions sometimes just make a lot of sense. :-) Review your design, maybe either the table schema or the choice of the primary key is not natural for your database. .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
Re: PRIMARY KEY on a *group* of columns imply that each column is NOT
From
Stephane Bortzmeyer
Date:
On Thu, Apr 28, 2005 at 02:04:29PM +0200, Marco Colombo <pgsql@esiway.net> wrote a message of 146 lines which said: > No. NULL is NOT 'None', nor 'undef', and definitely not NULL as in > C. Thanks for the very good and detailed explanation of NULL in SQL. Curious people may note that the strange semantics of NULL are heavily criticized in C. J. Date and Hugh Darwen "A Guide to the SQL Standard" (Addison-Wesley) [Side note: I was perfectly aware for NULL in C, where it is just an ordinary zero, that's why I only mentioned Perl and Python.] > marco=# select 2 = NULL; > ?column? > ---------- > > (1 row) Even better, you can write; registry=> select NULL = NULL; ?column? ---------- (1 row) > Depending on what you're trying to achieve, you may need to split > the table (normalization the theorists call it). Yes, I noticed in similar organizations that the Hosts table was split in one table for names and one for addresses, may be for exactly that reason. > Review your design, maybe either the table schema or the choice of > the primary key is not natural for your database. At the present time, it seems that, practically speaking, the technique proposed by Sebastian Böck (two index) is perfectly fine. This technique can be summarized as "PostgreSQL, I tell you to treat all NULL addresses as being the same value".
Marco Colombo wrote: > That is, given a variable A, you can always write a boolean expression > that evaluates True or False to test if A is _equal_ to None/undef/NULL > (in C): And of course you can do this in SQL as well. > ('a', NULL) is neither the same _nor different_ from ('a', 2). Uh, I'm not sure what you mean by "the same"; but certainly ('a',NULL) is distinct from ('a',2). >The result of comparison is NULL, no matter how you're testing it: Unless you're testing with the IS DISTINCT FROM operator. fli=# select ('a',NULL) IS DISTINCT FROM ('a',2); ?column? ---------- t (1 row) > marco=# select ('a', 2) = ('a', NULL); > ?column? > ---------- > > (1 row) > > marco=# select ('a', 2) <> ('a', NULL); > ?column? > ---------- > > (1 row) fli=# select ('a',NULL) IS DISTINCT FROM ('a',NULL); ?column? ---------- f (1 row) > That's why NULLs are not allowed in primary keys... But in general I agree with most of the rest of what you said.