Thread: Unique Index
Hi, i have a unique index on a table over multiple columns. If now one of the records has a null value in one of the indexed columns i can insert the same record multiple times. Is this a problem within postgres or expected? Example: index unique, btree (colA, colB, colC); would still allow me to insert AAAA, ,CCCC AAAA, ,CCCC Thanks for any help on that. A
On Thu, 20 Jan 2005, Alex wrote: > i have a unique index on a table over multiple columns. If now one of > the records has a null value in one of the indexed columns i can insert > the same record multiple times. > > Is this a problem within postgres or expected? Expected. NULLs are effectively not considered as the same for the purposes of UNIQUE. The predicate basically functions as: 2) If there are no two rows in T such that the value of each column in one row is non-null and is equal to the value of the cor- responding column in the other row according to Subclause 8.2, "<comparison predicate>", then the result of the <unique predi- cate> is true; otherwise, the result of the <unique predicate> is false.
# alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100: > i have a unique index on a table over multiple columns. If now one of > the records has a null value in one of the indexed columns i can insert > the same record multiple times. > > Is this a problem within postgres or expected? In SQL, NULL means "unknown value". How could you assert that two NULLs are equal? -- If you cc me or remove the list(s) completely I'll most likely ignore your message. see http://www.eyrie.org./~eagle/faqs/questions.html
Roman Neuhauser wrote: > # alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100: > >>i have a unique index on a table over multiple columns. If now one of >>the records has a null value in one of the indexed columns i can insert >>the same record multiple times. >> >>Is this a problem within postgres or expected? > > > In SQL, NULL means "unknown value". How could you assert that two > NULLs are equal? > which doesn't make mathematical sense. mathwise null is an empty result. so setting the logic up using the math logic, null values are always equal. -- ======================================== only plain text format email accepted. smaller file size, no virus transfer no proprietary file formats. ========================================
Attachment
According to Date you should never use NULLs. This is because a NULL can mean many different things. It can mean not known (e.g. I know he has an age but I don't know what it is), It can be not applicable (e.g. in a Party table of organizations and people, people would be of a certain sex but an organization would not), It can mean a number divided by zero, It can also mean I don't know if the value is unknown or if the value is not applicable etc. etc. In Date's view there is an infinite number of meanings for null and for a database to handle it correctly would require an equivalent set of operators for each type of null. I think though that Nulls are a very useful feature of SQL databases. It's just that when I write a program, I have to know in a particular context what a null means and handle it appropriately (of course controlling nulls in an index is a different matter). I don't think there is only one accepted way to use nulls. > Roman Neuhauser wrote: >> # alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100: >> >>>i have a unique index on a table over multiple columns. If now one of >>>the records has a null value in one of the indexed columns i can insert >>>the same record multiple times. >>> >>>Is this a problem within postgres or expected? >> >> >> In SQL, NULL means "unknown value". How could you assert that two >> NULLs are equal? >> > which doesn't make mathematical sense. > mathwise null is an empty result. > so setting the logic up using the math logic, null values are always > equal. > > -- > ======================================== > > only plain text format email accepted. > > smaller file size, no virus transfer > no proprietary file formats. > > ======================================== >
Null values are a big surprise to almost every end-user (though the programmers are OK with them). Look at the astonishment on the face of your end user when you tell them that: SELECT COUNT(*) FROM clothing WHERE clothing_color = 'green' + SELECT COUNT(*) FROM clothing WHERE NOT clothing_color = 'green' Is not the count of all clothing because clothing without a color recorded will not be counted. Or (perhaps better yet, violating trichotomy) ... If <Some_column> has a null numeric value, then ALL of the following are FALSE for that case: Some_column < 0 Some_column > 0 Some_column = 0 Some_column <> 0 // This is the one that many find surprising Some_column <= 0 Some_column >= 0 You can probably see why Null values can do strange things in (for instance) an index. Even at that, I think that being able to insert more than one null value into a unique index should be considered as a bug (or diagnosed as an error). -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of vhikida@inreach.com Sent: Wednesday, January 19, 2005 3:30 PM To: J. Greenlees Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Unique Index According to Date you should never use NULLs. This is because a NULL can mean many different things. It can mean not known (e.g. I know he has an age but I don't know what it is), It can be not applicable (e.g. in a Party table of organizations and people, people would be of a certain sex but an organization would not), It can mean a number divided by zero, It can also mean I don't know if the value is unknown or if the value is not applicable etc. etc. In Date's view there is an infinite number of meanings for null and for a database to handle it correctly would require an equivalent set of operators for each type of null. I think though that Nulls are a very useful feature of SQL databases. It's just that when I write a program, I have to know in a particular context what a null means and handle it appropriately (of course controlling nulls in an index is a different matter). I don't think there is only one accepted way to use nulls.
"Dann Corbit" <DCorbit@connx.com> writes: > Or (perhaps better yet, violating trichotomy) ... > If <Some_column> has a null numeric value, then ALL of the following are > FALSE for that case: > Some_column < 0 > Some_column > 0 > Some_column = 0 > Some_column <> 0 // This is the one that many find surprising > Some_column <= 0 > Some_column >= 0 It's worse than that: the above do *not* yield FALSE, they yield NULL. Which does act like FALSE in a simple WHERE clause, but there are other cases (like CHECK clauses) where it doesn't. "x NOT IN (SELECT ...)" is a case that newbies routinely get bitten by. > Even at that, I think that being able to insert more than one null value > into a unique index should be considered as a bug (or diagnosed as an > error). Direct your complaints to the ISO SQL standards committee. regards, tom lane
The ISO SQL Standard does not even define an index, and so any index is an extension to the standard (though primary keys and foreign keys imply them). At least in the SQL Standard that I have (ANSI/ISO/IEC 9075-1-1999 and related documents) has no definition of an index. Perhaps the newer version contains such a definition. So, in the creation of an extension, I think it is up to the programmer to do whatever is best. -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Wednesday, January 19, 2005 4:18 PM To: Dann Corbit Cc: vhikida@inreach.com; J. Greenlees; pgsql-general@postgresql.org Subject: Re: [GENERAL] Unique Index "Dann Corbit" <DCorbit@connx.com> writes: > Or (perhaps better yet, violating trichotomy) ... > If <Some_column> has a null numeric value, then ALL of the following are > FALSE for that case: > Some_column < 0 > Some_column > 0 > Some_column = 0 > Some_column <> 0 // This is the one that many find surprising > Some_column <= 0 > Some_column >= 0 It's worse than that: the above do *not* yield FALSE, they yield NULL. Which does act like FALSE in a simple WHERE clause, but there are other cases (like CHECK clauses) where it doesn't. "x NOT IN (SELECT ...)" is a case that newbies routinely get bitten by. > Even at that, I think that being able to insert more than one null value > into a unique index should be considered as a bug (or diagnosed as an > error). Direct your complaints to the ISO SQL standards committee. regards, tom lane
On Wed, 19 Jan 2005, Dann Corbit wrote: > Even at that, I think that being able to insert more than one null value > into a unique index should be considered as a bug (or diagnosed as an > error). AFAICT the UNIQUE constraint that it's used to model explicitly allows multiple NULLs in the spec so I don't see making it error as being terribly workable.
True, but the standard says nothing about the creation of an index, so you can make it behave in any way that you see fit. -----Original Message----- From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] Sent: Wednesday, January 19, 2005 4:27 PM To: Dann Corbit Cc: vhikida@inreach.com; J. Greenlees; pgsql-general@postgresql.org Subject: Re: [GENERAL] Unique Index On Wed, 19 Jan 2005, Dann Corbit wrote: > Even at that, I think that being able to insert more than one null value > into a unique index should be considered as a bug (or diagnosed as an > error). AFAICT the UNIQUE constraint that it's used to model explicitly allows multiple NULLs in the spec so I don't see making it error as being terribly workable.
On Wed, 19 Jan 2005, Dann Corbit wrote: > True, but the standard says nothing about the creation of an index, so > you can make it behave in any way that you see fit. The unique index is however used to model the unique constraint in PostgreSQL which I had thought was clear from my statement so giving the unique index behavior which makes it unable to model the constraint wouldn't be terribly workable (without rewriting the constraint to be modeled in a separate fashion). > -----Original Message----- > From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] > Sent: Wednesday, January 19, 2005 4:27 PM > To: Dann Corbit > Cc: vhikida@inreach.com; J. Greenlees; pgsql-general@postgresql.org > Subject: Re: [GENERAL] Unique Index > > > On Wed, 19 Jan 2005, Dann Corbit wrote: > > > Even at that, I think that being able to insert more than one null > value > > into a unique index should be considered as a bug (or diagnosed as an > > error). > > AFAICT the UNIQUE constraint that it's used to model explicitly allows > multiple NULLs in the spec so I don't see making it error as being > terribly workable. >
On Wed, 19 Jan 2005, Stephan Szabo wrote: > > On Wed, 19 Jan 2005, Dann Corbit wrote: > > > True, but the standard says nothing about the creation of an index, so > > you can make it behave in any way that you see fit. > > The unique index is however used to model the unique constraint in > PostgreSQL which I had thought was clear from my statement so giving the > unique index behavior which makes it unable to model the constraint > wouldn't be terribly workable (without rewriting the constraint to be > modeled in a separate fashion). Actually, thinking about it, I think I generally disagree with the idea that unique is the correct word to use in any case. It's used in both the constraint and in the predicate to have its current meaning (as described in my earlier message) and I think the word distinct already the semantics of treating two NULLs in the fashion such that an index wouldn't allow two.
I actually just wanted to know if there is a way around this problem. Obviously it is implemented that way for whatever reason. I still though think some arguments given in some of the replies, while probably correct, are besides the point. I use a unique index that may contain null values. On an insert or update I can now not rely on the exception thrown but actually have to write a select statement to check if the same row exists, which I believe defies ONE purpose of having unique indices. Whether Null is associated with "unknown value", "divided by zero"... or however one wants to interpret it is not the issue here, in my view NULL in the same column have the same value or at least should be treated the same. (If I want to differentiate the state, I would use a code instead of NULL as a NULL does not give any indication of its meaning, thus we could safely assume they are treated as equal). Maybe there could be an option in the creation of the index to indicate on how to use NULL values. How do other DBMS handle this? A Tom Lane wrote: >"Dann Corbit" <DCorbit@connx.com> writes: > > >>Or (perhaps better yet, violating trichotomy) ... >>If <Some_column> has a null numeric value, then ALL of the following are >>FALSE for that case: >> >> > > > >>Some_column < 0 >>Some_column > 0 >>Some_column = 0 >>Some_column <> 0 // This is the one that many find surprising >>Some_column <= 0 >>Some_column >= 0 >> >> > >It's worse than that: the above do *not* yield FALSE, they yield NULL. >Which does act like FALSE in a simple WHERE clause, but there are other >cases (like CHECK clauses) where it doesn't. "x NOT IN (SELECT ...)" >is a case that newbies routinely get bitten by. > > > >>Even at that, I think that being able to insert more than one null value >>into a unique index should be considered as a bug (or diagnosed as an >>error). >> >> > >Direct your complaints to the ISO SQL standards committee. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 8: explain analyze is your friend > > > >
Tom Lane <tgl@sss.pgh.pa.us> writes: > "Dann Corbit" <DCorbit@connx.com> writes: > > > Even at that, I think that being able to insert more than one null value > > into a unique index should be considered as a bug (or diagnosed as an > > error). > > Direct your complaints to the ISO SQL standards committee. The SQL standard generally treats NULLs as a escape hatch for constraints. That's true for CHECK constraints as well: if you have a "CHECK a>0" constraint but a is nullable then a NULL value is allowed even though the check constraint can't be verified. This isn't an unreasonable approach. Once you have NULLs it makes more sense to make them useful than to treat them as just another value. You can always make the column NOT NULL if you don't want any exceptions to your constraints. -- greg
Alex <alex@meerkatsoft.com> writes: > I actually just wanted to know if there is a way around this problem. Obviously > it is implemented that way for whatever reason. The way around is to make all the columns NOT NULL. For most applications unique indexes don't make much sense on nullable columns. > (If I want to differentiate the state, I would use a code instead of NULL as > a NULL does not give any indication of its meaning, thus we could safely > assume they are treated as equal). I think you have that backwards. You use NULL in the case where you want all cases to compare as unknown values. If you want them to compare as known values then you should use a special value. In other words, NULL has special properties. You should use it if those special properties are what you want. If you want the system to enforce a unique constraint on the special value then you probably don't want to be using NULL for that special state. Specifically if you find yourself saying "we could safely assume they are treated equal" then NULL is almost certainly not what you want to represent that. NULL never compares equal to anything. -- greg
Dann Corbit wrote: > True, but the standard says nothing about the creation of an index, so > you can make it behave in any way that you see fit. But I thought we are talking about unique _constraint_ here (which is certainly regulated by the standard). -- dave
On Jan 20, 2005, at 16:03, David Garamond wrote: > Dann Corbit wrote: >> True, but the standard says nothing about the creation of an index, so >> you can make it behave in any way that you see fit. > > But I thought we are talking about unique _constraint_ here (which is > certainly regulated by the standard). They could conceivably be separated. The standard likewise doesn't say anything about how the unique constraint is enforced. In PostgreSQL, a "unique" index is used to enforce the constraint, but the unique index is not intrinsically tied to the unique constraint. Michael Glaesemann grzm myrealbox com
>I actually just wanted to know if there is a way around this problem. >Obviously it is implemented that way for whatever reason. > > I still though think some arguments given in some of the replies, while > probably correct, are besides the point. Sorry. I was hoping someone else would answer. > > I use a unique index that may contain null values. On an insert or update > I can now not rely on the exception thrown but actually have to write a > select statement to check if the same row exists, which I believe defies > ONE purpose of having unique indices. Whether Null is associated with > "unknown value", "divided by zero"... or however one wants to interpret it > is not the issue here, in my view NULL in the same column have the same > value or at least should be treated the same. (If I want to differentiate > the state, I would use a code instead of NULL as a NULL does not give any > indication of its meaning, thus we could safely assume they are treated as > equal). > > Maybe there could be an option in the creation of the index to indicate on > how to use NULL values. I can think of two options. One was mentioned already. If only one row can have a null value then it seems to me that you should make it a non null and null would have a special code. If it really needs to be null. Then a rather messy solution would be to have a second column (I'll call it a null indicator) which can only be 1 or null and have a unique index on it. colA ind ------ ----- 1 null 2 null 3 null null 1 > > How do other DBMS handle this? Oracle is the same. > > A > > > > > > Tom Lane wrote: > >>"Dann Corbit" <DCorbit@connx.com> writes: >> >>>Or (perhaps better yet, violating trichotomy) ... >>>If <Some_column> has a null numeric value, then ALL of the following are >>>FALSE for that case: >>> >> >> >>>Some_column < 0 >>>Some_column > 0 >>>Some_column = 0 Some_column <> 0 // This is the one that many find >>>surprising >>>Some_column <= 0 >>>Some_column >= 0 >>> >> >>It's worse than that: the above do *not* yield FALSE, they yield NULL. >>Which does act like FALSE in a simple WHERE clause, but there are other >>cases (like CHECK clauses) where it doesn't. "x NOT IN (SELECT ...)" >>is a case that newbies routinely get bitten by. >> >> >>>Even at that, I think that being able to insert more than one null value >>>into a unique index should be considered as a bug (or diagnosed as an >>>error). >>> >> >>Direct your complaints to the ISO SQL standards committee. >> >> regards, tom lane >> >>---------------------------(end of broadcast)--------------------------- >>TIP 8: explain analyze is your friend >> >> >> > > >
Hi, Am Mittwoch, den 19.01.2005, 15:02 -0800 schrieb J. Greenlees: > Roman Neuhauser wrote: > > # alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100: > > > >>i have a unique index on a table over multiple columns. If now one of > >>the records has a null value in one of the indexed columns i can insert > >>the same record multiple times. > >> > >>Is this a problem within postgres or expected? > > > > > > In SQL, NULL means "unknown value". How could you assert that two > > NULLs are equal? > > > which doesn't make mathematical sense. > mathwise null is an empty result. > so setting the logic up using the math logic, null values are always equal. What kind of mathematics you are speaking? For example you have "infinity" where infinity is never equal to infinity. Same with null. Which is "unknown" or "undefined" So if x is undefined and y is undefined you cannot assume x=y - because if you assume this, then they would not be undefined anymore. q.e.d. Regards Tino
On Thu, 20 Jan 2005 15:20:26 +1100, Alex <alex@meerkatsoft.com> wrote: > I actually just wanted to know if there is a way around this problem. > Obviously it is implemented that way for whatever reason. Well, if you really need it, partial indexes are your friends! :) For clarity, let's say you have: CREATE TABLE foo ( a int, b int, c int, ); And an INDEX: CREATE UNIQUE INDEX foo_abc_index ON foo (a,b,c); Now, you want to make sure a and b are UNIQUE, when c is null; just do: CREATE UNIQUE INDEX foo_abN_index ON foo (a,b) WHERE c IS NULL; Or even, to make b UNIQUE when a and c are null: CREATE UNIQUE INDEX foo_NbN_index ON foo (b) WHERE a IS NULL AND c IS NULL; You need to create such partial indexes for each set of columns you want to be unique-with-null. Don't worry about "index bloat". These additional indexes will be used only when your main (foo_abc_index) is not used, so there won't be any duplicate data in them. Isn't PostgreSQL great? :) Regards, Dawid
On Thu, Jan 20, 2005 at 04:32:37PM +0900, Michael Glaesemann wrote: > > On Jan 20, 2005, at 16:03, David Garamond wrote: > > >Dann Corbit wrote: > >>True, but the standard says nothing about the creation of an index, so > >>you can make it behave in any way that you see fit. > > > >But I thought we are talking about unique _constraint_ here (which is > >certainly regulated by the standard). > > They could conceivably be separated. The standard likewise doesn't say > anything about how the unique constraint is enforced. In PostgreSQL, a > "unique" index is used to enforce the constraint, but the unique index > is not intrinsically tied to the unique constraint. Ofcourse, but then you'd have to create another UNIQUE index type just to handle the constraint, so you could define two types of unique: CREATE UNIQUE INDEX blah on (...) CREATE POSTGRESQL_STRANGE_UNIQUE INDEX blah on (...) the former being used by CREATE CONSTRAINT. After all, we'd just be incompatable with everyone else, no point using a standard keyword. This is just more confusing when there are already perfectly workable solutions to the problem as given. Don't use NULL when it's not appropriate. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Direct your complaints to the ISO SQL standards committee. > The SQL standard generally treats NULLs as a escape hatch for constraints. Not for UNIQUE constraints. SQL92 section 4.10 "Integrity constraints": A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns. In addition, if the unique constraint was defined with PRIMARY KEY, then it requires that none of the values in the specified column or columns be the null value. The short answer to this thread is that the OP is misusing nulls, and should pick some non-null value to be his "placeholder". regards, tom lane
Tino Wildenhain wrote: > Hi, > > Am Mittwoch, den 19.01.2005, 15:02 -0800 schrieb J. Greenlees: > >>Roman Neuhauser wrote: >> >>># alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100: >>> >>> >>>>i have a unique index on a table over multiple columns. If now one of >>>>the records has a null value in one of the indexed columns i can insert >>>>the same record multiple times. >>>> >>>>Is this a problem within postgres or expected? >>> >>> >>> In SQL, NULL means "unknown value". How could you assert that two >>> NULLs are equal? >>> >> >>which doesn't make mathematical sense. >>mathwise null is an empty result. >>so setting the logic up using the math logic, null values are always equal. > > > What kind of mathematics you are speaking? > For example you have "infinity" where infinity is never > equal to infinity. > Same with null. Which is "unknown" or "undefined" > So if x is undefined and y is undefined you cannot > assume x=y - because if you assume this, then > they would not be undefined anymore. > > q.e.d. > > Regards > Tino > > or null as in empty. an empty result set is a null set, zero results. declare a variable, but never assign a value, it has a default value of null from the declaration. ( basically any content of memory space allocated that was not actually empty is the content, but it's a null value to the app. ) -- ======================================== only plain text format email accepted. smaller file size, no virus transfer no proprietary file formats. ========================================
Attachment
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > Tom Lane <tgl@sss.pgh.pa.us> writes: > >> Direct your complaints to the ISO SQL standards committee. > > > The SQL standard generally treats NULLs as a escape hatch for constraints. Huh? I thought I was agreeing with you. By "escape hatch" I meant that having a NULL value in the indexed column allowed multiple records with otherwise identical values because the NULL compared unequal. Ie, that the unique index doesn't apply to the record with NULL columns. > Not for UNIQUE constraints. SQL92 section 4.10 "Integrity constraints": > > A unique constraint is satisfied if and only if no two rows in > a table have the same non-null values in the unique columns. That's ambiguous. Does it mean no two rows have all non-null columns that are all identical? Or does it mean no two rows have columns that excluding any null columns are identical. It actually sounds more like the latter to me which would mean Postgres's interpretation is wrong. > The short answer to this thread is that the OP is misusing nulls, > and should pick some non-null value to be his "placeholder". We're in "violent agreement". -- greg
Dawid Kuroczko <qnex42@gmail.com> writes: > Don't worry about "index bloat". These additional indexes will be used > only when your main (foo_abc_index) is not used, so there won't be > any duplicate data in them. The main index will have _all_ the tuples in them, even where some of the columns are NULL, so this will in fact use extra space. It will also cause extra i/o on every update of a record with NULL in one of the columns. To minimize the extra space you could make it Dawid Kuroczko <qnex42@gmail.com> writes: CREATE UNIQUE INDEX foo_ab_index ON foo (a,b) WHERE c IS NULL AND a is NOT NULL and b IS NOT NULL; CREATE UNIQUE INDEX foo_ac_index ON foo (a,c) WHERE b IS NULL AND a is NOT NULL and c IS NOT NULL; CREATE UNIQUE INDEX foo_bc_index ON foo (b,c) WHERE a IS NULL AND b is NOT NULL and c IS NOT NULL; CREATE UNIQUE INDEX foo_a_index ON foo (a) WHERE a IS NOT NULL AND b IS NULL and c is NULL; CREATE UNIQUE INDEX foo_b_index ON foo (b) WHERE b IS NOT NULL AND a IS NULL and c is NULL; CREATE UNIQUE INDEX foo_c_index ON foo (c) WHERE c IS NOT NULL AND a IS NULL and b is NULL; To avoid indexing the same tuples in multiple indexes. None of this will prevent you from inserting multiple <null,null,null> records though. -- greg
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Not for UNIQUE constraints. SQL92 section 4.10 "Integrity constraints": >> >> A unique constraint is satisfied if and only if no two rows in >> a table have the same non-null values in the unique columns. > That's ambiguous. Does it mean no two rows have all non-null columns that are > all identical? Or does it mean no two rows have columns that excluding any > null columns are identical. OK, try the more formal definition in 8.9 <unique predicate> 2) If there are no two rows in T such that the value of each column in one row is non-null and is equal to the value of the cor- responding column in the other row according to Subclause 8.2, "<comparison predicate>", then the result of the <unique predi- cate> is true; otherwise, the result of the <unique predicate> is false. (11.7 defines the UNIQUE constraint in terms of the unique predicate) regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm sure this won't work for some reason, but something similar might; why not create a unique index on a constant where all three are null; something along these lines (in addition to the others): CREATE UNIQUE INDEX foo_trio_index ON foo (1) WHERE c IS NULL AND a IS NULL and b IS NULL; On Jan 20, 2005, at 10:57 AM, Greg Stark wrote: > Dawid Kuroczko <qnex42@gmail.com> writes: > >> Don't worry about "index bloat". These additional indexes will be >> used >> only when your main (foo_abc_index) is not used, so there won't be >> any duplicate data in them. > > The main index will have _all_ the tuples in them, even where some of > the > columns are NULL, so this will in fact use extra space. It will also > cause > extra i/o on every update of a record with NULL in one of the columns. > > To minimize the extra space you could make it > > Dawid Kuroczko <qnex42@gmail.com> writes: > > CREATE UNIQUE INDEX foo_ab_index ON foo (a,b) WHERE c IS NULL AND a > is NOT NULL and b IS NOT NULL; > CREATE UNIQUE INDEX foo_ac_index ON foo (a,c) WHERE b IS NULL AND a > is NOT NULL and c IS NOT NULL; > CREATE UNIQUE INDEX foo_bc_index ON foo (b,c) WHERE a IS NULL AND b > is NOT NULL and c IS NOT NULL; > CREATE UNIQUE INDEX foo_a_index ON foo (a) WHERE a IS NOT NULL AND > b IS NULL and c is NULL; > CREATE UNIQUE INDEX foo_b_index ON foo (b) WHERE b IS NOT NULL AND > a IS NULL and c is NULL; > CREATE UNIQUE INDEX foo_c_index ON foo (c) WHERE c IS NOT NULL AND > a IS NULL and b is NULL; > > To avoid indexing the same tuples in multiple indexes. > > None of this will prevent you from inserting multiple <null,null,null> > records > though. > > > -- > greg > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > > - ----------------------------------------------------------- Frank D. Engel, Jr. <fde101@fjrhome.net> $ ln -s /usr/share/kjvbible /usr/manual $ true | cat /usr/manual | grep "John 3:16" John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. $ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFB79hL7aqtWrR9cZoRAglUAJ9sT3SypLYDZhx6Dkysfr7aLHQttwCeNLs8 /J4jFlWMLcMMxbQ3/nj55eA= =4Bbe -----END PGP SIGNATURE----- ___________________________________________________________ $0 Web Hosting with up to 120MB web space, 1000 MB Transfer 10 Personalized POP and Web E-mail Accounts, and much more. Signup at www.doteasy.com
On Wed, 2005-01-19 at 22:20, Alex wrote: > > Maybe there could be an option in the creation of the index to indicate > on how to use NULL values. > > How do other DBMS handle this? http://troels.arvin.dk/db/rdbms/
"Frank D. Engel, Jr." <fde101@fjrhome.net> writes: > I'm sure this won't work for some reason, but something similar might; why not > create a unique index on a constant where all three are null; something along > these lines (in addition to the others): > > CREATE UNIQUE INDEX foo_trio_index ON foo (1) WHERE c IS NULL AND a IS NULL and > b IS NULL; > Huh. Hadn't thought of creating an index on a constant. It works if you put an extra set of parentheses in: CREATE UNIQUE INDEX foo_abc_index ON foo ((1)) WHERE ... -- greg
It is clear to me that only allowing a single null value will not violate the explanation below. It would be equally true that allowing multiple null values would not violate it. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane Sent: Thursday, January 20, 2005 8:03 AM To: Greg Stark Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Unique Index Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Not for UNIQUE constraints. SQL92 section 4.10 "Integrity constraints": >> >> A unique constraint is satisfied if and only if no two rows in >> a table have the same non-null values in the unique columns. > That's ambiguous. Does it mean no two rows have all non-null columns that are > all identical? Or does it mean no two rows have columns that excluding any > null columns are identical. OK, try the more formal definition in 8.9 <unique predicate> 2) If there are no two rows in T such that the value of each column in one row is non-null and is equal to the value of the cor- responding column in the other row according to Subclause 8.2, "<comparison predicate>", then the result of the <unique predi- cate> is true; otherwise, the result of the <unique predicate> is false. (11.7 defines the UNIQUE constraint in terms of the unique predicate) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
On Thu, 20 Jan 2005, Dann Corbit wrote: > It is clear to me that only allowing a single null value will not > violate the explanation below. Given two rows in T with one column each (NULL), (NULL) Find two rows such that the value of each column in one row is non-null and equal to the value of the corresponding column in the other row according to 8.2. If there are no such rows the unique predicate returns true (ie the constraint is satisfied).
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Huh? ... the value of EACH COLUMN in one row is NOT NULL and IS EQUAL to ... In order for values to be equal in SQL, neither one can be null. For this condition to hold, it is more than "clear" that at least one row must contain *NO* *NULL* *VALUES* (that means zero columns in that row may contain null values). Since *ALL* columns in the other row must be EQUAL to the corresponding column in that row, none of them can be null either. Therefore, the uniqueness predicate evaluates to false, and each of the two rows is considered unique compared to the other as soon as any null value shows up in either row. There is *no* ambiguity here! On Jan 20, 2005, at 1:55 PM, Dann Corbit wrote: > It is clear to me that only allowing a single null value will not > violate the explanation below. > > It would be equally true that allowing multiple null values would not > violate it. > > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane > Sent: Thursday, January 20, 2005 8:03 AM > To: Greg Stark > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Unique Index > > Greg Stark <gsstark@mit.edu> writes: >> Tom Lane <tgl@sss.pgh.pa.us> writes: >>> Not for UNIQUE constraints. SQL92 section 4.10 "Integrity > constraints": >>> >>> A unique constraint is satisfied if and only if no two rows in >>> a table have the same non-null values in the unique columns. > >> That's ambiguous. Does it mean no two rows have all non-null columns > that are >> all identical? Or does it mean no two rows have columns that excluding > any >> null columns are identical. > > OK, try the more formal definition in 8.9 <unique predicate> > > 2) If there are no two rows in T such that the value of each > column > in one row is non-null and is equal to the value of the > cor- > responding column in the other row according to Subclause > 8.2, > "<comparison predicate>", then the result of the <unique > predi- > cate> is true; otherwise, the result of the <unique > predicate> > is false. > > (11.7 defines the UNIQUE constraint in terms of the unique predicate) > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to > majordomo@postgresql.org > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > > - ----------------------------------------------------------- Frank D. Engel, Jr. <fde101@fjrhome.net> $ ln -s /usr/share/kjvbible /usr/manual $ true | cat /usr/manual | grep "John 3:16" John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. $ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFB8AX07aqtWrR9cZoRAlu+AJ0YCv6VEID7MR5BHOf6rRl+gyPeLgCghImV Um7r0pyp2vfsCVArSOkswMM= =kzpI -----END PGP SIGNATURE----- ___________________________________________________________ $0 Web Hosting with up to 120MB web space, 1000 MB Transfer 10 Personalized POP and Web E-mail Accounts, and much more. Signup at www.doteasy.com
Would the constraint not be satisfied if each combination (including NULL) were not also forced to be unique? I maintain that the constraint is still satisfied. So, it is satisfied if I stuff thousands of NULL values in there. And it is satisfied if I only allow a single NULL value. With multiple columns, it might be set such that each combination must be unique, like binary counting. Let me also state that I agree: allowing null values in a unique index is ludicrous. But if it is allowed, I think forcing the combinations to be single valued makes more sense than allowing any number of them. -----Original Message----- From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] Sent: Thursday, January 20, 2005 11:14 AM To: Dann Corbit Cc: Tom Lane; Greg Stark; pgsql-general@postgresql.org Subject: Re: [GENERAL] Unique Index On Thu, 20 Jan 2005, Dann Corbit wrote: > It is clear to me that only allowing a single null value will not > violate the explanation below. Given two rows in T with one column each (NULL), (NULL) Find two rows such that the value of each column in one row is non-null and equal to the value of the corresponding column in the other row according to 8.2. If there are no such rows the unique predicate returns true (ie the constraint is satisfied).
"Dann Corbit" <DCorbit@connx.com> writes: > Would the constraint not be satisfied if each combination (including > NULL) were not also forced to be unique? > > I maintain that the constraint is still satisfied. > > So, it is satisfied if I stuff thousands of NULL values in there. > > And it is satisfied if I only allow a single NULL value. You're misreading it. It's not a question of what you allow, it's a question of what's in the table. The database *must* allow anything in the table that would satisfy the constraint. Of course the constraint is satisfied if there's only one NULL value. But the constraint is also satisfied if there are more than one. So the database has to allow you to do either of these since there's no constraint that they violate. > Let me also state that I agree: allowing null values in a unique index > is ludicrous. But if it is allowed, I think forcing the combinations to > be single valued makes more sense than allowing any number of them. In fact allowing NULLs in columns involved in a unique constraint is an optional feature of the SQL standard. But if it's allowed it is required to be done the way Postgres does it. Perhaps you're just thinking of primary keys (which, btw, are *not* allowed to be nullable according to the spec). NULLs are much more likely in a situation where it's not a key field, just a bit of incidental data that we know should be unique. Consider for example a user table where there's a phone number field. Now not all users enter a phone number, but when they do we want to make sure it's unique. Does it make sense to restrict the database to a single user with an unknown phone number? Or consider a product database. Some products have ISBNs, namely books, but not all do. If the product has an ISBN then it really ought to be unique, no other product should have the same ISBN. But any number of products can be non-books and not have an ISBN. -- greg
On Thu, 20 Jan 2005, Dann Corbit wrote: > Would the constraint not be satisfied if each combination (including > NULL) were not also forced to be unique? The constraint would be satisfied, however cases that the constraint is satisfied for would not be allowed. The case I gave below is one for which I argue the constraint is satisfied because the search condition is true. The definition above would appear to not allow that case and as such appears to be contrary to the definition of the constraint. > Let me also state that I agree: allowing null values in a unique index > is ludicrous. But if it is allowed, I think forcing the combinations to > be single valued makes more sense than allowing any number of them. I think that'd be better termed a DISTINCT index to use SQL terminology. > On Thu, 20 Jan 2005, Dann Corbit wrote: > > > It is clear to me that only allowing a single null value will not > > violate the explanation below. > > Given two rows in T with one column each > (NULL), (NULL) > > Find two rows such that the value of each column in one row is non-null > and equal to the value of the corresponding column in the other row > according to 8.2. If there are no such rows the unique predicate returns > true (ie the constraint is satisfied). >
Yes. I was wrong. Sorry about the noise. -----Original Message----- From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] Sent: Thursday, January 20, 2005 12:01 PM To: Dann Corbit Cc: Tom Lane; Greg Stark; pgsql-general@postgresql.org Subject: RE: [GENERAL] Unique Index On Thu, 20 Jan 2005, Dann Corbit wrote: > Would the constraint not be satisfied if each combination (including > NULL) were not also forced to be unique? The constraint would be satisfied, however cases that the constraint is satisfied for would not be allowed. The case I gave below is one for which I argue the constraint is satisfied because the search condition is true. The definition above would appear to not allow that case and as such appears to be contrary to the definition of the constraint. > Let me also state that I agree: allowing null values in a unique index > is ludicrous. But if it is allowed, I think forcing the combinations to > be single valued makes more sense than allowing any number of them. I think that'd be better termed a DISTINCT index to use SQL terminology. > On Thu, 20 Jan 2005, Dann Corbit wrote: > > > It is clear to me that only allowing a single null value will not > > violate the explanation below. > > Given two rows in T with one column each > (NULL), (NULL) > > Find two rows such that the value of each column in one row is non-null > and equal to the value of the corresponding column in the other row > according to 8.2. If there are no such rows the unique predicate returns > true (ie the constraint is satisfied). >
Am Donnerstag, den 20.01.2005, 06:09 -0800 schrieb J. Greenlees: > Tino Wildenhain wrote: > > Hi, > > > > Am Mittwoch, den 19.01.2005, 15:02 -0800 schrieb J. Greenlees: > > > >>Roman Neuhauser wrote: > >> > >>># alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100: > >>> > >>> > >>>>i have a unique index on a table over multiple columns. If now one of > >>>>the records has a null value in one of the indexed columns i can insert > >>>>the same record multiple times. > >>>> > >>>>Is this a problem within postgres or expected? > >>> > >>> > >>> In SQL, NULL means "unknown value". How could you assert that two > >>> NULLs are equal? > >>> > >> > >>which doesn't make mathematical sense. > >>mathwise null is an empty result. > >>so setting the logic up using the math logic, null values are always equal. > > > > > > What kind of mathematics you are speaking? > > For example you have "infinity" where infinity is never > > equal to infinity. > > Same with null. Which is "unknown" or "undefined" > > So if x is undefined and y is undefined you cannot > > assume x=y - because if you assume this, then > > they would not be undefined anymore. > > > > q.e.d. > > > > Regards > > Tino > > > > > or null as in empty. > an empty result set is a null set, zero results. > declare a variable, but never assign a value, it has a default value of > null from the declaration. > ( basically any content of memory space allocated that was not actually > empty is the content, but it's a null value to the app. ) No. Empty result set is just a set without elements. zero-length list or tuple or whatever your programming language uses :-) [] != null/undefined/None