Thread: Unique Index

Unique Index

From

Alex

Date:

19 January 2005, 15:03:27

Hi,

i have a unique index on a table over multiple columns. If now one of
the records has a null value in one of the indexed columns i can insert
the same record multiple times.

Is this a problem within postgres or expected?

Example:

index unique, btree (colA, colB, colC);

would still allow me to insert

AAAA, ,CCCC
AAAA, ,CCCC


Thanks for any help on that.
A

Re: Unique Index

From

Stephan Szabo

Date:

19 January 2005, 15:25:21

On Thu, 20 Jan 2005, Alex wrote:

> i have a unique index on a table over multiple columns. If now one of
> the records has a null value in one of the indexed columns i can insert
> the same record multiple times.
>
> Is this a problem within postgres or expected?

Expected. NULLs are effectively not considered as the same for the
purposes of UNIQUE.

The predicate basically functions as:

2) If there are no two rows in T such that the value of each column
in one row is non-null and is equal to the value of the cor-
responding column in the other row according to Subclause 8.2,
"<comparison predicate>", then the result of the <unique predi-
cate> is true; otherwise, the result of the <unique predicate>
is false.

Re: Unique Index

From

Roman Neuhauser

Date:

19 January 2005, 16:58:37

# alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100:
> i have a unique index on a table over multiple columns. If now one of
> the records has a null value in one of the indexed columns i can insert
> the same record multiple times.
>
> Is this a problem within postgres or expected?

    In SQL, NULL means "unknown value". How could you assert that two
    NULLs are equal?

--
If you cc me or remove the list(s) completely I'll most likely ignore
your message.    see http://www.eyrie.org./~eagle/faqs/questions.html

Re: Unique Index

From

"J. Greenlees"

Date:

19 January 2005, 23:04:17

Roman Neuhauser wrote:
> # alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100:
>
>>i have a unique index on a table over multiple columns. If now one of
>>the records has a null value in one of the indexed columns i can insert
>>the same record multiple times.
>>
>>Is this a problem within postgres or expected?
>
>
>     In SQL, NULL means "unknown value". How could you assert that two
>     NULLs are equal?
>
which doesn't make mathematical sense.
mathwise null is an empty result.
so setting the logic up using the math logic, null values are always equal.

--
========================================

only plain text format email accepted.

smaller file size, no virus transfer
no proprietary file formats.

========================================

Attachment

smime.p7s

Re: Unique Index

From

vhikida@inreach.com

Date:

19 January 2005, 23:29:58

According to Date you should never use NULLs. This is because a NULL can
mean many different things. It can mean not known (e.g. I know he has an
age but I don't know what it is), It can be not applicable (e.g. in a
Party table of organizations and people, people would be of a certain sex
but an organization would not), It can mean a number divided by zero, It
can also mean I don't know if the value is unknown or if the value is not
applicable etc. etc. In Date's view there is an infinite number of
meanings for null and for a database to handle it correctly would require
an equivalent set of operators for each type of null.

I think though that Nulls are a very useful feature of SQL databases. It's
just that when I write a program, I have to know in a particular context
what a null means and handle it appropriately (of course controlling nulls
in an index is a different matter). I don't think there is only one
accepted way to use nulls.

> Roman Neuhauser wrote:
>> # alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100:
>>
>>>i have a unique index on a table over multiple columns. If now one of
>>>the records has a null value in one of the indexed columns i can insert
>>>the same record multiple times.
>>>
>>>Is this a problem within postgres or expected?
>>
>>
>>     In SQL, NULL means "unknown value". How could you assert that two
>>     NULLs are equal?
>>
> which doesn't make mathematical sense.
> mathwise null is an empty result.
> so setting the logic up using the math logic, null values are always
> equal.
>
> --
> ========================================
>
> only plain text format email accepted.
>
> smaller file size, no virus transfer
> no proprietary file formats.
>
> ========================================
>

Re: Unique Index

From

"Dann Corbit"

Date:

19 January 2005, 23:55:49

Null values are a big surprise to almost every end-user (though the
programmers are OK with them).

Look at the astonishment on the face of your end user when you tell them
that:
SELECT COUNT(*) FROM clothing WHERE clothing_color = 'green'
+
SELECT COUNT(*) FROM clothing WHERE NOT clothing_color = 'green'

Is not the count of all clothing because clothing without a color
recorded will not be counted.

Or (perhaps better yet, violating trichotomy) ...
If <Some_column> has a null numeric value, then ALL of the following are
FALSE for that case:

Some_column < 0
Some_column > 0
Some_column = 0
Some_column <> 0 // This is the one that many find surprising
Some_column <= 0
Some_column >= 0

You can probably see why Null values can do strange things in (for
instance) an index.

Even at that, I think that being able to insert more than one null value
into a unique index should be considered as a bug (or diagnosed as an
error).

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of
vhikida@inreach.com
Sent: Wednesday, January 19, 2005 3:30 PM
To: J. Greenlees
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Unique Index

According to Date you should never use NULLs. This is because a NULL can
mean many different things. It can mean not known (e.g. I know he has an
age but I don't know what it is), It can be not applicable (e.g. in a
Party table of organizations and people, people would be of a certain
sex
but an organization would not), It can mean a number divided by zero, It
can also mean I don't know if the value is unknown or if the value is
not
applicable etc. etc. In Date's view there is an infinite number of
meanings for null and for a database to handle it correctly would
require
an equivalent set of operators for each type of null.

I think though that Nulls are a very useful feature of SQL databases.
It's
just that when I write a program, I have to know in a particular context
what a null means and handle it appropriately (of course controlling
nulls
in an index is a different matter). I don't think there is only one
accepted way to use nulls.

Re: Unique Index

From

Tom Lane

Date:

20 January 2005, 00:18:51

"Dann Corbit" <DCorbit@connx.com> writes:
> Or (perhaps better yet, violating trichotomy) ...
> If <Some_column> has a null numeric value, then ALL of the following are
> FALSE for that case:

> Some_column < 0
> Some_column > 0
> Some_column = 0
> Some_column <> 0 // This is the one that many find surprising
> Some_column <= 0
> Some_column >= 0

It's worse than that: the above do *not* yield FALSE, they yield NULL.
Which does act like FALSE in a simple WHERE clause, but there are other
cases (like CHECK clauses) where it doesn't.  "x NOT IN (SELECT ...)"
is a case that newbies routinely get bitten by.

> Even at that, I think that being able to insert more than one null value
> into a unique index should be considered as a bug (or diagnosed as an
> error).

Direct your complaints to the ISO SQL standards committee.

            regards, tom lane

Re: Unique Index

From

"Dann Corbit"

Date:

20 January 2005, 00:25:51

The ISO SQL Standard does not even define an index, and so any index is
an extension to the standard (though primary keys and foreign keys imply
them).

At least in the SQL Standard that I have (ANSI/ISO/IEC 9075-1-1999 and
related documents) has no definition of an index.  Perhaps the newer
version contains such a definition.

So, in the creation of an extension, I think it is up to the programmer
to do whatever is best.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, January 19, 2005 4:18 PM
To: Dann Corbit
Cc: vhikida@inreach.com; J. Greenlees; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Unique Index

"Dann Corbit" <DCorbit@connx.com> writes:
> Or (perhaps better yet, violating trichotomy) ...
> If <Some_column> has a null numeric value, then ALL of the following
are
> FALSE for that case:

> Some_column < 0
> Some_column > 0
> Some_column = 0
> Some_column <> 0 // This is the one that many find surprising
> Some_column <= 0
> Some_column >= 0

It's worse than that: the above do *not* yield FALSE, they yield NULL.
Which does act like FALSE in a simple WHERE clause, but there are other
cases (like CHECK clauses) where it doesn't.  "x NOT IN (SELECT ...)"
is a case that newbies routinely get bitten by.

> Even at that, I think that being able to insert more than one null
value
> into a unique index should be considered as a bug (or diagnosed as an
> error).

Direct your complaints to the ISO SQL standards committee.

            regards, tom lane

Re: Unique Index

From

Stephan Szabo

Date:

20 January 2005, 00:28:57

On Wed, 19 Jan 2005, Dann Corbit wrote:

> Even at that, I think that being able to insert more than one null value
> into a unique index should be considered as a bug (or diagnosed as an
> error).

AFAICT the UNIQUE constraint that it's used to model explicitly allows
multiple NULLs in the spec so I don't see making it error as being
terribly workable.

Re: Unique Index

From

"Dann Corbit"

Date:

20 January 2005, 00:29:04

True, but the standard says nothing about the creation of an index, so
you can make it behave in any way that you see fit.

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
Sent: Wednesday, January 19, 2005 4:27 PM
To: Dann Corbit
Cc: vhikida@inreach.com; J. Greenlees; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Unique Index

On Wed, 19 Jan 2005, Dann Corbit wrote:

> Even at that, I think that being able to insert more than one null
value
> into a unique index should be considered as a bug (or diagnosed as an
> error).

AFAICT the UNIQUE constraint that it's used to model explicitly allows
multiple NULLs in the spec so I don't see making it error as being
terribly workable.

Re: Unique Index

From

Stephan Szabo

Date:

20 January 2005, 01:16:52

On Wed, 19 Jan 2005, Dann Corbit wrote:

> True, but the standard says nothing about the creation of an index, so
> you can make it behave in any way that you see fit.

The unique index is however used to model the unique constraint in
PostgreSQL which I had thought was clear from my statement so giving the
unique index behavior which makes it unable to model the constraint
wouldn't be terribly workable (without rewriting the constraint to be
modeled in a separate fashion).

> -----Original Message-----
> From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
> Sent: Wednesday, January 19, 2005 4:27 PM
> To: Dann Corbit
> Cc: vhikida@inreach.com; J. Greenlees; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Unique Index
>
>
> On Wed, 19 Jan 2005, Dann Corbit wrote:
>
> > Even at that, I think that being able to insert more than one null
> value
> > into a unique index should be considered as a bug (or diagnosed as an
> > error).
>
> AFAICT the UNIQUE constraint that it's used to model explicitly allows
> multiple NULLs in the spec so I don't see making it error as being
> terribly workable.
>

Re: Unique Index

From

Stephan Szabo

Date:

20 January 2005, 02:50:14

On Wed, 19 Jan 2005, Stephan Szabo wrote:

>
> On Wed, 19 Jan 2005, Dann Corbit wrote:
>
> > True, but the standard says nothing about the creation of an index, so
> > you can make it behave in any way that you see fit.
>
> The unique index is however used to model the unique constraint in
> PostgreSQL which I had thought was clear from my statement so giving the
> unique index behavior which makes it unable to model the constraint
> wouldn't be terribly workable (without rewriting the constraint to be
> modeled in a separate fashion).

Actually, thinking about it, I think I generally disagree with the idea
that unique is the correct word to use in any case.  It's used in both the
constraint and in the predicate to have its current meaning (as described
in my earlier message) and I think the word distinct already the semantics
of treating two NULLs in the fashion such that an index wouldn't allow
two.

Re: Unique Index

From

Alex

Date:

20 January 2005, 04:23:07

I actually just wanted to know if there is a way around this problem.
Obviously it is implemented that way for whatever reason.

I still though think some arguments given in some of the replies, while
probably correct, are besides the point.

I use a unique index that may contain null values. On an insert or
update I can now not rely on the exception thrown but actually have to
write a select statement to check if the same row exists, which I
believe defies ONE purpose of having unique indices. Whether Null is
associated with "unknown value", "divided by zero"... or however one
wants to interpret it is not the issue here, in my view NULL in the same
column have the same value or at least should be treated the same. (If I
want to differentiate the state, I would use a code instead of NULL as a
NULL does not give any indication of its meaning, thus we could safely
assume they are treated as equal).

Maybe there could be an option in the creation of the index to indicate
on how to use NULL values.

How do other DBMS handle this?

A

Tom Lane wrote:

>"Dann Corbit" <DCorbit@connx.com> writes:
>
>
>>Or (perhaps better yet, violating trichotomy) ...
>>If <Some_column> has a null numeric value, then ALL of the following are
>>FALSE for that case:
>>
>>
>
>
>
>>Some_column < 0
>>Some_column > 0
>>Some_column = 0
>>Some_column <> 0 // This is the one that many find surprising
>>Some_column <= 0
>>Some_column >= 0
>>
>>
>
>It's worse than that: the above do *not* yield FALSE, they yield NULL.
>Which does act like FALSE in a simple WHERE clause, but there are other
>cases (like CHECK clauses) where it doesn't.  "x NOT IN (SELECT ...)"
>is a case that newbies routinely get bitten by.
>
>
>
>>Even at that, I think that being able to insert more than one null value
>>into a unique index should be considered as a bug (or diagnosed as an
>>error).
>>
>>
>
>Direct your complaints to the ISO SQL standards committee.
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend
>
>
>
>

Re: Unique Index

From

Greg Stark

Date:

20 January 2005, 04:53:29

Tom Lane <tgl@sss.pgh.pa.us> writes:

> "Dann Corbit" <DCorbit@connx.com> writes:
>
> > Even at that, I think that being able to insert more than one null value
> > into a unique index should be considered as a bug (or diagnosed as an
> > error).
>
> Direct your complaints to the ISO SQL standards committee.

The SQL standard generally treats NULLs as a escape hatch for constraints.

That's true for CHECK constraints as well: if you have a "CHECK a>0"
constraint but a is nullable then a NULL value is allowed even though the
check constraint can't be verified.

This isn't an unreasonable approach. Once you have NULLs it makes more sense
to make them useful than to treat them as just another value. You can always
make the column NOT NULL if you don't want any exceptions to your constraints.

--
greg

Re: Unique Index

From

Greg Stark

Date:

20 January 2005, 06:32:29

Alex <alex@meerkatsoft.com> writes:

> I actually just wanted to know if there is a way around this problem. Obviously
> it is implemented that way for whatever reason.

The way around is to make all the columns NOT NULL. For most applications
unique indexes don't make much sense on nullable columns.

> (If I want to differentiate the state, I would use a code instead of NULL as
> a NULL does not give any indication of its meaning, thus we could safely
> assume they are treated as equal).

I think you have that backwards. You use NULL in the case where you want all
cases to compare as unknown values. If you want them to compare as known
values then you should use a special value.

In other words, NULL has special properties. You should use it if those
special properties are what you want. If you want the system to enforce a
unique constraint on the special value then you probably don't want to be
using NULL for that special state.

Specifically if you find yourself saying "we could safely assume they are
treated equal" then NULL is almost certainly not what you want to represent
that. NULL never compares equal to anything.

--
greg

Re: Unique Index

From

David Garamond

Date:

20 January 2005, 07:03:25

Dann Corbit wrote:
> True, but the standard says nothing about the creation of an index, so
> you can make it behave in any way that you see fit.

But I thought we are talking about unique _constraint_ here (which is
certainly regulated by the standard).

--
dave

Re: Unique Index

From

Michael Glaesemann

Date:

20 January 2005, 07:32:46

On Jan 20, 2005, at 16:03, David Garamond wrote:

> Dann Corbit wrote:
>> True, but the standard says nothing about the creation of an index, so
>> you can make it behave in any way that you see fit.
>
> But I thought we are talking about unique _constraint_ here (which is
> certainly regulated by the standard).

They could conceivably be separated. The standard likewise doesn't say
anything about how the unique constraint is enforced. In PostgreSQL, a
"unique" index is used to enforce the constraint, but the unique index
is not intrinsically tied to the unique constraint.

Michael Glaesemann
grzm myrealbox com

Re: Unique Index

From

"Vincent Hikida"

Date:

20 January 2005, 07:33:24


>I actually just wanted to know if there is a way around this problem.
>Obviously it is implemented that way for whatever reason.
>
> I still though think some arguments given in some of the replies, while
> probably correct, are besides the point.

Sorry. I was hoping someone else would answer.
>
> I use a unique index that may contain null values. On an insert or update
> I can now not rely on the exception thrown but actually have to write a
> select statement to check if the same row exists, which I believe defies
> ONE purpose of having unique indices. Whether Null is associated with
> "unknown value", "divided by zero"... or however one wants to interpret it
> is not the issue here, in my view NULL in the same column have the same
> value or at least should be treated the same. (If I want to differentiate
> the state, I would use a code instead of NULL as a NULL does not give any
> indication of its meaning, thus we could safely assume they are treated as
> equal).
>
> Maybe there could be an option in the creation of the index to indicate on
> how to use NULL values.

I can think of two options.

One was mentioned already. If only one row can have a null value then it
seems to me that you should make it a non null and null would have a special
code.

If it really needs to be null. Then a rather messy solution would be to have
a second column (I'll call it a null indicator) which can only be 1 or null
and have a unique index on it.

colA        ind
------       -----
1             null
2             null
3             null
null         1

>
> How do other DBMS handle this?

Oracle is the same.
>
> A
>
>
>
>
>
> Tom Lane wrote:
>
>>"Dann Corbit" <DCorbit@connx.com> writes:
>>
>>>Or (perhaps better yet, violating trichotomy) ...
>>>If <Some_column> has a null numeric value, then ALL of the following are
>>>FALSE for that case:
>>>
>>
>>
>>>Some_column < 0
>>>Some_column > 0
>>>Some_column = 0 Some_column <> 0 // This is the one that many find
>>>surprising
>>>Some_column <= 0
>>>Some_column >= 0
>>>
>>
>>It's worse than that: the above do *not* yield FALSE, they yield NULL.
>>Which does act like FALSE in a simple WHERE clause, but there are other
>>cases (like CHECK clauses) where it doesn't.  "x NOT IN (SELECT ...)"
>>is a case that newbies routinely get bitten by.
>>
>>
>>>Even at that, I think that being able to insert more than one null value
>>>into a unique index should be considered as a bug (or diagnosed as an
>>>error).
>>>
>>
>>Direct your complaints to the ISO SQL standards committee.
>>
>> regards, tom lane
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 8: explain analyze is your friend
>>
>>
>>
>
>
>

Re: Unique Index

From

Tino Wildenhain

Date:

20 January 2005, 07:34:42

Hi,

Am Mittwoch, den 19.01.2005, 15:02 -0800 schrieb J. Greenlees:
> Roman Neuhauser wrote:
> > # alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100:
> >
> >>i have a unique index on a table over multiple columns. If now one of
> >>the records has a null value in one of the indexed columns i can insert
> >>the same record multiple times.
> >>
> >>Is this a problem within postgres or expected?
> >
> >
> >     In SQL, NULL means "unknown value". How could you assert that two
> >     NULLs are equal?
> >
> which doesn't make mathematical sense.
> mathwise null is an empty result.
> so setting the logic up using the math logic, null values are always equal.

What kind of mathematics you are speaking?
For example you have "infinity" where infinity is never
equal to infinity.
Same with null. Which is "unknown" or "undefined"
So if x is undefined and y is undefined you cannot
assume x=y  - because if you assume this, then
they would not be undefined anymore.

q.e.d.

Regards
Tino

Re: Unique Index

From

Dawid Kuroczko

Date:

20 January 2005, 08:27:50

On Thu, 20 Jan 2005 15:20:26 +1100, Alex <alex@meerkatsoft.com> wrote:
> I actually just wanted to know if there is a way around this problem.
> Obviously it is implemented that way for whatever reason.

Well, if you really need it, partial indexes are your friends! :)

For clarity, let's say you have:
CREATE TABLE foo (
   a int,
   b int,
   c int,
);
And an INDEX:
CREATE UNIQUE INDEX foo_abc_index ON foo (a,b,c);

Now, you want to make sure a and b are UNIQUE, when c is null; just do:
CREATE UNIQUE INDEX foo_abN_index ON foo (a,b) WHERE c IS NULL;

Or even, to make b UNIQUE when a and c are null:
CREATE UNIQUE INDEX foo_NbN_index ON foo (b) WHERE a IS NULL AND c IS NULL;

You need to create such partial indexes for each set of columns
you want to be unique-with-null.

Don't worry about "index bloat".  These additional indexes will be used
only when your main (foo_abc_index) is not used, so there won't be
any duplicate data in them.

Isn't PostgreSQL great? :)

   Regards,
      Dawid

Re: Unique Index

From

Martijn van Oosterhout

Date:

20 January 2005, 12:56:28

On Thu, Jan 20, 2005 at 04:32:37PM +0900, Michael Glaesemann wrote:
>
> On Jan 20, 2005, at 16:03, David Garamond wrote:
>
> >Dann Corbit wrote:
> >>True, but the standard says nothing about the creation of an index, so
> >>you can make it behave in any way that you see fit.
> >
> >But I thought we are talking about unique _constraint_ here (which is
> >certainly regulated by the standard).
>
> They could conceivably be separated. The standard likewise doesn't say
> anything about how the unique constraint is enforced. In PostgreSQL, a
> "unique" index is used to enforce the constraint, but the unique index
> is not intrinsically tied to the unique constraint.

Ofcourse, but then you'd have to create another UNIQUE index type just
to handle the constraint, so you could define two types of unique:

CREATE UNIQUE INDEX blah on (...)
CREATE POSTGRESQL_STRANGE_UNIQUE INDEX blah on  (...)

the former being used by CREATE CONSTRAINT. After all, we'd just be
incompatable with everyone else, no point using a standard keyword.

This is just more confusing when there are already perfectly workable
solutions to the problem as given. Don't use NULL when it's not
appropriate.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-91241.dat

Re: Unique Index

From

Tom Lane

Date:

20 January 2005, 13:14:31

Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Direct your complaints to the ISO SQL standards committee.

> The SQL standard generally treats NULLs as a escape hatch for constraints.

Not for UNIQUE constraints.  SQL92 section 4.10 "Integrity constraints":

         A unique constraint is satisfied if and only if no two rows in
         a table have the same non-null values in the unique columns. In
         addition, if the unique constraint was defined with PRIMARY KEY,
         then it requires that none of the values in the specified column or
         columns be the null value.

The short answer to this thread is that the OP is misusing nulls,
and should pick some non-null value to be his "placeholder".

            regards, tom lane

Re: Unique Index

From

"J. Greenlees"

Date:

20 January 2005, 14:09:44

Tino Wildenhain wrote:
> Hi,
>
> Am Mittwoch, den 19.01.2005, 15:02 -0800 schrieb J. Greenlees:
>
>>Roman Neuhauser wrote:
>>
>>># alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100:
>>>
>>>
>>>>i have a unique index on a table over multiple columns. If now one of
>>>>the records has a null value in one of the indexed columns i can insert
>>>>the same record multiple times.
>>>>
>>>>Is this a problem within postgres or expected?
>>>
>>>
>>>    In SQL, NULL means "unknown value". How could you assert that two
>>>    NULLs are equal?
>>>
>>
>>which doesn't make mathematical sense.
>>mathwise null is an empty result.
>>so setting the logic up using the math logic, null values are always equal.
>
>
> What kind of mathematics you are speaking?
> For example you have "infinity" where infinity is never
> equal to infinity.
> Same with null. Which is "unknown" or "undefined"
> So if x is undefined and y is undefined you cannot
> assume x=y  - because if you assume this, then
> they would not be undefined anymore.
>
> q.e.d.
>
> Regards
> Tino
>
>
or null as in empty.
an empty result set is a null set, zero results.
declare a variable, but never assign a value, it has a default value of
null from the declaration.
( basically any content of memory space allocated that was not actually
empty is the content, but it's a null value to the app. )

--
========================================

only plain text format email accepted.

smaller file size, no virus transfer
no proprietary file formats.

========================================

Attachment

smime.p7s

Re: Unique Index

From

Greg Stark

Date:

20 January 2005, 15:51:28

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> Direct your complaints to the ISO SQL standards committee.
>
> > The SQL standard generally treats NULLs as a escape hatch for constraints.

Huh? I thought I was agreeing with you. By "escape hatch" I meant that having
a NULL value in the indexed column allowed multiple records with otherwise
identical values because the NULL compared unequal. Ie, that the unique index
doesn't apply to the record with NULL columns.

> Not for UNIQUE constraints.  SQL92 section 4.10 "Integrity constraints":
>
>          A unique constraint is satisfied if and only if no two rows in
>          a table have the same non-null values in the unique columns.

That's ambiguous. Does it mean no two rows have all non-null columns that are
all identical? Or does it mean no two rows have columns that excluding any
null columns are identical.

It actually sounds more like the latter to me which would mean Postgres's
interpretation is wrong.

> The short answer to this thread is that the OP is misusing nulls,
> and should pick some non-null value to be his "placeholder".

We're in "violent agreement".

--
greg

Re: Unique Index

From

Greg Stark

Date:

20 January 2005, 15:57:52

Dawid Kuroczko <qnex42@gmail.com> writes:

> Don't worry about "index bloat".  These additional indexes will be used
> only when your main (foo_abc_index) is not used, so there won't be
> any duplicate data in them.

The main index will have _all_ the tuples in them, even where some of the
columns are NULL, so this will in fact use extra space. It will also cause
extra i/o on every update of a record with NULL in one of the columns.

To minimize the extra space you could make it

Dawid Kuroczko <qnex42@gmail.com> writes:

 CREATE UNIQUE INDEX foo_ab_index ON foo (a,b) WHERE c IS NULL AND a is NOT NULL and b IS NOT NULL;
 CREATE UNIQUE INDEX foo_ac_index ON foo (a,c) WHERE b IS NULL AND a is NOT NULL and c IS NOT NULL;
 CREATE UNIQUE INDEX foo_bc_index ON foo (b,c) WHERE a IS NULL AND b is NOT NULL and c IS NOT NULL;
 CREATE UNIQUE INDEX foo_a_index ON foo (a)   WHERE a IS NOT NULL AND b IS NULL and c is NULL;
 CREATE UNIQUE INDEX foo_b_index ON foo (b)   WHERE b IS NOT NULL AND a IS NULL and c is NULL;
 CREATE UNIQUE INDEX foo_c_index ON foo (c)   WHERE c IS NOT NULL AND a IS NULL and b is NULL;

To avoid indexing the same tuples in multiple indexes.

None of this will prevent you from inserting multiple <null,null,null> records
though.

--
greg

Re: Unique Index

From

Tom Lane

Date:

20 January 2005, 16:02:49

Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Not for UNIQUE constraints.  SQL92 section 4.10 "Integrity constraints":
>>
>> A unique constraint is satisfied if and only if no two rows in
>> a table have the same non-null values in the unique columns.

> That's ambiguous. Does it mean no two rows have all non-null columns that are
> all identical? Or does it mean no two rows have columns that excluding any
> null columns are identical.

OK, try the more formal definition in 8.9 <unique predicate>

         2) If there are no two rows in T such that the value of each column
            in one row is non-null and is equal to the value of the cor-
            responding column in the other row according to Subclause 8.2,
            "<comparison predicate>", then the result of the <unique predi-
            cate> is true; otherwise, the result of the <unique predicate>
            is false.

(11.7 defines the UNIQUE constraint in terms of the unique predicate)

            regards, tom lane

Re: Unique Index

From

"Frank D. Engel, Jr."

Date:

20 January 2005, 16:12:08

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm sure this won't work for some reason, but something similar might;
why not create a unique index on a constant where all three are null;
something along these lines (in addition to the others):

CREATE UNIQUE INDEX foo_trio_index ON foo (1) WHERE c IS NULL AND a IS
NULL and b IS NULL;

On Jan 20, 2005, at 10:57 AM, Greg Stark wrote:

> Dawid Kuroczko <qnex42@gmail.com> writes:
>
>> Don't worry about "index bloat".  These additional indexes will be
>> used
>> only when your main (foo_abc_index) is not used, so there won't be
>> any duplicate data in them.
>
> The main index will have _all_ the tuples in them, even where some of
> the
> columns are NULL, so this will in fact use extra space. It will also
> cause
> extra i/o on every update of a record with NULL in one of the columns.
>
> To minimize the extra space you could make it
>
> Dawid Kuroczko <qnex42@gmail.com> writes:
>
>  CREATE UNIQUE INDEX foo_ab_index ON foo (a,b) WHERE c IS NULL AND a
> is NOT NULL and b IS NOT NULL;
>  CREATE UNIQUE INDEX foo_ac_index ON foo (a,c) WHERE b IS NULL AND a
> is NOT NULL and c IS NOT NULL;
>  CREATE UNIQUE INDEX foo_bc_index ON foo (b,c) WHERE a IS NULL AND b
> is NOT NULL and c IS NOT NULL;
>  CREATE UNIQUE INDEX foo_a_index ON foo (a)   WHERE a IS NOT NULL AND
> b IS NULL and c is NULL;
>  CREATE UNIQUE INDEX foo_b_index ON foo (b)   WHERE b IS NOT NULL AND
> a IS NULL and c is NULL;
>  CREATE UNIQUE INDEX foo_c_index ON foo (c)   WHERE c IS NOT NULL AND
> a IS NULL and b is NULL;
>
> To avoid indexing the same tuples in multiple indexes.
>
> None of this will prevent you from inserting multiple <null,null,null>
> records
> though.
>
>
> --
> greg
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
>
- -----------------------------------------------------------
Frank D. Engel, Jr.  <fde101@fjrhome.net>

$ ln -s /usr/share/kjvbible /usr/manual
$ true | cat /usr/manual | grep "John 3:16"
John 3:16 For God so loved the world, that he gave his only begotten
Son, that whosoever believeth in him should not perish, but have
everlasting life.
$
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFB79hL7aqtWrR9cZoRAglUAJ9sT3SypLYDZhx6Dkysfr7aLHQttwCeNLs8
/J4jFlWMLcMMxbQ3/nj55eA=
=4Bbe
-----END PGP SIGNATURE-----



___________________________________________________________
$0 Web Hosting with up to 120MB web space, 1000 MB Transfer
10 Personalized POP and Web E-mail Accounts, and much more.
Signup at www.doteasy.com

Re: Unique Index

From

Scott Marlowe

Date:

20 January 2005, 16:27:42

On Wed, 2005-01-19 at 22:20, Alex wrote:

>
> Maybe there could be an option in the creation of the index to indicate
> on how to use NULL values.
>
> How do other DBMS handle this?

http://troels.arvin.dk/db/rdbms/

Re: Unique Index

From

Greg Stark

Date:

20 January 2005, 18:02:43

"Frank D. Engel, Jr." <fde101@fjrhome.net> writes:

> I'm sure this won't work for some reason, but something similar might; why not
> create a unique index on a constant where all three are null; something along
> these lines (in addition to the others):
>
> CREATE UNIQUE INDEX foo_trio_index ON foo (1) WHERE c IS NULL AND a IS NULL and
> b IS NULL;
>

Huh. Hadn't thought of creating an index on a constant. It works if you put an
extra set of parentheses in:

CREATE UNIQUE INDEX foo_abc_index ON foo ((1)) WHERE ...

--
greg

Re: Unique Index

From

"Dann Corbit"

Date:

20 January 2005, 18:58:01

It is clear to me that only allowing a single null value will not
violate the explanation below.

It would be equally true that allowing multiple null values would not
violate it.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Thursday, January 20, 2005 8:03 AM
To: Greg Stark
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Unique Index

Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Not for UNIQUE constraints.  SQL92 section 4.10 "Integrity
constraints":
>>
>> A unique constraint is satisfied if and only if no two rows in
>> a table have the same non-null values in the unique columns.

> That's ambiguous. Does it mean no two rows have all non-null columns
that are
> all identical? Or does it mean no two rows have columns that excluding
any
> null columns are identical.

OK, try the more formal definition in 8.9 <unique predicate>

         2) If there are no two rows in T such that the value of each
column
            in one row is non-null and is equal to the value of the cor-
            responding column in the other row according to Subclause
8.2,
            "<comparison predicate>", then the result of the <unique
predi-
            cate> is true; otherwise, the result of the <unique
predicate>
            is false.

(11.7 defines the UNIQUE constraint in terms of the unique predicate)

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Unique Index

From

Stephan Szabo

Date:

20 January 2005, 19:27:20

On Thu, 20 Jan 2005, Dann Corbit wrote:

> It is clear to me that only allowing a single null value will not
> violate the explanation below.

Given two rows in T with one column each
(NULL), (NULL)

Find two rows such that the value of each column in one row is non-null
and equal to the value of the corresponding column in the other row
according to 8.2. If there are no such rows the unique predicate returns
true (ie the constraint is satisfied).

Re: Unique Index

From

"Frank D. Engel, Jr."

Date:

20 January 2005, 19:37:41

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Huh?

... the value of EACH COLUMN in one row is NOT NULL and IS EQUAL to ...

In order for values to be equal in SQL, neither one can be null.  For
this condition to hold, it is more than "clear" that at least one row
must contain *NO* *NULL* *VALUES* (that means zero columns in that row
may contain null values).  Since *ALL* columns in the other row must be
EQUAL to the corresponding column in that row, none of them can be null
either.  Therefore, the uniqueness predicate evaluates to false, and
each of the two rows is considered unique compared to the other as soon
as any null value shows up in either row.

There is *no* ambiguity here!

On Jan 20, 2005, at 1:55 PM, Dann Corbit wrote:

> It is clear to me that only allowing a single null value will not
> violate the explanation below.
>
> It would be equally true that allowing multiple null values would not
> violate it.
>
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Thursday, January 20, 2005 8:03 AM
> To: Greg Stark
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Unique Index
>
> Greg Stark <gsstark@mit.edu> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> writes:
>>> Not for UNIQUE constraints.  SQL92 section 4.10 "Integrity
> constraints":
>>>
>>> A unique constraint is satisfied if and only if no two rows in
>>> a table have the same non-null values in the unique columns.
>
>> That's ambiguous. Does it mean no two rows have all non-null columns
> that are
>> all identical? Or does it mean no two rows have columns that excluding
> any
>> null columns are identical.
>
> OK, try the more formal definition in 8.9 <unique predicate>
>
>          2) If there are no two rows in T such that the value of each
> column
>             in one row is non-null and is equal to the value of the
> cor-
>             responding column in the other row according to Subclause
> 8.2,
>             "<comparison predicate>", then the result of the <unique
> predi-
>             cate> is true; otherwise, the result of the <unique
> predicate>
>             is false.
>
> (11.7 defines the UNIQUE constraint in terms of the unique predicate)
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>
>
- -----------------------------------------------------------
Frank D. Engel, Jr.  <fde101@fjrhome.net>

$ ln -s /usr/share/kjvbible /usr/manual
$ true | cat /usr/manual | grep "John 3:16"
John 3:16 For God so loved the world, that he gave his only begotten
Son, that whosoever believeth in him should not perish, but have
everlasting life.
$
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFB8AX07aqtWrR9cZoRAlu+AJ0YCv6VEID7MR5BHOf6rRl+gyPeLgCghImV
Um7r0pyp2vfsCVArSOkswMM=
=kzpI
-----END PGP SIGNATURE-----



___________________________________________________________
$0 Web Hosting with up to 120MB web space, 1000 MB Transfer
10 Personalized POP and Web E-mail Accounts, and much more.
Signup at www.doteasy.com

Re: Unique Index

From

"Dann Corbit"

Date:

20 January 2005, 19:41:36

Would the constraint not be satisfied if each combination (including
NULL) were not also forced to be unique?

I maintain that the constraint is still satisfied.

So, it is satisfied if I stuff thousands of NULL values in there.

And it is satisfied if I only allow a single NULL value.

With multiple columns, it might be set such that each combination must
be unique, like binary counting.

Let me also state that I agree: allowing null values in a unique index
is ludicrous.  But if it is allowed, I think forcing the combinations to
be single valued makes more sense than allowing any number of them.

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
Sent: Thursday, January 20, 2005 11:14 AM
To: Dann Corbit
Cc: Tom Lane; Greg Stark; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Unique Index

On Thu, 20 Jan 2005, Dann Corbit wrote:

> It is clear to me that only allowing a single null value will not
> violate the explanation below.

Given two rows in T with one column each
(NULL), (NULL)

Find two rows such that the value of each column in one row is non-null
and equal to the value of the corresponding column in the other row
according to 8.2. If there are no such rows the unique predicate returns
true (ie the constraint is satisfied).

Re: Unique Index

From

Greg Stark

Date:

20 January 2005, 19:54:58

"Dann Corbit" <DCorbit@connx.com> writes:

> Would the constraint not be satisfied if each combination (including
> NULL) were not also forced to be unique?
>
> I maintain that the constraint is still satisfied.
>
> So, it is satisfied if I stuff thousands of NULL values in there.
>
> And it is satisfied if I only allow a single NULL value.

You're misreading it. It's not a question of what you allow, it's a question
of what's in the table. The database *must* allow anything in the table that
would satisfy the constraint.

Of course the constraint is satisfied if there's only one NULL value. But the
constraint is also satisfied if there are more than one. So the database has
to allow you to do either of these since there's no constraint that they
violate.

> Let me also state that I agree: allowing null values in a unique index
> is ludicrous.  But if it is allowed, I think forcing the combinations to
> be single valued makes more sense than allowing any number of them.

In fact allowing NULLs in columns involved in a unique constraint is an
optional feature of the SQL standard. But if it's allowed it is required to be
done the way Postgres does it.

Perhaps you're just thinking of primary keys (which, btw, are *not* allowed to
be nullable according to the spec). NULLs are much more likely in a situation
where it's not a key field, just a bit of incidental data that we know should
be unique.

Consider for example a user table where there's a phone number field. Now not
all users enter a phone number, but when they do we want to make sure it's
unique. Does it make sense to restrict the database to a single user with an
unknown phone number?

Or consider a product database. Some products have ISBNs, namely books, but
not all do. If the product has an ISBN then it really ought to be unique, no
other product should have the same ISBN. But any number of products can be
non-books and not have an ISBN.

--
greg

Re: Unique Index

From

Stephan Szabo

Date:

20 January 2005, 20:01:03

On Thu, 20 Jan 2005, Dann Corbit wrote:

> Would the constraint not be satisfied if each combination (including
> NULL) were not also forced to be unique?

The constraint would be satisfied, however cases that the constraint is
satisfied for would not be allowed.  The case I gave below is one for
which I argue the constraint is satisfied because the search condition is
true.  The definition above would appear to not allow that case and as
such appears to be contrary to the definition of the constraint.

> Let me also state that I agree: allowing null values in a unique index
> is ludicrous.  But if it is allowed, I think forcing the combinations to
> be single valued makes more sense than allowing any number of them.

I think that'd be better termed a DISTINCT index to use SQL terminology.

> On Thu, 20 Jan 2005, Dann Corbit wrote:
>
> > It is clear to me that only allowing a single null value will not
> > violate the explanation below.
>
> Given two rows in T with one column each
> (NULL), (NULL)
>
> Find two rows such that the value of each column in one row is non-null
> and equal to the value of the corresponding column in the other row
> according to 8.2. If there are no such rows the unique predicate returns
> true (ie the constraint is satisfied).
>

Re: Unique Index

From

"Dann Corbit"

Date:

20 January 2005, 20:03:40

Yes.
I was wrong.

Sorry about the noise.

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
Sent: Thursday, January 20, 2005 12:01 PM
To: Dann Corbit
Cc: Tom Lane; Greg Stark; pgsql-general@postgresql.org
Subject: RE: [GENERAL] Unique Index

On Thu, 20 Jan 2005, Dann Corbit wrote:

> Would the constraint not be satisfied if each combination (including
> NULL) were not also forced to be unique?

The constraint would be satisfied, however cases that the constraint is
satisfied for would not be allowed.  The case I gave below is one for
which I argue the constraint is satisfied because the search condition
is
true.  The definition above would appear to not allow that case and as
such appears to be contrary to the definition of the constraint.

> Let me also state that I agree: allowing null values in a unique index
> is ludicrous.  But if it is allowed, I think forcing the combinations
to
> be single valued makes more sense than allowing any number of them.

I think that'd be better termed a DISTINCT index to use SQL terminology.

> On Thu, 20 Jan 2005, Dann Corbit wrote:
>
> > It is clear to me that only allowing a single null value will not
> > violate the explanation below.
>
> Given two rows in T with one column each
> (NULL), (NULL)
>
> Find two rows such that the value of each column in one row is
non-null
> and equal to the value of the corresponding column in the other row
> according to 8.2. If there are no such rows the unique predicate
returns
> true (ie the constraint is satisfied).
>

Re: Unique Index

From

Tino Wildenhain

Date:

20 January 2005, 21:00:40

Am Donnerstag, den 20.01.2005, 06:09 -0800 schrieb J. Greenlees:
> Tino Wildenhain wrote:
> > Hi,
> >
> > Am Mittwoch, den 19.01.2005, 15:02 -0800 schrieb J. Greenlees:
> >
> >>Roman Neuhauser wrote:
> >>
> >>># alex@meerkatsoft.com / 2005-01-20 01:35:32 +1100:
> >>>
> >>>
> >>>>i have a unique index on a table over multiple columns. If now one of
> >>>>the records has a null value in one of the indexed columns i can insert
> >>>>the same record multiple times.
> >>>>
> >>>>Is this a problem within postgres or expected?
> >>>
> >>>
> >>>    In SQL, NULL means "unknown value". How could you assert that two
> >>>    NULLs are equal?
> >>>
> >>
> >>which doesn't make mathematical sense.
> >>mathwise null is an empty result.
> >>so setting the logic up using the math logic, null values are always equal.
> >
> >
> > What kind of mathematics you are speaking?
> > For example you have "infinity" where infinity is never
> > equal to infinity.
> > Same with null. Which is "unknown" or "undefined"
> > So if x is undefined and y is undefined you cannot
> > assume x=y  - because if you assume this, then
> > they would not be undefined anymore.
> >
> > q.e.d.
> >
> > Regards
> > Tino
> >
> >
> or null as in empty.
> an empty result set is a null set, zero results.
> declare a variable, but never assign a value, it has a default value of
> null from the declaration.
> ( basically any content of memory space allocated that was not actually
> empty is the content, but it's a null value to the app. )

No. Empty result set is just a set without elements. zero-length
list or tuple or whatever your programming language uses :-)

[] != null/undefined/None