Thread: Unique indices and nulls

Unique indices and nulls

From
Roland Roberts
Date:
I see in the documentation that a unique index does not place any
constraints on the number of nulls in a table.  In my case, I have a
table with has four columns (plus other data) where that combination
of 4 columns should be unique.

    (catalog, entry, suffix, component)

forms a unique tuple.  In most (but not all) cases, (catalog, entry)
is unique and suffix and component will both be null.  In those cases,
it is common to have an entry with (catalog, entry, null, null), as
well as multiple entries with (catalog, entry, suffix, component).

But there should never be more than one entry with (catalog, entry,
null, null).

Is there any way I can enforce this?  Am I going to have to write a
trigger to check for duplicates?

roland
--
               PGP Key ID: 66 BC 3B CD
Roland B. Roberts, PhD                             RL Enterprises
roland@rlenter.com                     76-15 113th Street, Apt 3B
roland@astrofoto.org                       Forest Hills, NY 11375

Re: Unique indices and nulls

From
Tom Lane
Date:
Roland Roberts <roland@astrofoto.org> writes:
> ... In most (but not all) cases, (catalog, entry)
> is unique and suffix and component will both be null.  In those cases,
> it is common to have an entry with (catalog, entry, null, null), as
> well as multiple entries with (catalog, entry, suffix, component).
> But there should never be more than one entry with (catalog, entry,
> null, null).
> Is there any way I can enforce this?  Am I going to have to write a
> trigger to check for duplicates?

AFAICT an SQL unique constraint will not do this for you.  The spec
defines <unique constraint> in terms of the <unique predicate>,
which is defined as

         2) If there are no two rows in T such that the value of each column
            in one row is non-null and is equal to the value of the cor-
            responding column in the other row according to Subclause 8.2,
            "<comparison predicate>", then the result of the <unique predi-
            cate> is true; otherwise, the result of the <unique predicate>
            is false.

so the constraint cannot be violated by rows that contain any nulls
(in the columns checked by the constraint).

You could possibly do something with a partial UNIQUE index on (catalog,
entry) where the index's WHERE condition selects only rows with suffix
and component both NULL.  But this doesn't scale well if you also want
to forbid other cases that are equal-up-to-nulls; you'd end up with
a large number of partial indexes on different subsets of the columns.

Tell you the truth, my advice is to reconsider the way you're using
NULLs.  C.J. Date thinks NULLs are evil and best avoided ... your
case may be an example of what he's driving at.

            regards, tom lane

Re: Unique indices and nulls

From
Roland Roberts
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

    Tom> Tell you the truth, my advice is to reconsider the way you're
    Tom> using NULLs.  C.J. Date thinks NULLs are evil and best
    Tom> avoided ... your case may be an example of what he's driving
    Tom> at.

I have reconsidered...the NULLs have to stay, but the index is no
longer unique, it is only on the non-null columns, and is only being
used for select purposes.  Instead, I am enforcing the constraint via
a "before insert" trigger.  Except for loading the table, this table
will have virtually no inserts, so the trigger seems a reasonable way
to go.

roland
--
               PGP Key ID: 66 BC 3B CD
Roland B. Roberts, PhD                             RL Enterprises
roland@rlenter.com                     76-15 113th Street, Apt 3B
roland@astrofoto.org                       Forest Hills, NY 11375

Re: Unique indices and nulls

From
"Josh Berkus"
Date:
Roland,

> > ... In most (but not all) cases, (catalog, entry)
> > is unique and suffix and component will both be null.  In those
> cases,
> > it is common to have an entry with (catalog, entry, null, null), as
> > well as multiple entries with (catalog, entry, suffix, component).
> > But there should never be more than one entry with (catalog, entry,
> > null, null).
> > Is there any way I can enforce this?  Am I going to have to write a
> > trigger to check for duplicates?
<snip>
> Tell you the truth, my advice is to reconsider the way you're using
> NULLs.  C.J. Date thinks NULLs are evil and best avoided ... your
> case may be an example of what he's driving at.

What Tom's getting at, here, is that you may wish to consider using
"zero values" (e.g. 0, or empty string, or 'None', or 'N/A', or
'1900-01-01', or similar)
instead of Nulls to hold "blank" fields.   This would allow you to use
a standard UNIQUE constraint instead of a custom trigger, and be better
normal form to boot.

All DBAs are guilty of, to a lesser or greater degree, using NULL to
represent "None", even though NULL actually means "unknown" according
to the SQL spec.  The rest of the SQL spec is designed to support NULLs
as "unknown", so this can lead to problems in application.  Like yours.

-Josh Berkus



Re: Unique indices and nulls

From
Roland Roberts
Date:
>>>>> "Josh" == Josh Berkus <josh@agliodbs.com> writes:

    Josh> All DBAs are guilty of, to a lesser or greater degree, using
    Josh> NULL to represent "None", even though NULL actually means
    Josh> "unknown" according to the SQL spec.  The rest of the SQL
    Josh> spec is designed to support NULLs as "unknown", so this can
    Josh> lead to problems in application.  Like yours.

This is a good point, and since I know the range of values which can
go into those columns, it is easy to pick something to use as
representing "none."

roland
--
               PGP Key ID: 66 BC 3B CD
Roland B. Roberts, PhD                             RL Enterprises
roland@rlenter.com                     76-15 113th Street, Apt 3B
roland@astrofoto.org                       Forest Hills, NY 11375