Thread: [GENERAL][SQL] Getting multiple field unique index to distinguish NULLs.

[GENERAL][SQL] Getting multiple field unique index to distinguish NULLs.

From
Stuart Rison
Date:
Dear All,

Consider the following table:

myscheme_id|level1|level2|level3|level4|function
-----------+------+------+------+------+-------------------------------
         11|     4|     5|     1|     3|Long John Silver
         12|  1242|     3|    44|     5|Metabolism
         13|     1|     2|     3|     4|Transport
          1|     1|     4|     3|      |Energy
          9|     1|     2|     3|     1|Signaling
          3|     1|     2|     3|     2|test1
         18|     1|     2|      |      |test2
         19|     1|     2|      |      |test3
         21|     1|     2|      |      |test4

This is essentially a hierarchical key set-up; each function can be
identified by a key (which can be thought of as the concatenation of
level1, level2, level3 and level4).

But you can add a function at any level (i.e. only level1 must be given)
so:

INSERT INTO myscheme_funcat (level1,function) VALUES (1,'Top level
function');

is OK.

I am trying to get the database to reject INSERTS for keys already
occupied

CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
(level1,level2,level3,level4);

The problem is that the indexing considers all NULLs to be distinct thus:

INSERT INTO myscheme_funcat (level1,level2,level3,level4,function) VALUES
(4,5,76,NULL,'OK');
INSERT 1044737 1

but so does
INSERT INTO myscheme_funcat (level1,level2,level3,level4,function) VALUES
(4,5,76,NULL,'Should fail because position 4.5.76 already occupied');
INSERT 1044738 1

Works because 4,5,76,NULL is considered DISTINCT from 4,5,76,NULL (all
NULLs are different).

So, any ideas, workarounds etc.??

cheers,

S.

### Please Note New Details ###
Stuart C. G. Rison
Department of Biochemistry and Molecular Biology
Gower Street, London, WC1E 6BT, United Kingdom
Tel. 0207 504 2303, Fax. 0207 380 7193
e-mail: s.rison@biochem.ucl.ac.uk



Re: [GENERAL][SQL] Getting multiple field unique index to distinguish NULLs.

From
Stuart Rison
Date:
Hi Tom,

On Fri, 10 Sep 1999, Tom Lane wrote:

> > The problem is that the indexing considers all NULLs to be distinct thus:
>
> Don't use NULLs ... declare the columns as NOT NULL, and select some
> special value for the fields you want to consider unused.  Zero maybe,
> or an empty string if the columns are text.
>

Fair enough, and indeed it's a solution I've been using but, for the sake
of argument, what if I couldn't do it?

I though of using a index and functions such as:

CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
(level1, null_to_zero(level2), null_to_zero(level3),
null_to_zero(level4));

but saddly, you can't use functions in that way to define indices or at
least not in this way...

I tried

CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
(make_key(level1,level2,level3,level4) text_ops);

Where make_key essentially concatenates the levels into a '.' separated
key (e.g. 1 or 1.2.1 or 2.4.1.6).

Now this would work except that there's a problem with pl/pgSQL such that
when you pass several values to a plpgsql function and one of them is
NULL, then all values passed to the function become NULL.

The solution would be something like:

CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
(make_key(level1,null_to_zero(level2),null_to_zero(level3),
null_to_zero(level4));

But saddly you can't have "sub-function" in the CREATE INDEX statement.

So:

1) yes, the solution would be to use 0 or somesuch token value for 'NULL'
2) but I wondered if anybody had ventured into the kind of stuff mentioned
above
3) and if anybody have solved the pl/pgSQL 'issue' that turns all values
to NULL?

Regards,

Stuart.


Re: [GENERAL][SQL] Getting multiple field unique index to distinguish NULLs.

From
Beth Strohmayer
Date:
Is this problem with NULLS being treated as distinct values only apparent
in regards to Indices?  If you do a Select Distinct or a Group By using
these fields does it display the duplicate rows with nulls? I know this
doesn't offer any help but I was just curious...

Beth  :-)
    _______________________________________________
   / Beth L Strohmayer / Software Engineer    _____)
  /  ITT Industries, Systems Division        (_____|______________________
/   @ Naval Research Laboratory, Code 5542  |                            \
\   4555 Overlook Ave. SW                   |  Phone: (202) 404-3798      \
  \  Washington, DC  20375                   |    Fax: (202) 404-7942       \
   \_________________________________________|                              /
                                             | strohmayer@itd.nrl.navy.mil /
                                             |____________________________/



At 11:37 AM 09/10/1999 , Stuart Rison wrote:
>Hi Tom,
>
>On Fri, 10 Sep 1999, Tom Lane wrote:
>
> > > The problem is that the indexing considers all NULLs to be distinct thus:
> >
> > Don't use NULLs ... declare the columns as NOT NULL, and select some
> > special value for the fields you want to consider unused.  Zero maybe,
> > or an empty string if the columns are text.
> >
>
>Fair enough, and indeed it's a solution I've been using but, for the sake
>of argument, what if I couldn't do it?
>
>I though of using a index and functions such as:
>
>CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
>(level1, null_to_zero(level2), null_to_zero(level3),
>null_to_zero(level4));
>
>but saddly, you can't use functions in that way to define indices or at
>least not in this way...
>
>I tried
>
>CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
>(make_key(level1,level2,level3,level4) text_ops);
>
>Where make_key essentially concatenates the levels into a '.' separated
>key (e.g. 1 or 1.2.1 or 2.4.1.6).
>
>Now this would work except that there's a problem with pl/pgSQL such that
>when you pass several values to a plpgsql function and one of them is
>NULL, then all values passed to the function become NULL.
>
>The solution would be something like:
>
>CREATE UNIQUE INDEX myscheme_idx ON myscheme_funcat
>(make_key(level1,null_to_zero(level2),null_to_zero(level3),
>null_to_zero(level4));
>
>But saddly you can't have "sub-function" in the CREATE INDEX statement.
>
>So:
>
>1) yes, the solution would be to use 0 or somesuch token value for 'NULL'
>2) but I wondered if anybody had ventured into the kind of stuff mentioned
>above
>3) and if anybody have solved the pl/pgSQL 'issue' that turns all values
>to NULL?
>
>Regards,
>
>Stuart.
>
>
>************


Re: [GENERAL][SQL] Getting multiple field unique index to distinguish NULLs.

From
Tom Lane
Date:
Stuart Rison <rison@biochemistry.ucl.ac.uk> writes:
> Now this would work except that there's a problem with pl/pgSQL such that
> when you pass several values to a plpgsql function and one of them is
> NULL, then all values passed to the function become NULL.

Yes.  That's not plpgsql's fault; there's a basic design limitation in
the function-call interface (only one isnull flag gets passed).  There
has been some talk on the pghackers list of fixing this, but I do not
know if it will happen any time soon or not.  Fixing that interface
would mean revising *all* the code that implements any SQL function or
operator, so it'd be pretty tedious --- and it'd break any C-coded
user functions, too.  It'll probably get done eventually, maybe even
for 6.6, but don't hold your breath...

            regards, tom lane

Re: [GENERAL] Getting multiple field unique index to distinguish NULLs.

From
Stuart Rison
Date:
On Fri, 10 Sep 1999, Tom Lane wrote:

> Stuart Rison <rison@biochemistry.ucl.ac.uk> writes:
> > Now this would work except that there's a problem with pl/pgSQL such that
> > when you pass several values to a plpgsql function and one of them is
> > NULL, then all values passed to the function become NULL.
>
> Yes.  That's not plpgsql's fault; there's a basic design limitation in
> the function-call interface (only one isnull flag gets passed).  There
> has been some talk on the pghackers list of fixing this, but I do not
> know if it will happen any time soon or not.  Fixing that interface
> would mean revising *all* the code that implements any SQL function or
> operator, so it'd be pretty tedious --- and it'd break any C-coded
> user functions, too.  It'll probably get done eventually, maybe even
> for 6.6, but don't hold your breath...
>
>             regards, tom lane
>

Aha!  Thanks a lot for that Tom, I've been wondering for ages if this NULL
behaviour was just a version things (I've got 6.4.2) or a more profound
'problem'.

Hum... how unfortunate that it's the latter.  Personally, and this is
completely sellfish -I use functions quite a lot- I'd love to see this
fixed asap!  Alas, can't do it myself, I barely managed to get 'hello
world' to work in C (he weeps)...

Whilst I'm on a whimsical trip, my other top wishes remain:

1) outer joins (not 'workarounds')
2) referential integrity (not refint.c)

Still PG is top-tastic all the same.  Thanks to all Hackers out there!

Cheers,

S.

Stuart C. G. Rison
Department of Biochemistry and Molecular Biology
Gower Street, London, WC1E 6BT, United Kingdom
Tel. 0207 504 2303, Fax. 0207 380 7193
e-mail: s.rison@biochem.ucl.ac.uk