Thread: Multicolumn foreign keys need useless unique indices?

Multicolumn foreign keys need useless unique indices?

From
Antti Haapala
Date:
There is a misfeature in 7.2.2 that appears when I have a foreign key that
references two columns of a table. Consider following simplified example:

CREATE TABLE a (   a int PRIMARY KEY,   b int
);

CREATE TABLE b (   aref int,   bref int,   FOREIGN KEY (aref, bref) REFERENCES a(a, b)       MATCH FULL ON DELETE
CASCADEON UPDATE CASCADE
 
);

I get an error

"UNIQUE constraint matching given keys for referenced table "a" not
found."

because I have unique constraint only on the first field (which is still
enough to make the whole combination unique. (b is not even unique))...

So I need to add an useless(?) UNIQUE constraint to "(a, b)" for table "a"
just to allow creation of multicol FOREIGN KEYs for table "b".

And I get NOTICE:  CREATE TABLE / UNIQUE will create implicit index
'a_a_key' for table.

AFAIK, the extra index only slows down my inserts - it basically contains
no usable information... shouldn't the presence of _primary_key_ in
multicol foreign key be enough to decide whether the whole key is unique
or not? And shouldn't it be enough to find out the tuple in table 'a'
corresponding newly inserted tuple in b?

Or should I just write my own triggers for checking the integrity of
"b"/"bref" column pair to avoid needless index creation?

-- 
Antti Haapala






Re: Multicolumn foreign keys need useless unique indices?

From
"Christopher Kings-Lynne"
Date:
> AFAIK, the extra index only slows down my inserts - it basically contains
> no usable information...

Not 100% true.  It will speed up cascade delete and update...

> shouldn't the presence of _primary_key_ in
> multicol foreign key be enough to decide whether the whole key is unique
> or not?

Hmmm - thinking about it, I don't see why postgres would need the entire
thing to be unique...can't think of a reason at the moment.  Stephen?

Chris



Re: Multicolumn foreign keys need useless unique indices?

From
Antti Haapala
Date:
> > AFAIK, the extra index only slows down my inserts - it basically contains
> > no usable information...
>
> Not 100% true.  It will speed up cascade delete and update...

To clarify things:

CREATE TABLE original (  a int PRIMARY KEY,  b int
);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
'original_pkey' for table 'original'
CREATE

CREATE TABLE referencer (   aref int,   bref int,   FOREIGN KEY (aref, bref) REFERENCES original(a, b)       MATCH FULL
ONDELETE CASCADE ON UPDATE CASCADE
 
);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN
KEY check(s)
ERROR:  UNIQUE constraint matching given keys for referenced table
"original" not found

CREATE TABLE original (   a int PRIMARY KEY,   b int,   UNIQUE (a,b)
);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
'original_pkey' for table 'original'
NOTICE:  CREATE TABLE / UNIQUE will create implicit index 'original_a_key'
for table 'original'
CREATE

CREATE TABLE referencer (   aref int,   bref int,   FOREIGN KEY (aref, bref) REFERENCES original(a, b)       MATCH FULL
ONDELETE CASCADE ON UPDATE CASCADE
 
);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN
KEY check(s)
CREATE

ilmo=# \d original         Table "original"Column |  Type   | Modifiers
--------+---------+-----------a      | integer | not nullb      | integer |
Primary key: a_pkey
Unique keys: a_a_key
Triggers: RI_ConstraintTrigger_41250,         RI_ConstraintTrigger_41252

ilmo=# \d referencer         Table "referencer"Column |  Type   | Modifiers
--------+---------+-----------aref   | integer |bref   | integer |
Triggers: RI_ConstraintTrigger_41248

Actually nothing changes. The unique constraint doesn't add anything new -
it allows NULLs in column b and requires that combination (a, b) is
unique... and it definitely is because column 'a' is unique (primary key).
It just creates a multicol index and adds an useless extra constraint
check, while almost the same data is available in index "original_a_pkey".

-- 
Antti Haapala



Re: Multicolumn foreign keys need useless unique indices?

From
Rod Taylor
Date:
On Fri, 2002-09-13 at 04:27, Christopher Kings-Lynne wrote:
> > AFAIK, the extra index only slows down my inserts - it basically contains
> > no usable information...
> 
> Not 100% true.  It will speed up cascade delete and update...
> 
> > shouldn't the presence of _primary_key_ in
> > multicol foreign key be enough to decide whether the whole key is unique
> > or not?
> 
> Hmmm - thinking about it, I don't see why postgres would need the entire
> thing to be unique...can't think of a reason at the moment.  Stephen?

If it's not all unique, you cannot be guaranteed there is a single row
with those values in the referenced table.

--  Rod Taylor



Re: Multicolumn foreign keys need useless unique indices?

From
Tom Lane
Date:
Rod Taylor <rbt@rbt.ca> writes:
> On Fri, 2002-09-13 at 04:27, Christopher Kings-Lynne wrote:
>> Hmmm - thinking about it, I don't see why postgres would need the entire
>> thing to be unique...can't think of a reason at the moment.  Stephen?

> If it's not all unique, you cannot be guaranteed there is a single row
> with those values in the referenced table.

Right.  The single-column unique constraint guarantees at most one
match, but it isn't helpful for checking if there's at least one match.
The spec obviously intends that the index supporting the unique
constraint be useful for verifying the existence of a match.

I read this in SQL92:
           a) If the <referenced table and columns> specifies a <reference             column list>, then the set of
columnnames of that <refer-             ence column list> shall be equal to the set of column names             in the
uniquecolumns of a unique constraint of the refer-             enced table.
 

It says "equal to", not "superset of".  So we are behaving per spec.
        regards, tom lane


Re: Multicolumn foreign keys need useless unique indices?

From
Stephan Szabo
Date:
> Rod Taylor <rbt@rbt.ca> writes:
> > On Fri, 2002-09-13 at 04:27, Christopher Kings-Lynne wrote:
> >> Hmmm - thinking about it, I don't see why postgres would need the entire
> >> thing to be unique...can't think of a reason at the moment.  Stephen?
>
> > If it's not all unique, you cannot be guaranteed there is a single row
> > with those values in the referenced table.
>
> Right.  The single-column unique constraint guarantees at most one
> match, but it isn't helpful for checking if there's at least one match.
> The spec obviously intends that the index supporting the unique
> constraint be useful for verifying the existence of a match.
>
> I read this in SQL92:
>
>             a) If the <referenced table and columns> specifies a <reference
>               column list>, then the set of column names of that <refer-
>               ence column list> shall be equal to the set of column names
>               in the unique columns of a unique constraint of the refer-
>               enced table.
>
> It says "equal to", not "superset of".  So we are behaving per spec.

That's what I used when doing it.  It possibly is a stronger than
necessary statement but I assumed at the time they had some reason for
wanting to define it that way.




Re: Multicolumn foreign keys need useless unique indices?

From
Stephan Szabo
Date:
On Fri, 13 Sep 2002, Antti Haapala wrote:

> > > AFAIK, the extra index only slows down my inserts - it basically contains
> > > no usable information...
> >
> > Not 100% true.  It will speed up cascade delete and update...
>
> To clarify things:
>
> CREATE TABLE original (
>    a int PRIMARY KEY,
>    b int
> );
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> 'original_pkey' for table 'original'
> CREATE
>
> CREATE TABLE referencer (
>     aref int,
>     bref int,
>     FOREIGN KEY (aref, bref) REFERENCES original(a, b)
>         MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
> );
> NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN
> KEY check(s)
> ERROR:  UNIQUE constraint matching given keys for referenced table
> "original" not found

SQL 92 would want you to normalize and remove bref from referencer
since it's redundant.  You're storing a reference to a table and
some of the dependent values to that reference in another table.
That's probably the best workaround, although I assume your real
case is more complicated.




Re: Multicolumn foreign keys need useless unique indices?

From
Hannu Krosing
Date:
On Fri, 2002-09-13 at 16:00, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> > On Fri, 2002-09-13 at 04:27, Christopher Kings-Lynne wrote:
> >> Hmmm - thinking about it, I don't see why postgres would need the entire
> >> thing to be unique...can't think of a reason at the moment.  Stephen?
> 
> > If it's not all unique, you cannot be guaranteed there is a single row
> > with those values in the referenced table.
> 
> Right.  The single-column unique constraint guarantees at most one
> match, but it isn't helpful for checking if there's at least one match.

Due to postgres's implementation we can't do the 'at least' part using
only index anyway - we must check the actual table.

> The spec obviously intends that the index supporting the unique
> constraint be useful for verifying the existence of a match.

Does the spec say _anything_ about implementing unique contraint using
an unique index ?

> I read this in SQL92:
> 
>             a) If the <referenced table and columns> specifies a <reference
>               column list>, then the set of column names of that <refer-
>               ence column list> shall be equal to the set of column names
>               in the unique columns of a unique constraint of the refer-
>               enced table.
> 
> It says "equal to", not "superset of".  So we are behaving per spec.

But we are doing it in a suboptimal way.

If we have unique index on t.i and we define additional unique
constraint on (t.i, t.j), then we don't need the extra unique index to
be created - the index on t.i is enough to quarantee the uniqueness of
(t.i,t.j) or any set of columns that includes t.i.

---------------
Hannu

PS. IMHO our unique is still broken as shown by the following:

hannu=# create table t(i int unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index 't_i_key' for
table 't'
CREATE TABLE
hannu=# insert into t values(1);
INSERT 41555 1
hannu=# insert into t values(2);
INSERT 41556 1
hannu=# update t set i=i-1;
UPDATE 2
hannu=# update t set i=i+1;
ERROR:  Cannot insert a duplicate key into unique index t_i_key
hannu=#                              

DB2 has no problems doing it:

db2 => create table t(i int not null unique)
DB20000I  The SQL command completed successfully.
db2 => insert into t values(1)
DB20000I  The SQL command completed successfully.
db2 => insert into t values(2)
DB20000I  The SQL command completed successfully.
db2 => update t set i=i+1
DB20000I  The SQL command completed successfully.
db2 => update t set i=i-1
DB20000I  The SQL command completed successfully.

neither has Oracle

SQL> create table t(i int not null unique);
Table created.
SQL> insert into t values(1);
1 row created.
SQL> insert into t values(2);
1 row created.
SQL> update t set i=i+1;
2 rows updated.
SQL> update t set i=i-1;
2 rows updated.
SQL> 

----------------
Hannu



Re: Multicolumn foreign keys need useless unique indices?

From
Rod Taylor
Date:
> hannu=# update t set i=i+1;
> ERROR:  Cannot insert a duplicate key into unique index t_i_key

A possibility may be to reverse the sequential scan order for the simple
cases, but anything any more complex and the check should be deferred
till end of statement, rather than checking immediately.


--  Rod Taylor



Re: Multicolumn foreign keys need useless unique indices?

From
Hannu Krosing
Date:
On Fri, 2002-09-13 at 17:42, Rod Taylor wrote:
> 
> > hannu=# update t set i=i+1;
> > ERROR:  Cannot insert a duplicate key into unique index t_i_key
> 
> A possibility may be to reverse the sequential scan order for the simple
> cases, but anything any more complex and the check should be deferred
> till end of statement, rather than checking immediately.

Or we could keep a 'conflict list' that would be dynamically added to
and deleted from during the statement and the statement would be aborted
if 

1) there were any entries in the list at the end of statement

or

2) if the list overflowed at some predefined limit (say 1000 or 100.000
conflicts) during the statement.

in our simple case we would have at most 1 conflict in the list at any
time.

--------------
Hannu


Re: Multicolumn foreign keys need useless unique indices?

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> If we have unique index on t.i and we define additional unique
> constraint on (t.i, t.j), then we don't need the extra unique index to
> be created - the index on t.i is enough to quarantee the uniqueness of
> (t.i,t.j) or any set of columns that includes t.i.

You missed the point: we are concerned about existence of a row, not only
uniqueness.
        regards, tom lane


Re: Multicolumn foreign keys need useless unique indices?

From
Hannu Krosing
Date:
On Sat, 2002-09-14 at 20:14, Tom Lane wrote:
> Hannu Krosing <hannu@tm.ee> writes:
> > If we have unique index on t.i and we define additional unique
> > constraint on (t.i, t.j), then we don't need the extra unique index to
> > be created - the index on t.i is enough to quarantee the uniqueness of
> > (t.i,t.j) or any set of columns that includes t.i.
> 
> You missed the point: we are concerned about existence of a row, not only
> uniqueness.

Maybe I'm missing something, but I'll reiterate my two points

1) to check for existance of a referenced tuple for a foreigh key we
have to:

* lookup the row in index

and

* check if the row is live in the relation

so the index will help us equally for both cases, as it will point to N
entries of which only one can be alive at a time and which all have to
be checked.

It will be only marginally more work to check if the only live entry
does match the non-index columns.


And I think that my other point holds as well - there is no need for
extra unique index on (redundant) unique constraint that is put over a
superset of columns covered by _another_ unique constraint. 

There will probably be additional work if we want to drop the original
constraint, but this is a separate issue.

---------------
Hannu




Re: Multicolumn foreign keys need useless unique indices?

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> so the index will help us equally for both cases, as it will point to N
> entries of which only one can be alive at a time and which all have to
> be checked.
> It will be only marginally more work to check if the only live entry
> does match the non-index columns.

But the "marginally more work" represents code that does not exist at
all right now, and which there's no really convenient place to add AFAIR.
This seems to me to be going rather out of our way to support a coding
practice that is specifically disallowed by the standard.

Something that no one has bothered to ask, but seems to me relevant,
is exactly why we should consider it important to support foreign keys
of this form?  Aren't we talking about a poor schema design in the first
place, if the referenced column set covers more than just the unique key
of the referenced table?  At the very least this is a violation of
normalization, and so it's inherently inefficient.

> There will probably be additional work if we want to drop the original
> constraint, but this is a separate issue.

It's not separate, because it's more work that we *will* have to do,
to support a feature that is nonstandard and of debatable usefulness.
        regards, tom lane