Re: Defining a foreign key with a duplicate column is broken - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Defining a foreign key with a duplicate column is broken
Date
Msg-id 29219.1407543213@sss.pgh.pa.us
Whole thread Raw
In response to Re: Defining a foreign key with a duplicate column is broken  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Defining a foreign key with a duplicate column is broken
List pgsql-hackers
I wrote:
> David Rowley <dgrowleyml@gmail.com> writes:
>> The attached seems to fix the problem, but the whole thing makes me wonder
>> if this is even meant to be allowed? I was thinking that this might be a
>> good time to disallow this altogether, since it's already broken and looks
>> like it has been for about 11 years

> We've gone out of our way in the past to allow duplicate index columns
> (eg commit cfc5008a51f4), so I'm not sure why we'd not allow such indexes
> to be used as foreign key references.

I poked at this some more and changed my mind.  Per SQL standard, the
referent of a foreign key has to match some unique or pkey constraint,
and *we do not allow duplicate columns in constraint syntax*:

regression=# create temp table pp (f1 int, unique(f1,f1));
ERROR:  column "f1" appears twice in unique constraint
LINE 1: create temp table pp (f1 int, unique(f1,f1));                                     ^

which restriction I believe is also per spec.  It's true that using
CREATE UNIQUE INDEX syntax, you can make an index with duplicate
columns, but there seems no a-priori reason why we have to allow such
an index to be the referent of a foreign key.  The main reason *not* to
allow it is that such an index might conceivably have different opclasses
for the index columns containing the same attnum, and in that case it
becomes ambiguous which opclass (ie, which definition of equality)
applies to each column of the foreign key.  We could make some arbitrary
definition of which one to select but I think that would be stretching
generality well past the point of usefulness.  Worse, if the different
opclasses do have different definitions of equality and we pick the wrong
one, it's not real clear that it mightn't be possible for a referencing
row to be considered equal to multiple rows in the referenced table (that
are distinct according to the rules used by the index).

So I'm thinking you're right: we should rewrite this code so that only
indexes having all columns distinct can match, thereby ensuring that there
is only one possible interpretation of the equality semantics for the
foreign key.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Next
From: Stephen Frost
Date:
Subject: Re: jsonb format is pessimal for toast compression