Thread: Foreign keys for non-default datatypes

Foreign keys for non-default datatypes

From
Tom Lane
Date:
I looked into the problem reported here:
http://archives.postgresql.org/pgsql-admin/2006-02/msg00261.php
To wit, a pg_restore of a foreign key constraint involving user-defined
types produces

pg_restore: WARNING:  foreign key constraint "luuid_fkey" will require costly sequential scans
DETAIL:  Key columns "luuid" and "luuid" are of different types: public.uniqueidentifier and public.uniqueidentifier.

A small variation of the example (I tried it with the contrib isbn type
instead, just to see if it was uniqueidentifier's fault) produces
complete failure:

isbn=# ALTER TABLE ONLY beta ADD CONSTRAINT beta_luuid_fkey FOREIGN KEY (luuid) REFERENCES alpha(luuid);
ERROR:  foreign key constraint "beta_luuid_fkey" cannot be implemented
DETAIL:  Key columns "luuid" and "luuid" are of incompatible types: public.isbn and public.isbn.

The problem is that pg_dump likes to set a restrictive search path:
the above follows
isbn# SET search_path = delta, pg_catalog;
while the data type and its operators all are defined in the public
schema.  So when ATAddForeignKeyConstraint checks to see if there's a
suitable "=" operator, it doesn't find the intended operator.  In the
isbn case it doesn't find anything at all; in the uniqueidentifier case
there's an implicit cast to text and so the texteq operator is found,
but it's not a member of the index's opclass and thus the warning
appears.

Even if ATAddForeignKeyConstraint weren't trying to be helpful by
checking the operator, we'd be facing the exact same risks at runtime
--- the RI triggers blithely assume that "foo = bar" will do the right
thing.

This has been a hazard in the RI code since day one, of course, but
I think it's time to face up to it and do something about it.  The
RI code ought not be assuming that "=" will find an appropriate
operator --- it should be doing something based on semantics, not a
pure name search, and definitely not a search-path-dependent search.

This ties into Stephan's nearby concerns about whether unique indexes
using nondefault opclasses make sense as infrastructure for foreign
key checks.  The answer of course is that they make sense if and only
if the "=" operator used for the RI check is a member of the index
opclass.

Any thoughts about details?  My feeling is that we should tie RI
semantics to btree opclasses, same as we have done for ORDER BY
and some other SQL constructs, but I don't have a concrete proposal
right offhand.  The btree idea may not cover cross-type FKs anyway.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Martijn van Oosterhout
Date:
On Thu, Feb 23, 2006 at 01:10:07PM -0500, Tom Lane wrote:
> I looked into the problem reported here:
> http://archives.postgresql.org/pgsql-admin/2006-02/msg00261.php

<snip>

> This has been a hazard in the RI code since day one, of course, but
> I think it's time to face up to it and do something about it.  The
> RI code ought not be assuming that "=" will find an appropriate
> operator --- it should be doing something based on semantics, not a
> pure name search, and definitely not a search-path-dependent search.

<snip>

> Any thoughts about details?  My feeling is that we should tie RI
> semantics to btree opclasses, same as we have done for ORDER BY
> and some other SQL constructs, but I don't have a concrete proposal
> right offhand.  The btree idea may not cover cross-type FKs anyway.

Well, one solution is to simply use the '=' operator of the default
btree operator class. In all the cases we're talking about here they
have a btree operator class so this isn't an issue and (I think) gives
the behaviour people expect.

However, really they require less because we only care about equality,
not order, so a btree operator class is overkill. The case I'm thinking
of is "complex" which has equality but no order. Although people are
unlikely to do foreign keys on them, there are probably other types
where people might consider it.

Your point about cross-type foreign keys is good, but I think it could
be dealt with by explanding our use of cross-type operator classes,
like we've done for int2/4/8. For example, I was surprised the other
day by the fact that we didn't have cross-type operator classes for
text/varchar/char. If you want to do cross-type foreign keys, would it
be unreasonable to require a cross-type operator class to match?

This could all tie in with my collate stuff by requiring people to
declare the semantics of the type (equality, order, hash, etc) and
building the operator classes and such automatically from that. Some
types really only have equality but no order so you could have foreign
keys but couldn't create a btree index. But you could create a hash
index if a hash function was defined. The COLLATE clause gives people a
simple way to supplying the semantics they want from a given type,
which may deal with Stephan's concerns.

The cross-type stuff for collations is an issue, although I was
thinking about tackling it with cross-type equality, order and hash
functions, which pretty much amounts to what we need for cross-type
operator classes now.

But I think people are getting tired of me going on about collate and I
should just go implement it :)

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Thu, 23 Feb 2006, Tom Lane wrote:

> I looked into the problem reported here:
> http://archives.postgresql.org/pgsql-admin/2006-02/msg00261.php
> To wit, a pg_restore of a foreign key constraint involving user-defined
> types produces
>
> pg_restore: WARNING:  foreign key constraint "luuid_fkey" will require costly sequential scans
> DETAIL:  Key columns "luuid" and "luuid" are of different types: public.uniqueidentifier and
public.uniqueidentifier.
>
> A small variation of the example (I tried it with the contrib isbn type
> instead, just to see if it was uniqueidentifier's fault) produces
> complete failure:
>
> isbn=# ALTER TABLE ONLY beta ADD CONSTRAINT beta_luuid_fkey FOREIGN KEY (luuid) REFERENCES alpha(luuid);
> ERROR:  foreign key constraint "beta_luuid_fkey" cannot be implemented
> DETAIL:  Key columns "luuid" and "luuid" are of incompatible types: public.isbn and public.isbn.
>
> The problem is that pg_dump likes to set a restrictive search path:
> the above follows
> isbn# SET search_path = delta, pg_catalog;
> while the data type and its operators all are defined in the public
> schema.  So when ATAddForeignKeyConstraint checks to see if there's a
> suitable "=" operator, it doesn't find the intended operator.  In the
> isbn case it doesn't find anything at all; in the uniqueidentifier case
> there's an implicit cast to text and so the texteq operator is found,
> but it's not a member of the index's opclass and thus the warning
> appears.
>
> Even if ATAddForeignKeyConstraint weren't trying to be helpful by
> checking the operator, we'd be facing the exact same risks at runtime
> --- the RI triggers blithely assume that "foo = bar" will do the right
> thing.
>
> This has been a hazard in the RI code since day one, of course, but
> I think it's time to face up to it and do something about it.  The
> RI code ought not be assuming that "=" will find an appropriate
> operator --- it should be doing something based on semantics, not a
> pure name search, and definitely not a search-path-dependent search.
>
> This ties into Stephan's nearby concerns about whether unique indexes
> using nondefault opclasses make sense as infrastructure for foreign
> key checks.  The answer of course is that they make sense if and only
> if the "=" operator used for the RI check is a member of the index
> opclass.

Right, when I'd said "or we allow you to specify a different equality
operator for that case which matches the one in the opclass" I sort of was
thinking that direction, although I was considering either matching the
index one or allowing you to specify the operator if we went that way.
Those still bug me a little (especially matching the index one) because
writing out the constraint's check by hand by looking at the spec would
give different results.

> Any thoughts about details?  My feeling is that we should tie RI
> semantics to btree opclasses, same as we have done for ORDER BY
> and some other SQL constructs, but I don't have a concrete proposal
> right offhand.  The btree idea may not cover cross-type FKs anyway.

ISTM that the btree opclass is too restrictive right now since I'm
guessing we'd want to allow say int4 <-> numeric which I don't think is in
either btree opclass, but I don't know if they're not there because it
wasn't worth putting in or if there's a more fundamental reason.


Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> Any thoughts about details?  My feeling is that we should tie RI
>> semantics to btree opclasses, same as we have done for ORDER BY
>> and some other SQL constructs, but I don't have a concrete proposal
>> right offhand.  The btree idea may not cover cross-type FKs anyway.

> This means getting rid of SPI usage, right?

No, there's no need for that.  It means that the RI stuff would have to
take whatever steps we agree on to determine the exact comparison
operator to use, and then be sure to emit SQL that will select exactly
that operator --- this involves using the OPERATOR(foo.=) syntax to
remove schema-ambiguity and possibly adding explicit type coercions of
the operands.  This'll make the RI queries noticeably uglier, but
they're not meant to be read by humans anyway.  I think it wouldn't be
any slower, because OPERATOR() syntax will suppress a search-path
search that the parser would otherwise make for the operator --- but
in any case, since the plan result is cached, a few microseconds here or
there won't matter.

The hard part of this is just deciding what an RI constraint really
means semantically, in the sense of what equality operator are we going
to use to define the constraint.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Any thoughts about details?  My feeling is that we should tie RI
> semantics to btree opclasses, same as we have done for ORDER BY
> and some other SQL constructs, but I don't have a concrete proposal
> right offhand.  The btree idea may not cover cross-type FKs anyway.

This means getting rid of SPI usage, right?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Foreign keys for non-default datatypes

From
Christopher Kings-Lynne
Date:
> No, there's no need for that.  It means that the RI stuff would have to
> take whatever steps we agree on to determine the exact comparison
> operator to use, and then be sure to emit SQL that will select exactly
> that operator --- this involves using the OPERATOR(foo.=) syntax to
> remove schema-ambiguity and possibly adding explicit type coercions of
> the operands.  This'll make the RI queries noticeably uglier, but
> they're not meant to be read by humans anyway.  I think it wouldn't be
> any slower, because OPERATOR() syntax will suppress a search-path
> search that the parser would otherwise make for the operator --- but
> in any case, since the plan result is cached, a few microseconds here or
> there won't matter.

Incidentally, shouldn't the existing RI queries (eg. SELECT ... FOR 
SHARE) explicity specify operator(pg_catalog.=)?  Or are they safe from 
that for some other reason?

Chris



Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Incidentally, shouldn't the existing RI queries (eg. SELECT ... FOR 
> SHARE) explicity specify operator(pg_catalog.=)?

Then they'd be guaranteed to fail for datatypes/operators created in
other schemas, rather than only at risk of failing.  Don't see that
as an improvement really.  I think we should fix it properly for 8.2
but not institute any half-measures beforehand.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
[ returning to a week-old thread... ]

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Thu, 23 Feb 2006, Tom Lane wrote:
>> Any thoughts about details?  My feeling is that we should tie RI
>> semantics to btree opclasses, same as we have done for ORDER BY
>> and some other SQL constructs, but I don't have a concrete proposal
>> right offhand.  The btree idea may not cover cross-type FKs anyway.

> ISTM that the btree opclass is too restrictive right now since I'm
> guessing we'd want to allow say int4 <-> numeric which I don't think is in
> either btree opclass, but I don't know if they're not there because it
> wasn't worth putting in or if there's a more fundamental reason.

I thought a bit more about this and realized that there's really a
fairly natural way to associate an FK constraint with a btree index
opclass.  To wit, we already identify a unique index that the FK
constraint depends on to enforce uniqueness of the PK column --- and in
the current system, only btree indexes can be unique.  So we can just
use the opclass(es) of that index.  (If we ever add uniqueness support
to GiST, it would not be unreasonable to expect that the opclasses that
support uniqueness identify exactly which operator they think defines
equality, so we could still make it work for that.)

To handle the cross-type cases, I propose that we make two checks:

1. If the index opclass contains an exact operator for the case
"PKtype = FKtype", use that operator.

2. Otherwise, if there's an implicit coercion from the FK column
type to the PK column type, apply that coercion and use the opclass's
native equality operator.

If neither of these work, I think it'd be OK to fail (ie, reject
creation of the FK constraint).  This would have the interesting
property that the current warning condition "FK constraint will require
costly sequential scans" would become an error case.  I don't really
have a problem with that --- if there are any practically-useful cases
where people still get that warning, it means there are gaps we ought to
fill in the btree cross-type indexing support, not that we ought to go
out of our way to continue supporting a very inefficient mode of FK
operation.  In any case, the current code is willing to try to enforce
FKs that use an "equality" operator that we have no hard evidence works
like equality at all, and that surely isn't a good idea.

Comments?
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
elein
Date:
On Thu, Mar 02, 2006 at 08:05:59PM -0500, Tom Lane wrote:
> [ returning to a week-old thread... ]
> 
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Thu, 23 Feb 2006, Tom Lane wrote:
> >> Any thoughts about details?  My feeling is that we should tie RI
> >> semantics to btree opclasses, same as we have done for ORDER BY
> >> and some other SQL constructs, but I don't have a concrete proposal
> >> right offhand.  The btree idea may not cover cross-type FKs anyway.
> 
> > ISTM that the btree opclass is too restrictive right now since I'm
> > guessing we'd want to allow say int4 <-> numeric which I don't think is in
> > either btree opclass, but I don't know if they're not there because it
> > wasn't worth putting in or if there's a more fundamental reason.
> 
> I thought a bit more about this and realized that there's really a
> fairly natural way to associate an FK constraint with a btree index
> opclass.  To wit, we already identify a unique index that the FK
> constraint depends on to enforce uniqueness of the PK column --- and in
> the current system, only btree indexes can be unique.  So we can just
> use the opclass(es) of that index.  (If we ever add uniqueness support
> to GiST, it would not be unreasonable to expect that the opclasses that
> support uniqueness identify exactly which operator they think defines
> equality, so we could still make it work for that.)
> 
> To handle the cross-type cases, I propose that we make two checks:
> 
> 1. If the index opclass contains an exact operator for the case
> "PKtype = FKtype", use that operator.
> 
> 2. Otherwise, if there's an implicit coercion from the FK column
> type to the PK column type, apply that coercion and use the opclass's
> native equality operator.
> 
> If neither of these work, I think it'd be OK to fail (ie, reject
> creation of the FK constraint).  This would have the interesting
> property that the current warning condition "FK constraint will require
> costly sequential scans" would become an error case.  I don't really
> have a problem with that --- if there are any practically-useful cases
> where people still get that warning, it means there are gaps we ought to
> fill in the btree cross-type indexing support, not that we ought to go
> out of our way to continue supporting a very inefficient mode of FK
> operation.  In any case, the current code is willing to try to enforce
> FKs that use an "equality" operator that we have no hard evidence works
> like equality at all, and that surely isn't a good idea.
> 
> Comments?
> 
>             regards, tom lane


Somewhat related to this issue is that you cannot declare a primary key
on creation of a table on a column which should use a separate operator
class.

The case in point is a subtype (domain) with a BTREE operator class.  
I can of course create a separate unique index, however, if I use the 
PRIMARY KEY syntax the op class of the data type is not recognized.

I'm still thinking about the proper way this should be handled.  
If FKs will be able to recognize the opclass of the datatype, then the 
primary key should also using the same code.  But if you can only create 
PK constraints on default BTREE opclasses then you can only create FKs on 
default BTREE opclasses.  What I'm saying is that the opclass needs to be 
an option to PRIMARY KEY and FOREIGN KEY--whether implicitly in the code 
or explicitly in the grammar.  But as I said I'm thinking about it.  Perhaps
there are other alternatives.


--elein
elein@varlena.com



Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
elein <elein@varlena.com> writes:
> ... What I'm saying is that the opclass needs to be 
> an option to PRIMARY KEY and FOREIGN KEY--

PRIMARY KEY and UNIQUE, you mean.

This was brought up before, but I remain less than excited about it.
You can get essentially the same functionality by doing a CREATE UNIQUE
INDEX command, so allowing it as part of the PK/UNIQUE syntax is little
more than syntactic sugar.  I'm concerned that wedging opclass names
into that syntax could come back to haunt us some day --- eg, if SQL2009
decides to put their own kind of option into the same syntactic spot.

> The case in point is a subtype (domain) with a BTREE operator class.  
> I can of course create a separate unique index, however, if I use the 
> PRIMARY KEY syntax the op class of the data type is not recognized.

Hm, does CREATE INDEX work without explicitly specifying the opclass?
I suspect your complaint really stems from overeager getBaseType() calls
in the index definition code, which is maybe fixable without having to
get into syntactic extensions.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Thu, 2 Mar 2006, Tom Lane wrote:

> [ returning to a week-old thread... ]
>
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Thu, 23 Feb 2006, Tom Lane wrote:
> >> Any thoughts about details?  My feeling is that we should tie RI
> >> semantics to btree opclasses, same as we have done for ORDER BY
> >> and some other SQL constructs, but I don't have a concrete proposal
> >> right offhand.  The btree idea may not cover cross-type FKs anyway.
>
> > ISTM that the btree opclass is too restrictive right now since I'm
> > guessing we'd want to allow say int4 <-> numeric which I don't think is in
> > either btree opclass, but I don't know if they're not there because it
> > wasn't worth putting in or if there's a more fundamental reason.
>
> I thought a bit more about this and realized that there's really a
> fairly natural way to associate an FK constraint with a btree index
> opclass.  To wit, we already identify a unique index that the FK
> constraint depends on to enforce uniqueness of the PK column --- and in
> the current system, only btree indexes can be unique.  So we can just
> use the opclass(es) of that index.  (If we ever add uniqueness support
> to GiST, it would not be unreasonable to expect that the opclasses that
> support uniqueness identify exactly which operator they think defines
> equality, so we could still make it work for that.)
>
> To handle the cross-type cases, I propose that we make two checks:
>
> 1. If the index opclass contains an exact operator for the case
> "PKtype = FKtype", use that operator.

Is this rule to read explicitly naming '=' or just the item in that
position in the opclass?

> 2. Otherwise, if there's an implicit coercion from the FK column
> type to the PK column type, apply that coercion and use the opclass's
> native equality operator.
>
> If neither of these work, I think it'd be OK to fail (ie, reject
> creation of the FK constraint).  This would have the interesting
> property that the current warning condition "FK constraint will require
> costly sequential scans" would become an error case.  I don't really
> have a problem with that --- if there are any practically-useful cases
> where people still get that warning, it means there are gaps we ought to
> fill in the btree cross-type indexing support, not that we ought to go
> out of our way to continue supporting a very inefficient mode of FK
> operation.  In any case, the current code is willing to try to enforce
> FKs that use an "equality" operator that we have no hard evidence works
> like equality at all, and that surely isn't a good idea.

I think it's an acceptable idea to fail if we're going to extend the
cross-type indexing support, but AFAICS we have to at the very least allow
all of the "standard" numeric types in all combinations to work to meet
the spec, and I don't think the above rules and current opclasses will
give that to us (and I don't honestly understand some of the bits of this
to know if there's a problem with extending the opclasses to allow that).


Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Thu, 2 Mar 2006, Tom Lane wrote:
>> 1. If the index opclass contains an exact operator for the case
>> "PKtype = FKtype", use that operator.

> Is this rule to read explicitly naming '=' or just the item in that
> position in the opclass?

The operator occupying the equality position in the opclass.

> I think it's an acceptable idea to fail if we're going to extend the
> cross-type indexing support, but AFAICS we have to at the very least allow
> all of the "standard" numeric types in all combinations to work to meet
> the spec, and I don't think the above rules and current opclasses will
> give that to us (and I don't honestly understand some of the bits of this
> to know if there's a problem with extending the opclasses to allow that).

The cases that are likely to be problematic are things like a FLOAT8
column referencing a NUMERIC primary key.  However, that sort of
mishmash is fraught with all kinds of risks anyway (think roundoff
error) so the fact that the spec nominally allows it doesn't tell me
that we ought to encourage it.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Fri, 3 Mar 2006, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Thu, 2 Mar 2006, Tom Lane wrote:
> >> 1. If the index opclass contains an exact operator for the case
> >> "PKtype = FKtype", use that operator.
>
> > Is this rule to read explicitly naming '=' or just the item in that
> > position in the opclass?
>
> The operator occupying the equality position in the opclass.

Okay.

> > I think it's an acceptable idea to fail if we're going to extend the
> > cross-type indexing support, but AFAICS we have to at the very least allow
> > all of the "standard" numeric types in all combinations to work to meet
> > the spec, and I don't think the above rules and current opclasses will
> > give that to us (and I don't honestly understand some of the bits of this
> > to know if there's a problem with extending the opclasses to allow that).
>
> The cases that are likely to be problematic are things like a FLOAT8
> column referencing a NUMERIC primary key.  However, that sort of
> mishmash is fraught with all kinds of risks anyway (think roundoff
> error) so the fact that the spec nominally allows it doesn't tell me
> that we ought to encourage it.

There's a bit of difference between not encouraging it and disallowing it
entirely, but I'm willing to buy that argument.  I do think that numeric
reference int needs to be allowed though, and I thought that's also
currently not there (although int reference numeric should work I think).



Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> There's a bit of difference between not encouraging it and disallowing it
> entirely, but I'm willing to buy that argument.  I do think that numeric
> reference int needs to be allowed though, and I thought that's also
> currently not there (although int reference numeric should work I think).

Um, not sure which way you mean there.  The case that would work in the
proposal as I gave it is where the referencing column's type is
implicitly promotable to the referenced column's type.  So for example
an FK int column could reference a PK numeric column but (without more
btre support) not an FK numeric column referencing a PK int.  Is that
what you meant?

If you try numeric-references-int today, you get the "costly sequential
scan" warning, because the selected "=" operator is numeric_eq and
that's not compatible with the PK's index.  Basically, if any implicit
promotion happens on the PK side, you're going to get the warning
because you're no longer using an "=" operator that matches the index.
We have not seen many complaints about getting that warning since 8.0,
so I think that in practice people aren't using these cases and it'd be
OK to make them a hard error instead.  I would also argue that if
implicit promotion does happen on the PK side, it's very questionable
what semantics the FK constraint has anyway --- you can no longer be
sure that the operator you are using has a notion of equality that's
compatible with the PK index's notion.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Fri, 3 Mar 2006, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > There's a bit of difference between not encouraging it and disallowing it
> > entirely, but I'm willing to buy that argument.  I do think that numeric
> > reference int needs to be allowed though, and I thought that's also
> > currently not there (although int reference numeric should work I think).
>
> Um, not sure which way you mean there.  The case that would work in the
> proposal as I gave it is where the referencing column's type is
> implicitly promotable to the referenced column's type.  So for example
> an FK int column could reference a PK numeric column but (without more
> btre support) not an FK numeric column referencing a PK int.  Is that
> what you meant?

Yes.

> If you try numeric-references-int today, you get the "costly sequential
> scan" warning, because the selected "=" operator is numeric_eq and
> that's not compatible with the PK's index.  Basically, if any implicit
> promotion happens on the PK side, you're going to get the warning
> because you're no longer using an "=" operator that matches the index.
> We have not seen many complaints about getting that warning since 8.0,
> so I think that in practice people aren't using these cases and it'd be
> OK to make them a hard error instead.  I would also argue that if
> implicit promotion does happen on the PK side, it's very questionable
> what semantics the FK constraint has anyway --- you can no longer be
> sure that the operator you are using has a notion of equality that's
> compatible with the PK index's notion.

I'd argue that this case makes sense in a purely theoretical sense,
numeric(8,0) references int is a sensible operation (when you don't take
into account specific PostgreSQL limitations) and it's required by spec,
but I agree that promotion on the PK side is problematic because of the
issues about equality so we may not be able to do better. I'm just worried
that we're moving further from compliance with the spec.



Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> I'd argue that this case makes sense in a purely theoretical sense,
> numeric(8,0) references int is a sensible operation (when you don't take
> into account specific PostgreSQL limitations) and it's required by spec,
> but I agree that promotion on the PK side is problematic because of the
> issues about equality so we may not be able to do better. I'm just worried
> that we're moving further from compliance with the spec.

Well, if anyone complains about it we can add an int-eq-numeric operator
to the btree opclass for int, but the case seems pretty pointless to me.
Why are you using a numeric if the FK constraint is going to keep you
from storing any values that don't fit in int?  (Cases where this
argument doesn't hold will fit under the its-ok-to-promote-the-FK-side
rule.)

The reason I'm hesitant to add a bunch more cross-type operators is
mainly that we have too darn many operators named "=" already.  I've
seen in recent profiling tests that it's taking the parser a noticeable
amount of time to decide which one is meant.  So I don't want to add a
lot more without a fairly credible use-case, and right now this doesn't
look like one to me.

BTW, in researching this I noted that SQL92 requires the referencing
column to be the *same* type as the referenced column (11.8 <referential
constraint definition> syntax rule 7).  I see that SQL99 relaxes this to
require only comparable types, but I'm not really going to feel too bad
if we restrict it to being just "sensible" combinations of comparable
types.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Fri, 3 Mar 2006, Tom Lane wrote:

> The reason I'm hesitant to add a bunch more cross-type operators is
> mainly that we have too darn many operators named "=" already.  I've
> seen in recent profiling tests that it's taking the parser a noticeable
> amount of time to decide which one is meant.  So I don't want to add a
> lot more without a fairly credible use-case, and right now this doesn't
> look like one to me.

Honestly, as long as we *could* reasonably add one, I don't have a
complaint, because we aren't taking a step that involves a huge amount of
work if we get a complaint or separately decide to be more compliant on
this.


Re: Foreign keys for non-default datatypes

From
Andrew Dunstan
Date:
Stephan Szabo wrote:

>On Fri, 3 Mar 2006, Tom Lane wrote:
>
>  
>
>>The reason I'm hesitant to add a bunch more cross-type operators is
>>mainly that we have too darn many operators named "=" already.  I've
>>seen in recent profiling tests that it's taking the parser a noticeable
>>amount of time to decide which one is meant. 
>>

Speaking of parsers, did anyone else notice that gcc in its latest 
release has ripped out the bison based parser for C and Objective-C in 
favor of a hand cut RD parser?

cheers

andrew


Re: Foreign keys for non-default datatypes

From
Alvaro Herrera
Date:
Andrew Dunstan wrote:
> Stephan Szabo wrote:
> 
> >On Fri, 3 Mar 2006, Tom Lane wrote:
> >
> >>The reason I'm hesitant to add a bunch more cross-type operators is
> >>mainly that we have too darn many operators named "=" already.  I've
> >>seen in recent profiling tests that it's taking the parser a noticeable
> >>amount of time to decide which one is meant. 
> 
> Speaking of parsers, did anyone else notice that gcc in its latest 
> release has ripped out the bison based parser for C and Objective-C in 
> favor of a hand cut RD parser?

Yeah, I did.  I wonder what sort of effort they went to write the new
parser.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Foreign keys for non-default datatypes

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Andrew Dunstan wrote:

> > Speaking of parsers, did anyone else notice that gcc in its latest 
> > release has ripped out the bison based parser for C and Objective-C in 
> > favor of a hand cut RD parser?
> 
> Yeah, I did.  I wonder what sort of effort they went to write the new
> parser.

This is the new parser
http://gcc.gnu.org/viewcvs/tags/gcc_4_1_0_release/gcc/c-parser.c?view=markup&rev=111560

It's 6000 lines long.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Foreign keys for non-default datatypes

From
elein
Date:
On Thu, Mar 02, 2006 at 08:41:20PM -0500, Tom Lane wrote:
> elein <elein@varlena.com> writes:
> > ... What I'm saying is that the opclass needs to be 
> > an option to PRIMARY KEY and FOREIGN KEY--
> 
> PRIMARY KEY and UNIQUE, you mean.
> 
> This was brought up before, but I remain less than excited about it.
> You can get essentially the same functionality by doing a CREATE UNIQUE
> INDEX command, so allowing it as part of the PK/UNIQUE syntax is little
> more than syntactic sugar.  I'm concerned that wedging opclass names
> into that syntax could come back to haunt us some day --- eg, if SQL2009
> decides to put their own kind of option into the same syntactic spot.
> 
> > The case in point is a subtype (domain) with a BTREE operator class.  
> > I can of course create a separate unique index, however, if I use the 
> > PRIMARY KEY syntax the op class of the data type is not recognized.
> 
> Hm, does CREATE INDEX work without explicitly specifying the opclass?
> I suspect your complaint really stems from overeager getBaseType() calls
> in the index definition code, which is maybe fixable without having to
> get into syntactic extensions.

I am also leary of syntactic extensions.  I've found at least one getbasetype
to be misplaced rather than over eager.  But I'm looking at these issues
as I have time available.

> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 


Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Alvaro Herrera wrote:
>> Andrew Dunstan wrote:
>>> Speaking of parsers, did anyone else notice that gcc in its latest 
>>> release has ripped out the bison based parser for C and Objective-C in 
>>> favor of a hand cut RD parser?
>> 
>> Yeah, I did.  I wonder what sort of effort they went to write the new
>> parser.

> This is the new parser
> http://gcc.gnu.org/viewcvs/tags/gcc_4_1_0_release/gcc/c-parser.c?view=markup&rev=111560

> It's 6000 lines long.

Given that we whack the grammar around on a pretty regular basis,
I can't imagine that it'd be a smart idea to go to a handmade parser.
gcc is dealing with a very stable language definition so the tradeoffs
for them are a lot different.

Be nice if bison were a tad faster though :-(
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
BTW, I had another thought about this: if we go this way, then the plans
associated with RI check queries would essentially always be trivial
index lookups (for everything except RI_Initial_Check).  It'd be within
the realm of feasibility to throw away the current cached-plan RI
infrastructure and simply do direct indexscans, in the style that we
currently use for most system-catalog accesses.  I have never done any
performance testing or profiling of routine foreign-key check operations
but I should think that this would be really significantly faster ---
and it'd let us get rid of some ugly warts that we've had to plaster
onto SPI and the executor to support RI semantics, such as the
"crosscheck snapshot" cruft.

If we did this then RI checks would no longer be subvertible by rules or
user triggers.  Although I've been heard to argue that that's a feature,
I think the majority of people feel it's a bug, and wouldn't be sorry to
see it go.

Comments?
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Fri, 3 Mar 2006, Tom Lane wrote:

> BTW, I had another thought about this: if we go this way, then the plans
> associated with RI check queries would essentially always be trivial
> index lookups (for everything except RI_Initial_Check).  It'd be within
> the realm of feasibility to throw away the current cached-plan RI
> infrastructure and simply do direct indexscans, in the style that we
> currently use for most system-catalog accesses.

Would we have to do anything odd if we want to be testing only some of the
index columns and possibly not in the index order (like match partial
where some of the fk side is null)?  I don't honestly see us doing match
partial any time soon, but I'd like to have an idea of what'd be involved.

> If we did this then RI checks would no longer be subvertible by rules or
> user triggers.

I don't think that it'd really help because it's the actions that are
generally subvertible not the checks and since those are looking at the
potentially not indexed fk side, I don't think the above would apply.


Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Fri, 3 Mar 2006, Tom Lane wrote:
>> BTW, I had another thought about this: if we go this way, then the plans
>> associated with RI check queries would essentially always be trivial
>> index lookups (for everything except RI_Initial_Check).

> Would we have to do anything odd if we want to be testing only some of the
> index columns and possibly not in the index order (like match partial
> where some of the fk side is null)?  I don't honestly see us doing match
> partial any time soon, but I'd like to have an idea of what'd be involved.

Match partial would be an index lookup with a subset of the keys, which
btree at least is fine with.  You could argue that a "sufficiently
partial" match would be better done as a seqscan, but I think we could
just bull ahead and do it as indexscans always ...

>> If we did this then RI checks would no longer be subvertible by rules or
>> user triggers.

> I don't think that it'd really help because it's the actions that are
> generally subvertible not the checks and since those are looking at the
> potentially not indexed fk side, I don't think the above would apply.

Oh, right, we'd probably still need to do planning in that case.  Unless
we wanted to insist on having an FK-side index too for every FK, which
is something I'm not for.

Do you think it's worth redoing the implementation of just the PK checks?
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
Stephan Szabo
Date:
On Fri, 3 Mar 2006, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Fri, 3 Mar 2006, Tom Lane wrote:
> >> BTW, I had another thought about this: if we go this way, then the plans
> >> associated with RI check queries would essentially always be trivial
> >> index lookups (for everything except RI_Initial_Check).
>
> > Would we have to do anything odd if we want to be testing only some of the
> > index columns and possibly not in the index order (like match partial
> > where some of the fk side is null)?  I don't honestly see us doing match
> > partial any time soon, but I'd like to have an idea of what'd be involved.
>
> Match partial would be an index lookup with a subset of the keys, which
> btree at least is fine with.  You could argue that a "sufficiently
> partial" match would be better done as a seqscan, but I think we could
> just bull ahead and do it as indexscans always ...

So at that level, not having leading columns is okay, so we wouldn't have
an issue with this (as opposed to if we planned such a statement, where it
wouldn't choose to use the index)?

> >> If we did this then RI checks would no longer be subvertible by rules or
> >> user triggers.
>
> > I don't think that it'd really help because it's the actions that are
> > generally subvertible not the checks and since those are looking at the
> > potentially not indexed fk side, I don't think the above would apply.
>
> Oh, right, we'd probably still need to do planning in that case.  Unless
> we wanted to insist on having an FK-side index too for every FK, which
> is something I'm not for.
>
> Do you think it's worth redoing the implementation of just the PK checks?

I'd say it's worth trying.  If it's probably faster, and it actually
sounds like it might be cleaner code than the way the statements are
built, even if it's another, different way of doing it.


Re: Foreign keys for non-default datatypes

From
Michael Glaesemann
Date:
On Mar 4, 2006, at 9:22 , Tom Lane wrote:

> If we did this then RI checks would no longer be subvertible by  
> rules or
> user triggers.  Although I've been heard to argue that that's a  
> feature,
> I think the majority of people feel it's a bug, and wouldn't be  
> sorry to
> see it go.

Just to check, are you referring to constraint triggers here? (I  
don't believe so, as I've been using constraint triggers in cases  
other than referential integrity and as I read it, you're only  
referring to RI.) Currently constraint triggers are the only way to  
defer constraint checking to the end of a transaction. Until we have  
some kind of support for multiple updates in a single statement (or  
perhaps deferred constraints, but I'd prefer the former), I'd like  
constraint triggers to hang around.

Michael Glaesemann
grzm myrealbox com





Re: Foreign keys for non-default datatypes

From
"Michael Paesold"
Date:
Tom Lane wrote:
>>> If we did this then RI checks would no longer be subvertible by rules
>>> or user triggers.
>
> Stephan Szabo writes:
>> I don't think that it'd really help because it's the actions that are
>> generally subvertible not the checks and since those are looking at the
>> potentially not indexed fk side, I don't think the above would apply.
>
> Oh, right, we'd probably still need to do planning in that case.  Unless
> we wanted to insist on having an FK-side index too for every FK, which
> is something I'm not for.

I don't really understand the implications here, but I hope that the 
following usecase will still work afterwards:

Two tables A, B.

B (id) references A (id), with ON DELETE CASCADE

Usually deleting a row from A will cause all referencing rows in B to be 
deleted, too. Nevertheless B has a BEFORE DELETE trigger "check_delete" that 
checks if a row of B may be deleted or not. I.e. it contains a IF ... RAISE 
EXCEPTION...

Will this trigger still be called, so it can abort the delete?
If not, I am against that change because it will break the 
consistency-enforcements of one of our applications.

In other words, if you only change the checks of the FKs, I see no problem 
at all; but if you change the actions of FKs to not call user defined 
triggers, I have a problem.

Please correct any of my wrong assumptions. ;-)

Best Regards,
Michael 




Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
"Michael Paesold" <mpaesold@gmx.at> writes:
> B (id) references A (id), with ON DELETE CASCADE

> Usually deleting a row from A will cause all referencing rows in B to be 
> deleted, too. Nevertheless B has a BEFORE DELETE trigger "check_delete" that 
> checks if a row of B may be deleted or not. I.e. it contains a IF ... RAISE 
> EXCEPTION...

> Will this trigger still be called, so it can abort the delete?

We'd certainly still call triggers and check row-level constraints,
and any error would abort the whole statement (leaving A unmodified).

The case that I think we'd forbid if the implementation could support
doing so is where a BEFORE trigger cancels the B-update operation by
returning NULL.  This currently leaves you with a row in B that violates
the FK constraint (once the A row is gone).

Triggers that modify the row to be stored are not a problem, because
B will have an AFTER trigger that rechecks the row against A anyway.
AFAICS it's only the silent-cancellation case that subverts RI constraints.

Rules on B that rewrite the DELETE or UPDATE into something else are
also problematic.

This is all moot at the moment since Stephan pointed out that we still
need planning for the FK actions (ie the cascaded deletes/updates).
So I'm not currently thinking of redoing the implementation of actions.
        regards, tom lane


Re: Foreign keys for non-default datatypes

From
"Michael Paesold"
Date:
Tom Lane writes:


> "Michael Paesold" <mpaesold@gmx.at> writes:
>> Will this trigger still be called, so it can abort the delete?
>
> We'd certainly still call triggers and check row-level constraints,
> and any error would abort the whole statement (leaving A unmodified).
>
> The case that I think we'd forbid if the implementation could support
> doing so is where a BEFORE trigger cancels the B-update operation by
> returning NULL.  This currently leaves you with a row in B that violates
> the FK constraint (once the A row is gone).
>
> Triggers that modify the row to be stored are not a problem, because
> B will have an AFTER trigger that rechecks the row against A anyway.
> AFAICS it's only the silent-cancellation case that subverts RI
> constraints.
>
> Rules on B that rewrite the DELETE or UPDATE into something else are
> also problematic.
>
> This is all moot at the moment since Stephan pointed out that we still
> need planning for the FK actions (ie the cascaded deletes/updates).
> So I'm not currently thinking of redoing the implementation of actions.

Ok, thank you for the explanation. At least I am not worried about a future 
reimplementation of the RI triggers.

Best Regards,
Michael Paesold 




Re: Foreign keys for non-default datatypes

From
Tom Lane
Date:
Michael Glaesemann <grzm@myrealbox.com> writes:
> Just to check, are you referring to constraint triggers here?

No, I don't wish to remove constraint triggers.  I would like to see
them become a better-documented, better-supported feature, which might
require some changes ... I don't recall why we are keeping them obscure
at the moment.
        regards, tom lane