Thread: feature requests (possibly interested in working on this): functional foreign keys

feature requests (possibly interested in working on this): functional foreign keys

From
Chris Travers
Date:
Hi everyone;

I have a project right now where functional foreign keys would be very
handy.  Let me explain the specific use case I am looking at and I would
assume that there are many more.

I am looking at building a network configuration database for virtualized
machines in a hosting environment.  Now if I had functional foreign keys I
would probably build the db differently (I will probably simulate a partial
functional foreign key with a check constraint, a custom function, and a
custom trigger, but DRI would be helpful here for both the partial and
functional aspects).  The partial aspects can be done today using table
inheritance (with some caveats) but functional foreign keys are not
supported.

I have a table which tracks CIDR assignments to DHCP servers, and another
table which tracks IP address to MAC assignments.  There are some
complications here regarding the fact that not all IP addresses are managed
via DHCP, but this could be managed....

What would be nice to be able to do is to be able to do something like:

ALTER TABLE inet_assignment ADD FOREIGN KEY (network(inet_address))
REFERENCES cidr_block(block_def);

There are some obvious limitations here.  A foreign key could only use
immutable functions, for example.  Additionally if we need to handle
partial foreign keys we could do so with an immutable function returning
NULL from the table type based on relevant criteria.  Also there are a
couple possible dependency issues, such as:

1:  The foreign key depends on the function so the function cannot be
dropped first absent CASCADE

2: If the function is redefined, one would have to check all rows to verify
that they meet the new function's requirements.  This could pose a
performance issue with DDL.

There are obvious workarounds.  One could use a trigger and a foreign key.

But my questions are:

1.  Is there enough use in something like this to even try to tackle it?

2.  Are there any other major showstoppers I haven't thought of?

Best Wishes,
Chris Travers
On 7 February 2013 09:38, Chris Travers <chris.travers@gmail.com> wrote:

> 1:  The foreign key depends on the function so the function cannot be
> dropped first absent CASCADE
>
> 2: If the function is redefined, one would have to check all rows to
> verify that they meet the new function's requirements.  This could pose a
> performance issue with DDL.
>
> There are obvious workarounds.  One could use a trigger and a foreign key.
>
> But my questions are:
>
> 1.  Is there enough use in something like this to even try to tackle it?
>
> 2.  Are there any other major showstoppers I haven't thought of?
>
> Purely from a user perspective IMO it seems like a good idea and a logical
progression from index expressions. You could even make use of the
equivalent index expression if it existed, or (better) insist on it,
because the calculated value would have to be UNIQUE anyway (otherwise you
end up in all sorts of trouble).

Geoff
Geoff Winkless, 07.02.2013 11:46:
> On 7 February 2013 09:38, Chris Travers <chris.travers@gmail.com
> <mailto:chris.travers@gmail.com>> wrote:
>
> 1:  The foreign key depends on the function so the function cannot be
> dropped first absent CASCADE
>
> 2: If the function is redefined, one would have to check all rows to
> verify that they meet the new function's requirements.  This could
> pose a performance issue with DDL.
>
> There are obvious workarounds.  One could use a trigger and a foreign
> key.
>
> But my questions are:
>
> 1.  Is there enough use in something like this to even try to tackle
> it?
>
> 2.  Are there any other major showstoppers I haven't thought of?
>
> Purely from a user perspective IMO it seems like a good idea and a
> logical progression from index expressions. You could even make use
> of the equivalent index expression if it existed, or (better) insist
> on it, because the calculated value would have to be UNIQUE anyway
> (otherwise you end up in all sorts of trouble).
>

Wouldn't the ability to have virtual columns (aka computed or generated columns) inside a table be a generalization of
this?

The feature would need some kind of "virtual column" to support the FKs anyway, if I'm not mistaken (because the FK
valueneeds to be stored somewhere in order to be able to look it up). 

So I think exposing the ability to declare a virtual column would open up even more possibilities (and then in turn
allowthose virtual columns to be used in a FK constraint). 

Thomas
On 7 February 2013 11:18, Thomas Kellerer <spam_eater@gmx.net> wrote:

> Geoff Winkless, 07.02.2013 11:46:
>
>> On 7 February 2013 09:38, Chris Travers <chris.travers@gmail.com
>> <mailto:chris.travers@gmail.**com <chris.travers@gmail.com>>> wrote:
>>
>> 1:  The foreign key depends on the function so the function cannot be
>> dropped first absent CASCADE
>>
>
[snip]


> 2.  Are there any other major showstoppers I haven't thought of?
>
>
>> Purely from a user perspective IMO it seems like a good idea and a
>> logical progression from index expressions. You could even make use
>> of the equivalent index expression if it existed, or (better) insist
>> on it, because the calculated value would have to be UNIQUE anyway
>> (otherwise you end up in all sorts of trouble).
>>
>
Please note that the indenting is messed up here; I did not write the first
section of this, Chris did; only the last paragraph is mine.

Wouldn't the ability to have virtual columns (aka computed or generated
> columns) inside a table be a generalization of this?
>

Well it would be a much larger block of work with a much heavier impact and
(IMO) the value of taking up disk space with a column that is based on (and
entirely generate-able from) other columns is dubious. I imagine similar
arguments took place when expression indexes were mooted.

The feature would need some kind of "virtual column" to support the FKs
> anyway, if I'm not mistaken (because the FK value needs to be stored
> somewhere in order to be able to look it up).
>

Which is resolved by relying on the expression index.

Geoff
Chris Travers <chris.travers@gmail.com> writes:
> What would be nice to be able to do is to be able to do something like:
> ALTER TABLE inet_assignment ADD FOREIGN KEY (network(inet_address))
> REFERENCES cidr_block(block_def);

> 2.  Are there any other major showstoppers I haven't thought of?

The information_schema can't represent such a thing, and this is
unfixable without breaking the SQL standard.  I suppose we could omit
functional FK constraints from the information_schema views, but that's
not terribly palatable.

Have you considered just storing the network(inet_address) value in a
separate column (maintained by a BEFORE INSERT/UPDATE trigger) and then
using a regular FK with that?

            regards, tom lane
(Forgot to CC the list)


On Thu, Feb 7, 2013 at 7:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Chris Travers <chris.travers@gmail.com> writes:
> > What would be nice to be able to do is to be able to do something like:
> > ALTER TABLE inet_assignment ADD FOREIGN KEY (network(inet_address))
> > REFERENCES cidr_block(block_def);
>
> > 2.  Are there any other major showstoppers I haven't thought of?
>
> The information_schema can't represent such a thing, and this is
> unfixable without breaking the SQL standard.  I suppose we could omit
> functional FK constraints from the information_schema views, but that's
> not terribly palatable.
>

If this were to be limited to table methods (i.e. functions where
relation.function notation works), would that be sufficiently workable?

>
> Have you considered just storing the network(inet_address) value in a
> separate column (maintained by a BEFORE INSERT/UPDATE trigger) and then
> using a regular FK with that?
>

I have.  Honestly I think the UDF + check + trigger is cleaner.

Best Wishes,
Chris Travers

>
>                         regards, tom lane
>
Chris Travers <chris.travers@gmail.com> writes:
> On Thu, Feb 7, 2013 at 7:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Chris Travers <chris.travers@gmail.com> writes:
>>> What would be nice to be able to do is to be able to do something like:
>>> ALTER TABLE inet_assignment ADD FOREIGN KEY (network(inet_address))
>>> REFERENCES cidr_block(block_def);
>>>
>>> 2.  Are there any other major showstoppers I haven't thought of?

>> The information_schema can't represent such a thing, and this is
>> unfixable without breaking the SQL standard.  I suppose we could omit
>> functional FK constraints from the information_schema views, but that's
>> not terribly palatable.

> If this were to be limited to table methods (i.e. functions where
> relation.function notation works), would that be sufficiently workable?

Hmm, interesting hack.  I guess that would meet the part of the spec
that says, eg, information_schema.constraint_column_usage.column_name
must be an identifier --- at least if you also restricted which schema
the function could be in.  But it would sure violate some other parts.
For instance an app would expect to be able to join that column to
information_schema.columns.  On the whole I doubt that it's really a
good idea to break spec compatibility subtly rather than obviously.

            regards, tom lane
On 7 February 2013 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Hmm, interesting hack.  I guess that would meet the part of the spec
> that says, eg, information_schema.constraint_column_usage.column_name
> must be an identifier --- at least if you also restricted which schema
> the function could be in.


Apologies if I'm misunderstanding your point; couldn't you give the index
name (since we've suggested you would have to have a unique index on the
function in order to use it as an FK anyway) as the identifier?

Geoff
Geoff Winkless <pgsqlgeneral@geoff.dj> writes:
> On 7 February 2013 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, interesting hack.  I guess that would meet the part of the spec
>> that says, eg, information_schema.constraint_column_usage.column_name
>> must be an identifier --- at least if you also restricted which schema
>> the function could be in.

> Apologies if I'm misunderstanding your point; couldn't you give the index
> name (since we've suggested you would have to have a unique index on the
> function in order to use it as an FK anyway) as the identifier?

My point is that the spec expects that identifier to be the name of a
column in the table, and so will spec-compliant applications.  Inventing
different ways to provide an identifier that can be claimed to describe
the functional expression doesn't really do anything to get around that
problem.

I'm inclined to think that the way that the standards committee expects
people to get around this is to store the functional expression
explicitly as a separate column.  There's a feature called "generated
columns" in recent versions of the spec that automates that.  PG hasn't
implemented generated columns yet, but you can get the same effect with
a BEFORE trigger to calculate the separate column's value.

            regards, tom lane
On Thu, Feb 7, 2013 at 9:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Geoff Winkless <pgsqlgeneral@geoff.dj> writes:
> > On 7 February 2013 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Hmm, interesting hack.  I guess that would meet the part of the spec
> >> that says, eg, information_schema.constraint_column_usage.column_name
> >> must be an identifier --- at least if you also restricted which schema
> >> the function could be in.
>
> > Apologies if I'm misunderstanding your point; couldn't you give the index
> > name (since we've suggested you would have to have a unique index on the
> > function in order to use it as an FK anyway) as the identifier?
>
> My point is that the spec expects that identifier to be the name of a
> column in the table, and so will spec-compliant applications.  Inventing
> different ways to provide an identifier that can be claimed to describe
> the functional expression doesn't really do anything to get around that
> problem.
>
> I'm inclined to think that the way that the standards committee expects
> people to get around this is to store the functional expression
> explicitly as a separate column.  There's a feature called "generated
> columns" in recent versions of the spec that automates that.  PG hasn't
> implemented generated columns yet, but you can get the same effect with
> a BEFORE trigger to calculate the separate column's value.
>

Thanks.  That's the sort of show-stopper that was overlooking.

It seems that to make the table method approach work we'd have to be able
to have some other things in place first, perhaps being able to explicitly
define a table method as a "virtual column" that could be seen in the
information schema (and possibly making the args of the function entirely
implicit, allowing select method from relation.

That might be worthwhile too but it significantly expands the scope of what
I was looking at.

Best Wishes,
Chris Travers

                        regards, tom lane
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>