Thread: PostgreSQL 7.4.2 allows foreign key violation

PostgreSQL 7.4.2 allows foreign key violation

From
Markus Bertheau
Date:
Hi,

On PostgreSQL 7.4.2 I can create a situation in which a foreign key is
violated:

bug=# SELECT b.name, exists(SELECT 1 FROM a WHERE a.name = b.name) FROM b;
 name | ?column?
------+----------
 xxx  | f
(1 запись)

bug=# \d+ b
                       Таблица "public.b"
 Колонка | Тип | Модификаторы | Описание
----------------+--------+--------------------------+------------------
 name           | text   | not null                 |
Индексы:
    "b_pkey" ключевое поле, btree (name)
Ограничения по вторичному ключу:
    "$1" FOREIGN KEY (name) REFERENCES a(name) ON UPDATE CASCADE
Правила:
    b_rename AS ON UPDATE TO b DO INSTEAD UPDATE a SET name = new.name WHERE (a.name = old.name)

I create the situation as follows:

CREATE TABLE a (name TEXT PRIMARY KEY);
INSERT INTO a VALUES ('xxx');
CREATE TABLE b (name TEXT PRIMARY KEY REFERENCES a(name) ON UPDATE CASCADE);
INSERT INTO b VALUES ('xxx');
CREATE RULE b_rename AS ON UPDATE TO b DO INSTEAD UPDATE a SET name = NEW.name WHERE name = OLD.name;
UPDATE b SET name = 'yyy' WHERE name = 'xxx';
SELECT b.name, exists(SELECT 1 FROM a WHERE a.name = b.name) FROM b;
 name | ?column?
------+----------
 xxx  | f
(1 запись)

Up to here I thought that the following was going on: The UPDATE b
statement was rewritten into a UPDATE a statement by the rule system.
The update on a triggers the foreign key update on b. This UPDATE gets
rewritten again by the rule system to update a instead. The update to a
triggers the foreign key again, which recognizes that it is already
running and does nothing. The outer foreign key is done and the update
to a is realized. b stays unchanged.

But then I discovered that if I update the row in a prior to creating
the rule, the rule works as expected:

CREATE TABLE a (name TEXT PRIMARY KEY);
INSERT INTO a VALUES ('xxx');
CREATE TABLE b (name TEXT PRIMARY KEY REFERENCES a(name) ON UPDATE CASCADE);
INSERT INTO b VALUES ('xxx');
UPDATE a SET name = 'zzz' WHERE name = 'xxx';
CREATE RULE b_rename AS ON UPDATE TO b DO INSTEAD UPDATE a SET name = NEW.name WHERE name = OLD.name;
UPDATE b SET name = 'yyy' WHERE name = 'zzz';
SELECT b.name, exists(SELECT 1 FROM a WHERE a.name = b.name) FROM b;
 name | ?column?
------+----------
 yyy  | t
(1 запись)

This somehow renders my theory invalid. Can someone comment?

I also tried the same rule without INSTEAD. That does what I want and it
is what I'm using in the application now. I wonder if that is The Right
Way®.

And should PostgreSQL allow foreign key violations like in the example
above?

Thanks

--
Markus Bertheau <twanger@bluetwanger.de>


Re: PostgreSQL 7.4.2 allows foreign key violation

From
Stephan Szabo
Date:
On Fri, 6 Aug 2004, Markus Bertheau wrote:

> Up to here I thought that the following was going on: The UPDATE b
> statement was rewritten into a UPDATE a statement by the rule system.
> The update on a triggers the foreign key update on b. This UPDATE gets
> rewritten again by the rule system to update a instead. The update to a
> triggers the foreign key again, which recognizes that it is already
> running and does nothing. The outer foreign key is done and the update
> to a is realized. b stays unchanged.
>
> But then I discovered that if I update the row in a prior to creating
> the rule, the rule works as expected:
>
> CREATE TABLE a (name TEXT PRIMARY KEY);
> INSERT INTO a VALUES ('xxx');
> CREATE TABLE b (name TEXT PRIMARY KEY REFERENCES a(name) ON UPDATE CASCADE);
> INSERT INTO b VALUES ('xxx');
> UPDATE a SET name = 'zzz' WHERE name = 'xxx';
> CREATE RULE b_rename AS ON UPDATE TO b DO INSTEAD UPDATE a SET name = NEW.name WHERE name = OLD.name;
> UPDATE b SET name = 'yyy' WHERE name = 'zzz';
> SELECT b.name, exists(SELECT 1 FROM a WHERE a.name = b.name) FROM b;
>  name | ?column?
> ------+----------
>  yyy  | t
> (1 запись)
>
> This somehow renders my theory invalid. Can someone comment?

Only for that session.  The foreign key query is planned at the first
update so it doesn't see the rule until you get to a new session.


> I also tried the same rule without INSTEAD. That does what I want and it
> is what I'm using in the application now. I wonder if that is The Right
> Way®.

Probably.

> And should PostgreSQL allow foreign key violations like in the example
> above?

Probably not.

It also looks like before triggers returning NULL can break them. I think
we'd been worried about the added cost of doing the check when the average
case doesn't have this problem but we should probably just eat it.  In
practice I think it's one line of code per action function (on update set
default already does it).

Any opinions out there?

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Tom Lane
Date:
Markus Bertheau <twanger@bluetwanger.de> writes:
> I create the situation as follows:

> CREATE TABLE a (name TEXT PRIMARY KEY);
> INSERT INTO a VALUES ('xxx');
> CREATE TABLE b (name TEXT PRIMARY KEY REFERENCES a(name) ON UPDATE CASCADE);
> INSERT INTO b VALUES ('xxx');
> CREATE RULE b_rename AS ON UPDATE TO b DO INSTEAD UPDATE a SET name = NEW.name WHERE name = OLD.name;
> UPDATE b SET name = 'yyy' WHERE name = 'xxx';
> SELECT b.name, exists(SELECT 1 FROM a WHERE a.name = b.name) FROM b;

The difficulty here is that the CASCADE is implemented by generating an
"UPDATE b" command ... which is rewritten by your rule and thus fails to
affect table b at all.

It would probably be better if the RI implementation acted at a lower
level and wasn't affected by rules, but for the foreseeable future the
answer is "don't do that".

> But then I discovered that if I update the row in a prior to creating
> the rule, the rule works as expected:

Only for the moment --- you're depending on a cached plan for the
foreign-key update.  Start a fresh backend and it's broken again.

            regards, tom lane

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> It also looks like before triggers returning NULL can break them.

Or a BEFORE trigger that overrides the attempted field update.

> I think we'd been worried about the added cost of doing the check when
> the average case doesn't have this problem but we should probably just
> eat it.  In practice I think it's one line of code per action function
> (on update set default already does it).

Already does what?  I see nothing in there that would override either
triggers or rules...

> Any opinions out there?

I seem to recall some discussions to the effect that having these
updates subject to rules/triggers is not necessarily bad.  For example,
if you were using a rule or trigger to log all updates of table B
someplace else, you'd probably be annoyed to find the RI updates
bypassing your logging mechanism.

There's no perfect solution ...

            regards, tom lane

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Stephan Szabo
Date:
On Fri, 6 Aug 2004, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > It also looks like before triggers returning NULL can break them.
>
> Or a BEFORE trigger that overrides the attempted field update.
>
> > I think we'd been worried about the added cost of doing the check when
> > the average case doesn't have this problem but we should probably just
> > eat it.  In practice I think it's one line of code per action function
> > (on update set default already does it).
>
> Already does what?  I see nothing in there that would override either
> triggers or rules...

It's not for overriding the triggers or rules, but instead checking that
the post action state is valid (by running the no action code which
makes sure that either another row now has the pk value or that there are
no longer any matching rows).

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Stephan Szabo
Date:
On Fri, 6 Aug 2004, Stephan Szabo wrote:

> On Fri, 6 Aug 2004, Tom Lane wrote:
>
> > Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > > It also looks like before triggers returning NULL can break them.
> >
> > Or a BEFORE trigger that overrides the attempted field update.
> >
> > > I think we'd been worried about the added cost of doing the check when
> > > the average case doesn't have this problem but we should probably just
> > > eat it.  In practice I think it's one line of code per action function
> > > (on update set default already does it).
> >
> > Already does what?  I see nothing in there that would override either
> > triggers or rules...
>
> It's not for overriding the triggers or rules, but instead checking that
> the post action state is valid (by running the no action code which
> makes sure that either another row now has the pk value or that there are
> no longer any matching rows).

To make that clearer, that another row now has the old pk value, or that
there are no longer any matching rows to the old pk value.

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Fri, 6 Aug 2004, Tom Lane wrote:
>> Already does what?  I see nothing in there that would override either
>> triggers or rules...

> It's not for overriding the triggers or rules, but instead checking that
> the post action state is valid (by running the no action code which
> makes sure that either another row now has the pk value or that there are
> no longer any matching rows).

Oh, I see.  Seems an awfully expensive solution though :-(

            regards, tom lane

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Stephan Szabo
Date:
On Fri, 6 Aug 2004, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Fri, 6 Aug 2004, Tom Lane wrote:
> >> Already does what?  I see nothing in there that would override either
> >> triggers or rules...
>
> > It's not for overriding the triggers or rules, but instead checking that
> > the post action state is valid (by running the no action code which
> > makes sure that either another row now has the pk value or that there are
> > no longer any matching rows).
>
> Oh, I see.  Seems an awfully expensive solution though :-(

Yeah.  That's the major downside. We could potentially do it only if there
are any instead rules or before triggers on the action we want to do
(either delete or update). I'm not sure if/how expensive that would be to
determine.  I think we could potentially check rules when the query is
planned (since presumably, those are the only rules that matter), but I
don't think that'd work for the triggers.


Re: PostgreSQL 7.4.2 allows foreign key violation

From
Jan Wieck
Date:
On 8/6/2004 1:23 PM, Tom Lane wrote:
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
>> On Fri, 6 Aug 2004, Tom Lane wrote:
>>> Already does what?  I see nothing in there that would override either
>>> triggers or rules...
>
>> It's not for overriding the triggers or rules, but instead checking that
>> the post action state is valid (by running the no action code which
>> makes sure that either another row now has the pk value or that there are
>> no longer any matching rows).
>
> Oh, I see.  Seems an awfully expensive solution though :-(

IMHO it is one of the cases that are on the line of "doctor, when I ...
then don't do it". As you said, there is no perfect solution. Triggers
and rules can conflict in several ways, but we don't want to sacrifice
one for making the other failsafe.


Jan

>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: PostgreSQL 7.4.2 allows foreign key violation

From
Stephan Szabo
Date:
On Sun, 8 Aug 2004, Jan Wieck wrote:

> On 8/6/2004 1:23 PM, Tom Lane wrote:
> > Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> >> On Fri, 6 Aug 2004, Tom Lane wrote:
> >>> Already does what?  I see nothing in there that would override either
> >>> triggers or rules...
> >
> >> It's not for overriding the triggers or rules, but instead checking that
> >> the post action state is valid (by running the no action code which
> >> makes sure that either another row now has the pk value or that there are
> >> no longer any matching rows).
> >
> > Oh, I see.  Seems an awfully expensive solution though :-(
>
> IMHO it is one of the cases that are on the line of "doctor, when I ...
> then don't do it". As you said, there is no perfect solution. Triggers
> and rules can conflict in several ways, but we don't want to sacrifice
> one for making the other failsafe.

True, but I don't think we're sacrificing one for the other. We'd be
sacrificing speed AFAICS.

As I see it we've got the following:
 a) Do no code changes. The constraints can be fooled by some situations,
    make sure that it's documented and point people to the documentation.
     Upsides: No code changes, no further slowdown of functioning of
      constraint.
     Downside: Constraint can be violated

 b) Always run the no action check code. We always run the no action code
    after the action to check to make sure that the dependent rows are
    no longer there (or are supported by some other value). This is the
    theoretical model of the constraint in the spec, I believe.
     Upsides: Constraint should be theoretically difficult to break with
      this form of violation, barring bugs. Minor code change.
     Downsides: Everyone pays the (non-trivial) cost to do the check to
      fix this case. The extra check also potentially grabs yet more
      locks.

 c) Run the no action check code when we think there's some chance of this
    situation occuring.  The "some chance" could be always in which case
    this is the same as b, if there are any instead rules or before
    triggers on the acted upon table for the action being run (update or
    delete), or something more complicated.
     Upsides: Compared to a, we would get a constraint that's harder to
      break. Compared to b, we hopefully lessen the cost to people not
      using the combination.
     Downsides: More involved code changes and testing to make sure it's
      right. We still add a cost to everyone to check the state. The
      constraint now "acts differently" for people using instead rules
      or before triggers which means it's an additional variable to
      deal with when debugging problems.