Re: on update / on delete performance of foreign keys - Mailing list pgsql-general

From Stephan Szabo
Subject Re: on update / on delete performance of foreign keys
Date
Msg-id 20050124083442.F37929@megazone.bigpanda.com
Whole thread Raw
In response to Re: on update / on delete performance of foreign keys  ("Florian G. Pflug" <fgp@phlo.org>)
Responses Re: on update / on delete performance of foreign keys
List pgsql-general
On Mon, 24 Jan 2005, Florian G. Pflug wrote:

> Stephan Szabo wrote:
> > On Mon, 24 Jan 2005, Florian G. Pflug wrote:
> >
> >>Since postgres already incoporates code to check foreign keys more
> >>efficiently (when doing alter table ... add constraint .. foreign key,
> >>postgres seems to use a merge or a hash join, instead of a nested loop),
> >>I wondered how hard it would be to use this for the triggers too.
> >>
> >>I imagined creating a statement-level trigger in parallel to the
> >>row-level triggers, and defining some threshold (let's say, more than
> >>10% of the rows deleted). If the threshold is reached, the row-level
> >>trigger would just do nothing, and the statement-level trigger would
> >>delete the referencing records doing a join.
> >>
> >>Would this be feasable? And would it be something a newbie could tackle,
> >>or is it more involved than I think?
> >
> > It's a little more involved.  The first is that I think there's no good
> > way to tell the row trigger to do nothing (remember that the constraints
> > may be deferred so simple flags aren't sufficient).
> I would be content if my optimization works for the not-deferred case -
> I'd don't fully understand how deferred foreign keys are handled in
> postgres. (I guess I don't even fully understand their semantics - I
> use them only when doing bulk inserts, and there are either circular
> dependencies, or I don't feel like find the right table order ;-))

Actually, thinking about it, this might not be so bad, you'd need to
effectively have a stack of states (for events triggered by triggered
actions perhaps many levels deep) but I think that's already effectively
there.

> > > The second is that
> > these triggers will want to know which rows are deleted, but AFAIK
> > statement-level triggers don't currently give you that information and
> > deleting/changing any rows that aren't satisfied does not give the correct
> > behavior.
> This I do not understand. Isn't it sufficient to delete any rows whose
> reference does not exist (for the on-delete-cascade case), or complain
> if such rows exist (for the no-action/restrict case)? The
> on-update-cascade case is difficult I guess - I'm not sure if my idea
> even works for that case, now that I think about it...

It's not sufficient to do the delete for non existant pk rows in the
deferred case.  I also think we'd need to decide on the behavior for the
PostgreSQL case where a user trigger runs in between the delete and the
action (for example, if I delete where pk=1 and then in between the delete
and its action insert a row with pk=1 does the delete fire? The spec
doesn't say much because I don't think you can run anything between the
two.)

insert into pk values (1);
begin;
 insert into fk values (2);
 delete from pk;
commit;

AFAICT to follow the foreign key semantics if the foreign key check is
deferred an error occurs on commit.  Deleting the fk row on the delete
from pk is not allowed.

I think it may be valid for on delete no action even in the deferred
case(*) , but I haven't done alot of thinking about it, but I think it's
also invalid for deferred restrict since only the rows being deleted have
the restrict applied to them, so an insert into pk values (2) between the
delete and commit would allow the transaction to succeed AFAIK.

(*) - I'm not sure how you'd necessarily give a complete error message if
the error should really be that an insert was invalid but you noticed it
on a delete check.

I haven't thought about the update cases at all.

> > The no action case is actually a little more involved again as
> > it needs to remove rows from the set of changed pk rows if new pk rows
> > have come into existance for matching keys.
> Guess without understanding your previous comment I'm lost here too - I
> wouldn't care to check only changed rows - I would check them all - but
> only if some estimate shows that it will probably cheaper.

> At the moment I'm writing a few plpgsql functions that do what I want.
> They disable all constraint-related trigger, do a deleted, and then
> recursivly traverse all tables (following the foreign-keys), and do
> a "delete from .. where not exists (select 1 from ... where ...)".
> I'll if I stumble upon problems - maybe I'll suddenly understand your
> comments ;-)))

Note that this may fail for cases where the original delete or any of the
triggered deletes run statements that should cause triggered actions.

pgsql-general by date:

Previous
From: Jarkko Elfving
Date:
Subject: Re: Ident authentication SOLVED
Next
From: Christoffer Gurell
Date:
Subject: Re: number of rown in a cursor.