Re: on update / on delete performance of foreign keys - Mailing list pgsql-general
From | Stephan Szabo |
---|---|
Subject | Re: on update / on delete performance of foreign keys |
Date | |
Msg-id | 20050124083442.F37929@megazone.bigpanda.com Whole thread Raw |
In response to | Re: on update / on delete performance of foreign keys ("Florian G. Pflug" <fgp@phlo.org>) |
Responses |
Re: on update / on delete performance of foreign keys
|
List | pgsql-general |
On Mon, 24 Jan 2005, Florian G. Pflug wrote: > Stephan Szabo wrote: > > On Mon, 24 Jan 2005, Florian G. Pflug wrote: > > > >>Since postgres already incoporates code to check foreign keys more > >>efficiently (when doing alter table ... add constraint .. foreign key, > >>postgres seems to use a merge or a hash join, instead of a nested loop), > >>I wondered how hard it would be to use this for the triggers too. > >> > >>I imagined creating a statement-level trigger in parallel to the > >>row-level triggers, and defining some threshold (let's say, more than > >>10% of the rows deleted). If the threshold is reached, the row-level > >>trigger would just do nothing, and the statement-level trigger would > >>delete the referencing records doing a join. > >> > >>Would this be feasable? And would it be something a newbie could tackle, > >>or is it more involved than I think? > > > > It's a little more involved. The first is that I think there's no good > > way to tell the row trigger to do nothing (remember that the constraints > > may be deferred so simple flags aren't sufficient). > I would be content if my optimization works for the not-deferred case - > I'd don't fully understand how deferred foreign keys are handled in > postgres. (I guess I don't even fully understand their semantics - I > use them only when doing bulk inserts, and there are either circular > dependencies, or I don't feel like find the right table order ;-)) Actually, thinking about it, this might not be so bad, you'd need to effectively have a stack of states (for events triggered by triggered actions perhaps many levels deep) but I think that's already effectively there. > > > The second is that > > these triggers will want to know which rows are deleted, but AFAIK > > statement-level triggers don't currently give you that information and > > deleting/changing any rows that aren't satisfied does not give the correct > > behavior. > This I do not understand. Isn't it sufficient to delete any rows whose > reference does not exist (for the on-delete-cascade case), or complain > if such rows exist (for the no-action/restrict case)? The > on-update-cascade case is difficult I guess - I'm not sure if my idea > even works for that case, now that I think about it... It's not sufficient to do the delete for non existant pk rows in the deferred case. I also think we'd need to decide on the behavior for the PostgreSQL case where a user trigger runs in between the delete and the action (for example, if I delete where pk=1 and then in between the delete and its action insert a row with pk=1 does the delete fire? The spec doesn't say much because I don't think you can run anything between the two.) insert into pk values (1); begin; insert into fk values (2); delete from pk; commit; AFAICT to follow the foreign key semantics if the foreign key check is deferred an error occurs on commit. Deleting the fk row on the delete from pk is not allowed. I think it may be valid for on delete no action even in the deferred case(*) , but I haven't done alot of thinking about it, but I think it's also invalid for deferred restrict since only the rows being deleted have the restrict applied to them, so an insert into pk values (2) between the delete and commit would allow the transaction to succeed AFAIK. (*) - I'm not sure how you'd necessarily give a complete error message if the error should really be that an insert was invalid but you noticed it on a delete check. I haven't thought about the update cases at all. > > The no action case is actually a little more involved again as > > it needs to remove rows from the set of changed pk rows if new pk rows > > have come into existance for matching keys. > Guess without understanding your previous comment I'm lost here too - I > wouldn't care to check only changed rows - I would check them all - but > only if some estimate shows that it will probably cheaper. > At the moment I'm writing a few plpgsql functions that do what I want. > They disable all constraint-related trigger, do a deleted, and then > recursivly traverse all tables (following the foreign-keys), and do > a "delete from .. where not exists (select 1 from ... where ...)". > I'll if I stumble upon problems - maybe I'll suddenly understand your > comments ;-))) Note that this may fail for cases where the original delete or any of the triggered deletes run statements that should cause triggered actions.
pgsql-general by date: