Re: Triggers on foreign tables - Mailing list pgsql-hackers
From | Kohei KaiGai |
---|---|
Subject | Re: Triggers on foreign tables |
Date | |
Msg-id | CADyhKSXci64a3xupcOv35dMwjj5ZKbLqUDy7HJ-13MFYretxAw@mail.gmail.com Whole thread Raw |
In response to | Re: Triggers on foreign tables (Ronan Dunklau <ronan.dunklau@dalibo.com>) |
Responses |
Re: Triggers on foreign tables
|
List | pgsql-hackers |
I tried to check the latest (v8) patch again, then could not find problem in your design change from the v7. As Noah pointed out, it uses per query-depth tuplestore being released on AfterTriggerEndSubXact. So, may I mark it as "ready for committer"? 2014-03-03 15:48 GMT+09:00 Ronan Dunklau <ronan.dunklau@dalibo.com>: > Hello. > > Did you have time to review the latest version of this patch ? Is there > anything I can do to get this "ready for commiter" ? > > Thank you for all the work performed so far. > > > > > Le mardi 4 février 2014 13:16:22 Ronan Dunklau a écrit : >> Le lundi 3 février 2014 23:28:45 Noah Misch a écrit : >> > On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: >> > > Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit : >> > > > On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: >> > > > > What do you think about this approach ? Is there something I missed >> > > > > which >> > > > > would make it not sustainable ? >> > > > >> > > > Seems basically reasonable. I foresee multiple advantages from having >> > > > one >> > > > tuplestore per query level as opposed to one for the entire >> > > > transaction. >> > > > You would remove the performance trap of backing up the tuplestore by >> > > > rescanning. It permits reclaiming memory and disk space in >> > > > AfterTriggerEndQuery() rather than at end of transaction. You could >> > > > remove >> > > > ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the >> > > > flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next >> > > > one >> > > > or >> > > > the next two tuples from the tuplestore. Using work_mem per >> > > > AfterTriggerBeginQuery() instead of per transaction is no problem. >> > > > What >> > > > do >> > > > you think of that design change? >> > > >> > > I agree that this design is better, but I have some objections. >> > > >> > > We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but >> > > the >> > > rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) >> > > can't go away. >> > > >> > > Consider for example the case of a foreign table with more than one >> > > AFTER >> > > UPDATE triggers. Unless we store the tuples once for each trigger, we >> > > will >> > > have to rescan the tuplestore. >> > >> > Will we? Within a given query level, when do (non-deferred) triggers >> > execute in an order other than the enqueue order? >> >> Let me explain what I had in mind. >> >> Looking at the code in AfterTriggerSaveEvent: >> >> - we build a "template" AfterTriggerEvent, and store the tuple(s) >> - for each suitable after trigger that matches the trigger type, as well as >> the WHEN condition if any, a copy of the previously built AfterTriggerEvent >> is queued >> >> Later, those events are fired in order. >> >> This means that more than one event can be fired for one tuple. >> >> Take this example: >> >> CREATE TRIGGER trig_row_after1 >> AFTER UPDATE ON rem2 >> FOR EACH ROW >> WHEN (NEW.f1 % 5 < 3) >> EXECUTE PROCEDURE trigger_func('TRIG1'); >> >> CREATE TRIGGER trig_row_after2 >> AFTER UPDATE ON rem2 >> FOR EACH ROW >> WHEN (NEW.f1 % 5 < 4) >> EXECUTE PROCEDURE trigger_func('TRIG2'); >> >> UPDATE rem2 set f2 = 'something'; >> >> Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's >> ate_tupleindex will be, in that order. Ass >> >> 0-0-2-2-4-8-8 >> >> So, at least a backward seek is required for trig_row_after2 to be able to >> retrieve a tuple that was already consumed when firing trig_row_after1. >> >> On a side note, this made me realize that it is better to avoid storing a >> tuple entirely if there is no enabled trigger (the f1 = 4 case above). The >> attached patch does that, so the previous sequence becomes: >> >> 0-0-2-2-4-6-6 >> >> It also prevents from initalizing a tuplestore at all if its not needed. >> >> > > To mitigate the effects of this behaviour, I added the option to perform >> > > a >> > > reverse_seek when the looked-up tuple is nearer from the current index >> > > than >> > > from the start. >> > >> > If there's still a need to seek within the tuplestore, that should get rid >> > of the O(n^2) effect. I'm hoping that per-query-level tuplestores will >> > eliminate the need to seek entirely. >> >> I think the only case when seeking is still needed is when there are more >> than one after trigger that need to be fired, since the abovementioned >> change prevents from seeking to skip tuples. >> >> > > > If you do pursue that change, make sure the code still does the right >> > > > thing >> > > > when it drops queued entries during subxact abort. >> > > >> > > I don't really understand what should be done at that stage. Since >> > > triggers on foreign tables are not allowed to be deferred, everything >> > > should be cleaned up at the end of each query, right ? So, there >> > > shouldn't be any queued entries. >> > >> > I suspect that's right. If you haven't looked over >> > AfterTriggerEndSubXact(), please do so and ensure all its actions still >> > make sense in the context of this new kind of trigger storage. >> >> You're right, I missed something here. When aborting a subxact, the >> tuplestores for queries below the subxact query depth should be cleaned, if >> any, because AfterTriggerEndQuery has not been called for the failing query. >> >> The attached patch fixes that. >> >> > > > > The attached patch checks this, and add documentation for this >> > > > > limitation. >> > > > > I'm not really sure about how to phrase that correctly in the error >> > > > > message >> > > > > and the documentation. One can store at most INT_MAX foreign tuples, >> > > > > which >> > > > > means that at most INT_MAX insert or delete or "half-updates" can >> > > > > occur. >> > > > > By >> > > > > half-updates, I mean that for update two tuples are stored whereas >> > > > > in >> > > > > contrast to only one for insert and delete trigger. >> > > > > >> > > > > Besides, I don't know where this disclaimer should be in the >> > > > > documentation. >> > > > > Any advice here ? >> > > > >> > > > I wouldn't mention that limitation. >> > > >> > > Maybe it shouldn't be so prominent, but I still think a note somewhere >> > > couldn't hurt. >> > >> > Perhaps. There's not much documentation of such implementation upper >> > limits, and there's no usage of "INT_MAX" outside of parts that discuss >> > writing C code. I'm not much of a visionary when it comes to the >> > documentation; I try to document new things with an amount of detail >> > similar to old features. >> >> Ok, I removed the paragraph documenting the limitation. >> >> > > Should the use of work_mem be documented somewhere, too ? >> > >> > I wouldn't. Most uses of work_mem are undocumented, even relatively major >> > ones like count(DISTINCT ...) and CTEs. So, while I'd generally favor a >> > patch documenting all/most of the things that use work_mem, it would be >> > odd >> > to document one new consumer apart from the others. >> >> Ok. >> >> > > > This is the performance trap I mentioned above; it brings potential >> > > > O(n^2) >> > > > complexity to certain AFTER trigger execution scenarios. >> > > >> > > What scenarios do you have in mind ? I "fixed" the problem when there >> > > are >> > > multiple triggers on a foreign table, do you have any other one ? >> > >> > I'm not aware of such a performance trap in your latest design. >> >> Good ! >> >> > Thanks, >> > nm > > -- > Ronan Dunklau > http://dalibo.com - http://dalibo.org -- KaiGai Kohei <kaigai@kaigai.gr.jp>
pgsql-hackers by date: