Re: Triggers on foreign tables - Mailing list pgsql-hackers

From Ronan Dunklau
Subject Re: Triggers on foreign tables
Date
Msg-id 1452059.Z9cTgqlt2J@ronan_laptop
Whole thread Raw
In response to Re: Triggers on foreign tables  (Noah Misch <noah@leadboat.com>)
Responses Re: Triggers on foreign tables
List pgsql-hackers
Thank you very much for this review.

> The need here is awfully similar to a need of INSTEAD OF triggers on views.
> For them, we add a single "wholerow" resjunk TLE.  Is there a good reason to
> do it differently for triggers on foreign tables?

I wasn't aware of that, it makes for some much cleaner code IMO. Thanks.

> > - for after triggers, the whole queuing mechanism is bypassed for foreign
> > tables. This is IMO acceptable, since foreign tables cannot have
> > constraints or constraints triggers, and thus have not need for
> > deferrable execution. This design avoids the need for storing and
> > retrieving/identifying remote tuples until the query or transaction end.
>
> Whether an AFTER ROW trigger is deferred determines whether it runs at the
> end of the firing query or at the end of the firing query's transaction.
> In all cases, every BEFORE ROW trigger of a given query fires before any
> AFTER ROW trigger of the same query.  SQL requires that.  This proposal
> would give foreign table AFTER ROW triggers a novel firing time; let's not
> do that.
>
> I think the options going forward are either (a) design a way to queue
> foreign table AFTER ROW triggers such that we can get the old and/or new
> rows at the end of the query or (b) not support AFTER ROW triggers on
> foreign tables for the time being.
>

I did not know this was mandated by the standard.

The attached patch tries to solve this problem by allocating a tuplestore
in the global afterTriggers structure. This tuplestore is used for the whole
transaction, and uses only work_mem per transaction.

Both old and new tuples are stored in this tuplestore. Some additional
bookkeeping is done on the afterTriggers global structure, to keep track of
the number of inserted tuples, and the current read pointer position. The
tuples are identified by their order of insertion during the transaction.
I think this could benefit from some support in the tuplestore API, by
allowing arbitrary seek without the need to store more ReadPointers.

I initially tried to keep track of them by allocating read pointers on the
tuple store, but it turned out to be so expensive that I had to find another
way (24bytes per stored tuple, which are not reclaimable until the end of the
transaction).

What do you think about this approach ? Is there something I missed which
would make it not sustainable ?

If you prefer, I also have a patch implementing the rest of the changes and
keeping the previous behaviour for after triggers.

> It's not clear to me whether SQL/MED contemplates triggers on foreign
> tables. Its <drop basic column definition> General Rules do mention the
> possibility of a reference from a trigger column list.  On the other hand,
> I see nothing overriding the fact that CREATE TRIGGER only targets base
> tables.  Is this clearer to anyone else?  (This is a minor point, mainly
> bearing on the Compatibility section of the CREATE TRIGGER documentation.)

I do not have access to the standard specification, any advice regarding
specs compliance would be welcomed.

>
> > - the duplicated resjunk attributes are identified by being:
> >   - marked as resjunk in the targetlist
> >   - not being system or whole-row attributes (varno > 0)
> >
> > There is still one small issue with the attached patch: modifications to
> > the tuple performed by the foreign data wrapper (via the returned
> > TupleTableSlot in ExecForeignUpdate and ExecForeignInsert hooks) are not
> > visible to the AFTER trigger. This could be fixed by merging the planslot
> > containing the resjunk columns with the returned slot before calling the
> > trigger, but I'm not really sure how to safely perform that. Any advice ?
>
> Currently, FDWs are permitted to skip returning columns not actually
> referenced by any RETURNING clause.  I would change that part of the API
> contract to require returning all columns when an AFTER ROW trigger is
> involved.  You can't get around doing that by merging old column values,
> because, among other reasons, an INSERT does not have those values at all.

I'm not sure this should be part of the API contract: it would make
implementing a FDW more complicated than it is now. The attached patch hooks
on rewriteTargetListIU to add the missing targets to the returning clause,
when needed.

This also changes the way the query's hasReturning flag is set to exclude the
case when only resjunk entries are present in the returning list.

> > +NOTICE:  TG_NARGS: 2
> > +NOTICE:  TG_ARGV: [23, skidoo]
> > +NOTICE:  OLD: (11,"bye remote")
> > +insert into rem1 values(1,'insert');
>
> Would you trim the verbosity a bit?  Maybe merge several of the TG_ fields
> onto one line, and remove the low-importance ones.  Perhaps issue one line
> like this in place of all the current TG_ lines:
>
>   NOTICE:  trig_row_after(23, skidoo) AFTER ROW DELETE ON rem1
>

Fixed in the attached patch.


> > +CREATE TRIGGER trig_stmt_before BEFORE DELETE OR INSERT OR UPDATE ON ft1
> > FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func(); +CREATE TRIGGER
> > trig_stmt_after AFTER DELETE OR INSERT OR UPDATE ON ft1 FOR EACH
> > STATEMENT EXECUTE PROCEDURE trigger_func();
> No test actually fires these triggers.

These should have been placed on the rem1 foreign table, not ft1. Fixed.

> > +    On foreign tables, triggers can not be defined at row level.
>
> This is obsolete.

I missed that from the earlier version of the patch, thank you.


> Why permit some variants, but not every variant, of ALTER TABLE t ENABLE
> TRIGGER <type> on foreign tables?

I overlooked that. Fixed.


> Keep the old variable name.  Exceptions can be made if the name was
> deceptive, but "tupleslot" communicates the same thing as "slot".

Fixed.

>
> > @@ -2146,7 +2183,8 @@ ExecASDeleteTriggers(EState *estate, ResultRelInfo
> > *relinfo)>
> >  bool
> >  ExecBRDeleteTriggers(EState *estate, EPQState *epqstate,
> >
> >                       ResultRelInfo *relinfo,
> >
> > -                     ItemPointer tupleid)
> > +                     ItemPointer tupleid,
> > +                     TupleTableSlot *tupleslot)
>
> The new argument is unused.

I added it here for coherence. Now removed in the attached patch.

>
> Please don't change unrelated whitespace.

> Please use pgindent to fix the formatting of your new code.  It's fine to
> introduce occasional whitespace errors, but they're unusually-plentiful
> here.

I think its done now. One problem I have with running pgindent is that I
accidentally add chunks that were modified only by pgindent.

> Obsolete comment.  That's done elsewhere, not here.

Ok

>
> For future reference, you mustn't just assume that a resjunk Var is the same
> resjunk Var you added for this purpose.  The target list has many
> consumers, present and future, so you need to find your resjunk entries
> more-reliably than this.  See other resjunk-adding code for examples.  This
> concern goes away if you borrow the "wholerow" approach from INSTEAD OF
> triggers.

Using the wholerow approach, the entry is identified by the junkfilter
jf_junkAttNo attribute. So this concern indeed goes away.


Again, thank you for this review.

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [bug fix] pg_ctl always uses the same event source
Next
From: Andrew Dunstan
Date:
Subject: Re: commit fest 2014-01 week 1 report