Re: delta relations in AFTER triggers - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: delta relations in AFTER triggers
Date
Msg-id 1403127006.98728.YahooMailNeo@web122301.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: delta relations in AFTER triggers  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Kevin Grittner wrote:

> TRUNCATE triggers shouldn't have delta relations, that's for
> sure, or it'd completely negate the point of TRUNCATE triggers.
> I don't think we have any other event, do we?  People who get
> delta relations for deleting all rows should be using DELETE, I
> think.

That sounds reasonable.  The only other issue was INSTEAD OF
triggers, but I don't think we need them there, either.

> I am not sure about providing delta relations in the FOR EACH ROW
> case.  For what cases are them useful?

In an accounting application for courts I have seen a case where
each receivable (asset) account had a contra (liability) account of
"uncollected receivables", because until and unless the Clerk of
Courts collected the judgment there was no obligation to pay the
money back out.  Any financial transaction had to be in a database
transaction, and not only did all the transaction detail need to
balance to zero, but any receivable detail row needed to be
immediately followed by a balancing contra account row (and vice
versa).  A FOR EACH ROW trigger, on seeing one of these rows, could
check for the required balancing row.  Now this would only be
solved by the standard feature if both rows were always inserted by
a single statement, which might not always have been the case; so
even with this example I am stretching a bit.  But it's close
enough to suggest that there might be legitimate uses.

And there is the fact that the standard seems to want this to be
supported.

> In all cases I think NEW and OLD are sufficient.  I didn't read
> the standard, but if it's saying that in FOR EACH ROW the
> new/deleted/changed row should be accessible by way of a delta
> relation, [...]

No, it says that you can specify *both* the row variables for the
row under consideration and the tables for the full set of rows
affected by the statement *for the same FOR EACH ROW trigger*.

> Now, if you only have delta relations in FOR EACH STATEMENT, then
> it seems to me you can optimize obtaining them only when such
> triggers are defined; this lets you do away with the reloption
> entirely, doesn't it?

That was intended to minimize the situations under which there was
a performance hit where the new delta relations were not needed in
an AFTER trigger.  If we only generate these for FOR EACH STATEMENT
triggers, that certainly reduces the need for that, but I'm not
sure it eliminates it -- especially if we're generating tuplestores
for the full rows rather than their TIDs.

It is already generating the tuplestores only if there is an AFTER
trigger for the type of statement being executed, but I agree that
it would be a very rare FOR EACH ROW trigger that actually used it.
Of course, that is one argument for the standard syntax -- we
could test whether any of the trigger definitions for that
operation on that table specified each transition table, and only
generate them if needed based on that.

> I noticed that GetCurrentFDWTuplestore() changed its identity
> without getting its comment updated.  The new name seems awfully
> generic, and I don't think it really describes what the function
> does.  I think it needs more renaminguu

Good point.  Will fix that in the next version.

>> (1)  My first impulse was to capture this delta data in the form
>> of tuplestores of just TIDs, and fetching the tuples themselves
>> from the heap on reference.  In earlier discussions others have
>> argued for using tuplestores of the actual rows themselves.
>
> Can you please supply pointers to such discussion?

That was in a matview-related discussion, so perhaps we should
ignore that for now and discuss what's best for triggers here.  If
matviews need something different, the rows could always be
materialized from the TIDs at a later point.

> I don't see any point in not just storing TIDs, but perhaps I'm
> missing something.

I think there was some question about performance, but I really
have a hard time seeing the performance being significantly worse
with a TID tuplestore for most reasonable uses; and I think the TID
tuplestore could be a lot faster if (for example) you have a table
with a large, TOASTed text or bytea column which is not referenced
(in selection criteria or the SET clause) and is not used in the
FOR EACH STATEMENT trigger.

I think there might also have been some question about visibility.
A TID tuplestore would need to use a different snapshot (like maybe
SnapshotAny) in the same query where it joined to other tables with
a normal MVCC snapshot.

>> (2)  Do we want to just pick names for these in the PLs rather
>> than using the standard syntax?  Implementing the standard
>> syntax seemed to require three new (unreserved) keywords,
>> changes to the catalogs to store the chosen relations names, and
>> some way to tie the specified relation names in to the various
>> PLs.
>
> I think the only one for which we have a compulsion to follow
> someone in this area would be PL/pgSQL,

... which currently hard-codes the row variable names rather than
using standard syntax to specify them.

> which probably needs to follow PL/SQL's lead if there is one.

I don't believe we can create PL/SQL trigger functions, can we?

> Other than that I don't think we need to do anything standard.
> We don't (yet) have PL/PSM which would need to have the
> standard-mandated syntax.

The stated reason for not specifying the row variable names in the
CREATE TRIGGER statement is the difficulty of tying that in to all
the supported PLs.  I couldn't see any reason for that to be easier
for the tables than the row variables.  Of course, if we intend to
support PL/PSM and we want to use standard CREATE TRIGGER syntax
for that (rather than having trigger functions in that language act
as though some particular name was specified) it seems better to me
that we implement it now.  But I don't really see why we would need
to do that to have a conforming PL/PSM implementation.

>> The way I have gone here just adds two new fields to the
>> TriggerData structure and leaves it to each PL how to deal with
>> that.  Failure to do anything in a PL just leaves it at the
>> status quo with no real harm done -- it just won't have the new
>> delta relations available.
>
> It seems to me that plpgsql support is mandatory, but other PLs
> can add support as interested parties weigh in with patches.  As
> with event triggers, I don't feel the author of the main feature
> is responsible for patching all PLs.

I like that point of view!

> Are you intentionally omitting the corresponding
> sql-createtrigger patch?

Yes, because that depends on decisions made about whether to use
standard syntax and (if not) a round of bikeshedding about what
fixed names to use in plpgsql.

Thanks for the feedback!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Atomics hardware support table & supported architectures
Next
From: Bruce Momjian
Date:
Subject: Re: idle_in_transaction_timeout