Re: delta relations in AFTER triggers - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: delta relations in AFTER triggers
Date
Msg-id 1403130634.33625.YahooMailNeo@web122305.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: delta relations in AFTER triggers  (David Fetter <david@fetter.org>)
Responses Re: delta relations in AFTER triggers  (David Fetter <david@fetter.org>)
List pgsql-hackers
David Fetter <david@fetter.org> wrote:
> Robert Haas wrote:
>> Kevin Grittner <kgrittn@ymail.com> wrote:

> The good:
>     - Generating the tuplestores.  Yay!

Thanks for that.  ;-)

> The bad:
>     - Generating them exactly and only for AFTER triggers

The standard only allows them for AFTER triggers, and I'm not sure
what the semantics would be for any others.

>     - Requiring that the tuplestores both be generated or not at
>       all.  There are real use cases described below where only
>       one would be relevant.

Yeah.

>     - Generating the tuplestores unconditionally.

Well, there are conditions.  Only when the reloption allows and
only if there is an AFTER trigger for the type of operation in
progress.

> The ugly:
>     - Attaching tuplestore generation to tables rather than
        callers (triggers, DML, etc.)

I'm not sure what you're getting at here.  This patch is
specifically only concerned with generating delta relations for DML
AFTER triggers, although my hope is that it will be a basis for
delta relations used for other purposes.  This seems to me like the
right place to initially capture the data for incremental
maintenance of materialized views, and might be of value for other
purposes, too.

> [formal definition of standard CREATE TRIGGER statement]

> Sorry that was a little verbose, but what it does do is give us
> what we need at trigger definition time.  I'd say it's pilot
> error if a trigger definition says "make these tuplestores" and
> the trigger body then does nothing with them, which goes to
> Robert's point below re: unconditional overhead.

Yeah, the more I think about it (and discuss it) the more I'm
inclined to suffer the additional complexity of the standard syntax
for specifying transition relations in order to avoid unnecessary
overhead creating them when not needed.  I'm also leaning toward
just storing TIDs in the tuplestores, even though it requires mixed
snapshots in executing queries in the triggers.  It just seems like
there will otherwise be to much overhead in copying around big,
unreferenced columns for some situations.

> Along that same line, we don't always need to capture both the
> before tuplestores and the after ones.  Two examples of this come
> to mind:
>
> - BEFORE STATEMENT triggers accessing rows, where there is no
> after part to use,

Are you talking about an UPDATE for which the AFTER trigger(s) only
reference the before transition table, and don't look at AFTER?  If
so, using the standard syntax would cover that just fine.  If not,
can you elaborate?

> and
> - DML (RETURNING BEFORE, e.g.) which only touches one of them.
> This applies both to extant use cases of RETURNING and to planned
> ones.

I think that can be sorted out by a patch which implements that, if
these deltas even turn out to be the appropriate way to get that
data (which is not clear to me at this time).  Assuming standard
syntax, the first thing would be for the statement to somehow
communicate to the trigger layer the need to capture a tuplestore
it might otherwise not generate, and there would need to be a way
for the statement to access the needed tuplestore(s).  The
statement would also need to project the right set of columns.
None of that seems to me to be relevant to this patch.  If this
patch turns out to provide infrastructure that helps, great.  If
you have a specific suggestion about how to make the tuplestores
more accessible to other layers, I'm listening.

> In summary, I'd like to propose that the tuplestores be generated
> separately in general and attached to callers. We can optimize
> this by not generating redundant tuplestores.

Well, if we use the standard syntax for CREATE TRIGGER and store
the transition table names (if any) in pg_trigger, the code can
generate one relation if any AFTER triggers which are going to fire
need it.  I don't see any point in generating exactly the same
tuplestore contents for each trigger.  And suspect that we can wire
in any other uses later when we have something to connect them to.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: btreecheck extension
Next
From: Andrew Dunstan
Date:
Subject: Re: Jsonb: jbvBinary usage in the convertJsonbValue?