Re: Command Triggers - Mailing list pgsql-hackers

From Dimitri Fontaine
Subject Re: Command Triggers
Date
Msg-id m2obszzvx7.fsf@2ndQuadrant.fr
Whole thread Raw
In response to Re: Command Triggers  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Command Triggers  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

Thanks for your time reviewing that patch!

Robert Haas <robertmhaas@gmail.com> writes:
> I took a brief look at this just now, and in general I'm pleasantly
> surprised.  But, as you might imagine, I have some reservations:

Good starting point, let's see about the details :)

> 1. I fear that the collection of commands to which this applies is
> going to end up being kind of a random selection.  I suggest that for
> the first version of the patch, just pick a couple of simple commands
> like VACUUM and ANALYZE, and do just those.  We can add on more later
> easily enough once the basic patch is committed.  Also, that way, if
> you don't get to absolutely everything, we'll have a coherent subset
> of the functionality, rather than a subset defined by "what Dimitri
> got to".

I share that feeling here. Now, given the current scope of the patch, I
think we can add in 9.2 all the commands that make sense to support
(yes, including alter operator family). FWIW, I've been adding support
for 16 forms of ALTER commands today, triple (implementation of alter
rename, owner and namespace are separated).

So while your reaction is perfectly understandable, I don't think that's
the main thing here, you've just happened to see an intermediate state
of things.

> 2. Currently, we have some objects (views) that support INSTEAD OF
> triggers and others (tables) that support BEFORE and AFTER triggers.
> I don't see any advantage in supporting both.

That's because you can cancel an INSERT from a BEFORE trigger, that's
how you can write an INSTEAD OF trigger on a TABLE.  As a VIEW does not
support INSERT (directly), an INSTEAD OF trigger is what makes sense
here.

I don't think it's possible to transpose that thinking to command
triggers, so I've been providing for either INSTEAD OF triggers or
BEFORE/AFTER triggers. Note that you can't have both with my current
patch. It's a convenience only thing, but I think it's worth having it:
you don't need to read the trigger procedure to decide if that BEFORE
command trigger is in fact an INSTEAD OF command trigger.

Also, the code footprint for this feature is very small.

> 3. I am not at all convinced that it's safe to allow command triggers
> to silently (i.e. without throwing an error) cancel the operation.  I
> don't have very much confidence that that is in general a safe thing

I was trying to offer a parallel with returning NULL from a BEFORE
INSERT trigger, which allows you to cancel it. If that turns out to be
so bad an idea that we need to withdraw it, so be it.

It's still possible to cancel a command by means of RAISE EXCEPTION in a
BEFORE command trigger, or by means of an INSTEAD OF trigger that does
nothing.

> to do; there may be code calling this code that expects that such
> things will not happen.  This diff hunk, for example, scares the crap
> out of me:
[...]
> I don't immediately understand why that's necessary, but I'm sure
> there's a good reason, and I bet a nickel that there are other places
> where similar adjustments are necessary but you haven't found them.  I

Might be.  That idea was sound to me when under the illusion that I
wouldn't have to edit each and every command implementation in order to
implement command triggers.  I'm ok to step back now, but read on.

> think we should rip out the option to silently cancel the command.  We
> can always add that later, but if we add it here and in the first
> commit I think we are going to be chasing bugs for months.

Ok, I'm going to remove that from the BEFORE command implementation.

What about having both INSTEAD OF triggers (the command is not executed)
and BEFORE trigger (you can cancel its execution only by raising an
exception, and that cancels the whole transaction).  I'd like that we'd
be able to keep that feature.

> 4. This API forces all the work of setting up the CommandContext to be
> done regardless of whether any command triggers are present:
>
> +       cmd->objectId = InvalidOid;
> +       cmd->objectname = (char *)aggName;
> +       cmd->schemaname = get_namespace_name(aggNamespace);

At some point while developing the patch I had something way smarter
than that, and centralized.  I've not revised the API to get back the
smartness when breaking out of the centralized implementation, because
those elements only can be set from the innards of each command.

The API to list which triggers to run already exists and only need the
command tag, which we might have earlier in the processing. Note that we
have to deal with commands acting on more than one object, as in
RemoveObjects() where we loop over a list and call triggers each time:
 https://github.com/dimitri/postgres/compare/master...command_triggers#diff-19

The best I can think of would be to build the list of triggers to call
as soon as we have the command tag, and skip preparing the rest of the
CommandContextData structure when that list comes empty.

Still no general stop-gap, but that would address your point I think.

Now, about INSTEAD OF command triggers, we still need to implement them
in exactly the same spot as BEFORE command triggers, and they're not
making things any more complex in the per-command support code.

> I'll grant you that most people probably do not execute enough DDL for
> the cost of those extra get_namespace_name() calls to add up, but I'm
> not completely sure it's a negligible overhead in general, and in any
> case I think it's a safe bet that there will be continuing demand to
> add more information to the set of things that are supplied to the
> trigger.  So I think that should be rethought.  It'd be nice to have a
> cheap way to say if (AnyCommandTriggersFor(command_tag)) { ... do
> stuff ... }, emphasis on cheap.  There's a definitional problem here,

It's as cheap as scanning a catalog, as of now. I didn't install a new
catalog cache for command triggers, as I don't think this code path is
performance critical enough to pay back for the maintenance cost. Also
consider that a big user of command triggers might have as much as a
couple of triggers (BEFORE/AFTER) per command, that's about 300 of them.

If they go crazy and have more than that with special conditions each
time, let's say that's 1000 rows in the catalog. I don't think that kind
of size and use cases (DDL) calls for a catalog cache here, nor for
micro optimizing things.

> too, which is that you're supplying to the trigger the aggregate name
> *as supplied by the user* and the schema name as it exists at the time
> we do the reverse lookup.  That reverse lookup isn't guaranteed to
> work at all, and it's definitely not guaranteed to produce an answer
> that's consistent with the aggName field.  Maybe there's no better
> way, but it would at least be better to pass the namespace OID rather
> than the name.  That way, the name lookup can be deferred until we are
> sure that we actually need to call something.

I'm confused here, because all error messages that needs to contain the
namespace are doing exactly the same thing as I'm doing in my patch.

> 5. I'm not entirely convinced that it makes sense to support command
> triggers on commands that affect shared objects.  It seems odd that
> such triggers will fire or not fire depending on which database is
> currently selected.  I think that could lead to user confusion, too.

You mean tablespace here, I guess, what else? I don't think I've added
other shared objects in there yet. I share your analyze btw, will
remove support.

> 6. Why do we need all this extra
> copyfuncs/equalfuncs/outfuncs/readfuncs support?  I thought we were
> dropping passing node trees for 9.2.

I didn't clean that out yet, that's the only reason why it's still
there. I would like that this work was not useless :)

> 7. I don't have a strong feeling on what the psql command should be
> called, but \dcT seems odd.  Why one lower-case letter and one
> upper-case letter?

Well \dt and \dT are already taken, so I got there. Will change to
something else, e.g. \dct would be your choice here?

> In general, I like the direction that this is going.  But I think it
> will greatly improve its chances of being successful and relatively
> non-buggy if we strip it down to something very simple for an initial
> commit, and then add more stuff piece by piece.

I had the feeling we did exactly that when reducing the API not to
provide the parse tree nor the (rewritten) command string. I can see
more features to drop off, like the ability to silently cancel a
statement in a BEFORE command trigger.

I'd like to keep the INSTEAD OF command trigger feature and I think we
can support lots of commands as soon as the integration becomes simple
enough, which is where we stand as soon as we remove the statement
cancelling.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


pgsql-hackers by date:

Previous
From: Gaetano Mendola
Date:
Subject: Re: CUDA Sorting
Next
From: Jay Levitt
Date:
Subject: Designing an extension for feature-space similarity search