Thread: Event Triggers reduced, v1

Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Allow me to open the new season of the DML trigger series, named
pg_event_trigger. This first episode is all about setting up the drama,
so that next ones make perfect sense.

The attached patch contains all the infrastructure for event triggers
and also a first implementation of them for the event "command_start",
implemented in a single place in utility.c.

The infrastructure is about:

 - new catalog
 - grammar for new commands
 - documentation skeleton
 - pg_dump support
 - psql support
 - ability to actually run user triggers
 - written in "core languages"
    (pl/c, pl/pgsql, pl/python, pl/perl, pl/tcl)
 - limited subcommand handling

The goal for this first patch is to avoid semantics issues so that we
can get something technically clean in, and have more time to talk
semantics next times. The main discussion to avoid is deciding if we
want to fire event triggers for CREATE SEQUENCE and CREATE INDEX in a
command that just did implement a SERIAL PRIMARY KEY in a table.

This way of doing things is possible because we took time to set a road
map together with Robert when we were both in Ottawa, and because it's
early in the cycle. The complete feature still needs to happen before
9.3 is released, but any realistic progress has to be cut down.

Look, it's an easy little skinny patch to review, right:

git --no-pager diff --shortstat master
 62 files changed, 4546 insertions(+), 108 deletions(-)

This patch includes regression tests that we worked on with Thom last
rounds, remember that they only run in the serial schedule, that means
with `make installcheck` only. Adding noisy output at random while the
parallel schedule run is a good way to break all the regression testing,
so I've been avoiding that.

I don't think this patch is ready as it is, by the way, I couldn't
devote nearly enough time to have something that polished already. I
think even with setting the goal not to embrace semantics, reviewing
this patch will certainly bring some interesting design discussions.

Regards,
--
Dimitri Fontaine
PostgreSQL DBA, Architecte


Attachment

Re: Event Triggers reduced, v1

From
Simon Riggs
Date:
On 15 June 2012 21:27, Dimitri Fontaine <dfontaine@hi-media.com> wrote:

> The goal for this first patch is to avoid semantics issues so that we
> can get something technically clean in, and have more time to talk
> semantics next times. The main discussion to avoid is deciding if we
> want to fire event triggers for CREATE SEQUENCE and CREATE INDEX in a
> command that just did implement a SERIAL PRIMARY KEY in a table.

So this patch triggers once per top level command, just with
information about the set of nested events?

I'm happy if we make sweeping initial points like "don't generate
events for sequences and indexes" in the first version. We can always
add more later.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Event Triggers reduced, v1

From
Alvaro Herrera
Date:
Excerpts from Dimitri Fontaine's message of vie jun 15 16:27:50 -0400 2012:

> The attached patch contains all the infrastructure for event triggers
> and also a first implementation of them for the event "command_start",
> implemented in a single place in utility.c.
>
> The infrastructure is about:
>
>  - new catalog
>  - grammar for new commands
>  - documentation skeleton
>  - pg_dump support
>  - psql support
>  - ability to actually run user triggers
>  - written in "core languages"
>     (pl/c, pl/pgsql, pl/python, pl/perl, pl/tcl)
>  - limited subcommand handling

Did you try REASSIGN OWNED and DROP OWNED with a role that has defined
some event triggers?

> Look, it's an easy little skinny patch to review, right:
>
> git --no-pager diff --shortstat master
>  62 files changed, 4546 insertions(+), 108 deletions(-)

Skinny ... right.  I started to give it a look -- I may have something
useful to comment later.

> This patch includes regression tests that we worked on with Thom last
> rounds, remember that they only run in the serial schedule, that means
> with `make installcheck` only. Adding noisy output at random while the
> parallel schedule run is a good way to break all the regression testing,
> so I've been avoiding that.

Hmm, I don't like the idea of a test that only runs in serial mode.
Maybe we can find some way to make it work in parallel mode as well.
I don't have anything useful to comment right now.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jun 15, 2012 at 4:27 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Allow me to open the new season of the DML trigger series, named
> pg_event_trigger. This first episode is all about setting up the drama,
> so that next ones make perfect sense.

Comments:

1. I still think we ought to get rid of the notion of BEFORE or AFTER
(i.e. pg_event_trigger.evttype) and just make that detail part of the
event name (e.g. pg_event_trigger.evtevent).  Many easily forseeable
event types will be more like "during" rather than "before" or
"after", and for those that do have a notion of before and after, we
can have two different event names and include the word "before" or
"after" there.  I am otherwise satisfied with the schema you've
chosen.

2. I think it's important to be able to add new types of event
triggers without creating excessive parser bloat.  I think it's
important to use some kind of generic syntax here which will be able
to apply to all types of triggers we may want to add, now or in the
future.  The easiest way to do that is to use literal syntax for the
list of command tags, rather than writing them out as key words: e.g.
'ALTER TABLE' rather than ALTER TABLE.  It's not quite as pretty, but
the savings in parser bloat and future code change seem well worth it.Or, alternatively, we could use identifier style,
e.g.alter_table,
 
as I previously suggested.

3. The event trigger cache seems to be a few bricks shy of a load.
First, event_trigger_cache_is_stalled is mis-named; I think you mean
"stale", not "stalled".  Second, instead of setting that flag and then
rebuilding the cache when you see the flag set, how about just blowing
away the cache contents whenever you would have set the flag?  That
seems a whole lot simpler and cleaner, and removes the need for a
force_rebuild flag on BuildEventTriggerCache().  Third, ISTM that this
isn't going to work correctly if backend A performs an event after
backend B has built its cache.  To fix this, I think you need to rip
out all the places where you force a rebuild locally and instead use
something like CacheRegisterSyscacheCallback() to blow away the cache
whenever something changes; you might find it helpful to look at
attoptcache.c.

4. The documentation doesn't build.

openjade:reference.sgml:44:4:W: cannot generate system identifier for
general entity "alterEventTrigger"
openjade:reference.sgml:44:4:E: general entity "alterEventTrigger" not
defined and no default entity
openjade:reference.sgml:44:21:E: reference to entity
"alterEventTrigger" for which no system identifier could be generated
openjade:reference.sgml:44:3: entity was defined here
openjade:reference.sgml:86:4:W: cannot generate system identifier for
general entity "createEventTrigger"
openjade:reference.sgml:86:4:E: general entity "createEventTrigger"
not defined and no default entity
openjade:reference.sgml:86:22:E: reference to entity
"createEventTrigger" for which no system identifier could be generated
openjade:reference.sgml:86:3: entity was defined here
openjade:reference.sgml:125:4:W: cannot generate system identifier for
general entity "dropEventTrigger"
openjade:reference.sgml:125:4:E: general entity "dropEventTrigger" not
defined and no default entity
openjade:reference.sgml:125:20:E: reference to entity
"dropEventTrigger" for which no system identifier could be generated
openjade:reference.sgml:125:3: entity was defined here
openjade:catalogs.sgml:1868:35:E: character "_" is not allowed in the
value of attribute "ZONE"
openjade:catalogs.sgml:1868:19:X: reference to non-existent ID
"CATALOG-PG-EVENT_TRIGGER"
openjade:trigger.sgml:43:47:X: reference to non-existent ID
"SQL-CREATECOMMANDTRIGGER"

5. In terms of a psql command, I think that \dev is both not very
mnemonic and, as you mentioned in the comment, possibly confusable
with SQL/MED.  If we're going to have a \d command for this, I suggest
something like \dy, which is not very mnemonic either but at least
seems unlikely to be confused with anything else.  Other things that
are unused include \dh, \dj, \dk, \dq, \dw, and \dz, if any of those
grab you, or a somewhat broader range of things (but still nothing
very memorable) if we include capital letters.  Or we could branch out
into punctuation, like \d& -- or things that don't begin with the
letter d, but that doesn't seem like a particularly good idea.

6. In objectaddress.c, I think that get_object_address_event_trigger
should be eliminated in favor of an additional case in
get_object_address_unqualified.

7. There are many references to command triggers that still need to be
cleaned up.  trigger.sgml still makes reference to the name command
triggers.  plpgsql.sgml also contains vestiges of the command trigger
notation, and references to some TG_* variables that I don't think
exist in this version of the patch.  event_trigger.c is identified
(twice) as cmdtrigger.c in the file header comment.  The header
comment for InsertEventTriggerTuple makes reference to pg_cmdtrigger,
as does a line comment in RemoveEventTriggerById.  The regression
output mentions cmdtrigger in a few places as well.  In the
documentation, event triggers are mentioned as having return type
command_trigger, but it's now called event_trigger.

8. The re-addition of node tags like T_RemoveFuncStmt seems to be a
merging error.  Changing \dc so that it rejects \dcrap appears to be a
leftover from when the command was \dcT. In one place in the docs you
have 'avent' for 'event'.  In event_trigger.c, you have #ifdef
UNDEFINED; project standard is #ifdef NOT_USED (or else you can remove
the code).

9. The regression tests seem to now be testing some features that
don't exist any more, and might need some rethinking to make what they
do match the scope of this patch.

10. I suggest separating out the support for other PLs (Python, Tcl)
and submitting that as a later patch, since I'm unqualified to commit
it (and I'm hoping to get the rest of this committed).  The PL/TCL
stuff also contains residual references to the command-trigger
notation which should be cleaned up before resubmitting.

There's probably more, but I'm all reviewed out for right now.
Hopefully that's enough to get you started.  I think this is heading
in a good direction, even though there's still a good bit of work left
to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Did you try REASSIGN OWNED and DROP OWNED with a role that has defined
> some event triggers?

Not yet. Will add to regression tests, good catch.

> Hmm, I don't like the idea of a test that only runs in serial mode.
> Maybe we can find some way to make it work in parallel mode as well.

I don't see how we can manage to do that, as adding an event trigger to
some command will impact tests running in parallel.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Andres Freund
Date:
On Wednesday, June 20, 2012 09:33:26 PM Dimitri Fontaine wrote:
> > Hmm, I don't like the idea of a test that only runs in serial mode.
> > Maybe we can find some way to make it work in parallel mode as well.
> 
> I don't see how we can manage to do that, as adding an event trigger to
> some command will impact tests running in parallel.
Cant you just put it alone in a test group in the parallel_schedule? Several 
other tests do that?

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Wed, Jun 20, 2012 at 3:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On Wednesday, June 20, 2012 09:33:26 PM Dimitri Fontaine wrote:
>> > Hmm, I don't like the idea of a test that only runs in serial mode.
>> > Maybe we can find some way to make it work in parallel mode as well.
>>
>> I don't see how we can manage to do that, as adding an event trigger to
>> some command will impact tests running in parallel.
> Cant you just put it alone in a test group in the parallel_schedule? Several
> other tests do that?

Yeah, that seems like it should work.  If not, I'd say the tests
themselves are broken.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Robert Haas <robertmhaas@gmail.com> writes:
> 1. I still think we ought to get rid of the notion of BEFORE or AFTER
> (i.e. pg_event_trigger.evttype) and just make that detail part of the
> event name (e.g. pg_event_trigger.evtevent).  Many easily forseeable
> event types will be more like "during" rather than "before" or
> "after", and for those that do have a notion of before and after, we
> can have two different event names and include the word "before" or
> "after" there.  I am otherwise satisfied with the schema you've
> chosen.

It's not before/after anymore, but rather addon/replace if you will. I
kept the INSTEAD OF keyword for the replace semantics, that you've been
asking me to keep IIRC, with security policy plugins as a use case.

Now we can of course keep those semantics and embed them in the event
name we provide users, I though that maybe a documentation matrix of
which event support which "mode" would be cleaner to document. We might
as well find a clean way to implement both modes for most of the
commands, I don't know yet.

So, are you sure you want to embed that part of the event trigger
semantics in the event name itself?

> 2. I think it's important to be able to add new types of event
> triggers without creating excessive parser bloat.  I think it's

I've been trying to do that yes, as you can see with event_name and
event_trigger_variable rules. I've been re-using as much existing
keywords as I could because I believe that's not causing any measurable
bloat, I'll kindly reconsider if necessary, even if sadly.

> important to use some kind of generic syntax here which will be able
> to apply to all types of triggers we may want to add, now or in the
> future.  The easiest way to do that is to use literal syntax for the
> list of command tags, rather than writing them out as key words: e.g.
> 'ALTER TABLE' rather than ALTER TABLE.  It's not quite as pretty, but
> the savings in parser bloat and future code change seem well worth it.
>  Or, alternatively, we could use identifier style, e.g. alter_table,
> as I previously suggested.

Whatever the solution here, we still need to be able to ERROR out very
early if the user entered a non supported command here, such as GRANT
for the time being. I'm not sure a manual list of strcmp() would be more
effective than having bison basically produce the same thing for us. I
think using the grammar here makes for easier maintenance.

> 3. The event trigger cache seems to be a few bricks shy of a load.

I wouldn't be that surprised, mind you. I didn't have nearly as much
time I wanted to working on that project.

> First, event_trigger_cache_is_stalled is mis-named; I think you mean
> "stale", not "stalled".  Second, instead of setting that flag and then

Stale. Right. Edited.

> rebuilding the cache when you see the flag set, how about just blowing
> away the cache contents whenever you would have set the flag?  That

I've been doing that at first, but that meant several full rebuilds in a
row in the regression tests, which are adding new event triggers then
using them. I though lazily maintaining the cache would be better.

> seems a whole lot simpler and cleaner, and removes the need for a
> force_rebuild flag on BuildEventTriggerCache().  Third, ISTM that this
> isn't going to work correctly if backend A performs an event after
> backend B has built its cache.  To fix this, I think you need to rip
> out all the places where you force a rebuild locally and instead use
> something like CacheRegisterSyscacheCallback() to blow away the cache
> whenever something changes; you might find it helpful to look at
> attoptcache.c.

Ok, looking at that for next revision of the patch, which I should be
able to post early next week.

> 4. The documentation doesn't build.

Sorry about that, will get fixed too.

> 5. In terms of a psql command, I think that \dev is both not very
> mnemonic and, as you mentioned in the comment, possibly confusable
> with SQL/MED.  If we're going to have a \d command for this, I suggest
> something like \dy, which is not very mnemonic either but at least
> seems unlikely to be confused with anything else.  Other things that
> are unused include \dh, \dj, \dk, \dq, \dw, and \dz, if any of those
> grab you, or a somewhat broader range of things (but still nothing
> very memorable) if we include capital letters.  Or we could branch out
> into punctuation, like \d& -- or things that don't begin with the
> letter d, but that doesn't seem like a particularly good idea.

I'm not that fond of psql commands, but I don't think it's going to fly
not to have one for event triggers. I could buy \dy.

> 6. In objectaddress.c, I think that get_object_address_event_trigger
> should be eliminated in favor of an additional case in
> get_object_address_unqualified.

Sure. It used to be more complex than that when the identifier was a
composite with the command name, it makes no sense to separate it away
now. Done.

> 7. There are many references to command triggers that still need to be
> cleaned up.  trigger.sgml still makes reference to the name command
> triggers.  plpgsql.sgml also contains vestiges of the command trigger
> notation, and references to some TG_* variables that I don't think
> exist in this version of the patch.  event_trigger.c is identified
> (twice) as cmdtrigger.c in the file header comment.  The header
> comment for InsertEventTriggerTuple makes reference to pg_cmdtrigger,
> as does a line comment in RemoveEventTriggerById.  The regression
> output mentions cmdtrigger in a few places as well.  In the
> documentation, event triggers are mentioned as having return type
> command_trigger, but it's now called event_trigger.

All fixed, will grep for cmd and command in the patch and fix any other
one that's still there before sending v2 of the patch. Sorry about that.

> 8. The re-addition of node tags like T_RemoveFuncStmt seems to be a
> merging error.  Changing \dc so that it rejects \dcrap appears to be a
> leftover from when the command was \dcT. In one place in the docs you
> have 'avent' for 'event'.  In event_trigger.c, you have #ifdef
> UNDEFINED; project standard is #ifdef NOT_USED (or else you can remove
> the code).

Will see about node tags and psql clean merge. Docs fixed. I meant to
remove the code. Done now. Thanks.

> 9. The regression tests seem to now be testing some features that
> don't exist any more, and might need some rethinking to make what they
> do match the scope of this patch.

The current implementation must be kicking for all supported commands
and we have a authoritative list of them in the grammar, so I wanted to
maintain a regression test suite where they are all exercised, even if
we're exercising the same code path each time.

That's meant to change with later patch, I'm not sure how much gain we
have to remove test covering now knowing that we certainly won't release
with only that first patch.

> 10. I suggest separating out the support for other PLs (Python, Tcl)
> and submitting that as a later patch, since I'm unqualified to commit
> it (and I'm hoping to get the rest of this committed).  The PL/TCL
> stuff also contains residual references to the command-trigger
> notation which should be cleaned up before resubmitting.

Fixed pltcl internal references. Will produce separate patches for next
revision.

> There's probably more, but I'm all reviewed out for right now.
> Hopefully that's enough to get you started.  I think this is heading
> in a good direction, even though there's still a good bit of work left
> to do.

Thanks for your review, it's clearly enough to get started chewing on
the patch!

Early followers can see the progress, when it happens, in the github
repository, if waiting for the next patch is unbearably long :)
 https://github.com/dimitri/postgres https://github.com/dimitri/postgres/tree/evt_trig_v1

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> Cant you just put it alone in a test group in the parallel_schedule? Several
>> other tests do that?
>
> Yeah, that seems like it should work.  If not, I'd say the tests
> themselves are broken.

I completely missed that we could do that. I don't feel bright. Of
course it just works.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Wed, Jun 20, 2012 at 4:36 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> 1. I still think we ought to get rid of the notion of BEFORE or AFTER
>> (i.e. pg_event_trigger.evttype) and just make that detail part of the
>> event name (e.g. pg_event_trigger.evtevent).  Many easily forseeable
>> event types will be more like "during" rather than "before" or
>> "after", and for those that do have a notion of before and after, we
>> can have two different event names and include the word "before" or
>> "after" there.  I am otherwise satisfied with the schema you've
>> chosen.
>
> It's not before/after anymore, but rather addon/replace if you will. I
> kept the INSTEAD OF keyword for the replace semantics, that you've been
> asking me to keep IIRC, with security policy plugins as a use case.
>
> Now we can of course keep those semantics and embed them in the event
> name we provide users, I though that maybe a documentation matrix of
> which event support which "mode" would be cleaner to document. We might
> as well find a clean way to implement both modes for most of the
> commands, I don't know yet.
>
> So, are you sure you want to embed that part of the event trigger
> semantics in the event name itself?

Yeah, pretty sure.  I think that for regular triggers, BEFORE, AFTER,
and INSTEAD-OF are the firing-point specification.   But even triggers
will have more than three firing points, probably eventually quite a
lot more.  So we need something more flexible.  But we don't need that
more flexible thing AND ALSO the before/after/instead-of
specification, which I think in most cases won't be meaningful anyway.It happens to be somewhat sensible for this
initialfiring point, but 
I think for most of them there will be just one place, and in many
cases it will be neither before, nor after, nor instead-of.

>> 2. I think it's important to be able to add new types of event
>> triggers without creating excessive parser bloat.  I think it's
>
> I've been trying to do that yes, as you can see with event_name and
> event_trigger_variable rules. I've been re-using as much existing
> keywords as I could because I believe that's not causing any measurable
> bloat, I'll kindly reconsider if necessary, even if sadly.

The issue is that the size of the parser tables grow with the square
of the number of states.  This will introduce lots of new states that
we don't really need; and every new kind of event trigger that we want
to add will introduce more.

>> 3. The event trigger cache seems to be a few bricks shy of a load.
>
> I wouldn't be that surprised, mind you. I didn't have nearly as much
> time I wanted to working on that project.
>
>> First, event_trigger_cache_is_stalled is mis-named; I think you mean
>> "stale", not "stalled".  Second, instead of setting that flag and then
>
> Stale. Right. Edited.
>
>> rebuilding the cache when you see the flag set, how about just blowing
>> away the cache contents whenever you would have set the flag?  That
>
> I've been doing that at first, but that meant several full rebuilds in a
> row in the regression tests, which are adding new event triggers then
> using them. I though lazily maintaining the cache would be better.

Well, AFAICS, you're still doing full rebuilds whenever something
changes; you're just keeping the (useless, dead) cache around until
you decide to rebuild it.  Might as well free the memory once you know
that the next access will rebuild it anyway, and for a bonus it saves
you a flag.

> I'm not that fond of psql commands, but I don't think it's going to fly
> not to have one for event triggers. I could buy \dy.

Yeah, I think people are going to want to have one.  I really despise
the \d<whatever> syntax, but it's not 100% clear what a better one
would look like.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> It's not before/after anymore, but rather addon/replace if you will. I
>> kept the INSTEAD OF keyword for the replace semantics, that you've been
>> asking me to keep IIRC, with security policy plugins as a use case.
>>
>> Now we can of course keep those semantics and embed them in the event
>> name we provide users, I though that maybe a documentation matrix of
>> which event support which "mode" would be cleaner to document. We might
>> as well find a clean way to implement both modes for most of the
>> commands, I don't know yet.
>>
>> So, are you sure you want to embed that part of the event trigger
>> semantics in the event name itself?
>
> Yeah, pretty sure.  I think that for regular triggers, BEFORE, AFTER,
> and INSTEAD-OF are the firing-point specification.   But even triggers
> will have more than three firing points, probably eventually quite a
> lot more.  So we need something more flexible.  But we don't need that
> more flexible thing AND ALSO the before/after/instead-of
> specification, which I think in most cases won't be meaningful anyway.
>  It happens to be somewhat sensible for this initial firing point, but
> I think for most of them there will be just one place, and in many
> cases it will be neither before, nor after, nor instead-of.

I agree with using the event name as a the specification for the firing
point, and that we should prefer documenting the ordering of those
rather than offering a fuzzy idea of BEFORE and AFTER steps in there.
The AFTER step is better expressed as BEFORE the next one.

Now, I still think there's an important discrepancy between adding a new
behaviour that adds-up to whatever the backend currently implements and
providing a replacement behaviour with a user defined function that gets
called instead of the backend code.

And I still don't think that the event name should be carrying alone
that semantic discrepancy. Now, I also want the patch to get in, so I
won't insist very much if I'm alone in that position. Anyone else
interested enough to chime in?

The user visible difference would be between those variants:
 create event trigger foo at 'before_security_check' ... create event trigger foo at 'replace_security_check' ...
 create event trigger foo before 'security_check' ... create event trigger foo instead of 'security_check' ...

Note that in this version the INSTEAD OF variant is not supported, we
only intend to offer it in some very narrow cases, or at least that is
my understanding.

> The issue is that the size of the parser tables grow with the square
> of the number of states.  This will introduce lots of new states that
> we don't really need; and every new kind of event trigger that we want
> to add will introduce more.

It's a little sad not being able to reuse command tag keywords, but it's
even more sad to impact the rest of the query parsing. IIRC you had some
performance test patch with a split of the main parser into queries and
dml on the one hand, and utility commands on the other hand. Would that
help here? (I mean more as a general solution against that bloat problem
than for this very patch here).

I prefer the solution of using 'ALTER TABLE' rather than ALTER TABLE,
even if code wise we're not gaining anything in complexity: the parser
bloat gets replaced by a big series of if branches. Of course you only
exercise it when you need to. I will change that for next patch.

>>> 3. The event trigger cache seems to be a few bricks shy of a load.
>
> Well, AFAICS, you're still doing full rebuilds whenever something
> changes; you're just keeping the (useless, dead) cache around until
> you decide to rebuild it.  Might as well free the memory once you know
> that the next access will rebuild it anyway, and for a bonus it saves
> you a flag.

I'm just done rewriting the cache management with a catalog cache for
event triggers and a Syscache Callback that calls into a new module
called src/backend/utils/cache/evtcache.c that mimics attoptcache.c. No
more cache stale variable. And a proper composite hash key.

I still have some more work here before being able to send a new release
of the patch, as I said I won't have enough time to make that happen
until within next week. The git repository is updated, though.
 https://github.com/dimitri/postgres/tree/evt_trig_v1
https://github.com/dimitri/postgres/compare/913091de51...861eb038d0

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Here's an early revision 2 of the patch, not yet ready for commit, so
including the PL stuff still. What's missing is mainly a cache reference
leak to fix at DROP EVENT TRIGGER, but I failed to spot where it comes
from.

As I fixed about all the other comments I though I might as well send in
the new version of the patch to get to another round of review.

Robert Haas <robertmhaas@gmail.com> writes:
> 1. I still think we ought to get rid of the notion of BEFORE or AFTER
> (i.e. pg_event_trigger.evttype) and just make that detail part of the
> event name (e.g. pg_event_trigger.evtevent).  Many easily forseeable

So, agreed on before/after, not on INSTEAD OF. No change in the patch,
still discussing that point.

> 2. I think it's important to be able to add new types of event
> triggers without creating excessive parser bloat.  I think it's

Fixed in the attached, I believe.

> 3. The event trigger cache seems to be a few bricks shy of a load.

Fixed in the attached, including cache invalidation registered as a
system cache callback.

> 4. The documentation doesn't build.

Fixed, I mainly managed to forget adding the new files.

> 5. In terms of a psql command, I think that \dev is both not very

Switched to \dy and cleaned up.

> 6. In objectaddress.c, I think that get_object_address_event_trigger
> should be eliminated in favor of an additional case in
> get_object_address_unqualified.

Fixed in the attached.

> 7. There are many references to command triggers that still need to be
> cleaned up.

All fixed.

> 8. The re-addition of node tags like T_RemoveFuncStmt seems to be a
> merging error.  Changing \dc so that it rejects \dcrap appears to be a
> leftover from when the command was \dcT. In one place in the docs you
> have 'avent' for 'event'.  In event_trigger.c, you have #ifdef
> UNDEFINED; project standard is #ifdef NOT_USED (or else you can remove
> the code).

All fixed.

> 9. The regression tests seem to now be testing some features that
> don't exist any more, and might need some rethinking to make what they
> do match the scope of this patch.

Actually those tests helped me spot some strange things when cleaning up
the cache key, and only some commands would fail. So I'm in favor of
keeping it all for now.

> 10. I suggest separating out the support for other PLs (Python, Tcl)
> and submitting that as a later patch, since I'm unqualified to commit
> it (and I'm hoping to get the rest of this committed).  The PL/TCL
> stuff also contains residual references to the command-trigger
> notation which should be cleaned up before resubmitting.

That's for next turn.

> There's probably more, but I'm all reviewed out for right now.
> Hopefully that's enough to get you started.  I think this is heading
> in a good direction, even though there's still a good bit of work left
> to do.

So, let's see about that remaining bit of work :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Sun, Jun 24, 2012 at 5:46 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Here's an early revision 2 of the patch, not yet ready for commit, so
> including the PL stuff still. What's missing is mainly a cache reference
> leak to fix at DROP EVENT TRIGGER, but I failed to spot where it comes
> from.
>
> As I fixed about all the other comments I though I might as well send in
> the new version of the patch to get to another round of review.

Some of the pg_dump hunks fail to apply for me; I guess this needs to
be remerged.

Spurious hunk:

-    query_hosts
+    query_hosts

Spurious hunk:

- * need deep copies, so they should be copied via list_copy()
+ * need deep copies, so they should be copied via list_copy(const )

There are a few remaining references to EVTG_FIRED_BEFORE /
EVTG_FIRED_INSTEAD_OF which should be cleaned up.  I suggest writing
the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE...  On
a related note, evttype is still mentioned in the documentation, also.And CreateEventTrigStmt has a char timing field
thatcan go.
 

Incidentally, why do we not support an argument list here, as with
ordinary triggers?

I think the below hunk should get removed.  Or else it should be
surrounded by #ifdef rather than commented out.

+       /*
+        * Useful to raise WARNINGs for any DDL command not yet supported.
+        *
+       elog(WARNING, "Command Tag:    %s", tag);
+       elog(WARNING, "Note to String: %s", nodeToString(parsetree));
+        */

Typo:

+ * where we probide object name and namespace separately and still want nice

Typo:

+ * the easier code makes up fot it big time.

psql is now very confused about what the slash command is.  It's
actually implemented as \dy, but the comments say \dev in several
places, and slashUsage() documents it as \dct.

InitializeEvtTriggerCommandCache still needs work.  First, it ensures
that CacheMemoryContext is non-NULL... after it's already stored the
value of CacheMemoryContext into the HASHCTL.  Obviously, the order
there needs to be fixed.  Also, I think you need to split this into
two functions.  There should be one function that gets called just
once at startup time to CacheRegisterSyscacheCallback(), because we
don't want to redo that every time the cache gets blown away.  Then
there should be another function that gets called when, and only when,
someone needs to use the cache.  That should create and populate the
hash table.

I think that all event triggers should fire in exactly alphabetical
order by name.  Now that we have the concept of multiple firing
points, there's really no reason to distinguish between any triggers
and triggers on specific commands.  Eliminating that distinction will
eliminate a bunch of complexity while making things *more* flexible
for the user, who will be able to get a command trigger for a specific
command to fire either before or after an ANY command trigger he has
also defined rather than having the system enforce policy on him.

plperl_validator still makes reference to CMDTRIGGER.

Let's simplify this down to an enum with just one element, since
that's all we're going to support for starters, and remove the related
parser support for everything but command_start:

+typedef enum TrigEvent
+{
+       E_CommandStart       = 1,
+       E_SecurityCheck      = 10,
+       E_ConsistencyCheck   = 15,
+       E_NameLookup         = 20,
+       E_CommandEnd         = 51
+} TrigEvent;

I think we should similarly rip out the vestigial support for
supplying schema name, object name, and object ID.  That's not going
to be possible at command_start anyway; we can include that stuff in
the patch that adds a later firing point (command end, or consistency
check, perhaps).

I think standard_ProcessUtility should have a shortcut that bypasses
setting up the event context if there are no event triggers at all in
the entire system, so that we do no extra work unless there's some
potential benefit.

It seems to me that we probably need a CommandCounterIncrement() after
firing each event trigger, unless that's happening under the covers
somewhere and I'm missing it.  A good test case would be to have two
event triggers.  Have the first one insert a row into the table and
check that the second one can see the row there.  I suggest adding
something like this to the regression tests.

Instead of having giant switch statements match E_WhatEver tags to
strings and visca versa, I think it would be much better to create an
array someplace that contains all the mappings.  Then you can write a
convenience function that scans the array for a string and returns the
E_WhatEver tag, and another convenience function that scans the array
for an E_WhatEver tag and returns the corresponding string.  Then all
the knowledge of how those things map onto each other is in ONE place,
which should make things a whole lot easier in terms of future code
maintenance, not to mention minimizing the chances of bugs of
oversight in the patch as it stands.  :-)

The comment changes in type_sanity.sql seem unnecessary.  I think you
can revert that part.  There's also a one-line whitespace change in
triggers.sql which can also be removed.

With respect to this hunk, it seems to me that the verbiage is
inaccurate.  It will forbid the execution of any command for which
event triggers are supported, whether they happen to be DDL commands
or not.  Also, a hypothetical DDL command that can't tolerate event
triggers wouldn't be forbidden.  I'm not sure exactly what the wording
should look like here, but we should strive to be as exact as
possible.

+  <para>
+   Forbids the execution of any DDL command:
+
+<programlisting>
+CREATE OR REPLACE FUNCTION abort_any_command()
+  RETURNS event_trigger
+ LANGUAGE plpgsql
+  AS $$
+BEGIN
+  RAISE EXCEPTION 'command % is disabled', tg_tag;
+END;
+$$;

format_type_be_without_namespace is unused in this patch; please
remove it for now.

get_event_trigger_oid looks like it might be the source of your
syscache reference leak.  It would be a good idea to change the coding
pattern of this function to match, say, get_foreign_server_oid.  That
would fix the leak and be more consistent.

I'm all reviewed out; hope that's enough for now.  I hope you can get
this cleaned up some more soon, as we are starting to run out of
CommitFest and I would really like to get this in.  Of course if we
miss the CommitFest deadline I am happy to work on it in July and
August but the sooner we get it done the better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Thu, Jun 28, 2012 at 9:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> [ review ]

Also:

../../../src/include/catalog/pg_event_trigger.h:34: error: expected
specifier-qualifier-list before ‘int2’

This needs to be changed to int16 as a result of commit
b8b2e3b2deeaab19715af063fc009b7c230b2336.

alter.c:73: warning: implicit declaration of function ‘RenameEventTrigger’

That file needs to include commands/event_trigger.h.

Please spell out DO_EVTTRIGGER as DO_EVENT_TRIGGER.

Another random warning:

plpy_main.c:195: warning: ‘retval’ may be used uninitialized in this function

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Here's an answer to your review (thanks!), with no patch attached yet
even if I've been cleanup up most of what you reported. Incremental diff
is available for browsing here:
 https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70

Robert Haas <robertmhaas@gmail.com> writes:
> Some of the pg_dump hunks fail to apply for me; I guess this needs to
> be remerged.

Done.

> There are a few remaining references to EVTG_FIRED_BEFORE /
> EVTG_FIRED_INSTEAD_OF which should be cleaned up.  I suggest writing
> the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE...  On
> a related note, evttype is still mentioned in the documentation, also.
>  And CreateEventTrigStmt has a char timing field that can go.

I didn't get the memo that we had made a decision here :)  That said it
will be possible to change our mind and introduce that instead of syntax
if that's necessary later in the cycle, so I'll go clean up for the
first commit.

> Incidentally, why do we not support an argument list here, as with
> ordinary triggers?

Left for a follow-up patch. Do you want it already in this one?

> I think the below hunk should get removed.  Or else it should be
> surrounded by #ifdef rather than commented out.

Removed.

> Typo:

Fixed.

> psql is now very confused about what the slash command is.  It's
> actually implemented as \dy, but the comments say \dev in several
> places, and slashUsage() documents it as \dct.

Fixed.

> InitializeEvtTriggerCommandCache still needs work.  First, it ensures
> that CacheMemoryContext is non-NULL... after it's already stored the
> value of CacheMemoryContext into the HASHCTL.  Obviously, the order
> there needs to be fixed.  Also, I think you need to split this into
> two functions.  There should be one function that gets called just
> once at startup time to CacheRegisterSyscacheCallback(), because we
> don't want to redo that every time the cache gets blown away.  Then
> there should be another function that gets called when, and only when,
> someone needs to use the cache.  That should create and populate the
> hash table.

CacheMemoryContext ordering fixed, I wrongly copied from attoptcache
here even when the memory management is not really done the same. The
only place I can see where to init the Event Trigger Cache is in
InitPostgres() itself (in src/backend/utils/init/postinit.c), done now.

> I think that all event triggers should fire in exactly alphabetical
> order by name.  Now that we have the concept of multiple firing
> points, there's really no reason to distinguish between any triggers
> and triggers on specific commands.  Eliminating that distinction will
> eliminate a bunch of complexity while making things *more* flexible
> for the user, who will be able to get a command trigger for a specific
> command to fire either before or after an ANY command trigger he has
> also defined rather than having the system enforce policy on him.

Internally we still need to keep the distinction to be able to fire ANY
triggers on otherwise non supported commands. I agree that we should not
concern our users with that, though. Done.

> plperl_validator still makes reference to CMDTRIGGER.

In a comment, fixed.

> Let's simplify this down to an enum with just one element, since
> that's all we're going to support for starters, and remove the related
> parser support for everything but command_start:
>
> +typedef enum TrigEvent
> +{
> +       E_CommandStart       = 1,
> +       E_SecurityCheck      = 10,
> +       E_ConsistencyCheck   = 15,
> +       E_NameLookup         = 20,
> +       E_CommandEnd         = 51
> +} TrigEvent;

I wanted to see where the current choice would lead us, I agree that we
can remove that level of details for now, that also allows not to pick
next event integration point names already :) Done.

> I think we should similarly rip out the vestigial support for
> supplying schema name, object name, and object ID.  That's not going
> to be possible at command_start anyway; we can include that stuff in
> the patch that adds a later firing point (command end, or consistency
> check, perhaps).

We got this part of the API fixed last round, so I would prefer not to
dumb it down in this first patch. We know that we want to add exactly
that specification later, don't we?

> I think standard_ProcessUtility should have a shortcut that bypasses
> setting up the event context if there are no event triggers at all in
> the entire system, so that we do no extra work unless there's some
> potential benefit.

Setting up the event context is a very lightweight operation, and
there's no way to know if the command is going to fire an event trigger
without having done exactly what the InitEventContext() is doing. Maybe
what we need to do here is rename it?

Another problem with short cutting it aggressively is what happens if a
new event trigger is created while the command is in flight. We have yet
to discuss about that (as we only support a single timing point), but
doing it the way you propose forecloses any other choice than
"repeatable read" equivalent where we might want to have some "read
commited" behaviour, that is fire the new triggers if they appear while
the command is being run.

> It seems to me that we probably need a CommandCounterIncrement() after
> firing each event trigger, unless that's happening under the covers
> somewhere and I'm missing it.  A good test case would be to have two
> event triggers.  Have the first one insert a row into the table and
> check that the second one can see the row there.  I suggest adding
> something like this to the regression tests.

Added CommandCounterIncrement() and a new regression test. That fails
for now, I'll have to get back to that later.

> Instead of having giant switch statements match E_WhatEver tags to
> strings and visca versa, I think it would be much better to create an
> array someplace that contains all the mappings.  Then you can write a
> convenience function that scans the array for a string and returns the
> E_WhatEver tag, and another convenience function that scans the array
> for an E_WhatEver tag and returns the corresponding string.  Then all
> the knowledge of how those things map onto each other is in ONE place,
> which should make things a whole lot easier in terms of future code
> maintenance, not to mention minimizing the chances of bugs of
> oversight in the patch as it stands.  :-)

That means that the Enum definition can not jump from a number to
another non consecutive one, or that we have a very sparse array and
some way to fill it unknown to me. As those numbers are going to end up
on disk, we can not ever change them. I though it would be better to
mimic what we do with the NodeTag definition here.

> The comment changes in type_sanity.sql seem unnecessary.  I think you
> can revert that part.  There's also a one-line whitespace change in
> triggers.sql which can also be removed.

Done.

> With respect to this hunk, it seems to me that the verbiage is
> inaccurate.  It will forbid the execution of any command for which
> event triggers are supported, whether they happen to be DDL commands
> or not.  Also, a hypothetical DDL command that can't tolerate event
> triggers wouldn't be forbidden.  I'm not sure exactly what the wording
> should look like here, but we should strive to be as exact as
> possible.
>
> +  <para>
> +   Forbids the execution of any DDL command:
> +
> +<programlisting>
> +CREATE OR REPLACE FUNCTION abort_any_command()
> +  RETURNS event_trigger
> + LANGUAGE plpgsql
> +  AS $$
> +BEGIN
> +  RAISE EXCEPTION 'command % is disabled', tg_tag;
> +END;
> +$$;

Yeah, will rework that text. Not this late though, needs more brainpower.

> format_type_be_without_namespace is unused in this patch; please
> remove it for now.

Done.

> get_event_trigger_oid looks like it might be the source of your
> syscache reference leak.  It would be a good idea to change the coding
> pattern of this function to match, say, get_foreign_server_oid.  That
> would fix the leak and be more consistent.

Fixed meanwhile, that was it, thanks.

> I'm all reviewed out; hope that's enough for now.  I hope you can get
> this cleaned up some more soon, as we are starting to run out of
> CommitFest and I would really like to get this in.  Of course if we
> miss the CommitFest deadline I am happy to work on it in July and
> August but the sooner we get it done the better.

So, I've begun working on this already, and I intend to spend more time
on PostgreSQL development from next week on.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> ../../../src/include/catalog/pg_event_trigger.h:34: error: expected
> specifier-qualifier-list before ‘int2’
>
> This needs to be changed to int16 as a result of commit
> b8b2e3b2deeaab19715af063fc009b7c230b2336.

Done as part of the previous work.

> alter.c:73: warning: implicit declaration of function ‘RenameEventTrigger’

Done now.

> Please spell out DO_EVTTRIGGER as DO_EVENT_TRIGGER.

Also done as part of the previous email.

> Another random warning:
> plpy_main.c:195: warning: ‘retval’ may be used uninitialized in this function

Will do a whole warning check pass later. Can you give me your local
Makefile trick to turn them into hard errors again please? :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
>   https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70

The revised incremental diff is here:

  https://github.com/dimitri/postgres/compare/f99e8d93b7...74bbeda8

And a new revision of the patch is attached to this email. We have some
pending questions, depending on the answers it could be ready for
commit.

> Robert Haas <robertmhaas@gmail.com> writes:
>> There are a few remaining references to EVTG_FIRED_BEFORE /
>> EVTG_FIRED_INSTEAD_OF which should be cleaned up.  I suggest writing
>> the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE...  On
>> a related note, evttype is still mentioned in the documentation, also.
>>  And CreateEventTrigStmt has a char timing field that can go.
>
> I didn't get the memo that we had made a decision here :)  That said it
> will be possible to change our mind and introduce that instead of syntax
> if that's necessary later in the cycle, so I'll go clean up for the
> first commit.

Done now.

>> Incidentally, why do we not support an argument list here, as with
>> ordinary triggers?
>
> Left for a follow-up patch. Do you want it already in this one?

Didn't do that, I though cleaning up all the points here would go first,
please tell me if you want that in the first commit.

>> I think we should similarly rip out the vestigial support for
>> supplying schema name, object name, and object ID.  That's not going
>> to be possible at command_start anyway; we can include that stuff in
>> the patch that adds a later firing point (command end, or consistency
>> check, perhaps).
>
> We got this part of the API fixed last round, so I would prefer not to
> dumb it down in this first patch. We know that we want to add exactly
> that specification later, don't we?

Didn't change anything here.

>> I think standard_ProcessUtility should have a shortcut that bypasses
>> setting up the event context if there are no event triggers at all in
>> the entire system, so that we do no extra work unless there's some
>> potential benefit.
>
> Setting up the event context is a very lightweight operation, and
> there's no way to know if the command is going to fire an event trigger
> without having done exactly what the InitEventContext() is doing. Maybe
> what we need to do here is rename it?
>
> Another problem with short cutting it aggressively is what happens if a
> new event trigger is created while the command is in flight. We have yet
> to discuss about that (as we only support a single timing point), but
> doing it the way you propose forecloses any other choice than
> "repeatable read" equivalent where we might want to have some "read
> commited" behaviour, that is fire the new triggers if they appear while
> the command is being run.

Same, don't see a way to shortcut.

> Added CommandCounterIncrement() and a new regression test. That fails
> for now, I'll have to get back to that later.

Of course I just needed to pay attention to the new ordering rules :)

>> Instead of having giant switch statements match E_WhatEver tags to
>> strings and visca versa, I think it would be much better to create an
>> array someplace that contains all the mappings.  Then you can write a
>> convenience function that scans the array for a string and returns the
>> E_WhatEver tag, and another convenience function that scans the array
>> for an E_WhatEver tag and returns the corresponding string.  Then all
>> the knowledge of how those things map onto each other is in ONE place,
>> which should make things a whole lot easier in terms of future code
>> maintenance, not to mention minimizing the chances of bugs of
>> oversight in the patch as it stands.  :-)
>
> That means that the Enum definition can not jump from a number to
> another non consecutive one, or that we have a very sparse array and
> some way to fill it unknown to me. As those numbers are going to end up
> on disk, we can not ever change them. I though it would be better to
> mimic what we do with the NodeTag definition here.

I'd like some more input here.

>> +   Forbids the execution of any DDL command:

Worked out something. Might need some more input.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Thom Brown
Date:
On 2 July 2012 15:11, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
>>   https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70
>
> The revised incremental diff is here:
>
>   https://github.com/dimitri/postgres/compare/f99e8d93b7...74bbeda8
>
> And a new revision of the patch is attached to this email. We have some
> pending questions, depending on the answers it could be ready for
> commit.
>
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> There are a few remaining references to EVTG_FIRED_BEFORE /
>>> EVTG_FIRED_INSTEAD_OF which should be cleaned up.  I suggest writing
>>> the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE...  On
>>> a related note, evttype is still mentioned in the documentation, also.
>>>  And CreateEventTrigStmt has a char timing field that can go.
>>
>> I didn't get the memo that we had made a decision here :)  That said it
>> will be possible to change our mind and introduce that instead of syntax
>> if that's necessary later in the cycle, so I'll go clean up for the
>> first commit.
>
> Done now.
>
>>> Incidentally, why do we not support an argument list here, as with
>>> ordinary triggers?
>>
>> Left for a follow-up patch. Do you want it already in this one?
>
> Didn't do that, I though cleaning up all the points here would go first,
> please tell me if you want that in the first commit.
>
>>> I think we should similarly rip out the vestigial support for
>>> supplying schema name, object name, and object ID.  That's not going
>>> to be possible at command_start anyway; we can include that stuff in
>>> the patch that adds a later firing point (command end, or consistency
>>> check, perhaps).
>>
>> We got this part of the API fixed last round, so I would prefer not to
>> dumb it down in this first patch. We know that we want to add exactly
>> that specification later, don't we?
>
> Didn't change anything here.
>
>>> I think standard_ProcessUtility should have a shortcut that bypasses
>>> setting up the event context if there are no event triggers at all in
>>> the entire system, so that we do no extra work unless there's some
>>> potential benefit.
>>
>> Setting up the event context is a very lightweight operation, and
>> there's no way to know if the command is going to fire an event trigger
>> without having done exactly what the InitEventContext() is doing. Maybe
>> what we need to do here is rename it?
>>
>> Another problem with short cutting it aggressively is what happens if a
>> new event trigger is created while the command is in flight. We have yet
>> to discuss about that (as we only support a single timing point), but
>> doing it the way you propose forecloses any other choice than
>> "repeatable read" equivalent where we might want to have some "read
>> commited" behaviour, that is fire the new triggers if they appear while
>> the command is being run.
>
> Same, don't see a way to shortcut.
>
>> Added CommandCounterIncrement() and a new regression test. That fails
>> for now, I'll have to get back to that later.
>
> Of course I just needed to pay attention to the new ordering rules :)
>
>>> Instead of having giant switch statements match E_WhatEver tags to
>>> strings and visca versa, I think it would be much better to create an
>>> array someplace that contains all the mappings.  Then you can write a
>>> convenience function that scans the array for a string and returns the
>>> E_WhatEver tag, and another convenience function that scans the array
>>> for an E_WhatEver tag and returns the corresponding string.  Then all
>>> the knowledge of how those things map onto each other is in ONE place,
>>> which should make things a whole lot easier in terms of future code
>>> maintenance, not to mention minimizing the chances of bugs of
>>> oversight in the patch as it stands.  :-)
>>
>> That means that the Enum definition can not jump from a number to
>> another non consecutive one, or that we have a very sparse array and
>> some way to fill it unknown to me. As those numbers are going to end up
>> on disk, we can not ever change them. I though it would be better to
>> mimic what we do with the NodeTag definition here.
>
> I'd like some more input here.
>
>>> +   Forbids the execution of any DDL command:
>
> Worked out something. Might need some more input.

I'm not clear on this paragraph:

Triggers on <literal>ANY</literal> command support more commands than
just this list, and will only provide the <literal>command
tag</literal> argument as <literal>NOT NULL</literal>.

Do you actually mean it will return NULL?

I also attach various typo/grammar fixes.

--
Thom

Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jun 29, 2012 at 5:38 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Will do a whole warning check pass later. Can you give me your local
> Makefile trick to turn them into hard errors again please? :)

echo COPT=-Werror > src/Makefile.custom

Your latest patch contains a warning about using a variable
uninitialized that seems to indicate that you didn't test this very
carefully: in get_event_triggers, current_any_name is set but not
used.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Jul 2, 2012 at 1:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jun 29, 2012 at 5:38 PM, Dimitri Fontaine
> <dimitri@2ndquadrant.fr> wrote:
>> Will do a whole warning check pass later. Can you give me your local
>> Makefile trick to turn them into hard errors again please? :)
>
> echo COPT=-Werror > src/Makefile.custom
>
> Your latest patch contains a warning about using a variable
> uninitialized that seems to indicate that you didn't test this very
> carefully: in get_event_triggers, current_any_name is set but not
> used.

Make that used but not set.  Anyhow, it's broken.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Jul 2, 2012 at 10:11 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> [ new patch ]

I would really like to start committing parts of this, but there are
still a lot of unfinished loose ends in this code.  The one that is
most immediately bothering me is related to the syntax you've
implemented, which is:

CREATE EVENT TRIGGER name ON event_name WHEN event_trigger_variable IN
(trigger_command_list) EXECUTE PROCEDURE func_name ()

I've been beating on the issue of generic syntax for a while now, and
that production looks sort of generic, but there are problems when you
dig into it.  trigger_command_list is defined in a way that means that
it can never be anything other than a list of tags, which means that
event_trigger_variable can never really be anything other than TAG.
Oops.   I think you need to remove the validation of
trigger_command_list from the parser and do that afterwards.  In other
words, the parser should be happy if event_trigger_variable is an
ColId and trigger_command_list is a bunch of comma-separated SConst
items.  Then, after parsing is complete, the code that actually
implements CREATE EVENT TRIGGER should look at event_trigger_variable
and decide whether it has a legal value and, if so, whether the
associated trigger_command_list is compatible with it.  IOW, the
validation should be happening in CreateEventTrigger rather than
gram.y.

There is a related problem in terms of the system catalog
representation which I should have noted previously, but did not.  The
system catalog representation has no place to store
event_trigger_variable, and the column that will store
trigger_command_list is called evttags, again implying rather strongly
that these are command tags we're dealing with rather than anything
else.  Obviously this could be revised later, but it will be much
easier to add new functionality in subsequent patches if we get the
catalog infrastructure right - or close to right - on the first try,
so it seems worth spending a little bit of time on it.  The two
questions that occur to me are:

1. Do we imagine a situation where a given event_name would allow more
than one choice of event_trigger_variable?  If so, then
pg_event_trigger needs a place to store the event_trigger_variable.
If not, then it's a noise word, and we should consider simplifying the
syntax to something like:

CREATE EVENT TRIGGER name ON event_name (trigger_command_list) EXECUTE
PROCEDURE func_name ()
or maybe
CREATE EVENT TRIGGER name ON event_name FOR (trigger_command_list)
EXECUTE PROCEDURE func_name ()
or perhaps
CREATE EVENT TRIGGER name ON event_name USING (trigger_command_list)
EXECUTE PROCEDURE func_name ()

2. On a related point, do we anticipate that we might eventually want
to allow filtering by more than one event_trigger_variable in the same
trigger?  That is, might we want to do something like this:

CREATE EVENT TRIGGER something ON somevent WHEN thingy IN ('item1',
'item2') AND otherthingy IN ('foo', 'bar') EXECUTE PROCEDURE func_name
()

If so, then how do we imagine that getting stored in the catalog?

Please let me know your thoughts.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> CREATE EVENT TRIGGER name ON event_name WHEN event_trigger_variable IN
> (trigger_command_list) EXECUTE PROCEDURE func_name ()
[...]
> 1. Do we imagine a situation where a given event_name would allow more
> than one choice of event_trigger_variable?  If so, then

The only use I think is really interesting here is matching command tags
for either the main event or a sub-event of some sort. We decided not to
include any sub-commands reasoning in that first patch, that's why the
syntax is way more generic than the implementation.

You could then install an event trigger for a CREATE SEQUENCE that
happens as part of a CREATE TABLE, or a DROP COLUMN that happens as part
of a DROP TYPE, for examples.

So yes the event_trigger_variable is made to only host a tag and the
associate catalog entry make that very clear.

> pg_event_trigger needs a place to store the event_trigger_variable.

My thinking was to add another hard-coded list of tags for the other
variable, that is have a flexible syntax (no WHEN clause, a WHEN clause
with only main tag matching, a WHEN clause with both parent/main tag
matching, or only parent) that only uses two hard coded variables both
being tags matched against a static list.

The reason why it's not all in the current patch is that we decided
together that we want to revisit the sub-command semantics once we have
a basic framework in place. It's already leaky though.

> If not, then it's a noise word, and we should consider simplifying the
> syntax to something like:

> CREATE EVENT TRIGGER name ON event_name FOR (trigger_command_list)
> EXECUTE PROCEDURE func_name ()

That would have my preference, in case you would rather have a
simplified first patch, that is without any provision to expand on the
tag matching facility. Of course we won't be able to keep that syntax as
soon as we decide how to handle sub commands, which makes that decision
a lot less useful that it seems to be, in my mind.

> 2. On a related point, do we anticipate that we might eventually want
> to allow filtering by more than one event_trigger_variable in the same
> trigger?  That is, might we want to do something like this:
>
> CREATE EVENT TRIGGER something ON somevent WHEN thingy IN ('item1',
> 'item2') AND otherthingy IN ('foo', 'bar') EXECUTE PROCEDURE func_name
> ()

That's what I want to be able to do yes, in another step. The only two
variables I think about would be named "tag" and "toplevel", or
something equivalent ("main", "host", "user", etc).

> If so, then how do we imagine that getting stored in the catalog?

We will have to extend the catalog when we attack sub command handling,
at least I believe we will. Hence the current proposed catalog is not
yet finished. We could also already decide about sub command handling if
you think that's the only remaining thing to do in this patch; so that
it's easier down the road. At the expense of taking some more time right
now. Your call, I have the round tuits :)

Regards,
--
Dimitri Fontaine                                        06 63 07 10 78
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Jul 2, 2012 at 4:10 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>> 2. On a related point, do we anticipate that we might eventually want
>> to allow filtering by more than one event_trigger_variable in the same
>> trigger?  That is, might we want to do something like this:
>>
>> CREATE EVENT TRIGGER something ON somevent WHEN thingy IN ('item1',
>> 'item2') AND otherthingy IN ('foo', 'bar') EXECUTE PROCEDURE func_name
>> ()
>
> That's what I want to be able to do yes, in another step. The only two
> variables I think about would be named "tag" and "toplevel", or
> something equivalent ("main", "host", "user", etc).
>
>> If so, then how do we imagine that getting stored in the catalog?
>
> We will have to extend the catalog when we attack sub command handling,
> at least I believe we will. Hence the current proposed catalog is not
> yet finished. We could also already decide about sub command handling if
> you think that's the only remaining thing to do in this patch; so that
> it's easier down the road. At the expense of taking some more time right
> now. Your call, I have the round tuits :)

I'd really like to not have to change the catalog again in every
patch, because if we do that then we are just saying we're going to
rewrite this patch completely every time we want to add a new feature,
which kind of defeats the purpose IMHO.

So let's try to hammer something out now.  The obvious thing that
occurs to me is to have a column in the catalog that is a 2-D array of
text, with the first element of each array being something like "tag"
or "subtag" (i.e. event_trigger_variable) and the remaining array
elements being a list of legal values.  That is:

WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar')

would be represented as this array:

{{thingy,item1,item2},{otherthingy,foo,bar}}

This would allow us to support 0 or more WHERE clauses, each of the
form "thing IN (value1, value2, ...)".  Is that general enough for
every use case that you can foresee, or do we need more?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I'd really like to not have to change the catalog again in every
> patch, because if we do that then we are just saying we're going to
> rewrite this patch completely every time we want to add a new feature,
> which kind of defeats the purpose IMHO.

Fair enough.

> So let's try to hammer something out now.  The obvious thing that
> occurs to me is to have a column in the catalog that is a 2-D array of
> text, with the first element of each array being something like "tag"
> or "subtag" (i.e. event_trigger_variable) and the remaining array
> elements being a list of legal values.  That is:
>
> WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar')
>
> would be represented as this array:
>
> {{thingy,item1,item2},{otherthingy,foo,bar}}
>
> This would allow us to support 0 or more WHERE clauses, each of the
> form "thing IN (value1, value2, ...)".  Is that general enough for
> every use case that you can foresee, or do we need more?

Given what I foresee, simply having another columns in there named
evtstags with the exact same content as evttags would be the simplest
and most natural implementation, really.

I don't foresee more generic needs here, unless you can convince me that
we need both a. a full stack of arbitrarily nested commands and b. a way
to match and target any level of that stack.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Jul 2, 2012 at 4:53 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> So let's try to hammer something out now.  The obvious thing that
>> occurs to me is to have a column in the catalog that is a 2-D array of
>> text, with the first element of each array being something like "tag"
>> or "subtag" (i.e. event_trigger_variable) and the remaining array
>> elements being a list of legal values.  That is:
>>
>> WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar')
>>
>> would be represented as this array:
>>
>> {{thingy,item1,item2},{otherthingy,foo,bar}}
>>
>> This would allow us to support 0 or more WHERE clauses, each of the
>> form "thing IN (value1, value2, ...)".  Is that general enough for
>> every use case that you can foresee, or do we need more?
>
> Given what I foresee, simply having another columns in there named
> evtstags with the exact same content as evttags would be the simplest
> and most natural implementation, really.

That seems a lot less general for no particular gain.

> I don't foresee more generic needs here, unless you can convince me that
> we need both a. a full stack of arbitrarily nested commands and b. a way
> to match and target any level of that stack.

Well, let's take the example of an ALTER TABLE command.  You could
want to match on:

- the type of object the user mentioned in the command (did he write
ALTER TABLE or ALTER VIEW?)
- the type of object actually being modified (could differ if he used
ALTER TABLE on a view, or visca versa)
- the particular ALTER TABLE subcommand in use (e.g. SET STATISTICS)

I suspect there are other examples as well.  If we pick the 2-D list
representation I suggested, or something like it, we can easily
accommodate these kinds of filters without having to whack the catalog
representation around any further.  That seems pretty appealing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> So let's try to hammer something out now.  The obvious thing that
> occurs to me is to have a column in the catalog that is a 2-D array of
> text, with the first element of each array being something like "tag"
> or "subtag" (i.e. event_trigger_variable) and the remaining array
> elements being a list of legal values.  That is:

> WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar')

> would be represented as this array:

> {{thingy,item1,item2},{otherthingy,foo,bar}}

Um, doesn't that require nonrectangular arrays?  Or is there some
non-obvious reason why the lists of legal values will always be all the
same length?
        regards, tom lane


Re: Event Triggers reduced, v1

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> I don't foresee more generic needs here, unless you can convince me that
> we need both a. a full stack of arbitrarily nested commands and b. a way
> to match and target any level of that stack.

Um ... isn't the burden of proof the other way around here?  That is,
what's the argument that we *don't* need that?
        regards, tom lane


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Jul 2, 2012 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> So let's try to hammer something out now.  The obvious thing that
>> occurs to me is to have a column in the catalog that is a 2-D array of
>> text, with the first element of each array being something like "tag"
>> or "subtag" (i.e. event_trigger_variable) and the remaining array
>> elements being a list of legal values.  That is:
>
>> WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar')
>
>> would be represented as this array:
>
>> {{thingy,item1,item2},{otherthingy,foo,bar}}
>
> Um, doesn't that require nonrectangular arrays?  Or is there some
> non-obvious reason why the lists of legal values will always be all the
> same length?

Doh.  You're right: I keep forgetting that arrays have to be rectangular.

Any suggestions on a sensible way to represent this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jul 2, 2012 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Um, doesn't that require nonrectangular arrays?

> Doh.  You're right: I keep forgetting that arrays have to be rectangular.

> Any suggestions on a sensible way to represent this?

Are there likely to be enough entries that storage efficiency actually
matters?  If not, we could use a 2xN array of {key,allowed_value} pairs,
that is

{{thingy,item1},{thingy,item2},{otherthingy,foo},{otherthingy,bar}}

Or perhaps push these out into a separate table, along the lines
ofoid    key    allowed_value
and use an oidvector to list the selected values in a trigger entry?
        regards, tom lane


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> Given what I foresee, simply having another columns in there named
>> evtstags with the exact same content as evttags would be the simplest
>> and most natural implementation, really.
>
> That seems a lot less general for no particular gain.

The gain is code, docs and usage simplification. I think going general
here is going to confuse every one involved and that we should have two
targets in mind: classic use cases that we want to address easily enough
in SQL and with some PLpgSQL help, and advanced use cases that are
possible to implement in PL/C using the parse tree and the soon to come
back rewritten command string.

IOW, let's make the simple things simple and the complex one possible.


The following is quite long an email where I try to give plenty of
examples and to detail the logic I'm working with so that you can easily
stab at whichever part you're thinking is not going to fly.


>> I don't foresee more generic needs here, unless you can convince me that
>> we need both a. a full stack of arbitrarily nested commands and b. a way
>> to match and target any level of that stack.
>
> Well, let's take the example of an ALTER TABLE command.  You could
> want to match on:
>
> - the type of object the user mentioned in the command (did he write
> ALTER TABLE or ALTER VIEW?)
> - the type of object actually being modified (could differ if he used

I don't think it's possible to implement that without shaking all the
system, after having a look at the following lines from gram.y:
ALTER VIEW qualified_name alter_table_cmds{    AlterTableStmt *n = makeNode(AlterTableStmt);

So, the way to implement that need from an event trigger is to use the
parse tree, and hopefully soon enough the rewritten command string.

> ALTER TABLE on a view, or visca versa)
> - the particular ALTER TABLE subcommand in use (e.g. SET STATISTICS)

Now we can publish that, we would have some events with 
tag = 'ALTER TABLE'

then some others with
toplevel = 'ALTER TABLE'tag = 'SET STATISTICS'

The same idea would need to get implemented for serial, where the tag is
'CREATE SEQUENCE' and the toplevel tag is 'CREATE TABLE'. That allows to
easily install an event trigger that gets called every time a sequence
is created, you can then have a look at the toplevel command tag if you
need to.
 CREATE EVENT TRIGGER snitch_seqs                   ON command_start                 WHEN tag IN ('CREATE SEQUENCE')
EXECUTEPROCEDURE snitch_seqs();
 

The idea is that the function snitch_seqs has a "magic" variable
TG_TOPLEVEL that can be tested and will be set to 'CREATE TABLE' when
we're dealing with a SERIAL, in that example.

If you want your event trigger to only ever deal with SERIAL, you could
install it this way:
 CREATE EVENT TRIGGER my_serial_trigger                   ON command_start                 WHEN toplevel IN ('CREATE
TABLE')                 AND tag IN ('CREATE SEQUENCE')    EXECUTE PROCEDURE handle_serial_trigger();
 

Now let's see about getting more generic than that.

We also can get tag = 'CREATE INDEX' and toplevel = 'ALTER TABLE' when
adding a primary key for example. That's an example that can lead us to
more than 2 levels of nested tags, which I would want to avoid. The
stack here would look like:
1. ALTER TABLE2.   ADD PRIMARY KEY3.     CREATE INDEX

I think only having 1 and 3 is enough, for more details the command
string and the parse tree are available. In the main use case of
replication, you mostly just want to replicate the command string. You
might want to apply some transformation rules to it (table names, cope
with a different target schema, etc) but typically those rules are to be
run in the subscriber system, not on the provider (picture a federating
system where each provider uses the same schema, that gets mapped to a
schema per provider on the subscriber).

The other problem with the stack of tags is matching them. I don't see
that it helps writing event triggers much. In the previous example, if
you want an event trigger that fires on each ALTER TABLE, you don't know
which level of the stack to target. Either you have to target the
current tag or the toplevel tag or something in between. We could easily
have the following tag stack:
1. CREATE SCHEMA2.   ALTER TABLE3.     ADD PRIMARY KEY4.       CREATE INDEX

So now we need a way to target any entry in the stack and a way to
represent the stack in every PL language we have, and an easy way to
analyze the stack. For PLpgSQL I guess that means we want to expose this
tag stack as a TABLE, and the complexity just went off the table.

My view point is that for any practical case I can think about what we
care about is the current command being run, and given how PostgreSQL is
made today that means handling one level of sub commands. That addresses
ALTER TABLE and also DROP CASCADE.

I don't think adding-in an ALTER TABLE that never happened in the middle
of those two elements is going to make life easier for anybody involved,
quite the contrary:
1. DROP TYPE2.   DROP COLUMN

Users that need that level of detail for their processing are welcome to
code their Event Trigger in PL/C and analyze the parse tree. We can call
that advanced analysis.

> I suspect there are other examples as well.  If we pick the 2-D list
> representation I suggested, or something like it, we can easily
> accommodate these kinds of filters without having to whack the catalog
> representation around any further.  That seems pretty appealing.

The generic approach leads us to invent a stack of tags and (I suspect)
a DSL for tag matching where you can express at least those different
things:
- this tag is found in the stack (tag <@ stack)- this other tag is found higher in the stack- this other tag is found
justone level higher in the stack- this other tag is found at least 2 levels higher in the stack- this third tag is
foundlower in the stack- this third tag is found just one level lower in the stack- and maybe some more
 

Those semantics are going to be needed, either in the event trigger
definition itself, or in the event trigger code. We could expose a stack
of tags and ditch the WHEN clause here, but we still have to be able to
implement the filtering in PLpgSQL for simple cases.

If we're not able to express such detailed semantics I don't think we're
servicing users by making things way more complex for them to use. The
drawback is that we will have to make choices as to which tag we expose
exactly, remembering that all the details are to be found in the parse
tree and the rewritten command string.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Jul 2, 2012 at 11:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Jul 2, 2012 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Um, doesn't that require nonrectangular arrays?
>
>> Doh.  You're right: I keep forgetting that arrays have to be rectangular.
>
>> Any suggestions on a sensible way to represent this?
>
> Are there likely to be enough entries that storage efficiency actually
> matters?  If not, we could use a 2xN array of {key,allowed_value} pairs,
> that is
>
> {{thingy,item1},{thingy,item2},{otherthingy,foo},{otherthingy,bar}}
>
> Or perhaps push these out into a separate table, along the lines
> of
>         oid     key     allowed_value
> and use an oidvector to list the selected values in a trigger entry?

It seems likely that there will fairly commonly be many allowed values
per key.  However, the universe of legal keys will be quite small,
probably a total of 2-4.  So maybe the best representation is to have
an a separate column for each key and store the array of legal values
in it.  That's more or less what Dimitri already has in his latest
patch, except that after looking it over I'm inclined to think that
we'd be better off storing the keys as text and translating to
internal ID numbers when we read and cache the table, rather than
storing the ID numbers in the table.  That will make the catalog
contents easier to read, and more importantly will mean that a change
to the internal ID numbers need not be initdb-forcing.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> in it.  That's more or less what Dimitri already has in his latest
> patch, except that after looking it over I'm inclined to think that
> we'd be better off storing the keys as text and translating to
> internal ID numbers when we read and cache the table, rather than
> storing the ID numbers in the table.  That will make the catalog
> contents easier to read, and more importantly will mean that a change
> to the internal ID numbers need not be initdb-forcing.

Well that's what I had in previous versions of the patch, where the code
was dealing directly with command tags. If we store text in catalogs and
given that we already have the command tag as text, I suppose we would
get back to only managing tags as text in the cache key and the rest of
the code.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Tue, Jul 3, 2012 at 11:52 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> in it.  That's more or less what Dimitri already has in his latest
>> patch, except that after looking it over I'm inclined to think that
>> we'd be better off storing the keys as text and translating to
>> internal ID numbers when we read and cache the table, rather than
>> storing the ID numbers in the table.  That will make the catalog
>> contents easier to read, and more importantly will mean that a change
>> to the internal ID numbers need not be initdb-forcing.
>
> Well that's what I had in previous versions of the patch, where the code
> was dealing directly with command tags. If we store text in catalogs and
> given that we already have the command tag as text, I suppose we would
> get back to only managing tags as text in the cache key and the rest of
> the code.

Yeah, I'm of two minds on that.  I thought that it made sense to use
integer identifiers internally for speed, but now I'm worried that the
effort to translate back and forth between strings and integers is
going to end up being more than any speed we might save.   But I'm
still not certain which way is best: maybe your original idea of using
strings everywhere will prove to be the winner, but on the other hand
I'm not convinced that the code as you have it written today is as
efficient as it could be.

At any rate, whether or not we end up using strings or integers inside
the backend-local caches, I'm inclined to think that it's better to
store strings in the catalog, so we'd end up with something like this:

CATALOG(pg_event_trigger,3466)
{       NameData        evtname;                /* trigger's name */       int16       evtevent;           /* trigger's
event*/       Oid                     evtfoid;                /* OID of function to be       char
evtenabled;            /* trigger's firing configuratio
*session_repli
 
#ifdef CATALOG_VARLEN       text       evttags[1];         /* command TAGs this event trigger targe
#endif
}

If there's no filter on tags, then I think we should let evttags = NULL.

If in the future we have filtering that's not based on tags, we'd add,
e.g. "text evtshushiness[1]" if we were going to filter based on
smushiness level.  We'd set evtsmushiness = NULL if there's no
filtering by smushiness, or else an array of the smushiness levels
that fire that trigger.

Does that work for you?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Yeah, I'm of two minds on that.  I thought that it made sense to use
> integer identifiers internally for speed, but now I'm worried that the
> effort to translate back and forth between strings and integers is
> going to end up being more than any speed we might save.

We do that for nodetags in stored query/expression trees, and I've never
seen any indication that it costs enough to notice.  It's definitely a
huge win for anything that changes regularly, which seems like it'll be
a pretty good description of event tags for at least the next few years.
        regards, tom lane


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Tue, Jul 3, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Yeah, I'm of two minds on that.  I thought that it made sense to use
>> integer identifiers internally for speed, but now I'm worried that the
>> effort to translate back and forth between strings and integers is
>> going to end up being more than any speed we might save.
>
> We do that for nodetags in stored query/expression trees, and I've never
> seen any indication that it costs enough to notice.  It's definitely a
> huge win for anything that changes regularly, which seems like it'll be
> a pretty good description of event tags for at least the next few years.

Good analogy.  In that case, as in what I'm proposing here, we use
integers in memory and text on disk.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jul 3, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Yeah, I'm of two minds on that.  I thought that it made sense to use
>>> integer identifiers internally for speed, but now I'm worried that the
>>> effort to translate back and forth between strings and integers is
>>> going to end up being more than any speed we might save.
>>
>> We do that for nodetags in stored query/expression trees, and I've never
>> seen any indication that it costs enough to notice.  It's definitely a
>> huge win for anything that changes regularly, which seems like it'll be
>> a pretty good description of event tags for at least the next few years.
>
> Good analogy.  In that case, as in what I'm proposing here, we use
> integers in memory and text on disk.

New patch for that tomorrow. I assume we agree on having a column per
extra variable, I'm not sure about already including full support for
the toplevel one. With some luck I'll have news from you about that
before sending the next revision of the patch, which then would include
the int16 evttoplevel[1] column.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Tue, Jul 3, 2012 at 1:31 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Jul 3, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> Yeah, I'm of two minds on that.  I thought that it made sense to use
>>>> integer identifiers internally for speed, but now I'm worried that the
>>>> effort to translate back and forth between strings and integers is
>>>> going to end up being more than any speed we might save.
>>>
>>> We do that for nodetags in stored query/expression trees, and I've never
>>> seen any indication that it costs enough to notice.  It's definitely a
>>> huge win for anything that changes regularly, which seems like it'll be
>>> a pretty good description of event tags for at least the next few years.
>>
>> Good analogy.  In that case, as in what I'm proposing here, we use
>> integers in memory and text on disk.
>
> New patch for that tomorrow. I assume we agree on having a column per
> extra variable,

Yes.

>  I'm not sure about already including full support for
> the toplevel one. With some luck I'll have news from you about that
> before sending the next revision of the patch, which then would include
> the int16 evttoplevel[1] column.

No, I'm not asking you to add any more columns right now (in fact,
please do not).  But the type of the existing column should change to
text[].

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> No, I'm not asking you to add any more columns right now (in fact,
> please do not).  But the type of the existing column should change to
> text[].

Ok, done in the attached. We now read text from either the user input in
the grammar or from the catalogs (in a 1-D array there), and convert to
our enum structure for internal code use.

In fact I attached both the new patch version and the incremental patch
on top of the v1.3 you had before, I suspect that might make life easier
for you. It's of course browsable online too at this URL:

   https://github.com/dimitri/postgres/commit/a8cf89116ae7b6e51c8a580510612b85caae2d38

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Robert Haas <robertmhaas@gmail.com> writes:
>> No, I'm not asking you to add any more columns right now (in fact,
>> please do not).  But the type of the existing column should change to
>> text[].
>
> Ok, done in the attached. We now read text from either the user input in
> the grammar or from the catalogs (in a 1-D array there), and convert to
> our enum structure for internal code use.

In the previous one I missed adapting pg_dump and psql, here's the
updated set of patches. Sorry about that. This unexpected electrical
shut down in the middle of the day was not cool. Nor was the second one.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Wed, Jul 4, 2012 at 1:56 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> No, I'm not asking you to add any more columns right now (in fact,
>>> please do not).  But the type of the existing column should change to
>>> text[].
>>
>> Ok, done in the attached. We now read text from either the user input in
>> the grammar or from the catalogs (in a 1-D array there), and convert to
>> our enum structure for internal code use.
>
> In the previous one I missed adapting pg_dump and psql, here's the
> updated set of patches. Sorry about that. This unexpected electrical
> shut down in the middle of the day was not cool. Nor was the second one.

Thanks.  There are some review comments from previous rounds that
don't seem to have been fully incorporated to this version:

- You only changed the definition of pg_event_triggers.evttags, not
the documentation.  I don't think we need pg_evttag_to_string aka
pg_event_trigger_command_to_string any more either.
- When you removed format_type_be_without_namespace, you didn't remove
either the function prototype or the related changes in format_type.c.
- I'm pretty sure that a previous review mentioned the compiler
warning in evtcache.c, which seems not to be fixed.  It doesn't look
harmless, either.
- The definitions of EVTG_FIRED_* are still present in
pg_event_trigger.h, even though they're longer used anywhere.
- The comments in parsenodes.h still refer to command triggers rather
than event triggers.  event_trigger.h contains a reference to
CommandTriggerData that should say EventTriggerData.  plperl.sgml has
vestiges of the old terminology as well.

In terms of the schema itself, I think we are almost there, but:

- I noticed while playing around with this that pg_dump emits a
completely empty owner field when dumping an event trigger.  At first
I thought that was just an oversight, but then I realized there's a
deeper reason for it: pg_event_trigger doesn't have an owner field.  I
think it should.  The only other objects in the system that don't have
owners are text search parsers and text search templates (and casts,
sort of).  It might seem redundant to have an owner even when
event-triggers are superuser-only, but we might want to try to relax
that restriction later.  Note that foreign data wrappers, which are
also superuser-create-only, do have an owner.  (Note that if we give
event triggers an owner, then we also need ALTER .. OWNER TO support
for them.)

- It seems pretty clear at this point that the cache you've
implemented in src/backend/utils/cache/evtcache.c is going to do all
the heavy lifting of converting the stored catalog representation to a
form that is suitable for quick access.  Given that, I wonder whether
we should go whole hog and change evtevent to a text field.  This
would simplify things for pg_dump and we could get rid of
pg_evtevent_to_string, at a cost of doing marginally more work when we
rebuild the event cache (but who really cares, given that you're
reading the entire table every time you rebuild the cache anyway?).

Thoughts?

Some other minor comments:

- In the \dy output, command_start is referred to as the "condition",
but the documentation calls it an "event" and the grammar calls it
"event_name".  "event" or "event_name" seems better, so I think this
is just a matter of changing psql to match.
- AlterEventTrigStmt defines tgenbled as char *; I think it should
just be char.  In gram.y, you can declare the enable_trigger
production as <chr> (or <ival>?) rather than <str>.  At any rate
pstrdup(stmt->tgenabled)[0] doesn't look right; you don't need to copy
a string to fetch the first byte.
- Why did you create a separate file pg_event_trigger_fn.h instead of
just including that single prototype in pg_event_trigger.h?
- The EVENTTRIGGEROID syscache is used only in RemoveEventTriggerById.I don't think that's a sufficient justification
foreating up more
 
memory for another system cache.  I think you should just remove that
cache and recode RemoveEventTriggerById to find the correct tuple via
an index scan.  See RemoveEventTriggerById for an example.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Please find attached v6 of the patch, fixing all your comments here.
Sorry about missing some obvious things in the making of this patch.

Given that I had to merge master again, I can't easily produce an
incremental patch on top of last version, so you only have the full
patch attached.

Robert Haas <robertmhaas@gmail.com> writes:
> - You only changed the definition of pg_event_triggers.evttags, not
> the documentation.  I don't think we need pg_evttag_to_string aka
> pg_event_trigger_command_to_string any more either.

Done. Removed now useless exposed functions.

> - When you removed format_type_be_without_namespace, you didn't remove
> either the function prototype or the related changes in format_type.c.

Fixed.

> - I'm pretty sure that a previous review mentioned the compiler
> warning in evtcache.c, which seems not to be fixed.  It doesn't look
> harmless, either.

I'm failing to reproduce it here, in -O0. […Trying with the -O2
default…] I got the error in -O2, fixed in the attached.

> - The definitions of EVTG_FIRED_* are still present in
> pg_event_trigger.h, even though they're longer used anywhere.

Fixed.

> - The comments in parsenodes.h still refer to command triggers rather
> than event triggers.  event_trigger.h contains a reference to
> CommandTriggerData that should say EventTriggerData.  plperl.sgml has
> vestiges of the old terminology as well.

Fixed. I only found a single such vestige in plperl.sgml though.

> In terms of the schema itself, I think we are almost there, but:
>
> - I noticed while playing around with this that pg_dump emits a
> completely empty owner field when dumping an event trigger.  At first
> I thought that was just an oversight, but then I realized there's a
> deeper reason for it: pg_event_trigger doesn't have an owner field.  I
> think it should.  The only other objects in the system that don't have
> owners are text search parsers and text search templates (and casts,
> sort of).  It might seem redundant to have an owner even when
> event-triggers are superuser-only, but we might want to try to relax
> that restriction later.  Note that foreign data wrappers, which are
> also superuser-create-only, do have an owner.  (Note that if we give
> event triggers an owner, then we also need ALTER .. OWNER TO support
> for them.)

Damn, I had it on my TODO and Álvaro hinted me already, and I kept
forgetting about it nonetheless. Fixed now.

> - It seems pretty clear at this point that the cache you've
> implemented in src/backend/utils/cache/evtcache.c is going to do all
> the heavy lifting of converting the stored catalog representation to a
> form that is suitable for quick access.  Given that, I wonder whether
> we should go whole hog and change evtevent to a text field.  This
> would simplify things for pg_dump and we could get rid of
> pg_evtevent_to_string, at a cost of doing marginally more work when we
> rebuild the event cache (but who really cares, given that you're
> reading the entire table every time you rebuild the cache anyway?).

Agreed, done that way (using a NameData fixed width field).

> - In the \dy output, command_start is referred to as the "condition",
> but the documentation calls it an "event" and the grammar calls it
> "event_name".  "event" or "event_name" seems better, so I think this
> is just a matter of changing psql to match.

Fixed.

> - AlterEventTrigStmt defines tgenbled as char *; I think it should
> just be char.  In gram.y, you can declare the enable_trigger
> production as <chr> (or <ival>?) rather than <str>.  At any rate
> pstrdup(stmt->tgenabled)[0] doesn't look right; you don't need to copy
> a string to fetch the first byte.

D'oh. Sure. Done.

> - Why did you create a separate file pg_event_trigger_fn.h instead of
> just including that single prototype in pg_event_trigger.h?

To mimic some existing files, fixed.

> - The EVENTTRIGGEROID syscache is used only in RemoveEventTriggerById.
>  I don't think that's a sufficient justification for eating up more
> memory for another system cache.  I think you should just remove that
> cache and recode RemoveEventTriggerById to find the correct tuple via
> an index scan.  See RemoveEventTriggerById for an example.

It's now used in more places (dependencies, alter owner), so thinking
that it's pulling its weight now with 3 call sites I've left it alone.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Thu, Jul 5, 2012 at 5:31 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> [ new patch ]

Attached is a incremental patch with a bunch of minor cleanups,
including reverts of a few spurious white space changes.  Could you
merge this into your version?

I have some concerns about pg_dump:

1. Can we spell out EvtTriggerInfo, getEvtTriggers, and dumpEvtTrigger
as EventTriggerInfo, getEventTriggers, and dumpEventTrigger?

2. I don't think that this code properly handles older server
versions.  First, the version test in getEvtTriggers checks against
90200 instead of 90300.  Second, when running against a too-old server
version, this is going to try to execute an empty query and then parse
the results.  I think you want to restructure this code so that it
just plain old returns if the server is too old.  See for example
getForeignDataWrappers.

3. The way you're generating evtfname is unsafe if either the schema
or the function name contains SQL characters.  I think this should
probably be casting the function OID to regproc in lieu of rolling its
own formatting code.  Also, the tags should probably be escaped using
quote_literal, just to be future-proof.

4. I think we should aim to generate all the SQL in upper case.  Right
now "CREATE EVENT TRIGGER" and "EXECUTE PROCEDURE" are in upper case
but "when tag in" is in lower case.

In psql, I think we should similarly have listEventTriggers() rather
than listEvtTriggers(); here as in pg_dump I think you should cast the
evtfoid to regproc to get schema-qualification and escaping, in lieu
of the explicit join.

>> In terms of the schema itself, I think we are almost there, but:
>>
>> - I noticed while playing around with this that pg_dump emits a
>> completely empty owner field when dumping an event trigger.  At first
>> I thought that was just an oversight, but then I realized there's a
>> deeper reason for it: pg_event_trigger doesn't have an owner field.  I
>> think it should.  The only other objects in the system that don't have
>> owners are text search parsers and text search templates (and casts,
>> sort of).  It might seem redundant to have an owner even when
>> event-triggers are superuser-only, but we might want to try to relax
>> that restriction later.  Note that foreign data wrappers, which are
>> also superuser-create-only, do have an owner.  (Note that if we give
>> event triggers an owner, then we also need ALTER .. OWNER TO support
>> for them.)
>
> Damn, I had it on my TODO and Álvaro hinted me already, and I kept
> forgetting about it nonetheless. Fixed now.

evtowner seems to have missed the documentation bus.

With respect to the documentation, keep in mind that the synopsis is
going to show up in the command line help for \h.  I'm thinking that
the full list of command tags is too long for that, and that we should
instead rearrange the page so that the list of supported commands is
outside the synopsis.  The synposis is also somewhat misleading, I
think, in that "variable in (tag, ...)" conveys the idea that no
matter what the variable is, the items in parentheses will surely be
tags.  I suggest that we say something like "filter_variable in
(filter_value, ...)" and then document in the text that
filter_variable must currently always be TAG, and that the legal
values for filter_value are dependent on the choice of
filter_variable, and the legal choices for TAG are those listed in the
following table: <splat>.

The documentation contains the following claim with which I'm
extremely uncomfortable:

+     <para>
+      Triggers on <literal>ANY</literal> command support more commands than
+      just this list, and will only provide the <literal>command
+      tag</literal> argument as <literal>NOT NULL</literal>. Supporting more
+      commands is made so that you can actually block <xref linkend="ddl">
+      commands in one go.
+     </para>

A minor issue is that there's no notion of ANY any more; it's just a
consequence of leaving out the WHEN clause.  The bigger issue is that
I can't see any reason to do it that way.  Surely if we're firing the
trigger at all, we can arrange to have the command tag properly filled
in so that we can filter by it.

This might be a crazy idea, but... it seems like it would be awfully
sweet if we could find a way to avoid having to translate between node
tags (i.e. constants beginning with T_* that are used to identify the
type of statement that is executing) and event tags (i.e. constants
beginning with E_* that are used to identify the type of statement
that is executing).  Now obviously this is not quite possible because
in some cases the choice of E_* constant depends on not only the node
tag but also the type of object being operated on.  However, it seems
like this is a surmountable obstacle: since we no longer need to store
the E_* constants in a system catalog, they don't really need to be
integers.  For example, we could define something like this:

typedef struct
{
    NodeTag nodetag;
    ObjectType objecttype;
} NodeTagWithObjectType;

...and set the objecttype to a new OBJECT_DUMMY value or somesuch when
the NodeTag is such that the ObjectType isn't relevant.  If we did
that, then all the E_* constants could go away, and InitEventContext()
would become a lot simpler and, presumably, faster.  It would just
need to check whether it's got one of the node tags that needs the
object-type filled in.  If so, it does that; if not, it just sets the
nodetag field to the statement's node-tag and the objecttype to
OBJECT_DUMMY, and it's done.  Well, OK, it's probably not quite that
simple... but it still seems like we'd need explicit handling of a
smaller number of cases than presently.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Attached is a incremental patch with a bunch of minor cleanups,
> including reverts of a few spurious white space changes.  Could you
> merge this into your version?

Thank you very much for that, yes it's included now. So you have 3
attachments here, the whole new patch revision (v1.7), the incremental
patch to go from 1.6 to 1.7 and the incremental patch that should apply
cleanly on top of your cleanups.

> 1. Can we spell out EvtTriggerInfo, getEvtTriggers, and dumpEvtTrigger
> as EventTriggerInfo, getEventTriggers, and dumpEventTrigger?

Done.

> 2. I don't think that this code properly handles older server
> versions.

Fixed.

> 3. The way you're generating evtfname is unsafe if either the schema

Fixed using regproc cast, both in pg_dump and in psql.

> 4. I think we should aim to generate all the SQL in upper case.  Right
> now "CREATE EVENT TRIGGER" and "EXECUTE PROCEDURE" are in upper case
> but "when tag in" is in lower case.

Oh, sure, done.

> In psql, I think we should similarly have listEventTriggers() rather
> than listEvtTriggers(); here as in pg_dump I think you should cast the
> evtfoid to regproc to get schema-qualification and escaping, in lieu
> of the explicit join.

Done.

> evtowner seems to have missed the documentation bus.

I'm told it finally did catch it.

> With respect to the documentation, keep in mind that the synopsis is
> going to show up in the command line help for \h.  I'm thinking that
> the full list of command tags is too long for that, and that we should
> instead rearrange the page so that the list of supported commands is
> outside the synopsis.  The synposis is also somewhat misleading, I
> think, in that "variable in (tag, ...)" conveys the idea that no
> matter what the variable is, the items in parentheses will surely be
> tags.  I suggest that we say something like "filter_variable in
> (filter_value, ...)" and then document in the text that
> filter_variable must currently always be TAG, and that the legal
> values for filter_value are dependent on the choice of
> filter_variable, and the legal choices for TAG are those listed in the
> following table: <splat>.

I tried to arrange something here, I'm not quite sure about the current
result but it's already much better than the previous version.
Articulating ideas that way also allows to begin that command / events
support matrix, as you can see in the attached.

> The documentation contains the following claim with which I'm
> extremely uncomfortable:

[…ANY command set…]

It's a vestige from a long time ago, I removed that and some other
verbiage now.

> I can't see any reason to do it that way.  Surely if we're firing the
> trigger at all, we can arrange to have the command tag properly filled
> in so that we can filter by it.

Indeed, command tag is always available. The magic trigger variables
might not all be, though, that was what this text was alluding to, but
that case is already covered in the specific PL docs.

The part that we will certainly have to re-install later is when we add
new event "integration points" that we can only implement in some of the
supported commands, not all of them. Think about command_end and CLUSTER
or other commands managing the transaction themselves (concurrently).

But that's not at all relevant to what's in this reduced v1 patch.

> This might be a crazy idea, but... it seems like it would be awfully
> sweet if we could find a way to avoid having to translate between node
> tags (i.e. constants beginning with T_* that are used to identify the
> type of statement that is executing) and event tags (i.e. constants
> beginning with E_* that are used to identify the type of statement
> that is executing).  Now obviously this is not quite possible because
> in some cases the choice of E_* constant depends on not only the node
> tag but also the type of object being operated on.  However, it seems
> like this is a surmountable obstacle: since we no longer need to store
> the E_* constants in a system catalog, they don't really need to be
> integers.  For example, we could define something like this:

All of that happens in InitEventContext(). We can see in there that we
wouldn't gain much, the great majority of this code is dealing with
cases where we have no symmetry at all. Also I don't think that the
nodeTag macro is expensive, nor is a 2-levels nested switch statement.

So while I would enjoy seeing that part simplified, I don't think your
angle of attack would provide us much progress here…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 6, 2012 at 7:21 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Attached is a incremental patch with a bunch of minor cleanups,
>> including reverts of a few spurious white space changes.  Could you
>> merge this into your version?
>
> Thank you very much for that, yes it's included now. So you have 3
> attachments here, the whole new patch revision (v1.7), the incremental
> patch to go from 1.6 to 1.7 and the incremental patch that should apply
> cleanly on top of your cleanups.

Here is an incremental documentation patch which I hope you will like.
 I made the event triggers stuff its own chapter rather than trying to
fold it in under triggers, and added some more detail.  It's already
quite a bit of extra stuff, and it's only going to become more as we
expand this feature, so I think a separate chapter is appropriate.  I
moved a bunch of the details that were under CREATE EVENT TRIGGER into
this new chapter, which I think is a better location, and reformatted
the matrix somewhat.  I think as we add more firing points it will be
clearer and easier to read if we have all the commands arranged in
columns rather than listing a bunch of firing points on each line.  I
also made a bunch of minor edits to improve readability and improve
the English (which wasn't bad, but I touched it up a bit); and I tried
to add some extra detail here and there (some of it recycled from
previous patch versions).  Assuming this all seems reasonably
agreeable, can you merge it on your side?

This took the last several hours, so I haven't looked at your latest
code changes yet.   However, in the course of editing the
documentation, it occurred to me that we seem to be fairly arbitrarily
excluding a large number of commands from the event trigger mechanism.
 For example, GRANT and REVOKE.  In earlier patches, we needed
specific changes for every command, so there was some reason not to
try to support everything right out of the gate.  But ISTM that the
argument for this is much less now; presumably it's just a few extra
lines of code per command, so maybe we ought to go ahead and try to
make this as complete as possible.  I attempt to explain in the
attached patch the reasons why we don't support certain classes of
commands, but I can't come up with any explanation for supporting
GRANT and REVOKE that doesn't fall flat.  I can't even really see a
reason not to support things like LISTEN and NOTIFY, and it would
certainly be more consistent with the notion of a command_start
trigger to support as many commands as we can.

I had an interesting experience while testing this patch.  I
accidentally redefined my event trigger function to something which
errored out.  That of course precluded me from using CREATE OR REPLACE
FUNCTION to fix it.  This makes me feel rather glad that we decided to
exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger
mechanism, else recovery would have had to involve system catalog
hackery.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Here is an incremental documentation patch which I hope you will like.

Definitely, it's better this way. I'm not thrilled with separating it
into its own top level chapter, but I can see how it makes sense indeed.

This part is strange though:

+     A trigger definition can also specify a <literal>WHEN</literal>
+     condition so that, for example, a <literal>command_start</literal>
+     tag can be fired only for particular commands which the user wishes
+     to intercept.  A common use of such triggers is to restrict the range of
+     DDL operations which users may perform.

I don't think of that as firing a command tag, so it's hard for me to
parse that sentence.

> the matrix somewhat.  I think as we add more firing points it will be
> clearer and easier to read if we have all the commands arranged in
> columns rather than listing a bunch of firing points on each line.  I

+1

> also made a bunch of minor edits to improve readability and improve
> the English (which wasn't bad, but I touched it up a bit); and I tried
> to add some extra detail here and there (some of it recycled from
> previous patch versions).  Assuming this all seems reasonably
> agreeable, can you merge it on your side?

Done, thanks !

> This took the last several hours, so I haven't looked at your latest
> code changes yet.   However, in the course of editing the
> documentation, it occurred to me that we seem to be fairly arbitrarily
> excluding a large number of commands from the event trigger mechanism.

As many as that? I'm surprised about the quantity. Yes I did not add all
and any command we have, on purpose, and I agree that the new turn of
things allow us to add a new set.

>  For example, GRANT and REVOKE.  In earlier patches, we needed
> specific changes for every command, so there was some reason not to
> try to support everything right out of the gate.  But ISTM that the
> argument for this is much less now; presumably it's just a few extra
> lines of code per command, so maybe we ought to go ahead and try to
> make this as complete as possible.  I attempt to explain in the

Will do soon™.

> attached patch the reasons why we don't support certain classes of
> commands, but I can't come up with any explanation for supporting
> GRANT and REVOKE that doesn't fall flat.  I can't even really see a
> reason not to support things like LISTEN and NOTIFY, and it would
> certainly be more consistent with the notion of a command_start
> trigger to support as many commands as we can.

I would think that NOTIFY is on a fast track not to be disturbed by
calling into used defined code, and that would explain why we don't
support event triggers here.

> I had an interesting experience while testing this patch.  I
> accidentally redefined my event trigger function to something which
> errored out.  That of course precluded me from using CREATE OR REPLACE
> FUNCTION to fix it.  This makes me feel rather glad that we decided to
> exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger
> mechanism, else recovery would have had to involve system catalog
> hackery.

Yeah, we have some places were it's not very hard to shoot oneself in
the foot, here the resulting hole is a little too big and offers no real
benefits. Event triggers on create|alter|drop event triggers, really?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 6, 2012 at 12:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 6, 2012 at 7:21 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Attached is a incremental patch with a bunch of minor cleanups,
>>> including reverts of a few spurious white space changes.  Could you
>>> merge this into your version?
>>
>> Thank you very much for that, yes it's included now. So you have 3
>> attachments here, the whole new patch revision (v1.7), the incremental
>> patch to go from 1.6 to 1.7 and the incremental patch that should apply
>> cleanly on top of your cleanups.
>
> Here is an incremental documentation patch which I hope you will like.

And here is another incremental patch, this one doing some more
cleanup.  Some of this is cosmetic, but it also:

- Fixes the new event_trigger type so that it passes the type sanity
test, instead of adding the failure as expected output.
- Fixes DROP EVENT TRIGGER IF EXISTS on a non-existent trigger.
- Fleshes out the ownership handling so that it's more similar to what
we do for other types of objects.

I'm feeling pretty good about this at this point, although I think
there is still some more work to do before we call it done and go
home.

I have a large remaining maintainability concern about the way we're
mapping back and forth between node tags, event tags, and command
tags.  Right now we've got parse_event_tag, which parses something
like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got
command_to_string, which turns E_AlterAggregate back into 'ALTER
AGGREGATE', and then we've got InitEventContext(), which turns
T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into
E_AlterAggregate.  I can't easily verify that all three of these
things are consistent with each other, and even if they are right now
I estimate the chances of that remaining true as other people patch
the code as near-zero.  You didn't like my last proposal for dealing
with this, which is fine: it might not have been the best way of
dealing with it.  But I think we have to figure out something better
than what we've got now, or this is almost guaranteed to get broken.
If you don't have a brilliant idea I'll hack on it and see what I can
come up with.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> And here is another incremental patch, this one doing some more
> cleanup.  Some of this is cosmetic, but it also:

Thanks, applied in my github repository!

> I'm feeling pretty good about this at this point, although I think
> there is still some more work to do before we call it done and go
> home.

Nice reading that :)

> I have a large remaining maintainability concern about the way we're
> mapping back and forth between node tags, event tags, and command
> tags.  Right now we've got parse_event_tag, which parses something
[…valid concern…]
> If you don't have a brilliant idea I'll hack on it and see what I can
> come up with.

I think we might be able to install a static array for the setup where
we would find the different elements, and then code up some procedures
doing different kind of look ups in that array.

> like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got
> command_to_string, which turns E_AlterAggregate back into 'ALTER
> AGGREGATE', and then we've got InitEventContext(), which turns
> T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into
> E_AlterAggregate.  I can't easily verify that all three of these

{ E_AlterAggregate,             // TrigEventCommand "ALTER AGGREGATE",            // command tag T_RenameStmt,
      // nodeTag -1                            // object type 
},
{ E_AlterAggregate, "ALTER AGGREGATE", T_AlterObjectSchemaStmt, OBJECT_AGGREGATE
}

The problem is coming up with a way of writing the code that does not
incur a full array scan for each step of parsing or rewriting. And I
don't see that it merits yet another cache. Given the existing event
trigger cache it might be that we don't care about having a full scan of
this table twice per event trigger related commands, as I don't think it
would happen when executing other DDLs.

Scratch that we need to parse command tags when we build the event
cache, so scanning the full array each time would make that O(n²) and we
want to avoid that. So we could install the contents of the array in
another hash table in BuildEventTriggerCache() then use that to lookup
the TrigEventCommand from the command tag…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 6, 2012 at 3:29 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>> Here is an incremental documentation patch which I hope you will like.
>
> Definitely, it's better this way. I'm not thrilled with separating it
> into its own top level chapter, but I can see how it makes sense indeed.

Oh, really?  I thought that was a huge readability improvement.  There
are some things that are the same between triggers and event triggers,
but there's an awful lotta stuff that is completely different.

> This part is strange though:
>
> +     A trigger definition can also specify a <literal>WHEN</literal>
> +     condition so that, for example, a <literal>command_start</literal>
> +     tag can be fired only for particular commands which the user wishes
> +     to intercept.  A common use of such triggers is to restrict the range of
> +     DDL operations which users may perform.
>
> I don't think of that as firing a command tag, so it's hard for me to
> parse that sentence.

Oh, that should say "a command_start trigger" rather than "a
command_start tag".  Good catch.

>> This took the last several hours, so I haven't looked at your latest
>> code changes yet.   However, in the course of editing the
>> documentation, it occurred to me that we seem to be fairly arbitrarily
>> excluding a large number of commands from the event trigger mechanism.
>
> As many as that? I'm surprised about the quantity. Yes I did not add all
> and any command we have, on purpose, and I agree that the new turn of
> things allow us to add a new set.

I admit I didn't count them up.  :-)  Maybe there aren't that many.

I think we might want to extend the support matrix to include every
command that appears in sql-commands.html and have either an X if it's
supported or blank if it's not.  That would make it easier to judge
how many commands are not supported, not just for us but for users who
may be trying to answer the same questions we are.

> I would think that NOTIFY is on a fast track not to be disturbed by
> calling into used defined code, and that would explain why we don't
> support event triggers here.

If the DBA is allowed to restrict CREATE FUNCTION, why not NOTIFY?  I
guess I don't see why that one's deserving of special treatment.

Mind you, if I ran the world, this would probably be broken up
differently: I'd have ddl_command_start covering all the
CREATE/ALTER/DROP commands and nothing else; and separate firing
points for anything else I wanted to support.  It's not too late to
make that change, hint, hint.  But if we're not gonna do that then I
think that we'd better try to cast the net as broadly as reasonably
possible.  It seems to me that our excuse for not including things
like UPDATE and DELETE is a bit thin; surely there are people who
would like a sort of universal trigger that applies to every relation
in the system.  Of course there are recursion problems there that need
to be thought long hard about, and no I don't really want to go there
right now, but I'll bet you a nickle that someone is going to ask why
it doesn't work that way.

Another advantage to recasting this as ddl_command_start is that we
quite easily pass the operation (CREATE, ALTER, DROP) and the named
object type (TABLE, FUNCTION, CAST) as separate arguments.  I think
that would be a usability improvement, since it would then be dead
easy to write an event trigger that prohibits DROP (and only DROP) of
any sort.  Of course it's not that hard to do it right now, but you
have to parse the command tag.  It would likely simplify the code for
mapping between node and command tags, too.

One other thought: if we're NOT going to do what I suggested above,
then how about renaming TG_WHEN to TG_EVENT?  Seems like that would
fit better.

Also: now that the E_WhatEver constants don't get stored on disk, I
don't think they should live in pg_event_trigger.h any more; can we
move them to someplace more appropriate?  Possibly make them private
to event_trigger.c?  And, on a related note, I don't think it's a good
idea to use E_ as a prefix for both the event types and the command
tags.  It's too short, and hard to grep for, and we don't want the
same one for both, I think.  How above things like EVT_CommandStart
for the events and ECT_CreateAggregate for the command tags?

>> I had an interesting experience while testing this patch.  I
>> accidentally redefined my event trigger function to something which
>> errored out.  That of course precluded me from using CREATE OR REPLACE
>> FUNCTION to fix it.  This makes me feel rather glad that we decided to
>> exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger
>> mechanism, else recovery would have had to involve system catalog
>> hackery.
>
> Yeah, we have some places were it's not very hard to shoot oneself in
> the foot, here the resulting hole is a little too big and offers no real
> benefits. Event triggers on create|alter|drop event triggers, really?

Indeed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 6, 2012 at 4:00 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> And here is another incremental patch, this one doing some more
>> cleanup.  Some of this is cosmetic, but it also:
>
> Thanks, applied in my github repository!

Thanks.

>> I have a large remaining maintainability concern about the way we're
>> mapping back and forth between node tags, event tags, and command
>> tags.  Right now we've got parse_event_tag, which parses something
> […valid concern…]
>> If you don't have a brilliant idea I'll hack on it and see what I can
>> come up with.
>
> I think we might be able to install a static array for the setup where
> we would find the different elements, and then code up some procedures
> doing different kind of look ups in that array.

+1.

>> like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got
>> command_to_string, which turns E_AlterAggregate back into 'ALTER
>> AGGREGATE', and then we've got InitEventContext(), which turns
>> T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into
>> E_AlterAggregate.  I can't easily verify that all three of these
>
> {
>   E_AlterAggregate,             // TrigEventCommand
>   "ALTER AGGREGATE",            // command tag
>   T_RenameStmt,                 // nodeTag
>   -1                            // object type
> },
> {
>   E_AlterAggregate,
>   "ALTER AGGREGATE",
>   T_AlterObjectSchemaStmt,
>   OBJECT_AGGREGATE
> }
>
> The problem is coming up with a way of writing the code that does not
> incur a full array scan for each step of parsing or rewriting. And I
> don't see that it merits yet another cache. Given the existing event
> trigger cache it might be that we don't care about having a full scan of
> this table twice per event trigger related commands, as I don't think it
> would happen when executing other DDLs.
>
> Scratch that we need to parse command tags when we build the event
> cache, so scanning the full array each time would make that O(n²) and we
> want to avoid that. So we could install the contents of the array in
> another hash table in BuildEventTriggerCache() then use that to lookup
> the TrigEventCommand from the command tag…

Ugh.  Yeah, obviously the most important thing I think is that
InitEventContext() needs to be lightning-fast, but we don't want
BuildEventTriggerCache() to be pathologically slow either.

I think the best thing to do with InitEventContext() might be to get
rid of it.  It's just a big switch over node tags, and we've already
got one of those in standard_ProcessUtility.  Maybe every case that
already exists in that function should either (a) get a comment of the
form /* does not support event triggers */ or (b) get a call of the
form EventTriggerStartup(&evt, parsetree, E_WhateverCommandThisIs).
EventTriggerStartup() could call InitEventContext() and then if
CommandFiresTriggersForEvent(..., E_CommandStart) it could also call
ExecEventTriggers().  This might seem like it's just moving the wood
around, but if someone adds a new case in standard_ProcessUtility,
they're going to model it on one of the existing cases, which greatly
decreases the likelihood that they're going to screw it up.  And if
they do screw it up it will be obviously non-parallel to the rest of
what's there, so somebody can notice and fix it.  As a side benefit,
this would probably be faster than having two separate switches that
are executed more or less consecutively.

Now that leaves the question of how to translate between
E_AlterAggregate and "ALTER AGGREGATE"; I think your idea of a hash
table (or two?) might be the most practical option.  We'd only need to
build the hash table(s) if an index-scan of pg_event_trigger finds it
non-empty, and then only once per session.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Thom Brown <thom@linux.com> writes:
> I also attach various typo/grammar fixes.

In fact Robert's cleanup of the docs make that patch of yours not apply
anymore, and I think a part of it is maybe already fixed. Do you have
time to look at this with the new v1.8 patch that you will receive in a
minute, or with the github branch if you're tracking that?

Sorry about that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Please find attached a new revision of the patch, v1.8, and an
incremental diff from the previous one, that includes the patches you
sent me.

Robert Haas <robertmhaas@gmail.com> writes:
> Oh, really?  I thought that was a huge readability improvement.  There
> are some things that are the same between triggers and event triggers,
> but there's an awful lotta stuff that is completely different.

True.

> Oh, that should say "a command_start trigger" rather than "a
> command_start tag".  Good catch.

Fixed in the attached.

> I think we might want to extend the support matrix to include every
> command that appears in sql-commands.html and have either an X if it's
> supported or blank if it's not.  That would make it easier to judge
> how many commands are not supported, not just for us but for users who
> may be trying to answer the same questions we are.

Yes, not done yet, see below.

> Mind you, if I ran the world, this would probably be broken up
> differently: I'd have ddl_command_start covering all the
> CREATE/ALTER/DROP commands and nothing else; and separate firing
> points for anything else I wanted to support.  It's not too late to
> make that change, hint, hint.  But if we're not gonna do that then I

Let's see about doing that. I guess we would have ddl_command_start and
command_start, and I would think that the later is the most generic. So
we would certainly want DDL commands to run first the command_start
event triggers then the ddl_command_start event triggers, whereas a
NOTIFY would only run command_start triggers, and a GRANT command would
run maybe command_start then dcl_command_start triggers?

If that's where we're going to, we can commit as-is and expand later.

> think that we'd better try to cast the net as broadly as reasonably
> possible.  It seems to me that our excuse for not including things
> like UPDATE and DELETE is a bit thin; surely there are people who
> would like a sort of universal trigger that applies to every relation
> in the system.  Of course there are recursion problems there that need
> to be thought long hard about, and no I don't really want to go there
> right now, but I'll bet you a nickle that someone is going to ask why
> it doesn't work that way.

The current reason why we only support 149 SQL commands and variations
is because we want a patch that's easy enough to review and agree on. So
I think we will in the future be able to add new firing point at places
where maybe some discussion is needed.

Such places, in my mind, include the NOTIFY mechanism, DCLs, and global
objects such as databases and tablespaces and roles. I'd be happy to see
event triggers embrace support for those. Maybe in v2 though?

> Another advantage to recasting this as ddl_command_start is that we
> quite easily pass the operation (CREATE, ALTER, DROP) and the named
> object type (TABLE, FUNCTION, CAST) as separate arguments.  I think

That's a good idea. I don't think we should replace the current tag
support with that though, because some commands are harder to stow into
the operation and type model (in supported commands, mainly LOAD).

So I've included partial support for that in the attached patch, in the
simplest way possible, just so that we can see where it leads in term of
using the feature. The next step here is to actually go in each branch
of the process utility switch and manually decorate the command context
with the current operation and objecttype when relevant.

> One other thought: if we're NOT going to do what I suggested above,
> then how about renaming TG_WHEN to TG_EVENT?  Seems like that would
> fit better.

That seems like the right thing to do in either case, done.

> Also: now that the E_WhatEver constants don't get stored on disk, I
> don't think they should live in pg_event_trigger.h any more; can we
> move them to someplace more appropriate?  Possibly make them private
> to event_trigger.c?  And, on a related note, I don't think it's a good

Done, they now live in evtcache.h, where they belong.

> idea to use E_ as a prefix for both the event types and the command
> tags.  It's too short, and hard to grep for, and we don't want the
> same one for both, I think.  How above things like EVT_CommandStart
> for the events and ECT_CreateAggregate for the command tags?

Done this way.

Robert Haas <robertmhaas@gmail.com> writes:
>> I think we might be able to install a static array for the setup where
>> we would find the different elements, and then code up some procedures
>> doing different kind of look ups in that array.
>
> +1.

Done in the attached. Filling that array was… an interesting use case
for Emacs Keyboard Macros spanning 3 different buffers, maintaining it
should be easy enough now.

> Ugh.  Yeah, obviously the most important thing I think is that
> InitEventContext() needs to be lightning-fast, but we don't want
> BuildEventTriggerCache() to be pathologically slow either.

So I've built two new hash tables so that we can either search internal
command enum number by command tag or by parse tree nodeTag and object
type. The hash table is only built when first needed, I didn't add any
trick to only build it when the pg_event_trigger table is empty.

> I think the best thing to do with InitEventContext() might be to get
> rid of it.  It's just a big switch over node tags, and we've already
> got one of those in standard_ProcessUtility.  Maybe every case that

Given that second hash table (EventTriggerCommandNodeCache) we can now
move that look-up to later in the course of events. I kept
InitEventContext() as the place where to stuff default values in the
EventContext structure, though.

Also, as we're optimizing things, there's no longer any call sites to
the function CommandFiresTriggersForEvent() in the attached patch.
That's because this function main use case is allowing the specific
command support code to avoid some possibly costly setup when there's in
fact no trigger to fire. That's never going to be the case with
command_start triggers as they have no specific variable support.

A typical integration now looks like this:

    case T_CreateCastStmt:
        ExecEventTriggers(&evt, EVT_CommandStart);
        CreateCast((CreateCastStmt *) parsetree);
        break;

> Now that leaves the question of how to translate between
> E_AlterAggregate and "ALTER AGGREGATE"; I think your idea of a hash
> table (or two?) might be the most practical option.  We'd only need to
> build the hash table(s) if an index-scan of pg_event_trigger finds it
> non-empty, and then only once per session.

As said earlier, I implemented that without the non-empty trick.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Tue, Jul 10, 2012 at 10:38 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
>> Mind you, if I ran the world, this would probably be broken up
>> differently: I'd have ddl_command_start covering all the
>> CREATE/ALTER/DROP commands and nothing else; and separate firing
>> points for anything else I wanted to support.  It's not too late to
>> make that change, hint, hint.  But if we're not gonna do that then I
>
> Let's see about doing that. I guess we would have ddl_command_start and
> command_start, and I would think that the later is the most generic. So
> we would certainly want DDL commands to run first the command_start
> event triggers then the ddl_command_start event triggers, whereas a
> NOTIFY would only run command_start triggers, and a GRANT command would
> run maybe command_start then dcl_command_start triggers?
>
> If that's where we're going to, we can commit as-is and expand later.

That's not quite what I was thinking.  I actually can't imagine any
situation where you want an event trigger that gets fired on EVERY
command for which we can support command_start.  If you're trying to
prevent or replicate DDL, that's too much.  If you're trying to do
logging or auditing, it's not enough, since there will still be
commands that aren't supported, and it'll be grossly inefficient to
boot.  You really want something like log_min_duration_statement=0 for
those cases.  So it seems to me that the use case for a command_start
trigger, conceived in the broadest possible way so that every single
command we can support is included, is razor-thin.

So my proposal for the present patch would be:

1. Rename command_start to ddl_command_start.
2. Remove support for everything other than CREATE, ALTER, and DROP.
3. Pass the operation and the SQL object type as separate magic variables.

Then we can add dcl_command_start, etc. in follow-on patches.

>> think that we'd better try to cast the net as broadly as reasonably
>> possible.  It seems to me that our excuse for not including things
>> like UPDATE and DELETE is a bit thin; surely there are people who
>> would like a sort of universal trigger that applies to every relation
>> in the system.  Of course there are recursion problems there that need
>> to be thought long hard about, and no I don't really want to go there
>> right now, but I'll bet you a nickle that someone is going to ask why
>> it doesn't work that way.
>
> The current reason why we only support 149 SQL commands and variations
> is because we want a patch that's easy enough to review and agree on. So
> I think we will in the future be able to add new firing point at places
> where maybe some discussion is needed.

Agreed.

> Such places, in my mind, include the NOTIFY mechanism, DCLs, and global
> objects such as databases and tablespaces and roles. I'd be happy to see
> event triggers embrace support for those. Maybe in v2 though?

Yep, sure.  Note that the proposal above constrains the list of
commands we support in v1 in a very principled way: CREATE, ALTER,
DROP.  Everything else can be added later under a different (but
similarly situated) firing point name.  If we stick with command_start
then I think we're going to be forever justifying our decisions as to
what got included or excluded; which might be worth it if it seemed
likely that there'd be much use for such a command trigger, but it
doesn't (to me, anyway).

>> Another advantage to recasting this as ddl_command_start is that we
>> quite easily pass the operation (CREATE, ALTER, DROP) and the named
>> object type (TABLE, FUNCTION, CAST) as separate arguments.  I think
>
> That's a good idea. I don't think we should replace the current tag
> support with that though, because some commands are harder to stow into
> the operation and type model (in supported commands, mainly LOAD).

I'm imagining that ddl_command_start triggers would get the
information this way, but LOAD might be covered by something like
admin_command_start that just gets the command tag.

> So I've included partial support for that in the attached patch, in the
> simplest way possible, just so that we can see where it leads in term of
> using the feature. The next step here is to actually go in each branch
> of the process utility switch and manually decorate the command context
> with the current operation and objecttype when relevant.
>
[...]
> Done in the attached. Filling that array was… an interesting use case
> for Emacs Keyboard Macros spanning 3 different buffers, maintaining it
> should be easy enough now.

Yep, looks better.  It looks like you've got
EventTriggerCommandTagsEntry mapping the command tag to an ETC_*
constant; I think the need for that hash goes away entirely if you
just pass this information down from the ProcessUtility() switch.  At
any rate having NameData involved seems like it's probably not too
good an idea; if for some reason we need to keep that hash, use a
NUL-terminated string and initialize the hash table with string_hash
instead of tag_hash.  That'll be simpler and also allows the length of
the buffer to vary independently of NAMEDATALEN, which can only be to
the good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

Robert Haas <robertmhaas@gmail.com> writes:
> That's not quite what I was thinking.  I actually can't imagine any
> situation where you want an event trigger that gets fired on EVERY
> command for which we can support command_start.  If you're trying to

I don't have any solid use case in mind, just though it would make the
feature rather complete. Withdrawn.

> So my proposal for the present patch would be:
>
> 1. Rename command_start to ddl_command_start.
> 2. Remove support for everything other than CREATE, ALTER, and DROP.
> 3. Pass the operation and the SQL object type as separate magic variables.

All done in the attached. As usual, you get an incremental patch and a
complete one. It's also published on github for easy browsing:

  https://github.com/dimitri/postgres/compare/33f285e67f...ba42279924

> Yep, sure.  Note that the proposal above constrains the list of
> commands we support in v1 in a very principled way: CREATE, ALTER,
> DROP.  Everything else can be added later under a different (but
> similarly situated) firing point name.  If we stick with command_start
> then I think we're going to be forever justifying our decisions as to
> what got included or excluded; which might be worth it if it seemed
> likely that there'd be much use for such a command trigger, but it
> doesn't (to me, anyway).

Yeah, done that way. I had to remove support for only 4 commands…

> I'm imagining that ddl_command_start triggers would get the
> information this way, but LOAD might be covered by something like
> admin_command_start that just gets the command tag.

My point was that I didn't want to replace the command tag by the object
type and operation combo, happy to see we're on the same line.

> EventTriggerCommandTagsEntry mapping the command tag to an ETC_*
> constant; I think the need for that hash goes away entirely if you
> just pass this information down from the ProcessUtility() switch.  At

Well, no, because we use that hash mapping in BuildEventTriggerCache(),
when reading from the catalogs. I don't see a way to cut down on the
number of per-session hash-table here without involving a penalty.

> any rate having NameData involved seems like it's probably not too
> good an idea; if for some reason we need to keep that hash, use a
> NUL-terminated string and initialize the hash table with string_hash
> instead of tag_hash.  That'll be simpler and also allows the length of
> the buffer to vary independently of NAMEDATALEN, which can only be to
> the good.

Oh, I just failed to realize that the hash table key mustn't be a static
length field. I'm done for tonight though, it's still something to fix.
Maybe you will beat me to it? :) (if not, I'll be happy to, with some
luck even tomorrow).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> All done in the attached. As usual, you get an incremental patch and a
> complete one. It's also published on github for easy browsing:
>
>   https://github.com/dimitri/postgres/compare/33f285e67f...ba42279924

Same as usual, even if that's a very little adjustment this time:

  https://github.com/dimitri/postgres/commit/5ec1ddb41489a89eea09e350f6ee39e726f9fb03

>> any rate having NameData involved seems like it's probably not too
>> good an idea; if for some reason we need to keep that hash, use a
>> NUL-terminated string and initialize the hash table with string_hash
>> instead of tag_hash.  That'll be simpler and also allows the length of
>> the buffer to vary independently of NAMEDATALEN, which can only be to
>> the good.

Fixed now, using a char[] as with PortalHashTable for example.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Thu, Jul 12, 2012 at 8:50 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> [ new patch ]

Well, I think it's about time to start getting some of this code into
our tree.  However, I'm still not confident that the code that
actually runs the triggers is entirely solid, so I decided to rip that
stuff out and commit only the syntax support and documentation
portions of this patch.  We can add the actual trigger firing stuff
and PL language support in a subsequent commit or commits.  I made a
significant number of modifications.

First, I changed the representation of CreateEventTriggerStmt
considerably, so that we can more easily accommodate multiple filter
variables in future patches without having to whack the code around
too much.  I also disentangled the CreateEventTriggerStmt processing
from the event-cache machinery, because it doesn't seem like a good
idea  to have evtcache.c and event_trigger.c be deeply intertwined,
from a modularity point of view.  I think we might still need to make
a bit more adjustment to the organization of this code - perhaps the
code that is needed by both commands/event_triggers.c and
utils/cache/evtcache.c ought to be moved to something like
catalog/pg_event_trigger.c.  However, I haven't done that in this
commit.

Second, I made this work with COMMENT, SECURITY LABEL, and with the
extension mechanism, including updating the documentation.

Third, I also some other documentation updates to match recent changes
and also added the missing documentation for \dy.

Fourth, I rewrote the regression tests fairly extensively to make sure
we're adequately testing all of the syntax that this commit adds: not
only that it works when it's supposed to work, but also that it fails
when it's supposed to fail and emits a hopefully-good error message in
such cases.

And finally there were a bunch of miscellaneous code cleanups (some of
them fixing things that I muffed in earlier incremental patches that I
sent you to merge), and others that touch code you wrote.

I suspect there are probably still a few oversights here, but
hopefully not too many.

The next step here is obviously to complete the work necessary to
actually be able to fire these triggers, as opposed to just saying
that we fire them.  I'll write up my thoughts on that topic in a
separate email.  I don't think there's a ton of work left to be done
there, but more than zero.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Wed, Jul 18, 2012 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> The next step here is obviously to complete the work necessary to
> actually be able to fire these triggers, as opposed to just saying
> that we fire them.  I'll write up my thoughts on that topic in a
> separate email.  I don't think there's a ton of work left to be done
> there, but more than zero.

I have committed code to do this.  It's basically similar to what you
had before, but I reworked the event cache machinery heavily.  I think
that the new organization will be easier to extent to meet future
needs, and it also gets rid of a lot of boilerplate code whose job was
to translate between different representations.  I also committed the
PL/pgsql support code, but not the code for the other PLs.  It should
be easy to rebase that work and resubmit it as a separate patch,
though, or maybe one patch per PL would be better.

Obviously, there's quite a bit more work that could be done here --
passing more context, add more firing points, etc. -- but now we've at
least got the basics.

As previously threatened, I amended this code so that triggers fire
once per DDL command.  So internally generated command nodes that are
used as an argument-passing mechanism do not fire triggers, for
example.  I believe this is the right decision because I think, as I
mentioned in another email to Tom yesterday, that generating and
passing around command tags is a really bad practice that we should be
looking to eliminate, not institutionalize.  It posed a major obstacle
to my 9.2-era efforts to clean up the behavior of our DDL under
concurrency, for example.

I think, however, that it would be useful to have an event trigger
that is defined to fire "every time a certain type of SQL object gets
created" rather than "every time a certain command gets executed".
That could complement, not replace, this mechanism.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Thom Brown
Date:
On 20 July 2012 16:50, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jul 18, 2012 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> The next step here is obviously to complete the work necessary to
>> actually be able to fire these triggers, as opposed to just saying
>> that we fire them.  I'll write up my thoughts on that topic in a
>> separate email.  I don't think there's a ton of work left to be done
>> there, but more than zero.
>
> I have committed code to do this.  It's basically similar to what you
> had before, but I reworked the event cache machinery heavily.  I think
> that the new organization will be easier to extent to meet future
> needs, and it also gets rid of a lot of boilerplate code whose job was
> to translate between different representations.  I also committed the
> PL/pgsql support code, but not the code for the other PLs.  It should
> be easy to rebase that work and resubmit it as a separate patch,
> though, or maybe one patch per PL would be better.
>
> Obviously, there's quite a bit more work that could be done here --
> passing more context, add more firing points, etc. -- but now we've at
> least got the basics.
>
> As previously threatened, I amended this code so that triggers fire
> once per DDL command.  So internally generated command nodes that are
> used as an argument-passing mechanism do not fire triggers, for
> example.  I believe this is the right decision because I think, as I
> mentioned in another email to Tom yesterday, that generating and
> passing around command tags is a really bad practice that we should be
> looking to eliminate, not institutionalize.  It posed a major obstacle
> to my 9.2-era efforts to clean up the behavior of our DDL under
> concurrency, for example.
>
> I think, however, that it would be useful to have an event trigger
> that is defined to fire "every time a certain type of SQL object gets
> created" rather than "every time a certain command gets executed".
> That could complement, not replace, this mechanism.

I've just run my own set of tests against these changes, which tests
every supported DDL command (with the exception of "CREATE USER
MAPPING", "ALTER USER MAPPING", "DROP USER MAPPING", "CREATE LANGUAGE"
and "DROP LANGUAGE"), and many variants of each command, and
everything behaves exactly as expected. :)

-- 
Thom


Re: Event Triggers reduced, v1

From
Tom Lane
Date:
The Windows buildfarm members don't seem too happy with the latest
patch.
        regards, tom lane


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 20, 2012 at 2:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The Windows buildfarm members don't seem too happy with the latest
> patch.

I'm looking at this now, but am so far mystified.  Something's
obviously broken as regards how the trigger flags get set up, but if
that were broken in a trivial way it would be broken everywhere, and
it's not.  Will keep searching...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 20, 2012 at 1:22 PM, Thom Brown <thom@linux.com> wrote:
> I've just run my own set of tests against these changes, which tests
> every supported DDL command (with the exception of "CREATE USER
> MAPPING", "ALTER USER MAPPING", "DROP USER MAPPING", "CREATE LANGUAGE"
> and "DROP LANGUAGE"), and many variants of each command, and
> everything behaves exactly as expected. :)

Nice!

But all of those commands you just mentioned ought to work, too.

The documentation probably needs some updating on that point, come to
think of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Fri, Jul 20, 2012 at 3:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 20, 2012 at 2:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The Windows buildfarm members don't seem too happy with the latest
>> patch.
>
> I'm looking at this now, but am so far mystified.  Something's
> obviously broken as regards how the trigger flags get set up, but if
> that were broken in a trivial way it would be broken everywhere, and
> it's not.  Will keep searching...

I think this is fixed now.  Let me know if anyone sees evidence of
remaining breakage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Hi,

I'm back to PostgreSQL development concerns after some distraction here.
First, thanks for pushing the patch to commit!

I've been reviewing your changes and here's a very small patch with some
details I would have spelled out differently. See what you think, I
mostly needed to edit some code to get back in shape :)

Coming next, catch-up with things I've missed and extending the included
support for event triggers in term of function parameters (rewritten
command string, object kind, etc), and maybe PL support too.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Event Triggers reduced, v1

From
Robert Haas
Date:
On Mon, Aug 27, 2012 at 7:52 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> I've been reviewing your changes and here's a very small patch with some
> details I would have spelled out differently. See what you think, I
> mostly needed to edit some code to get back in shape :)

I guess I don't particularly like either of these changes.  The first
one is mostly harmless, but I don't really see why it's any better,
and it does have the downside of traversing the string twice (once for
strlen and a second time in str_toupper) instead of just once.  It
also makes a line wider than 80 columns, which is a bit ugly.  In the
second hunk, the point is that we never have to do CreateCommandTag()
here at all unless either casserts are enabled or EventCacheLookup
returns a non-empty list.  That means that in a non-assert-enabled
build, we get to skip that work altogether in the presumably-common
case where there are no relevant event triggers.  Your proposed change
would avoid doing it twice when asserts are disabled, but the cost
would be that we'd have to do it once when asserts were disabled even
if no event triggers exist.  I don't think that's a good trade-off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Event Triggers reduced, v1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I guess I don't particularly like either of these changes.  The first

Fair enough.

> one is mostly harmless, but I don't really see why it's any better,
> and it does have the downside of traversing the string twice (once for
> strlen and a second time in str_toupper) instead of just once.  It
> also makes a line wider than 80 columns, which is a bit ugly.  In the

It's much easier to read, I think. The line is 79 columns here given the
project Emacs setup wrt tabs, see src/tools/editors/emacs.samples.

> second hunk, the point is that we never have to do CreateCommandTag()
> here at all unless either casserts are enabled or EventCacheLookup
> returns a non-empty list.  That means that in a non-assert-enabled
> build, we get to skip that work altogether in the presumably-common
> case where there are no relevant event triggers.  Your proposed change
> would avoid doing it twice when asserts are disabled, but the cost
> would be that we'd have to do it once when asserts were disabled even
> if no event triggers exist.  I don't think that's a good trade-off.

Well I needed more exercise before sending a patch then, I just missed
that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support