Re: Event Triggers: adding information - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Event Triggers: adding information
Date
Msg-id CA+TgmoZjOACnHOKrz-eucj3FJXrYJzA-_yjSqajOQR8qkjCi=w@mail.gmail.com
Whole thread Raw
In response to Re: Event Triggers: adding information  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: Event Triggers: adding information  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: Event Triggers: adding information  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
List pgsql-hackers
On Fri, Dec 21, 2012 at 11:35 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> It's hard to devise exactly what kind of refactoring we need to do
> before trying to write a patch that benefits from it, in my experience.
> In some cases the refactoring will make things more complex, not less.

Perhaps.  I have a feeling there's some more simplification that can
be done here, somehow.

> That's a good idea. I only started quite late in that patch to work on
> the ObjectID piece of information, that's why I didn't split it before
> doing so.

That's fine.  I've committed that part.  I think the remainder of the
patch is really several different features that ought to be split
apart and considered independently.  We may want some but not others,
or some may be ready to go in but not others.

> Also, keep in mind we want the ObjectID in all CREATE, ALTER and DROP
> statements, so my current patch is still some bricks shy of a load… (I
> added ObjectID only in the commands I added rewrite support for, apart
> from DROP).

I shall rely on you to provide those bricks which are still missing.

>> change to gram.y that affects any of the supported commands will
>> require a compensating change to ddl_rewrite.c.  That is a whole lot
>> of work and it is almost guaranteed that future patch authors will
>> sometimes fail to do it correctly.   At an absolute bare minimum I
>> think we would need some kind of comprehensive test suite for this
>> functionality, as opposed to the less than 60 lines of new test cases
>> this patch adds which completely fail to test this code in any way at
>> all, but really I think we should just not have it.  It WILL get
>
> I had a more complete test suite last rounds and you did oppose to it
> for good reasons. I'm ok to revisit that now, and I think the test case
> should look like this:
>
>   - install an auditing command trigger that will insert any DDL command
>     string into a table with a sequence ordering
>
>   - select * from audit
>
>   - \d… of all objects created
>
>   - drop schema cascade
>
>   - DO $$ loop for sql in select command from audit do execute sql …
>
>   - \d… of all objects created again
>
> Then any whacking of the grammar should alter the output here and any
> case that's not supported will fail too.
>
> We might be able to have a better way of testing the feature here, but I
> tried to stay into the realms of what I know pg_regress able to do.

I was thinking that we might need to go beyond what pg_regress can do
in this case.  For example, one idea would be to install an audit
trigger that records all the DDL that happens during the regression
tests.  Then, afterwards, replay that DDL into a new database.  Then
do a schema-only dump of the old and new databases and diff the dump
files.  That's a little wacky by current community standards but FWIW
EDB has a bunch of internal tests that we run to check our proprietary
stuff; some of them work along these lines and it's pretty effective
at shaking out bugs.

In this particular case, it would significantly reduce the maintenance
burden of putting a proper test framework in place, because we can
hope that the regression tests create (and leave around) at least one
object of any given type, and that any new DDL features will be
accompanied by similar regression tests.  We already rely on the
regression database for pg_upgrade testing, so if it's not complete,
it impacts pg_upgrade test coverage as well.

>> broken (if it isn't already) and it WILL slow down future development
>> of SQL features.  It also WILL have edge cases where it doesn't work
>> properly, such as where the decision of the actual index name to use
>> is only decided at execution time and cannot be inferred from the
>> parse tree.  I know that you feel that all of these are tolerable
>
> The way to solve that, I think, as Tom said, is to only rewrite the
> command string when the objects exist in the catalogs. That's why we now
> have ddl_command_trace which is an alias to either start or end
> depending on whether we're doing a DROP or a CREATE or ALTER operation.

Hmm.  I have to study that more, maybe.  I certainly agree that if you
can look at the catalogs, you should be able to reliably reconstruct
what happened - which isn't possible just looking at the parse tree.
However, it feels weird to me to do something that's partly based on
the parse tree and partly based on the catalog contents.  Moreover,
the current pre-create hook isn't the right place to gather
information from the catalogs anyway as it happens before locks have
been taken.

Now, there's another thing that is bothering me here, too.  How
complete is the support you've implemented in this patch?  Does it
cover ALL CREATE/ALTER/DROP commands, including all options to and all
variants of those commands?  Because while I'm not very sanguine about
doing this at all, it somehow feels like doing it only halfway would
be even worse.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: buffer assertion tripping under repeat pgbench load
Next
From: Jeff Janes
Date:
Subject: initdb and share/postgresql.conf.sample