Re: Add CREATE support to event triggers - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Add CREATE support to event triggers
Date
Msg-id 20140204051136.GL10723@eldon.alvh.no-ip.org
Whole thread Raw
In response to Re: Add CREATE support to event triggers  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: Add CREATE support to event triggers  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

Dimitri, thanks for the review.

Here's a new version of this patch.  I focused on a small set of
commands for now, with the intention of building something that could
serve as a useful foundation for an interesting percentage of use cases.
There are enough strange things going on in this code that I thought I
should go one step further in the foundations before I go much further
in supporting the rest of the commands, which will be mostly tedious
additions of extracting info from catalogs to immediately emit as JSON
blobs.

So this version supports the following commands:
   CREATE SCHEMA
   CREATE TABLE
   CREATE SEQUENCE
   CREATE INDEX
   CREATE VIEW

I have run into some issues, though:

1. certain types, particularly timestamp/timestamptz but really this
could happen for any type, have unusual typmod output behavior.  For
those one cannot just use the schema-qualified catalog names and then
append the typmod at the end; because what you end up is something like
   pg_catalog.timestamptz(4) with time zone
because, for whatever reason, the "with time zone" is part of typmod
output.  But this doesn't work at all for input.  I'm not sure how to
solve this.

2. I have been having object definitions be emitted complete; in
particular, sequences have OWNED BY clauses when they have an owning
column.  But this doesn't work with a SERIAL column, because we get
output like this:

alvherre=# CREATE  TABLE public.hijo  (b serial);
NOTICE:  expanded: CREATE  SEQUENCE public.hijo_b_seq INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START WITH
1CACHE 1 NO CYCLE OWNED BY public.hijo.b 
NOTICE:  expanded: CREATE  TABLE public.hijo  (b pg_catalog.int4 DEFAULT nextval('hijo_b_seq'::regclass) NOT NULL )

which is all nice, except that the sequence is using the column name as
owner before the column has been created in the first place.  Both these
command will, of course, fail, because both depend on the other to have
been executed first.  The tie is easy to break in this case: just don't
emit the OWNED BY clause .. but it makes me a bit nervous to be
hardcoding the decision of parts that might depend on others.  OTOH
pg_dump already knows how to split objects in constituent parts as
necessary; maybe it's not so bad.

3. It is possible to coerce ruleutils.c to emit always-qualified names
by using PushOverrideSearchPath() facility; but actually this doesn't
always work, because some places in namespace.c believe that
PG_CATALOG_NAMESPACE is always visible and so certain objects are not
qualified.  In particular, text columns using default collation will be
emitted as having collation "default" and not pg_catalog.default as I
would have initially expected.  Right now it doesn't seem like this is a
problem, but it is unusual.


Dimitri Fontaine wrote:

> After careful study and thinking, it appears to me that the proposed
> patch addresses the whole range of features we expect here, and is both
> flexible enough for the users and easy enough to maintain.

Excellent, thanks.

>   - preliminary commit
>
>     It might be a good idea to separate away some pre-requisites of this
>     patch and commit them separately: the OID publishing parts and
>     allowing an Event Trigger to get fired after CREATE but without the
>     extra detailed JSON formated information might be good commits
>     already, and later add the much needed details about what did
>     happen.

I agree that perhaps the OID publishing bits should be pushed prior to
the rest of the patch.  It's two minor changes anyway that only affect
CREATE TABLE AS and REFRESH MATVIEW as far as I remember; the rest of
the commands already return the OID of the affected object.

I'm not sure about pushing the bits to have a trigger to fire before
having the deparse utility stuff.  I guess it is useful because it will
provide classID/objectID/identity for all created objects, even if we
don't have the creation command.  But changing the API of
pg_get_creation_commands() in a later release might be problematic.

>   - document the JSON format
>
>     I vote for going with the proposed format, because it actually
>     allows to implement both the audit and replication features we want,
>     with the capability of hacking schema, data types, SQL
>     representation etc; and because I couldn't think of anything better
>     than what's proposed here ;-)

That's great to hear.  Since the previous patch I have added a %{}O
escape that is used to format operators.  I think that makes the set of
format specifiers mostly complete.

>   - review the JSON producing code
>
>     It might be possible to use more of the internal supports for JSON
>     now that the format is freezing.

Yeah.  I realized that the trick through row_to_json is likely to fail
when the objects have more than 1600 columns.  Not sure if this is a
problem in practice (haven't tried creating a 1600-column table yet, but
I think that wouldn't fail because table elements are handed as arrays).
Anyway if we want to use the new json_build facilities, we'd need to
serialize from the in-memory representation I'm using to the text
format, and from there to JSON.  Not real sure that that's an
improvement ...

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Viability of text HISTORY/INSTALL/regression README files (was Re: [COMMITTERS] pgsql: Document a few more regression test hazards.)
Next
From: Amit Kapila
Date:
Subject: Re: narwhal and PGDLLIMPORT