Re: deparsing utility commands - Mailing list pgsql-hackers

From Andres Freund
Subject Re: deparsing utility commands
Date
Msg-id 20150224021018.GH30784@awork2.anarazel.de
Whole thread Raw
In response to Re: deparsing utility commands  (Stephen Frost <sfrost@snowman.net>)
Responses Re: deparsing utility commands  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
Hi,

On 2015-02-23 19:48:43 -0500, Stephen Frost wrote:
> > Yes, it might be possible to use the same code for a bunch of minor
> > commands, but not for the interesting/complex stuff.
> 
> We can clearly rebuild at least CREATE commands for all objects without
> access to the parse tree, obviously pg_dump manages somehow.

That's not the same. pg_dump recreates a static state, not running log
of changes that can be replayed.

> I didn't specify that a single command had to be used.

Well, if you want to get 'the create table statement' you'll need to
decide what you want. With ddl event triggers it's clear - something
that, when replayed, results in the same catalog contents. Such an easy
definition doesn't, as far as I know, exist for a set of functions you
seem to imagine.

> Further, the actual deparse_CreateStmt doesn't look like it'd have a
> terribly hard time producing something without access to the
> parsetree.

The interesting bit isn't the deparse_CreateStmt itself, but all the
stuff that's also triggered when you do a CREATE TABLE
nontrival-stuff;. utility.c will first transform the statement and then
run run and stash every created subcommand. And the subcommands will
*also* get deparsed.

Just think about what to do about CREATE TABLE foo(id serial primary
key, data text, bar_id REFERENCES foo.bar); - there's no way you can get
which other objects to dump from the catalog alone. What for a schema,
considering CREATE SCHEMA ... (schema_elements)+;?

Sure, individual subcommands could refer to the catalog instead of the
parse tree. But to get the whole thing you can't easily just refer to
it.

> All I'm suggesting is that we focus on collecting the information from
> the catalog and avoid using the parsetree.  For at least the CREATE and
> DROP cases, that should be entirely possible.

DROP's already in 9.4, the additions in 9.5 were more or less usability
things.  The problem generating DROPs is right now more identifying
which one you want to drop and checking the dependencies - the latter
I'm not sure how to do without actually executing the DROP.

> > > Maybe that's possible and maybe it isn't, but at least for the CREATE
> > > cases we should be able to avoid forcing a user to provide a built
> > > parsetree, and that'd be *really* nice and open up this feature to
> > > quite a few other use-cases that I can think of.  I'd further suggest
> > > that we provide these command to the SQL level, and then have a
> > > wrapper which will take the name of an object, resolve it to Oid, and
> > > then pass back the CREATE command for it.
> > 
> > I think this is a different feature that might end up sharing some of
> > the infrastructure, but not more.
> 
> I know you've looked through this code also and I really don't get the
> comment that only "some" of this infrastructure would be shared.  As I
> tried to point out, for the 'CREATE POLICY' case, a few minor
> modifications would have it provide exactly what I'm suggesting and I'm
> sure that most of the cases would be similar.  Simply looking through
> the code with an eye towards avoiding the parsetree in favor of pulling
> information from the catalog would illustrate the point I'm making, I
> believe.

I've no problem at all changing CREATE POLICY (and some other) deparse
functions to look more at the catalog than the command. What I don't see
is changing all of the create deparse functions, guarantee that they are
usable for getting the DDL of catalog objects and provide SQL
accessible infrastructure for that. That's a *massive* undertaking.

What I mean with 'sharing some of the infrastructure' is that I can see
a good portion of the deparse_* functions being reused for what you
desire.

But the decision about which of those to call will be an entirely
separate project. So is a whole new infrastructure to consider locking
and visibility (all the deparse stuff uses continualy evolving catalog
snapshots!) that it'll need as that's a problem the event trigger stuff
has much less to care about, because the objects are new rows and thus
can't be updated by other backends.

> > I don't think it's all that related, so I'm not surprised. If you
> > execute a CREATE TABLE(id serial primary key); you'll get a bunch of
> > commands with this facility - it'd be much less clear what exactly shall
> > be dumped in the case of some hypothetical GUI when you want to dump the
> > table.
> 
> I really don't think it's all that strange or complicated to consider
> and we've got a rather nice example of what a good approach would be.

Right. We got a *massive* program that solves dependencies and doesn't
do all that much useful/correct things if you only let it dump
individual objects. And that dumping of individual objects is what you
want... ;)

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: pg_check_dir comments and implementation mismatch
Next
From: Tomas Vondra
Date:
Subject: Re: Abbreviated keys for text cost model fix