Thread: More extension issues: ownership and search_path

More extension issues: ownership and search_path

From
Tom Lane
Date:
There are some things that the current extensions patch leaves
indeterminate during a dump and reload cycle, which strikes me
as a bad thing.

One is ownership.  Since we don't record the identity of the user who
created an extension, there's no way for pg_dump to ensure that it's
recreated by the same user.  I think we'll regret that in future even
if you don't think it's problematic today.  In particular, I foresee
eventually allowing non-superusers to load extensions, so I think this
is going to follow pretty much the same trajectory as procedural
languages, which we originally did not track ownership for.  In short,
I think we'd better add an extowner column to pg_extension.

Another is the search_path setting.  Currently this is actually rather
broken since pg_dump makes no effort at all to ensure that search_path
is the same at reload time as it was when the extension was created,
and thus the contained objects could easily go into the wrong schema.
But even without thinking about dump/reload, it seems to me that it
would be a good thing for reproducibility if CREATE EXTENSION were to
forcibly set search_path before running the extension's SQL script.

The level-zero version of that would be to do the equivalent ofSET LOCAL search_path = @extschema@
such that the path only contains the target schema and nothing else.

The trouble with this simplistic approach is that it doesn't work nicely
if one extension depends on another --- you probably want the search
path to include the schema(s) the required extensions got installed
into.  Of course inter-extension dependencies aren't going to work
anyway unless pg_dump knows about them so it can make sure to load the
extensions in the right order.  So where I think we're going to end up
is adding a clause along the line of "USING list-of-extension-names"
to CREATE EXTENSION, storing those dependencies explicitly, and having
the CREATE EXTENSION code set search_path to the target schema followed
by the target schema(s) of the USING extensions.  Not sure if this is
worth doing immediately or can be left for 9.2.  At least in the contrib
modules, there are no interdependencies, and I don't know whether any
exist in the wider world of existing extensions.

Comments?
        regards, tom lane


Re: More extension issues: ownership and search_path

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> One is ownership.  Since we don't record the identity of the user who
> created an extension, there's no way for pg_dump to ensure that it's
> recreated by the same user.  I think we'll regret that in future even
> if you don't think it's problematic today.  In particular, I foresee
> eventually allowing non-superusers to load extensions, so I think this
> is going to follow pretty much the same trajectory as procedural
> languages, which we originally did not track ownership for.  In short,
> I think we'd better add an extowner column to pg_extension.

Agreed.  There's no need to have it now but we will add it at some
point.  So if now is when that works the best for you, I'm happy to see
that happen :)

Would it help that I prepare some of those modifications, as patches
over the extension's patch that you started from?

> Another is the search_path setting.  Currently this is actually rather
> broken since pg_dump makes no effort at all to ensure that search_path
> is the same at reload time as it was when the extension was created,
> and thus the contained objects could easily go into the wrong schema.

Well there's some code to place the extension's schema at the first
place in the search_path before executing the script, already.

> But even without thinking about dump/reload, it seems to me that it
> would be a good thing for reproducibility if CREATE EXTENSION were to
> forcibly set search_path before running the extension's SQL script.
>
> The level-zero version of that would be to do the equivalent of
>     SET LOCAL search_path = @extschema@
> such that the path only contains the target schema and nothing else.

Spelled this way, I could see attaching SET to CREATE EXTENSION the same
way we did for CREATE FUNCTION.  I'm not too sure about what other set
of GUCs would be useful to support here, but that would be a good
mechanism to use I would say.

> The trouble with this simplistic approach is that it doesn't work nicely
> if one extension depends on another --- you probably want the search
> path to include the schema(s) the required extensions got installed
> into.  Of course inter-extension dependencies aren't going to work
> anyway unless pg_dump knows about them so it can make sure to load the
> extensions in the right order.  So where I think we're going to end up
> is adding a clause along the line of "USING list-of-extension-names"
> to CREATE EXTENSION, storing those dependencies explicitly, and having
> the CREATE EXTENSION code set search_path to the target schema followed
> by the target schema(s) of the USING extensions.  Not sure if this is
> worth doing immediately or can be left for 9.2.  At least in the contrib
> modules, there are no interdependencies, and I don't know whether any
> exist in the wider world of existing extensions.

We have a interdependency in contrib, earthdistance depends on cube
already.  In skytools, PGQ is composed of several parts that are
packaged as a single extension now, but whose sources are maintained in
separate parts.  Other than that, I think tricky scripts that depends on
some objects of the extension to already be usable will be simpler to
solve with splitting the extensions and adding some dependency.

So while we said this is 9.2 material, if you want to tackle the whole
search_path at restore time issue (I did only the creation namespace,
thinking it would be enough) fully, you need dependency too.

I think we should then register some core components as extensions for
the sake of interdependencies here, too.  \dx would then list PostgreSQL
itself and its (major) version, and the installed PL would need to be
there, and maybe some "features" too — but the way we handle bugfix only
minor upgrade makes that useless for us I think.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I think we'd better add an extowner column to pg_extension.

> Agreed.  There's no need to have it now but we will add it at some
> point.  So if now is when that works the best for you, I'm happy to see
> that happen :)

> Would it help that I prepare some of those modifications, as patches
> over the extension's patch that you started from?

No, I've hacked the code enough already that merging would be painful.
I'll keep working on it.

>> Another is the search_path setting.  Currently this is actually rather
>> broken since pg_dump makes no effort at all to ensure that search_path
>> is the same at reload time as it was when the extension was created,
>> and thus the contained objects could easily go into the wrong schema.

> Well there's some code to place the extension's schema at the first
> place in the search_path before executing the script, already.

Oh, duh, I'd forgotten about the OverrideSearchPath usage.  So never
mind the above claim.  But I still think it'd be a good idea to ensure
that the search path is the same during dump/reload as it was the first
time, and the current arrangement isn't going to ensure that.

> So while we said this is 9.2 material, if you want to tackle the whole
> search_path at restore time issue (I did only the creation namespace,
> thinking it would be enough) fully, you need dependency too.

Quite aside from search path, cross-extension dependencies simply aren't
going to work unless pg_dump knows about them so it can load the
extensions in the right order.  I had forgotten about the earthdistance
case, but given that I think we probably can't put this issue off.

> I think we should then register some core components as extensions for
> the sake of interdependencies here, too.

Maybe that sort of refactoring could be done in 9.2 or further out.
I don't see it happening for 9.1.  I'm not really sure what the value
is anyway.
        regards, tom lane


Re: More extension issues: ownership and search_path

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> No, I've hacked the code enough already that merging would be painful.
> I'll keep working on it.

I supposed so much, but got to ask :)

> Oh, duh, I'd forgotten about the OverrideSearchPath usage.  So never
> mind the above claim.  But I still think it'd be a good idea to ensure
> that the search path is the same during dump/reload as it was the first
> time, and the current arrangement isn't going to ensure that.

Right.  Something automated would be best (no user intervention), it
would force extension scripts to be self-contained or to explicitly
declare all their external dependencies.  I'm in fact doping out
entirely my previous SET idea.

>> So while we said this is 9.2 material, if you want to tackle the whole
>> search_path at restore time issue (I did only the creation namespace,
>> thinking it would be enough) fully, you need dependency too.
>
> Quite aside from search path, cross-extension dependencies simply aren't
> going to work unless pg_dump knows about them so it can load the
> extensions in the right order.  I had forgotten about the earthdistance
> case, but given that I think we probably can't put this issue off.

Well in fact the ordering already works because some earthdistance
objects depend on some cube objects, and pg_dump sees that in pg_depend.

>> I think we should then register some core components as extensions for
>> the sake of interdependencies here, too.
>
> Maybe that sort of refactoring could be done in 9.2 or further out.
> I don't see it happening for 9.1.  I'm not really sure what the value
> is anyway.

Imagine that you wrote a set of plpgsql and plpythonu functions that you
want to maintain as an extension.  You certainly want to mark that this
extension depends on the procedural languages being installed, right?

Also, I didn't bite this bullet, but maybe we should provide core PLs as
extension.  Then CREATE LANGUAGE would maybe get deprecated and only
valid when used in an extension's script — or the next patch (UPGRADE)
will take care of create a plpythonu extension then attaching the PL
into it.

Again, pg_depend already allows pg_dump to create plpythonu before it
creates the extension that depends on it, though.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Quite aside from search path, cross-extension dependencies simply aren't
>> going to work unless pg_dump knows about them so it can load the
>> extensions in the right order.  I had forgotten about the earthdistance
>> case, but given that I think we probably can't put this issue off.

> Well in fact the ordering already works because some earthdistance
> objects depend on some cube objects, and pg_dump sees that in pg_depend.

Really?  I think it's more likely just luckily working because of the
alphabetical-ordering default.  To make it work reliably off of those
dependencies, we'd need some code to pull up the dependency links from
the individual objects to the module level, and we haven't got that.
I'd prefer to make the module dependencies explicit anyway.

> Imagine that you wrote a set of plpgsql and plpythonu functions that you
> want to maintain as an extension.  You certainly want to mark that this
> extension depends on the procedural languages being installed, right?

Interesting point.  It's all right at the moment because I tweaked
pg_dump_sort.c so that procedural languages are dumped before modules.
But maybe we should convert the PLs to modules.
        regards, tom lane


Re: More extension issues: ownership and search_path

From
"David E. Wheeler"
Date:
On Feb 7, 2011, at 9:20 AM, Dimitri Fontaine wrote:

> Also, I didn't bite this bullet, but maybe we should provide core PLs as
> extension.  Then CREATE LANGUAGE would maybe get deprecated and only
> valid when used in an extension's script — or the next patch (UPGRADE)
> will take care of create a plpythonu extension then attaching the PL
> into it.

I anticipate dependencies becoming a big deal. I already have ideas for extensions that depend on citext, for example
(domainsfor time zone, email address, etc.). And yeah, some of those might depend on procedural languages. FWIW, I've
beenputting PL prereqs in META.json files on PGXN. pgTAP, for example, requires PL/pgSQL: 
 http://master.pgxn.org/dist/pgTAP/pgTAP-0.25.0.json

Best,

David



Re: More extension issues: ownership and search_path

From
"David E. Wheeler"
Date:
On Feb 7, 2011, at 9:51 AM, Tom Lane wrote:

> Interesting point.  It's all right at the moment because I tweaked
> pg_dump_sort.c so that procedural languages are dumped before modules.
> But maybe we should convert the PLs to modules.

s/modules/extensions/?

David



Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Feb 7, 2011, at 9:51 AM, Tom Lane wrote:
>> Interesting point.  It's all right at the moment because I tweaked
>> pg_dump_sort.c so that procedural languages are dumped before modules.
>> But maybe we should convert the PLs to modules.

> s/modules/extensions/?

Yeah, I keep saying "module".  Memo to self: grep the whole patch for
"module" before committing.
        regards, tom lane


Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I think we'd better add an extowner column to pg_extension.

> Agreed.  There's no need to have it now but we will add it at some
> point.  So if now is when that works the best for you, I'm happy to see
> that happen :)

BTW, on trying this I notice that pg_dump's default approach to
ownership doesn't work because of the lack of an ALTER EXTENSION
OWNER TO command.  I'm going to go ahead and add extowner to the catalog
anyway, because it's easy and I'm convinced we're going to want it
later.  But I don't feel like writing ALTER EXTENSION OWNER TO right
now, so pg_dump will continue its current behavior of creating the
extension as the user running the script.
        regards, tom lane


Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
I wrote:
> ... So where I think we're going to end up
> is adding a clause along the line of "USING list-of-extension-names"
> to CREATE EXTENSION, storing those dependencies explicitly, and having
> the CREATE EXTENSION code set search_path to the target schema followed
> by the target schema(s) of the USING extensions.

On reflection, the set of extensions that an extension depends on is
obviously a property of the extension, which means it ought to be
specified in the extension's control file, not in the CREATE EXTENSION
command.  So now I'm thinking something like
requires = 'foo, bar, baz'

in the file.  We could even imagine autoloading such dependencies if
they're not already installed, but that's a frammish for later.  (One
objection to autoloading is it's not clear which schema to drop the
other extensions into.)
        regards, tom lane


Re: More extension issues: ownership and search_path

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> On reflection, the set of extensions that an extension depends on is
> obviously a property of the extension, which means it ought to be
> specified in the extension's control file, not in the CREATE EXTENSION
> command.  So now I'm thinking something like
>
>     requires = 'foo, bar, baz'

+1

And that can change at upgrade time, of course, but that's another
story.  Ditto for recommends and conflict dependency types, that's
material for 9.2 at best.

> in the file.  We could even imagine autoloading such dependencies if
> they're not already installed, but that's a frammish for later.  (One
> objection to autoloading is it's not clear which schema to drop the
> other extensions into.)

Well I don't see why it wouldn't be the same schema in case of auto
solving dependencies, but I don't see a pressing need to have automatic
dependency following at install time (we're still in the realm of dpkg,
we'll get into apt-get next) :)

That said, we should do something about ALTER EXTENSION SET SCHEMA and
the relocatable property.  I'm thinking that an extension stops being
relocatable as soon as any of its reverse dependencies (all the tree) is
not relocatable.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> That said, we should do something about ALTER EXTENSION SET SCHEMA and
> the relocatable property.  I'm thinking that an extension stops being
> relocatable as soon as any of its reverse dependencies (all the tree) is
> not relocatable.

If you're worried about that, then it's questionable whether ALTER
EXTENSION SET SCHEMA makes sense at all, ever.  I don't see any reason
to think that an extension is more fragile for this purpose than any
other random SQL dependencies.  Also, an extension being relocatable
doesn't seem to me to guarantee that it can cope with its dependencies
moving around; they're really independent properties.
        regards, tom lane


Re: More extension issues: ownership and search_path

From
"David E. Wheeler"
Date:
On Feb 7, 2011, at 10:23 AM, Tom Lane wrote:

> On reflection, the set of extensions that an extension depends on is
> obviously a property of the extension, which means it ought to be
> specified in the extension's control file, not in the CREATE EXTENSION
> command.  So now I'm thinking something like
> 
>     requires = 'foo, bar, baz'
> 
> in the file. 

And that takes us one step closer to PGXN's META.json file. Here's the spec:
 http://pgxn.org/meta/spec.txt

It includes a "prereqs" section, which looks like this:
      "prereqs": {         "runtime": {            "requires": {               "citext": 0,               "plpgsql": 0,
             "PostgreSQL": "8.0.0"            },            "recommends": {               "PostgreSQL": "8.4.0"
  }         }      },
 


Best,

David


Re: More extension issues: ownership and search_path

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> If you're worried about that, then it's questionable whether ALTER
> EXTENSION SET SCHEMA makes sense at all, ever.  I don't see any reason
> to think that an extension is more fragile for this purpose than any
> other random SQL dependencies.  Also, an extension being relocatable
> doesn't seem to me to guarantee that it can cope with its dependencies
> moving around; they're really independent properties.

Well a relocatable extension certainly supports SET SCHEMA just fine, or
it would not have the property.  Then your conclusion is right.  My
comment was about what happens when things are setup the other way.

We have earthdistance that depends on cube.  Let's pretend that
earthdistance is not relocatable.  I think we should then consider (when
both are installed) that cube is not relocatable, whatever its control
file says.  That's because not relocatable means that the install script
is using the @extschema@ place holder and the fragility there is known
quite high: the install script and some installed objects do depend on
@extschema@.  Moving the dependencies underneath it in this case looks
to me more than a risk: we know we're breaking things.

What you're saying (or what I'm reading at least) is that if
earthdistance is relocatable, you don't have faith that it means we can
actually move cube without collateral damages.  Well, the author said it
would cope fine, and in this case I see no reason not to believe him.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: More extension issues: ownership and search_path

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> If you're worried about that, then it's questionable whether ALTER
>> EXTENSION SET SCHEMA makes sense at all, ever.  I don't see any reason
>> to think that an extension is more fragile for this purpose than any
>> other random SQL dependencies.  Also, an extension being relocatable
>> doesn't seem to me to guarantee that it can cope with its dependencies
>> moving around; they're really independent properties.

> Well a relocatable extension certainly supports SET SCHEMA just fine, or
> it would not have the property.  Then your conclusion is right.  My
> comment was about what happens when things are setup the other way.

Yes, I understood that.

> We have earthdistance that depends on cube.  Let's pretend that
> earthdistance is not relocatable.  I think we should then consider (when
> both are installed) that cube is not relocatable, whatever its control
> file says.  That's because not relocatable means that the install script
> is using the @extschema@ place holder and the fragility there is known
> quite high: the install script and some installed objects do depend on
> @extschema@.

But @extschema@ isn't affected by where the other modules are.

The basic issue here is that we might have wired something about the
referenced schemas into one of the objects in the dependent extension,
for example via
create function foo() ... SET search_path FROM CURRENT;

I agree that this is a risk.  My point is that you can do that without
any extension, and you're just as much at risk.  If you think that this
is something we must protect against, I think ripping out ALTER
EXTENSION SET SCHEMA altogether is the only real answer.  I see no
argument whatsoever why we should lock down extensions and only extensions
against this risk.
        regards, tom lane


Re: More extension issues: ownership and search_path

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
>  I see no
> argument whatsoever why we should lock down extensions and only extensions
> against this risk.

Spelled this way I can only agree :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support