Thread: Inline Extension

Inline Extension

From

Dimitri Fontaine

Date:

08 January 2012, 17:05:11

Hi,

The extension mechanism we added in 9.1 is aimed at allowing a fully
integrated contrib management, which was big enough a goal to preclude
doing anything else in its first release.

Now we have it and we can think some more about what features we want
covered, and a pretty obvious one that's been left out is the ability to
define and update an extension without resorting to file system support
for those extensions that do not need a shared object library. We could
have been calling that “SQL ONLY” extensions, but to simplify the
grammar support I did use the “inline” keyword so there we go.

Please find attached a WIP patch implementing that.  Note that the main
core benefit to integrating this feature is the ability to easily add
regression tests for extension related features.  Which is not done yet
in the attached.

I'm sending this quite soon because of the pg_dump support.  When an
extension is inline, we want to dump its content, as we currently do in
the binary dump output.  I had in mind that we could output a full
CREATE EXTENSION INLINE script in between some dollar-quoting rather
than adding each extension's object with a ALTER EXTENSION ... ADD line
like what pg_upgrade compatibility is currently doing.

It seems like much more work though, and I'd appreciate input about how
exactly to do that (it looks like making pg_dump reentrant, somehow).
Or some reason not to bother and just rename and share the binary
upgrade facility that Bruce already has put into pg_dump.

Here's a usage example that will certainly end up in the docs:

  create extension pair inline version '1.0' schema pair not relocatable as
  $pair$
    CREATE TYPE pair AS ( k text, v text );

    CREATE OR REPLACE FUNCTION pair(anyelement, text)
    RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair';

    CREATE OR REPLACE FUNCTION pair(text, anyelement)
    RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair';

    CREATE OR REPLACE FUNCTION pair(anyelement, anyelement)
    RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair';

    CREATE OR REPLACE FUNCTION pair(text, text)
    RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair;';

    CREATE OPERATOR ~> (LEFTARG = text, RIGHTARG = anyelement, PROCEDURE = pair);
    CREATE OPERATOR ~> (LEFTARG = anyelement, RIGHTARG = text, PROCEDURE = pair);
    CREATE OPERATOR ~> (LEFTARG = anyelement, RIGHTARG = anyelement, PROCEDURE = pair);
    CREATE OPERATOR ~> (LEFTARG = text, RIGHTARG = text, PROCEDURE = pair);
  $pair$;

  alter extension pair update from '1.0' to '1.1' with
  $pair$
    CREATE OR REPLACE FUNCTION key(pair)
    RETURNS text LANGUAGE SQL AS 'SELECT ($1).k;';

    CREATE OR REPLACE FUNCTION value(pair)
    RETURNS text LANGUAGE SQL AS 'SELECT ($1).v;';

    CREATE OPERATOR %% (RIGHTARG = pair, PROCEDURE = key);
    CREATE OPERATOR %# (RIGHTARG = pair, PROCEDURE = value);
  $pair$;

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Attachment

extension-inline.v0.patch

Re: Inline Extension

From

Heikki Linnakangas

Date:

19 January 2012, 09:58:41

On 08.01.2012 22:36, Dimitri Fontaine wrote:
> The extension mechanism we added in 9.1 is aimed at allowing a fully
> integrated contrib management, which was big enough a goal to preclude
> doing anything else in its first release.

Hooray!

> Now we have it and we can think some more about what features we want
> covered, and a pretty obvious one that's been left out is the ability to
> define and update an extension without resorting to file system support
> for those extensions that do not need a shared object library. We could
> have been calling that “SQL ONLY” extensions, but to simplify the
> grammar support I did use the “inline” keyword so there we go.

Frankly I don't see the point of this. If the extension is an 
independent piece of (SQL) code, developed separately from an 
application, with its own lifecycle, a .sql file seems like the best way 
to distribute it. If it's not, ie. if it's an integral part of the 
database schema, then why package it as an extension in the first place?

> Please find attached a WIP patch implementing that.  Note that the main
> core benefit to integrating this feature is the ability to easily add
> regression tests for extension related features.  Which is not done yet
> in the attached.

I'm not sure I buy that argument. These inline extensions are 
sufficiently different from regular extensions that I think you'd need 
to have regression tests for both kinds, anyway.

> I'm sending this quite soon because of the pg_dump support.  When an
> extension is inline, we want to dump its content, as we currently do in
> the binary dump output.  I had in mind that we could output a full
> CREATE EXTENSION INLINE script in between some dollar-quoting rather
> than adding each extension's object with a ALTER EXTENSION ... ADD line
> like what pg_upgrade compatibility is currently doing.

I thought the main point of extensions is that that they're not included 
in pg_dump. Again, if the extension is an integral part of the database, 
then it probably shouldn't be an extension in the first place.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Inline Extension

From

Dimitri Fontaine

Date:

19 January 2012, 11:21:52

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Frankly I don't see the point of this. If the extension is an independent
> piece of (SQL) code, developed separately from an application, with its own
> lifecycle, a .sql file seems like the best way to distribute it. If it's
> not, ie. if it's an integral part of the database schema, then why package
> it as an extension in the first place?

It allows to easily deploy an extension to N databases (my current use
case has 256 databases) and knowing which version is installed on each
server. It's easier to QA your procedures and upgrades when they are
packaged as extensions, too.

Now, for the dependency on a SQL file hosting the content, it's easier
to just connect to the databases and get them the script in the SQL
command rather than deploying a set of files: that means OS level
packaging, either RPM or debian or some other variant. Or some other
means of easily deploying the files. An SQL connection is all you need
if you're not shipping .so.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Tom Lane

Date:

19 January 2012, 12:15:53

Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Frankly I don't see the point of this. If the extension is an independent
>> piece of (SQL) code, developed separately from an application, with its own
>> lifecycle, a .sql file seems like the best way to distribute it. If it's
>> not, ie. if it's an integral part of the database schema, then why package
>> it as an extension in the first place?

> It allows to easily deploy an extension to N databases (my current use
> case has 256 databases) and knowing which version is installed on each
> server. It's easier to QA your procedures and upgrades when they are
> packaged as extensions, too.

> Now, for the dependency on a SQL file hosting the content, it's easier
> to just connect to the databases and get them the script in the SQL
> command rather than deploying a set of files: that means OS level
> packaging, either RPM or debian or some other variant. Or some other
> means of easily deploying the files. An SQL connection is all you need
> if you're not shipping .so.

I'm with Heikki on not believing that this is a good idea.  If you are
trying to do careful versioning of a set of object definitions, you want
to stick the things in a file, you don't want them just flying by in
submitted SQL.  Also, a large part of the point of the extension
facility is to be able to do uninstall/reinstall and version
upgrades/downgrades, none of which are possible unless the extension
scripts are stored somewhere.

ISTM your distribution concern would be much better addressed by
installing contrib/adminpack and then just using pg_file_write()
to put the new extension script into the remote server's library.
        regards, tom lane

Re: Inline Extension

From

"David E. Wheeler"

Date:

19 January 2012, 13:12:23

On Jan 19, 2012, at 7:21 AM, Dimitri Fontaine wrote:

> Now, for the dependency on a SQL file hosting the content, it's easier
> to just connect to the databases and get them the script in the SQL
> command rather than deploying a set of files: that means OS level
> packaging, either RPM or debian or some other variant. Or some other
> means of easily deploying the files. An SQL connection is all you need
> if you're not shipping .so.

ISTM that if you are managing 256 servers, you’re likely already using a packaging system for the deployment of
applicationdependencies. In which case, to keep things consistent, you ought to distribute your extensions in exactly
thesame way. 

Best,

David

Re: Inline Extension

From

Dimitri Fontaine

Date:

19 January 2012, 16:43:19

Tom Lane <tgl@sss.pgh.pa.us> writes:
> I'm with Heikki on not believing that this is a good idea.  If you are
> trying to do careful versioning of a set of object definitions, you want
> to stick the things in a file, you don't want them just flying by in
> submitted SQL.

I'm trying to open the extension facilities (versions being the first of
them, think \dx) to application PL code, and to hosted environments
where you're not granted access to the server's file system.

I think I would agree that the use case is not existing if the target is
traditional in-house deployments where the sys admins are your
colleagues. I've been told that's a smaller and smaller part of the
database world though.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

19 January 2012, 16:55:16

On Thu, Jan 19, 2012 at 3:42 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I'm with Heikki on not believing that this is a good idea.  If you are
>> trying to do careful versioning of a set of object definitions, you want
>> to stick the things in a file, you don't want them just flying by in
>> submitted SQL.
>
> I'm trying to open the extension facilities (versions being the first of
> them, think \dx) to application PL code, and to hosted environments
> where you're not granted access to the server's file system.

I guess the question is: for what purpose?

As you recognized in your original email, if the extension is inline,
then the objects will need to be dumped, because a simple CREATE
EXTENSION command is bound to fail.  But my understanding was that a
major part of the reason - if not the entire reason - was to get
pg_dump to emit CREATE EXTENSION bob instead of the component SQL
commands.  If we take that away, what's the remaining benefit of
packaging those objects inside an extension instead of just dumping
them "loose" into the database?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Tom Lane

Date:

20 January 2012, 00:49:16

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jan 19, 2012 at 3:42 PM, Dimitri Fontaine
>> I'm trying to open the extension facilities (versions being the first of
>> them, think \dx) to application PL code, and to hosted environments
>> where you're not granted access to the server's file system.

> I guess the question is: for what purpose?

> As you recognized in your original email, if the extension is inline,
> then the objects will need to be dumped, because a simple CREATE
> EXTENSION command is bound to fail.  But my understanding was that a
> major part of the reason - if not the entire reason - was to get
> pg_dump to emit CREATE EXTENSION bob instead of the component SQL
> commands.  If we take that away, what's the remaining benefit of
> packaging those objects inside an extension instead of just dumping
> them "loose" into the database?

Indeed, it seems like such a thing is not an extension at all anymore,
or at least it gives up many of the useful properties of extensions.

Given the entire lack of demand from the field for such a cut-down
concept of extension, I think we should not be in a hurry to introduce
it.  Maybe in a year or two when we have a clearer idea of how people
are actually using extensions, there will be a better argument for it.
Right now I'm afraid that we might foreclose our options for other
future extension features because these things would be incompatible
with such ideas.
        regards, tom lane

Re: Inline Extension

From

Dimitri Fontaine

Date:

20 January 2012, 09:52:53

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I guess the question is: for what purpose?
>
> Indeed, it seems like such a thing is not an extension at all anymore,
> or at least it gives up many of the useful properties of extensions.

I'm thinking that a common name and version number tracked in the
database for a set of related functions (that usually form an API) is
useful enough a property to be wanting to have extension support more
use cases than contrib-like “module centric” extensions (meaning, C
coded and shipped with a .so).

> Given the entire lack of demand from the field for such a cut-down
> concept of extension, I think we should not be in a hurry to introduce
> it.  Maybe in a year or two when we have a clearer idea of how people
> are actually using extensions, there will be a better argument for it.

Fair enough I guess (or at least I'm understanding how alone I am here),
let's hear from the field first.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

20 January 2012, 10:38:13

On Fri, Jan 20, 2012 at 8:52 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I guess the question is: for what purpose?
>>
>> Indeed, it seems like such a thing is not an extension at all anymore,
>> or at least it gives up many of the useful properties of extensions.
>
> I'm thinking that a common name and version number tracked in the
> database for a set of related functions (that usually form an API) is
> useful enough a property to be wanting to have extension support more
> use cases than contrib-like “module centric” extensions (meaning, C
> coded and shipped with a .so).

I see that there is some usefulness there, but I'm not sure that this
is the best way to get our hands around it.  For one thing, people can
and do schema versioning and schema upgrade scripts entirely in
userland.  My last implementation worked by keeping a schema_versions
table on the server with one column, a UUID.  The deployment tarball
contained a file with a list of UUIDs in it, each one associated to an
SQL script.  At install time, the install script ran through that file
in order and ran any scripts whose UUID didn't yet appear in the
table, and then added the UUIDs of the run scripts to the table.  This
might not be what any given person wants, but there's a lot of
flexibility to do things like that without any particular support from
the DBMS.  (Incidentally, this experience is what convinced me that
CREATE TABLE IF EXISTS and ALTER TABLE IF EXISTS are good things to
have; my system could have been a lot simpler if I'd had those.)

One random design idea I had related to providing this functionality
in the DB core is to have a command that creates an empty extension,
maybe just "CREATE EXTENSION foo EMPTY", and an ALTER command that
forcibly changes the DB's notion of what version is installed, like
"ALTER EXTENSION foo FORCE VERSION TO '1.1'".  That would allow the
same sort of thing you're after here by using those two features plus
ALTER EXTENSION ADD/DROP, and could also be used to do other things.
For example, suppose you have DB servers A and B.  A is running an old
version of some extension and is in a shared hosting environment where
you can't get access to the box.  B, on the other hand, is your
brand-new, dedicated server.  You could upgrade the extension on the
old machine "manually", by issuing SQL commands to forcibly change its
state, and then do a dump and reload onto B.  This might be useful if,
for example, B is also running a newer DB server version that won't
support the very old version of the extension running on A.   This is
probably an unusual situation, but maybe there's some value in
allowing users who want to do such things a cleaner way to do it than
direct catalog hackery.

Anyway, I'm just thinking out loud here - I don't actually have a very
strong feeling that I know what all of the solutions are in this area,
or even all the problems.  I'm interested in hearing about your
experiences with the system, and other people's, because I certainly
do agree that there's room for improvement.  One of my personal pet
peeves is that the system doesn't know how to do an install of v1.1 by
running the v1.0 script followed by the 1.0-1.1 upgrade script, which
I fear is going to lead to a rapid uptick in the number of copies of
almost-identical scripts in our git repository.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

20 January 2012, 12:29:48

Robert Haas <robertmhaas@gmail.com> writes:
> peeves is that the system doesn't know how to do an install of v1.1 by
> running the v1.0 script followed by the 1.0-1.1 upgrade script, which

Did you try
 CREATE EXTENSION foo FROM 1.0;

Maybe you would want the system to be able to determine the oldest
version to start from to reach the current default_version given in the
control file, but I guess it would be better to add another property
like default_full_version or such (last_stop?).

Looks like a simple enough project, the fact that it helps our own
shipping and script maintenance could maybe allow us to work on that
this late, dunno.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

20 January 2012, 12:49:23

On Fri, Jan 20, 2012 at 11:29 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> peeves is that the system doesn't know how to do an install of v1.1 by
>> running the v1.0 script followed by the 1.0-1.1 upgrade script, which
>
> Did you try
>
>  CREATE EXTENSION foo FROM 1.0;

Well, yes, that works, but I'm going for what you wrote next:

> Maybe you would want the system to be able to determine the oldest
> version to start from to reach the current default_version given in the
> control file, but I guess it would be better to add another property
> like default_full_version or such (last_stop?).

Possibly that might be a better design, yes.  Especially since we
don't order version numbers intrinsically.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

20 January 2012, 15:03:22

Robert Haas <robertmhaas@gmail.com> writes:
>> Maybe you would want the system to be able to determine the oldest
>> version to start from to reach the current default_version given in the
>> control file, but I guess it would be better to add another property
>> like default_full_version or such (last_stop?).
>
> Possibly that might be a better design, yes.  Especially since we
> don't order version numbers intrinsically.

It's quite simple to implement too, see attached.  As for the version
number ordering, that's a giant rat hole I don't want to be digging in,
and I think we can bypass that problem entirely.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Attachment

extension-default-full-version.v0.patch

Re: Inline Extension

From

Daniel Farina

Date:

20 January 2012, 18:01:18

On Thu, Jan 19, 2012 at 8:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> Frankly I don't see the point of this. If the extension is an independent
>>> piece of (SQL) code, developed separately from an application, with its own
>>> lifecycle, a .sql file seems like the best way to distribute it. If it's
>>> not, ie. if it's an integral part of the database schema, then why package
>>> it as an extension in the first place?
> I'm with Heikki on not believing that this is a good idea.  If you are
> trying to do careful versioning of a set of object definitions, you want
> to stick the things in a file, you don't want them just flying by in
> submitted SQL.  Also, a large part of the point of the extension
> facility is to be able to do uninstall/reinstall and version
> upgrades/downgrades, none of which are possible unless the extension
> scripts are stored somewhere.
>
> ISTM your distribution concern would be much better addressed by
> installing contrib/adminpack and then just using pg_file_write()
> to put the new extension script into the remote server's library.

I think this is somewhat rube-goldberg-esque, and denies non-superuser
roles the ability to get more version management of schema and
operators.  As-is many organizations are submitting "migrations" via
plain SQL that include committing to a version management table that
is maintained by convention, and as-is that is considered a modern-day
best-practice.

Unless someone comes up with something fundamentally new in how to
handle the problem being solved in user-land via 'migrations' and
version tables in Postgres, I think tying completely safe, sandboxed,
non-superuser operators to super-user-only (and awkward besides) file
management via FEBE on the server-side is not going to do anything but
redelegate the problem of soundness to every migration/change
management regime, and decrease reuse of useful operators that do not
require superuser.

The ship has sailed.  Encouraging use of files and .sql buy no
soundness, because everyone is moving towards is overlaying version
management via pure FEBE anyway.  At best, this means to me that
Postgres is completely neutral about what version management regime
you want to use for operators and schemas.  At worst, it means
Postgres frustratingly unhelpful in common use cases.  And somewhere
in the middle is "this may have value, but not enough to be worth
maintaining."  Given the lack of a fast and widely deployed trusted PL
integrations, I my interpretation of the truth falls somewhere between
the latter two interpretations.

--
fdr

Re: Inline Extension

From

Heikki Linnakangas

Date:

20 January 2012, 18:49:09

On 21.01.2012 00:00, Daniel Farina wrote:
> I think this is somewhat rube-goldberg-esque, and denies non-superuser
> roles the ability to get more version management of schema and
> operators.  As-is many organizations are submitting "migrations" via
> plain SQL that include committing to a version management table that
> is maintained by convention, and as-is that is considered a modern-day
> best-practice.

Even if you give the version number in the CREATE EXTENSION command, 
it's by convention that people actually maintain a sane versioning 
policy. If people don't take version management seriously, you will 
quickly end up with five different versions of an extension, all with 
version number 0.1.

Another approach is to use comments on the objects saying "version 
1.23". Those generally move together with the objects themselves; they 
are included in pg_dump schema-only dump, for example, while the 
contents of a table are not.

> The ship has sailed.  Encouraging use of files and .sql buy no
> soundness, because everyone is moving towards is overlaying version
> management via pure FEBE anyway.

What is FEBE?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Inline Extension

From

Daniel Farina

Date:

20 January 2012, 19:34:09

On Fri, Jan 20, 2012 at 2:48 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Even if you give the version number in the CREATE EXTENSION command, it's by
> convention that people actually maintain a sane versioning policy. If people
> don't take version management seriously, you will quickly end up with five
> different versions of an extension, all with version number 0.1.

Projects are taking it seriously, and invest a lot of effort in it.
There is no shortage of schema versioning frameworks, of varying
levels of maturity....but some are quite complete by the standards of
their users.  However, there is little knowledge shared between them,
and the no database gives them much support, so idiosyncrasy becomes
inevitable.

Unless one makes loading regular, unpackaged operators via SQL
impossible (which is, on the face of it, doesn't sound like a good
idea to me), the only people who will bother with CREATE EXTENSION
with inline content will be those who care and want to maintain
version control.

I stick to my original assessment: the ship has sailed, or perhaps,
more accurately, is sailing.  Perhaps one could gain wisdom by
thinking of this problem differently: "I'm going to write a migration
tool to help developers in my framework or workflow deal with change,
and I really would like it if my database helped me by....".

I don't think we should fool ourselves that we'll be letting people
shoot themselves in the foot with regard to versioning if we give them
versions.  People have been painfully and assiduously trying to avoid
version problems for some time now, with no help from the database,
and they do it entirely via pure SQL statements from Postgres' point
of view.

> Another approach is to use comments on the objects saying "version 1.23".
> Those generally move together with the objects themselves; they are included
> in pg_dump schema-only dump, for example, while the contents of a table are
> not.

Idiosyncratic, but perhaps useful for some people.  Again, workarounds
are possible, and per my previous statements, even pervasive.  That
doesn't make it a desirable thing to avoid assisting with.

> What is FEBE?

FrontEnd BackEnd Protocol, I thought? The one libpq speaks. Is there a
better name?

Here are some reasons not do this feature, in my mind:

* Is the world ready yet to think about versioning operators in a
database the same way software is versioned?

* Related to that: If the feature is written, it will have to be
supported.  Does the feature have enough impact at this time?

* Do we know how to design the feature correctly so it will attract
use, especially as a Postgres-ism, which is related in some part to
the decaying importance (my perception) of 'database agnostic'

* Are the awkward user-land versioning strategies good enough? Would
it be better to focus elsewhere where things are even *more* awkward?

My assessment is: I know some people who would use this feature, but a
broad swathe are not loading too many operators into their database.
It's hard to know if that's because packaging operators in databases
has been so abysmally bad in the industry at large, or because the
choice of sandboxed languages are just not appealing for, say, a
Python or Ruby developer to write things that are not a series of SQL
statements, ala pgsql, and a lot of that (sans triggers) can be done
over the wire anyway.  Maybe embedded Javascript can help with this,
but it's not the Here and Now.

If I had to invest with complexity with regard to versioning, I'd
rather invest in Postgres being able to tell me a hashed version of
all or a controllable subset of the tables, types, attributes in the
database, as to quickly pick out drift between an freshly created
database (produced by the migrations on-disk against development) and
any cruft that sneaks into a production server that is thought to be
the same as the others.  That would have huge impact with common use
cases at this very moment in time, so it provides a reasonable
backdrop to evaluate the cost/impact of the proposed feature.

-- 
fdr

Re: Inline Extension

From

Daniel Farina

Date:

22 January 2012, 16:21:50

On Fri, Jan 20, 2012 at 3:33 PM, Daniel Farina <daniel@heroku.com> wrote:
> On Fri, Jan 20, 2012 at 2:48 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Even if you give the version number in the CREATE EXTENSION command, it's by
>> convention that people actually maintain a sane versioning policy. If people
>> don't take version management seriously, you will quickly end up with five
>> different versions of an extension, all with version number 0.1.
>
> Projects are taking it seriously, and invest a lot of effort in it.
> There is no shortage of schema versioning frameworks, of varying
> levels of maturity....but some are quite complete by the standards of
> their users.  However, there is little knowledge shared between them,
> and the no database gives them much support, so idiosyncrasy becomes
> inevitable.

Speak of the devil. Someone just posted use of extension versioning to
manage schemas (using the existing plain-old-files mechanism):

http://philsorber.blogspot.com/2012/01/deploy-schemata-like-boss.html

He also links to a -hackers post Dimitri wrote last December.

A few anecdotes does not constitute evidence, but it does look like
some people pay attention to any additional versioning foothold they
can get.

--
fdr

Re: Inline Extension

From

Robert Haas

Date:

23 January 2012, 00:42:51

On Sun, Jan 22, 2012 at 3:20 PM, Daniel Farina <daniel@heroku.com> wrote:
> A few anecdotes does not constitute evidence, but it does look like
> some people pay attention to any additional versioning foothold they
> can get.

Sure, but just because some people do it doesn't make it a good idea.
I can personally attest to having done many things over the years
which initially appeared to be good ideas, but later turned out to be
mind-bogglingly dumb.  It's true that if we make it easy for people to
use the extension mechanism as a way of storing versioning information
for user code, then people will use it for that purpose, but the
question is whether that's really what we want.  I don't see what
advantage it gives us.  Dimitri's proposal was to neuter the pg_dump
support that is the raison d'être of the extension mechanism.  That's
clearly necessary if you don't want to end up with an unreloadable
database, but it begs the question (which no one's really answered
AFAICT) of what good the extension mechanism is without that feature.
There are certainly easier ways to remember a version number than
building support for it into core.  If people create their own
versioning mechanisms, they can create something which is tailor-made
for their particular requirements, rather than relying on decisions
which we made in core that may or may not be right for them (e.g. the
lack of version ordering, or even that we have versions rather than
some more general type of control table).  I don't want to prejudge
the outcome and say, oh, we should never have support for this concept
in core - but neither do I want to embark on that project without a
detailed understanding of where and how it is adding value.  If people
are doing management via "pure FEBE", good for them: but that doesn't
explain why it shoudn't be done all in userspace, with all of the
flexibility that gives.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Tom Lane

Date:

23 January 2012, 02:01:08

Robert Haas <robertmhaas@gmail.com> writes:
> ... If people
> are doing management via "pure FEBE", good for them: but that doesn't
> explain why it shoudn't be done all in userspace, with all of the
> flexibility that gives.

On reflection it seems like this patch is simply offering the wrong
solution for the problem.  I agree that it could be useful to install
extensions without having direct access to the server's filesystem,
but it doesn't seem to follow that we must lobotomize existing extension
features in order to have that.  I pointed out earlier that you could
get such functionality via contrib/adminpack, though people not
unreasonably complained that that was pretty ugly and low-level.
But couldn't we define some SQL-level operations to allow installing
extension control and script files?

Probably the worst issue with that is that in typical installations,
the share/extension/ directory would be read-only to the server, and a
lot of people might be uncomfortable with making it writable.  Not sure
whether we should consider inventing another place to keep
SQL-command-installed extensions, or just say "if you want this
functionality you have to make share/extension/ writable".
        regards, tom lane

Re: Inline Extension

From

Daniel Farina

Date:

23 January 2012, 04:22:23

On Sun, Jan 22, 2012 at 8:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Jan 22, 2012 at 3:20 PM, Daniel Farina <daniel@heroku.com> wrote:
>> A few anecdotes does not constitute evidence, but it does look like
>> some people pay attention to any additional versioning foothold they
>> can get.
>
> Sure, but just because some people do it doesn't make it a good idea.

True, I did not mean to suggest that this is clearly the best
mechanism, only to express support for the general idea that people
are appreciative and willing to experiment with any way to make
version management better, and people who use those features are
*probably* going to try to use them correctly.  It's up to us to make
the right-thing easy, too, otherwise I fear we will see too many lousy
version numbers creeping about in the wild.

> Dimitri's proposal was to neuter the pg_dump
> support that is the raison d'être of the extension mechanism.  That's
> clearly necessary if you don't want to end up with an unreloadable
> database, but it begs the question (which no one's really answered
> AFAICT) of what good the extension mechanism is without that feature.

Oh, no, non-reloadability is a really bad thing -- I'd say a pretty
bad deal-breaker -- but as Tom wrote, it does seem like it should
somehow be a tractable problem.

Is it such a bad idea to store the literal text of the extension's
pieces (control file and corresponding SQL program) in catalogs?  I'm
not sure if I understand why everyone is so interested in a special
interaction with the file system in some way.  By the same token,
extensions can be dumped in the literal syntax -- even the ones that
were installed from a file.

> There are certainly easier ways to remember a version number than
> building support for it into core.  If people create their own
> versioning mechanisms, they can create something which is tailor-made
> for their particular requirements, rather than relying on decisions
> which we made in core that may or may not be right for them (e.g. the
> lack of version ordering, or even that we have versions rather than
> some more general type of control table).

I understand the desire to avoid investing in something that is not
what people want.  However, in the interest of scoping the discussion
to the inline extension support, I can't seem to understand the
objection to supporting what is basically a different transport for
precisely the same semantic operation as having to ssh into a machine
and untar some files, except available without the bizarre
side-channel of ssh and fie system mangling when one is loading
trustable operators, itself a raft of usability issues if one wishes
to enable more software reuse.

--
fdr

Re: Inline Extension

From

Cédric Villemain

Date:

23 January 2012, 06:30:13

> Is it such a bad idea to store the literal text of the extension's
> pieces (control file and corresponding SQL program) in catalogs?  I'm
> not sure if I understand why everyone is so interested in a special
> interaction with the file system in some way.  By the same token,
> extensions can be dumped in the literal syntax -- even the ones that
> were installed from a file.
>
>> There are certainly easier ways to remember a version number than
>> building support for it into core.  If people create their own
>> versioning mechanisms, they can create something which is tailor-made
>> for their particular requirements, rather than relying on decisions
>> which we made in core that may or may not be right for them (e.g. the
>> lack of version ordering, or even that we have versions rather than
>> some more general type of control table).
>
> I understand the desire to avoid investing in something that is not
> what people want.  However, in the interest of scoping the discussion
> to the inline extension support, I can't seem to understand the
> objection to supporting what is basically a different transport for
> precisely the same semantic operation as having to ssh into a machine
> and untar some files, except available without the bizarre
> side-channel of ssh and fie system mangling when one is loading
> trustable operators, itself a raft of usability issues if one wishes
> to enable more software reuse.

or with adminpack, and without ssh, but still interaction with filesystem.
Filesystem rw access is a pain for the DBA. There are hacks possible
to get ride of that (but not completely: mount -o ro partitions for
example...)
I am in favor to be able to create extension directly in plain sql,
without file creation or access or system administrators privileges.
Why wouldn't we want that ?!

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Inline Extension

From

Dimitri Fontaine

Date:

23 January 2012, 06:54:06

Tom Lane <tgl@sss.pgh.pa.us> writes:
> On reflection it seems like this patch is simply offering the wrong
> solution for the problem.  I agree that it could be useful to install
> extensions without having direct access to the server's filesystem,
> but it doesn't seem to follow that we must lobotomize existing extension
> features in order to have that.  I pointed out earlier that you could
> get such functionality via contrib/adminpack, though people not
> unreasonably complained that that was pretty ugly and low-level.
> But couldn't we define some SQL-level operations to allow installing
> extension control and script files?

Yeah, that's what I was trying to do…

> Probably the worst issue with that is that in typical installations,
> the share/extension/ directory would be read-only to the server, and a
> lot of people might be uncomfortable with making it writable.  Not sure
> whether we should consider inventing another place to keep
> SQL-command-installed extensions, or just say "if you want this
> functionality you have to make share/extension/ writable".

So I've been wondering about storing the script content in the catalogs,
but you would have to store all the update scripts too and that's
useless because you want to dump the current state of the system, which
pg_dump is doing just fine.

Back to using the file system on the server when handed the script
content over the protocol, we could get there with a new GUC telling the
server where to find and store “inline” extensions, right?
 extension_inline_directory = '/path/to/some/writable/place'

Then creating an extension would look in both the system extension
directory and the inline one, which covers dump and restore.  Creating
an inline extension means creating the .control and the .sql files in
the extension_inline_directory, then running the current code.

I can adapt the patch to this behavior this week.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Cédric Villemain

Date:

23 January 2012, 06:59:32

Le 23 janvier 2012 11:53, Dimitri Fontaine <dimitri@2ndquadrant.fr> a écrit :
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> On reflection it seems like this patch is simply offering the wrong
>> solution for the problem.  I agree that it could be useful to install
>> extensions without having direct access to the server's filesystem,
>> but it doesn't seem to follow that we must lobotomize existing extension
>> features in order to have that.  I pointed out earlier that you could
>> get such functionality via contrib/adminpack, though people not
>> unreasonably complained that that was pretty ugly and low-level.
>> But couldn't we define some SQL-level operations to allow installing
>> extension control and script files?
>
> Yeah, that's what I was trying to do…
>
>> Probably the worst issue with that is that in typical installations,
>> the share/extension/ directory would be read-only to the server, and a
>> lot of people might be uncomfortable with making it writable.  Not sure
>> whether we should consider inventing another place to keep
>> SQL-command-installed extensions, or just say "if you want this
>> functionality you have to make share/extension/ writable".
>
> So I've been wondering about storing the script content in the catalogs,
> but you would have to store all the update scripts too and that's
> useless because you want to dump the current state of the system, which
> pg_dump is doing just fine.
>
> Back to using the file system on the server when handed the script
> content over the protocol, we could get there with a new GUC telling the
> server where to find and store “inline” extensions, right?
>
>  extension_inline_directory = '/path/to/some/writable/place'
>
> Then creating an extension would look in both the system extension
> directory and the inline one, which covers dump and restore.  Creating
> an inline extension means creating the .control and the .sql files in
> the extension_inline_directory, then running the current code.
>
> I can adapt the patch to this behavior this week.

if we agree to have that per cluster, then it can be in the
$pgdata/pg_extension or something similar...

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Inline Extension

From

Robert Haas

Date:

23 January 2012, 09:02:36

On Mon, Jan 23, 2012 at 5:53 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
>> Probably the worst issue with that is that in typical installations,
>> the share/extension/ directory would be read-only to the server, and a
>> lot of people might be uncomfortable with making it writable.  Not sure
>> whether we should consider inventing another place to keep
>> SQL-command-installed extensions, or just say "if you want this
>> functionality you have to make share/extension/ writable".
>
> So I've been wondering about storing the script content in the catalogs,
> but you would have to store all the update scripts too and that's
> useless because you want to dump the current state of the system, which
> pg_dump is doing just fine.

I'm not convinced that's useless.  It would meet Dan's requirement to
be able to manage the whole thing via the FE-BE protocol, and we could
make the CREATE EXTENSION mechanism transparently search both the
catalog and the filesystem when an extension is installed.  I'm
imagining that we'd create a catalog that would act as a sort of
"virtual directory" - e.g. CREATE TABLE pg_extension_virtualdir
(filename text, content text) which would be modifiable by the DBA and
would be searched either before or after the filesystem itself.  This
catalog wouldn't be dumped by pg_dump, and there would be no changes
to how extensions whose files are loaded from this catalog are dumped
vs. those whose files are loaded from the filesystem.  Rather, just as
now, it would be the DBA's responsibility to make sure that the
extensions needed to reload a given dump file are present on the new
system - except now they'd have two choices where to put the related
fies: on the file system, or in the per-database "virtual directory".

> Back to using the file system on the server when handed the script
> content over the protocol, we could get there with a new GUC telling the
> server where to find and store “inline” extensions, right?
>
>  extension_inline_directory = '/path/to/some/writable/place'
>
> Then creating an extension would look in both the system extension
> directory and the inline one, which covers dump and restore.  Creating
> an inline extension means creating the .control and the .sql files in
> the extension_inline_directory, then running the current code.

This is another possible approach, but it requires a bit more
configuration, and we'd better think carefully about what a malicious
non-superuser DBA can do by changing that GUC.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

23 January 2012, 09:25:59

Robert Haas <robertmhaas@gmail.com> writes:
> "virtual directory" - e.g. CREATE TABLE pg_extension_virtualdir
> (filename text, content text) which would be modifiable by the DBA and
> would be searched either before or after the filesystem itself.  This
> catalog wouldn't be dumped by pg_dump, and there would be no changes
> to how extensions whose files are loaded from this catalog are dumped
> vs. those whose files are loaded from the filesystem.  Rather, just as

That's the thing I don't like in this approach. Maybe it's just me but
the primary intention on working on extension was to make dump and
restore do the right thing all by itself.

Now if we have “inline” (SQL only) extensions, the right thing happens
to be very different from when you're dealing with contrib like ones,
namely I would want the script to be dumped.

>>  extension_inline_directory = '/path/to/some/writable/place'
>
> This is another possible approach, but it requires a bit more
> configuration, and we'd better think carefully about what a malicious
> non-superuser DBA can do by changing that GUC.

I think Cédric nailed it down upthread, proposing that we just use a
PGDATA sub directory called 'pg_extension'. In fact, that would need to
be a per-database sub directory. Then there's nothing to setup, nothing
to abuse.

Also remember that we're limiting this feature to SQL only extensions
(because we don't want to be loading our .so from anywhere in the system
and forcing them into a place owned by root is giving confidence, IIUC).
With SQL only extensions, it's all non-superuser land anyway.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

23 January 2012, 10:00:26

On Mon, Jan 23, 2012 at 8:25 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> "virtual directory" - e.g. CREATE TABLE pg_extension_virtualdir
>> (filename text, content text) which would be modifiable by the DBA and
>> would be searched either before or after the filesystem itself.  This
>> catalog wouldn't be dumped by pg_dump, and there would be no changes
>> to how extensions whose files are loaded from this catalog are dumped
>> vs. those whose files are loaded from the filesystem.  Rather, just as
>
> That's the thing I don't like in this approach. Maybe it's just me but
> the primary intention on working on extension was to make dump and
> restore do the right thing all by itself.

Well, fair enough: there's no accounting for taste.  You could make my
solution work with pg_dump in a fully automated fashion if you dumped
out the "virtual directory" contents before dumping any CREATE
EXTENSION statements, but I'm not going to get up on my soapbox and
say that's the world's best design, so if you don't like it, fine!

I am pretty concerned that we find a design that does not involve
pg_dump needing to dump out the extension contents, though: that seems
to me to be missing the point of having extensions in the first place.

> Now if we have “inline” (SQL only) extensions, the right thing happens
> to be very different from when you're dealing with contrib like ones,
> namely I would want the script to be dumped.
>
>>>  extension_inline_directory = '/path/to/some/writable/place'
>>
>> This is another possible approach, but it requires a bit more
>> configuration, and we'd better think carefully about what a malicious
>> non-superuser DBA can do by changing that GUC.
>
> I think Cédric nailed it down upthread, proposing that we just use a
> PGDATA sub directory called 'pg_extension'. In fact, that would need to
> be a per-database sub directory. Then there's nothing to setup, nothing
> to abuse.

Hmm, that might have something going for it.  It seems comparatively
easy to implement, and it also seems to do a pretty good job hiding
the complexity under the hood where users don't have to worry about
it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

23 January 2012, 11:05:39

Robert Haas <robertmhaas@gmail.com> writes:
> I am pretty concerned that we find a design that does not involve
> pg_dump needing to dump out the extension contents, though: that seems
> to me to be missing the point of having extensions in the first place.

I was just trying to explain where I'm coming from, I'm not wedded to
the idea though, all the more when I think that we're reaching a much
better one.

>> I think Cédric nailed it down upthread, proposing that we just use a
>> PGDATA sub directory called 'pg_extension'. In fact, that would need to
>> be a per-database sub directory. Then there's nothing to setup, nothing
>> to abuse.
>
> Hmm, that might have something going for it.  It seems comparatively
> easy to implement, and it also seems to do a pretty good job hiding
> the complexity under the hood where users don't have to worry about
> it.

And then basebackup and pg_upgrade would just work, and for dump and
restore we still need to find something not violating the POLA.

I think that would mean offering a backend function that list all files
from a given extension and their content, including the control files,
and a query that stores that output for only “inline” extensions.  The
content of the query result is formatted as a series of create extension
and alter extension update (in the right order) in the dump file so that
it just transparently re-creates the files for you on the new databse.

Or do you still want to insist that dump/restore shouldn't care about
any extension, inline or not, and so you're given the responsibility to
do the exact same thing yourself on the client side?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

23 January 2012, 11:16:44

On Mon, Jan 23, 2012 at 10:04 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> And then basebackup and pg_upgrade would just work, and for dump and
> restore we still need to find something not violating the POLA.
>
> I think that would mean offering a backend function that list all files
> from a given extension and their content, including the control files,
> and a query that stores that output for only “inline” extensions.  The
> content of the query result is formatted as a series of create extension
> and alter extension update (in the right order) in the dump file so that
> it just transparently re-creates the files for you on the new databse.

Hmm.  But CREATE EXTENSION / ALTER EXTENSION doesn't seem right,
because the files in the directory correspond to *available*
extensions, not already-created ones.  We need some way of dumping and
restoring the files themselves, not the extension that can be created
from them.  I suspect internal functions (pg_whatever) make more sense
than new SQL syntax, since this is really only to make pg_dump happy.

> Or do you still want to insist that dump/restore shouldn't care about
> any extension, inline or not, and so you're given the responsibility to
> do the exact same thing yourself on the client side?

How about adding a new pg_dump option to suppress this part of the dump?

It seems to me that there will be people who want to do that; for
example, it might be that all (or some, or one) of the extensions that
were installed this way on the old server are installed globally on
the new server.  We need some way to cope with that.  Having a new
pg_dump option to suppress this output is not terribly granular but
maybe it'd be enough for round one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

23 January 2012, 11:26:42

Robert Haas <robertmhaas@gmail.com> writes:
> Hmm.  But CREATE EXTENSION / ALTER EXTENSION doesn't seem right,
> because the files in the directory correspond to *available*
> extensions, not already-created ones.  We need some way of dumping and

I would have limited the dump query to only known installed extensions,
right.  The update scripts are still needed because with inline
extensions you typically never see a 1.2 script but a 1.0 then 1.0--1.1
and then a 1.1--1.2.

> restoring the files themselves, not the extension that can be created
> from them.  I suspect internal functions (pg_whatever) make more sense
> than new SQL syntax, since this is really only to make pg_dump happy.

That could well be, yes, but what would this function do that the
commands are not doing?  I'm ok not to invent specific syntax to solve
that problem, I just think that we should already have all we need :)

>> Or do you still want to insist that dump/restore shouldn't care about
>> any extension, inline or not, and so you're given the responsibility to
>> do the exact same thing yourself on the client side?
>
> How about adding a new pg_dump option to suppress this part of the dump?

Makes sense, indeed.  Well one could of course manually filter the dump
object list too, of course…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

23 January 2012, 11:39:33

On Mon, Jan 23, 2012 at 10:26 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Hmm.  But CREATE EXTENSION / ALTER EXTENSION doesn't seem right,
>> because the files in the directory correspond to *available*
>> extensions, not already-created ones.  We need some way of dumping and
>
> I would have limited the dump query to only known installed extensions,
> right.  The update scripts are still needed because with inline
> extensions you typically never see a 1.2 script but a 1.0 then 1.0--1.1
> and then a 1.1--1.2.

Hmm, I don't think I like that design.  I think we should view this as
a way to embed the SQL and control files needed by the extension in
the server, rather than a separate thing called an inline extension.
If pg_dump is going to dump those files, it ought to dump them all,
not just some subset of them.

>> restoring the files themselves, not the extension that can be created
>> from them.  I suspect internal functions (pg_whatever) make more sense
>> than new SQL syntax, since this is really only to make pg_dump happy.
>
> That could well be, yes, but what would this function do that the
> commands are not doing?  I'm ok not to invent specific syntax to solve
> that problem, I just think that we should already have all we need :)

I was thinking of something like
pg_write_extension_file('foobar.control', '....content goes
here....');

>>> Or do you still want to insist that dump/restore shouldn't care about
>>> any extension, inline or not, and so you're given the responsibility to
>>> do the exact same thing yourself on the client side?
>>
>> How about adding a new pg_dump option to suppress this part of the dump?
>
> Makes sense, indeed.  Well one could of course manually filter the dump
> object list too, of course…

True.  I guess that's another reason why a global flag to shut it all
off is probably sufficient, but I'm still in favor of having at least
that much.  I think the average user is much more likely to find
pg_dump --skip-whatever than they are to understand how to do dump
object filtering correctly, and even the advanced user may appreciate
the shortcut on occasion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

23 January 2012, 11:47:17

Robert Haas <robertmhaas@gmail.com> writes:
> Hmm, I don't think I like that design.  I think we should view this as
> a way to embed the SQL and control files needed by the extension in
> the server, rather than a separate thing called an inline extension.
> If pg_dump is going to dump those files, it ought to dump them all,
> not just some subset of them.

Ok, but then, what about .so files?  Wouldn't it make sense to be able
to ship also the executable modules needed, and if not, why not?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Robert Haas

Date:

23 January 2012, 11:51:31

On Mon, Jan 23, 2012 at 10:46 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Hmm, I don't think I like that design.  I think we should view this as
>> a way to embed the SQL and control files needed by the extension in
>> the server, rather than a separate thing called an inline extension.
>> If pg_dump is going to dump those files, it ought to dump them all,
>> not just some subset of them.
>
> Ok, but then, what about .so files?  Wouldn't it make sense to be able
> to ship also the executable modules needed, and if not, why not?

Sure, that would be as useful as any other part of this feature.  We'd
have to think carefully about how to make it secure, though: obviously
it's no good to let a non-superuser database owner install compiled C
code that gets loaded into the backend!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

23 January 2012, 12:17:50

Robert Haas <robertmhaas@gmail.com> writes:
>> Ok, but then, what about .so files?  Wouldn't it make sense to be able
>> to ship also the executable modules needed, and if not, why not?
>
> Sure, that would be as useful as any other part of this feature.  We'd
> have to think carefully about how to make it secure, though: obviously
> it's no good to let a non-superuser database owner install compiled C
> code that gets loaded into the backend!

The big problem is that .so are installed in directories where the
system postgres user usually is not granted permissions to write, that's
root business.

It already has been asked before about allowing PostgreSQL to load .so
from a non-root location, and I think that to be consistent with your
view of the world it would be good to add that feature. Then there's no
such beast as an “inline” extension so much as a way to install an
extension from the protocol, without file system level access to the
production server. This would need to be superuser only, of course.

That opens up the possibility to ship the modules to any standby server
too: the situation now is unfortunate in that create extension hstore on
the primary then using it in some indexes means the standby has no means
to use those index until you fix it and install the same extension files
there.

The module itself could be accepted as a bytea value and written at the
right place, where the server knows to load it.

Now you can dump/restore any extension fully, and we can even ship any
extension in the WAL stream (a new message is needed though).

The only remaining issue would be the default policy as far as including
or avoiding an extension in the dump is concerned, and I would be ok
with a default of including none as of now and a pg_dump switch that you
can repeat to include whichever extension you wish to.  The effect is to
transport the files and install them in the expected place on the target
server where you restore.

And now we really process all extensions the same and you can update
inline an extension that you installed the 9.1 way, and vice versa.

Don't let me speak about extension distribution facilities just now :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

Daniel Farina

Date:

23 January 2012, 21:00:19

On Mon, Jan 23, 2012 at 8:17 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>>> Ok, but then, what about .so files?  Wouldn't it make sense to be able
>>> to ship also the executable modules needed, and if not, why not?
>
> Now you can dump/restore any extension fully, and we can even ship any
> extension in the WAL stream (a new message is needed though).

Things are still a bit ugly in the more complex cases: consider
PostGIS's linkage against libproj and other libraries.  In order to
cover all cases, I feel that what I need is an optional hook (for the
same of argument, let's say it's another "command" type hook, e.g.
"archive_command") to be executed when extension (un)installation is
occurs on a primary or is replayed on a standby whereby I can acquire
the necessary dependencies for an extension or signal some kind of
error (as to exactly how that interfaces with the server is delicate,
should one want to supply good error messages to the user).

I think that hook could be useful for a number of reasons:

* Extension distribution (as long as we're downloading dependent
libraries, why not get the extension too?)
 * Extension distribution on standbys, too -- by much the same mechanism

* Extension whitelisting for non-superusers (are you allowed/can you
even have that extension?)

And, more to the point, if one wants to make replication and
extensions work nicely together, I don't really see an option outside
such a hook other than insisting on a form of packaging whereby all
dependencies are declared to Postgres and Postgres becomes the
dependency management system for all dependent binary assets.  That
could lead to a more cohesive system, but is also a pretty hefty
burden, both on this project and others.

But getting back to in-line extensions: I think it makes sense to dump
all extensions in their in-line representation even if in some
situations carrying a copy of the extension in the backup is not
strictly necessary.  The only bloat is including the literal
sourcetext of the extension in the dump.  With a
extension-installation hook, the literal version of the extension
could be supplied but ignored if an side-channel mechanism for getting
the extension makes sense.

--
fdr

Re: Inline Extension

From

Robert Haas

Date:

24 January 2012, 10:39:28

On Mon, Jan 23, 2012 at 7:59 PM, Daniel Farina <daniel@heroku.com> wrote:
> Things are still a bit ugly in the more complex cases: consider
> PostGIS's linkage against libproj and other libraries.  In order to
> cover all cases, I feel that what I need is an optional hook (for the
> same of argument, let's say it's another "command" type hook, e.g.
> "archive_command") to be executed when extension (un)installation is
> occurs on a primary or is replayed on a standby whereby I can acquire
> the necessary dependencies for an extension or signal some kind of
> error (as to exactly how that interfaces with the server is delicate,
> should one want to supply good error messages to the user).

There aren't a whole lot of good options for handling errors on the
standby, in general.  If the operation has already occurred on the
master, it's too late to decide that we're not going to have it happen
on the standby as well.

Of course, if we confine ourselves to control and SQL files, then it
doesn't really matter: the catalog entries from the master will make
it to the standby regardless of the absence of the SQL and control
files, and maybe we could simply decide that the operations which
write in pg_extension aren't WAL-logged and can be separately
performed on the standby if you want them there as well.

But, that's a bit unlike how we normally do things.  And if we're
going to WAL-log the writing of the extension files, then I start to
feel like we should go back to putting the data in a system catalog
rather than the filesystem, because inventing a new WAL record type
for this seems like it will make things quite a bit more complicated
for no particular benefit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

26 January 2012, 13:40:41

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jan 23, 2012 at 7:59 PM, Daniel Farina <daniel@heroku.com> wrote:
>> Things are still a bit ugly in the more complex cases: consider
>> PostGIS's linkage against libproj and other libraries.  In order to

After thinking about that for a while I think we should not include the
shared module (.so) in the scope of this patch nor 9.2.  Ensuring that
the .so dependencies are met is a problem well outside the reach of the
PostgreSQL system.

>> cover all cases, I feel that what I need is an optional hook (for the
>> same of argument, let's say it's another "command" type hook, e.g.
>> "archive_command") to be executed when extension (un)installation is
>> occurs on a primary or is replayed on a standby whereby I can acquire
>> the necessary dependencies for an extension or signal some kind of
>> error (as to exactly how that interfaces with the server is delicate,
>> should one want to supply good error messages to the user).

Supporting command triggers on the replica is in the TODO list. Having
that in 9.2 is far from granted though, but the catalog and syntax
already have support for the feature.

> But, that's a bit unlike how we normally do things.  And if we're
> going to WAL-log the writing of the extension files, then I start to
> feel like we should go back to putting the data in a system catalog
> rather than the filesystem, because inventing a new WAL record type
> for this seems like it will make things quite a bit more complicated
> for no particular benefit.

Exactly, and we should talk about concurrent accesses to the filesystem
too, and how to have a clean transactional backup including the scripts
when that pg_dump option is used.

So I'm going to prepare the next version of the patch with this design:
- in catalog extension scripts for inline extension
  pg_extension_script(extoid, oldversion, version, script)
  oldversion is null when create extension is used  unless when using the create extension from 'unpackaged' form
- see about adding more control properties in the catalog?
- current code that is parsing the filenames to determine the upgrade  path will have to be able to take the version
stringsfrom the new  catalog as an alternative, and getting to the script content must be  able to select from the
catalogor read a file on disk 
- pg_dump defaults to not dumping extension content
- pg_dump --include-extension-scripts dumps the scripts found either in  the filesystem or the catalog, a create script
firstthen any number  of update script as needed to reach the current installed version 
- same as we have -t, add -e --extension to pg_dump so that you can  choose to dump only a given extension

The extension dumping will not include the shared modules, so if you
extension depend on them being installed on the server, you will be much
better served with some OS level packaging.

Not for 9.2, but I can't help thinking that if we could manage to host
the .so module itself in the catalogs, we could solve updating it in a
transactional way and more importantly host it per-database, rather than
having the modules work per major version (not even per cluster) and the
extension mechanism work per-database inside each cluster. But that's
work for another release.

Any comments before I spend time coding this?
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Inline Extension

From

"David E. Wheeler"

Date:

26 January 2012, 19:48:36

On Jan 26, 2012, at 9:40 AM, Dimitri Fontaine wrote:

> So I'm going to prepare the next version of the patch with this design:
>
> - in catalog extension scripts for inline extension
>
>   pg_extension_script(extoid, oldversion, version, script)
>
>   oldversion is null when create extension is used
>   unless when using the create extension from 'unpackaged' form

Would you keep all the migration scripts used over time to upgrade from one version to another?

> - see about adding more control properties in the catalog?
>
> - current code that is parsing the filenames to determine the upgrade
>   path will have to be able to take the version strings from the new
>   catalog as an alternative, and getting to the script content must be
>   able to select from the catalog or read a file on disk
>
> - pg_dump defaults to not dumping extension content
>
> - pg_dump --include-extension-scripts dumps the scripts found either in
>   the filesystem or the catalog, a create script first then any number
>   of update script as needed to reach the current installed version
>
> - same as we have -t, add -e --extension to pg_dump so that you can
>   choose to dump only a given extension

Also --exclude-extension?

> The extension dumping will not include the shared modules, so if you
> extension depend on them being installed on the server, you will be much
> better served with some OS level packaging.

Or must make sure it’s installed on the system before you restore.

> Not for 9.2, but I can't help thinking that if we could manage to host
> the .so module itself in the catalogs, we could solve updating it in a
> transactional way and more importantly host it per-database, rather than
> having the modules work per major version (not even per cluster) and the
> extension mechanism work per-database inside each cluster. But that's
> work for another release.

+1 Cloud vendors will *love* this.

Best,

David

Re: Inline Extension

From

Peter van Hardenberg

Date:

27 January 2012, 05:47:09

On Thu, Jan 26, 2012 at 3:48 PM, David E. Wheeler <david@justatheory.com> wrote:
> On Jan 26, 2012, at 9:40 AM, Dimitri Fontaine wrote:
>
>> Not for 9.2, but I can't help thinking that if we could manage to host
>> the .so module itself in the catalogs, we could solve updating it in a
>> transactional way and more importantly host it per-database, rather than
>> having the modules work per major version (not even per cluster) and the
>> extension mechanism work per-database inside each cluster. But that's
>> work for another release.
>
> +1 Cloud vendors will *love* this.
>

Confirmed.

Let me share my perspective. I'll begin by describing the current
state of runtime code dependency management for comparison.

In Ruby, any user can push an application to our platform which relies
on any/every ruby "gem" ever released (give or take). These gems may
require exact releases of other gems, have elaborate system
dependencies, and/or natively compiled code components. This is thanks
to the rubygems.org repository, the "gem" system, and recently but
crucially, the "bundler" system for resolving and isolating
dependencies. Releasing a new gem takes moments and I have personally
released a half dozen of no real consequence to the world which I use
from time to time.

In contrast, the idea that any person or team of people could possibly
review the literally hundreds of gems released each day is no longer
plausible. The only feasible solution for providing a robust service
is to engineer a solution which can be operated from inside the
cluster to install any library whatsoever.

Our aim is, put simply, to be able to support every extension in the
world, at once, under cascading replication, across major catalogue
upgrades. We hope this ideal is shared by the community at large,
since our problems are generally the same as other users, just writ
large.

-pvh

PS: As an aside, because of the many problems with in-cluster
multi-tenancy (to pick just one example, resource isolation between
users) I have no security concerns with giving users every ability to
execute code as the cluster owner's UNIX user. On our service we do
not restrict our users access to superuser out of spite, but to reduce
the available surface area for self-destruction.

-- 
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt." -- Kurt Vonnegut

Re: Inline Extension

From

Cédric Villemain

Date:

27 January 2012, 06:19:38

>> So I'm going to prepare the next version of the patch with this design:
>>
>> - in catalog extension scripts for inline extension
>>
>>   pg_extension_script(extoid, oldversion, version, script)
>>
>>   oldversion is null when create extension is used
>>   unless when using the create extension from 'unpackaged' form
>
> Would you keep all the migration scripts used over time to upgrade from one version to another?

yes, that is the idea. You then can play all the stack of SQL needed
to go from version "foo" to version "bar" (we don't care about version
format, here again).

>> - same as we have -t, add -e --extension to pg_dump so that you can
>>   choose to dump only a given extension
>
> Also --exclude-extension?

It might be the default.
We need something to dump the content of
pg_catalog.pg_extension_script (or whatever table is going to contain
SQL code), per extension or all.

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Inline Extension

From

"David E. Wheeler"

Date:

28 January 2012, 16:46:59

On Jan 27, 2012, at 2:19 AM, Cédric Villemain wrote:

>> Also --exclude-extension?
>
> It might be the default.
> We need something to dump the content of
> pg_catalog.pg_extension_script (or whatever table is going to contain
> SQL code), per extension or all.

I think dim said --no-extensions would be the default, but I’m thinking it would be useful to have --with-extensions to
includethem all, but then be able to --exclude-extension for a select few. 

Best,

David

Re: Inline Extension

From

Cédric Villemain

Date:

28 January 2012, 17:22:42

Le 28 janvier 2012 21:46, David E. Wheeler <david@justatheory.com> a écrit :
> On Jan 27, 2012, at 2:19 AM, Cédric Villemain wrote:
>
>>> Also --exclude-extension?
>>
>> It might be the default.
>> We need something to dump the content of
>> pg_catalog.pg_extension_script (or whatever table is going to contain
>> SQL code), per extension or all.
>
> I think dim said --no-extensions would be the default, but I’m thinking it would be useful to have --with-extensions
toinclude them all, but then be able to --exclude-extension for a select few. 

So I am.

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Inline Extension

From

Robert Haas

Date:

08 March 2012, 09:15:51

On Thu, Jan 26, 2012 at 12:40 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> So I'm going to prepare the next version of the patch with this design:

Considering that this constitutes a major redesign and that the
updated patch hasn't been posted over a month later, it seems past
time to mark this Returned with Feedback.  So I've done that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Inline Extension

From

Dimitri Fontaine

Date:

08 March 2012, 12:07:33

Robert Haas <robertmhaas@gmail.com> writes:
> Considering that this constitutes a major redesign and that the
> updated patch hasn't been posted over a month later, it seems past
> time to mark this Returned with Feedback.  So I've done that.

I was hoping to be able to code that while the CF is running but
obviously that won't happen given the other patches I'm playing with.

This major redesign you're mentioning is a very good feedback already
and will allow me to implement a non controversial patch for the next
release.  Thanks for dealing with the CF parts of that.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support