Thread: Extension Templates S03E11
Hi, Please find attached to this email the latest and greatest version of in-line SQL only extensions support, known as "Extension Templates" and which could be renamed "In-Catalog Extension Templates". I've included a high-level description of the patch in a style that targets the detailed commit messages for features of that source code impact level. The attached patch is known to address all points raised in the previous reviews and to implement the best design we could come up with, thanks to immense helping from Tom, Heikki and Markus. Of course, bugs are all my precious. I'm going to register that patch to the next commitfest. It's not the only patch I intend to register for september though, as I want to get to a usable situation with Event Triggers, so you can expect a series of patches for that, covering what couldn't make it previously. As I think this WIP is about as ready-for-committer as it will ever be, it would be fantastic if we could do a single committer review before CF2013-09 so that I know that it's going to be accepted… or not. Well at least it's in the queue already, we'll see what can be done. Regards, --- Implement in-catalog Extension Template facility. Previously, the only way to CREATE EXTENSION involved installing file system templates in a place generally owned by root: creation scripts, upgrade scripts, main control file and auxilliary control files. This patch implements a way to upload all those resources into the catalogs, so that a PostgreSQL connection is all you need to make an extension available. By design and for security concerns the current Extension Template facility is not able to deal with extensions that need to load a DSO module into the backend. Using any other PL is supported though. An extension created from a template depends on it, and the templates are part of any backup script taken with pg_dump. So that at pg_restore time, when CREATE EXTENSION is executed the templates are already in place. To be able to do that, though, we need a difference in behavior in between the classic file system level templates and the catalog templates: there's no dependency tracking happening at all with file system templates and those can be changed at will even if an extension has been already instanciated from the templates, or even removed. Apart from the dependency tracking, the only other difference between file system templates and catalog templates for extensions is that the later are managed per-database. The file system level templates being managed per major version of PostgreSQL is considered a drawback of that method and not to be immitated by the in-catalog system, more flexible by design. At CREATE EXTENSION time, the file system templates are always prefered to the catalog templates. Also, it's prohibited to make available an extension in the catalogs if an extension of the same name is already available from file system templates. That said, some "race conditions" make it still possible to have the same extension name available as a file system template and a catalog template. Even if only the former will ever get installed, it's been deemed prudent to restrict the in-catalog templates for extensions to superusers only. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On 1 August 2013 18:01, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
Thom
Hi,
Please find attached to this email the latest and greatest version of
in-line SQL only extensions support, known as "Extension Templates" and
which could be renamed "In-Catalog Extension Templates".
I've included a high-level description of the patch in a style that
targets the detailed commit messages for features of that source code
impact level.
The attached patch is known to address all points raised in the previous
reviews and to implement the best design we could come up with, thanks
to immense helping from Tom, Heikki and Markus. Of course, bugs are all
my precious.
I'm going to register that patch to the next commitfest. It's not the
only patch I intend to register for september though, as I want to get
to a usable situation with Event Triggers, so you can expect a series of
patches for that, covering what couldn't make it previously.
As I think this WIP is about as ready-for-committer as it will ever be,
it would be fantastic if we could do a single committer review before
CF2013-09 so that I know that it's going to be accepted… or not. Well at
least it's in the queue already, we'll see what can be done.
Regards,
---
Implement in-catalog Extension Template facility.
Previously, the only way to CREATE EXTENSION involved installing file
system templates in a place generally owned by root: creation scripts,
upgrade scripts, main control file and auxilliary control files. This
patch implements a way to upload all those resources into the catalogs,
so that a PostgreSQL connection is all you need to make an extension
available.
By design and for security concerns the current Extension Template
facility is not able to deal with extensions that need to load a DSO
module into the backend. Using any other PL is supported though.
An extension created from a template depends on it, and the templates
are part of any backup script taken with pg_dump. So that at pg_restore
time, when CREATE EXTENSION is executed the templates are already in
place.
To be able to do that, though, we need a difference in behavior in
between the classic file system level templates and the catalog
templates: there's no dependency tracking happening at all with file
system templates and those can be changed at will even if an extension
has been already instanciated from the templates, or even removed.
Apart from the dependency tracking, the only other difference between
file system templates and catalog templates for extensions is that the
later are managed per-database. The file system level templates being
managed per major version of PostgreSQL is considered a drawback of that
method and not to be immitated by the in-catalog system, more flexible
by design.
At CREATE EXTENSION time, the file system templates are always prefered
to the catalog templates. Also, it's prohibited to make available an
extension in the catalogs if an extension of the same name is already
available from file system templates. That said, some "race conditions"
make it still possible to have the same extension name available as a
file system template and a catalog template. Even if only the former
will ever get installed, it's been deemed prudent to restrict the
in-catalog templates for extensions to superusers only.
Could you please resubmit this without using SnapshotNow as it's no longer supported?
Thanks
Thom
Thom Brown <thom@linux.com> writes: > Could you please resubmit this without using SnapshotNow as it's no longer > supported? Sure, sorry that I missed that, please find v12 attached. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Hi, 2013-08-04 15:20 keltezéssel, Dimitri Fontaine írta: > Thom Brown <thom@linux.com> writes: >> Could you please resubmit this without using SnapshotNow as it's no longer >> supported? > Sure, sorry that I missed that, please find v12 attached. Here's a review for this patch. * Is the patch in a patch format which has context? (eg: context diff format) Yes. * Does it apply cleanly to the current git master? No, it has one reject in src/bin/pg_dump/pg_backup_archiver.c. It was obvious to fix, version 12a is attached. * Does it include reasonable tests, necessary doc patches, etc? It has extended the SQL reference nicely but the reference to <xref linkend="extend-extensions"> was not obvious enough regarding the list of control parameters. The SQL syntax has them in allcaps, while the referenced section has them in lower case. It's easy to miss them while just browsing for e.g. RELOCATABLE. I had to go back twice to find the proper part of the text. I would like to see the control parameters documented in allcaps in CREATE EXTENSION TEMPLATE. Then ALTER EXTENSION TEMPLATE should reference the CREATE instead of the longer text in <xref linkend="extend-extensions">. This xref can still be there for reference in all three of CREATE/ALTER/DROP EXTENSION TEMPLATE. * Does the patch actually implement what it's supposed to do? Yes. * Do we want that? Yes. * Do we already have it? No. * Does it follow SQL spec, or the community-agreed behavior? There's no such provision in the spec. As far as I can tell, it follows the community-agreed behavior. * Does it include pg_dump support (if applicable)? Yes. But the version check is already wrong in src/bin/pg_dump/pg_dump.c since this patch missed 9.3. + /* + * Before 9.3, there are no extension templates. + */ + if (fout->remoteVersion < 90300) + { + *numExtensionTemplates = 0; + return NULL; + } + * Are there dangers? I don't think so. * Have all the bases been covered? It seems so. * Does the feature work as advertised? Yes. * Are there corner cases the author has failed to consider? I don't know. * Are there any assertion failures or crashes? No. * Does the patch slow down simple tests? No. * If it claims to improve performance, does it? n/a * Does it slow down other things? No. * Does it follow the project coding guidelines? Yes. Nitpicking. This chunk has an extra unnecessary space: diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile index c4d3f3c..689dc37 100644 --- a/src/backend/catalog/Makefile +++ b/src/backend/catalog/Makefile @@ -38,6 +38,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\ pg_authid.h pg_auth_members.h pg_shdepend.h pg_shdescription.h \ pg_ts_config.h pg_ts_config_map.h pg_ts_dict.h \ pg_ts_parser.h pg_ts_template.h pg_extension.h \ + pg_extension_control.h pg_extension_template.h pg_extension_uptmpl.h \ pg_foreign_data_wrapper.h pg_foreign_server.h pg_user_mapping.h \ pg_foreign_table.h \ pg_default_acl.h pg_seclabel.h pg_shseclabel.h pg_collation.h pg_range.h \ * Are there portability issues? No. * Will it work on Windows/BSD etc? It should. * Are the comments sufficient and accurate? Yes. * Does it do what it says, correctly? According to the added regression tests, yes. * Does it produce compiler warnings? No. * Can you make it crash? No. * Is everything done in a way that fits together coherently with other features/modules? I think so. * Are there interdependencies that can cause problems? I don't know. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Hi, Boszormenyi Zoltan <zb@cybertec.at> writes: > Here's a review for this patch. Thanks for that review Zoltan! > No, it has one reject in src/bin/pg_dump/pg_backup_archiver.c. > It was obvious to fix, version 12a is attached. Included in the new version of the patch (v13), attached. > It has extended the SQL reference nicely but the reference to > <xref linkend="extend-extensions"> was not obvious enough > regarding the list of control parameters. Fixed in the attached. > I would like to see the control parameters documented in allcaps > in CREATE EXTENSION TEMPLATE. Then ALTER EXTENSION > TEMPLATE should reference the CREATE instead of the longer > text in <xref linkend="extend-extensions">. This xref can still > be there for reference in all three of CREATE/ALTER/DROP > EXTENSION TEMPLATE. I didn't follow exactly your proposal here because I didn't want to have to maintain the control parameter description list in two different places. I've still added a detailed list with references and details and more importantly with coverage of e.g. "NORELOCATABLE" which is not covered in the referenced material. > But the version check is already wrong in src/bin/pg_dump/pg_dump.c > since this patch missed 9.3. > > + if (fout->remoteVersion < 90300) Fixed. > Nitpicking. This chunk has an extra unnecessary space: Fixed. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
2013-08-27 18:09 keltezéssel, Dimitri Fontaine írta: > Hi, > > Boszormenyi Zoltan <zb@cybertec.at> writes: >> Here's a review for this patch. > Thanks for that review Zoltan! You're welcome. >> I would like to see the control parameters documented in allcaps >> in CREATE EXTENSION TEMPLATE. Then ALTER EXTENSION >> TEMPLATE should reference the CREATE instead of the longer >> text in <xref linkend="extend-extensions">. This xref can still >> be there for reference in all three of CREATE/ALTER/DROP >> EXTENSION TEMPLATE. > I didn't follow exactly your proposal here because I didn't want to have > to maintain the control parameter description list in two different > places. I've still added a detailed list with references and details and > more importantly with coverage of e.g. "NORELOCATABLE" which is not > covered in the referenced material. It looks better. Now that the lowercase keywords are expected by the eye in the referenced text, they will be much easier to find. Thanks. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
This doesn't build: make -C pg_upgrade_support all make[2]: Entering directory `/var/lib/jenkins/jobs/postgresql_commitfest_world/workspace/contrib/pg_upgrade_support' gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -fpic -I. -I. -I../../src/include-D_GNU_SOURCE -I/usr/include/libxml2 -c -o pg_upgrade_support.o pg_upgrade_support.c -MMD -MP -MF .deps/pg_upgrade_support.Po pg_upgrade_support.c: In function ‘create_empty_extension’: pg_upgrade_support.c:193:8: error: too few arguments to function ‘InsertExtensionTuple’ In file included from pg_upgrade_support.c:16:0: ../../src/include/commands/extension.h:60:12: note: declared here make[2]: *** [pg_upgrade_support.o] Error 1
Peter Eisentraut <peter_e@gmx.net> writes: > make -C pg_upgrade_support all Do we have something automated to easily test pg_upgrade? My memories of how pg_upgrade works with extensions makes me believe that I don't have anything special to do when those extensions have been made available through a template: the scripts are not used. That said, we want to retain some new dependencies… The contrib/pg_upgrade/IMPLEMENTATION file is silent about upgrading extensions… I fixed the pg_upgrade_support compiling in my branch here, please find the patch-on-patch attached. I also checked that the whole of contribs now build fine and didn't find any other problem. Regards, -- Dimitri Fontaine 06 63 07 10 78 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Dimitri Fontaine wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > make -C pg_upgrade_support all > > Do we have something automated to easily test pg_upgrade? "make check" in contrib/pg_upgrade should do the trick. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > "make check" in contrib/pg_upgrade should do the trick. PASSED Even after I added extension to the serial_schedule. I don't know if I need to do anything specific on that area, will wait about some feedback on that before sending a new version of the patch. Meanwhile my branch is updated, and I sent the patch-on-patch too. Regards, -- Dimitri Fontaine 06 63 07 10 78 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> "make check" in contrib/pg_upgrade should do the trick. > > PASSED > > Even after I added extension to the serial_schedule. I don't know if I > need to do anything specific on that area, will wait about some feedback > on that before sending a new version of the patch. Meanwhile my branch > is updated, and I sent the patch-on-patch too. Here's v14 of the patch with pg_upgrade support fixed, so that the automated setup that Peter built is able to have at it! Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Thu, 2013-08-29 at 12:16 +0200, Dimitri Fontaine wrote: > Here's v14 of the patch with pg_upgrade support fixed, so that the > automated setup that Peter built is able to have at it! Fails cpluspluscheck: In file included from /tmp/cpluspluscheck.5g2uWw/test.cpp:3:0: ./src/include/commands/template.h:47:8: error: ‘ExtensionControl’ does not name a type ./src/include/commands/template.h:51:8: error: ‘ExtensionControl’ does not name a type I think this actually just means the header does not include all it needs by itself.
Peter Eisentraut <peter_e@gmx.net> writes: > Fails cpluspluscheck: Turns out I'm discovering that particular check, thanks! I could reproduce and fix the error locally after being led to the command ./src/tools/pginclude/cpluspluscheck. So please find v15 of the patch attached to this email, that passes all previously done checks and this one too now. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Mon, Sep 02, 2013 at 02:32:16AM -0400, Peter Eisentraut wrote: > On Thu, 2013-08-29 at 12:16 +0200, Dimitri Fontaine wrote: > > Here's v14 of the patch with pg_upgrade support fixed, so that the > > automated setup that Peter built is able to have at it! > > Fails cpluspluscheck: > > In file included from /tmp/cpluspluscheck.5g2uWw/test.cpp:3:0: > ./src/include/commands/template.h:47:8: error: ‘ExtensionControl’ does not name a type > ./src/include/commands/template.h:51:8: error: ‘ExtensionControl’ does not name a type > > I think this actually just means the header does not include all it needs by itself. Is there some standard set of checks you run on new patches, and are the results showing up on, say, the buildfarm or some other CI dashboard? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Tue, Sep 3, 2013 at 4:20 AM, David Fetter <david@fetter.org> wrote: > On Mon, Sep 02, 2013 at 02:32:16AM -0400, Peter Eisentraut wrote: >> On Thu, 2013-08-29 at 12:16 +0200, Dimitri Fontaine wrote: >> > Here's v14 of the patch with pg_upgrade support fixed, so that the >> > automated setup that Peter built is able to have at it! >> >> Fails cpluspluscheck: >> >> In file included from /tmp/cpluspluscheck.5g2uWw/test.cpp:3:0: >> ./src/include/commands/template.h:47:8: error: ‘ExtensionControl’ does not name a type >> ./src/include/commands/template.h:51:8: error: ‘ExtensionControl’ does not name a type >> >> I think this actually just means the header does not include all it needs by itself. > > Is there some standard set of checks you run on new patches, and are > the results showing up on, say, the buildfarm or some other CI > dashboard? I believe that Peter does all those checks using his own Jenkins environment: http://pgci.eisentraut.org/jenkins/ For the commit fest patches here you go: http://pgci.eisentraut.org/jenkins/view/All/job/postgresql_commitfest_world/ -- Michael
Dimitri, * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > So please find v15 of the patch attached to this email, that passes all > previously done checks and this one too now. Looks like there's been a bit of unfortunate bitrot due to Tom's change to disable fancy output: patching file src/test/regress/expected/sanity_check.out Hunk #1 FAILED at 104. Hunk #2 FAILED at 166. 2 out of 2 hunks FAILED -- saving rejects to file src/test/regress/expected/sanity_check.out.rej Are there any other changes you have pending for this..? Would be nice to see the latest version which you've tested and which patches cleanly against master... ;) I'll still go ahead and start looking through this, per our discussion. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > Are there any other changes you have pending for this..? Would be nice > to see the latest version which you've tested and which patches cleanly > against master... ;) I just rebased now, please see attached. I had to pick new OIDs in some places too, but that's about it. > I'll still go ahead and start looking through this, per our discussion. Thanks! -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Dimitri Fontaine wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Are there any other changes you have pending for this..? Would be nice > > to see the latest version which you've tested and which patches cleanly > > against master... ;) > > I just rebased now, please see attached. I had to pick new OIDs in some > places too, but that's about it. I think you're missing support for getObjectIdentity and getObjectTypeDescription for the new object classes. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, 2013-11-04 at 08:43 -0500, Stephen Frost wrote: > I'll still go ahead and start looking through this, per our discussion. In the CF app, this is marked "Ready for Committer". That's a bit vague here, considering Dimitri, you, Peter, and Alvaro are all committers. Who is this patch waiting on? Is the discussion concluding, or does it need another round of review? Regards,Jeff Davis
Hi, Jeff Davis <pgsql@j-davis.com> writes: > In the CF app, this is marked "Ready for Committer". That's a bit vague > here, considering Dimitri, you, Peter, and Alvaro are all committers. > Who is this patch waiting on? Is the discussion concluding, or does it > need another round of review? Thanks for the confusion I guess, but I'm no committer here ;-) This patch has received extensive review in July and I think it now properly implements the design proposed by Tom and Heikki in 9.3/CF4. As the path didn't make it already, yes it needs another (final) round of review. The main difficulty in reviewing is understanding the design and the relation in between our current model of extensions and what this patch offers. You might find the discussions we had with Markus Wanner quite useful in this light. The current situation is that I believe the patch to implement the same “template” model as the on-disk extensions, down to dependency tracking. IIRC I left only one differing behavior, which is that you're not allowed to DROP an Extension Template when it's needed for a dump and restore cycle, where you could be doing that at the file system level of course (and pg_restore on a new system would depend on other files). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri, * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > As the path didn't make it already, yes it needs another (final) round > of review. The main difficulty in reviewing is understanding the design > and the relation in between our current model of extensions and what > this patch offers. I'm afraid this really needs more work, at least of the more mundane kind. I started working through this patch, but when I hit on "get_template_oid", I was reminded of the discussion we had back in January around using just 'template' everywhere. http://www.postgresql.org/message-id/20130118182156.GF16126@tamriel.snowman.net We already have other 'template' objects in the system and I'm not excited about the confusion. This also applies to 'CreateTemplate', 'CreateTemplateTupleDesc', right down to 'template.h' and 'template.c'. Attached is a patch against v16 which fixes up a few documentation issues (I'm pretty sure extension templates and aggregates are unrelated..), and points out that there is zero documentation on these new catalog tables (look for 'XXX' in the patch) along with a few other areas which could use improvement. Thanks, Stephen
Attachment
On 24.11.2013 00:19, Dimitri Fontaine wrote: > Jeff Davis <pgsql@j-davis.com> writes: >> In the CF app, this is marked "Ready for Committer". That's a bit vague >> here, considering Dimitri, you, Peter, and Alvaro are all committers. >> Who is this patch waiting on? Is the discussion concluding, or does it >> need another round of review? > > Thanks for the confusion I guess, but I'm no committer here ;-) > > This patch has received extensive review in July and I think it now > properly implements the design proposed by Tom and Heikki in 9.3/CF4. Ok, since my name has been mentioned, I'll bite.. I still don't like this. What I suggested back in December was to have a simple mechanism to upload an extension zip file to the server via libpq (http://www.postgresql.org/message-id/50BF80A6.20500@vmware.com). The idea developed from that into the concept of extension templates, but the original idea was lost somewhere along the way. Back in December, when I agreed that "upload zip file via libpq" might be useful, Tom suggested that we call control+sql file a "template", and the installed entity an "extension". So far so good. Now comes the patch, and the extension template no longer means a control+sql file. It means an entity that's installed in the database that contains the same information as a control+sql file, but in a new format. In fact, what *do* you call the control+sql file that lies on the filesystem? Not a template, apparently. I want to be able to download extension.zip from pgxn.org, and then install it on a server. I want to be able to install it the traditional way, by unzipping it to the filesystem, or via libpq by using this new feature. I do *not* want to rewrite the extension using a new CREATE TEMPLATE FOR EXTENSION syntax to do the latter. I want to be able to install the *same* zip file using either method. - Heikki
* Heikki Linnakangas (hlinnakangas@vmware.com) wrote: > On 24.11.2013 00:19, Dimitri Fontaine wrote: > >This patch has received extensive review in July and I think it now > >properly implements the design proposed by Tom and Heikki in 9.3/CF4. > > Ok, since my name has been mentioned, I'll bite.. > > I still don't like this. What I suggested back in December was to > have a simple mechanism to upload an extension zip file to the > server via libpq > (http://www.postgresql.org/message-id/50BF80A6.20500@vmware.com). > The idea developed from that into the concept of extension > templates, but the original idea was lost somewhere along the way. I hate to admit it, but I kind of agree.. While reviewing the patch and thinking about it, the whole thing really did start to strike me as a little too 'meta'. We want a way to package and ship extensions which can then be loaded via the libpq protocol. I'm not sure that there's really any need to track all of this in catalog tables. Also as part of the patch review, I went back and looked through some of the older threads around this and noticed the understandable concern, given the current patch, of non-superusers being able to create extension templates. I'm not sure that an approach which allows a zip file to be uploaded would be something we could allow non-superusers to do, but I really feel like we need a way for non-superusers to create extensions (when they don't have untrusted-language components). Now, given these two trains of thought, I start to see that we may want to avoid non-superusers being able to create arbitrary files on disk, even in a controlled area. We've managed that for various other objects (tables, et al), I'm sure we could work out a solution to that issue... Thanks, Stephen
On Tue, 2013-11-26 at 01:37 +0200, Heikki Linnakangas wrote: > Back in December, when I agreed that "upload zip file via libpq" might > be useful, Tom suggested that we call control+sql file a "template", and > the installed entity an "extension". Simply uploading "safe" extension files (i.e. not native libraries) using the protocol seems narrower in scope and less controversial. However, I like the idea of putting extensions in the catalogs (aside from DSOs of course), too. Setting aside compatibility problems with existing extensions, do you think that's a reasonable goal to work toward, or do you think that's a dead end? Regards,Jeff Davis
Hi, Heikki Linnakangas <hlinnakangas@vmware.com> writes: > I still don't like this. What I suggested back in December was to have a > simple mechanism to upload an extension zip file to the server via libpq > (http://www.postgresql.org/message-id/50BF80A6.20500@vmware.com). The idea > developed from that into the concept of extension templates, but the > original idea was lost somewhere along the way. And I will quote Andres' answer to your same proposal: http://www.postgresql.org/message-id/20121205172747.GC27424@awork2.anarazel.de So having a mode for pg_dump that actually makes dumps that are usable for recovering after a disaster seems sensible tome. Otherwise you need to redeploy from the VCS or whatever, which isn't really what you want when restoring a databasebackup. Comparing the situation to the one where you have extensions provided by the packaging system or by /contrib or whateverdoesn't seem to be all that valid to me. If you continue reading the thread from back then, the conclusion was to drop the patch I was then proposing and instead work on the one we are currently reviewing. > Back in December, when I agreed that "upload zip file via libpq" might be > useful, Tom suggested that we call control+sql file a "template", and the > installed entity an "extension". So far so good. Now comes the patch, and > the extension template no longer means a control+sql file. It means an > entity that's installed in the database that contains the same information > as a control+sql file, but in a new format. In fact, what *do* you call the > control+sql file that lies on the filesystem? Not a template, apparently. It's historical. To make it possible to start with some extension patch in the 9.0 development cycle, it's been decided to only target the contrib style extensions. Thanks to that we add something in 9.1. In practice, the patch currently under review makes it so that both the file system based model and the catalog based model behave the same (as much as possible and sensible, and thanks to lots of reviewing efforts from Markus Wanner), so we could be refering to the file system based model as “template”. > I want to be able to download extension.zip from pgxn.org, and then install > it on a server. I want to be able to install it the traditional way, by > unzipping it to the filesystem, or via libpq by using this new feature. I do > *not* want to rewrite the extension using a new CREATE TEMPLATE FOR > EXTENSION syntax to do the latter. I want to be able to install the *same* > zip file using either method. I would like to be able to support that, and the theory is attractive. In practice, it's not that simple. PGXN implements a source based distribution model, and most extensions over there are following the same model, where the needed SQL files are derived from sources at build time, using make(1). See the following examples, the first one includes a C source file and the second one is all PL stuff: http://pgxn.org/dist/first_last_agg/ http://api.pgxn.org/src/first_last_agg/first_last_agg-0.1.2/ sql/$(EXTENSION)--$(EXTVERSION).sql: sql/$(EXTENSION).sql cp $< $@ http://pgxn.org/dist/mimeo/1.0.1/ http://api.pgxn.org/src/mimeo/mimeo-1.0.1/ sql/$(EXTENSION)--$(EXTVERSION).sql: sql/tables/*.sql sql/functions/*.sqlcat $^ > $@ So, to support uploading PGXN zip files directly within the backend, now the backend must be in a position to unpack the archive and build the extension, then it must know where the build artefacts are going to be found or it needs to `make install` in a known prefix and follow our current conventions to find the files. As I said to David Wheeler when he did build PGXN, I don't think that a source level distribution is going to help us dealing with production deployments. So, while I understand where you're coming from, please tell me what are your answers for those two design questions about the Extension template idea: - what should happen at pg_restore time? - do you really want the extension build infrastructure in core? My current thinking is to build the missing infrastructure as a contrib module that will know how to divert CREATE EXTENSIONwith an Event Trigger and apply the necessary magics at that time, and fill in the Extension Templates for you. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > So, to support uploading PGXN zip files directly within the backend, now > the backend must be in a position to unpack the archive and build the > extension, then it must know where the build artefacts are going to be > found or it needs to `make install` in a known prefix and follow our > current conventions to find the files. As I've said multiple times before, this is an absolute nonstarter. It's insane from a security standpoint, and it requires a whole lot of infrastructure that wouldn't be there on a production box (starting with the Postgres header files, but even a C compiler wouldn't necessarily be there). What you are looking for is a software distribution channel. A postgres daemon isn't that and we shouldn't try to make it that. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > > So, to support uploading PGXN zip files directly within the backend, now > > the backend must be in a position to unpack the archive and build the > > extension, then it must know where the build artefacts are going to be > > found or it needs to `make install` in a known prefix and follow our > > current conventions to find the files. > > As I've said multiple times before, this is an absolute nonstarter. > It's insane from a security standpoint, and it requires a whole lot > of infrastructure that wouldn't be there on a production box > (starting with the Postgres header files, but even a C compiler > wouldn't necessarily be there). > > What you are looking for is a software distribution channel. A postgres > daemon isn't that and we shouldn't try to make it that. I tend to agree and I've been trying to hash this out with Dimitri on IRC this morning. One proposal that I personally like is the notion of having an external-to-PG client which works with pgxn and simply installs the various objects into a schema based on some naming schema, with a table in that schema which handles the control information. This external utility would handle upgrades by looking at the 'control' table and then figuring out which script from the pgxn package needs to be run. These wouldn't be PG "extensions" really though, which it seems folks are pretty hung up on. There would also be no support for binary or untrusted components, which is a bit frustrating, as you'd like to be able to support those if you're a superuser. Trying to build both into one "extension template" structure, or what-have-you, seems to be causing us to fail to make any progress on either use-case. Thanks, Stephen
Tom Lane <tgl@sss.pgh.pa.us> writes: > As I've said multiple times before, this is an absolute nonstarter. FWIW, I was explaining the model that I didn't want to follow. Thanks for approving, even if that's not a surprise as the model I did follow is the one we agreed on a year ago. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Stephen Frost <sfrost@snowman.net> writes: > These wouldn't be PG "extensions" really though, which it seems folks > are pretty hung up on. There would also be no support for binary or > untrusted components, which is a bit frustrating, as you'd like to be > able to support those if you're a superuser. Trying to build both into > one "extension template" structure, or what-have-you, seems to be > causing us to fail to make any progress on either use-case. So, plenty of points to address separately here: - why do we want extensions to manage PL user code? - what about managing extensions with modules from the protocol directly, if we have Extension Templates ? - is it possible to open the feature to non-superusers ? Let's have at it. # Why do we want extensions to manage PL user code? Extensions allows to manage any number of database objects as a single entity with a very clear and simple enough life cycle management tooling: create extension … alter extension … update to … drop extension … [cascade] \dx \dx+ pg_available_extensions() pg_available_extension_versions() Plus, you can handle inter-extension dependencies, albeit currently in a very limited way, but that's still something. Now that we have those tools, I want to be able to use them when dealing with a bunch of PL code because I didn't find a better option in PostgreSQL for dealing with such code. It could well be that the “package” thing that I couldn't find tonight in the SQL standard documents are better suited to manage user code, but after having had to write management code and use extensions, I know I want to deal with PL user code the same way I deal with extensions. Of course, it's already possible to do so, packaging the user code into a set of files that you then need to install at the right place on the filesystem (a place owned by root in most cases), so I already did have the pleasure to use extensions for user code. And I also had to write all the packaging myself, then ship it to several nodes, etc. I think that opening Extensions to be easy to use without requiring root level access to the database server's filesystem would be a great idea, even if it would take, I don't know, say, 3 years of development to get there. # Extension Templates and Binary Modules Then as soon as we are able to CREATE EXTENSION mystuff; without ever pre-installing files on the file system as root, then we would like to be able to do just that even with binary modules. The current documentation of the dynamic_library_path GUC makes me believe it's possible already without patching the backend. We will need to patch the extension's CREATE FUNCTION statements not to hardcode $libdir in there of course. That could be done automatically by some separate distribution packaging software. Some cases would still need more work before being supported within that frame: - Hot Standby - Modules that depend on other libs being installed. # Extension Templates and Superusers The problem found here is that if a non privileged user installs an extension template named “pgcyrpto” then the superuser installs what he believes is the extension “pgcrypto”, the malicious unprivileged user now is running his own code (extension install script) as a superuser. The current patch will prioritize file-based templates when an extension is provided both from the file system and the catalogs, so for the situation to happen you need to have forgotten to install the proper system package. Still. I've been asked (if memory serves) to then limit the Extension Templates feature to superuser, which is a very bad thing™. What I now think we should do is only grant superusers the privileges to install an extension from a template they own or is owned by another superuser. The general rule being that only the role who did install the template would be allowed to install an extension with it, so that you can't have a privilege escalation happening, IIUC. As a superuser, use SET ROLE first. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 11/26/2013 10:07 PM, Dimitri Fontaine wrote: > What I now think we should do is only grant superusers the privileges to > install an extension from a template they own or is owned by another > superuser. Say what? Superusers bypass all privileges by definition. -- Vik
Stephen Frost <sfrost@snowman.net> writes: > We already have other 'template' objects in the system and I'm not > excited about the confusion. This also applies to 'CreateTemplate', > 'CreateTemplateTupleDesc', right down to 'template.h' and 'template.c'. The current code is made to expose the notion of a template and only implements one kind of template, for extensions. Still the ProcessUtility hooking has been made so as to make it easy adding new templates. Given that I haven't seen any other template idea in the years I've been contributing to PostgreSQL other than the FTS template, and that I'm seeing no other candidate, I took the arbitrary decision not to open the sources more than that to the idea of other kinds of templates. We could of course have a 50 lines templates.c file that calls into an extension_template.c file for the meat of the implementation if that's prefered by the project. Just tell me about that, it's not in the attached patch. > Attached is a patch against v16 which fixes up a few documentation > issues (I'm pretty sure extension templates and aggregates are > unrelated..), and points out that there is zero documentation on these > new catalog tables (look for 'XXX' in the patch) along with a few > other areas which could use improvement. I merged your patch in, rebased against master, fixed some more typos I found, and filled in the gaps you found in the docs. Version 17 of the patch is attached to that email, passes `make check`. ENOTIME for building docs, will do tomorrow, I though you might appreciate an update meanwhile (and with some luck docs still build fine). The other main point that will change the current code is dealing with superusers and security concerns. Baring objections, I'm going to implement a variation of what I did propose in the thread: When a superuser CREATE EXTENSION against a template that has been provided by a non-privileged user, automatically SET ROLE to that user before doing so, avoiding escalation privileges. If that's not what the superuser intended, then it still is possible for him to ALTER TEMPLATE FOR EXTENSION … OWNER TO himself beforehand. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Wed, 2013-11-27 at 18:34 +0100, Dimitri Fontaine wrote: > Stephen Frost <sfrost@snowman.net> writes: > > We already have other 'template' objects in the system and I'm not > > excited about the confusion. This also applies to 'CreateTemplate', > > 'CreateTemplateTupleDesc', right down to 'template.h' and 'template.c'. > ... > We could of course have a 50 lines templates.c file that calls into an > extension_template.c file for the meat of the implementation if that's > prefered by the project. I think that Stephen was just talking about the naming. I would have expected the names to be something like "xtmpl" (which is the shortest abbreviation that came to my mind) rather than "tpl", for instance. Use of "template" is a bit ambiguous. > I merged your patch in, rebased against master, fixed some more typos I > found, and filled in the gaps you found in the docs. Version 17 of the > patch is attached to that email, passes `make check`. > > ENOTIME for building docs, will do tomorrow, I though you might > appreciate an update meanwhile (and with some luck docs still build > fine). Yes, they build fine. However, I find the "full version" quite awkward still. I don't think it's purely a documentation issue -- the "default full version" concept itself is a bit awkward. If I understand correctly, it goes something like this: When a user does: CREATE EXTENSION foo VERSION '3.0'; and there are templates for 1.0, 2.0, and 2.1, and upgrades from 1.0->2.0, 2.0->2.1, and 2.1->3.0, then there are three possible actions that might be taken: 1. Install 1.0 and run three upgrades 2. Install 2.0 and run two upgrade 3. Install 2.1 and run one upgrade There are two ways to disambiguate: 1. Specify the initial version using "FROM 2.0" in CREATE EXTENSION command (though the documentation seems to say that the FROM clause is only useful for unpackaged, for some reason) 2. If not specified there, it falls through to DEFAULT FULL VERSION for the extension template But that's a little strange. Many extensions are likely to have all relevant versions available in full, and DEFAULT FULL won't kick in. Those that do make use of upgrade-only paths are likely to have a few full versions and a few upgrade scripts based off each. If you pick a default full version of 2.0, then it won't be useful for installing version 1.0, nor version 23.7. In each of those cases, you'll want to specify the FROM clause in CREATE EXTENSION. I suppose the idea is to avoid the need for specifying "FROM" in the common case where you want to install the latest version. But it seems like you could just as well keep a full copy of the extension's latest version, and just prune them out when they become old. Unless I'm missing something, I'd be inclined to just get rid of the concept of DEFAULT FULL VERSION just to keep the documentation simpler without losing any real functionality. In passing, I'll also make a quick complaint about the code: /* * If we have a full script for the target version (or a create template), * we don't need to care about unpackaged or default_major_version,nor* about upgrade sequences.*/ ... * If the user did* ask for a target version that happens to be the same as the* default_full_version, just install that onedirectly. The second one (in an "else" branch) shouldn't happen, assuming default_full_version points at a proper full version, right? Regards,Jeff Davis
On Sat, 2013-11-30 at 01:05 -0800, Jeff Davis wrote: > Unless I'm missing something, I'd be inclined to just get rid of the > concept of DEFAULT FULL VERSION just to keep the documentation simpler > without losing any real functionality. I found some explanation of the original reasoning in these threads: http://www.postgresql.org/message-id/CA +Tgmoae3Qs4QbQfxOUzZFxRSxA0zy8ibSOYSuuTzDUMPeAkAg@mail.gmail.com http://www.postgresql.org/message-id/m2622fy43s.fsf@2ndQuadrant.fr http://www.postgresql.org/message-id/m2k44m9oyo.fsf@2ndQuadrant.fr It seems like it's for pg_dump, so it can avoid outputting the extension templates and just say "VERSION 'x.y'" without worrying about which version it needs to start from. That seems like a legitimate purpose, but I think we can come up with something that's a little easier on users and easier to document (and name). Perhaps just find the shortest upgrade path to the version requested (using some arbitrary but deterministic tiebreaker)? Regards,Jeff Davis
On Wed, 2013-11-27 at 18:34 +0100, Dimitri Fontaine wrote: > The other main point that will change the current code is dealing with > superusers and security concerns. Baring objections, I'm going to > implement a variation of what I did propose in the thread: > > When a superuser CREATE EXTENSION against a template that has been > provided by a non-privileged user, automatically SET ROLE to that user > before doing so, avoiding escalation privileges. That proposal is worded like a special case for superusers, and I don't see why. If the security model is that an extension script is run with as the template owner, then we should just do that universally. If not, making a special case for superusers undermines the security of powerful-but-not-superuser roles. I haven't looked in detail at the security issues here... is this the result of a consensus or are there still differing opinions? > If that's not what the superuser intended, then it still is possible for > him to ALTER TEMPLATE FOR EXTENSION … OWNER TO himself beforehand. Doesn't sound like a clean workaround. We already have a model for executing functions, and those are black boxes of code as well. If we deviate too much from that, I think we're inviting problems. Heikki expressed some concerns here: http://www.postgresql.org/message-id/5152FF3D.4000401@vmware.com followed by Robert. The concerns seem mostly about the namespace, because it's hard to be sure that "CREATE EXTENSION foo" installs the "foo" you expect. Those are very valid practical concerns; but I agree with Dimitri that this is not a fundamental problem, and we might find a compromise here that can work. One idea: if a non-superuser role creates an extension template, then it is only visible to that exact same role. The first version might only allow superusers to create extension templates, but it would be nice to have a plan for non-superusers soon. Aside: why do file-based templates shadow catalog-based templates? Shouldn't we just throw an error if both are available at CREATE EXTENSION time? Also, I notice that the extension templates are not in shared catalogs; was that discussed? (Some of these issues seem underdocumented, as well.) Regards,Jeff Davis
On Tue, 2013-11-26 at 01:37 +0200, Heikki Linnakangas wrote: > I want to be able to download extension.zip from pgxn.org, and then > install it on a server. I want to be able to install it the traditional > way, by unzipping it to the filesystem, or via libpq by using this new > feature. I see where you're coming from, but after some thought, and looking at the patch, I think we really do want a catalog representation for (at least some) extensions. Dealing with files is a conceptual mismatch that will never be as easy and coherent as something that database manages and understands. Replication, backup, and Postgres-as-a-service providers are clear examples, and I just don't see a file-based approach solving those problems. One example is tablespaces, which are a constant source of administrative pain. That's inherent to tablespaces, just like it's inherent to native shared libraries, and we can't do much about them. But bringing more of an extension into the catalog can be done, and I think we'll see big benefits from that. Imagine something like (this comes from an in-person conversation with Dimitri a while ago; hopefully I'm not misrepresenting his vision): =# select pgxn_install_template('myextension'); or even: =# select pgxn_update_all_templates(); That is much closer to what modern language environments do -- ruby, python, go, and haskell all have a language-managed extension service independent of the OS packaging system and don't require more privileges or access than running the language. That being said, there some things about in-catalog templates that need some more thought: 1. If someone does want their OS to install extensions for them (e.g. the contrib package), how should that be done? This usually works fine with the aforementioned languages, because installation is still just dropping files in the right place. Postgres is different, because to put something in the catalog, we need a running server, which is awkward for a packaging system to do. 2. When 9.4 gets released, we need some solid advice for extension authors. If they have a native shared library, I assume we just tell them to keep using the file-based templates. But if they have a SQL-only extension, do we tell them to port to the in-catalog templates? What if they port to in-catalog templates, and then decide they just want to optimize one function by writing it in native code? Do they have to port back? What should the authors of SQL-only extensions distribute on PGXN? Should there be a migration period where they offer both kinds of templates until they drop support for 9.3? a. Some extensions have quite a few .sql files. It seems awkward to just cat them all into one giant SQL query. Not a rational problem, but it would bother me a little to tell people to squash their otherwise-organized functions into a giant blob. 3. What do we do about native shared libraries? Ultimately, I imagine that we should handle these similarly to tablespaces: have a real database object with an OID that extensions or functions can depend on, and create a symlink (with the OID as the link name) that points to the real file on disk. We could also export some new symbols like the shared library name and version for better error checking. 4. Do we live with both file-based and catalog-based templates forever? I guess probably so, because the file-based templates probably are a little better for contrib itself (because the complaints about relying on OS packaging don't apply as strongly, if at all). Regards,Jeff Davis
On Wed, 2013-11-27 at 18:34 +0100, Dimitri Fontaine wrote: > I merged your patch in, rebased against master, fixed some more typos I > found, and filled in the gaps you found in the docs. Version 17 of the > patch is attached to that email, passes `make check`. A couple more questions: 1. During the initial development of extensions, there was discussion about ordered version numbers and dependencies on the version (e.g. requires foo >= 2.1). Outside the scope of this patch, of course, but is that something that we can still do later? Or are we building infrastructure that will make that nearly impossible in a release or two? 2. People will want to start using this feature to control and version their schema. Do you have comments about that? Should we eventually try to get extensions to support that use case (if they don't already), or should that be a completely different feature? Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > I think that Stephen was just talking about the naming. I would have > expected the names to be something like "xtmpl" (which is the shortest > abbreviation that came to my mind) rather than "tpl", for instance. Use > of "template" is a bit ambiguous. To be honest I'm not following along with the complaint. What object would you liked to be named xtmpl rather than what it is called now? If you're refering to the catalog column names such as tplname and tplowner and the like, the tradition AFAICS is to find a trigram prefix for those entries… and that's what I did. > However, I find the "full version" quite awkward still. I don't think > it's purely a documentation issue -- the "default full version" concept > itself is a bit awkward. Yes. I coulnd't find a better name, and I would quite like that we do. There are two ideas behind that feature: - When we released hstore 1.1 we had a choice of either providing still hstore--1.0.sql and hstore--1.1.sql or only thelatter, deprecating the 1.0 version for the next release. The default_major_version feature allows to only ship hstore-1.0.sql and still offer full support for hstore 1.0 and1.1. Note that it's problematic given the current implementation of pg_upgrade, where when migrating from 9.1 to 9.2 (IIRCthat's when hstore got to 1.1) you can result in either still having hstore 1.0 in your database if you used pg_upgradeor have hstore 1.1 already if you used pg_dump and pg_restore. Note also that if you install 9.2 then by a technically constrained choice of policy in the project, then you cannotcontinue using hstore 1.0. So the goal is to simplify extension authors management of sql install and upgrade scripts while giving more optionsto the users of extension wrt the version they want to be using. - With Extension Templates, the extension author can be providing scripts foo--1.0.sql and foo--1.0--1.1.sql and run theupgrade with ALTER EXTENSION foo UPDATE TO '1.1'; Now what happens at pg_restore time? We only have the 1.0 and 1.0--1.1 scripts, yet we want to be installing foo version1.1. So we need the "default_major_version" capabilities, whatever the name we choose. Hence my inclusion of that featurein the Extension Template patch. > If I understand correctly, it goes something like this: > > When a user does: > CREATE EXTENSION foo VERSION '3.0'; > > and there are templates for 1.0, 2.0, and 2.1, and upgrades from > 1.0->2.0, 2.0->2.1, and 2.1->3.0, then there are three possible actions > that might be taken: > > 1. Install 1.0 and run three upgrades > 2. Install 2.0 and run two upgrade > 3. Install 2.1 and run one upgrade With PostgreSQL versions 9.1, 9.2 and 9.3, given the scripts you're listing, the command you propose will just fail. Full stop. ERROR. > The second one (in an "else" branch) shouldn't happen, assuming > default_full_version points at a proper full version, right? Will review, thanks. > It seems like it's for pg_dump, so it can avoid outputting the extension > templates and just say "VERSION 'x.y'" without worrying about which > version it needs to start from. Exactly. We need pg_dump to be smart enough for handling the case as soon as we have Extension Templates, and we already have said smarts in the backend code. They were just not applied at CREATE EXTENSION time before this patch. > That seems like a legitimate purpose, but I think we can come up with > something that's a little easier on users and easier to document (and > name). Perhaps just find the shortest upgrade path to the version > requested (using some arbitrary but deterministic tiebreaker)? The danger is that the shorter path could well include a downgrade, and nobody tests for downgrading paths IME. So I keep thinking we should ask the extension author to give us a starting point. Now, if you have a better name for it than “default_full_version”, I'm all ears! Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Jeff Davis <pgsql@j-davis.com> writes: >> When a superuser CREATE EXTENSION against a template that has been >> provided by a non-privileged user, automatically SET ROLE to that user >> before doing so, avoiding escalation privileges. > > That proposal is worded like a special case for superusers, and I don't > see why. If the security model is that an extension script is run with > as the template owner, then we should just do that universally. If not, > making a special case for superusers undermines the security of > powerful-but-not-superuser roles. I like that idea yes. > I haven't looked in detail at the security issues here... is this the > result of a consensus or are there still differing opinions? AFAIK past reviewers came up with the privilege escalation use case and said we'd better have that feature a superuser only one. It's playing safe, but I wish we could find another solution. > We already have a model for executing functions, and those are black > boxes of code as well. If we deviate too much from that, I think we're > inviting problems. So maybe we should have “SECURITY DEFINER” and “SECURITY INVOKER” extension templates, the default being “SECURITY DEFINER”? > Aside: why do file-based templates shadow catalog-based templates? > Shouldn't we just throw an error if both are available at CREATE > EXTENSION time? That sounds good too. We need to ERROR out at UPDATE time too of course. > Also, I notice that the extension templates are not in shared catalogs; > was that discussed? Yes it was. The current model for extensions is to be per-database, but it's limited by the way we deal with modules (.so), for security reasons that encompass the per-database model. Also consider multi-tenancy installations. Certainly, you don't want any database owner to be able to review PL code from any other database owner in the same cluster when each database owner is another customer. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Saturday, November 30, 2013, Dimitri Fontaine wrote:
Also consider multi-tenancy installations. Certainly, you don't want any
database owner to be able to review PL code from any other database
owner in the same cluster when each database owner is another customer
I'm planning to do a more comprehensive response, but the only use case that I see for extension templates is to be in a shared catalog and I don't see that as being a particularly compelling case.
Without that, all of the information about a given extension is already in the database in our dependency system. As you pointed out, there was previously a notion of "inline" templates. I'm not sure that particular patch is exactly where we want to go, but I absolutely do not like this idea that we have a "template" on a per-database level which does nothing but duplicate most of the information we *already have*, since you have to assume that if the extension template (which is per-database) exists then the extension has also been created in the database.
Thanks,
Stephen
Jeff Davis <pgsql@j-davis.com> writes: > 1. During the initial development of extensions, there was discussion > about ordered version numbers and dependencies on the version (e.g. > requires foo >= 2.1). Outside the scope of this patch, of course, but is > that something that we can still do later? Or are we building > infrastructure that will make that nearly impossible in a release or > two? The best answer I can think of as of today has been proposed on list already in a patch names "finer extension dependencies", where I failed miserably to explain my ideas correctly to -hackers. Here are the archive links to the opening of the thread and the latest known version of the patch: http://www.postgresql.org/message-id/m2hb0y2bh3.fsf@hi-media.com http://www.postgresql.org/message-id/871uoa4dal.fsf@hi-media-techno.com Meanwhile, at pgcon 2013 unconference about pg_upgrade, with Bruce and Peter we had a chat about this problem, coming from a different parallel that might allow for a better explaning of what is it all about: Distribution binary packages nowadays are tracking symbol level dependencies in between binaries (shared objects and programs).To do so they use some dynamic capabilities (objdump and the like, IIUC). In the PostgreSQL we just can't do that automatically, mainly because of EXECUTE support in PL functions and DO blocks. We are not even able to track inter-function dependencies because of that, so that you're allowed to DROP a function that you actually depend on. The “features” proposal where an extension “provides” a set of at least one feature (its own name) is a way for the extension author to list a subset of the symbols provided in his extension so that other extensions are able to depend on some of them. Then this set of “features” can be edited from a version to the next, adding new “features” and removing existing ones. With “feature” dependency tracking, we are able to prevent extension updates when the new version deprecates features that we know are still in the dependency graph: # ALTER EXTENSION foo UPDATE TO '2.0' ERROR: extension bar depends on feature "baz" provided by "foo" version '1.1' My plan is to be able to someday explain what I was trying to do in a way that allows for the patch to be considered again, and I have some faith in that plan because I remember Peter suddenly understanding what this was all about when in Ottawa earlier this year. It should be quite trivial to update the patch I had then to also support Extension Templates, basically adding a new column in the catalogs, grammar support in the control properties, and making sure that the value actually goes from the template into the extension catalogs at CREATE or UPDATE time. Mind you, if the Extension Templates patch fate is solvable this Commit Fest, and provided that the “extension features” appears to be a good idea for someone else than just me, I would happily be proposing an updated version of it for next Commit Fest, in time fro 9.4. It was a very simple patch. Of course, given Extension Templates, I would be adding some regression tests to said patch ;-) > 2. People will want to start using this feature to control and version > their schema. Do you have comments about that? Should we eventually try > to get extensions to support that use case (if they don't already), or > should that be a completely different feature? The tension between extensions and database schema is pretty simple: an Extension content is not part of pg_dump. With the idea of Extension Templates, you now have two kinds of templates: file system based templates, excluded from the dumps and managed separately (typically managed as an OS package), and catalog templates, parts of the dumps, fully managed by PostgreSQL. Both ways, at pg_restore time the extension is built again from the templates, wherever we find those. The term “extension” covers for a single set of behaviours. So my current thinking is that Extension Templates as currently developped will not be a solution to version controling your database schema. I have not been thinking about how to make that happen other than the following two points: - yes I would like PostgreSQL to offer database versioning capabilities; - no I don't think Extensions can be made into providing support for that set of features. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Stephen Frost <sfrost@snowman.net> writes: > Without that, all of the information about a given extension is already in > the database in our dependency system. As you pointed out, there was That's not entirely true. We would still be missing some information from the extension control file. > previously a notion of "inline" templates. I'm not sure that particular > patch is exactly where we want to go, but I absolutely do not like this > idea that we have a "template" on a per-database level which does nothing > but duplicate most of the information we *already have*, since you have to > assume that if the extension template (which is per-database) exists then > the extension has also been created in the database. That's a classic bootstrap problem. If you consider that the extension is already installed, then you don't need to know how to install it. The main feature that the patch provides is installation path for an extension that doesn't involve the server's file system. > Having a versioning notion (and whatever other meta data we, or an > extension author, feels is useful) for what are otherwise simple containers > (aka the schematic we already have..) makes sense and it would be great to > provide support around that, but not this duplication of > object definitions. I don't like duplication either, we've just been failing to find any alternative with pg_restore support for the last 3 years. If you want the simplest possible patch that would enable you bypassing the file system, here's what I would be proposing: have a special flag allowing CREATE EXTENSION to just prepare pg_extension catalog entries. Then create your objects as usual, and use ALTER EXTENSION … ADD … to register them against the existing extension. That would work beautifully, and of course you would have to do that again manually at pg_restore time after CREATE DATABASE and before pg_restore, or you would need to change the fact that extensions objects are not part of your pg_dump scripts, or you would have to name your new thing something else than an extension. Also, please note that I did propose that design when working on the first patch series for extension (8.4 and 9.0 eras), or at least some variant where the control properties came in from some command rather than from a file, and it was rejected because the CREATE EXTENSION bootstrapping was then judged too complex, and it was not clear how extension authors were going to maintain their scripts. The current extension model is simple enough to reason about. A script must be provided in a template and is executed at CREATE EXTENSION time or at ALTER EXTENSION UPDATE time, and pg_dump only contains the CREATE EXTENSION command, so that pg_restore has to find the template again. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri,
On Saturday, November 30, 2013, Dimitri Fontaine wrote:
On Saturday, November 30, 2013, Dimitri Fontaine wrote:
The current extension model is simple enough to reason about. A script
must be provided in a template and is executed at CREATE EXTENSION time
or at ALTER EXTENSION UPDATE time, and pg_dump only contains the CREATE
EXTENSION command, so that pg_restore has to find the template again.
I understand that folks have complained about pg_dump/restore knowing "too much" about extensions, but I do not find that a compelling argument- it's being made from the perspective that an extension is defined by files on the file system. Perhaps that means that what we're talking about aren't extensions, but in that case, such arguments do not make much sense.
Thanks,
Stephen
On Sat, 2013-11-30 at 22:55 +0100, Dimitri Fontaine wrote: > So we need the "default_major_version" capabilities, whatever the > name we choose. Hence my inclusion of that feature in the Extension > Template patch. What we need is a means to install versions for which we don't have full SQL, but for which there exists some upgrade path. "default_major_version" is one possible approach to that, but it seems pretty awkward to me. > > That seems like a legitimate purpose, but I think we can come up with > > something that's a little easier on users and easier to document (and > > name). Perhaps just find the shortest upgrade path to the version > > requested (using some arbitrary but deterministic tiebreaker)? > > The danger is that the shorter path could well include a downgrade, and > nobody tests for downgrading paths IME. So I keep thinking we should ask > the extension author to give us a starting point. Now, if you have a > better name for it than “default_full_version”, I'm all ears! The problem with asking for a starting point is that: (a) It's one more thing to get wrong; (b) what we really care about is the ending point; and (c) we have to find a path to the right endpoint, and that path might surprise the extension author and/or user anyway. It seems like in any real use case, the author would use a convention that kept the upgrades/downgrades somewhat sane. A few examples might be: * Offer only earliest version and upgrades* Offer only latest version and downgrades* Offer full SQL script on major versionsand upgrades to the point releases I don't see why we are trying to accommodate a case where the author doesn't offer enough full SQL scripts and offers broken downgrade scripts; or why that case is different from offering broken upgrade scripts. Regards,Jeff Davis
On Sat, 2013-11-30 at 23:03 +0100, Dimitri Fontaine wrote: > Jeff Davis <pgsql@j-davis.com> writes: > >> When a superuser CREATE EXTENSION against a template that has been > >> provided by a non-privileged user, automatically SET ROLE to that user > >> before doing so, avoiding escalation privileges. > > > > That proposal is worded like a special case for superusers, and I don't > > see why. If the security model is that an extension script is run with > > as the template owner, then we should just do that universally. If not, > > making a special case for superusers undermines the security of > > powerful-but-not-superuser roles. > > I like that idea yes. To clarify, I wasn't proposing that, I'd just like some consistent security model. > So maybe we should have “SECURITY DEFINER” and “SECURITY INVOKER” > extension templates, the default being “SECURITY DEFINER”? That doesn't seem to answer Heikki's stated concern, because a malicious non-superuser would just declare the trojan extension to be SECURITY INVOKER. As I see it, the problem is more about namespacing than anything else. It's analogous to a shell which includes the current directory in the $PATH -- a malicious user can just name an executable "ls" and trick root into executing it. The solution for a shell has nothing to do with setuid; so I'm reluctant to base our solution on SECURITY DEFINER. I prefer a solution that prevents the kind of name collisions that would trick a privileged user. My strawman idea was to just say that an extension template created by a non-superuser could only be instantiated by that same user. > Also consider multi-tenancy installations. Certainly, you don't want any > database owner to be able to review PL code from any other database > owner in the same cluster when each database owner is another customer. That could be solved by permissions, as well, right? Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > I don't see why we are trying to accommodate a case where the author > doesn't offer enough full SQL scripts and offers broken downgrade > scripts; or why that case is different from offering broken upgrade > scripts. That's fair enough I guess. I will work on automating the choice of the first full script to use then, for next patch version. Thanks, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Jeff Davis <pgsql@j-davis.com> writes: >> So maybe we should have “SECURITY DEFINER” and “SECURITY INVOKER” >> extension templates, the default being “SECURITY DEFINER”? > > That doesn't seem to answer Heikki's stated concern, because a malicious > non-superuser would just declare the trojan extension to be SECURITY > INVOKER. It does answer if only superusers are allowed to install SECURITY INVOKER templates, which I forgot to add in the previous email. Or at least my understanding is that it could work that way. > As I see it, the problem is more about namespacing than anything else. > It's analogous to a shell which includes the current directory in the > $PATH -- a malicious user can just name an executable "ls" and trick > root into executing it. The solution for a shell has nothing to do with > setuid; so I'm reluctant to base our solution on SECURITY DEFINER. > > I prefer a solution that prevents the kind of name collisions that would > trick a privileged user. My strawman idea was to just say that an > extension template created by a non-superuser could only be instantiated > by that same user. Yes that's a simpler model. And simpler is better when talking security. The only drawback of that is to forbid the superuser from executing a command. That would be new in PostgreSQL I think. We can work around that with automating the SET ROLE to the template owner when a superuser is creating the extension. That's what led me to the SECURITY DEFINER proposition. Either of those solution are fine to me, with or without the automated SET ROLE when a superuser is installing an extension from a template owned by a non-superuser. Tell me your preference, I'll work on the code. >> Also consider multi-tenancy installations. Certainly, you don't want any >> database owner to be able to review PL code from any other database >> owner in the same cluster when each database owner is another customer. > > That could be solved by permissions, as well, right? I still think about extensions as being a per-database thing, and that the current security policy makes if a per-major-version thing when the extension contains a module (.so). Also, the dynamic_library_path already allows us to make binary extensions a per-database object again, baring incompatibilities that would manifest themselves as run-time errors… So I strongly vote against making the Extension Templates a set of shared catalogs. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sun, 2013-12-01 at 15:58 +0100, Dimitri Fontaine wrote: > Jeff Davis <pgsql@j-davis.com> writes: > Either of those solution are fine to me, with or without the automated > SET ROLE when a superuser is installing an extension from a template > owned by a non-superuser. > > Tell me your preference, I'll work on the code. This version (for this 'fest) should be superuser-only, because we don't have enough consensus about the security model. That being said, we don't want to prevent a change to allow non-superusers in the future. So let's collect a few ideas, and leave room to implement one of them later. To throw another idea out, also based on the premise that it's a namespace problem: if a non-superuser creates an extension template, then we force a prefix of that user's username. So a superuser can create an "base" extension template with no prefix, but if I create an extension template it would be called something like "jdavis"."foo". To be more consistent, we could have a reserved prefix that's always assumed, similar to pg_catalog. > I still think about extensions as being a per-database thing, and that > the current security policy makes if a per-major-version thing when the > extension contains a module (.so). > > Also, the dynamic_library_path already allows us to make binary > extensions a per-database object again, baring incompatibilities that > would manifest themselves as run-time errors… > > So I strongly vote against making the Extension Templates a set of > shared catalogs. I don't have much of an opinion on this point, but I also don't understand your point. Can you clarify? What's the use case for DB-specific extension templates? I generally think of extension templates as universal, in that "myExtension version 1.2.3" is exactly the same everywhere, and immutable, so why not share it? I understand why extensions (not templates) are per-DB, because you might want to control which objects are available, and also control which namespace they go in. Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > To throw another idea out, also based on the premise that it's a > namespace problem: if a non-superuser creates an extension template, > then we force a prefix of that user's username. So a superuser can > create an "base" extension template with no prefix, but if I create an > extension template it would be called something like "jdavis"."foo". To > be more consistent, we could have a reserved prefix that's always > assumed, similar to pg_catalog. Ugh, no. We have schemas already (which, as we all know, are even in a 'pg_namespace' catalog..), with permissions associated with them to allow only certain users to create objects in them, etc. > > I still think about extensions as being a per-database thing, and that > > the current security policy makes if a per-major-version thing when the > > extension contains a module (.so). > > > > Also, the dynamic_library_path already allows us to make binary > > extensions a per-database object again, baring incompatibilities that > > would manifest themselves as run-time errors… > > > > So I strongly vote against making the Extension Templates a set of > > shared catalogs. > > I don't have much of an opinion on this point, but I also don't > understand your point. Can you clarify? What's the use case for > DB-specific extension templates? This is one of my major issues with this approach in general- we're building this very complicated structure for a use-case which doesn't exist. There isn't actually any need for per-DB extension templates, except as a way to create an extension that doesn't require sticking files on the filesystem, but we only have that requirement because it's what we've defined extensions *to be*, which is why these end up not really being extensions but another thing entirely. I'm completely behind the idea of having packages which install functions/tables/whatever in some schema, can be installed by a regular user, and even that the DB could provide some help in tracking metadata (version information, etc) about those packages, but I don't see building that on top of extensions in this way. > I generally think of extension templates as universal, in that > "myExtension version 1.2.3" is exactly the same everywhere, and > immutable, so why not share it? The concern, at least as I've understood it from talking to Dimitri, is that you don't want everyone to see or have available every extension template which exists; perhaps there is sensetive code, blah, blah. Another requirement which doesn't really match up with how extensions exist today. Thanks, Stephen
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Without that, all of the information about a given extension is already in > > the database in our dependency system. As you pointed out, there was > > That's not entirely true. We would still be missing some information > from the extension control file. Fine, so we need an extra side-table, which needn't even be a catalog table; though, as I've mentioned before, having PG help with managing these extensions by tracking such information seems reasonable. What isn't reasonable is having two nearly complete copies of every extension installed into a given database. > > previously a notion of "inline" templates. I'm not sure that particular > > patch is exactly where we want to go, but I absolutely do not like this > > idea that we have a "template" on a per-database level which does nothing > > but duplicate most of the information we *already have*, since you have to > > assume that if the extension template (which is per-database) exists then > > the extension has also been created in the database. > > That's a classic bootstrap problem. If you consider that the extension > is already installed, then you don't need to know how to install it. I don't see how it's a bootstrap problem at all- we just need a way to install the extension which doesn't involve the filesystem. That's completely possible to do *without* storing two complete copies of the extension in the PG catalogs. > The main feature that the patch provides is installation path for an > extension that doesn't involve the server's file system. No, that isn't what this patch does, which is why I'm so frustrated by it because what you describe is *exactly what I want*. This patch doesn't do that though- it tries to provide a way to *mimic* the filesystem part of PG extensions through the catalog- on a per-database level. There's a number of problems with that approach: the filesystem *sucks* when it comes to storing the information about an extension- we do it for extensions today because we're working with the OS packaging systems; we have *way* better information and understanding about an extension once it's actually installed into the PG catalogs, why ignore that instead of use it?; we're duplicating the filesystem-style (which is next to useless to us..) definition of an extension into every database we want to install it into for no good purpose; by having it mimic the filesystem-style approach, it makes it very difficult to reason about security, if non-superusers should be allowed to install these things (or the extensions themselves...), etc. > > Having a versioning notion (and whatever other meta data we, or an > > extension author, feels is useful) for what are otherwise simple containers > > (aka the schematic we already have..) makes sense and it would be great to > > provide support around that, but not this duplication of > > object definitions. > > I don't like duplication either, we've just been failing to find any > alternative with pg_restore support for the last 3 years. *That doesn't make this approach the right one*. If anything, I'm afraid we've ended up building ourselves a rube goldberg machine because of this constant struggle to fit a square peg into a round hole. > If you want the simplest possible patch that would enable you bypassing > the file system, here's what I would be proposing: have a special flag > allowing CREATE EXTENSION to just prepare pg_extension catalog entries. > > Then create your objects as usual, and use ALTER EXTENSION … ADD … to > register them against the existing extension. That's basically what we already do with schemas today and hence is pretty darn close to what I'm proposing. Perhaps it'd be a way to simply version schemas themselves- heck, with that, we could even provide that oft-asked-for schema delta tool in-core by being able to deduce the differences between schema at version X and schema at version Y. > That would work beautifully, and of course you would have to do that > again manually at pg_restore time after CREATE DATABASE and before > pg_restore, or you would need to change the fact that extensions objects > are not part of your pg_dump scripts, or you would have to name your new > thing something else than an extension. We would need a way to dump and restore this, of course. > Also, please note that I did propose that design when working on the > first patch series for extension (8.4 and 9.0 eras), or at least some > variant where the control properties came in from some command rather > than from a file, and it was rejected because the CREATE EXTENSION > bootstrapping was then judged too complex, and it was not clear how > extension authors were going to maintain their scripts. This just makes me feel like the problem was trying to shoehorn this new concept into the existing extension system. I don't see how this would really change things for extension authors having to maintain their scripts one way or the other. > The current extension model is simple enough to reason about. A script > must be provided in a template and is executed at CREATE EXTENSION time > or at ALTER EXTENSION UPDATE time, and pg_dump only contains the CREATE > EXTENSION command, so that pg_restore has to find the template again. The current extension system is simple. This addition of extension templates muddies things *significantly*. Thanks, Stephen
Jeff, * Jeff Davis (pgsql@j-davis.com) wrote: > I see where you're coming from, but after some thought, and looking at > the patch, I think we really do want a catalog representation for (at > least some) extensions. Perhaps I'm missing something- but we already *have* a catalog representation for every extension that's ever installed into a given database. A representation that's a heck of a lot better than a big text blob. > Dealing with files is a conceptual mismatch that will never be as easy > and coherent as something that database manages and understands. > Replication, backup, and Postgres-as-a-service providers are clear > examples, and I just don't see a file-based approach solving those > problems. Agreed. > But bringing more of an extension into the catalog can be done, and I > think we'll see big benefits from that. I'm not following here- what's 'missing'? > Imagine something like (this comes from an in-person conversation with > Dimitri a while ago; hopefully I'm not misrepresenting his vision): > > =# select pgxn_install_template('myextension'); > > or even: > > =# select pgxn_update_all_templates(); > > That is much closer to what modern language environments do -- ruby, > python, go, and haskell all have a language-managed extension service > independent of the OS packaging system and don't require more privileges > or access than running the language. I like the general idea, but I don't particularly see the need for the backend PG process to be making connections to these external repositories and pulling down files to execute. That could be done just as simply by another process which works with the PG backend- ala how dpkg and aptitude work together. > That being said, there some things about in-catalog templates that need > some more thought: > > 1. If someone does want their OS to install extensions for them (e.g. > the contrib package), how should that be done? This usually works fine > with the aforementioned languages, because installation is still just > dropping files in the right place. Postgres is different, because to put > something in the catalog, we need a running server, which is awkward for > a packaging system to do. You need a running PG for the *extension* to be installed, but with the filesystem-based extension approach we have today, the "template" (which are the files on the filesystem) don't need PG running, and if we had an external tool which could work with the PG backend to install extensions via libpq, just like the backend works with the files on the filesystem, we wouldn't have this issue of bootstrapping the 'extension template' into the catalog. > 2. When 9.4 gets released, we need some solid advice for extension > authors. If they have a native shared library, I assume we just tell > them to keep using the file-based templates. But if they have a SQL-only > extension, do we tell them to port to the in-catalog templates? What if > they port to in-catalog templates, and then decide they just want to > optimize one function by writing it in native code? Do they have to port > back? What should the authors of SQL-only extensions distribute on PGXN? > Should there be a migration period where they offer both kinds of > templates until they drop support for 9.3? This is one of the main things that I think Heikki was trying to drive at with his comment- we really don't *want* to make extension authors have to do anything different than what they do today. With an external tool, they wouldn't need to and it would just be two different ways for an extension to be installed into a given database. In the end though, if we're telling people to 'port' their extensions, then I think we've already lost. > a. Some extensions have quite a few .sql files. It seems awkward to > just cat them all into one giant SQL query. Not a rational problem, but > it would bother me a little to tell people to squash their > otherwise-organized functions into a giant blob. 'awkward' isn't the word I'd use, it's downright horrible. > 3. What do we do about native shared libraries? Ultimately, I imagine > that we should handle these similarly to tablespaces: have a real > database object with an OID that extensions or functions can depend on, > and create a symlink (with the OID as the link name) that points to the > real file on disk. We could also export some new symbols like the shared > library name and version for better error checking. I'm sorry, but I do not see shared libraries working through this system, at all. I know that goes against what Dimitri and some others want, but I've talked with a few folks (such as Paul Ramsey of PostGIS) about this notion and, from that perspective, it's almost laughable to think we could ship shared libraries in this way. Even if we could convince ourselves that there's some way for us to track the files on the filesystem and work out all the per-database and whatever issues are associated with that, it'd only work for the simplest shared libraries which don't have any dependencies on other libraries on the system (excepting, perhaps, libc6) and that narrows the use-case down significantly, to the point where I don't feel it's worth all that effort. > 4. Do we live with both file-based and catalog-based templates > forever? I guess probably so, because the file-based templates probably > are a little better for contrib itself (because the complaints about > relying on OS packaging don't apply as strongly, if at all). We need the file-based extension "templates" because they deal with shared libraries and because we've got them already. I don't think we actually need to keep track of the specific commands used to define a catalog-bassed extension inside the catalog, so I'm advocating for there simply not being any "catalog-based template" system for extensions. Extensions themselves are already "catalog-based". Thanks, Stephen
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > # Why do we want extensions to manage PL user code? Having a management system for sets of objects is a *great* idea- and one which we already have through schemas. What we don't have is any kind of versioning system built-in or other metadata about it, nor do we have good tooling which leverages such a versioning or similar system. Extensions provide some of that metadata around schemas and object definitions, but they are also currently defined to be built from SQL scripts on the filesystem in a specific way and can "include" shared libraries (though the PG extension, which is in the catalog, doesn't really include the shared libraries, when you think about it...). > # Extension Templates and Binary Modules > > Then as soon as we are able to CREATE EXTENSION mystuff; without ever > pre-installing files on the file system as root, then we would like to > be able to do just that even with binary modules. I really just don't see this as being either particularly useful nor feasible within a reasonable amount of effort. Shared libraries are really the perview of the OS packaging system. If you want to build some tool which is external to PG but helps facilitate the building and installing of shared libraries, but doesn't use the OS packaging system (and, instead, attempts to duplicate it) then go for it, but don't expect to ship or install that through the PG backend. > # Extension Templates and Superusers > > The problem found here is that if a non privileged user installs an > extension template named “pgcyrpto” then the superuser installs what he > believes is the extension “pgcrypto”, the malicious unprivileged user > now is running his own code (extension install script) as a superuser. For my part, the problem here is this notion of extension templates in the PG catalog and this is just one symptom of how that's really not a good approach. Thanks, Stephen
On 12/02/2013 05:34 AM, Stephen Frost wrote: > * Jeff Davis (pgsql@j-davis.com) wrote: >> I see where you're coming from, but after some thought, and looking at >> the patch, I think we really do want a catalog representation for (at >> least some) extensions. > > Perhaps I'm missing something- but we already *have* a catalog > representation for every extension that's ever installed into a given > database. A representation that's a heck of a lot better than a big > text blob. Right. I think Jeff was thinking of a catalog representation for extensions that haven't been installed yet, but are available in the system and could be installed with CREATE EXTENSION foo. I wouldn't mind having a catalog like that. Even without any of this extension template stuff, it would be handy to have a view that lists all the extensions available in the filesystem. >> 2. When 9.4 gets released, we need some solid advice for extension >> authors. If they have a native shared library, I assume we just tell >> them to keep using the file-based templates. But if they have a SQL-only >> extension, do we tell them to port to the in-catalog templates? What if >> they port to in-catalog templates, and then decide they just want to >> optimize one function by writing it in native code? Do they have to port >> back? What should the authors of SQL-only extensions distribute on PGXN? >> Should there be a migration period where they offer both kinds of >> templates until they drop support for 9.3? > > This is one of the main things that I think Heikki was trying to drive > at with his comment- we really don't *want* to make extension authors > have to do anything different than what they do today. With an external > tool, they wouldn't need to and it would just be two different ways for > an extension to be installed into a given database. In the end though, > if we're telling people to 'port' their extensions, then I think we've > already lost. Exactly. There should be no difference between file-based extensions and catalog-based extensions. It's just two different ways to install the same extension. The extension author doesn't need to care about that, it's the DBA that decides which method to use to install it. I'm going to object loudly to any proposal that doesn't meet that criteria. - Heikki
On 2013-12-02 11:07:28 +0200, Heikki Linnakangas wrote: > >Perhaps I'm missing something- but we already *have* a catalog > >representation for every extension that's ever installed into a given > >database. A representation that's a heck of a lot better than a big > >text blob. > > Right. I think Jeff was thinking of a catalog representation for extensions > that haven't been installed yet, but are available in the system and could > be installed with CREATE EXTENSION foo. I wouldn't mind having a catalog > like that. Even without any of this extension template stuff, it would be > handy to have a view that lists all the extensions available in the > filesystem. Luckily that's already there: SELECT * FROM pg_available_extensions; Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > Right. I think Jeff was thinking of a catalog representation for extensions > that haven't been installed yet, but are available in the system and could > be installed with CREATE EXTENSION foo. I wouldn't mind having a catalog > like that. Even without any of this extension template stuff, it would be > handy to have a view that lists all the extensions available in the > filesystem. http://www.postgresql.org/docs/9.1/static/view-pg-available-extensions.html http://www.postgresql.org/docs/9.1/static/view-pg-available-extension-versions.html > There should be no difference between file-based extensions and > catalog-based extensions. It's just two different ways to install the same > extension. The extension author doesn't need to care about that, it's the > DBA that decides which method to use to install it. Agreed. > I'm going to object loudly to any proposal that doesn't meet that criteria. Please be kind enough to poin me where my current patch is drifting away from that criteria. What you're proposing here is what I think I have been implementing. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Stephen Frost <sfrost@snowman.net> writes: >> > Having a versioning notion (and whatever other meta data we, or an >> > extension author, feels is useful) for what are otherwise simple containers >> > (aka the schematic we already have..) makes sense and it would be great to >> > provide support around that, but not this duplication of >> > object definitions. >> >> I don't like duplication either, we've just been failing to find any >> alternative with pg_restore support for the last 3 years. > > *That doesn't make this approach the right one*. If anything, I'm > afraid we've ended up building ourselves a rube goldberg machine because > of this constant struggle to fit a square peg into a round hole. This duplication you're talking about only applies to CREATE EXTENSION. I don't know of any ways to implement ALTER EXTENSION … UPDATE … behaviour without a separate set of scripts to apply in a certain order depending on the current and target versions of the extension. If you know how to enable a DBA to update a set of objects in a database only with information already found in the database, and in a way that this information is actually *not* an SQL script, I'm all ears. > That's basically what we already do with schemas today and hence is > pretty darn close to what I'm proposing. Perhaps it'd be a way to > simply version schemas themselves- heck, with that, we could even > provide that oft-asked-for schema delta tool in-core by being able to > deduce the differences between schema at version X and schema at > version Y. Given that at any moment you have a single version of the schema installed, I don't know how you're supposed to be able to do that? Maybe you mean by tracking the changes at update time? Well that at least would be a good incentive to have Command String access in event triggers, I guess. >> That would work beautifully, and of course you would have to do that >> again manually at pg_restore time after CREATE DATABASE and before >> pg_restore, or you would need to change the fact that extensions objects >> are not part of your pg_dump scripts, or you would have to name your new >> thing something else than an extension. > > We would need a way to dump and restore this, of course. Which is available in the current patch, of course. > Having a management system for sets of objects is a *great* idea- and > one which we already have through schemas. What we don't have is any > kind of versioning system built-in or other metadata about it, nor do we > have good tooling which leverages such a versioning or similar system. Exactly. How can we implement ALTER <OBJECT> … UPDATE TO <VERSION> without having access to some SQL scripts? The current patch offers a way to manage those scripts and apply them, with the idea that the people managing the scripts (extension authors) and the people applying them (DBAs) are not going to be the same people, and that it's then possible to have to apply more than a single script for a single UPDATE command. > I really just don't see this as being either particularly useful nor > feasible within a reasonable amount of effort. Shared libraries are > really the perview of the OS packaging system. If you want to build > some tool which is external to PG but helps facilitate the building and > installing of shared libraries, but doesn't use the OS packaging system > (and, instead, attempts to duplicate it) then go for it, but don't > expect to ship or install that through the PG backend. I'll give you that implementing Event Triggers just to be able to build what you're talking about on top of it and out of core might not be called “a reasonable amount of effort.” >> The problem found here is that if a non privileged user installs an >> extension template named “pgcyrpto” then the superuser installs what he >> believes is the extension “pgcrypto”, the malicious unprivileged user >> now is running his own code (extension install script) as a superuser. > > For my part, the problem here is this notion of extension templates in > the PG catalog and this is just one symptom of how that's really not a > good approach. The only reason for that being the case is that you suppose that root on the file system is more trustworthy as an entity than postgres on the file system or any superuser in the PostgreSQL service. As soon as you question that, then you might come to realise the only difference in between file-system templates and catalog templates is our ability to deal with the problem, rather than the problem itself. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Heikki Linnakangas (hlinnakangas@vmware.com) wrote: > Right. I think Jeff was thinking of a catalog representation for > extensions that haven't been installed yet, but are available in the > system and could be installed with CREATE EXTENSION foo. I really don't particularly see value in this unless it hooks into PGXN or similar somehow (ala how an apt repository works). I just don't see the point if users have to install templates to then get a list of what extensions they have available to install. The whole 'extension template' piece of this just ends up being overhead and gets in the way. > I wouldn't > mind having a catalog like that. Even without any of this extension > template stuff, it would be handy to have a view that lists all the > extensions available in the filesystem. As mentioned, that's available for the filesystem-based extensions. > There should be no difference between file-based extensions and > catalog-based extensions. It's just two different ways to install > the same extension. The extension author doesn't need to care about > that, it's the DBA that decides which method to use to install it. > > I'm going to object loudly to any proposal that doesn't meet that criteria. Right, which is why I think this is going to *have* to exist outside of the backend as an independent tool which can simply install an extension through normal libpq/PG object creation method- very similar to how extension creation already happens, except that we're being fed from a PG connection instead of reading in an SQL file from the filesystem. Thanks, Stephen
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > > There should be no difference between file-based extensions and > > catalog-based extensions. It's just two different ways to install the same > > extension. The extension author doesn't need to care about that, it's the > > DBA that decides which method to use to install it. > > Agreed. > > > I'm going to object loudly to any proposal that doesn't meet that criteria. > > Please be kind enough to poin me where my current patch is drifting away > from that criteria. What you're proposing here is what I think I have > been implementing. Perhaps you're seeing something down the road that I'm not, but I don't see how what you're proposing with extension templates actually moves us closer to this goal. What is the next step to allow an extension pulled down from pgxn to be installed, unchanged, into a given database? Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > What is the next step to allow an extension pulled down from pgxn to be > installed, unchanged, into a given database? An extension packaging system. Unchanged is not a goal, and not possible even today. PGXN is a *source based* packaging system. You can't just install what's in PGXN on the server's file system then CREATE EXTENSION, you have this extra step called the “build”. Whether you're targetting a file system template or a catalog template, PGXN is not a complete solution, you still need to build the extension. As I already mentionned in this thread, that's even true for SQL only extensions today, have a look at this example: http://api.pgxn.org/src/mimeo/mimeo-1.0.1/ http://api.pgxn.org/src/mimeo/mimeo-1.0.1/Makefile So even as of today, given file based extension templates and PGXN, there's something missing. You can find different client tools to help you there, such as pgxn_client and pex: http://pgxnclient.projects.pgfoundry.org/ https://github.com/petere/pex What I want to build is an “extension distribution” software that knows how to prepare anything from PGXN (and other places) so that it's fully ready for being used in the database. Then the main client would run as a CREATE EXTENSION "ddl_command_start" Event Trigger and would fetch the prepared extension for you and make it available, then leaving the main command operate as intended. Which is what I think the pex extension is doing, and that's not coincidental, but it runs the build step on the PostgreSQL server itself and needs to have a non-trivial set of file-system privileges to be doing so, and even needs to get root privileges with sudo for some of its operations. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > *That doesn't make this approach the right one*. If anything, I'm > > afraid we've ended up building ourselves a rube goldberg machine because > > of this constant struggle to fit a square peg into a round hole. > > This duplication you're talking about only applies to CREATE EXTENSION. > > I don't know of any ways to implement ALTER EXTENSION … UPDATE … > behaviour without a separate set of scripts to apply in a certain order > depending on the current and target versions of the extension. We've already got it in the form of how filesystem extensions work today.. > If you know how to enable a DBA to update a set of objects in a database > only with information already found in the database, and in a way that > this information is actually *not* an SQL script, I'm all ears. Clearly we need the information from the extension package (the scripts which are on the PG server's filesystem today, but need not be in the future) but that doesn't mean we need to keep those text blobs in the catalog. > > That's basically what we already do with schemas today and hence is > > pretty darn close to what I'm proposing. Perhaps it'd be a way to > > simply version schemas themselves- heck, with that, we could even > > provide that oft-asked-for schema delta tool in-core by being able to > > deduce the differences between schema at version X and schema at > > version Y. > > Given that at any moment you have a single version of the schema > installed, I don't know how you're supposed to be able to do that? *I am not trying to rebuild the entire extension package from the PG catalog*. I do not see the need to do so either. Perhaps that's short-sighted of me, but I don't think so; to go back to my dpkg example, we don't store the source package in dpkg's database nor do people generally feel the need to rebuild .deb's from the files which are out on the filesystem (a non-trivial task though I suppose it might be possible to do- but not for *every version* of the package..). > Maybe you mean by tracking the changes at update time? Well that at > least would be a good incentive to have Command String access in event > triggers, I guess. I don't see the need to track the changes at all. We don't actually track them in the database anywhere today... We happen to have scripts available on the filesystem which allow us to move between versions, but they're entirely outside the catalog and that's where they belong. > > Having a management system for sets of objects is a *great* idea- and > > one which we already have through schemas. What we don't have is any > > kind of versioning system built-in or other metadata about it, nor do we > > have good tooling which leverages such a versioning or similar system. > > Exactly. > > How can we implement ALTER <OBJECT> … UPDATE TO <VERSION> without having > access to some SQL scripts? > > The current patch offers a way to manage those scripts and apply them, > with the idea that the people managing the scripts (extension authors) > and the people applying them (DBAs) are not going to be the same people, > and that it's then possible to have to apply more than a single script > for a single UPDATE command. Extension authors are not going to be issuing updates to everyone's catalogs directly to update their templates.. That's still going to be the DBA, or some tool the DBA runs, job. I'm argueing that such a tool could actually do a lot more and work outside of the PG backend but communicate through libpq. As I see it, you're trying to build that tool *into* the backend and while 'extension templates' might end up there, I don't think you're going to get your wish when it comes to having a PG backend reach out over the internet at the request of a normal user, ever. > As soon as you question that, then you might come to realise the only > difference in between file-system templates and catalog templates is our > ability to deal with the problem, rather than the problem itself. I really think there's a good deal more to my concerns than that. :) Thanks, Stephen
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > What is the next step to allow an extension pulled down from pgxn to be > > installed, unchanged, into a given database? > > An extension packaging system. > > Unchanged is not a goal, and not possible even today. I'm not convinced of that, actually, but you do raise a good point. > PGXN is a *source based* packaging system. You can't just install what's > in PGXN on the server's file system then CREATE EXTENSION, you have this > extra step called the “build”. Fine- so we need a step that goes from 'source' to 'built'. I don't see that step being done in or by a PG backend process. Adding a new option which can take a pgxn source and build a script from it which can be run against PG via libpq is what I'd be going for- but that script *just installs (or perhaps upgrades) the extension.* There's no need for that script, or various upgrade/downgrade/whatever scripts, to be sucked wholesale into the PG catalog. > What I want to build is an “extension distribution” software that knows > how to prepare anything from PGXN (and other places) so that it's fully > ready for being used in the database. Then the main client would run as > a CREATE EXTENSION "ddl_command_start" Event Trigger and would fetch the > prepared extension for you and make it available, then leaving the main > command operate as intended. I really don't think that's a good approach. > Which is what I think the pex extension is doing, and that's not > coincidental, but it runs the build step on the PostgreSQL server itself > and needs to have a non-trivial set of file-system privileges to be > doing so, and even needs to get root privileges with sudo for some of > its operations. pex is an interesting beginning to this, but we'd need *some* backend support for being able to install the extension via libpq (or pex would need to be modified to not actually use our extension framework at all for 'trusted' extensions...). Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > Clearly we need the information from the extension package (the scripts > which are on the PG server's filesystem today, but need not be in the > future) but that doesn't mean we need to keep those text blobs in the > catalog. So, I guess it would have been good to hear about that about a year ago: http://www.postgresql.org/message-id/13481.1354743758@sss.pgh.pa.us http://www.postgresql.org/message-id/6466.1354817682@sss.pgh.pa.us We could have CREATE TEMPLATE FOR EXTENSION store the scripts into some files in PGDATA instead of the catalogs, but really I don't see the point. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 12/02/2013 04:14 PM, Dimitri Fontaine wrote: > Stephen Frost <sfrost@snowman.net> writes: >> What is the next step to allow an extension pulled down from pgxn to be >> installed, unchanged, into a given database? > > An extension packaging system. > > Unchanged is not a goal, and not possible even today. > > PGXN is a *source based* packaging system. You can't just install what's > in PGXN on the server's file system then CREATE EXTENSION, you have this > extra step called the “build”. > > Whether you're targetting a file system template or a catalog template, > PGXN is not a complete solution, you still need to build the extension. So? Just "make; make install" and you're done. Or "apt-get install foo". > What I want to build is an “extension distribution” software that knows > how to prepare anything from PGXN (and other places) so that it's fully > ready for being used in the database. You mean, something to replace "make install" if it's not installed on the server? Fair enough. You could probably write a little perl script to parse simple Makefiles that only copy a few static files in place. Or add a flag to the control file indicating that the extension follows a standard layout, and doesn't need a "make" step. I fear we're wandering off the point again. So let me repeat: It must be possible to install the same extension the way you do today, and using the new mechanism. - Heikki
Stephen Frost <sfrost@snowman.net> writes: > Fine- so we need a step that goes from 'source' to 'built'. I don't see > that step being done in or by a PG backend process. Adding a new option > which can take a pgxn source and build a script from it which can be run > against PG via libpq is what I'd be going for- but that script *just > installs (or perhaps upgrades) the extension.* There's no need for that > script, or various upgrade/downgrade/whatever scripts, to be sucked > wholesale into the PG catalog. As you said previously, we can't ask extension authors to control what version of their extension is installed on which database, so we need a way to cooperate with the backend in order to know how to operate the update. We can't just pull data out of the backend to do that, not until we've been pushing the list of available versions and update scripts that we have to be able to run the update. That's were I though about pushing the whole thing down to the catalogs and have the backend take control from there. >> What I want to build is an “extension distribution” software that knows >> how to prepare anything from PGXN (and other places) so that it's fully >> ready for being used in the database. Then the main client would run as >> a CREATE EXTENSION "ddl_command_start" Event Trigger and would fetch the >> prepared extension for you and make it available, then leaving the main >> command operate as intended. > > I really don't think that's a good approach. What's your alternative? Goals are: - using the update abilities of the extension mechanism - no access to the server's file system needed - pg_restore doesthe right thing I went for the whole set of extension abilities in my patch, you're pushing hard for me to reduce that goal so I only included the ability to manage version upgrades here. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > I fear we're wandering off the point again. So let me repeat: It must be > possible to install the same extension the way you do today, and using the > new mechanism. The way you do today is running make install or apt-get install or something else to write files in the right place on the file system, usually with root privileges. The new mechanism tries to avoid using the file system *completely*. Sorry. I don't understand what you mean other that “I don't want this patch because I don't understand what it is about”. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Stephen Frost <sfrost@snowman.net> writes: > * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: >> Then as soon as we are able to CREATE EXTENSION mystuff; without ever >> pre-installing files on the file system as root, then we would like to >> be able to do just that even with binary modules. > I really just don't see this as being either particularly useful nor > feasible within a reasonable amount of effort. Shared libraries are > really the perview of the OS packaging system. Yes, exactly. What's more, you're going to face huge push-back from vendors who are concerned about security (which is most of them). If there were such a feature, it would end up disabled, one way or another, in a large fraction of installations. That would make it impractical to use anyway for most extension authors. I don't think it's good project policy to fragment the user base that way. I'm on board with the notion of an all-in-the-database extension mechanism for extensions that consist solely of SQL objects. But not for ones that need a .so somewhere. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Stephen Frost <sfrost@snowman.net> writes: >> * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: >>> Then as soon as we are able to CREATE EXTENSION mystuff; without ever >>> pre-installing files on the file system as root, then we would like to >>> be able to do just that even with binary modules. > >> I really just don't see this as being either particularly useful nor >> feasible within a reasonable amount of effort. Shared libraries are >> really the perview of the OS packaging system. > > Yes, exactly. What's more, you're going to face huge push-back from > vendors who are concerned about security (which is most of them). Last time I talked with vendors, they were working in the Open Shift team at Red Hat, and they actually asked me to offer them the ability you're refusing, to let them enable a better security model. The way they use cgroups and SELinux means that they want to be able to load shared binaries from system user places. > If there were such a feature, it would end up disabled, one way or > another, in a large fraction of installations. That would make it > impractical to use anyway for most extension authors. I don't think > it's good project policy to fragment the user base that way. That point about fragmentation is a concern I share. > I'm on board with the notion of an all-in-the-database extension > mechanism for extensions that consist solely of SQL objects. But > not for ones that need a .so somewhere. Thanks for restating your position. The current patch offers a feature that only works with SQL objects, it's currently completely useless as soon as there's a .so involved. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Clearly we need the information from the extension package (the scripts > > which are on the PG server's filesystem today, but need not be in the > > future) but that doesn't mean we need to keep those text blobs in the > > catalog. > > So, I guess it would have been good to hear about that about a year ago: > > http://www.postgresql.org/message-id/13481.1354743758@sss.pgh.pa.us > http://www.postgresql.org/message-id/6466.1354817682@sss.pgh.pa.us > > We could have CREATE TEMPLATE FOR EXTENSION store the scripts into some > files in PGDATA instead of the catalogs, but really I don't see the > point. Yeah, I don't particularly like that idea either, but especially if it's going to be per-database again. I can kinda, sorta see the point if this was done cluster-wide but you don't like that idea and I'm not a big fan of pushing these files out onto the filesystem anyway. What I don't entirely follow is the argument against having non-file-backed extensions be dump'd through pg_dump/restore. Even in that thread, Tom appears to agree that they'd have to be dumped out in some fashion, even if they're stored as files under PGDATA, because otherwise you're not going to be able to restore the DB.. On the other hand, I can appreciate the concern that we don't really want a dump/restore to include the extension definition when it's already on the filesystem. That said, it amazes me that we don't include the version # of the extension in pg_dump's 'CREATE EXTENSION' command.. How is that not a problem? Thanks, Stephen
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Fine- so we need a step that goes from 'source' to 'built'. I don't see > > that step being done in or by a PG backend process. Adding a new option > > which can take a pgxn source and build a script from it which can be run > > against PG via libpq is what I'd be going for- but that script *just > > installs (or perhaps upgrades) the extension.* There's no need for that > > script, or various upgrade/downgrade/whatever scripts, to be sucked > > wholesale into the PG catalog. > > As you said previously, we can't ask extension authors to control what > version of their extension is installed on which database, so we need a > way to cooperate with the backend in order to know how to operate the > update. Sure, that sounds reasonable.. > We can't just pull data out of the backend to do that, not until we've > been pushing the list of available versions and update scripts that we > have to be able to run the update. I'm not following this, nor why we need this master list of every extension which exists in the world to be in every PG catalog in every database out there. > That's were I though about pushing the whole thing down to the catalogs > and have the backend take control from there. I can appreciate the desire to do that but this particular piece really feels like it could be done better external to the backend. To go back to my OS example, I feel Debian is better off with apt-get/aptitude being independent from dpkg itself. > What's your alternative? Goals are: > > - using the update abilities of the extension mechanism > - no access to the server's file system needed > - pg_restore does the right thing > > I went for the whole set of extension abilities in my patch, you're > pushing hard for me to reduce that goal so I only included the ability > to manage version upgrades here. I'd like to see these goals met, I just don't see it being all done in C in the PG backend. I've tried to outline my thoughts about how we should keep the actual extension creation scripts, upgrade scripts, etc, out of the backend catalogs (and not on the filesystem either..) and let those be managed externally, but that does then require that when we actually dump the extension's objects instead of just 'CREATE EXTENSION blah;'. I understand there have been objections raised to that, but I wonder if that isn't mainly because we're calling these new things "extensions" which have this built-in notion that they're coming from an OS packaging system which installs files somewhere.. We certainly have none of these qualms about dumping and restoring all the objects in a given schema. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > On the other hand, I can appreciate the concern that we don't really > want a dump/restore to include the extension definition when it's > already on the filesystem. That said, it amazes me that we don't > include the version # of the extension in pg_dump's 'CREATE EXTENSION' > command.. How is that not a problem? Including the version number would be a problem. When you install PostgreSQL 9.1, you only have hstore 1.0. When you install PostgreSQL 9.2, you only have hstore 1.1. When you install PostgreSQL 9.3, you only have hstore 1.2. http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/hstore/hstore.control;hb=refs/heads/REL9_1_STABLE http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/hstore/hstore.control;hb=refs/heads/REL9_2_STABLE http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/hstore/hstore.control;hb=refs/heads/REL9_3_STABLE We should maybe add the extension's version number in our documentation pages, such as the following: http://www.postgresql.org/docs/9.3/interactive/hstore.html So when you pg_dump | pg_restore from 9.1 into 9.3, if pg_dump were to be nitpicky about the version of hstore with the command CREATE EXTENSION hstore VERSION '1.0'; What would happen is that pg_restore would fail. That's just the way we maintain contribs. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > Yes, exactly. What's more, you're going to face huge push-back from > > vendors who are concerned about security (which is most of them). > > Last time I talked with vendors, they were working in the Open Shift > team at Red Hat, and they actually asked me to offer them the ability > you're refusing, to let them enable a better security model. > > The way they use cgroups and SELinux means that they want to be able to > load shared binaries from system user places. As I've pointed out before, I'd really like to hear exactly how these individuals are using SELinux and why they feel this is an acceptable approach. The only use-case that this model fits is where you don't have *any* access control in the database itself and everyone might as well be a superuser. Then, sure, SELinux can prevent your personal PG environment from destroying the others on the system in much the same way that a chroot can help there, but most folks who are looking at MAC would view *any* database as an independent object system which needs to *hook into* an SELinux or similar. In other words, I really don't think we should be encouraging this approach and certainly not without more understanding of what they're doing here. Perhaps they have a use-case for it, but it might be better done through 'adminpack' or something similar than what we support in core. Thanks, Stephen
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > So when you pg_dump | pg_restore from 9.1 into 9.3, if pg_dump were to > be nitpicky about the version of hstore with the command > > CREATE EXTENSION hstore VERSION '1.0'; > > What would happen is that pg_restore would fail. > > That's just the way we maintain contribs. I'd much rather get an error that says "that version of the extension is unavailable" than a runtime error when my plpgsql code tries to use a function whose definition changed between 1.0 and 1.1.. Perhaps we're not ready to go there because of how contrib is built and shipped, and I can understand that, but that doesn't make it a good solution. I'm not sure that such an issue should preclude us from including in-catalog-only extensions from being dump'd out as a set of objects (ala a schema) and then restored that way (preserving the version of the extension it was installed at..). I don't like the idea of having a pg_dump/restore mechanism that intentionally tries to go out and install the latest version of whatever extension was installed in the old DB by downloading it from PGXN, building it, and then installing it... Is that what people are expecting here? Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > I don't like the idea of having a pg_dump/restore mechanism that > intentionally tries to go out and install the latest version of whatever > extension was installed in the old DB by downloading it from PGXN, > building it, and then installing it... Is that what people are > expecting here? The whole idea of having Extension Templates in catalogs is exactly to prevent what you're describing here from happening. Whatever the templates you downloaded to get to the version you now have in your database for extension “foo” are going to used again by pg_restore at CREATE EXTENSION time. The extension depending on its in-catalog templates ensures that model of operations. You can copy/paste some extension examples from the regression tests and pg_dump -Fc | pg_restore -l to see the details, or something. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Dec 2, 2013, at 6:14 AM, Dimitri Fontaine <dimitri@2ndQuadrant.fr> wrote: > Whether you're targetting a file system template or a catalog template, > PGXN is not a complete solution, you still need to build the extension. This is true today, but only because PostgreSQL provides the infrastructure for building and installing extensions that entails`make && make install`. If Postgres provided some other method of building and installing extensions, you could startusing it right away on PGXN. The *only* requirement for PGXN distributions, really, is a META.json file describing theextension. Best, David
On Mon, Dec 2, 2013 at 10:13 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> I fear we're wandering off the point again. So let me repeat: It must be >> possible to install the same extension the way you do today, and using the >> new mechanism. > > The way you do today is running make install or apt-get install or > something else to write files in the right place on the file system, > usually with root privileges. > > The new mechanism tries to avoid using the file system *completely*. > > Sorry. I don't understand what you mean other that “I don't want this > patch because I don't understand what it is about”. OK, I'll bite. I've been trying to stay out of this thread, but I really *don't* understand what this patch is about. Extensions, as they exist today, are installed from the filesystem and their contents are not dumped. You're trying to create a new kind of extension which is installed from the system catalogs (instead of the file system) and is dumped. Why should anyone want that? It seems that part of the answer is that people would like to be able to install extensions via libpq. You could almost write a client-side tool for that today just by using adminpack to write the files to the server, but you'd trip over the fact that files written by adminpack must be in either the data directory or the log directory. But we could fix that easily enough. Here's a design sketch: (1) Add a new GUC that specifies an alternate location from which extensions can be installed. (2) Allow adminpack to write to that location just as it writes to the existing locations. (3) Write a tool that copies files from wherever to the appropriate server directory using adminpack. Problem solved! The only downside is that the use of this facility would have to be restricted to superusers, but the current consensus on this thread is that we should restrict *this* facility to superusers also, so we're not really losing anything. And, for a further plus, it'd even work for extensions that contain shared libraries. Win. Now, if we could make this mechanism work for non-superusers, then I think it gets more interesting, because now you have a more significant gain in functionality: someone can potentially download an extension from PGXN and install it just for themselves without needing superuser access, provided the extension doesn't require a .so or any permissions that they don't have. That's kind of nice, but as Stephen said elsewhere on the thread, this seems like a lot of mechanism for that narrow goal. As you (I think) said somewhere on this thread, you could just create the extension with a bunch of CREATE and ALTER EXTENSION .. ADD statements and set a flag on it that causes it to be dumped the same way. (We might need to add a CREATE EXTENSION foo WITH NO CONTENTS statement to really make it work, so that the libpq connection can create it as completely empty and then add objects to it one at a time, but we shouldn't need too much more than that.) The whole idea of the extension template as such goes away. So I'm having a hard time understanding what this patch actually gains us that can't be done more simply by some other means. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, 2013-12-01 at 22:34 -0500, Stephen Frost wrote: > Perhaps I'm missing something- but we already *have* a catalog > representation for every extension that's ever installed into a given > database. A representation that's a heck of a lot better than a big > text blob. I meant "extension template" represented in the catalog. > > But bringing more of an extension into the catalog can be done, and I > > think we'll see big benefits from that. > > I'm not following here- what's 'missing'? It seems that you are making the assumption that installing an extension template or creating an extension are major operations, and anyone encountering an error is a superuser with admin access to the server and can easily correct it. If the admin messes up and the extension template isn't there (e.g. after a failover), the person to encounter the error at CREATE EXTENSION time might not have admin access or there might be a process required to deploy the new files. But if the extension templates were carried along with replication and backup naturally, then they'd be there. And it would be nice if there was some hope for non-superusers to create extension templates, but that will never happen as long as they are files. > > That being said, there some things about in-catalog templates that need > > some more thought: > > > > 1. If someone does want their OS to install extensions for them (e.g. > > the contrib package), how should that be done? This usually works fine > > with the aforementioned languages, because installation is still just > > dropping files in the right place. Postgres is different, because to put > > something in the catalog, we need a running server, which is awkward for > > a packaging system to do. > > You need a running PG for the *extension* to be installed, but with the > filesystem-based extension approach we have today, the "template" (which > are the files on the filesystem) don't need PG running I think you misread -- this is a list of issues if we move templates into the catalog. File-based templates obviously don't have this problem. > > 3. What do we do about native shared libraries? Ultimately, I imagine > > that we should handle these similarly to tablespaces: have a real > > database object with an OID that extensions or functions can depend on, > > and create a symlink (with the OID as the link name) that points to the > > real file on disk. We could also export some new symbols like the shared > > library name and version for better error checking. > > I'm sorry, but I do not see shared libraries working through this > system, at all. I know that goes against what Dimitri and some others > want, but I've talked with a few folks (such as Paul Ramsey of PostGIS) > about this notion and, from that perspective, it's almost laughable to > think we could ship shared libraries in this way. Even if we could > convince ourselves that there's some way for us to track the files on > the filesystem and work out all the per-database and whatever issues are > associated with that, it'd only work for the simplest shared libraries > which don't have any dependencies on other libraries on the system > (excepting, perhaps, libc6) and that narrows the use-case down > significantly, to the point where I don't feel it's worth all that > effort. I was just suggesting that a little more information in the catalog could improve dependency tracking and error handling. I'm not suggesting we "ship" any shared libraries anywhere -- that's still up to extension authors and PGXN. I'm also not suggesting that the error handling will be perfect or catch subtle mismatches. Regards,Jeff Davis
On Mon, Dec 2, 2013 at 6:30 PM, Robert Haas <robertmhaas@gmail.com> wrote: > OK, I'll bite. I've been trying to stay out of this thread, but I > really *don't* understand what this patch is about. Extensions, as > they exist today, are installed from the filesystem and their contents > are not dumped. You're trying to create a new kind of extension which > is installed from the system catalogs (instead of the file system) and > is dumped. Why should anyone want that? > > It seems that part of the answer is that people would like to be able > to install extensions via libpq. You could almost write a client-side > tool for that today just by using adminpack to write the files to the > server, but you'd trip over the fact that files written by adminpack > must be in either the data directory or the log directory. But we > could fix that easily enough. Just tossing an idea out there. What if you could install an extension by specifying not a local file name but a URL. Obviously there's a security issue but for example we could allow only https URLs with verified domain names that are in a list of approved domain names specified by a GUC. -- greg
On Mon, Dec 2, 2013 at 2:33 PM, Greg Stark <stark@mit.edu> wrote: > On Mon, Dec 2, 2013 at 6:30 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> OK, I'll bite. I've been trying to stay out of this thread, but I >> really *don't* understand what this patch is about. Extensions, as >> they exist today, are installed from the filesystem and their contents >> are not dumped. You're trying to create a new kind of extension which >> is installed from the system catalogs (instead of the file system) and >> is dumped. Why should anyone want that? >> >> It seems that part of the answer is that people would like to be able >> to install extensions via libpq. You could almost write a client-side >> tool for that today just by using adminpack to write the files to the >> server, but you'd trip over the fact that files written by adminpack >> must be in either the data directory or the log directory. But we >> could fix that easily enough. > > Just tossing an idea out there. What if you could install an extension > by specifying not a local file name but a URL. Obviously there's a > security issue but for example we could allow only https URLs with > verified domain names that are in a list of approved domain names > specified by a GUC. That's a different feature, but I don't see anything preventing someone from implementing that as an extension, today, without any core support at all. It would only be usable in cases where the share directory is writable by the database server (i.e. low-security installations) and you'd have to make it a function call rather than piggybacking on CREATE EXTENSION, but neither of those things sound bad to me. (And if they are bad, they could be addressed by providing hooks or event triggers, leaving the rest of the functionality in the extension module.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greg Stark <stark@mit.edu> writes: > On Mon, Dec 2, 2013 at 6:30 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> OK, I'll bite. I've been trying to stay out of this thread, but I >> really *don't* understand what this patch is about. Extensions, as Thanks! >> they exist today, are installed from the filesystem and their contents >> are not dumped. You're trying to create a new kind of extension which >> is installed from the system catalogs (instead of the file system) and >> is dumped. Why should anyone want that? To benefit from ALTER EXTENSION … UPDATE … and \dx. And technically the extension is not dumped, its templates are. >> It seems that part of the answer is that people would like to be able >> to install extensions via libpq. You could almost write a client-side >> tool for that today just by using adminpack to write the files to the >> server, but you'd trip over the fact that files written by adminpack >> must be in either the data directory or the log directory. But we >> could fix that easily enough. Trick question: when you've implemented said client and used it for a couple of (in-house) extensions, what do you think should happen at pg_restore time? Hint: in a properly designed ops model, pg_restore happens each and every day when the unattended cron job “rebases”the QA or testing environments from the production PITR backups, of course. > Just tossing an idea out there. What if you could install an extension > by specifying not a local file name but a URL. Obviously there's a > security issue but for example we could allow only https URLs with > verified domain names that are in a list of approved domain names > specified by a GUC. That's something I want to build. This time, not in core. The model I've been thinking about involves an EVENT TRIGGER that is fired at "ddl_command_start" for CREATE EXTENSION and prepares an EXTENSION TEMPLATE before the command has a chance to check what's available and install the current default version of it. Also runs at ALTER EXTENSION … UPDATE …, of course, providing the upgrade scripts on the fly. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I don't like the idea of having a pg_dump/restore mechanism that > > intentionally tries to go out and install the latest version of whatever > > extension was installed in the old DB by downloading it from PGXN, > > building it, and then installing it... Is that what people are > > expecting here? > > The whole idea of having Extension Templates in catalogs is exactly to > prevent what you're describing here from happening. eh...? I'm not sure how that follows. > Whatever the templates you downloaded to get to the version you now have > in your database for extension “foo” are going to used again by > pg_restore at CREATE EXTENSION time. The extension depending on its > in-catalog templates ensures that model of operations. What I've been trying to point out is that there's absolutely zero need for the 'extension template' part of this to make a pg_restore work for an entirely-in-the-catalog extension. I realize that's how you've done it with this patch set but that doesn't make it necessary. Thanks, Stephen
David, * David E. Wheeler (david@justatheory.com) wrote: > On Dec 2, 2013, at 6:14 AM, Dimitri Fontaine <dimitri@2ndQuadrant.fr> wrote: > > > Whether you're targetting a file system template or a catalog template, > > PGXN is not a complete solution, you still need to build the extension. > > This is true today, but only because PostgreSQL provides the infrastructure for building and installing extensions thatentails `make && make install`. If Postgres provided some other method of building and installing extensions, you couldstart using it right away on PGXN. The *only* requirement for PGXN distributions, really, is a META.json file describingthe extension. Thanks, that's a pretty interesting point.. I like the idea that we could provide a new make target which could build an 'inline extension' (or what-have-you) which could then be distributed and used by users either directly or with some client-side tool. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * David E. Wheeler (david@justatheory.com) wrote: >> This is true today, but only because PostgreSQL provides the >> infrastructure for building and installing extensions that entails `make >> && make install`. If Postgres provided some other method of building and >> installing extensions, you could start using it right away on PGXN. The >> *only* requirement for PGXN distributions, really, is a META.json file >> describing the extension. > > Thanks, that's a pretty interesting point.. I like the idea that we > could provide a new make target which could build an 'inline extension' > (or what-have-you) which could then be distributed and used by users > either directly or with some client-side tool. +1 -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > Problem solved! The only downside is that the use of this facility > would have to be restricted to superusers, but the current consensus > on this thread is that we should restrict *this* facility to > superusers also, so we're not really losing anything. And, for a > further plus, it'd even work for extensions that contain shared > libraries. Win. This is more-or-less what I've been argueing for when it comes to addressing this for people who are superusers and want to install .so's via libpq- do it outside of PG core, thank you very much. > Now, if we could make this mechanism work for non-superusers, then I > think it gets more interesting, because now you have a more > significant gain in functionality: someone can potentially download an > extension from PGXN and install it just for themselves without needing > superuser access, provided the extension doesn't require a .so or any > permissions that they don't have. This is exactly what I want to see happen, with perhaps some external tool which handles the "download from PGXN and install it" part- or maybe we add that to psql, but I feel like a dedicated tool would be better. > That's kind of nice, but as Stephen > said elsewhere on the thread, this seems like a lot of mechanism for > that narrow goal. As you (I think) said somewhere on this thread, you > could just create the extension with a bunch of CREATE and ALTER > EXTENSION .. ADD statements and set a flag on it that causes it to be > dumped the same way. (We might need to add a CREATE EXTENSION foo > WITH NO CONTENTS statement to really make it work, so that the libpq > connection can create it as completely empty and then add objects to > it one at a time, but we shouldn't need too much more than that.) The > whole idea of the extension template as such goes away. Exactly. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > What I've been trying to point out is that there's absolutely zero need > for the 'extension template' part of this to make a pg_restore work for > an entirely-in-the-catalog extension. I realize that's how you've done > it with this patch set but that doesn't make it necessary. If it's an extension, it's filtered out of pg_dump, so it's not part of your pg_restore. Full Stop. This point has been debated and there has been a very clear conclusion a year ago. What am I missing here? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Jeff Davis (pgsql@j-davis.com) wrote: > On Sun, 2013-12-01 at 22:34 -0500, Stephen Frost wrote: > > Perhaps I'm missing something- but we already *have* a catalog > > representation for every extension that's ever installed into a given > > database. A representation that's a heck of a lot better than a big > > text blob. > > I meant "extension template" represented in the catalog. Ok. > > > But bringing more of an extension into the catalog can be done, and I > > > think we'll see big benefits from that. > > > > I'm not following here- what's 'missing'? The above was in reference to an extension, not an extension template. > It seems that you are making the assumption that installing an extension > template or creating an extension are major operations, and anyone > encountering an error is a superuser with admin access to the server and > can easily correct it. I'm not following this train of thought at all.. Unless you're referring to the case where it involves a .so, in which case then I *would* expect a superuser with admin access to the server to be involved. For the non-.so-including case, I'm hopeful we can provide a mechanism which doesn't require a superuser or admin at all. > If the admin messes up and the extension template isn't there (e.g. > after a failover), the person to encounter the error at CREATE EXTENSION > time might not have admin access or there might be a process required to > deploy the new files. But if the extension templates were carried along > with replication and backup naturally, then they'd be there. The *extension* is already brought along with replication and binary backup naturally, for non-.so-based extensions. For logical backups, we depend on the filesystem to provide the extension definition- I don't particularly like that and it *isn't necessary* for non-.so-based extensions. In many ways, we do it the way we do specifically due to how we distribute -contrib, as Dimitri has pointed out. What I'd like to see is support for dumping the extension's objects during a logical dump, perhaps only for specific types of extensions (ones installed that way), and maybe with some additional caveats on it (should we have multiple extension namespaces? or come up with some other name for these objects?). For .so-based extensions, yes, an admin is going to need to be involved and I am not impressed by these notions that such a requirement is unreasonable or overly draconian. > And it would be nice if there was some hope for non-superusers to create > extension templates, but that will never happen as long as they are > files. Agreed and as a follow-on to that, I'm pretty sure .so's will always be files... > > You need a running PG for the *extension* to be installed, but with the > > filesystem-based extension approach we have today, the "template" (which > > are the files on the filesystem) don't need PG running > > I think you misread -- this is a list of issues if we move templates > into the catalog. File-based templates obviously don't have this > problem. I don't see us being able to completely get rid of the existing extension system which uses the file-based approach. Certainly not for a very long time, so I'm not sure that there's much discussion to be had along these lines. > I was just suggesting that a little more information in the catalog > could improve dependency tracking and error handling. I'm not suggesting > we "ship" any shared libraries anywhere -- that's still up to extension > authors and PGXN. I'm also not suggesting that the error handling will > be perfect or catch subtle mismatches. Having more information about an extension in the catalog seems reasonable to me- along the same lines of how we have dependency information in dpkg for .deb's. What I don't like is the notion of upgrade scripts which are installed into the catalog as *scripts*. Ditto for initial extension creation. We already know how to drop and recreate every object in the database, let's make use of that mechanism instead. That doesn't allow pg_dump/restore to *invisibly upgrade an extension* (which is what happens now..), but I find that to be an entirely good thing... Thanks, Stephen
On Mon, Dec 2, 2013 at 3:02 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Greg Stark <stark@mit.edu> writes: >> On Mon, Dec 2, 2013 at 6:30 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> OK, I'll bite. I've been trying to stay out of this thread, but I >>> really *don't* understand what this patch is about. Extensions, as >>> they exist today, are installed from the filesystem and their contents >>> are not dumped. You're trying to create a new kind of extension which >>> is installed from the system catalogs (instead of the file system) and >>> is dumped. Why should anyone want that? > > To benefit from ALTER EXTENSION … UPDATE … and \dx. I don't see what benefit we get as far as ALTER EXTENSION .. UPDATE is concerned. You've got to do some kind of DDL to load the update script into the database, and then you've got to run the UPDATE itself. That's pretty complicated. You could have done the same thing by just ALTERing the extension, without actually needing the update script uploaded into the database. > And technically the extension is not dumped, its templates are. But you *could* dump the extension, and in fact it would be simpler than what you're actually doing. >>> It seems that part of the answer is that people would like to be able >>> to install extensions via libpq. You could almost write a client-side >>> tool for that today just by using adminpack to write the files to the >>> server, but you'd trip over the fact that files written by adminpack >>> must be in either the data directory or the log directory. But we >>> could fix that easily enough. > > Trick question: when you've implemented said client and used it for a > couple of (in-house) extensions, what do you think should happen at > pg_restore time? > > Hint: in a properly designed ops model, pg_restore happens each and > every day when the unattended cron job “rebases” the QA or testing > environments from the production PITR backups, of course. /me shrugs. You could just as well ask what happens when you've installed postgresql but not postgresql-contrib and try to restore a database that needs hstore. There's no getting around the fact that if the extension isn't available, installation will fail. Nor is that a bug. Still, I agree that there are use cases where that's not what you want, and discussed some of them in the rest of the email from which you took that quote. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > What I've been trying to point out is that there's absolutely zero need > > for the 'extension template' part of this to make a pg_restore work for > > an entirely-in-the-catalog extension. I realize that's how you've done > > it with this patch set but that doesn't make it necessary. > > If it's an extension, it's filtered out of pg_dump, so it's not part of > your pg_restore. Full Stop. This point has been debated and there has > been a very clear conclusion a year ago. > > What am I missing here? I don't buy it. I understand that you heard that from Tom with regard to extensions- but that was also when he was argueing that what you're building here shouldn't be called extensions. I have a huge problem with the idea that we can't dump out the definition of an in-catalog-only extension, but it's okay to dump out some big blob of text which is supposed to represent the same thing but in a much, much more kludgey and inexact way. How are we going to handle new keywords being added in new major versions? A pg_dump of the extension template script is then going to be loaded into the new major version but will not actually be able to be run because it'll error out... I'm 100% confident that's not the only problem with this approach which sticks script text blobs into the catalog as representations of database objects. Thanks, Stephen
On Mon, Dec 2, 2013 at 3:17 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Stephen Frost <sfrost@snowman.net> writes: >> What I've been trying to point out is that there's absolutely zero need >> for the 'extension template' part of this to make a pg_restore work for >> an entirely-in-the-catalog extension. I realize that's how you've done >> it with this patch set but that doesn't make it necessary. > > If it's an extension, it's filtered out of pg_dump, so it's not part of > your pg_restore. Full Stop. This point has been debated and there has > been a very clear conclusion a year ago. > > What am I missing here? This is a fair point. For those that may not remember the previous proposal, see here: http://www.postgresql.org/message-id/m2390abpc1.fsf@2ndQuadrant.fr See also ensuing discussion. Speaking only for myself, I think the thing I most disliked about that proposal was the syntax. I'd rather see each extension member dumped separately, and then later dump the extension itself as CREATE EXTENSION ... WITH NO CONTENTS or similar followed by ALTER EXTENSION ... ADD <item> for each member. That would provide a way of handling dependency loops, which Dimitri's proposed syntax did not, and just in general seems more elegant. But it's not perfect: for example, there's no clean way to handle the situation where the extension is present in the filesystem on the old database but not the new, or visca versa, and I don't think anyone's proposed *any* really clean way of handling that yet. Fundamentally, I think this is a pretty hard problem. The OS-level equivalent of extensions is something like RPMs or .deb files, and I can't help but observe that those are only used for system-wide installations, not per-user installs. I think the reason we're having a hard time coming up with a satisfactory way of making this work is that an extension as installed from SQL using libpq is a pretty different beast from an extension as installed via the filesystem, and bending the existing mechanism to make that work is somewhat painful no matter how you do it. The argument was made then, and with some validity, that we just shouldn't make the same mechanism serve both purposes. What I now understand (that I think I probably didn't fully understand back then) is that part of the point here is to enable installation of extensions without requiring local filesystem access; using a completely different mechanism would defeat that goal. But I'm still not altogether happy with where that's landed us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > Speaking only for myself, I think the thing I most disliked about that > proposal was the syntax. I'd rather see each extension member dumped > separately, and then later dump the extension itself as CREATE > EXTENSION ... WITH NO CONTENTS or similar followed by ALTER EXTENSION > ... ADD <item> for each member. That would provide a way of handling > dependency loops, which Dimitri's proposed syntax did not, and just in > general seems more elegant. But it's not perfect: for example, > there's no clean way to handle the situation where the extension is > present in the filesystem on the old database but not the new, or > visca versa, and I don't think anyone's proposed *any* really clean > way of handling that yet. I'm not convinced we really need to solve that problem, but one way to solve it 'cleanly' would be to seperate the two types of extensions, perhaps by calling them by different names or inventing a namespace for extensions. > Fundamentally, I think this is a pretty hard problem. The OS-level > equivalent of extensions is something like RPMs or .deb files, and I > can't help but observe that those are only used for system-wide > installations, not per-user installs. I think the reason we're having > a hard time coming up with a satisfactory way of making this work is > that an extension as installed from SQL using libpq is a pretty > different beast from an extension as installed via the filesystem, and > bending the existing mechanism to make that work is somewhat painful > no matter how you do it. The argument was made then, and with some > validity, that we just shouldn't make the same mechanism serve both > purposes. What I now understand (that I think I probably didn't fully > understand back then) is that part of the point here is to enable > installation of extensions without requiring local filesystem access; > using a completely different mechanism would defeat that goal. But > I'm still not altogether happy with where that's landed us. I think we're falling into the trap of thinking that whatever this user-installable-collection-of-objects thing is, it must be considered PG 'extensions'. While I agree that some of the things we do for extensions should also be done with these collections of objects (eg: having versions and other meta-data for them), I'm starting to think that's the "small" side of this whole equation and duplicating that meta-data store for these collections would be easier than trying to shoehorn them into the existing notion of 'extensions'. Thanks, Stephen
Robert Haas <robertmhaas@gmail.com> writes: > Speaking only for myself, I think the thing I most disliked about that > proposal was the syntax. I'd rather see each extension member dumped > separately, and then later dump the extension itself as CREATE > EXTENSION ... WITH NO CONTENTS or similar followed by ALTER EXTENSION > ... ADD <item> for each member. That would provide a way of handling > dependency loops, which Dimitri's proposed syntax did not, and just in > general seems more elegant. But it's not perfect: for example, I could have fixed the syntax quite easily, within 9.3 time frame. > there's no clean way to handle the situation where the extension is > present in the filesystem on the old database but not the new, or > visca versa, and I don't think anyone's proposed *any* really clean > way of handling that yet. Well, my memories is that I did propose thinking about upgrade paths mixing template sources, and Tom objected quite strongly to doing that. IIRC the line of thoughs is that it's indeed a very complex problem to solve, and renaming the extension when you switch your distribution model might be all you need. Same with incompatible major versions, when there's no integrated upgrade path possible. > Fundamentally, I think this is a pretty hard problem. The OS-level > equivalent of extensions is something like RPMs or .deb files, and I > can't help but observe that those are only used for system-wide > installations, not per-user installs. I think the reason we're having If you want to dive into system level unprivileged package management, have a look at that: http://www.gnu.org/software/guix/ > a hard time coming up with a satisfactory way of making this work is > that an extension as installed from SQL using libpq is a pretty > different beast from an extension as installed via the filesystem, and > bending the existing mechanism to make that work is somewhat painful > no matter how you do it. The argument was made then, and with some > validity, that we just shouldn't make the same mechanism serve both > purposes. What I now understand (that I think I probably didn't fully > understand back then) is that part of the point here is to enable > installation of extensions without requiring local filesystem access; > using a completely different mechanism would defeat that goal. But > I'm still not altogether happy with where that's landed us. I'd like to better understanding what is so wrong about the current design in terms that I'm not feeling like we did address a year ago. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Stephen Frost <sfrost@snowman.net> writes: > I'm not convinced we really need to solve that problem, but one way to > solve it 'cleanly' would be to seperate the two types of extensions, > perhaps by calling them by different names or inventing a namespace for > extensions. My understanding is that this line of thoughts gave us Extension Templates which are templates, not extensions. > I think we're falling into the trap of thinking that whatever this > user-installable-collection-of-objects thing is, it must be considered > PG 'extensions'. While I agree that some of the things we do for > extensions should also be done with these collections of objects (eg: > having versions and other meta-data for them), I'm starting to think > that's the "small" side of this whole equation and duplicating that > meta-data store for these collections would be easier than trying to > shoehorn them into the existing notion of 'extensions'. My main question when thinking that way is: - how to update from a version to another one? The point about extensions is that we separate the author who maintains the upgrade scripts from the DBA who operates the upgrades. It seems to me it's important to keep that property. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I'm not convinced we really need to solve that problem, but one way to > > solve it 'cleanly' would be to seperate the two types of extensions, > > perhaps by calling them by different names or inventing a namespace for > > extensions. > > My understanding is that this line of thoughts gave us Extension > Templates which are templates, not extensions. Errr. "Extension Templates" aren't collections of objects in the database which can be used directly by users. That's what extensions are, except that they have to currently be installed from files on the filesystem. > > I think we're falling into the trap of thinking that whatever this > > user-installable-collection-of-objects thing is, it must be considered > > PG 'extensions'. While I agree that some of the things we do for > > extensions should also be done with these collections of objects (eg: > > having versions and other meta-data for them), I'm starting to think > > that's the "small" side of this whole equation and duplicating that > > meta-data store for these collections would be easier than trying to > > shoehorn them into the existing notion of 'extensions'. > > My main question when thinking that way is: > > - how to update from a version to another one? You pull down the new "collection-of-objects" pgxn module (or whatever) and find the script that says "go from X.Y to Y.Z" and then run it, and it runs the necessary add-to-collection/remove-from-collection magic through libpq. Note that there would be a tool to do this, of course.. But my point is that it'd be a *client-side* tool, not something which the backend does by reaching out to the internet and downloading files. > The point about extensions is that we separate the author who maintains > the upgrade scripts from the DBA who operates the upgrades. It seems to > me it's important to keep that property. I agree. Thanks, Stephen
On 3 December 2013 02:02, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Stephen Frost <sfrost@snowman.net> writes: >> On the other hand, I can appreciate the concern that we don't really >> want a dump/restore to include the extension definition when it's >> already on the filesystem. That said, it amazes me that we don't >> include the version # of the extension in pg_dump's 'CREATE EXTENSION' >> command.. How is that not a problem? > > Including the version number would be a problem. > > When you install PostgreSQL 9.1, you only have hstore 1.0. > When you install PostgreSQL 9.2, you only have hstore 1.1. > When you install PostgreSQL 9.3, you only have hstore 1.2. ISTM that the real solution to this particular problem is to decouple the extensions that are currently in contrib from a specific postgres version. We have an extension mechanism now, and a distribution mechanism (which people may or may not like, personally I'd still rather install rpms) so why do we still need to ship these things as a blessed bundle which is tied to a specific release? If things were split out, it would be much easier for extension authors to maintain branches targeting different major versions of pgsql. Users could then upgrade those separately from upgrading the major db version. If this were considered seriously, we could still package up a contrib tarball with the same set as we have now, and for testing we could either teach the buildfarm client to pull the contrib modules from their respective homes, or replace the contrib dirs with git submodule refs. I suppose there's a case to be made that there will always be some extensions which are inseparable - plpgsql is the obvious case - but we do go to a fair bit of effort to keep backward compatibility in that case. So the version number issue isn't as much of a deal. Most extensions we'd expect to be more fluid, and we'd expect users to upgrade at (perhaps) a different rate to the main db. Is there a case for a small number of "built-in" extensions being versionless, but everything else requiring a version number AND dumping out with the version number? Cheers Tom
Tom Dunstan <pgsql@tomd.cc> writes: > On 3 December 2013 02:02, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> Stephen Frost <sfrost@snowman.net> writes: >>> On the other hand, I can appreciate the concern that we don't really >>> want a dump/restore to include the extension definition when it's >>> already on the filesystem. That said, it amazes me that we don't >>> include the version # of the extension in pg_dump's 'CREATE EXTENSION' >>> command.. How is that not a problem? >> Including the version number would be a problem. >> >> When you install PostgreSQL 9.1, you only have hstore 1.0. >> When you install PostgreSQL 9.2, you only have hstore 1.1. >> When you install PostgreSQL 9.3, you only have hstore 1.2. > ISTM that the real solution to this particular problem is to decouple > the extensions that are currently in contrib from a specific postgres > version. "Problem"? It's not a bug that you get hstore 1.2 when you dump from 9.2 and reload into 9.3; that's a feature. You wanted an upgrade, presumably, or you'd not have been going to 9.3 in the first place. The entire reason why the extension mechanism works like it does is to allow that sort of reasonably-transparent upgrade. It would not be a step forward to break that by having pg_dump prevent it (which it would fail to do anyway, likely, since the receiving installation might not have 1.1 available to install). regards, tom lane
On 3 December 2013 12:12, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Problem"? It's not a bug that you get hstore 1.2 when you dump from 9.2 > and reload into 9.3; that's a feature. You wanted an upgrade, presumably, > or you'd not have been going to 9.3 in the first place. The entire reason > why the extension mechanism works like it does is to allow that sort of > reasonably-transparent upgrade. It would not be a step forward to break > that by having pg_dump prevent it (which it would fail to do anyway, > likely, since the receiving installation might not have 1.1 available > to install). Well, I was suggesting further down that extensions which we'd like to consider built-in because they make guarantees about backwards compatibility should simply be versionless, since we expect a transparent upgrade, and as you point out the old version isn't likely to be there. So maybe hstore wasn't a good example. Those should continue to be dumped without versions. (The counter argument is that if hstore 1.2 has awesome new features, why should people wait until the next major version to enjoy them, unless they depend on something changing in core?) Extensions in contrib live in a weird place. Totally builtin stuff should obviously be dumped without versions, and stuff which is completely separate and follows its own release schedule should obviously be versioned. I guess we consider all modules in contrib to offer the same transparent upgrade guarantees as builtins, so they shouldn't be versioned, but it feels like some of them should be, if only because some aren't particularly tied in to the backend all that tightly. But I guess that's a bogus metric, the true metric is whether we want people to treat them as basically built-in, with the upgrade guarantees that go along with that. Cheers Tom
On Sun, 2013-12-01 at 15:48 +0100, Dimitri Fontaine wrote: > Jeff Davis <pgsql@j-davis.com> writes: > > I don't see why we are trying to accommodate a case where the author > > doesn't offer enough full SQL scripts and offers broken downgrade > > scripts; or why that case is different from offering broken upgrade > > scripts. > > That's fair enough I guess. I will work on automating the choice of the > first full script to use then, for next patch version. Can we separate this feature out? It's an issue with extensions today, and I'm eager to make some progress after the explosion of differing opinions today. Robert, do you think this is an acceptable approach to solve your pet peeve here: http://www.postgresql.org/message-id/CA +Tgmoae3Qs4QbQfxOUzZFxRSxA0zy8ibSOYSuuTzDUMPeAkAg@mail.gmail.com As an aside, I'm still not entirely sure why pg_dump omits the version of the extension. It seems to introduce unnecessary ways to fail during restore -- e.g. you've updated to (and depend on) 1.1, but restore will try to restore using an old default of 1.0. Is it so that extension authors can drop support for old SQL scripts and the restore will just find the newer one? If the reason the extension version is left out is because we can't find the upgrade path to the requested version, this feature should fix that. Regards,Jeff Davis
On Mon, 2013-12-02 at 11:07 +0200, Heikki Linnakangas wrote: > On 12/02/2013 05:34 AM, Stephen Frost wrote: > >> 2. When 9.4 gets released, we need some solid advice for extension > >> authors. If they have a native shared library, I assume we just tell > >> them to keep using the file-based templates. But if they have a SQL-only > >> extension, do we tell them to port to the in-catalog templates? What if > >> they port to in-catalog templates, and then decide they just want to > >> optimize one function by writing it in native code? Do they have to port > >> back? What should the authors of SQL-only extensions distribute on PGXN? > >> Should there be a migration period where they offer both kinds of > >> templates until they drop support for 9.3? > > > > This is one of the main things that I think Heikki was trying to drive > > at with his comment- we really don't *want* to make extension authors > > have to do anything different than what they do today. With an external > > tool, they wouldn't need to and it would just be two different ways for > > an extension to be installed into a given database. In the end though, > > if we're telling people to 'port' their extensions, then I think we've > > already lost. > > Exactly. > > There should be no difference between file-based extensions and > catalog-based extensions. It's just two different ways to install the > same extension. The extension author doesn't need to care about that, > it's the DBA that decides which method to use to install it. > > I'm going to object loudly to any proposal that doesn't meet that criteria. But we're arguably already in this position today. For a SQL-only extension, the author can choose between: 1. Using a schema/namespace a. installable over libpq b. installable by non-superusers c. no special handling when it comesto administration 2. Using an extension a. convenient metadata (e.g. "requires") b. upgrade scripts c. omitted from pg_dump so can be managedseparately d. relocatable e. not tied to one schema And if the author decides to change, it requires porting the extension to the other form. Note: I'm using "extension" loosely here. We might call the SQL-only, user-installable variety something else. So how do we get to the point where we have clear advice to the author of a SQL-only extension? And how do we do that without asking them to port anything? Stephen mentioned using external tools and/or metadata, but to me that sounds like it would require porting the extension away from what's on PGXN today. Regards,Jeff Davis
On 12/03/2013 09:25 AM, Jeff Davis wrote: > On Mon, 2013-12-02 at 11:07 +0200, Heikki Linnakangas wrote: >> There should be no difference between file-based extensions and >> catalog-based extensions. It's just two different ways to install the >> same extension. The extension author doesn't need to care about that, >> it's the DBA that decides which method to use to install it. >> >> I'm going to object loudly to any proposal that doesn't meet that criteria. > > But we're arguably already in this position today. For a SQL-only > extension, the author can choose between: > > 1. Using a schema/namespace > a. installable over libpq > b. installable by non-superusers > c. no special handling when it comes to administration > > 2. Using an extension > a. convenient metadata (e.g. "requires") > b. upgrade scripts > c. omitted from pg_dump so can be managed separately > d. relocatable > e. not tied to one schema > > And if the author decides to change, it requires porting the extension > to the other form. > > Note: I'm using "extension" loosely here. We might call the SQL-only, > user-installable variety something else. Good point. It's not too hard to install an "extension" written as an extension as plain schema objects, though. You can just run the .sql script through psql. That's what you used to do before extensions were invented. (the scripts in contrib contain an explicit check against that, but I don't think that's common outside contrib) Another perspective is that that's already a situation we'd rather not have. Let's not make it worse by introducing a third way to install an extension, which again requires the extension author to package the extension differently. > So how do we get to the point where we have clear advice to the author > of a SQL-only extension? And how do we do that without asking them to > port anything? Yeah, that's the crucial question of this whole thread. > Stephen mentioned using external tools and/or metadata, but to me that > sounds like it would require porting the extension away from what's on > PGXN today. Why? The external tool can pick the extension in its current form from PGXN, and install it via libpq. The tool might have to jump through some hoops to do it, and we might need some new backend functionality to support it, but I don't see why the extension author needs to do anything. That said, it might make the tool easier to write if we place some new requirements for extension authors. Like, stipulate that the .sql file is in the top-level directory of the extension tarball. But the same extension would still be installable with "make; make install". - Heikki
On Tue, Dec 3, 2013 at 1:31 AM, Jeff Davis <pgsql@j-davis.com> wrote: > On Sun, 2013-12-01 at 15:48 +0100, Dimitri Fontaine wrote: >> Jeff Davis <pgsql@j-davis.com> writes: >> > I don't see why we are trying to accommodate a case where the author >> > doesn't offer enough full SQL scripts and offers broken downgrade >> > scripts; or why that case is different from offering broken upgrade >> > scripts. >> >> That's fair enough I guess. I will work on automating the choice of the >> first full script to use then, for next patch version. > > Can we separate this feature out? It's an issue with extensions today, > and I'm eager to make some progress after the explosion of differing > opinions today. +1 for separating that part out. I thought it was separated, at some point. > Robert, do you think this is an acceptable approach to solve your pet > peeve here: > > http://www.postgresql.org/message-id/CA > +Tgmoae3Qs4QbQfxOUzZFxRSxA0zy8ibSOYSuuTzDUMPeAkAg@mail.gmail.com I'd need to look exactly what's being proposed in more detail. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Dec 2, 2013 at 7:46 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Just tossing an idea out there. What if you could install an extension >> by specifying not a local file name but a URL. Obviously there's a >> security issue but for example we could allow only https URLs with >> verified domain names that are in a list of approved domain names >> specified by a GUC. > > That's a different feature, but I don't see anything preventing > someone from implementing that as an extension, today, without any > core support at all. It would only be usable in cases where the share > directory is writable by the database server (i.e. low-security > installations) and you'd have to make it a function call rather than > piggybacking on CREATE EXTENSION, but neither of those things sound > bad to me. (And if they are bad, they could be addressed by providing > hooks or event triggers, leaving the rest of the functionality in the > extension module.) Well none of this isn't implementable as an extension if you have write access to the database server's share directory. This is all about UI. CREATE EXTENSION is about having the core do the bookkeeping about which files belong to which version of which extension. I thought the fundamental problem the "in-catalog" extensions were trying to solve were the issue with not having access to the filesystem. If that's the case then being able to say create extension from http://... would solve that. If the fundamental problem is that you want multi-tenant databases to be able to have different .so files visible depending on which database is opened then that's a bit trickier. -- greg
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > > On 3 December 2013 02:02, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > > ISTM that the real solution to this particular problem is to decouple > > the extensions that are currently in contrib from a specific postgres > > version. > > "Problem"? It's not a bug that you get hstore 1.2 when you dump from 9.2 > and reload into 9.3; that's a feature. You wanted an upgrade, presumably, I don't buy this argument at *all* and it's not going to fly when we've got multiple versions of an extension available concurrently. I'm willing to accept that we have limitations when it comes from a packaging perspective (today) around extensions but the notion that my backup utility should intentionally omit information which is required to restore the database to the same state it was in is ridiculous. > or you'd not have been going to 9.3 in the first place. This notion that a given major version of PG only ever has one version of an extension associated with it is *also* wrong and only makes any sense for contrib extensions- which are the exception rather than the rule today. > The entire reason > why the extension mechanism works like it does is to allow that sort of > reasonably-transparent upgrade. It would not be a step forward to break > that by having pg_dump prevent it (which it would fail to do anyway, > likely, since the receiving installation might not have 1.1 available > to install). I agree that there should be a way to *allow* such an upgrade to happen transparently and perhaps we keep it the way it is for contrib extensions as a historical artifact, but other extensions are independent of the PG major version and multiple versions will be available concurrently for them and having pg_dump willfully ignore the extension version is a receipe for broken backups. Step back and consider a user who is just trying to restore his backup of his 9.2 database into a new server, also with 9.2, as quickly as he can to get his system online again. Having four different versions of extension X installed and available for 9.2, no clue or information about which version was installed into which databases and getting mysterious failures and errors because they're not all compatible is not what anyone is going to want to deal with in that situation. I certainly don't see extensions (outside of contrib) in the general sense as being either tied to specific PG versions or being required to maintain the same API that they started with on day 1. Thanks, Stephen
Robert Haas <robertmhaas@gmail.com> writes: >> Can we separate this feature out? It's an issue with extensions today, >> and I'm eager to make some progress after the explosion of differing >> opinions today. > > +1 for separating that part out. I thought it was separated, at some point. http://www.postgresql.org/message-id/CALtqXTetVi-eXhdBSpUey3TrthuG51esWUOd8cUR2t+RxtgEKg@mail.gmail.com http://www.postgresql.org/message-id/m2r4k8jpfl.fsf@2ndQuadrant.fr The only way for to bugfix all the reported problems had been to have regression testing… and it's become a necessary dependency of the extension templates patch, so I just included it in. My interdependent git branches development fu seems to have totally disappeared after the main extension patch that needed 7 of thoses… > I'd need to look exactly what's being proposed in more detail. What I did propose is a new GUC default_full_version: + <term><varname>default_full_version</varname> (<type>string</type>)</term> + <listitem> + <para> + This option allows an extension author to avoid shiping all versions + of all scripts when shipping an extension. When a version is requested + and the matching script does not exist on disk, + set <replaceable>default_full_version</replaceable> to the first + script you still ship and PostgreSQL will apply the intermediate + upgrade script as per the <command>ALTER EXTENSION UPDATE</command> + command. + </para> + <para> + For example, say you did provide the extension <literal>pair</literal> + version <literal>1.0</literal> and are now providing the + version <literal>1.1</literal>. If you want both current and new users + to be able to install the new version, you can provide both the + scripts <literal>pair--1.0--1.1.sql</literal> + and <literal>pair--1.1.sql</literal>, adding to the already + existing <literal>pair--1.0.sql</literal>. + </para> + <para> + When specifying <literal>default_version</literal> + and <literal>default_full_version = 1.0</literal> you can instead only + provide only the scripts <literal>pair--1.0.sql</literal> + and <literal>pair-1.0--1.1.sql</literal>. The <command>CREATE + EXTENSION pair;</command> will then automatically use the afore + mentionned scripts to install version 1.0 then update it to 1.1. + </para> + </listitem> What Jeff is proposing is to simplify that down and have PostgreSQL auto discover the upgrade cycle when the version asked for isn't directly available with a creation script. We would keep the behavior depicted here, just in a fully automated way. Working on a separate patch for that, then. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Tom Dunstan (pgsql@tomd.cc) wrote: > Extensions in contrib live in a weird place. Totally builtin stuff > should obviously be dumped without versions, and stuff which is > completely separate and follows its own release schedule should > obviously be versioned. I guess we consider all modules in contrib to > offer the same transparent upgrade guarantees as builtins, so they > shouldn't be versioned, but it feels like some of them should be, if > only because some aren't particularly tied in to the backend all that > tightly. But I guess that's a bogus metric, the true metric is whether > we want people to treat them as basically built-in, with the upgrade > guarantees that go along with that. Note that we don't actually make guarantees about either builtins or contrib modules when it comes to major version upgrades. The current way we package contrib and how we tie contrib releases to PG releases means that we can get away with omitting the version and saying "well, if you restore to the same PG major version then you'll get the same contrib extension version, and if you restore into a different version then obviously you want the version of contrib with that major version" but that *only* works for contrib. We need to accept that other extensions exist and that they aren't tied to PG major versions nor to our release schedule. There's a lot more extensions out there today which are *not* in contrib than there are ones which *are*. Thanks, Stephen
* Jeff Davis (pgsql@j-davis.com) wrote: > Stephen mentioned using external tools and/or metadata, but to me that > sounds like it would require porting the extension away from what's on > PGXN today. Not at all- and that'd be the point. An external tool could take the PGXN extension, run 'make', then 'make install' (into a userland directory), extract out the script and then, with a little help from PG, run that script in "extension creation mode" via libpq. Another option, which I generally like better, is to have a new package format for PGXN that contains the results of "make install", more-or-less, synonymous to Debian source vs. .deb packages. Perhaps we could even have psql understand that format and be able to install the extension via a backslash command instead of having an external tool, but I think an external tool for dependency tracking and downloading of necessary dependencies ala Debian would be better than teaching psql to do that. Thanks, Stephen
* Greg Stark (stark@mit.edu) wrote: > I thought the fundamental problem the "in-catalog" extensions were > trying to solve were the issue with not having access to the > filesystem. If that's the case then being able to say create extension > from http://... would solve that. That's not really 'solved' unless you feel we can depend on that "create extension from URL" to work at pg_restore time... I wouldn't have guessed that people would accept that, but I've already been wrong about such things in this thread once. Thanks, Stephen
On Tue, Dec 3, 2013 at 8:44 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> > On 3 December 2013 02:02, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> > ISTM that the real solution to this particular problem is to decouple >> > the extensions that are currently in contrib from a specific postgres >> > version. >> >> "Problem"? It's not a bug that you get hstore 1.2 when you dump from 9.2 >> and reload into 9.3; that's a feature. You wanted an upgrade, presumably, > > I don't buy this argument at *all* and it's not going to fly when we've > got multiple versions of an extension available concurrently. I'm > willing to accept that we have limitations when it comes from a > packaging perspective (today) around extensions but the notion that my > backup utility should intentionally omit information which is required > to restore the database to the same state it was in is ridiculous. > >> or you'd not have been going to 9.3 in the first place. > > This notion that a given major version of PG only ever has one version > of an extension associated with it is *also* wrong and only makes any > sense for contrib extensions- which are the exception rather than the > rule today. > >> The entire reason >> why the extension mechanism works like it does is to allow that sort of >> reasonably-transparent upgrade. It would not be a step forward to break >> that by having pg_dump prevent it (which it would fail to do anyway, >> likely, since the receiving installation might not have 1.1 available >> to install). > > I agree that there should be a way to *allow* such an upgrade to happen > transparently and perhaps we keep it the way it is for contrib > extensions as a historical artifact, but other extensions are > independent of the PG major version and multiple versions will be > available concurrently for them and having pg_dump willfully ignore the > extension version is a receipe for broken backups. > > Step back and consider a user who is just trying to restore his backup > of his 9.2 database into a new server, also with 9.2, as quickly as he > can to get his system online again. Having four different versions of > extension X installed and available for 9.2, no clue or information > about which version was installed into which databases and getting > mysterious failures and errors because they're not all compatible is not > what anyone is going to want to deal with in that situation. I think Tom's original idea here was that new versions of extensions *shouldn't ever* be backward-incompatible, and therefore if this problem arises it's the extension author's fault. It isn't however clear that this dream is likely to be realized in practice. For example, the only difference between hstore 1.0 and hstore 1.1 is that we dropped the => operator, for the very good reason that we have been slowly working towards deprecating => as an operator name so that we can eventually use it for the purpose that the SQL standard specifies.Given that we've done it in core, we can hardly saythat no one will ever do this anywhere else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 3, 2013 at 8:43 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > What Jeff is proposing is to simplify that down and have PostgreSQL auto > discover the upgrade cycle when the version asked for isn't directly > available with a creation script. > > We would keep the behavior depicted here, just in a fully automated way. > > Working on a separate patch for that, then. I like the idea of making it automatic, but it won't work in all cases. For example, suppose someone ships 1.0, 1.0--1.2, 1.1, and 1.1--1.2. Since versions aren't intrinsically ordered, the system has no way of knowing whether it's preferable to run 1.0 and then 1.0--1.2 or instead run 1.1 and then 1.1--1.2. So I think we will need either to introduce a way of ordering version numbers (which Tom has previously opposed) or some concept like your default_full_version. In more normal cases, however, the system can (and probably should) figure out what was intended by choosing the *shortest* path to get to the intended version. For example, if someone ships 1.0, 1.0--1.1, 1.1, and 1.1--1.2, the system should choose to run 1.1 and then 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2. But that can be automatic: only if there are two paths of equal length (as in the example in the previous paragraph) do we need help from the user to figure out what to do. We should also consider the possibility of a user trying to deliberately install and older release. For example, if the user has 1.0, 1.0--1.1, 1.1, 1.1--1.2, and 1.2--1.0 (a downgrade script) with default_full_version = 1.2, an attempt to install 1.0 should run just the 1.0 script, NOT 1.2 and then 1.2--1.0. Putting all that together, I'm inclined to suggest that what we really need is a LIST of version numbers, rather than just one. If there one path to the version we're installing is shorter than any other, we choose that, period. If there are multiple paths of equal length, we break the tie by choosing which version number appears first in the aforementioned list. If that still doesn't break the tie, either because none of the starting points are mentioned in that list or because there are multiple equal-length paths starting in the same place, we give up and emit an error. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Stephen Frost <sfrost@snowman.net> writes: > That's not really 'solved' unless you feel we can depend on that "create > extension from URL" to work at pg_restore time... I wouldn't have > guessed that people would accept that, but I've already been wrong about > such things in this thread once. Basically, with the extra software I want to build out-of-core, what you have is an externally maintained repository and the scripts are downloaded at CREATE EXTENSION time. With the Extension Template, you then have a solid cache you can rely on at pg_restore time. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > That's not really 'solved' unless you feel we can depend on that "create > > extension from URL" to work at pg_restore time... I wouldn't have > > guessed that people would accept that, but I've already been wrong about > > such things in this thread once. > > Basically, with the extra software I want to build out-of-core, what you > have is an externally maintained repository and the scripts are > downloaded at CREATE EXTENSION time. I should have included above "unless we actually dump the extension objects out during pg_dump." > With the Extension Template, you then have a solid cache you can rely on > at pg_restore time. Extension templates are not needed for us to be able to dump and restore extensions. I do not understand why you continue to argue for extension templates as a solution to that problem when it's utter overkill and far worse than just dumping the extension objects out at pg_dump time and having a way to add them back as part of the extension on restore. I understand that you once proposed that and it was shot down but I think we need to move past that now that we've seen what the alternative is.. That isn't to say anything about the code or about you specifically, but, for my part, I really don't like nor see the value of sticking script blobs into the catalog as some kind of representation of database objects. Thanks, Stephen
Robert Haas <robertmhaas@gmail.com> writes: > In more normal cases, however, the system can (and probably should) > figure out what was intended by choosing the *shortest* path to get to > the intended version. For example, if someone ships 1.0, 1.0--1.1, > 1.1, and 1.1--1.2, the system should choose to run 1.1 and then > 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2. But that can > be automatic: only if there are two paths of equal length (as in the > example in the previous paragraph) do we need help from the user to > figure out what to do. Yeah. > We should also consider the possibility of a user trying to > deliberately install and older release. For example, if the user has > 1.0, 1.0--1.1, 1.1, 1.1--1.2, and 1.2--1.0 (a downgrade script) with > default_full_version = 1.2, an attempt to install 1.0 should run just > the 1.0 script, NOT 1.2 and then 1.2--1.0. In what I did, if you want version 1.0 and we have a script --1.0.sql around, then we just use that script, never kicking the path chooser. The path chooser at CREATE EXTENSION time is only execised when we don't have a direct script to support that specific version you're asking. > Putting all that together, I'm inclined to suggest that what we really > need is a LIST of version numbers, rather than just one. If there one > path to the version we're installing is shorter than any other, we > choose that, period. If there are multiple paths of equal length, we That's what Jeff did propose, yes. > break the tie by choosing which version number appears first in the > aforementioned list. If that still doesn't break the tie, either > because none of the starting points are mentioned in that list or > because there are multiple equal-length paths starting in the same > place, we give up and emit an error. Jeff also did mention about tiebreakers without entering into any level of details. We won't be able to just use default_version as the tiebreaker list here, because of the following example: default_version = 1.2, 1.0 create extension foo version '1.1'; With such a setup it would prefer 1.2--1.1 to 1.0--1.1, which doesn't look like what we want. Instead, we want default_version = 1.2 create_from_version_candidates = 1.0 create extension foo version '1.1'; Then the tie breaker is the 1.0 in "create_from_version_candidates" so we would run foo--1.0.sql and then foo--1.0--1.1.sql. Comments? Baring objections, I'm going to prepare a new branch to support developping that behavior against only file based extensions, and submit a spin-off patch to the current CF entry. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Stephen Frost <sfrost@snowman.net> writes: > I understand that you once proposed that and it was shot down but I > think we need to move past that now that we've seen what the alternative > is.. That isn't to say anything about the code or about you > specifically, but, for my part, I really don't like nor see the value of > sticking script blobs into the catalog as some kind of representation of > database objects. Well yeah, I'm having quite a hard time to withdraw that proposal, which is the fourth one in three years, and that had been proposed to me on this very mailing list, and got the infamous "community buy-in" about a year ago. It's always a hard time when you're being told that the main constraints you had to work with suddenly are no more, because after all this work, we realize that imposing those constraints actually made no sense. I understand that it can happen, it still really sucks when it does. <delusionnal paragraph, censored for lack of humour (incl. sarcasm)> Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Dec 3, 2013, at 9:14 AM, Dimitri Fontaine <dimitri@2ndQuadrant.fr> wrote: > I understand that it can happen, it still really sucks when it does. > > <delusionnal paragraph, censored for lack of humour (incl. sarcasm)> I have not followed this project closely, Dimitri, but I for one have appreciated your tenacity in following through on it.Extensions are awesome, thanks to you, and I’m happy to see all efforts to make it more so. Thank you. Best, David
On Tue, 2013-12-03 at 10:08 +0200, Heikki Linnakangas wrote: > Another perspective is that that's already a situation we'd rather not > have. Let's not make it worse by introducing a third way to install an > extension, which again requires the extension author to package the > extension differently. +1. > Why? The external tool can pick the extension in its current form from > PGXN, and install it via libpq. The tool might have to jump through some > hoops to do it, and we might need some new backend functionality to > support it, but I don't see why the extension author needs to do anything. Ideally, it would end up the same whether it was installed by either method -- the same entries in pg_extension, etc. I assume that's what you mean by "new backend functionality". This sounds like Inline Extensions to me, which was previously proposed. If I recall, that proposal trailed off because of issues with dump/reload. If you dump the contents of the extension, it's not really an extension; but if you don't, then the administrator can't back up the database (because he might not have all of the extension templates for the extensions installed). That's when the idea appeared for extension templates stored in the catalog, so that the administrator would always have all of the necessary templates present. A few interesting messages I found: http://www.postgresql.org/message-id/CA +TgmobJ-yCHt_utgJJL9WiiPssUAJWFd=3=uLrOB9nhBpCDOA@mail.gmail.com http://www.postgresql.org/message-id/CA +TgmoZTujw7BeyZf1DzdcRBg2DJCvtYaGEyChX8d52OE1Gqyw@mail.gmail.com http://www.postgresql.org/message-id/CA +Tgmoa_0D6Ef8upc03QX0uNhJfzozEosno_OFUCf5jGw+8D0w@mail.gmail.com http://www.postgresql.org/message-id/18054.1354751357@sss.pgh.pa.us (+Robert,Tom because I am referencing their comments) Regards,Jeff Davis
On Tue, 2013-12-03 at 09:20 -0500, Stephen Frost wrote: > * Jeff Davis (pgsql@j-davis.com) wrote: > > Stephen mentioned using external tools and/or metadata, but to me that > > sounds like it would require porting the extension away from what's on > > PGXN today. > > Not at all- and that'd be the point. An external tool could take the > PGXN extension, run 'make', then 'make install' (into a userland > directory), extract out the script and then, with a little help from PG, > run that script in "extension creation mode" via libpq. What is stopping Extension Templates, as proposed, from being this special "extension creation mode"? What would be a better design? It seems like the porting issue is just a matter of finding someone to write a tool to reliably translate packages from PGXN into a form suitable to be sent using SQL commands; which we would need anyway for this special mode. Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > This sounds like Inline Extensions to me, which was previously proposed. I've not looked at that proposal very carefully, but I agree that what we're describing is a lot closer to 'inline extensions' than 'extension templates'. > If I recall, that proposal trailed off because of issues with > dump/reload. If you dump the contents of the extension, it's not really > an extension; but if you don't, then the administrator can't back up the > database (because he might not have all of the extension templates for > the extensions installed). That's when the idea appeared for extension > templates stored in the catalog, so that the administrator would always > have all of the necessary templates present. When it comes to dump/reload, I'd much rather see a mechanism which uses our deep understanding of the extension's objects (as database objects) to implement the dump/reload than a text blob which is carried forward from major version to major version and may even fail to run. I realize that's different from extension files which are out on the filesystem, but I do not see that as a bad thing. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > When it comes to dump/reload, I'd much rather see a mechanism which uses > our deep understanding of the extension's objects (as database objects) > to implement the dump/reload than a text blob which is carried forward > from major version to major version and may even fail to run. Note that we're already doing that in the binary_upgrade code path. I agree that generalizing that approach sounds like a better idea than keeping a text blob around. regards, tom lane
* Jeff Davis (pgsql@j-davis.com) wrote: > On Tue, 2013-12-03 at 09:20 -0500, Stephen Frost wrote: > > * Jeff Davis (pgsql@j-davis.com) wrote: > > > Stephen mentioned using external tools and/or metadata, but to me that > > > sounds like it would require porting the extension away from what's on > > > PGXN today. > > > > Not at all- and that'd be the point. An external tool could take the > > PGXN extension, run 'make', then 'make install' (into a userland > > directory), extract out the script and then, with a little help from PG, > > run that script in "extension creation mode" via libpq. > > What is stopping Extension Templates, as proposed, from being this > special "extension creation mode"? What would be a better design? The extra catalog tables which store SQL scripts in text columns is one of my main objections to the as-proposed Extension Templates. I view those scripts as a poor man's definition of database objects which are defined properly in the catalog already. The other big issue is that there isn't an easy way to see how we could open up the ability to create extensions to non-superusers with this approach. What I think we should really be mulling over is if we need anything further when it comes to non-superuser extensions; a new namespace (eg: schemas for extensions, or maybe prefix for user extensions, or just a notion of ownership which gets combined with the name when doing operations with an extension)? a new name (not extensions, but something else)? > It seems like the porting issue is just a matter of finding someone to > write a tool to reliably translate packages from PGXN into a form > suitable to be sent using SQL commands; which we would need anyway for > this special mode. That's what I was thinking and hoping. :) Of course, we haven't yet figured out exactly what we want this special mode to look like, so it's a bit tricky to ask anyone to write such a tool. I keep thinking this should be something like: create a schema, set the search path to that schema, run the extension script more-or-less as is, then 'register' that schema as being an extension with a certain version. That 'registration' process could also handle renaming the schema, if the user wants the extension in a different schema (or perhaps the initial schema was some kind of "temporary" schema) or moving the objects into an existing schema, if that's what is requested. Thanks, Stephen
On Tue, 2013-12-03 at 14:41 -0500, Stephen Frost wrote: > * Jeff Davis (pgsql@j-davis.com) wrote: > The extra catalog tables which store SQL scripts in text columns is one > of my main objections to the as-proposed Extension Templates. OK, that's what I thought. This seems like the root of your objection, so let's focus here. > The other big issue is that > there isn't an easy way to see how we could open up the ability to > create extensions to non-superusers with this approach. Is this because of the namespace issue, or is there another problem here, too? > What I think we should really be mulling over is if we need anything > further when it comes to non-superuser extensions; a new namespace (eg: > schemas for extensions, or maybe prefix for user extensions, or just a > notion of ownership which gets combined with the name when doing > operations with an extension)? a new name (not extensions, but > something else)? Agreed. Adding namespaces would best be done sooner rather than later. > That's what I was thinking and hoping. :) Of course, we haven't yet > figured out exactly what we want this special mode to look like, so it's > a bit tricky to ask anyone to write such a tool. I keep thinking this > should be something like: create a schema, set the search path to that > schema, run the extension script more-or-less as is, then 'register' > that schema as being an extension with a certain version. That > 'registration' process could also handle renaming the schema, if the > user wants the extension in a different schema (or perhaps the initial > schema was some kind of "temporary" schema) or moving the objects into > an existing schema, if that's what is requested. An interesting idea to rely on schemas like that, but it seems a little hackish. I'd prefer something that would be sane for a user to do without the assistance of a tool. We can still recommend that they use the PGXN format and the tool, of course. Regards,Jeff Davis
On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: > > When it comes to dump/reload, I'd much rather see a mechanism which uses > > our deep understanding of the extension's objects (as database objects) > > to implement the dump/reload than a text blob which is carried forward > > from major version to major version and may even fail to run. > > Note that we're already doing that in the binary_upgrade code path. > I agree that generalizing that approach sounds like a better idea > than keeping a text blob around. So does this take us fully back to Inline Extensions, or is there a distinction that I'm missing? I still don't see that Extension Templates are all bad: * They preserve the fact that two instances of the same extension (e.g. in different databases) were created from the same template. * They mirror the file-based templates, so it seems easierto get consistent behavior. Regards,Jeff Davis
On Tue, 2013-12-03 at 08:44 -0500, Stephen Frost wrote: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > > On 3 December 2013 02:02, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > > > ISTM that the real solution to this particular problem is to decouple > > > the extensions that are currently in contrib from a specific postgres > > > version. > > > > "Problem"? It's not a bug that you get hstore 1.2 when you dump from 9.2 > > and reload into 9.3; that's a feature. You wanted an upgrade, presumably, > > I don't buy this argument at *all* and it's not going to fly when we've > got multiple versions of an extension available concurrently. It seems there's a use case for both behaviors; perhaps we should include it in the control information? preserve_version_on_dump (boolean) FWIW, I find the current behavior surprising when in the mindset of a SQL extension. But it makes sense for things more closely tied to the backend. Regards,Jeff Davis
On Mon, 2013-12-02 at 15:44 -0500, Stephen Frost wrote: > How are we going to handle new keywords > being added in new major versions? A pg_dump of the extension template > script is then going to be loaded into the new major version but will > not actually be able to be run because it'll error out... Elsewhere in the thread you argued that the version of an extension should be preserved across dump/reload. Surely a given version of the extension corresponds to a specific set of SQL commands (specifically, the SQL text blob on PGXN), so it *should* error out. Otherwise you end up with a weird situation where upgrading a 9.4 install to 9.5 allows you to keep version 1.2 of some extension, but 1.2 won't install directly to 9.5. (By the way, I think this is a problem with pg_upgrade currently.) You're fighting pretty hard against text blobs, but at the same time saying that we should be able to fully make use of existing PGXN extensions, which contain text blobs of SQL. And extension authors are versioning their SQL blobs, not some abstract concepts internal to postgres and the catalogs. Just because we start with blobs from PGXN doesn't mean we need to use blobs everywhere; but I think you're too quick to rule them out. Regards,Jeff Davis
On Tue, 2013-12-03 at 10:23 -0500, Robert Haas wrote: > In more normal cases, however, the system can (and probably should) > figure out what was intended by choosing the *shortest* path to get to > the intended version. For example, if someone ships 1.0, 1.0--1.1, > 1.1, and 1.1--1.2, the system should choose to run 1.1 and then > 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2. But that can > be automatic: only if there are two paths of equal length (as in the > example in the previous paragraph) do we need help from the user to > figure out what to do. Why do we need help from the user? Just pick a path. For an extension update, I understand why someone wouldn't want to accidentally downgrade 5 versions (dropping all of their dependent objects) before updating to the latest. But this doesn't apply to creation. And it just seems really awkward to document, and it's a constant maintenance burden on extension authors to specify their upgrade paths every time they release a new version. > Putting all that together, I'm inclined to suggest that what we really > need is a LIST of version numbers, rather than just one. If there one > path to the version we're installing is shorter than any other, we > choose that, period. If there are multiple paths of equal length, we > break the tie by choosing which version number appears first in the > aforementioned list. If that still doesn't break the tie, either > because none of the starting points are mentioned in that list or > because there are multiple equal-length paths starting in the same > place, we give up and emit an error. That seems like extreme overkill, and still doesn't give users full control over upgrade paths. Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote: > > Stephen Frost <sfrost@snowman.net> writes: > > > When it comes to dump/reload, I'd much rather see a mechanism which uses > > > our deep understanding of the extension's objects (as database objects) > > > to implement the dump/reload than a text blob which is carried forward > > > from major version to major version and may even fail to run. > > > > Note that we're already doing that in the binary_upgrade code path. > > I agree that generalizing that approach sounds like a better idea > > than keeping a text blob around. > > So does this take us fully back to Inline Extensions, or is there a > distinction that I'm missing? I've not really looked at the inline extensions patch/proposal in depth, but I do think that's a lot closer than this. As I understand it, Dimitri had a patch for this, though what I've found is the blog post. Also, there were a lot of discussions about the idea a year or so ago, including folks who haven't spoken up on this discussion. > I still don't see that Extension Templates are all bad: > * They preserve the fact that two instances of the same extension > (e.g. in different databases) were created from the same template. This is only true if we change the extension templates to be shared catalogs, which they aren't today.. > * They mirror the file-based templates, so it seems easier to get > consistent behavior. While it might seem easier and perhaps simpler, I'm not sure that I really buy into the idea that we'd actually be more consistent. Even if we are, I'm not convinced that's what we want here.. The only thing driving us in that direction is that we're calling these 'extensions' too. While I don't want five different extension-like things, I'd rather use a different name from 'extensions' if we feel that the differences between catalog-only extensions and filesystem extensions will cause 'extensions' overall to have terribly inconsistent behavior. Thanks, Stephen
* Jeff Davis (pgsql@j-davis.com) wrote: > On Mon, 2013-12-02 at 15:44 -0500, Stephen Frost wrote: > > How are we going to handle new keywords > > being added in new major versions? A pg_dump of the extension template > > script is then going to be loaded into the new major version but will > > not actually be able to be run because it'll error out... > > Elsewhere in the thread you argued that the version of an extension > should be preserved across dump/reload. Surely a given version of the > extension corresponds to a specific set of SQL commands (specifically, > the SQL text blob on PGXN), so it *should* error out. I *do* think the version should be preserved (though I admit that the argument about things released with PG does make some sense). My point above is that if we dump the text blob out and then just rerun it, it might not work, but if we use pg_dump and dump the extension out as database objects (using the newer version of pg_dump, as we always recommend..), then it's certainly more likely to work even in the face of new keywords and the like which change between releases. > Otherwise you end up with a weird situation where upgrading a 9.4 > install to 9.5 allows you to keep version 1.2 of some extension, but 1.2 > won't install directly to 9.5. (By the way, I think this is a problem > with pg_upgrade currently.) Hmm. I'll grant, that's an interesting situation to consider, but I'm trying to figure out why it's better to make it always break, both on initial installation and when doing a restore from a backup, than only have it (most likely anyway) break on initial installation (to a newer version that may not have existed originally). > You're fighting pretty hard against text blobs, but at the same time > saying that we should be able to fully make use of existing PGXN > extensions, which contain text blobs of SQL. And extension authors are > versioning their SQL blobs, not some abstract concepts internal to > postgres and the catalogs. I don't want text blobs in the backend catalogs. I'm not argueing against text blobs in general (that'd be kinda hard to do..). > Just because we start with blobs from PGXN doesn't mean we need to use > blobs everywhere; but I think you're too quick to rule them out. Perhaps, but I really don't see the point of putting a text blob into the database for a set of objects that we're just going to create in the next moment. That would be, from my point of view anyway, akin to storing 'CREATE TABLE' statements in the catalog next to the actual definition of the table in pg_class/pg_attribute. Thanks, Stephen
On Wed, 2013-12-04 at 09:50 -0500, Stephen Frost wrote: > > I still don't see that Extension Templates are all bad: > > * They preserve the fact that two instances of the same extension > > (e.g. in different databases) were created from the same template. > > This is only true if we change the extension templates to be shared > catalogs, which they aren't today.. I agree with you about that -- I don't like per-DB templates. I guess the challenge is that we might want to use namespaces to support user-installable extensions, and namespaces reside within a DB. But I think we can find some other solution there (e.g. user names rather than schemas), and per-DB templates are just not a good solution anyway. Regards,Jeff Davis
On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: > > When it comes to dump/reload, I'd much rather see a mechanism which uses > > our deep understanding of the extension's objects (as database objects) > > to implement the dump/reload than a text blob which is carried forward > > from major version to major version and may even fail to run. > > Note that we're already doing that in the binary_upgrade code path. > I agree that generalizing that approach sounds like a better idea > than keeping a text blob around. The reason for doing it that way in pg_upgrade was to preserve OIDs for types, etc.: http://www.postgresql.org/message-id/20783.1297184036@sss.pgh.pa.us That doesn't seem to apply to ordinary dump/reload. Do you think it's good for other reasons, as well? Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote: >> Stephen Frost <sfrost@snowman.net> writes: >>> When it comes to dump/reload, I'd much rather see a mechanism which uses >>> our deep understanding of the extension's objects (as database objects) >>> to implement the dump/reload than a text blob which is carried forward >>> from major version to major version and may even fail to run. >> Note that we're already doing that in the binary_upgrade code path. >> I agree that generalizing that approach sounds like a better idea >> than keeping a text blob around. > The reason for doing it that way in pg_upgrade was to preserve OIDs for > types, etc.: That was *a* reason, but not the only one, I believe. > That doesn't seem to apply to ordinary dump/reload. Do you think it's > good for other reasons, as well? I think Stephen has already argued why it could be a good idea here. But in a nutshell: it seems like there are two use-cases to be supported, one where you want "CREATE EXTENSION hstore" to give you some appropriate version of hstore, and one where you want to restore exactly what you had on the previous installation. It seems to me that "exploding" the extension by dumping, rather than suppressing, its component objects is by far the most reliable way of accomplishing the latter. To point out just one reason why, we've never made any effort to prohibit suitably-privileged users from modifying the objects within an extension. So even if you'd kept around the originally defining text string, it might not represent current reality. And as for relying on some URL or other --- whoever proposed that doesn't live in the same internet I do. URLs aren't immutable, even on days when you can get to them. regards, tom lane
On Tue, Dec 3, 2013 at 11:44 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> We should also consider the possibility of a user trying to >> deliberately install and older release. For example, if the user has >> 1.0, 1.0--1.1, 1.1, 1.1--1.2, and 1.2--1.0 (a downgrade script) with >> default_full_version = 1.2, an attempt to install 1.0 should run just >> the 1.0 script, NOT 1.2 and then 1.2--1.0. > > In what I did, if you want version 1.0 and we have a script --1.0.sql > around, then we just use that script, never kicking the path chooser. Oh, right. Duh. Sorry, bad example. I do think we want to avoid using a downgrade script as part of an install though - and to install from the newest possible full version (I kind of like the term "base" version) whenever possible. >> break the tie by choosing which version number appears first in the >> aforementioned list. If that still doesn't break the tie, either >> because none of the starting points are mentioned in that list or >> because there are multiple equal-length paths starting in the same >> place, we give up and emit an error. > > Jeff also did mention about tiebreakers without entering into any level > of details. > > We won't be able to just use default_version as the tiebreaker list > here, because of the following example: > > default_version = 1.2, 1.0 > > create extension foo version '1.1'; > > With such a setup it would prefer 1.2--1.1 to 1.0--1.1, which doesn't > look like what we want. Instead, we want > > default_version = 1.2 > create_from_version_candidates = 1.0 > > create extension foo version '1.1'; > > Then the tie breaker is the 1.0 in "create_from_version_candidates" so > we would run foo--1.0.sql and then foo--1.0--1.1.sql. I guess one way to skin this cat would be to just let the user provide an ordering for the versions i.e. version_ordering = 1.0 1.1 1.2 When the user asks for version X, we reject any paths that pass through a newer version (so that we never downgrade), and start with the path that begins as close to the target version as possible. For scenarios were people might be installing either an older or newer version, that might be easier to understand than a base-version preference list. > Baring objections, I'm going to prepare a new branch to support > developping that behavior against only file based extensions, and submit > a spin-off patch to the current CF entry. Not totally sure we're all on the same page yet, but that's not necessarily meant to dissuade you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Dec 4, 2013 at 3:39 AM, Jeff Davis <pgsql@j-davis.com> wrote: > On Tue, 2013-12-03 at 10:23 -0500, Robert Haas wrote: >> In more normal cases, however, the system can (and probably should) >> figure out what was intended by choosing the *shortest* path to get to >> the intended version. For example, if someone ships 1.0, 1.0--1.1, >> 1.1, and 1.1--1.2, the system should choose to run 1.1 and then >> 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2. But that can >> be automatic: only if there are two paths of equal length (as in the >> example in the previous paragraph) do we need help from the user to >> figure out what to do. > > Why do we need help from the user? Just pick a path. > > For an extension update, I understand why someone wouldn't want to > accidentally downgrade 5 versions (dropping all of their dependent > objects) before updating to the latest. But this doesn't apply to > creation. I suppose. But suppose we have 1.0, 1.1, 1.0--1.2, and 1.1--1.2. Suppose further that 1.1 drops some interfaces present in 1.0, and 1.2 adds new stuff. If the system chooses to run 1.0 and then 1.0--1.2, it'll create all the deprecated interfaces and then drop them again. Now maybe that won't cause any problems, but I bet it will. For example, consider hstore again. If we eventually disallow => as an operator altogether, the 1.0 script won't even run any more. Of course that doesn't matter for core because we've removed it entirely from our repository and don't ship it any more, but an out-of-core extension might well keep around more old scripts than we do, to make it easier to use the same bundle with multiple server versions. Imagine, for example, that 1.0 only works on 9.4 or earlier and 1.2 only works on releases 9.2 or later. The extension author wants to keep the 1.0 script around for the benefit of people who haven't upgraded, so that they can still install the older version that works there; but he also need the 1.1 base version to be preferred to the 1.0 base version, else installation of 1.2 on 10.0+ will fail completely. My experience with software upgrades is that cases like this, and even weirder things, happen pretty routinely, so I think more control is good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/2/13, 9:14 AM, Dimitri Fontaine wrote: > What I want to build is an “extension distribution” software that knows > how to prepare anything from PGXN (and other places) so that it's fully > ready for being used in the database. Then the main client would run as > a CREATE EXTENSION "ddl_command_start" Event Trigger and would fetch the > prepared extension for you and make it available, then leaving the main > command operate as intended. > > Which is what I think the pex extension is doing, and that's not > coincidental, but it runs the build step on the PostgreSQL server itself > and needs to have a non-trivial set of file-system privileges to be > doing so, and even needs to get root privileges with sudo for some of > its operations. You're thinking of autopex, and while that works, and can be made to work better with certain small changes, I don't think it can ever be the only solution. Many interesting extensions will have external packages build and run-time dependencies, and you need file-system level access to manage that.
On 12/2/13, 2:33 PM, Greg Stark wrote: > Just tossing an idea out there. What if you could install an extension > by specifying not a local file name but a URL. Obviously there's a > security issue but for example we could allow only https URLs with > verified domain names that are in a list of approved domain names > specified by a GUC. This is similar to what autopex does (https://github.com/petere/autopex).
On 12/3/13, 9:20 AM, Stephen Frost wrote: > Another option, which I generally like better, is to have a new package > format for PGXN that contains the results of "make install", > more-or-less, synonymous to Debian source vs. .deb packages. > > Perhaps we could even have psql understand that format and be able to > install the extension via a backslash command instead of having an > external tool, but I think an external tool for dependency tracking and > downloading of necessary dependencies ala Debian would be better than > teaching psql to do that. How would that handle varying file system layouts on the backend?
* Peter Eisentraut (peter_e@gmx.net) wrote: > On 12/3/13, 9:20 AM, Stephen Frost wrote: > > Another option, which I generally like better, is to have a new package > > format for PGXN that contains the results of "make install", > > more-or-less, synonymous to Debian source vs. .deb packages. > > > > Perhaps we could even have psql understand that format and be able to > > install the extension via a backslash command instead of having an > > external tool, but I think an external tool for dependency tracking and > > downloading of necessary dependencies ala Debian would be better than > > teaching psql to do that. > > How would that handle varying file system layouts on the backend? This discussion is all about catalog-only extensions and therefore we don't really care about anything filesystem related... Thanks, Stephen
On Wed, 2013-12-04 at 15:28 -0500, Robert Haas wrote: > My experience with software upgrades is that > cases like this, and even weirder things, happen pretty routinely, so > I think more control is good. There would still be control: just use full SQL scripts appropriately. I'm sure there's still room for surprise as extensions become more complex. But ultimately, those surprises will happen because of broken upgrade/downgrade scripts, and if those are broken, the user is probably in for a surprise in the near future anyway. It's fine with me if we help alleviate these problems by using a proper system to organize these upgrades/downgrades. But everything proposed seems pretty bad from the perspective of an extension author -- extra documentation, extra ceremony, more room for error, and more maintenance every time they release a new version. And once we document it, we have to support those behaviors for a long time, which will almost certainly prevent a better solution later. I think we should just make it simple:* If there is a full SQL script of the given version, we guarantee that we'll execute that one.* Otherwise, we execute the shortest path from a full version to the requested version.* If there's a tie, throw an error. That leaves us with plenty of room to improve the situation later, for instance if we support ordered versions. (I'm not sure if ordered versions was rejected outright, or we just didn't have time to do it properly.) Regards,Jeff Davis
On 12/1/13, 10:47 PM, Stephen Frost wrote: > Having a management system for sets of objects is a *great* idea- and > one which we already have through schemas. What we don't have is any > kind of versioning system built-in or other metadata about it, nor do we > have good tooling which leverages such a versioning or similar system. > Extensions provide some of that metadata around schemas and object > definitions, Schemas can't manage objects that are not in schemas, so that won't work. It would be great if we could take the dependency tracking mechanism in extensions and expose it separately. I would like to be able to say START PACKAGE foo -- bad name bunch of DDL STOP PACKAGE use it, later DROP PACKAGE foo; This mechanism already exists in extensions, but it's combined with a bunch of other things. Separating those things (and naming them separately) might clear a few things up.
On Wed, 2013-12-04 at 14:54 -0500, Tom Lane wrote: > I think Stephen has already argued why it could be a good idea here. > But in a nutshell: it seems like there are two use-cases to be > supported, one where you want "CREATE EXTENSION hstore" to give you > some appropriate version of hstore, and one where you want to restore > exactly what you had on the previous installation. It seems to me that > "exploding" the extension by dumping, rather than suppressing, its > component objects is by far the most reliable way of accomplishing the > latter. The behavior of an extension should not depend on how it was installed. The kind of "extension" being described by Stephen will: * Not be updatable by doing "ALTER EXTENSION foo UPDATE TO '2.0'" * Dump out objects that wouldn't be dumped if they had installed the extension using the filesystem So if we do it this way, then we should pick a new name, like "package". And then we'll need to decide whether it still makes sense to use an external tool to transform a PGXN extension into a form that could be loaded as a package. Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > On Wed, 2013-12-04 at 14:54 -0500, Tom Lane wrote: > > I think Stephen has already argued why it could be a good idea here. > > But in a nutshell: it seems like there are two use-cases to be > > supported, one where you want "CREATE EXTENSION hstore" to give you > > some appropriate version of hstore, and one where you want to restore > > exactly what you had on the previous installation. It seems to me that > > "exploding" the extension by dumping, rather than suppressing, its > > component objects is by far the most reliable way of accomplishing the > > latter. > > The behavior of an extension should not depend on how it was installed. > > The kind of "extension" being described by Stephen will: > > * Not be updatable by doing "ALTER EXTENSION foo UPDATE TO '2.0'" That's correct, but consider when the above command actually works today: when the new version is magically made available to the backend without any action happening in PG. That works when the filesystem can be updated independently or the backend can reach out to some other place and pull things down but that's, really, a pretty special situation. It works today specifically because we expect the OS packaging system to make changes to the filesystem for us but this whole 'extension template' proposal is about getting away from the filesystem. Instead, I'm suggesting an external tool which can pull down the new version from an external repo and then apply it to the backend. Clearly, my approach supports the general action of updating an extension, it just doesn't expect the filesystem on the server to be changed underneath PG nor that PG will reach out to some external repository on the basis of a non-superuser request to get the update script. > * Dump out objects that wouldn't be dumped if they had installed the > extension using the filesystem Correct, but the general presumption here is that many of these "extensions" wouldn't even be available for installation on the filesystem anyway. > So if we do it this way, then we should pick a new name, like "package". I've been on the fence about this for a while. There's definitely pros and cons to consider but it would go very much against one of the goals here, which is to avoid asking extension authors (or their users, to some extent..) to change and I expect it'd also be a lot more work to invent something which is 90% the same as extensions. Perhaps there's no help for it and we'll need "extensions" which are essentially for OS managed extensions (which can include .so files and friends) and then "packages" for entirely-in-catalog sets of objects with a certain amount of metadata included (version and the like). > And then we'll need to decide whether it still makes sense to use an > external tool to transform a PGXN extension into a form that could be > loaded as a package. I'd certainly think it would be but if we're moving away from calling them extensions then I'm afraid extension authors and users would end up having to change something anyway, no matter the tool. Perhaps that's reasonable and we can at least minimize the impact but much of what extensions offer are, in fact, what's also needed for packages and I'm not thrilled with the apparent duplication. It just occured to me that perhaps we can "call" these something different towards the end user but use the existing catalogs and code for extensions to handle our representation, with a few minor tweaks.. Not sure if I like that idea, but it's a thought. Thanks, Stephen
On Sat, 2013-12-07 at 12:27 -0500, Stephen Frost wrote: > * Jeff Davis (pgsql@j-davis.com) wrote: > > The behavior of an extension should not depend on how it was installed. > > > > The kind of "extension" being described by Stephen will: > > > > * Not be updatable by doing "ALTER EXTENSION foo UPDATE TO '2.0'" > > ... [ reason ] ... > > * Dump out objects that wouldn't be dumped if they had installed the > > extension using the filesystem > ... [ reason ] ... I understand there are reasons, but I'm having a hard time getting past the idea that "I have extension foo v1.2" now needs to be qualified with "installed using SQL" or "installed using the filesystem" to know what you actually have and how it will behave. Stepping back, maybe we need to do some more research on existing SQL-only extensions. We might be over-thinking this. How many extensions are really just a collection of functions on existing types? If you define a new data type, almost all of the functions seem to revolve around C code -- not just to define the basic data type, but also the GiST support routines, which then mean you're implementing operators in C too, etc. Perhaps we should first focus on making SQL-only extensions more useful? Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > I understand there are reasons, but I'm having a hard time getting past > the idea that "I have extension foo v1.2" now needs to be qualified with > "installed using SQL" or "installed using the filesystem" to know what > you actually have and how it will behave. I can certainly understand that. > Stepping back, maybe we need to do some more research on existing > SQL-only extensions. We might be over-thinking this. How many extensions > are really just a collection of functions on existing types? If you > define a new data type, almost all of the functions seem to revolve > around C code -- not just to define the basic data type, but also the > GiST support routines, which then mean you're implementing operators in > C too, etc. Fair point- I'll try and find time to review the 100+ extensions on PGXN and classify them (unless someone else would like to or happens to already know..?). That said, there's certainly cases where people find the existing extension system stinks for their SQL-only code and therefore don't use it- which is exactly the case I'm in @work. We have a ton of pl/pgsql code (along with schema definitions and the like), our own build system which builds both 'full/new' databases based on a certain version *and* will verify that newly built == current version plus a hand-written update script (of course, we have to write that update script) with versioned releases, etc. Point simply being that, were extensions more generally useful, we might see more of them and we need to consider more than just what's in PGXN, and therefore already built as an extension, today. > Perhaps we should first focus on making SQL-only extensions more useful? I'm not following what you're suggesting here..? In general, I agree; do you have specific ideas about how to do that? Ones which don't involve a superuser or modifying the filesystem under PG, and which works with replication and backups..? Thanks, Stephen
On Sat, Dec 7, 2013 at 3:12 AM, Jeff Davis <pgsql@j-davis.com> wrote: > So if we do it this way, then we should pick a new name, like "package". That was my first reaction as well, when I looked at this a few years ago, but I've since backed away from that position. You're certainly correct that it's awkward to have a single kind of object that behaves in two radically different ways, but it's also pretty awkward to have the same "stuff" installed as one of two completely different types of objects depending on who installed it and how. If we're targeting deployment of user-written application code, then I can see that it might make sense to have a different concept than "extension" for that, because arguably it's a different problem, though it's no longer clear to me that it's all that much different. But if we're talking about deployment of the same PGXN code (or wherever upstream lives) either by a DBA who is also the sysadmin (and can thus run make install or yum install) or one who is not (and thus wishes to proceed entirely via libpq) then making those two different concepts seems like it might be slicing awfully thin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, 2013-12-09 at 12:17 -0500, Robert Haas wrote: > On Sat, Dec 7, 2013 at 3:12 AM, Jeff Davis <pgsql@j-davis.com> wrote: > > So if we do it this way, then we should pick a new name, like "package". > > That was my first reaction as well, when I looked at this a few years > ago, but I've since backed away from that position. You're certainly > correct that it's awkward to have a single kind of object that behaves > in two radically different ways, but it's also pretty awkward to have > the same "stuff" installed as one of two completely different types of > objects depending on who installed it and how. I think awkwardness is most visible in the resulting documentation and error messages. At the moment, I'm having a difficult time imagining how we explain how this works to users (or, when they make a mistake or don't get the results they expect, explain to them what they did wrong and how to fix it). Regards,Jeff Davis
Hi, Stephen Frost <sfrost@snowman.net> writes: > * Jeff Davis (pgsql@j-davis.com) wrote: >> What is stopping Extension Templates, as proposed, from being this >> special "extension creation mode"? What would be a better design? > > The extra catalog tables which store SQL scripts in text columns is one > of my main objections to the as-proposed Extension Templates. I view > those scripts as a poor man's definition of database objects which are > defined properly in the catalog already. I have a very hard time to understand this objection. PL/SQL functions are just a SQL script stored as-is in the catalogs. That applies the same way to any other PL language too, with scripts stored as-is in the catalogs in different languages. Even views are stored in a textual way in the catalogs, albeit in a specific pre-processed format, it's still a text blob that could pass for a script in a backend specific language, parsed by the rewriter. So while I hear your objection to the "script in catalog" idea Stephen, I think we should move forward. We don't have the luxury of only applying patches where no compromise has to be made, where everyone is fully happy with the solution we find as a community. > The other big issue is that > there isn't an easy way to see how we could open up the ability to > create extensions to non-superusers with this approach. The main proposal here is to only allow the owner of a template to install it as an extension. For superusers, we can implement the needed SET ROLE command automatically in the CREATE EXTENSION command. Is there another security issue that this “same role” approach is not solving? I don't think so. >> It seems like the porting issue is just a matter of finding someone to >> write a tool to reliably translate packages from PGXN into a form >> suitable to be sent using SQL commands; which we would need anyway for >> this special mode. I already mentionned that's on my roadmap, part of the vision I'm trying to implement here. My goal is to deliver the full solution for 9.4, and this Extension Templates facility is the missing in-core bits of it. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Stephen Frost <sfrost@snowman.net> writes: >> The extra catalog tables which store SQL scripts in text columns is one >> of my main objections to the as-proposed Extension Templates. I view >> those scripts as a poor man's definition of database objects which are >> defined properly in the catalog already. > I have a very hard time to understand this objection. Why? I think it has considerable force. > PL/SQL functions are just a SQL script stored as-is in the catalogs. Those are the exception not the rule. > Even views are stored in a textual way in the catalogs, albeit in a > specific pre-processed format, This is utter nonsense. That representation has nothing to do with the original text of the CREATE VIEW command, and the fact that we store it as an ASCII string rather than some binary blob has more to do with debuggability than anything else. The important point is that we can (and sometimes do) transform the view to something else based on semantic understanding of what's in it. And we also have the ability to figure out what the view depends on, something that is mighty hard to get out of a text blob. (The fact that we don't have that for SQL functions is a serious minus of our approach to functions.) Stephen is concerned that a pure textual representation lacks any deep semantic understanding of what's in the extension, and I think that's indeed something to be concerned about. It's perhaps not a 100% show stopper, but it's something to be avoided unless the benefits of storing pure text are overwhelming. Which you don't seem to have convinced people of. > So while I hear your objection to the "script in catalog" idea Stephen, > I think we should move forward. We don't have the luxury of only > applying patches where no compromise has to be made, where everyone is > fully happy with the solution we find as a community. You've got that backwards. We do have the luxury of rejecting new features until people are generally satisfied that the basic design is right. There's no overlord decreeing that this must be in 9.4. regards, tom lane
On Wed, Dec 11, 2013 at 10:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So while I hear your objection to the "script in catalog" idea Stephen, >> I think we should move forward. We don't have the luxury of only >> applying patches where no compromise has to be made, where everyone is >> fully happy with the solution we find as a community. > > You've got that backwards. We do have the luxury of rejecting new > features until people are generally satisfied that the basic design is > right. There's no overlord decreeing that this must be in 9.4. I strongly agree. PostgreSQL has succeeded because we try not to do things at all until we're sure we know how to do them right. Sometimes we lag behind in features or performance as a result of that - but the upside is that when we say something now works, it does. Moreover, it means that the number of bad design decisions we're left to support off into eternity is comparatively small. Those things are of great benefit to our community. I can certainly understand Dimitri's frustration, in that he's written several versions of this patch and none have been accepted. But what that means is that none of those approaches have consensus behind them, which is another way of saying that, as a community, we really *don't* know the best way to solve this problem, and our community policy in that situation is to take no action until we do. I've certainly had my own share of disappointments about patches I've written which I believed, and in some cases still believe, to be really good work, and I'd really like to be able to force them through. But that's not how it works. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: >> You've got that backwards. We do have the luxury of rejecting new >> features until people are generally satisfied that the basic design is >> right. There's no overlord decreeing that this must be in 9.4. > > I strongly agree. PostgreSQL has succeeded because we try not to do > things at all until we're sure we know how to do them right. I still agree to the principle, or I wouldn't even try. Not in details, because the current design passed all the usual criteria a year ago. http://www.postgresql.org/message-id/6466.1354817682@sss.pgh.pa.us > I can certainly understand Dimitri's frustration, in that he's written > several versions of this patch and none have been accepted. But what The design was accepted, last year. It took a year to review it, which is fair enough, only to find new problems again. Circles at their best. You just said on another thread that perfect is the enemy of good. What about applying the same line of thoughts to this patch? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri, * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > The extra catalog tables which store SQL scripts in text columns is one > > of my main objections to the as-proposed Extension Templates. I view > > those scripts as a poor man's definition of database objects which are > > defined properly in the catalog already. > > I have a very hard time to understand this objection. > > PL/SQL functions are just a SQL script stored as-is in the catalogs. > That applies the same way to any other PL language too, with scripts > stored as-is in the catalogs in different languages. Sure- but in those cases only the actual function (which is, by definition, for an *interpreted* language..) is stored as text, not the definition of the function (eg: the CREATE FUNCTION statement), nor all of the metadata, dependency information, etc. Also, what you're proposing would result in having *both* in the same catalog- the canonical form defined in pg_proc and friends, and the SQL text blob in the extension template catalog and I simply do not see value in that. > So while I hear your objection to the "script in catalog" idea Stephen, > I think we should move forward. We don't have the luxury of only > applying patches where no compromise has to be made, where everyone is > fully happy with the solution we find as a community. I understand that you wish to push this forward regardless of anyone's concerns. While I appreciate your frustration and the time you've spent on this, that isn't going to change my opinion of this approach. > > The other big issue is that > > there isn't an easy way to see how we could open up the ability to > > create extensions to non-superusers with this approach. > > The main proposal here is to only allow the owner of a template to > install it as an extension. For superusers, we can implement the needed > SET ROLE command automatically in the CREATE EXTENSION command. > > Is there another security issue that this “same role” approach is not > solving? I don't think so. This isn't kind, and for that I'm sorry, but this feels, to me, like a very hand-wavey "well, I think this would solve all the problems" answer to the concerns raised. I can't answer offhand if this would really solve all of the issues because I've not tried to implement it or test it out, but I tend to doubt that it would. Thanks, Stephen
On Wed, Dec 11, 2013 at 2:49 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >>> You've got that backwards. We do have the luxury of rejecting new >>> features until people are generally satisfied that the basic design is >>> right. There's no overlord decreeing that this must be in 9.4. >> >> I strongly agree. PostgreSQL has succeeded because we try not to do >> things at all until we're sure we know how to do them right. > > I still agree to the principle, or I wouldn't even try. Not in details, > because the current design passed all the usual criteria a year ago. > > http://www.postgresql.org/message-id/6466.1354817682@sss.pgh.pa.us > >> I can certainly understand Dimitri's frustration, in that he's written >> several versions of this patch and none have been accepted. But what > > The design was accepted, last year. It took a year to review it, which > is fair enough, only to find new problems again. Circles at their best. > You just said on another thread that perfect is the enemy of good. What > about applying the same line of thoughts to this patch? Sure. For every patch that gets submitted, we have to decide whether it represents an improvement over where we are today, or not. For the record: 1. The patch you're talking about is 2 or 3 orders of magnitude less complicated than this one, and it is pretty easy to see that it will not paint us into a corner. It also happens to fix what I've long considered a deficiency in PostgreSQL. I think it is legitimate for me to have more confidence in that patch than this one. 2. I explicitly did not review this patch for the precise reason that I thought it needed a fresh pair of eyes. You and I have not always seen eye to eye on this and other extension-related patches, and I don't really want to be the guy who shoots down your patches every year. I was prepared to sit still for this if Stephen felt it was OK.But after both Stephen and Heikki expressed concernsabout the approach, I decided to say that I found those concerns valid also. I happen to think that Stephen did a very good job of explaining why blobs in the catalog could be a very bad plan. Specifically, I believe he mentioned that it creates a dump/restore hazard. If a new keyword gets added in a new server version, a logical dump of the extension made by a new pg_dump against the old server version will restore properly on the new version. Dumping and restoring the blobs and re-execute on the new version may fail. I had not thought of this issue when this was last discussed, or at least I don't remember having thought of it, and based on Tom's having endorsed the previous design, I'm guessing he didn't think of it at the time, either. I think that Stephen's other points about duplicating data, etc. are somewhat valid as well, but I am particularly sensitive about dump and restore hazards. I don't think you will find any time in the history of this project when I endorsed any change that would create more of them or declined to endorse fixing them, and if you do it was probably in my first year of involvement when I did not understand so well as I do now how much pain such problems create. Users are remarkably skilled at finding these bugs; it's been less then two weeks since we fixed the most recent one; and they cause a lot of pan. The only saving grace is that, up until now, we've pretty much always been able to patch them by changing pg_dump(all). The problems that this patch would create can't be fixed that way, though: you'd have to manually hack up the blobs stored in the catalog, or manually edit the dumpfile. That's not good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, 2013-12-11 at 20:49 +0100, Dimitri Fontaine wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I strongly agree. PostgreSQL has succeeded because we try not to do > > things at all until we're sure we know how to do them right. > > I still agree to the principle, or I wouldn't even try. Not in details, > because the current design passed all the usual criteria a year ago. > > http://www.postgresql.org/message-id/6466.1354817682@sss.pgh.pa.us For what it's worth, I think the idea of extension templates has good conceptual integrity. Extensions are external blobs. To make them work more smoothly in several ways, we move them into the catalog. They have pretty much the same upsides and downsides of our existing extensions, aside from issues directly related to filesystem vs. catalog. Stephen had some legitimate concerns. I don't entirely agree, but they are legitimate concerns, and we don't want to just override them. At the same time, I'm skeptical of the alternatives Stephen offered (though I don't think he intended them as a full proposal). So right now I'm discouraged about the whole idea of installing extensions using SQL. I don't see a lot of great options. On top of that, the inability to handle native code limits the number of extensions that could make use of such a facility, which dampens my enthusiasm. Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > For what it's worth, I think the idea of extension templates has good > conceptual integrity. Extensions are external blobs. To make them work > more smoothly in several ways, we move them into the catalog. They have > pretty much the same upsides and downsides of our existing extensions, > aside from issues directly related to filesystem vs. catalog. I've never particularly liked the idea that extensions are external blobs, to be honest. > Stephen had some legitimate concerns. I don't entirely agree, but they > are legitimate concerns, and we don't want to just override them. > > At the same time, I'm skeptical of the alternatives Stephen offered > (though I don't think he intended them as a full proposal). It was more thoughts on how I'd expect these things to work. I've also been talking to David about what he'd like to see done with PGXN and his thinking was a way to automate creating RPMs and DEBs based on PGXN spec files, but he points out that the challenge there is dealing with external dependencies. > So right now I'm discouraged about the whole idea of installing > extensions using SQL. I don't see a lot of great options. On top of > that, the inability to handle native code limits the number of > extensions that could make use of such a facility, which dampens my > enthusiasm. Yeah, having looked at PGXN, it turns out that some 80+% of the extensions there have .c code included, something well beyond what I was expecting, but most of those cases also look to have external dependencies (eg: FDWs), which really makes me doubt this notion that they could be distributed independently and outside of the OS packaging system (or that it would be a particularly good idea to even try...). Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: >> Stephen had some legitimate concerns. I don't entirely agree, but they >> are legitimate concerns, and we don't want to just override them. The main disturbing concern for me is to do with pg_restore and major upgrades, where the blob we have in the catalogs might not parse correctly into the new version. Of course, it's easy enough to fix either the source database or the pg_restore text itself, as it's just plain SQL in there. >> At the same time, I'm skeptical of the alternatives Stephen offered >> (though I don't think he intended them as a full proposal). I began working out a full proposal out of them, and here's the most important conclusions I can offer from that work: - The main constraint we have to work against is that no matter what, extension objects are not going to be selected forpg_dump. That basically means that the only model of extensions we accept is ever going to be the contrib model, whereas my currentattemps are meant to evolve the extension model way beyond contrib. The tension we have there is extremely hard to resolve, which explains the strange idea of storing SQL blobs in thecatalogs. - While it's possible to work out some new versioned container objects, I see mainly 3 consequences of doing so: 1. We're going to kill extensions, which design would be limited to only care about contribs: no data, code inC, any single version of the extension is intended to work against only one major version of PostgreSQL. Guess what, I know of no extension maintained by those rules outside of contribs. The third rule is the easy one to work around, of course, except if you consider the pg_restore behaviour, framedas a bug by Stephen. 2. The new thing would seamlessly allow for data only extensions, such as census or geolocation, etc, because wewould actually backup that data. 3. The separation of concerns in between the extension author who maintains the install and upgrade scripts andthe DBA who applies them is going to be pushed entirely to the client software, and that sounds way more fragilethat what we have now. So I see 3 solutions ahead of us: we allow extensions to manage more than the contrib model, a third-party software is going to work around it to make extensions usable for non-contrib things, or we're providing a new kind of container in core and kill extensions. Of course, none of them are exclusive, and I think realistically we're going to have to live with at least 2 of those alternatives in a near future. > It was more thoughts on how I'd expect these things to work. I've also > been talking to David about what he'd like to see done with PGXN and his > thinking was a way to automate creating RPMs and DEBs based on PGXN spec > files, but he points out that the challenge there is dealing with > external dependencies. With all due respect, we don't live in a world where its customary to have root privileges on your production service anymore. >> So right now I'm discouraged about the whole idea of installing >> extensions using SQL. I don't see a lot of great options. On top of >> that, the inability to handle native code limits the number of >> extensions that could make use of such a facility, which dampens my >> enthusiasm. Rejoice! Have a look at the documentation for dynamic_library_path. Any distribution or packaging software would trivially be able to make it so that the modules (.so) are loaded from a non-system place. Only missing is another path GUC to allow PostgreSQL to search for the extension control files in non-system places too, meanwhile it's already possible to edit the file system privileges. > Yeah, having looked at PGXN, it turns out that some 80+% of the > extensions there have .c code included, something well beyond what I was I call that a chicken and eggs problem. Every serious PostgreSQL user will have stored procedure code to maintain, where “serious” means that PostgreSQL is considered a part of the application platform. The only reason why this code is not organized as an extension today is because extensions are restricted to the contrib model. There's absolutely no surprise in discovering that current extensions in the wild are about all fitting the only currently supported model. > expecting, but most of those cases also look to have external > dependencies (eg: FDWs), which really makes me doubt this notion that > they could be distributed independently and outside of the OS packaging > system (or that it would be a particularly good idea to even try...). It seems to me that if dynamic_library_path and the system dynamic loader are both looking at the same places, then storing modules and their dependencies there is going to be all you need to make it work. The main packaging system (debian and red hat) have automatic dependency tracking using `ldd` or the like, so it should be possible to implement the same thing in the software packaging and distribution layer that we keep talking about to complement that feature. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, 2013-12-13 at 13:42 -0500, Stephen Frost wrote: > * Jeff Davis (pgsql@j-davis.com) wrote: > > For what it's worth, I think the idea of extension templates has good > > conceptual integrity. Extensions are external blobs. To make them work > > more smoothly in several ways, we move them into the catalog. They have > > pretty much the same upsides and downsides of our existing extensions, > > aside from issues directly related to filesystem vs. catalog. > > I've never particularly liked the idea that extensions are external > blobs, to be honest. It did feel a bit like you were arguing about extensions as they exist today, rather than extension templates. To make much more progress, it seems like we either need an ingenious idea about how to change existing extensions to work for all purposes, or we need to invent a new concept. > but most of those cases also look to have external > dependencies (eg: FDWs), which really makes me doubt this notion that > they could be distributed independently and outside of the OS packaging > system (or that it would be a particularly good idea to even try...). As I pointed out before, many language communities handle libraries outside of the OS packaging system, e.g. cpan, gems, cabal, pear, etc. Arguably, those libraries are more likely to have external dependencies, and yet they find it a better solution anyway. And you are completely ignoring the common case where people are just using C because *postgres says they have to*. For instance, defining new data types, implementing the GiST/GIN/SP-GiST APIs, or using some C hook in the backend. Those may have no external dependencies at all. Regards,Jeff Davis
* Jeff Davis (pgsql@j-davis.com) wrote: > To make much more progress, it seems like we either need an ingenious > idea about how to change existing extensions to work for all purposes, > or we need to invent a new concept. I'm still in favor of supporting SQL/trusted-language only 'extensions' and allowing them to be installed via a client by a non-superuser, with perhaps superusers having the ability to install extensions which use interpreted but untrusted languages. > As I pointed out before, many language communities handle libraries > outside of the OS packaging system, e.g. cpan, gems, cabal, pear, etc. Sure, and we have PGXN, which I believe fits quite a bit better into the above list than "extension templates". Even so, we could look to try and add in binary distribution capabilities to PGXN, though David seems more interested in having a way to go from PGXN source to RPM or Debian packages. Note also that all of the above operate by downloading something remote and then installing it *locally*- yet we're being asked to build a system "like" that which allows installing to a remote system through a database protocol. > And you are completely ignoring the common case where people are just > using C because *postgres says they have to*. For instance, defining new > data types, implementing the GiST/GIN/SP-GiST APIs, or using some C hook > in the backend. Those may have no external dependencies at all. I don't intend to ignore that case- my earlier point was merely that there seems to be a pretty close split between no C, C with external dependencies, and C without external dependencies. Based on what I saw in PGXN, we've got a good bit of all three. The only thing which has even been proposed thus far to address all three cases is PGXN- extension templates don't, nor do 'packages' or whatever we want to call extensions without C code. Thanks, Stephen
All: Can someone summarize the issues with this patch for those of us who haven't been following it closely? I was just chatting with a couple other contributors, and at this point none of just know what it implements, what it doesn't implement, what the plans are for expanding its feature set (if any), and why Frost doesn't like it. I tried reading through the thread on -hackers, and came away even more confused. Is there maybe a wiki page for it? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Sat, 2013-12-14 at 13:46 -0800, Josh Berkus wrote: > All: > > Can someone summarize the issues with this patch for those of us who > haven't been following it closely? I was just chatting with a couple > other contributors, and at this point none of just know what it > implements, what it doesn't implement, what the plans are for expanding > its feature set (if any), and why Frost doesn't like it. I tried > reading through the thread on -hackers, and came away even more confused. > > Is there maybe a wiki page for it? The patch offers an alternative to dropping files on the filesystem before doing CREATE EXTENSION. Instead, if the extension has no C code, you can put it in the catalog using ordinary SQL access, and execute the same kind of CREATE EXTENSION. Aside from that, it's pretty much identical to existing extensions. Stephen doesn't like the idea that the SQL in an extension is a blob of text. There are weird cases, like if you make local modifications to objects held in an extension, then dump/reload will lose those local modifications. Another issue, which I agree is dubious in many situations, is that the version of an extension is not preserved across dump/reload (this is actually a feature, which was designed with contrib-style extensions in mind, but can be surprising in other circumstances). This isn't necessarily a dead-end, but there are a lot of unsettled issues, and it will take some soul-searching to answer them. Is an "extension" a blob of text with a version, that's maintained in some external repo? Is it the job of postgres to ensure that dump/reload creates the same situation that you started with, including local modifications to objects that are part of an extension? Should everything be an extension, or do we need to invent a new concept for some of the use cases? What role to external tools play in all of this? Regards,Jeff Davis
On 12/15/2013 10:47 PM, Jeff Davis wrote: > The patch offers an alternative to dropping files on the filesystem > before doing CREATE EXTENSION. Instead, if the extension has no C code, > you can put it in the catalog using ordinary SQL access, and execute the > same kind of CREATE EXTENSION. Aside from that, it's pretty much > identical to existing extensions. OK, so the idea is that for psql-only extensions (i.e. ones without .so's) the user shouldn't be required to create a file on disk. That seems straightforwards and acceptable. > Stephen doesn't like the idea that the SQL in an extension is a blob of > text. I, personally, would prefer per-object line-items, but I don't think that's a deal-breaker. Having a single text blob does match up with existing Extension design. Note for Dimitri, though: I think that having line-item objects in dependancy order would make this feature vastly more useful for schema-template maintenance. Give it some thought. > There are weird cases, like if you make local modifications to > objects held in an extension, then dump/reload will lose those local > modifications. What does DUMP/Reload do with regular Extensions currently in that case? > Another issue, which I agree is dubious in many > situations, is that the version of an extension is not preserved across > dump/reload (this is actually a feature, which was designed with > contrib-style extensions in mind, but can be surprising in other > circumstances). Well, this should work with a versioning system, in which dump/reload can load older versions of the extension if they are present, the same as external Extensions do now. Is that built in? > This isn't necessarily a dead-end, but there are a lot of unsettled > issues, and it will take some soul-searching to answer them. Is an > "extension" a blob of text with a version, that's maintained in some > external repo? Well, plus potentially binaries and library references, yes. Although you could describe all of Postgres as a bunch of text blobs and some library references, when you get right down to it. > Is it the job of postgres to ensure that dump/reload > creates the same situation that you started with, including local > modifications to objects that are part of an extension? IMHO: No. AFAIK, if a user modifies, say, information_schema views in PostgreSQL, we don't respect that in dump/restore either. Now, I can see adding to this mechanism a method for tracking such modifications in a way that pgdump can support them. But that can easily be a version 2 feature. > Should > everything be an extension, or do we need to invent a new concept for > some of the use cases? What role to external tools play in all of this? So, the reason I was confused by this feature -- and the reason Stephen hates it, I think -- is that I thought it was solving the "Extensions don't follow replication, and they are complicated to install if your OS doesn't have good packages" problem. It's not, and it never will solve that issue. It's solving a completely different problem, to wit: Some PostgreSQL shops with lots of servers have large internal libraries of functions, views, and similar code that they've written to support their applications, which don't comprise a complete database. This feature would allow them to "package" those libraries, and version, upgrade and track them, without requiring a filesystem-based install. I myself have a couple clients who could benefit from this. I think the name "Extension Templates" is horrible because it misleads all of us on this list into thinking the proposed feature is completely something other than what it is. I don't have a better name offhand, but that's got to change before it becomes a feature. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 12/16/2013 10:53 AM, Josh Berkus wrote: > Some PostgreSQL shops with lots of servers have large internal libraries > of functions, views, and similar code that they've written to support > their applications, which don't comprise a complete database. This > feature would allow them to "package" those libraries, and version, > upgrade and track them, without requiring a filesystem-based install. I > myself have a couple clients who could benefit from this. cc'ing Jim Nasby, since I think Enova is part of the target market for this feature. Jim, have you taken a look at this? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > So, the reason I was confused by this feature -- and the reason Stephen > hates it, I think -- is that I thought it was solving the "Extensions > don't follow replication, and they are complicated to install if your OS > doesn't have good packages" problem. It's not, and it never will solve > that issue. > It's solving a completely different problem, to wit: > Some PostgreSQL shops with lots of servers have large internal libraries > of functions, views, and similar code that they've written to support > their applications, which don't comprise a complete database. This > feature would allow them to "package" those libraries, and version, > upgrade and track them, without requiring a filesystem-based install. I > myself have a couple clients who could benefit from this. Right. I think a lot of the tension comes from people being unconvinced that the existing extension feature is an ideal model for this sort of use-case. Extensions were mainly designed around the notion of a .so with some SQL datatype/function/etc declarations that have to match up with the C code. So it made sense for them to be relatively static things that live in the filesystem. Notably, if you're migrating to a new PG major version, you're at the very least gonna have to recompile the C code and quite likely will need to change it some. So adapting the SQL code if needed goes along with that, and would never be automatic in any case. OTOH, for a set of pure-SQL objects, it's not necessary that there be a canonical text file somewhere, and we have in principle complete knowledge of the objects' semantics as well as the ability to dump-and-restore into newer PG versions. So it's not at all clear that we should just adopt the existing model with the smallest possible changes --- which AFAICS is basically what this proposal is. Maybe that's the way to go, but we should consider alternatives, and in particular I think there is much more reason to allow inside-the-database mutation of the SQL objects. > I think the name "Extension Templates" is horrible because it misleads > all of us on this list into thinking the proposed feature is completely > something other than what it is. I don't have a better name offhand, > but that's got to change before it becomes a feature. Given your previous para, I wonder if "library" or "package" would work better. I agree that "template" isn't le mot juste. regards, tom lane
On 12/16/13, 1:00 PM, Josh Berkus wrote: > On 12/16/2013 10:53 AM, Josh Berkus wrote: >> Some PostgreSQL shops with lots of servers have large internal libraries >> of functions, views, and similar code that they've written to support >> their applications, which don't comprise a complete database. This >> feature would allow them to "package" those libraries, and version, >> upgrade and track them, without requiring a filesystem-based install. I >> myself have a couple clients who could benefit from this. > > cc'ing Jim Nasby, since I think Enova is part of the target market for > this feature. Jim, have you taken a look at this? The name rings a bell; I think I looked at it in the past. I've read all of this thread that I've currently got (back to 12/11), so I think I've got some idea what's going on. Enova definitely has libraries of objects, and in fact we're currently working on releasing them via PGXN. That's provinga bit challenging since now we have to find a way to incorporate PGXN into our existing deployment framework (I doNOT want users to have to manually run pgxn client commands). Another complication is that we don't want our productionservers downloading random, un-audited code, so we need an internal PGXN mirror. I think it's probably best if I describe the issues that we've run across, to help the rest of the community understand thepain points. I'll work on doing that. In the meantime, I can say this: - Being forced to put files down for extensions is a PITA - We don't have a good way to deal with extensions that have been installed in a non-standard schema, other than search_path,which for a complex database is impractical - There's a lot that could potentially be done without any external libraries (we've got the equivalent of probably 6-8 modules,none of which require C and only one uses a Perl module (which is part of the module itself; the only reason forthe .pm is to properly factor the code between plperl functions) - We definitely need a mechanism for declaring deps between modules Somewhat related to this, I really wish Postgres had the idea of a "class", that was allowed to contain any type of objectand could be "instantiated" when needed. For example, if we had an "address class", we could instantiate it once fortracking our customer addresses, and a second time for tracking the addresses customers supply for their employers. Sucha mechanism would probably be ideal for what we need, but of course you'd still have the question of how to load a classdefinition that someone else has published. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Tom Lane <tgl@sss.pgh.pa.us> writes: > Right. I think a lot of the tension comes from people being unconvinced > that the existing extension feature is an ideal model for this sort of > use-case. Extensions were mainly designed around the notion of a .so The effort here is all about extending the Extension Use Case, yes. > OTOH, for a set of pure-SQL objects, it's not necessary that there be a > canonical text file somewhere, and we have in principle complete knowledge > of the objects' semantics as well as the ability to dump-and-restore into > newer PG versions. So it's not at all clear that we should just adopt the > existing model with the smallest possible changes --- which AFAICS is > basically what this proposal is. Maybe that's the way to go, but we > should consider alternatives, and in particular I think there is much > more reason to allow inside-the-database mutation of the SQL objects. My thinking is that if we invent a new mechanism for extensions that are not managed like contribs, we will find out that only contribs are going to be using extensions. Given the options of either growing extensions into being able to cope with more than a single model or building an entirely new system having most of the same feature set than Extensions, I'm pushing for the option where we build on top of what we have already. >> I think the name "Extension Templates" is horrible because it misleads >> all of us on this list into thinking the proposed feature is completely >> something other than what it is. I don't have a better name offhand, >> but that's got to change before it becomes a feature. > > Given your previous para, I wonder if "library" or "package" would work > better. I agree that "template" isn't le mot juste. We can't use “package” because it means something very different in direct competition. I have other propositions, but they are only relevant if we choose not to improve Extensions… right? Regards, -- Dimitri Fontaine 06 63 07 10 78 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > > OTOH, for a set of pure-SQL objects, it's not necessary that there be a > > canonical text file somewhere, and we have in principle complete knowledge > > of the objects' semantics as well as the ability to dump-and-restore into > > newer PG versions. So it's not at all clear that we should just adopt the > > existing model with the smallest possible changes --- which AFAICS is > > basically what this proposal is. Maybe that's the way to go, but we > > should consider alternatives, and in particular I think there is much > > more reason to allow inside-the-database mutation of the SQL objects. > > My thinking is that if we invent a new mechanism for extensions that are > not managed like contribs, we will find out that only contribs are going > to be using extensions. That's only accurate if the new mechanism supports .so's, which seems unlikely to be the case. What I think we'd end up with is a split between extensions, which would be things containing .so's, and "libraries" or what-have-you, which would be more-or-less everything else. That kind of a break-down strikes me as perfectly reasonable. There would also be flexability in that an author might choose to use an extension even in cases where it's not strictly necessary to do so, for whatever reason they want. > Given the options of either growing extensions into being able to cope > with more than a single model or building an entirely new system having > most of the same feature set than Extensions, I'm pushing for the option > where we build on top of what we have already. I'm not sure that we need to throw away everything that exists to add on this new capability; perhaps we can build a generic versioned object-container system on which extensions and packages/libraries/classes/whatever can also be built on (or perhaps that's what 'extensions' end up morphing into). > We can't use “package” because it means something very different in > direct competition. I have other propositions, but they are only > relevant if we choose not to improve Extensions… right? I'd like to see extensions improved. I don't feel like the proposed 'extension templates' is the way to do that because I don't think it really solves anything and it adds a layer that strikes me as wholly unnecessary. I could see pulling in the control file contents as a catalog, adding in dependency information which could be checked against, perhaps hard vs. soft dependencies, and other things that make sense to track for extensions-currently-installed into a given database. However, as I understand it from the various discussions on this topic outside of this list, the immediate concern is the need for a multi-OS extension distribution network with support for binaries, .so's and .dll's and whatever else, to make installing extensions easier for developers on various platforms. I'm all for someone building that and dealing with the issues associated with that, but building a client for it in core, either in a way where a backend would reach out and download the files or accepting binary .so's through the frontend protocol, isn't the first step in that and I very much doubt it would ever make sense. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: >> My thinking is that if we invent a new mechanism for extensions that are >> not managed like contribs, we will find out that only contribs are going >> to be using extensions. > > That's only accurate if the new mechanism supports .so's, which seems > unlikely to be the case. Really? Look at dynamic_library_path, then at a classic CREATE FUNCTION command that maps into a C provided symbol: CREATE OR REPLACE FUNCTION prefix_range_in(cstring) RETURNS prefix_range AS '$libdir/prefix' LANGUAGE C IMMUTABLE STRICT; A packaging or distribution software will have no problem removing the '$libdir/' part of the magic AS string here. Once removed, prefix.so will be loaded from anywhere on the file system paths listed into the dynamic_library_path GUC. So now, you don't need anymore to have file system write privileges into a central place owned by root, it can be anywhere else, and the backend hooks, when properly setup, will be able to benefit from that. The missing bits are where to find the extension control files and scripts. The only reason why the current proposal mention *nothing* about how to deal with modules (.so etc) is because each and every time a mention is made about that problem, the answer from Tom is “rejected, full stop”. > What I think we'd end up with is a split > between extensions, which would be things containing .so's, and > "libraries" or what-have-you, which would be more-or-less everything > else. That kind of a break-down strikes me as perfectly reasonable. Only if it's the best we can do. > There would also be flexability in that an author might choose to use an > extension even in cases where it's not strictly necessary to do so, for > whatever reason they want. Note that of course you can still install proper OS packages when we ship with support for Extension Templates. > I'd like to see extensions improved. I don't feel like the proposed > 'extension templates' is the way to do that because I don't think it > really solves anything and it adds a layer that strikes me as wholly > unnecessary. You still didn't propose any other way to have at it, where it's already my fourth detailed proposal. I did spend time on designing what I think you're trying to say hand-wavely in that exchange, and I don't quite like the result, as I see now way for it not to entirely deprecate extensions. Maybe the proper answer is that we should actually confine extensions to being the way to install contribs and nothing else, and deprecate them for cases where you don't have an OS level package. It seems really strange to build a facility with such a generic name as “extension” only to resist changing any of it, then stop using it at first opportunity. Also, I'm not sure about the consequences in terms of user trust if we build something new to solve a use case that looks so much the same. > However, as I understand it from the various discussions on this topic > outside of this list, the immediate concern is the need for a multi-OS > extension distribution network with support for binaries, .so's and > .dll's and whatever else, to make installing extensions easier for > developers on various platforms. I'm all for someone building that and > dealing with the issues associated with that, but building a client for > it in core, either in a way where a backend would reach out and > download the files or accepting binary .so's through the frontend > protocol, isn't the first step in that and I very much doubt it would > ever make sense. That's exactly the reason why the first piece of that proposal has absolutely nothing to do with building said client, and is all about how NOT to have to build it in core *ever*. If you don't like what I'm building because it's not solving the problem you want to solve… well don't use what I'm building, right? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > >> My thinking is that if we invent a new mechanism for extensions that are > >> not managed like contribs, we will find out that only contribs are going > >> to be using extensions. > > > > That's only accurate if the new mechanism supports .so's, which seems > > unlikely to be the case. > > Really? Yes. The 'new mechanism' to which I was referring was when the entire XXX (extension, library, package, whatever) is in the PG catalog and not managed through files on the filesystem, as contrib-like extensions are. I'm quite aware that what you're asking for is technically possible- that's not what this discussion is about. > The only reason why the current proposal mention *nothing* about how to > deal with modules (.so etc) is because each and every time a mention is > made about that problem, the answer from Tom is “rejected, full stop”. Perhaps I'm not making myself clear here, but *I agree with Tom* on this point. > > There would also be flexability in that an author might choose to use an > > extension even in cases where it's not strictly necessary to do so, for > > whatever reason they want. > > Note that of course you can still install proper OS packages when we > ship with support for Extension Templates. With the various naming conflicts and other risks associated with doing that, which I don't believe were very clearly addressed. An off-the-cuff answer to that issue is not sufficient, imv. > You still didn't propose any other way to have at it, where it's already > my fourth detailed proposal. I didn't outline a proposal which provides what you want, no. That was intentional. > I did spend time on designing what I think > you're trying to say hand-wavely in that exchange, and I don't quite > like the result, as I see now way for it not to entirely deprecate > extensions. I don't think we need to, nor should we, deprecate extensions entirely when that's the approach which *should be* used for .so requiring extensions. Obviously, that's my opinion, and you don't agree with it, and it seems neither of us is willing to shift from that position. > Maybe the proper answer is that we should actually confine extensions to > being the way to install contribs and nothing else, and deprecate them > for cases where you don't have an OS level package. It seems really > strange to build a facility with such a generic name as “extension” only > to resist changing any of it, then stop using it at first opportunity. I'm open to changing how extensions work, to adding dependency information and making other improvements. Being interested in improving the extension system doesn't mean I'm required to support shipping .so's in this manner or installing text script blobs into catalog tables. > > However, as I understand it from the various discussions on this topic > > outside of this list, the immediate concern is the need for a multi-OS > > extension distribution network with support for binaries, .so's and > > .dll's and whatever else, to make installing extensions easier for > > developers on various platforms. I'm all for someone building that and > > dealing with the issues associated with that, but building a client for > > it in core, either in a way where a backend would reach out and > > download the files or accepting binary .so's through the frontend > > protocol, isn't the first step in that and I very much doubt it would > > ever make sense. > > That's exactly the reason why the first piece of that proposal has > absolutely nothing to do with building said client, and is all about how > NOT to have to build it in core *ever*. You can already build what you're after without extension templates entirely, if you're allowing files to be stashed out on the filesystem anywhere. Your argument that you need root doesn't hold any water with me on this issue- there's quite a few mechanisms out there already which allow you to trivially become root. You can write pl/perlu which sudo's and apt-get installs your favorite extension, if you like. That doesn't mean we should build a system into core which tries to do that for you. And, yes, I know that you pushed for and got the GUC in to allow you to have other places to pull .so's from. Having that flexibility doesn't mean we have to support populating that directory from PG. You probably would have been better off pushing for a GUC that allowed a '.d' like directory system for extensions to be defined in. That *still* doesn't require extension templates, storing SQL scripts as text blobs in catalog tables, and you can even avoid the whole 'root' concern if you want. > If you don't like what I'm building because it's not solving the problem > you want to solve… well don't use what I'm building, right? I'm pretty sure that I've pointed out a number of issues that go well beyond not liking it. Thanks, Stephen
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Stephen Frost <sfrost@snowman.net> writes: >> That's only accurate if the new mechanism supports .so's, which seems >> unlikely to be the case. > Really? Yes, really. > So now, you don't need anymore to have file system write privileges into > a central place owned by root, it can be anywhere else, Modern OSes have security checks that can prevent loading libraries from random places. This is widely seen as not merely a good thing, but security-critical for network-exposed daemons. Of which we are one. I keep telling you this, and it keeps not sinking in. One more time: any feature that does what you want will be dead on arrival so far as vendors like Red Hat are concerned. I don't care how creatively you argue for it, they will refuse to ship it (or at least refuse to disable the SELinux policy that prevents it). Period. Please stop wasting my time with suggestions otherwise, because it won't happen. So what we have left to discuss is whether we want to develop, and base a community extension-distribution infrastructure on, a mechanism that some popular vendors will actively block. I'm inclined to think it's a bad idea, but I just work here. > If you don't like what I'm building because it's not solving the problem > you want to solve… well don't use what I'm building, right? What worries me is that time and effort will go into this instead of something that would be universally acceptable/useful. I grant that there are some installations whose security policies are weak enough that they could use what you want to build. But I'm not sure how many there are, and I'm worried about market fragmentation if we need to have more than one distribution mechanism. Of course, we're already talking about two distribution infrastructures (one for packages including .so's, and one for those without). I see no way around that unless we settle for the status quo. But what you're suggesting will end up with three distribution infrastructures, with duplicative methods for packages including .so's depending on whether they're destined for security-conscious or less-security-conscious platforms. I don't want to end up with that. regards, tom lane
On 12/16/2013 11:44 AM, Tom Lane wrote:> Right. I think a lot of the tension comes from people being unconvinced > that the existing extension feature is an ideal model for this sort of > use-case. Extensions were mainly designed around the notion of a .so > with some SQL datatype/function/etc declarations that have to match up > with the C code. So it made sense for them to be relatively static things > that live in the filesystem. Notably, if you're migrating to a new PG > major version, you're at the very least gonna have to recompile the C code > and quite likely will need to change it some. So adapting the SQL code > if needed goes along with that, and would never be automatic in any case. I see what you mean. On the other hand: a) introducing a new concept would require a new reserved word b) it would also require figuring out how it interacts with extensions c) extensions already have versioning, which this feature needs d) extensions already have dependancies, which this feature needs While it splits Extensions into two slightly different concepts, I find that on the whole less confusing than the alternative. On 12/16/2013 05:17 PM, Jim Nasby wrote: > Somewhat related to this, I really wish Postgres had the idea of a > "class", that was allowed to contain any type of object and could be > "instantiated" when needed. For example, if we had an "address class", > we could instantiate it once for tracking our customer addresses, and a > second time for tracking the addresses customers supply for their > employers. Such a mechanism would probably be ideal for what we need, > but of course you'd still have the question of how to load a class > definition that someone else has published. Well, the idea originally (POSTGRES) was for the Type, Domain, and Inheritance system to do just what you propose. Nobody ever worked out all the practicalities and gotchas to make it really work in production, though. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 13 December 2013 18:42, Stephen Frost <sfrost@snowman.net> wrote: > * Jeff Davis (pgsql@j-davis.com) wrote: >> For what it's worth, I think the idea of extension templates has good >> conceptual integrity. Extensions are external blobs. To make them work >> more smoothly in several ways, we move them into the catalog. They have >> pretty much the same upsides and downsides of our existing extensions, >> aside from issues directly related to filesystem vs. catalog. > > I've never particularly liked the idea that extensions are external > blobs, to be honest. I've been reading this, trying to catch back up with hackers. This thread is amazing because this feature ought to be a shoe-in. Jeff expresses his points politely, but not strongly enough. I agree with him. I keep seeing people repeat "I don't like blobs" as if that were an objection. There is no danger or damage from doing this. I can't see any higher beauty that we're striving for by holding out. Why not allow the user to choose XML, JSON, YAML, or whatever they choose. Some things need to wait for the right design, like RLS, for a variety of reasons. I don't see any comparison here and I can't see any reason for a claim of veto on grounds of higher wisdom to apply to this case. Blocking this stops nothing, it just forces people to do an extra non-standard backflip to achieve their goals. Is that what we want? Why? >> Stephen had some legitimate concerns. I don't entirely agree, but they >> are legitimate concerns, and we don't want to just override them. >> >> At the same time, I'm skeptical of the alternatives Stephen offered >> (though I don't think he intended them as a full proposal). > > It was more thoughts on how I'd expect these things to work. I've also > been talking to David about what he'd like to see done with PGXN and his > thinking was a way to automate creating RPMs and DEBs based on PGXN spec > files, but he points out that the challenge there is dealing with > external dependencies. > >> So right now I'm discouraged about the whole idea of installing >> extensions using SQL. I don't see a lot of great options. On top of >> that, the inability to handle native code limits the number of >> extensions that could make use of such a facility, which dampens my >> enthusiasm. > > Yeah, having looked at PGXN, it turns out that some 80+% of the > extensions there have .c code included, something well beyond what I was > expecting, but most of those cases also look to have external > dependencies (eg: FDWs), which really makes me doubt this notion that > they could be distributed independently and outside of the OS packaging > system (or that it would be a particularly good idea to even try...). That is clear evidence that the packaging is getting in the way of extensions that don't include binary programs. My only personal interest in this is to stimulate the writing of further extensions, which is fairly clearly hampered by the overhead required for packaging. Who needs old fashioned package management? Some bigger extensions need it. Smaller ones don't. Who are we to force people to distribute their wares in only certain ways? I can't see any reason to block this, nor any better design than the flexible, neutral and simple one proposed. If there's some secret reason to block this, please let me know off list cos I currently don't get it at all. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
* Simon Riggs (simon@2ndQuadrant.com) wrote: > I keep seeing people repeat "I don't like blobs" as if that were an > objection. There is no danger or damage from doing this. I can't see > any higher beauty that we're striving for by holding out. Why not > allow the user to choose XML, JSON, YAML, or whatever they choose. I have no idea where you're going with this, but I *do* object to sticking an SQL script which defines a bunch of objects into a catalog table *right next to where they are properly defined*. There's just no sense in it that I can see, except that it happens to mimic what we do today- to no particular purpose. > Blocking this stops nothing, it just forces people to do an extra > non-standard backflip to achieve their goals. Is that what we want? > Why? It's hardly non-standard when it's required for 80+% of the extensions that exist today anyway. > That is clear evidence that the packaging is getting in the way of > extensions that don't include binary programs. I'm totally on-board with coming up with a solution for extensions which do not include .so's. Avoiding mention of the .so issue doesn't somehow change this solution into one which actually solves the issue around non-binary extensions. > My only personal interest in this is to stimulate the writing of > further extensions, which is fairly clearly hampered by the overhead > required for packaging. I'm not convinced of that but I agree that we can do better and would like to see a solution which actually makes progress in that regard. I don't feel that this does that- indeed, it hardly changes the actual packaging effort required of extension authors at all. Thanks, Stephen
On 12/17/2013 08:32 PM, Stephen Frost wrote: > * Simon Riggs (simon@2ndQuadrant.com) wrote: >> My only personal interest in this is to stimulate the writing of >> further extensions, which is fairly clearly hampered by the overhead >> required for packaging. > > I'm not convinced of that but I agree that we can do better and would > like to see a solution which actually makes progress in that regard. I > don't feel that this does that- indeed, it hardly changes the actual > packaging effort required of extension authors at all. I'll repeat my requirement: the same extension must be installable the old way and the new way. I've lost track which of the ideas being discussed satisfy that requirement, but I object to any that doesn't. Considering that, I don't see how any if this is going to reduce the overhead required for packaging. An extension author will write the extension exactly the same way he does today. Perhaps you meant the overhead of installing an extension, ie. the work that the DBA does, not the work that the extension author does? - Heikki
On 17 December 2013 17:54, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So now, you don't need anymore to have file system write privileges into >> a central place owned by root, it can be anywhere else, > > Modern OSes have security checks that can prevent loading libraries from > random places. This is widely seen as not merely a good thing, but > security-critical for network-exposed daemons. Of which we are one. > > I keep telling you this, and it keeps not sinking in. One more time: any > feature that does what you want will be dead on arrival so far as vendors > like Red Hat are concerned. I don't care how creatively you argue for it, > they will refuse to ship it (or at least refuse to disable the SELinux > policy that prevents it). Period. Please stop wasting my time with > suggestions otherwise, because it won't happen. > > So what we have left to discuss is whether we want to develop, and base a > community extension-distribution infrastructure on, a mechanism that some > popular vendors will actively block. I'm inclined to think it's a bad > idea, but I just work here. Yes, there is a strong argument against enabling Postgres out-of-the-box to allow loading of .so files from random places and bypassing distro security procedures. But that argument doesn't apply to all types of extension. For example, data. In any case, right now, its easy to load an FDW and then do an INSERT SELECT from a foreign server into a text column. There are no restrictions on URLs to access foreign servers. Then write a *trusted* PL/pgSQL procedure to execute the contents of the text column to do whatever. All you need is the Postgres foreign data wrapper loaded, an insecure URL and a route to it. I don't see a material difference between that route and the new one proposed. The only difference is really that the new route would be blessed as being the officially recommended way to import objects without needing a file, and to allow them to be backed up and restored. So perhaps all we need is a module that once loaded allows other things to be loaded. (Red Hat etc can then ban that as they see fit.) -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 17 December 2013 18:32, Stephen Frost <sfrost@snowman.net> wrote: > * Simon Riggs (simon@2ndQuadrant.com) wrote: >> I keep seeing people repeat "I don't like blobs" as if that were an >> objection. There is no danger or damage from doing this. I can't see >> any higher beauty that we're striving for by holding out. Why not >> allow the user to choose XML, JSON, YAML, or whatever they choose. > > I have no idea where you're going with this, but I *do* object to > sticking an SQL script which defines a bunch of objects into a catalog > table *right next to where they are properly defined*. There's just no > sense in it that I can see, except that it happens to mimic what we do > today- to no particular purpose. The purpose is clear: so it is part of the database backup. It's a fairly boring purpose, not fancy at all. But it is a purpose, indeed *the* purpose. I don't see any technical objection here. We aim to have the simplest implementation that meets the stated need and reasonable extrapolations of that. Text in a catalog table is the simplest implementation. That is not a reason to reject it, especially when we aren't suggesting a viable alternative. I have zero attachment to this design, my interest is in the feature. How do we achieve the feature if not this way? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: > On 17 December 2013 18:32, Stephen Frost <sfrost@snowman.net> wrote: >> I have no idea where you're going with this, but I *do* object to >> sticking an SQL script which defines a bunch of objects into a catalog >> table *right next to where they are properly defined*. There's just no >> sense in it that I can see, except that it happens to mimic what we do >> today- to no particular purpose. > The purpose is clear: so it is part of the database backup. It's a > fairly boring purpose, not fancy at all. But it is a purpose, indeed > *the* purpose. The point Stephen is making is that it's just as easy, and far more reliable, to dump the package-or-whatever-you-call-it by dumping the definitions of the contained objects, as to dump it by dumping the text blob it was originally created from. So I don't see a lot of merit to claiming that we need to keep the text blob for this purpose. We did it differently for extensions in part because you can't dump a .so as a SQL command, so dump-the-contained-objects wasn't going to be a complete backup strategy in any case. But for a package containing only SQL objects, that's not a problem. > We aim to have the simplest implementation that meets the stated need > and reasonable extrapolations of that. Text in a catalog table is the > simplest implementation. That is not a reason to reject it, especially > when we aren't suggesting a viable alternative. The first part of this assertion is debatable, and the claim that no viable alternative has been suggested is outright wrong. regards, tom lane
On 17 December 2013 23:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> We aim to have the simplest implementation that meets the stated need >> and reasonable extrapolations of that. Text in a catalog table is the >> simplest implementation. That is not a reason to reject it, especially >> when we aren't suggesting a viable alternative. > > The first part of this assertion is debatable, and the claim that no > viable alternative has been suggested is outright wrong. I just hadn't read about that myself. All I'd read was why this feature should be blocked. Sounds like we have a way forward for this feature then, just not with the current patch. Can someone attempt to summarise the way forward, with any caveats and necessary restrictions? It would save further column inches of debate. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Tom Lane <tgl@sss.pgh.pa.us> writes: > I keep telling you this, and it keeps not sinking in. How can you say that? I've been spending a couple of years on designing and implementing and arguing for a complete feature set where dealing with modules is avoided at all costs. The problem we have now is that I'm being told that the current feature is rejected if it includes anything about modules, and not interesting enough if it's not dealing with modules. I tried my best to make it so that nothing in-core changes wrt modules, yet finding out-of-core solutions to still cope with that. It's a failure, ok. I think we need a conclusion on this thread: Extension specs are frozen. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Simon Riggs <simon@2ndQuadrant.com> writes: > On 17 December 2013 23:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> We aim to have the simplest implementation that meets the stated need >>> and reasonable extrapolations of that. Text in a catalog table is the >>> simplest implementation. That is not a reason to reject it, especially >>> when we aren't suggesting a viable alternative. >> >> The first part of this assertion is debatable, and the claim that no >> viable alternative has been suggested is outright wrong. With due respect, it's only wrong when you buy into implementing something new rather than improving extensions. > Sounds like we have a way forward for this feature then, just not with > the current patch. > > Can someone attempt to summarise the way forward, with any caveats and > necessary restrictions? It would save further column inches of debate. Here's my attempt: # Inline Extension, Extension Templates The problem with *Inline Extension* is the dump and restore policy. The contents of an extensions are not be found in a `pg_dump` script, ever. The problem with the *Extension Templates* is that we store the extension scripts (plain text blobs) in the catalogs, where we already have the full SQL objects and tools (such as `pg_dump` and `pg_depends`) to manipulate and introspect them. # The new thing™ A set of SQL objects that can be managed wholesale, with a version string attached to it. Objects are part of `pg_dump` output, the whole set can be relocatable, and has a version string attached. Name: - not `PACKAGE`, Oracle - not `MODULE`, that's already the name of a .so file - not `SYSTEM`, already something else - `BUNDLE`- `LIBRARY` - `UNIT` I'll pick UNIT here. Commands: CREATE UNIT name [ SCHEMA ... ] [ [ NOT ] RELOCATABLE ] [ REQUIRE ...]; WITH UNIT name; <commands> END UNIT name; ALTER UNIT name OWNER TO <role>; ALTER UNIT name ADD <object definition>; ALTER UNIT name DROP <object definition>; ALTER UNIT name SET SCHEMA <new schema>; ALTER UNIT name UPDATETO <version string>; ALTER UNIT name SET [ NOT ] RELOCATABLE; ALTER UNIT name REQUIRE a, b, c; COMMENT ON UNIT name IS ''; DROP UNIT name [ CASCADE ]; The `UPDATE TO` command only sets a new version string. # Implementation details We need a new `pg_unit` catalog, that looks almost exactly like the `pg_extension` one, except for the `extconfig` and `extcondition` fields. We need a way to `recordDependencyOnCurrentUnit()`, so another pair of static variables `creating_unit` and `CurrentUnitObject`. Each and every command we do support for creating objects must be made aware of the new `UNIT` concept, including `CREATE EXTENSION`. The `pg_dump` dependencies have to be set so that all the objects are restored independently first, as of today, and only then issue `CREATE UNIT` and a bunch of `ALTER UNIT ADD` commands, one per object. # Event Trigger support Event Triggers are to be provided for all the `UNIT` commands. # Life with Extensions and Units PostgreSQL now includes two different ways to package SQL objects, with about the same feature set. The only difference is the `pg_restore` behavior: *Extensions* are re-created from external resources, *Units* are re-created from what's in the dump. The smarts about `ALTER EXTENSION ... UPDATE` are not available when dealing with *UNITS*, leaving the user or the client scripts to care about that entirely on their own. In principle, a client can prepare a SQL script from a PGXN distribution and apply it surrounded by `WITH UNIT` and `END UNIT` commands. Upgrade scripts, once identified, can be run as straight SQL, adding a simple `ALTER UNIT ... UPDATE TO ...` command before the `COMMIT` at the end of the script. Identifying the upgrade script(s) may require implementing current Extension update smarts into whatever client side program is going to be built to support installing from PGXN etc. # Conclusion The main advantage of the `UNIT` proposal is that it copes very well with relations and other usual schema objects, as the data are preserved at `pg_restore` time. A `UNIT` can also entirely replace an `EXTENSION`, including when it needs a *module*, provided that the *module* is made available on the server's file system before creating the functions in `LANGUAGE C` that depend on it. It is possible to write a *UNIT distribution network* where a client software drives the installation of SQL objects within an UNIT, and this client software needs to include UNIT update smarts too. It's possible also to build that software as a set of Event Triggers on the `CREATE UNIT` and `ALTER UNIT UPDATE TO` commands. # Analysis The main drawback is that rather than building on extensions, both in a technical way and in building user trust, we are basically going to deprecate extensions entirely, giving them a new name an an incompatible way to manage them. Only *contribs* are going to be shipped as extensions, as they are basically the only known extensions following the same delivery rules as the PostgreSQL core product itself. Almost any other extension existing today builds support for all the PostgreSQL releases in each version of it, meaning that the pecularities of `pg_dump` and `pg_restore` are not going to apply to a `UNIT` in the same way at all. Basically with building `UNIT` we realise with hindsight that we failed to build a proper `EXTENSION` system, and we send that message to our users. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Here's my attempt: > > # Inline Extension, Extension Templates > > The problem with *Inline Extension* is the dump and restore policy. The > contents of an extensions are not be found in a `pg_dump` script, ever. You keep coming back to this and I think you're taking too narraw a view to the comments made on the prior threads. No, we don't really want extensions which have .sql files out on disk somewhere as part of them to be dumped out through pg_dump because then it becomes unclear which set of scripts should be used during restore. What we're talking about here is intended to not have that issue by using a different namespace, a flag, something which identifies these extensions as being defined through the catalog instead. > # The new thing™ > > A set of SQL objects that can be managed wholesale, with a version string > attached to it. Objects are part of `pg_dump` output, the whole set can be > relocatable, and has a version string attached. I'd like to see more than just a single version string included and I think that'd be beneficial for extensions too. > Name: [...] > I'll pick UNIT here. We can figure that later. > Commands: > > CREATE UNIT name [ SCHEMA ... ] [ [ NOT ] RELOCATABLE ] [ REQUIRE ...]; > > WITH UNIT name; > <commands> > END UNIT name; Interesting approach- I had considered something similar by having a 'fake' schema created into which you built up the 'UNIT'. The reason I was thinking schema instead of begin/end style commands, as you have above, is because of questions around transactions. Do you think the syntax you have here would require the definition to be all inside of a single transaction? Do we feel that would even be an issue or perhaps that it *should* be done that way? I don't currently have any strong feelings one way or the other on this and I'm curious what others think. > The `UPDATE TO` command only sets a new version string. So, one of the things I had been wondering about is if we could provide a 'diff' tool. Using your 'WITH UNIT' syntax above, an author might need to only write their initial implementation, build up a 'UNIT' based on it, then adjust that implementation with another 'WITH UNIT' clause and then ask PG for the differences. It's not clear if we could make that work but there is definitely a set of desireable capabilities out there, which some other databases have, around automated upgrade script building and doing schema differences. > # Implementation details > # Event Trigger support Not sure we're really ready to get into these yet. > The main drawback is that rather than building on extensions, both in a > technical way and in building user trust, we are basically going to > deprecate extensions entirely, giving them a new name an an incompatible way > to manage them. I don't see this as ending up deprecating extensions, even if we build a new thing with a new name. I would argue that properly supported extensions, such as those in contrib and the other 'main' ones, like PostGIS, and others that have any external dependencies (eg: FDWs) would almost certainly continue as extensions and would be packaged through the normal OS packaging systems. While you plan to use the event trigger mechanism to build something on top of this which tries to act like extenisons-but-not, I think that's an extremely narrow and limited use-case that very few people would have any interest in or use. > Basically with building `UNIT` we realise with hindsight that we failed to > build a proper `EXTENSION` system, and we send that message to our users. Little difficult to draw conclusions about what out 'hindsight' will look like. Thanks, Stephen
Stephen Frost escribió: > * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > > Basically with building `UNIT` we realise with hindsight that we failed to > > build a proper `EXTENSION` system, and we send that message to our users. > > Little difficult to draw conclusions about what out 'hindsight' will > look like. I haven't been keeping very close attention to this, but I fail to see why extensions are so much of a failure. Surely we can invent a new "kind" of extensions, ones whose contents specifically are dumped by pg_dump. Regular extensions, the kind we have today, still wouldn't, but we could have a flag, say "CREATE EXTENSION ... (WITH DUMP)" or something. That way you don't have to come up with UNIT at all (or whatever). A whole new set of catalogs just to fix up a minor issue with extensions sounds a bit too much to me; we can just add this new thing on top of the existing infrastructure. I didn't much like the WITH UNIT/END UNIT thingy. What's wrong with CREATE foo; ALTER EXTENSION ADD foo? There's a bit of a problem that if you create the object and die before being able to add it to the extension, it would linger unreferenced; but that's easily fixable by doing the creation in a transaction, I think. (Alternatively, we could have a single command that creates the extension and the contained objects in one fell swoop, similar to how CREATE SCHEMA can do it; but I'm not sure that's all that much better, and from a grammar POV it probably sucks.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 12/18/13, 4:22 AM, Dimitri Fontaine wrote: > ALTER UNIT name SET SCHEMA <new schema>; FWIW, with the "units" that we've developed we use schemas to differentiate between public objects and "internal" (privateor protected) objects. So single-schema stuff becomes a PITA. Of course, since extensions already work that way Isuppose that ship has sailed, but I thought I'd mention it. For those who are curious... we make the distinction of public/protected/private via schemas because we don't want generalusers to need to wade through that stuff when looking at objects. So the convention we settled on is that public objectsgo in one schema, protected objects go in a schema of the same name that's prepended with "_", and private objectsare in the protjected schema but also prepend "_" to their names. IE: CREATE SCHEMA awesome_feature; CREATE VIEW awesome_feature.have_some_data CREATE SCHEMA _awesome_feature; -- Protected / private stuff CREATE VIEW _awesome_feature.stuff_for_database_code_to_see_but_not_users CREATE FUNCTION _awesome_feature._do_not_run_this_function_anywhere_outside_of_awesome_feature() -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
<p dir="ltr">Yeah I think this whole discussion is happening at the wrong level. The problem you're having, despite appearances,is not that people disagree about the best way to accomplish your goals.<p dir="ltr">The problem is that noteveryone is convinced your goals are a good idea. Either they just don't understand the goals or they do understand thembut don't agree that they're a good idea. <p dir="ltr">Personally I'm in the former category and would welcome a detailedexplanation of the goals of the feature and what use cases those goals enable.<p dir="ltr">I think Tom is in thelater category and needs a very good argument for why those goals are important enough to outweigh the downsides. <p dir="ltr">Idon't think loading shared libraries from RAM or a temp download directory is a *complete* show stopper the wayTom says but it would require a pretty compelling use case to make it worth the difficulties it would cause.<p dir="ltr">--<br /> greg
On Wed, Dec 18, 2013 at 10:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Stephen Frost escribió: >> * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > >> > Basically with building `UNIT` we realise with hindsight that we failed to >> > build a proper `EXTENSION` system, and we send that message to our users. >> >> Little difficult to draw conclusions about what out 'hindsight' will >> look like. > > I haven't been keeping very close attention to this, but I fail to see > why extensions are so much of a failure. Surely we can invent a new > "kind" of extensions, ones whose contents specifically are dumped by > pg_dump. Regular extensions, the kind we have today, still wouldn't, > but we could have a flag, say "CREATE EXTENSION ... (WITH DUMP)" or > something. That way you don't have to come up with UNIT at all (or > whatever). A whole new set of catalogs just to fix up a minor issue > with extensions sounds a bit too much to me; we can just add this new > thing on top of the existing infrastructure. Yep. I'm not very convinced that extensions are a failure. I've certainly had plenty of good experiences with them, and I think others have as well, so I believe Dimitri's allegation that we've somehow failed here is overstated. That having been said, having a flag we can set to dump the extension contents normally rather than just dumping a CREATE EXTENSION statement seems completely reasonable to me. ALTER EXTENSION foo SET (dump_members = true/false); It's even got use cases outside of what Dimitri wants to do, like dumping and restoring an extension that you've manually modified without losing your changes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/19/2013 08:01 AM, Robert Haas wrote: > On Wed, Dec 18, 2013 at 10:05 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Stephen Frost escribió: >>> * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: >>>> Basically with building `UNIT` we realise with hindsight that we failed to >>>> build a proper `EXTENSION` system, and we send that message to our users. >>> Little difficult to draw conclusions about what out 'hindsight' will >>> look like. >> I haven't been keeping very close attention to this, but I fail to see >> why extensions are so much of a failure. Surely we can invent a new >> "kind" of extensions, ones whose contents specifically are dumped by >> pg_dump. Regular extensions, the kind we have today, still wouldn't, >> but we could have a flag, say "CREATE EXTENSION ... (WITH DUMP)" or >> something. That way you don't have to come up with UNIT at all (or >> whatever). A whole new set of catalogs just to fix up a minor issue >> with extensions sounds a bit too much to me; we can just add this new >> thing on top of the existing infrastructure. > Yep. > > I'm not very convinced that extensions are a failure. I've certainly > had plenty of good experiences with them, and I think others have as > well, so I believe Dimitri's allegation that we've somehow failed here > is overstated. Indeed. There might be limitations, but what we have is VERY useful. Let's keep things in proportion here. > That having been said, having a flag we can set to > dump the extension contents normally rather than just dumping a CREATE > EXTENSION statement seems completely reasonable to me. > > ALTER EXTENSION foo SET (dump_members = true/false); > > It's even got use cases outside of what Dimitri wants to do, like > dumping and restoring an extension that you've manually modified > without losing your changes. > Yeah, seems like it might have merit. cheers andrew
Le jeudi 19 décembre 2013 14:01:17, Robert Haas a écrit : > On Wed, Dec 18, 2013 at 10:05 AM, Alvaro Herrera > > <alvherre@2ndquadrant.com> wrote: > > Stephen Frost escribió: > >> * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > >> > Basically with building `UNIT` we realise with hindsight that we > >> > failed to build a proper `EXTENSION` system, and we send that message > >> > to our users. > >> > >> Little difficult to draw conclusions about what out 'hindsight' will > >> look like. > > > > I haven't been keeping very close attention to this, but I fail to see > > why extensions are so much of a failure. Surely we can invent a new > > "kind" of extensions, ones whose contents specifically are dumped by > > pg_dump. Regular extensions, the kind we have today, still wouldn't, > > but we could have a flag, say "CREATE EXTENSION ... (WITH DUMP)" or > > something. That way you don't have to come up with UNIT at all (or > > whatever). A whole new set of catalogs just to fix up a minor issue > > with extensions sounds a bit too much to me; we can just add this new > > thing on top of the existing infrastructure. > > Yep. > > I'm not very convinced that extensions are a failure. I've certainly > had plenty of good experiences with them, and I think others have as > well, so I believe Dimitri's allegation that we've somehow failed here > is overstated. That having been said, having a flag we can set to > dump the extension contents normally rather than just dumping a CREATE > EXTENSION statement seems completely reasonable to me. > > ALTER EXTENSION foo SET (dump_members = true/false); > > It's even got use cases outside of what Dimitri wants to do, like > dumping and restoring an extension that you've manually modified > without losing your changes. Isn't there some raw SQL extension author are supposed to be able to push in order to dump partial configuration table and similar things (well, what we're supposed to be able to change in an extension). yes, it is: SELECT pg_catalog.pg_extension_config_dump('my_config', 'WHERE NOT standard_entry'); (it is raw SQL here, but it is not appreciated for Extension 'Templates' .... I stopped trying to figure/undertand many arguments in those Extension email threads) Maybe something around that to have also the objects created by extension dumped, and we're done. I even wnder if Dimitri has not already a patch for that based on the work done for Extensions feature. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
* Andrew Dunstan (andrew@dunslane.net) wrote: > >That having been said, having a flag we can set to > >dump the extension contents normally rather than just dumping a CREATE > >EXTENSION statement seems completely reasonable to me. > > > >ALTER EXTENSION foo SET (dump_members = true/false); > > > >It's even got use cases outside of what Dimitri wants to do, like > >dumping and restoring an extension that you've manually modified > >without losing your changes. > > Yeah, seems like it might have merit. I like the simplicity of this approach as well, but I believe Tom had concerns about having some extensions behave quite different from others (hence the earlier suggetsion to name the 'dumpable' ones something different). That said, I'm starting to wonder about a few different options that might be handy- having the extension be dumpable (or maybe an option to pg_dump to dump them from the DB, or not), and perhaps an option to have the version # included in the dump (or an option to exclude it, such as when run by pg_upgrade..?). Perhaps similar things for pg_restore. In any case, this is certainly the way I had been hoping the discussion would go.. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > That said, I'm starting to wonder about a few > different options that might be handy- having the extension be dumpable > (or maybe an option to pg_dump to dump them from the DB, or not), and > perhaps an option to have the version # included in the dump (or an > option to exclude it, such as when run by pg_upgrade..?). Perhaps > similar things for pg_restore. > > In any case, this is certainly the way I had been hoping the discussion > would go.. http://www.postgresql.org/message-id/18778.1354753982@sss.pgh.pa.us -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri, * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > That said, I'm starting to wonder about a few > > different options that might be handy- having the extension be dumpable > > (or maybe an option to pg_dump to dump them from the DB, or not), and > > perhaps an option to have the version # included in the dump (or an > > option to exclude it, such as when run by pg_upgrade..?). Perhaps > > similar things for pg_restore. > > > > In any case, this is certainly the way I had been hoping the discussion > > would go.. > > http://www.postgresql.org/message-id/18778.1354753982@sss.pgh.pa.us If you'd like to add to the discussion, then please do so. This isn't adding anything. Tom brings up good points in that thread from last year but my suggestion was about having a few options *in addition* to keeping track of how extensions were installed and using that as the default. Thanks, Stephen
On Sat, Dec 21, 2013 at 12:10 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Stephen Frost <sfrost@snowman.net> writes: >> That said, I'm starting to wonder about a few >> different options that might be handy- having the extension be dumpable >> (or maybe an option to pg_dump to dump them from the DB, or not), and >> perhaps an option to have the version # included in the dump (or an >> option to exclude it, such as when run by pg_upgrade..?). Perhaps >> similar things for pg_restore. >> >> In any case, this is certainly the way I had been hoping the discussion >> would go.. > > http://www.postgresql.org/message-id/18778.1354753982@sss.pgh.pa.us Fortunately, nobody's proposing that exact design, and I think there are more recent emails where Tom expressed at least some support for the idea of installing an extension purely via SQL, and in fact backed the idea of being able to dump-and-restore the extension members as superior to storing blobs in the catalog. If you want to go beat your head against the wall, I don't blame you, but it's not going to help us make any progress here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Dec 21, 2013 at 12:10 PM, Dimitri Fontaine > <dimitri@2ndquadrant.fr> wrote: >> Stephen Frost <sfrost@snowman.net> writes: >>> That said, I'm starting to wonder about a few >>> different options that might be handy- having the extension be dumpable >>> (or maybe an option to pg_dump to dump them from the DB, or not), and >>> perhaps an option to have the version # included in the dump (or an >>> option to exclude it, such as when run by pg_upgrade..?). Perhaps >>> similar things for pg_restore. >>> >>> In any case, this is certainly the way I had been hoping the discussion >>> would go.. >> >> http://www.postgresql.org/message-id/18778.1354753982@sss.pgh.pa.us > Fortunately, nobody's proposing that exact design, and I think there > are more recent emails where Tom expressed at least some support for > the idea of installing an extension purely via SQL, and in fact backed > the idea of being able to dump-and-restore the extension members as > superior to storing blobs in the catalog. AFAICT, what I was complaining about there was the idea that the per-extension behavior had to be specified via switches to pg_dump in order to get a valid dump. That doesn't seem too workable --- you think your nightly backup script will know that? But the idea that it's an alterable property of each extension, *stored in the database*, does not fall foul of that complaint. regards, tom lane