Thread: Inline Extension
Hi, The extension mechanism we added in 9.1 is aimed at allowing a fully integrated contrib management, which was big enough a goal to preclude doing anything else in its first release. Now we have it and we can think some more about what features we want covered, and a pretty obvious one that's been left out is the ability to define and update an extension without resorting to file system support for those extensions that do not need a shared object library. We could have been calling that “SQL ONLY” extensions, but to simplify the grammar support I did use the “inline” keyword so there we go. Please find attached a WIP patch implementing that. Note that the main core benefit to integrating this feature is the ability to easily add regression tests for extension related features. Which is not done yet in the attached. I'm sending this quite soon because of the pg_dump support. When an extension is inline, we want to dump its content, as we currently do in the binary dump output. I had in mind that we could output a full CREATE EXTENSION INLINE script in between some dollar-quoting rather than adding each extension's object with a ALTER EXTENSION ... ADD line like what pg_upgrade compatibility is currently doing. It seems like much more work though, and I'd appreciate input about how exactly to do that (it looks like making pg_dump reentrant, somehow). Or some reason not to bother and just rename and share the binary upgrade facility that Bruce already has put into pg_dump. Here's a usage example that will certainly end up in the docs: create extension pair inline version '1.0' schema pair not relocatable as $pair$ CREATE TYPE pair AS ( k text, v text ); CREATE OR REPLACE FUNCTION pair(anyelement, text) RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair'; CREATE OR REPLACE FUNCTION pair(text, anyelement) RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair'; CREATE OR REPLACE FUNCTION pair(anyelement, anyelement) RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair'; CREATE OR REPLACE FUNCTION pair(text, text) RETURNS pair LANGUAGE SQL AS 'SELECT ROW($1, $2)::pair;'; CREATE OPERATOR ~> (LEFTARG = text, RIGHTARG = anyelement, PROCEDURE = pair); CREATE OPERATOR ~> (LEFTARG = anyelement, RIGHTARG = text, PROCEDURE = pair); CREATE OPERATOR ~> (LEFTARG = anyelement, RIGHTARG = anyelement, PROCEDURE = pair); CREATE OPERATOR ~> (LEFTARG = text, RIGHTARG = text, PROCEDURE = pair); $pair$; alter extension pair update from '1.0' to '1.1' with $pair$ CREATE OR REPLACE FUNCTION key(pair) RETURNS text LANGUAGE SQL AS 'SELECT ($1).k;'; CREATE OR REPLACE FUNCTION value(pair) RETURNS text LANGUAGE SQL AS 'SELECT ($1).v;'; CREATE OPERATOR %% (RIGHTARG = pair, PROCEDURE = key); CREATE OPERATOR %# (RIGHTARG = pair, PROCEDURE = value); $pair$; Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On 08.01.2012 22:36, Dimitri Fontaine wrote: > The extension mechanism we added in 9.1 is aimed at allowing a fully > integrated contrib management, which was big enough a goal to preclude > doing anything else in its first release. Hooray! > Now we have it and we can think some more about what features we want > covered, and a pretty obvious one that's been left out is the ability to > define and update an extension without resorting to file system support > for those extensions that do not need a shared object library. We could > have been calling that “SQL ONLY” extensions, but to simplify the > grammar support I did use the “inline” keyword so there we go. Frankly I don't see the point of this. If the extension is an independent piece of (SQL) code, developed separately from an application, with its own lifecycle, a .sql file seems like the best way to distribute it. If it's not, ie. if it's an integral part of the database schema, then why package it as an extension in the first place? > Please find attached a WIP patch implementing that. Note that the main > core benefit to integrating this feature is the ability to easily add > regression tests for extension related features. Which is not done yet > in the attached. I'm not sure I buy that argument. These inline extensions are sufficiently different from regular extensions that I think you'd need to have regression tests for both kinds, anyway. > I'm sending this quite soon because of the pg_dump support. When an > extension is inline, we want to dump its content, as we currently do in > the binary dump output. I had in mind that we could output a full > CREATE EXTENSION INLINE script in between some dollar-quoting rather > than adding each extension's object with a ALTER EXTENSION ... ADD line > like what pg_upgrade compatibility is currently doing. I thought the main point of extensions is that that they're not included in pg_dump. Again, if the extension is an integral part of the database, then it probably shouldn't be an extension in the first place. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Frankly I don't see the point of this. If the extension is an independent > piece of (SQL) code, developed separately from an application, with its own > lifecycle, a .sql file seems like the best way to distribute it. If it's > not, ie. if it's an integral part of the database schema, then why package > it as an extension in the first place? It allows to easily deploy an extension to N databases (my current use case has 256 databases) and knowing which version is installed on each server. It's easier to QA your procedures and upgrades when they are packaged as extensions, too. Now, for the dependency on a SQL file hosting the content, it's easier to just connect to the databases and get them the script in the SQL command rather than deploying a set of files: that means OS level packaging, either RPM or debian or some other variant. Or some other means of easily deploying the files. An SQL connection is all you need if you're not shipping .so. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> Frankly I don't see the point of this. If the extension is an independent >> piece of (SQL) code, developed separately from an application, with its own >> lifecycle, a .sql file seems like the best way to distribute it. If it's >> not, ie. if it's an integral part of the database schema, then why package >> it as an extension in the first place? > It allows to easily deploy an extension to N databases (my current use > case has 256 databases) and knowing which version is installed on each > server. It's easier to QA your procedures and upgrades when they are > packaged as extensions, too. > Now, for the dependency on a SQL file hosting the content, it's easier > to just connect to the databases and get them the script in the SQL > command rather than deploying a set of files: that means OS level > packaging, either RPM or debian or some other variant. Or some other > means of easily deploying the files. An SQL connection is all you need > if you're not shipping .so. I'm with Heikki on not believing that this is a good idea. If you are trying to do careful versioning of a set of object definitions, you want to stick the things in a file, you don't want them just flying by in submitted SQL. Also, a large part of the point of the extension facility is to be able to do uninstall/reinstall and version upgrades/downgrades, none of which are possible unless the extension scripts are stored somewhere. ISTM your distribution concern would be much better addressed by installing contrib/adminpack and then just using pg_file_write() to put the new extension script into the remote server's library. regards, tom lane
On Jan 19, 2012, at 7:21 AM, Dimitri Fontaine wrote: > Now, for the dependency on a SQL file hosting the content, it's easier > to just connect to the databases and get them the script in the SQL > command rather than deploying a set of files: that means OS level > packaging, either RPM or debian or some other variant. Or some other > means of easily deploying the files. An SQL connection is all you need > if you're not shipping .so. ISTM that if you are managing 256 servers, you’re likely already using a packaging system for the deployment of applicationdependencies. In which case, to keep things consistent, you ought to distribute your extensions in exactly thesame way. Best, David
Tom Lane <tgl@sss.pgh.pa.us> writes: > I'm with Heikki on not believing that this is a good idea. If you are > trying to do careful versioning of a set of object definitions, you want > to stick the things in a file, you don't want them just flying by in > submitted SQL. I'm trying to open the extension facilities (versions being the first of them, think \dx) to application PL code, and to hosted environments where you're not granted access to the server's file system. I think I would agree that the use case is not existing if the target is traditional in-house deployments where the sys admins are your colleagues. I've been told that's a smaller and smaller part of the database world though. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Thu, Jan 19, 2012 at 3:42 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I'm with Heikki on not believing that this is a good idea. If you are >> trying to do careful versioning of a set of object definitions, you want >> to stick the things in a file, you don't want them just flying by in >> submitted SQL. > > I'm trying to open the extension facilities (versions being the first of > them, think \dx) to application PL code, and to hosted environments > where you're not granted access to the server's file system. I guess the question is: for what purpose? As you recognized in your original email, if the extension is inline, then the objects will need to be dumped, because a simple CREATE EXTENSION command is bound to fail. But my understanding was that a major part of the reason - if not the entire reason - was to get pg_dump to emit CREATE EXTENSION bob instead of the component SQL commands. If we take that away, what's the remaining benefit of packaging those objects inside an extension instead of just dumping them "loose" into the database? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jan 19, 2012 at 3:42 PM, Dimitri Fontaine >> I'm trying to open the extension facilities (versions being the first of >> them, think \dx) to application PL code, and to hosted environments >> where you're not granted access to the server's file system. > I guess the question is: for what purpose? > As you recognized in your original email, if the extension is inline, > then the objects will need to be dumped, because a simple CREATE > EXTENSION command is bound to fail. But my understanding was that a > major part of the reason - if not the entire reason - was to get > pg_dump to emit CREATE EXTENSION bob instead of the component SQL > commands. If we take that away, what's the remaining benefit of > packaging those objects inside an extension instead of just dumping > them "loose" into the database? Indeed, it seems like such a thing is not an extension at all anymore, or at least it gives up many of the useful properties of extensions. Given the entire lack of demand from the field for such a cut-down concept of extension, I think we should not be in a hurry to introduce it. Maybe in a year or two when we have a clearer idea of how people are actually using extensions, there will be a better argument for it. Right now I'm afraid that we might foreclose our options for other future extension features because these things would be incompatible with such ideas. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Robert Haas <robertmhaas@gmail.com> writes: >> I guess the question is: for what purpose? > > Indeed, it seems like such a thing is not an extension at all anymore, > or at least it gives up many of the useful properties of extensions. I'm thinking that a common name and version number tracked in the database for a set of related functions (that usually form an API) is useful enough a property to be wanting to have extension support more use cases than contrib-like “module centric” extensions (meaning, C coded and shipped with a .so). > Given the entire lack of demand from the field for such a cut-down > concept of extension, I think we should not be in a hurry to introduce > it. Maybe in a year or two when we have a clearer idea of how people > are actually using extensions, there will be a better argument for it. Fair enough I guess (or at least I'm understanding how alone I am here), let's hear from the field first. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, Jan 20, 2012 at 8:52 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Robert Haas <robertmhaas@gmail.com> writes: >>> I guess the question is: for what purpose? >> >> Indeed, it seems like such a thing is not an extension at all anymore, >> or at least it gives up many of the useful properties of extensions. > > I'm thinking that a common name and version number tracked in the > database for a set of related functions (that usually form an API) is > useful enough a property to be wanting to have extension support more > use cases than contrib-like “module centric” extensions (meaning, C > coded and shipped with a .so). I see that there is some usefulness there, but I'm not sure that this is the best way to get our hands around it. For one thing, people can and do schema versioning and schema upgrade scripts entirely in userland. My last implementation worked by keeping a schema_versions table on the server with one column, a UUID. The deployment tarball contained a file with a list of UUIDs in it, each one associated to an SQL script. At install time, the install script ran through that file in order and ran any scripts whose UUID didn't yet appear in the table, and then added the UUIDs of the run scripts to the table. This might not be what any given person wants, but there's a lot of flexibility to do things like that without any particular support from the DBMS. (Incidentally, this experience is what convinced me that CREATE TABLE IF EXISTS and ALTER TABLE IF EXISTS are good things to have; my system could have been a lot simpler if I'd had those.) One random design idea I had related to providing this functionality in the DB core is to have a command that creates an empty extension, maybe just "CREATE EXTENSION foo EMPTY", and an ALTER command that forcibly changes the DB's notion of what version is installed, like "ALTER EXTENSION foo FORCE VERSION TO '1.1'". That would allow the same sort of thing you're after here by using those two features plus ALTER EXTENSION ADD/DROP, and could also be used to do other things. For example, suppose you have DB servers A and B. A is running an old version of some extension and is in a shared hosting environment where you can't get access to the box. B, on the other hand, is your brand-new, dedicated server. You could upgrade the extension on the old machine "manually", by issuing SQL commands to forcibly change its state, and then do a dump and reload onto B. This might be useful if, for example, B is also running a newer DB server version that won't support the very old version of the extension running on A. This is probably an unusual situation, but maybe there's some value in allowing users who want to do such things a cleaner way to do it than direct catalog hackery. Anyway, I'm just thinking out loud here - I don't actually have a very strong feeling that I know what all of the solutions are in this area, or even all the problems. I'm interested in hearing about your experiences with the system, and other people's, because I certainly do agree that there's room for improvement. One of my personal pet peeves is that the system doesn't know how to do an install of v1.1 by running the v1.0 script followed by the 1.0-1.1 upgrade script, which I fear is going to lead to a rapid uptick in the number of copies of almost-identical scripts in our git repository. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > peeves is that the system doesn't know how to do an install of v1.1 by > running the v1.0 script followed by the 1.0-1.1 upgrade script, which Did you try CREATE EXTENSION foo FROM 1.0; Maybe you would want the system to be able to determine the oldest version to start from to reach the current default_version given in the control file, but I guess it would be better to add another property like default_full_version or such (last_stop?). Looks like a simple enough project, the fact that it helps our own shipping and script maintenance could maybe allow us to work on that this late, dunno. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, Jan 20, 2012 at 11:29 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> peeves is that the system doesn't know how to do an install of v1.1 by >> running the v1.0 script followed by the 1.0-1.1 upgrade script, which > > Did you try > > CREATE EXTENSION foo FROM 1.0; Well, yes, that works, but I'm going for what you wrote next: > Maybe you would want the system to be able to determine the oldest > version to start from to reach the current default_version given in the > control file, but I guess it would be better to add another property > like default_full_version or such (last_stop?). Possibly that might be a better design, yes. Especially since we don't order version numbers intrinsically. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: >> Maybe you would want the system to be able to determine the oldest >> version to start from to reach the current default_version given in the >> control file, but I guess it would be better to add another property >> like default_full_version or such (last_stop?). > > Possibly that might be a better design, yes. Especially since we > don't order version numbers intrinsically. It's quite simple to implement too, see attached. As for the version number ordering, that's a giant rat hole I don't want to be digging in, and I think we can bypass that problem entirely. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Thu, Jan 19, 2012 at 8:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >>> Frankly I don't see the point of this. If the extension is an independent >>> piece of (SQL) code, developed separately from an application, with its own >>> lifecycle, a .sql file seems like the best way to distribute it. If it's >>> not, ie. if it's an integral part of the database schema, then why package >>> it as an extension in the first place? > I'm with Heikki on not believing that this is a good idea. If you are > trying to do careful versioning of a set of object definitions, you want > to stick the things in a file, you don't want them just flying by in > submitted SQL. Also, a large part of the point of the extension > facility is to be able to do uninstall/reinstall and version > upgrades/downgrades, none of which are possible unless the extension > scripts are stored somewhere. > > ISTM your distribution concern would be much better addressed by > installing contrib/adminpack and then just using pg_file_write() > to put the new extension script into the remote server's library. I think this is somewhat rube-goldberg-esque, and denies non-superuser roles the ability to get more version management of schema and operators. As-is many organizations are submitting "migrations" via plain SQL that include committing to a version management table that is maintained by convention, and as-is that is considered a modern-day best-practice. Unless someone comes up with something fundamentally new in how to handle the problem being solved in user-land via 'migrations' and version tables in Postgres, I think tying completely safe, sandboxed, non-superuser operators to super-user-only (and awkward besides) file management via FEBE on the server-side is not going to do anything but redelegate the problem of soundness to every migration/change management regime, and decrease reuse of useful operators that do not require superuser. The ship has sailed. Encouraging use of files and .sql buy no soundness, because everyone is moving towards is overlaying version management via pure FEBE anyway. At best, this means to me that Postgres is completely neutral about what version management regime you want to use for operators and schemas. At worst, it means Postgres frustratingly unhelpful in common use cases. And somewhere in the middle is "this may have value, but not enough to be worth maintaining." Given the lack of a fast and widely deployed trusted PL integrations, I my interpretation of the truth falls somewhere between the latter two interpretations. -- fdr
On 21.01.2012 00:00, Daniel Farina wrote: > I think this is somewhat rube-goldberg-esque, and denies non-superuser > roles the ability to get more version management of schema and > operators. As-is many organizations are submitting "migrations" via > plain SQL that include committing to a version management table that > is maintained by convention, and as-is that is considered a modern-day > best-practice. Even if you give the version number in the CREATE EXTENSION command, it's by convention that people actually maintain a sane versioning policy. If people don't take version management seriously, you will quickly end up with five different versions of an extension, all with version number 0.1. Another approach is to use comments on the objects saying "version 1.23". Those generally move together with the objects themselves; they are included in pg_dump schema-only dump, for example, while the contents of a table are not. > The ship has sailed. Encouraging use of files and .sql buy no > soundness, because everyone is moving towards is overlaying version > management via pure FEBE anyway. What is FEBE? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, Jan 20, 2012 at 2:48 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Even if you give the version number in the CREATE EXTENSION command, it's by > convention that people actually maintain a sane versioning policy. If people > don't take version management seriously, you will quickly end up with five > different versions of an extension, all with version number 0.1. Projects are taking it seriously, and invest a lot of effort in it. There is no shortage of schema versioning frameworks, of varying levels of maturity....but some are quite complete by the standards of their users. However, there is little knowledge shared between them, and the no database gives them much support, so idiosyncrasy becomes inevitable. Unless one makes loading regular, unpackaged operators via SQL impossible (which is, on the face of it, doesn't sound like a good idea to me), the only people who will bother with CREATE EXTENSION with inline content will be those who care and want to maintain version control. I stick to my original assessment: the ship has sailed, or perhaps, more accurately, is sailing. Perhaps one could gain wisdom by thinking of this problem differently: "I'm going to write a migration tool to help developers in my framework or workflow deal with change, and I really would like it if my database helped me by....". I don't think we should fool ourselves that we'll be letting people shoot themselves in the foot with regard to versioning if we give them versions. People have been painfully and assiduously trying to avoid version problems for some time now, with no help from the database, and they do it entirely via pure SQL statements from Postgres' point of view. > Another approach is to use comments on the objects saying "version 1.23". > Those generally move together with the objects themselves; they are included > in pg_dump schema-only dump, for example, while the contents of a table are > not. Idiosyncratic, but perhaps useful for some people. Again, workarounds are possible, and per my previous statements, even pervasive. That doesn't make it a desirable thing to avoid assisting with. > What is FEBE? FrontEnd BackEnd Protocol, I thought? The one libpq speaks. Is there a better name? Here are some reasons not do this feature, in my mind: * Is the world ready yet to think about versioning operators in a database the same way software is versioned? * Related to that: If the feature is written, it will have to be supported. Does the feature have enough impact at this time? * Do we know how to design the feature correctly so it will attract use, especially as a Postgres-ism, which is related in some part to the decaying importance (my perception) of 'database agnostic' * Are the awkward user-land versioning strategies good enough? Would it be better to focus elsewhere where things are even *more* awkward? My assessment is: I know some people who would use this feature, but a broad swathe are not loading too many operators into their database. It's hard to know if that's because packaging operators in databases has been so abysmally bad in the industry at large, or because the choice of sandboxed languages are just not appealing for, say, a Python or Ruby developer to write things that are not a series of SQL statements, ala pgsql, and a lot of that (sans triggers) can be done over the wire anyway. Maybe embedded Javascript can help with this, but it's not the Here and Now. If I had to invest with complexity with regard to versioning, I'd rather invest in Postgres being able to tell me a hashed version of all or a controllable subset of the tables, types, attributes in the database, as to quickly pick out drift between an freshly created database (produced by the migrations on-disk against development) and any cruft that sneaks into a production server that is thought to be the same as the others. That would have huge impact with common use cases at this very moment in time, so it provides a reasonable backdrop to evaluate the cost/impact of the proposed feature. -- fdr
On Fri, Jan 20, 2012 at 3:33 PM, Daniel Farina <daniel@heroku.com> wrote: > On Fri, Jan 20, 2012 at 2:48 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> Even if you give the version number in the CREATE EXTENSION command, it's by >> convention that people actually maintain a sane versioning policy. If people >> don't take version management seriously, you will quickly end up with five >> different versions of an extension, all with version number 0.1. > > Projects are taking it seriously, and invest a lot of effort in it. > There is no shortage of schema versioning frameworks, of varying > levels of maturity....but some are quite complete by the standards of > their users. However, there is little knowledge shared between them, > and the no database gives them much support, so idiosyncrasy becomes > inevitable. Speak of the devil. Someone just posted use of extension versioning to manage schemas (using the existing plain-old-files mechanism): http://philsorber.blogspot.com/2012/01/deploy-schemata-like-boss.html He also links to a -hackers post Dimitri wrote last December. A few anecdotes does not constitute evidence, but it does look like some people pay attention to any additional versioning foothold they can get. -- fdr
On Sun, Jan 22, 2012 at 3:20 PM, Daniel Farina <daniel@heroku.com> wrote: > A few anecdotes does not constitute evidence, but it does look like > some people pay attention to any additional versioning foothold they > can get. Sure, but just because some people do it doesn't make it a good idea. I can personally attest to having done many things over the years which initially appeared to be good ideas, but later turned out to be mind-bogglingly dumb. It's true that if we make it easy for people to use the extension mechanism as a way of storing versioning information for user code, then people will use it for that purpose, but the question is whether that's really what we want. I don't see what advantage it gives us. Dimitri's proposal was to neuter the pg_dump support that is the raison d'être of the extension mechanism. That's clearly necessary if you don't want to end up with an unreloadable database, but it begs the question (which no one's really answered AFAICT) of what good the extension mechanism is without that feature. There are certainly easier ways to remember a version number than building support for it into core. If people create their own versioning mechanisms, they can create something which is tailor-made for their particular requirements, rather than relying on decisions which we made in core that may or may not be right for them (e.g. the lack of version ordering, or even that we have versions rather than some more general type of control table). I don't want to prejudge the outcome and say, oh, we should never have support for this concept in core - but neither do I want to embark on that project without a detailed understanding of where and how it is adding value. If people are doing management via "pure FEBE", good for them: but that doesn't explain why it shoudn't be done all in userspace, with all of the flexibility that gives. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > ... If people > are doing management via "pure FEBE", good for them: but that doesn't > explain why it shoudn't be done all in userspace, with all of the > flexibility that gives. On reflection it seems like this patch is simply offering the wrong solution for the problem. I agree that it could be useful to install extensions without having direct access to the server's filesystem, but it doesn't seem to follow that we must lobotomize existing extension features in order to have that. I pointed out earlier that you could get such functionality via contrib/adminpack, though people not unreasonably complained that that was pretty ugly and low-level. But couldn't we define some SQL-level operations to allow installing extension control and script files? Probably the worst issue with that is that in typical installations, the share/extension/ directory would be read-only to the server, and a lot of people might be uncomfortable with making it writable. Not sure whether we should consider inventing another place to keep SQL-command-installed extensions, or just say "if you want this functionality you have to make share/extension/ writable". regards, tom lane
On Sun, Jan 22, 2012 at 8:42 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Jan 22, 2012 at 3:20 PM, Daniel Farina <daniel@heroku.com> wrote: >> A few anecdotes does not constitute evidence, but it does look like >> some people pay attention to any additional versioning foothold they >> can get. > > Sure, but just because some people do it doesn't make it a good idea. True, I did not mean to suggest that this is clearly the best mechanism, only to express support for the general idea that people are appreciative and willing to experiment with any way to make version management better, and people who use those features are *probably* going to try to use them correctly. It's up to us to make the right-thing easy, too, otherwise I fear we will see too many lousy version numbers creeping about in the wild. > Dimitri's proposal was to neuter the pg_dump > support that is the raison d'être of the extension mechanism. That's > clearly necessary if you don't want to end up with an unreloadable > database, but it begs the question (which no one's really answered > AFAICT) of what good the extension mechanism is without that feature. Oh, no, non-reloadability is a really bad thing -- I'd say a pretty bad deal-breaker -- but as Tom wrote, it does seem like it should somehow be a tractable problem. Is it such a bad idea to store the literal text of the extension's pieces (control file and corresponding SQL program) in catalogs? I'm not sure if I understand why everyone is so interested in a special interaction with the file system in some way. By the same token, extensions can be dumped in the literal syntax -- even the ones that were installed from a file. > There are certainly easier ways to remember a version number than > building support for it into core. If people create their own > versioning mechanisms, they can create something which is tailor-made > for their particular requirements, rather than relying on decisions > which we made in core that may or may not be right for them (e.g. the > lack of version ordering, or even that we have versions rather than > some more general type of control table). I understand the desire to avoid investing in something that is not what people want. However, in the interest of scoping the discussion to the inline extension support, I can't seem to understand the objection to supporting what is basically a different transport for precisely the same semantic operation as having to ssh into a machine and untar some files, except available without the bizarre side-channel of ssh and fie system mangling when one is loading trustable operators, itself a raft of usability issues if one wishes to enable more software reuse. -- fdr
> Is it such a bad idea to store the literal text of the extension's > pieces (control file and corresponding SQL program) in catalogs? I'm > not sure if I understand why everyone is so interested in a special > interaction with the file system in some way. By the same token, > extensions can be dumped in the literal syntax -- even the ones that > were installed from a file. > >> There are certainly easier ways to remember a version number than >> building support for it into core. If people create their own >> versioning mechanisms, they can create something which is tailor-made >> for their particular requirements, rather than relying on decisions >> which we made in core that may or may not be right for them (e.g. the >> lack of version ordering, or even that we have versions rather than >> some more general type of control table). > > I understand the desire to avoid investing in something that is not > what people want. However, in the interest of scoping the discussion > to the inline extension support, I can't seem to understand the > objection to supporting what is basically a different transport for > precisely the same semantic operation as having to ssh into a machine > and untar some files, except available without the bizarre > side-channel of ssh and fie system mangling when one is loading > trustable operators, itself a raft of usability issues if one wishes > to enable more software reuse. or with adminpack, and without ssh, but still interaction with filesystem. Filesystem rw access is a pain for the DBA. There are hacks possible to get ride of that (but not completely: mount -o ro partitions for example...) I am in favor to be able to create extension directly in plain sql, without file creation or access or system administrators privileges. Why wouldn't we want that ?! -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Tom Lane <tgl@sss.pgh.pa.us> writes: > On reflection it seems like this patch is simply offering the wrong > solution for the problem. I agree that it could be useful to install > extensions without having direct access to the server's filesystem, > but it doesn't seem to follow that we must lobotomize existing extension > features in order to have that. I pointed out earlier that you could > get such functionality via contrib/adminpack, though people not > unreasonably complained that that was pretty ugly and low-level. > But couldn't we define some SQL-level operations to allow installing > extension control and script files? Yeah, that's what I was trying to do… > Probably the worst issue with that is that in typical installations, > the share/extension/ directory would be read-only to the server, and a > lot of people might be uncomfortable with making it writable. Not sure > whether we should consider inventing another place to keep > SQL-command-installed extensions, or just say "if you want this > functionality you have to make share/extension/ writable". So I've been wondering about storing the script content in the catalogs, but you would have to store all the update scripts too and that's useless because you want to dump the current state of the system, which pg_dump is doing just fine. Back to using the file system on the server when handed the script content over the protocol, we could get there with a new GUC telling the server where to find and store “inline” extensions, right? extension_inline_directory = '/path/to/some/writable/place' Then creating an extension would look in both the system extension directory and the inline one, which covers dump and restore. Creating an inline extension means creating the .control and the .sql files in the extension_inline_directory, then running the current code. I can adapt the patch to this behavior this week. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Le 23 janvier 2012 11:53, Dimitri Fontaine <dimitri@2ndquadrant.fr> a écrit : > Tom Lane <tgl@sss.pgh.pa.us> writes: >> On reflection it seems like this patch is simply offering the wrong >> solution for the problem. I agree that it could be useful to install >> extensions without having direct access to the server's filesystem, >> but it doesn't seem to follow that we must lobotomize existing extension >> features in order to have that. I pointed out earlier that you could >> get such functionality via contrib/adminpack, though people not >> unreasonably complained that that was pretty ugly and low-level. >> But couldn't we define some SQL-level operations to allow installing >> extension control and script files? > > Yeah, that's what I was trying to do… > >> Probably the worst issue with that is that in typical installations, >> the share/extension/ directory would be read-only to the server, and a >> lot of people might be uncomfortable with making it writable. Not sure >> whether we should consider inventing another place to keep >> SQL-command-installed extensions, or just say "if you want this >> functionality you have to make share/extension/ writable". > > So I've been wondering about storing the script content in the catalogs, > but you would have to store all the update scripts too and that's > useless because you want to dump the current state of the system, which > pg_dump is doing just fine. > > Back to using the file system on the server when handed the script > content over the protocol, we could get there with a new GUC telling the > server where to find and store “inline” extensions, right? > > extension_inline_directory = '/path/to/some/writable/place' > > Then creating an extension would look in both the system extension > directory and the inline one, which covers dump and restore. Creating > an inline extension means creating the .control and the .sql files in > the extension_inline_directory, then running the current code. > > I can adapt the patch to this behavior this week. if we agree to have that per cluster, then it can be in the $pgdata/pg_extension or something similar... -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Mon, Jan 23, 2012 at 5:53 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> Probably the worst issue with that is that in typical installations, >> the share/extension/ directory would be read-only to the server, and a >> lot of people might be uncomfortable with making it writable. Not sure >> whether we should consider inventing another place to keep >> SQL-command-installed extensions, or just say "if you want this >> functionality you have to make share/extension/ writable". > > So I've been wondering about storing the script content in the catalogs, > but you would have to store all the update scripts too and that's > useless because you want to dump the current state of the system, which > pg_dump is doing just fine. I'm not convinced that's useless. It would meet Dan's requirement to be able to manage the whole thing via the FE-BE protocol, and we could make the CREATE EXTENSION mechanism transparently search both the catalog and the filesystem when an extension is installed. I'm imagining that we'd create a catalog that would act as a sort of "virtual directory" - e.g. CREATE TABLE pg_extension_virtualdir (filename text, content text) which would be modifiable by the DBA and would be searched either before or after the filesystem itself. This catalog wouldn't be dumped by pg_dump, and there would be no changes to how extensions whose files are loaded from this catalog are dumped vs. those whose files are loaded from the filesystem. Rather, just as now, it would be the DBA's responsibility to make sure that the extensions needed to reload a given dump file are present on the new system - except now they'd have two choices where to put the related fies: on the file system, or in the per-database "virtual directory". > Back to using the file system on the server when handed the script > content over the protocol, we could get there with a new GUC telling the > server where to find and store “inline” extensions, right? > > extension_inline_directory = '/path/to/some/writable/place' > > Then creating an extension would look in both the system extension > directory and the inline one, which covers dump and restore. Creating > an inline extension means creating the .control and the .sql files in > the extension_inline_directory, then running the current code. This is another possible approach, but it requires a bit more configuration, and we'd better think carefully about what a malicious non-superuser DBA can do by changing that GUC. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > "virtual directory" - e.g. CREATE TABLE pg_extension_virtualdir > (filename text, content text) which would be modifiable by the DBA and > would be searched either before or after the filesystem itself. This > catalog wouldn't be dumped by pg_dump, and there would be no changes > to how extensions whose files are loaded from this catalog are dumped > vs. those whose files are loaded from the filesystem. Rather, just as That's the thing I don't like in this approach. Maybe it's just me but the primary intention on working on extension was to make dump and restore do the right thing all by itself. Now if we have “inline” (SQL only) extensions, the right thing happens to be very different from when you're dealing with contrib like ones, namely I would want the script to be dumped. >> extension_inline_directory = '/path/to/some/writable/place' > > This is another possible approach, but it requires a bit more > configuration, and we'd better think carefully about what a malicious > non-superuser DBA can do by changing that GUC. I think Cédric nailed it down upthread, proposing that we just use a PGDATA sub directory called 'pg_extension'. In fact, that would need to be a per-database sub directory. Then there's nothing to setup, nothing to abuse. Also remember that we're limiting this feature to SQL only extensions (because we don't want to be loading our .so from anywhere in the system and forcing them into a place owned by root is giving confidence, IIUC). With SQL only extensions, it's all non-superuser land anyway. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jan 23, 2012 at 8:25 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> "virtual directory" - e.g. CREATE TABLE pg_extension_virtualdir >> (filename text, content text) which would be modifiable by the DBA and >> would be searched either before or after the filesystem itself. This >> catalog wouldn't be dumped by pg_dump, and there would be no changes >> to how extensions whose files are loaded from this catalog are dumped >> vs. those whose files are loaded from the filesystem. Rather, just as > > That's the thing I don't like in this approach. Maybe it's just me but > the primary intention on working on extension was to make dump and > restore do the right thing all by itself. Well, fair enough: there's no accounting for taste. You could make my solution work with pg_dump in a fully automated fashion if you dumped out the "virtual directory" contents before dumping any CREATE EXTENSION statements, but I'm not going to get up on my soapbox and say that's the world's best design, so if you don't like it, fine! I am pretty concerned that we find a design that does not involve pg_dump needing to dump out the extension contents, though: that seems to me to be missing the point of having extensions in the first place. > Now if we have “inline” (SQL only) extensions, the right thing happens > to be very different from when you're dealing with contrib like ones, > namely I would want the script to be dumped. > >>> extension_inline_directory = '/path/to/some/writable/place' >> >> This is another possible approach, but it requires a bit more >> configuration, and we'd better think carefully about what a malicious >> non-superuser DBA can do by changing that GUC. > > I think Cédric nailed it down upthread, proposing that we just use a > PGDATA sub directory called 'pg_extension'. In fact, that would need to > be a per-database sub directory. Then there's nothing to setup, nothing > to abuse. Hmm, that might have something going for it. It seems comparatively easy to implement, and it also seems to do a pretty good job hiding the complexity under the hood where users don't have to worry about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I am pretty concerned that we find a design that does not involve > pg_dump needing to dump out the extension contents, though: that seems > to me to be missing the point of having extensions in the first place. I was just trying to explain where I'm coming from, I'm not wedded to the idea though, all the more when I think that we're reaching a much better one. >> I think Cédric nailed it down upthread, proposing that we just use a >> PGDATA sub directory called 'pg_extension'. In fact, that would need to >> be a per-database sub directory. Then there's nothing to setup, nothing >> to abuse. > > Hmm, that might have something going for it. It seems comparatively > easy to implement, and it also seems to do a pretty good job hiding > the complexity under the hood where users don't have to worry about > it. And then basebackup and pg_upgrade would just work, and for dump and restore we still need to find something not violating the POLA. I think that would mean offering a backend function that list all files from a given extension and their content, including the control files, and a query that stores that output for only “inline” extensions. The content of the query result is formatted as a series of create extension and alter extension update (in the right order) in the dump file so that it just transparently re-creates the files for you on the new databse. Or do you still want to insist that dump/restore shouldn't care about any extension, inline or not, and so you're given the responsibility to do the exact same thing yourself on the client side? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jan 23, 2012 at 10:04 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > And then basebackup and pg_upgrade would just work, and for dump and > restore we still need to find something not violating the POLA. > > I think that would mean offering a backend function that list all files > from a given extension and their content, including the control files, > and a query that stores that output for only “inline” extensions. The > content of the query result is formatted as a series of create extension > and alter extension update (in the right order) in the dump file so that > it just transparently re-creates the files for you on the new databse. Hmm. But CREATE EXTENSION / ALTER EXTENSION doesn't seem right, because the files in the directory correspond to *available* extensions, not already-created ones. We need some way of dumping and restoring the files themselves, not the extension that can be created from them. I suspect internal functions (pg_whatever) make more sense than new SQL syntax, since this is really only to make pg_dump happy. > Or do you still want to insist that dump/restore shouldn't care about > any extension, inline or not, and so you're given the responsibility to > do the exact same thing yourself on the client side? How about adding a new pg_dump option to suppress this part of the dump? It seems to me that there will be people who want to do that; for example, it might be that all (or some, or one) of the extensions that were installed this way on the old server are installed globally on the new server. We need some way to cope with that. Having a new pg_dump option to suppress this output is not terribly granular but maybe it'd be enough for round one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Hmm. But CREATE EXTENSION / ALTER EXTENSION doesn't seem right, > because the files in the directory correspond to *available* > extensions, not already-created ones. We need some way of dumping and I would have limited the dump query to only known installed extensions, right. The update scripts are still needed because with inline extensions you typically never see a 1.2 script but a 1.0 then 1.0--1.1 and then a 1.1--1.2. > restoring the files themselves, not the extension that can be created > from them. I suspect internal functions (pg_whatever) make more sense > than new SQL syntax, since this is really only to make pg_dump happy. That could well be, yes, but what would this function do that the commands are not doing? I'm ok not to invent specific syntax to solve that problem, I just think that we should already have all we need :) >> Or do you still want to insist that dump/restore shouldn't care about >> any extension, inline or not, and so you're given the responsibility to >> do the exact same thing yourself on the client side? > > How about adding a new pg_dump option to suppress this part of the dump? Makes sense, indeed. Well one could of course manually filter the dump object list too, of course… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jan 23, 2012 at 10:26 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Hmm. But CREATE EXTENSION / ALTER EXTENSION doesn't seem right, >> because the files in the directory correspond to *available* >> extensions, not already-created ones. We need some way of dumping and > > I would have limited the dump query to only known installed extensions, > right. The update scripts are still needed because with inline > extensions you typically never see a 1.2 script but a 1.0 then 1.0--1.1 > and then a 1.1--1.2. Hmm, I don't think I like that design. I think we should view this as a way to embed the SQL and control files needed by the extension in the server, rather than a separate thing called an inline extension. If pg_dump is going to dump those files, it ought to dump them all, not just some subset of them. >> restoring the files themselves, not the extension that can be created >> from them. I suspect internal functions (pg_whatever) make more sense >> than new SQL syntax, since this is really only to make pg_dump happy. > > That could well be, yes, but what would this function do that the > commands are not doing? I'm ok not to invent specific syntax to solve > that problem, I just think that we should already have all we need :) I was thinking of something like pg_write_extension_file('foobar.control', '....content goes here....'); >>> Or do you still want to insist that dump/restore shouldn't care about >>> any extension, inline or not, and so you're given the responsibility to >>> do the exact same thing yourself on the client side? >> >> How about adding a new pg_dump option to suppress this part of the dump? > > Makes sense, indeed. Well one could of course manually filter the dump > object list too, of course… True. I guess that's another reason why a global flag to shut it all off is probably sufficient, but I'm still in favor of having at least that much. I think the average user is much more likely to find pg_dump --skip-whatever than they are to understand how to do dump object filtering correctly, and even the advanced user may appreciate the shortcut on occasion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Hmm, I don't think I like that design. I think we should view this as > a way to embed the SQL and control files needed by the extension in > the server, rather than a separate thing called an inline extension. > If pg_dump is going to dump those files, it ought to dump them all, > not just some subset of them. Ok, but then, what about .so files? Wouldn't it make sense to be able to ship also the executable modules needed, and if not, why not? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jan 23, 2012 at 10:46 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Hmm, I don't think I like that design. I think we should view this as >> a way to embed the SQL and control files needed by the extension in >> the server, rather than a separate thing called an inline extension. >> If pg_dump is going to dump those files, it ought to dump them all, >> not just some subset of them. > > Ok, but then, what about .so files? Wouldn't it make sense to be able > to ship also the executable modules needed, and if not, why not? Sure, that would be as useful as any other part of this feature. We'd have to think carefully about how to make it secure, though: obviously it's no good to let a non-superuser database owner install compiled C code that gets loaded into the backend! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: >> Ok, but then, what about .so files? Wouldn't it make sense to be able >> to ship also the executable modules needed, and if not, why not? > > Sure, that would be as useful as any other part of this feature. We'd > have to think carefully about how to make it secure, though: obviously > it's no good to let a non-superuser database owner install compiled C > code that gets loaded into the backend! The big problem is that .so are installed in directories where the system postgres user usually is not granted permissions to write, that's root business. It already has been asked before about allowing PostgreSQL to load .so from a non-root location, and I think that to be consistent with your view of the world it would be good to add that feature. Then there's no such beast as an “inline” extension so much as a way to install an extension from the protocol, without file system level access to the production server. This would need to be superuser only, of course. That opens up the possibility to ship the modules to any standby server too: the situation now is unfortunate in that create extension hstore on the primary then using it in some indexes means the standby has no means to use those index until you fix it and install the same extension files there. The module itself could be accepted as a bytea value and written at the right place, where the server knows to load it. Now you can dump/restore any extension fully, and we can even ship any extension in the WAL stream (a new message is needed though). The only remaining issue would be the default policy as far as including or avoiding an extension in the dump is concerned, and I would be ok with a default of including none as of now and a pg_dump switch that you can repeat to include whichever extension you wish to. The effect is to transport the files and install them in the expected place on the target server where you restore. And now we really process all extensions the same and you can update inline an extension that you installed the 9.1 way, and vice versa. Don't let me speak about extension distribution facilities just now :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jan 23, 2012 at 8:17 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >>> Ok, but then, what about .so files? Wouldn't it make sense to be able >>> to ship also the executable modules needed, and if not, why not? > > Now you can dump/restore any extension fully, and we can even ship any > extension in the WAL stream (a new message is needed though). Things are still a bit ugly in the more complex cases: consider PostGIS's linkage against libproj and other libraries. In order to cover all cases, I feel that what I need is an optional hook (for the same of argument, let's say it's another "command" type hook, e.g. "archive_command") to be executed when extension (un)installation is occurs on a primary or is replayed on a standby whereby I can acquire the necessary dependencies for an extension or signal some kind of error (as to exactly how that interfaces with the server is delicate, should one want to supply good error messages to the user). I think that hook could be useful for a number of reasons: * Extension distribution (as long as we're downloading dependent libraries, why not get the extension too?) * Extension distribution on standbys, too -- by much the same mechanism * Extension whitelisting for non-superusers (are you allowed/can you even have that extension?) And, more to the point, if one wants to make replication and extensions work nicely together, I don't really see an option outside such a hook other than insisting on a form of packaging whereby all dependencies are declared to Postgres and Postgres becomes the dependency management system for all dependent binary assets. That could lead to a more cohesive system, but is also a pretty hefty burden, both on this project and others. But getting back to in-line extensions: I think it makes sense to dump all extensions in their in-line representation even if in some situations carrying a copy of the extension in the backup is not strictly necessary. The only bloat is including the literal sourcetext of the extension in the dump. With a extension-installation hook, the literal version of the extension could be supplied but ignored if an side-channel mechanism for getting the extension makes sense. -- fdr
On Mon, Jan 23, 2012 at 7:59 PM, Daniel Farina <daniel@heroku.com> wrote: > Things are still a bit ugly in the more complex cases: consider > PostGIS's linkage against libproj and other libraries. In order to > cover all cases, I feel that what I need is an optional hook (for the > same of argument, let's say it's another "command" type hook, e.g. > "archive_command") to be executed when extension (un)installation is > occurs on a primary or is replayed on a standby whereby I can acquire > the necessary dependencies for an extension or signal some kind of > error (as to exactly how that interfaces with the server is delicate, > should one want to supply good error messages to the user). There aren't a whole lot of good options for handling errors on the standby, in general. If the operation has already occurred on the master, it's too late to decide that we're not going to have it happen on the standby as well. Of course, if we confine ourselves to control and SQL files, then it doesn't really matter: the catalog entries from the master will make it to the standby regardless of the absence of the SQL and control files, and maybe we could simply decide that the operations which write in pg_extension aren't WAL-logged and can be separately performed on the standby if you want them there as well. But, that's a bit unlike how we normally do things. And if we're going to WAL-log the writing of the extension files, then I start to feel like we should go back to putting the data in a system catalog rather than the filesystem, because inventing a new WAL record type for this seems like it will make things quite a bit more complicated for no particular benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jan 23, 2012 at 7:59 PM, Daniel Farina <daniel@heroku.com> wrote: >> Things are still a bit ugly in the more complex cases: consider >> PostGIS's linkage against libproj and other libraries. In order to After thinking about that for a while I think we should not include the shared module (.so) in the scope of this patch nor 9.2. Ensuring that the .so dependencies are met is a problem well outside the reach of the PostgreSQL system. >> cover all cases, I feel that what I need is an optional hook (for the >> same of argument, let's say it's another "command" type hook, e.g. >> "archive_command") to be executed when extension (un)installation is >> occurs on a primary or is replayed on a standby whereby I can acquire >> the necessary dependencies for an extension or signal some kind of >> error (as to exactly how that interfaces with the server is delicate, >> should one want to supply good error messages to the user). Supporting command triggers on the replica is in the TODO list. Having that in 9.2 is far from granted though, but the catalog and syntax already have support for the feature. > But, that's a bit unlike how we normally do things. And if we're > going to WAL-log the writing of the extension files, then I start to > feel like we should go back to putting the data in a system catalog > rather than the filesystem, because inventing a new WAL record type > for this seems like it will make things quite a bit more complicated > for no particular benefit. Exactly, and we should talk about concurrent accesses to the filesystem too, and how to have a clean transactional backup including the scripts when that pg_dump option is used. So I'm going to prepare the next version of the patch with this design: - in catalog extension scripts for inline extension pg_extension_script(extoid, oldversion, version, script) oldversion is null when create extension is used unless when using the create extension from 'unpackaged' form - see about adding more control properties in the catalog? - current code that is parsing the filenames to determine the upgrade path will have to be able to take the version stringsfrom the new catalog as an alternative, and getting to the script content must be able to select from the catalogor read a file on disk - pg_dump defaults to not dumping extension content - pg_dump --include-extension-scripts dumps the scripts found either in the filesystem or the catalog, a create script firstthen any number of update script as needed to reach the current installed version - same as we have -t, add -e --extension to pg_dump so that you can choose to dump only a given extension The extension dumping will not include the shared modules, so if you extension depend on them being installed on the server, you will be much better served with some OS level packaging. Not for 9.2, but I can't help thinking that if we could manage to host the .so module itself in the catalogs, we could solve updating it in a transactional way and more importantly host it per-database, rather than having the modules work per major version (not even per cluster) and the extension mechanism work per-database inside each cluster. But that's work for another release. Any comments before I spend time coding this? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Jan 26, 2012, at 9:40 AM, Dimitri Fontaine wrote: > So I'm going to prepare the next version of the patch with this design: > > - in catalog extension scripts for inline extension > > pg_extension_script(extoid, oldversion, version, script) > > oldversion is null when create extension is used > unless when using the create extension from 'unpackaged' form Would you keep all the migration scripts used over time to upgrade from one version to another? > - see about adding more control properties in the catalog? > > - current code that is parsing the filenames to determine the upgrade > path will have to be able to take the version strings from the new > catalog as an alternative, and getting to the script content must be > able to select from the catalog or read a file on disk > > - pg_dump defaults to not dumping extension content > > - pg_dump --include-extension-scripts dumps the scripts found either in > the filesystem or the catalog, a create script first then any number > of update script as needed to reach the current installed version > > - same as we have -t, add -e --extension to pg_dump so that you can > choose to dump only a given extension Also --exclude-extension? > The extension dumping will not include the shared modules, so if you > extension depend on them being installed on the server, you will be much > better served with some OS level packaging. Or must make sure it’s installed on the system before you restore. > Not for 9.2, but I can't help thinking that if we could manage to host > the .so module itself in the catalogs, we could solve updating it in a > transactional way and more importantly host it per-database, rather than > having the modules work per major version (not even per cluster) and the > extension mechanism work per-database inside each cluster. But that's > work for another release. +1 Cloud vendors will *love* this. Best, David
On Thu, Jan 26, 2012 at 3:48 PM, David E. Wheeler <david@justatheory.com> wrote: > On Jan 26, 2012, at 9:40 AM, Dimitri Fontaine wrote: > >> Not for 9.2, but I can't help thinking that if we could manage to host >> the .so module itself in the catalogs, we could solve updating it in a >> transactional way and more importantly host it per-database, rather than >> having the modules work per major version (not even per cluster) and the >> extension mechanism work per-database inside each cluster. But that's >> work for another release. > > +1 Cloud vendors will *love* this. > Confirmed. Let me share my perspective. I'll begin by describing the current state of runtime code dependency management for comparison. In Ruby, any user can push an application to our platform which relies on any/every ruby "gem" ever released (give or take). These gems may require exact releases of other gems, have elaborate system dependencies, and/or natively compiled code components. This is thanks to the rubygems.org repository, the "gem" system, and recently but crucially, the "bundler" system for resolving and isolating dependencies. Releasing a new gem takes moments and I have personally released a half dozen of no real consequence to the world which I use from time to time. In contrast, the idea that any person or team of people could possibly review the literally hundreds of gems released each day is no longer plausible. The only feasible solution for providing a robust service is to engineer a solution which can be operated from inside the cluster to install any library whatsoever. Our aim is, put simply, to be able to support every extension in the world, at once, under cascading replication, across major catalogue upgrades. We hope this ideal is shared by the community at large, since our problems are generally the same as other users, just writ large. -pvh PS: As an aside, because of the many problems with in-cluster multi-tenancy (to pick just one example, resource isolation between users) I have no security concerns with giving users every ability to execute code as the cluster owner's UNIX user. On our service we do not restrict our users access to superuser out of spite, but to reduce the available surface area for self-destruction. -- Peter van Hardenberg San Francisco, California "Everything was beautiful, and nothing hurt." -- Kurt Vonnegut
>> So I'm going to prepare the next version of the patch with this design: >> >> - in catalog extension scripts for inline extension >> >> pg_extension_script(extoid, oldversion, version, script) >> >> oldversion is null when create extension is used >> unless when using the create extension from 'unpackaged' form > > Would you keep all the migration scripts used over time to upgrade from one version to another? yes, that is the idea. You then can play all the stack of SQL needed to go from version "foo" to version "bar" (we don't care about version format, here again). >> - same as we have -t, add -e --extension to pg_dump so that you can >> choose to dump only a given extension > > Also --exclude-extension? It might be the default. We need something to dump the content of pg_catalog.pg_extension_script (or whatever table is going to contain SQL code), per extension or all. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Jan 27, 2012, at 2:19 AM, Cédric Villemain wrote: >> Also --exclude-extension? > > It might be the default. > We need something to dump the content of > pg_catalog.pg_extension_script (or whatever table is going to contain > SQL code), per extension or all. I think dim said --no-extensions would be the default, but I’m thinking it would be useful to have --with-extensions to includethem all, but then be able to --exclude-extension for a select few. Best, David
Le 28 janvier 2012 21:46, David E. Wheeler <david@justatheory.com> a écrit : > On Jan 27, 2012, at 2:19 AM, Cédric Villemain wrote: > >>> Also --exclude-extension? >> >> It might be the default. >> We need something to dump the content of >> pg_catalog.pg_extension_script (or whatever table is going to contain >> SQL code), per extension or all. > > I think dim said --no-extensions would be the default, but I’m thinking it would be useful to have --with-extensions toinclude them all, but then be able to --exclude-extension for a select few. So I am. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Thu, Jan 26, 2012 at 12:40 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > So I'm going to prepare the next version of the patch with this design: Considering that this constitutes a major redesign and that the updated patch hasn't been posted over a month later, it seems past time to mark this Returned with Feedback. So I've done that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Considering that this constitutes a major redesign and that the > updated patch hasn't been posted over a month later, it seems past > time to mark this Returned with Feedback. So I've done that. I was hoping to be able to code that while the CF is running but obviously that won't happen given the other patches I'm playing with. This major redesign you're mentioning is a very good feedback already and will allow me to implement a non controversial patch for the next release. Thanks for dealing with the CF parts of that. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support