Thread: pg_upgrade libraries check
pg_upgrade is a little over-keen about checking for shared libraries that back functions. In particular, it checks for libraries that support functions created in pg_catalog, even if pg_dump doesn't export the function. The attached patch mimics the filter that pg_dump uses for functions so that only the relevant libraries are checked. This would remove the need for a particularly ugly hack in making the 9.1 backport of JSON binary upgradeable. cheers andrew
Attachment
On Fri, May 25, 2012 at 10:20:29AM -0400, Andrew Dunstan wrote: > pg_upgrade is a little over-keen about checking for shared libraries > that back functions. In particular, it checks for libraries that > support functions created in pg_catalog, even if pg_dump doesn't > export the function. > > The attached patch mimics the filter that pg_dump uses for functions > so that only the relevant libraries are checked. > > This would remove the need for a particularly ugly hack in making > the 9.1 backport of JSON binary upgradeable. Andrew is right that pg_upgrade is overly restrictive in checking _any_ shared object file referenced in pg_proc. I never expected that pg_catalog would have such references, but in Andrew's case it does, and pg_dump doesn't dump them, so I guess pg_upgrade shouldn't check them either. In some sense this is a hack for the JSON type, but it also gives users a way to create shared object references in old clusters that are _not_ checked by pg_upgrade, and not migrated to the new server, so I suppose it is fine. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Fri, May 25, 2012 at 11:08:10PM -0400, Bruce Momjian wrote: > On Fri, May 25, 2012 at 10:20:29AM -0400, Andrew Dunstan wrote: > > pg_upgrade is a little over-keen about checking for shared libraries > > that back functions. In particular, it checks for libraries that > > support functions created in pg_catalog, even if pg_dump doesn't > > export the function. > > > > The attached patch mimics the filter that pg_dump uses for functions > > so that only the relevant libraries are checked. > > > > This would remove the need for a particularly ugly hack in making > > the 9.1 backport of JSON binary upgradeable. > > Andrew is right that pg_upgrade is overly restrictive in checking _any_ > shared object file referenced in pg_proc. I never expected that > pg_catalog would have such references, but in Andrew's case it does, and > pg_dump doesn't dump them, so I guess pg_upgrade shouldn't check them > either. > > In some sense this is a hack for the JSON type, but it also gives users > a way to create shared object references in old clusters that are _not_ > checked by pg_upgrade, and not migrated to the new server, so I suppose > it is fine. OK, now I know it is _not_ fine. :-( I just realized the problem as part of debugging the report of a problem with plpython_call_handler(): http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php http://archives.postgresql.org/pgsql-bugs/2012-05/msg00205.php The problem is that functions defined in the "pg_catalog" schema, while no explicitly dumped by pg_dump, are implicitly dumped by things like CREATE LANGUAGE plperl. I have added a pg_upgrade C comment documenting this issue in case we revisit it later. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
On 05/27/2012 06:40 AM, Bruce Momjian wrote: > On Fri, May 25, 2012 at 11:08:10PM -0400, Bruce Momjian wrote: >> On Fri, May 25, 2012 at 10:20:29AM -0400, Andrew Dunstan wrote: >>> pg_upgrade is a little over-keen about checking for shared libraries >>> that back functions. In particular, it checks for libraries that >>> support functions created in pg_catalog, even if pg_dump doesn't >>> export the function. >>> >>> The attached patch mimics the filter that pg_dump uses for functions >>> so that only the relevant libraries are checked. >>> >>> This would remove the need for a particularly ugly hack in making >>> the 9.1 backport of JSON binary upgradeable. >> Andrew is right that pg_upgrade is overly restrictive in checking _any_ >> shared object file referenced in pg_proc. I never expected that >> pg_catalog would have such references, but in Andrew's case it does, and >> pg_dump doesn't dump them, so I guess pg_upgrade shouldn't check them >> either. >> >> In some sense this is a hack for the JSON type, but it also gives users >> a way to create shared object references in old clusters that are _not_ >> checked by pg_upgrade, and not migrated to the new server, so I suppose >> it is fine. > OK, now I know it is _not_ fine. :-( > > I just realized the problem as part of debugging the report of a problem > with plpython_call_handler(): > > http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php > http://archives.postgresql.org/pgsql-bugs/2012-05/msg00205.php > > The problem is that functions defined in the "pg_catalog" schema, while > no explicitly dumped by pg_dump, are implicitly dumped by things like > CREATE LANGUAGE plperl. > > I have added a pg_upgrade C comment documenting this issue in case we > revisit it later. "things like CREATE LANGUAGE plperl" is a rather vague phrase. The PL case could be easily handled by adding this to the query: OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid = p.oid) Do you know of any other cases that this would miss? The fact is that unless we do something like this there is a potential for unnecessary pg_upgrade failures. The workaround I am currently using for the JSON backport of having to supply a dummy shared library is almost unspeakably ugly. If you won't consider changing the query, how about an option to explicitly instruct pg_upgrade to ignore a certain library in its checks? cheers andrew
On Sun, May 27, 2012 at 08:48:54AM -0400, Andrew Dunstan wrote: > > >I just realized the problem as part of debugging the report of a problem > >with plpython_call_handler(): > > > > http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php > > http://archives.postgresql.org/pgsql-bugs/2012-05/msg00205.php > > > >The problem is that functions defined in the "pg_catalog" schema, while > >no explicitly dumped by pg_dump, are implicitly dumped by things like > >CREATE LANGUAGE plperl. > > > >I have added a pg_upgrade C comment documenting this issue in case we > >revisit it later. > > > "things like CREATE LANGUAGE plperl" is a rather vague phrase. The > PL case could be easily handled by adding this to the query: > > OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid > = p.oid) > > > Do you know of any other cases that this would miss? The problem is I don't know. I don't know in what places we reference shared object files implicit but not explicitly, and I can't know what future places we might do this. > The fact is that unless we do something like this there is a > potential for unnecessary pg_upgrade failures. The workaround I am > currently using for the JSON backport of having to supply a dummy > shared library is almost unspeakably ugly. If you won't consider > changing the query, how about an option to explicitly instruct > pg_upgrade to ignore a certain library in its checks? The plpython_call_handler case I mentioned is a good example of a place where a pg_upgrade hack to map plpython to plpython2 has caused pg_upgrade "check" to succeed, but the actual pg_upgrade to fail --- certainly a bad thing, and something we would like to avoid. This kind of tinkering can easily cause such problems. We are not writing a one-off pg_upgrade for JSON-backpatchers here. If you want to create a new pg_upgrade binary with that hack, feel free to do so. Unless someone can explain a second use case for this, I am not ready to risk making pg_upgrade more unstable, and I don't think the community is either. I am not the person who decides if this gets added to pg_upgrade, but I am guessing what the community would want here. FYI, your fix would not address the plpython_call_handler problem because in that case we are actually dumping that function that references the shared object, and the pg_dumpall restore will fail. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Sun, May 27, 2012 at 08:48:54AM -0400, Andrew Dunstan wrote: >> "things like CREATE LANGUAGE plperl" is a rather vague phrase. The >> PL case could be easily handled by adding this to the query: >> OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid >> = p.oid) >> Do you know of any other cases that this would miss? Well, laninline and lanvalidator for two ;-) > The problem is I don't know. I don't know in what places we reference > shared object files implicit but not explicitly, and I can't know what > future places we might do this. The "future changes" argument seems like a straw man to me. We're already in the business of adjusting pg_upgrade when we make significant catalog changes. > We are not writing a one-off pg_upgrade for JSON-backpatchers here. I tend to agree with that position, and particularly think that we should not allow the not-community-approved design of the existing JSON backport to drive changes to pg_upgrade. It would be better to ask first if there were a different way to construct that backport that would fit better with pg_upgrade. Having said that, I've got to also say that I think we've fundamentally blown it with the current approach to upgrading extensions. Because we dump all the extension member objects, the extension contents have got to be restorable into a new database version as-is, and that throws away most of the flexibility that we were trying to buy with the extension mechanism. IMO we have *got* to get to a place where both pg_dump and pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner the better. Once we have that, this type of issue could be addressed by having different contents of the extension creation script for different major server versions --- or maybe even the same server version but different python library versions, to take something on-point for this discussion. For instance, Andrew's problem could be dealt with if the backport were distributed as an extension "json-backport", and then all that's needed in a new installation is an empty extension script of that name. More generally, this would mean that cross-version compatibility problems for extensions could generally be solved in the extension scripts, and not with kluges in pg_upgrade. As things stand, you can be sure that kluging pg_upgrade is going to be the only possible fix for a very wide variety of issues. I don't recall exactly what problems drove us to make pg_upgrade do what it does with extensions, but we need a different fix for them. regards, tom lane
On Sun, May 27, 2012 at 11:31:12AM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Sun, May 27, 2012 at 08:48:54AM -0400, Andrew Dunstan wrote: > >> "things like CREATE LANGUAGE plperl" is a rather vague phrase. The > >> PL case could be easily handled by adding this to the query: > >> OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid > >> = p.oid) > >> Do you know of any other cases that this would miss? > > Well, laninline and lanvalidator for two ;-) > > > The problem is I don't know. I don't know in what places we reference > > shared object files implicit but not explicitly, and I can't know what > > future places we might do this. > > The "future changes" argument seems like a straw man to me. We're > already in the business of adjusting pg_upgrade when we make significant > catalog changes. The bottom line is I just don't understand the rules of when a function in the pg_catalog schema implicitly creates something that references a shared object, and unless someone tells me, I am inclined to just have pg_upgrade check everything and throw an error during 'check', rather than throw an error during the upgrade. If someone did tell me, I would be happy with modifying the pg_upgrade query to match. Also, pg_upgrade rarely requires adjustments for major version changes, and we want to keep it that way. > > We are not writing a one-off pg_upgrade for JSON-backpatchers here. > > I tend to agree with that position, and particularly think that we > should not allow the not-community-approved design of the existing > JSON backport to drive changes to pg_upgrade. It would be better to > ask first if there were a different way to construct that backport > that would fit better with pg_upgrade. Yep. A command-line flag just seems too user-visible for this use-case, and too error-pone. I barely understand what is going on, particularly with plpython in "public" (which we don't fully even understand yet), so adding a command-line flag seems like the wrong direction. > Having said that, I've got to also say that I think we've fundamentally > blown it with the current approach to upgrading extensions. Because we > dump all the extension member objects, the extension contents have got > to be restorable into a new database version as-is, and that throws away > most of the flexibility that we were trying to buy with the extension > mechanism. IMO we have *got* to get to a place where both pg_dump and > pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner > the better. Once we have that, this type of issue could be addressed by > having different contents of the extension creation script for different > major server versions --- or maybe even the same server version but > different python library versions, to take something on-point for this > discussion. For instance, Andrew's problem could be dealt with if the > backport were distributed as an extension "json-backport", and then all > that's needed in a new installation is an empty extension script of that > name. > > More generally, this would mean that cross-version compatibility > problems for extensions could generally be solved in the extension > scripts, and not with kludges in pg_upgrade. As things stand, you can be > sure that kludging pg_upgrade is going to be the only possible fix for > a very wide variety of issues. > > I don't recall exactly what problems drove us to make pg_upgrade do > what it does with extensions, but we need a different fix for them. Uh, pg_upgrade doesn't do anything special with extensions, so it must have been something people did in pg_dump. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Sun, May 27, 2012 at 12:08:14PM -0400, Bruce Momjian wrote: > > > We are not writing a one-off pg_upgrade for JSON-backpatchers here. > > > > I tend to agree with that position, and particularly think that we > > should not allow the not-community-approved design of the existing > > JSON backport to drive changes to pg_upgrade. It would be better to > > ask first if there were a different way to construct that backport > > that would fit better with pg_upgrade. > > Yep. > > A command-line flag just seems too user-visible for this use-case, and > too error-pone. I barely understand what is going on, particularly with > plpython in "public" (which we don't fully even understand yet), so > adding a command-line flag seems like the wrong direction. FYI, I recommend to Andrew that he just set probin to NULL for the JSON type in the old cluster before performing the upgrade. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 05/27/2012 11:31 AM, Tom Lane wrote: > > > Having said that, I've got to also say that I think we've fundamentally > blown it with the current approach to upgrading extensions. Because we > dump all the extension member objects, the extension contents have got > to be restorable into a new database version as-is, and that throws away > most of the flexibility that we were trying to buy with the extension > mechanism. IMO we have *got* to get to a place where both pg_dump and > pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner > the better. Once we have that, this type of issue could be addressed by > having different contents of the extension creation script for different > major server versions --- or maybe even the same server version but > different python library versions, to take something on-point for this > discussion. For instance, Andrew's problem could be dealt with if the > backport were distributed as an extension "json-backport", and then all > that's needed in a new installation is an empty extension script of that > name. It sounds nice, but we'd have to make pg_upgrade drop its current assumption that libraries wanted in the old version will be named the same (one for one) as the libraries wanted in the new version. Currently it looks for every shared library named in probin (other than plpgsql.so) in the old cluster and tries to LOAD it in the new cluster, and errors out if it can't. My current unspeakably ugly workaround for this behaviour is to supply a dummy library for the new cluster. The only other suggestion I have heard (from Bruce) to handle this is to null out the relevant probin entries before doing the upgrade. I'm not sure if that's better or worse. It is certainly just about as ugly. So pg_upgrade definitely needs to get a lot smarter IMNSHO. cheers andrew
On Sun, May 27, 2012 at 12:31:09PM -0400, Andrew Dunstan wrote: > > > On 05/27/2012 11:31 AM, Tom Lane wrote: > > > > > >Having said that, I've got to also say that I think we've fundamentally > >blown it with the current approach to upgrading extensions. Because we > >dump all the extension member objects, the extension contents have got > >to be restorable into a new database version as-is, and that throws away > >most of the flexibility that we were trying to buy with the extension > >mechanism. IMO we have *got* to get to a place where both pg_dump and > >pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner > >the better. Once we have that, this type of issue could be addressed by > >having different contents of the extension creation script for different > >major server versions --- or maybe even the same server version but > >different python library versions, to take something on-point for this > >discussion. For instance, Andrew's problem could be dealt with if the > >backport were distributed as an extension "json-backport", and then all > >that's needed in a new installation is an empty extension script of that > >name. > > > > It sounds nice, but we'd have to make pg_upgrade drop its current > assumption that libraries wanted in the old version will be named > the same (one for one) as the libraries wanted in the new version. > Currently it looks for every shared library named in probin (other > than plpgsql.so) in the old cluster and tries to LOAD it in the new > cluster, and errors out if it can't. I didn't fully understand this. Are you saying pg_upgrade will check some extension config file for the library name? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 05/27/2012 12:50 PM, Bruce Momjian wrote: > On Sun, May 27, 2012 at 12:31:09PM -0400, Andrew Dunstan wrote: >> >> On 05/27/2012 11:31 AM, Tom Lane wrote: >>> >>> Having said that, I've got to also say that I think we've fundamentally >>> blown it with the current approach to upgrading extensions. Because we >>> dump all the extension member objects, the extension contents have got >>> to be restorable into a new database version as-is, and that throws away >>> most of the flexibility that we were trying to buy with the extension >>> mechanism. IMO we have *got* to get to a place where both pg_dump and >>> pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner >>> the better. Once we have that, this type of issue could be addressed by >>> having different contents of the extension creation script for different >>> major server versions --- or maybe even the same server version but >>> different python library versions, to take something on-point for this >>> discussion. For instance, Andrew's problem could be dealt with if the >>> backport were distributed as an extension "json-backport", and then all >>> that's needed in a new installation is an empty extension script of that >>> name. >> >> >> It sounds nice, but we'd have to make pg_upgrade drop its current >> assumption that libraries wanted in the old version will be named >> the same (one for one) as the libraries wanted in the new version. >> Currently it looks for every shared library named in probin (other >> than plpgsql.so) in the old cluster and tries to LOAD it in the new >> cluster, and errors out if it can't. > I didn't fully understand this. Are you saying pg_upgrade will check > some extension config file for the library name? AIUI, for Tom's scheme to work pg_upgrade would need not to check libraries that belonged to extension functions. Currently it simply assumes that what is present in the old cluster is exactly what will be needed in the new cluster. Tom's proposed scheme would completely invalidate that assumption (which I would argue is already of doubtful validity). Instead of explicitly trying to LOAD the libraries it would have to rely on the "CREATE EXTENSION foo" failing, presumably at some time before it would be too late to abort. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > AIUI, for Tom's scheme to work pg_upgrade would need not to check > libraries that belonged to extension functions. Currently it simply > assumes that what is present in the old cluster is exactly what will be > needed in the new cluster. Tom's proposed scheme would completely > invalidate that assumption (which I would argue is already of doubtful > validity). Instead of explicitly trying to LOAD the libraries it would > have to rely on the "CREATE EXTENSION foo" failing, presumably at some > time before it would be too late to abort. Well, the scheme I had in mind would require pg_upgrade to verify that the new cluster contains an extension control file for each extension in the old cluster (which is something it really oughta do anyway, if it doesn't already). After that, though, it ought not be looking at the individual functions defined by an extension --- it is the extension's business which libraries those are in. The only reason for pg_upgrade to still look at probin at all would be to cover C functions that weren't within extensions. In the long run it might be possible to consider those unsupported, or at least not common enough to justify a safety check in pg_upgrade. regards, tom lane
Bruce Momjian <bruce@momjian.us> writes: > On Sun, May 27, 2012 at 11:31:12AM -0400, Tom Lane wrote: >> I don't recall exactly what problems drove us to make pg_upgrade do >> what it does with extensions, but we need a different fix for them. > Uh, pg_upgrade doesn't do anything special with extensions, so it must > have been something people did in pg_dump. Most of the dirty work is in pg_dump --binary_upgrade, but it's pretty disingenuous of you to disavow responsibility for those kluges. They are part and parcel of pg_upgrade IMO. regards, tom lane
On 05/27/2012 02:32 PM, Tom Lane wrote: > Andrew Dunstan<andrew@dunslane.net> writes: >> AIUI, for Tom's scheme to work pg_upgrade would need not to check >> libraries that belonged to extension functions. Currently it simply >> assumes that what is present in the old cluster is exactly what will be >> needed in the new cluster. Tom's proposed scheme would completely >> invalidate that assumption (which I would argue is already of doubtful >> validity). Instead of explicitly trying to LOAD the libraries it would >> have to rely on the "CREATE EXTENSION foo" failing, presumably at some >> time before it would be too late to abort. > Well, the scheme I had in mind would require pg_upgrade to verify that > the new cluster contains an extension control file for each extension in > the old cluster (which is something it really oughta do anyway, if it > doesn't already). After that, though, it ought not be looking at the > individual functions defined by an extension --- it is the extension's > business which libraries those are in. > Yeah, that makes sense. cheers andrew
Tom Lane <tgl@sss.pgh.pa.us> writes: > Well, the scheme I had in mind would require pg_upgrade to verify that > the new cluster contains an extension control file for each extension in > the old cluster (which is something it really oughta do anyway, if it > doesn't already). After that, though, it ought not be looking at the > individual functions defined by an extension --- it is the extension's > business which libraries those are in. I have some plans that we will be discussing later in the new dev cycle and that would impact such a method if we're to follow them. To better solve both the per-system (not even cluster) and per-database extension versions and the inline/os-packaged extension discrepancy, I'm thinking that we should move the extension support files from their shared OS location to a per-database location at CREATE EXTENSION time. That would mean managing several copies of those so that each database can actually implement a different version and upgrade cycle of an extension, including of its shared object libraries (some more session state) in simplest cases (no extra shared object dependencies). I don't intend to be already discussing the details of that proposal here, that's just a hint so that you know that I intend to rework the current control file strategy that we have, so any work on that in pg_upgrade in next cycle will possibly be impacted. > The only reason for pg_upgrade to still look at probin at all would be > to cover C functions that weren't within extensions. In the long run it > might be possible to consider those unsupported, or at least not common > enough to justify a safety check in pg_upgrade. I'm thinking about optionally forbidding creating functions written in C or installed into pg_catalog when not done via an extension script, and maybe later down the road changing the default to be the forbidding. The pg_catalog case makes sense as going via an extension's script is the only way I know about to dump/restore the function. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > I have some plans that we will be discussing later in the new dev cycle > and that would impact such a method if we're to follow them. To better > solve both the per-system (not even cluster) and per-database extension > versions and the inline/os-packaged extension discrepancy, I'm thinking > that we should move the extension support files from their shared OS > location to a per-database location at CREATE EXTENSION time. As a packager, I can say that moving shared libraries in such a way is an absolute nonstarter, as in don't even bother to propose it because it is not going to happen. Putting shared libraries into a postgres-writable directory will be seen (correctly) as a security hole of the first magnitude, not to mention that in many systems it'd require root privilege anyway to adjust the dynamic linker's search path. You could possibly make per-database copies of the control and script files, but I don't see much point in that if you can't similarly version-control the shared libraries. I think we're better off sticking to the assumption that the files constituting an extension are read-only so far as the database server is concerned. regards, tom lane
On Sun, May 27, 2012 at 11:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't recall exactly what problems drove us to make pg_upgrade do > what it does with extensions, but we need a different fix for them. Well, you need pg_upgrade to preserve the OIDs of objects that are part of extensions just as you do for any other objects. If pg_dump --binary-upgrade just emits "CREATE EXTENSION snarfle", for some extension snarfle that provides an eponymous type, then the new cluster is going to end up with a type with a different OID than than whatever existed in the old cluster, and that's going to break all sorts of things - e.g. arrays already on disk that contain the old type OID. I think that it would be an extremely fine thing if we could fix the world so that we not preserve OIDs across a pg_upgrade; that would be all kinds of wonderful. However, I think that's likely to be quite a difficult project. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tom Lane <tgl@sss.pgh.pa.us> writes: > is not going to happen. Putting shared libraries into a > postgres-writable directory will be seen (correctly) as a security hole > of the first magnitude, not to mention that in many systems it'd require > root privilege anyway to adjust the dynamic linker's search path. It seems I keep forgetting about security concerns. Thanks for the heads-up, will try to think about something else then. Damn. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sun, May 27, 2012 at 02:39:28PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Sun, May 27, 2012 at 11:31:12AM -0400, Tom Lane wrote: > >> I don't recall exactly what problems drove us to make pg_upgrade do > >> what it does with extensions, but we need a different fix for them. > > > Uh, pg_upgrade doesn't do anything special with extensions, so it must > > have been something people did in pg_dump. > > Most of the dirty work is in pg_dump --binary_upgrade, but it's pretty > disingenuous of you to disavow responsibility for those kluges. They > are part and parcel of pg_upgrade IMO. True. I was just saying I did not write any of that code and have not studied it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Sun, May 27, 2012 at 02:32:52PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > AIUI, for Tom's scheme to work pg_upgrade would need not to check > > libraries that belonged to extension functions. Currently it simply > > assumes that what is present in the old cluster is exactly what will be > > needed in the new cluster. Tom's proposed scheme would completely > > invalidate that assumption (which I would argue is already of doubtful > > validity). Instead of explicitly trying to LOAD the libraries it would > > have to rely on the "CREATE EXTENSION foo" failing, presumably at some > > time before it would be too late to abort. > > Well, the scheme I had in mind would require pg_upgrade to verify that > the new cluster contains an extension control file for each extension in > the old cluster (which is something it really oughta do anyway, if it > doesn't already). After that, though, it ought not be looking at the > individual functions defined by an extension --- it is the extension's > business which libraries those are in. I assumed I could just have pg_upgrade create and drop the extension in the new database to make sure it works. In the JSON backpatch case, the extension file would just do nothing, as has already been suggested. In fact, can we identify right now if a function is used only by an extension? If so, we could try that idea with our existing infrastructure. That would allow extensions to change their shared object file names with no effect on pg_upgrade. This would allow us to skip checking pg_catalog functions because those are now (mostly?) extensions. > The only reason for pg_upgrade to still look at probin at all would be > to cover C functions that weren't within extensions. In the long run it > might be possible to consider those unsupported, or at least not common > enough to justify a safety check in pg_upgrade. What about C triggers and SPI functions. Do we really want to require extension files for them? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > I assumed I could just have pg_upgrade create and drop the extension in > the new database to make sure it works. In the JSON backpatch case, the > extension file would just do nothing, as has already been suggested. It seems like checking for the control file being present should be sufficient. I don't think it's part of pg_upgrade's job description to test whether the new installation is broken. And we don't really want it cluttering the new installation with dead objects right off the bat (could cause OID issues or LSN issues, for instance). > In fact, can we identify right now if a function is used only by an > extension? ITYM is the function defined by an extension, and the answer to that is "look in pg_depend". regards, tom lane
On Tue, May 29, 2012 at 01:46:28PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > I assumed I could just have pg_upgrade create and drop the extension in > > the new database to make sure it works. In the JSON backpatch case, the > > extension file would just do nothing, as has already been suggested. > > It seems like checking for the control file being present should be > sufficient. I don't think it's part of pg_upgrade's job description to > test whether the new installation is broken. And we don't really want > it cluttering the new installation with dead objects right off the bat > (could cause OID issues or LSN issues, for instance). True. I just wasn't sure the control file method was fool-proof enough. > > In fact, can we identify right now if a function is used only by an > > extension? > > ITYM is the function defined by an extension, and the answer to that is > "look in pg_depend". So is this something I should be exploring, or not worth it at this time? It would allow changing the names of extension shared object files, but right now I don't know anyone doing that, so I am not sure of the value of the change. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Tue, May 29, 2012 at 01:46:28PM -0400, Tom Lane wrote: >> Bruce Momjian <bruce@momjian.us> writes: >>> In fact, can we identify right now if a function is used only by an >>> extension? >> ITYM is the function defined by an extension, and the answer to that is >> "look in pg_depend". > So is this something I should be exploring, or not worth it at this > time? It would allow changing the names of extension shared object > files, but right now I don't know anyone doing that, so I am not sure of > the value of the change. Well, it'd eliminate the kluges for plpython, as well as possibly fixing Andrew's JSON backport issue, and going forward there are going to be more things like that. So I think it's something to pursue, but it's not going to be a simple change unfortunately. As Robert pointed out, we have to be able to preserve OIDs for at least the types exported by an extension. So that seems to mean that we need some mechanism intermediate between "just run the new extension script" and the current "physically duplicate the extension's contents" approach. I'm not real sure what that should look like. As a straw man, I could imagine making pg_dump --binary-upgrade produce something like CREATE EXTENSION hstore WITH OIDS (TYPE hstore = 12345, TYPE hstore[] = 12346); and then having code in CREATE EXTENSION that does the equivalent of the OID-forcing hacks when we're about to create an object that matches something in the WITH OIDS list. Or maybe it'd be better to leave the SQL command alone, and generalize the existing OID-forcing hooks so that we can pre-set a list of names to force OIDs for, and then everything can happen behind the back of CREATE EXTENSION. regards, tom lane
On Tue, May 29, 2012 at 02:38:03PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Tue, May 29, 2012 at 01:46:28PM -0400, Tom Lane wrote: > >> Bruce Momjian <bruce@momjian.us> writes: > >>> In fact, can we identify right now if a function is used only by an > >>> extension? > > >> ITYM is the function defined by an extension, and the answer to that is > >> "look in pg_depend". > > > So is this something I should be exploring, or not worth it at this > > time? It would allow changing the names of extension shared object > > files, but right now I don't know anyone doing that, so I am not sure of > > the value of the change. > > Well, it'd eliminate the kluges for plpython, as well as possibly fixing > Andrew's JSON backport issue, and going forward there are going to be > more things like that. So I think it's something to pursue, but it's > not going to be a simple change unfortunately. I am confused how they fix the plpython issue (mapping plpython.so to plpython2.so). Is there an extension for these languages or is it just pg_pltemplate? Which extensions have config files? > As Robert pointed out, we have to be able to preserve OIDs for at least > the types exported by an extension. So that seems to mean that we need > some mechanism intermediate between "just run the new extension script" > and the current "physically duplicate the extension's contents" > approach. I'm not real sure what that should look like. As a straw > man, I could imagine making pg_dump --binary-upgrade produce something > like > > CREATE EXTENSION hstore > WITH OIDS (TYPE hstore = 12345, TYPE hstore[] = 12346); > > and then having code in CREATE EXTENSION that does the equivalent of the > OID-forcing hacks when we're about to create an object that matches > something in the WITH OIDS list. Or maybe it'd be better to leave the > SQL command alone, and generalize the existing OID-forcing hooks so that > we can pre-set a list of names to force OIDs for, and then everything > can happen behind the back of CREATE EXTENSION. Yes, we have used those hooks before, and they are pretty clean. I wonder if the OID preservation designation has to be _in_ the extension script, but I am unclear how that would be passed to pg_upgrade or pg_dumpall. As you said, it is pretty complex, and I am seeing that now. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Tue, May 29, 2012 at 02:38:03PM -0400, Tom Lane wrote: >> Well, it'd eliminate the kluges for plpython, as well as possibly fixing >> Andrew's JSON backport issue, and going forward there are going to be >> more things like that. So I think it's something to pursue, but it's >> not going to be a simple change unfortunately. > I am confused how they fix the plpython issue (mapping plpython.so to > plpython2.so). Is there an extension for these languages or is it just > pg_pltemplate? Which extensions have config files? Hmmm ... that is a good point; right now, there are probably still an awful lot of installations that have PL languages installed "bare" rather than wrapped in an extension, including all the ones where the plpython library has the old name. So a fix that only considers extensions doesn't get you out of the plpython problem. (Eventually I would like to see "CREATE LANGUAGE foo" with no parameters redefined as doing "CREATE EXTENSION foo", and doing away with pg_pltemplate; but it didn't get done for 9.2.) Actually, though, I don't see why pg_upgrade couldn't treat PL languages, even bare ones, the same way I'm proposing for extensions. Right now, it's already the case AFAICS that pg_dump --binary-upgrade does not dump PL support functions for PLs listed in pg_pltemplate; it just emits the CREATE LANGUAGE command and nothing else. This being the case, it's simply wrong for pg_upgrade to be checking shlib names for the old support functions, because that has got exactly nothing to do with whether the CREATE LANGUAGE command will succeed in the new installation. There are at least two ways you could check that condition without assuming that the shlib name has stayed the same: 1. Look for an extension control file for the language name in the new installation; if present, assume the proper shlib is installed too. 2. Look at the pg_pltemplate entries for the language in the new database and see if they point to a shlib that exists. I'd vote for #1 on the grounds that it's simpler and won't break when we remove pg_pltemplate. It does assume that packagers of third-party PLs have gotten off their duffs and extension-ified their languages. I would think that would have been a pretty popular thing to do, though, since it so greatly reduces the installation complexity for a language not known to pg_pltemplate. regards, tom lane
On Tue, May 29, 2012 at 04:10:41PM -0400, Tom Lane wrote: > Hmmm ... that is a good point; right now, there are probably still an > awful lot of installations that have PL languages installed "bare" > rather than wrapped in an extension, including all the ones where the > plpython library has the old name. So a fix that only considers > extensions doesn't get you out of the plpython problem. > > (Eventually I would like to see "CREATE LANGUAGE foo" with no parameters > redefined as doing "CREATE EXTENSION foo", and doing away with > pg_pltemplate; but it didn't get done for 9.2.) > > Actually, though, I don't see why pg_upgrade couldn't treat PL > languages, even bare ones, the same way I'm proposing for extensions. > Right now, it's already the case AFAICS that pg_dump --binary-upgrade > does not dump PL support functions for PLs listed in pg_pltemplate; > it just emits the CREATE LANGUAGE command and nothing else. This being > the case, it's simply wrong for pg_upgrade to be checking shlib names > for the old support functions, because that has got exactly nothing > to do with whether the CREATE LANGUAGE command will succeed in the new > installation. There are at least two ways you could check that > condition without assuming that the shlib name has stayed the same: > > 1. Look for an extension control file for the language name in the new > installation; if present, assume the proper shlib is installed too. > > 2. Look at the pg_pltemplate entries for the language in the new > database and see if they point to a shlib that exists. > > I'd vote for #1 on the grounds that it's simpler and won't break when > we remove pg_pltemplate. It does assume that packagers of third-party > PLs have gotten off their duffs and extension-ified their languages. > I would think that would have been a pretty popular thing to do, though, > since it so greatly reduces the installation complexity for a language > not known to pg_pltemplate. Uh, there is a pg_upgrade C comment on this, so we did discuss it at some point; see second paragraph: * In Postgres 9.0, Python 3 support was added, and to do that, a * plpython2u language was created withlibrary name plpython2.so * as a symbolic link to plpython.so. In Postgres 9.1, only the * plpython2.solibrary was created, and both plpythonu and * plpython2u pointing to it. For this reason, any referenceto * library name "plpython" in an old PG <= 9.1 cluster must look * for "plpython2" in the newcluster. * * For this case, we could check pg_pltemplate, but that only works * for languages, and doesnot help with function shared objects, * so we just do a general fix. The bottom line is that already needed function shared objects checking, so we just wrapped languages into that as well. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > The bottom line is that already needed function shared objects checking, > so we just wrapped languages into that as well. Well, that'd be fine if it actually *worked*, but as per this discussion, it doesn't work; you have to kluge around the fact that the shared library name changed. I'm suggesting that pg_upgrade needs to be revised to treat this stuff at a higher level. Yeah, you need to look at shared library names for bare C functions, but we should be getting away from that wherever there is a higher-level construct such as an extension or PL. regards, tom lane
On Tue, May 29, 2012 at 05:35:18PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > The bottom line is that already needed function shared objects checking, > > so we just wrapped languages into that as well. > > Well, that'd be fine if it actually *worked*, but as per this > discussion, it doesn't work; you have to kluge around the fact that > the shared library name changed. I'm suggesting that pg_upgrade needs > to be revised to treat this stuff at a higher level. Yeah, you need to > look at shared library names for bare C functions, but we should be > getting away from that wherever there is a higher-level construct such > as an extension or PL. Well, we have three problems. One is the JSON problem that Andrew wants skipped, there is the plpython rename which we adjust in the check, and then there is the plpython support function that is in the public schema, which I am working on a patch to throw a meaningful error. The only one that actually is setup for a rename is the second one. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, 2012-05-28 at 16:06 -0400, Robert Haas wrote: > On Sun, May 27, 2012 at 11:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I don't recall exactly what problems drove us to make pg_upgrade do > > what it does with extensions, but we need a different fix for them. > > Well, you need pg_upgrade to preserve the OIDs of objects that are > part of extensions just as you do for any other objects. Also, I think it needs to force the extension version to match the old cluster. Otherwise, we could be dealing with on-disk format changes, or other complexities. It doesn't sound difficult, but I thought I'd bring it up. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > Also, I think it needs to force the extension version to match the old > cluster. Otherwise, we could be dealing with on-disk format changes, or > other complexities. Yeah, I was thinking we might want --binary-upgrade to specify a particular extension version in CREATE EXTENSION instead of letting it default. I have not really thought through the pros and cons of that, but it seems plausible. regards, tom lane