Thread: pg_upgrade libraries check

pg_upgrade libraries check

From

Andrew Dunstan

Date:

25 May 2012, 11:21:05

pg_upgrade is a little over-keen about checking for shared libraries
that back functions. In particular, it checks for libraries that support
functions created in pg_catalog, even if pg_dump doesn't export the
function.

The attached patch mimics the filter that pg_dump uses for functions so
that only the relevant libraries are checked.

This would remove the need for a particularly ugly hack in making the
9.1 backport of JSON binary upgradeable.

cheers

andrew

Attachment

upgrade.patch

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

26 May 2012, 00:29:32

On Fri, May 25, 2012 at 10:20:29AM -0400, Andrew Dunstan wrote:
> pg_upgrade is a little over-keen about checking for shared libraries
> that back functions. In particular, it checks for libraries that
> support functions created in pg_catalog, even if pg_dump doesn't
> export the function.
>
> The attached patch mimics the filter that pg_dump uses for functions
> so that only the relevant libraries are checked.
> 
> This would remove the need for a particularly ugly hack in making
> the 9.1 backport of JSON binary upgradeable.

Andrew is right that pg_upgrade is overly restrictive in checking _any_
shared object file referenced in pg_proc.  I never expected that
pg_catalog would have such references, but in Andrew's case it does, and
pg_dump doesn't dump them, so I guess pg_upgrade shouldn't check them
either.

In some sense this is a hack for the JSON type, but it also gives users
a way to create shared object references in old clusters that are _not_
checked by pg_upgrade, and not migrated to the new server, so I suppose
it is fine.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

27 May 2012, 07:41:19

On Fri, May 25, 2012 at 11:08:10PM -0400, Bruce Momjian wrote:
> On Fri, May 25, 2012 at 10:20:29AM -0400, Andrew Dunstan wrote:
> > pg_upgrade is a little over-keen about checking for shared libraries
> > that back functions. In particular, it checks for libraries that
> > support functions created in pg_catalog, even if pg_dump doesn't
> > export the function.
> >
> > The attached patch mimics the filter that pg_dump uses for functions
> > so that only the relevant libraries are checked.
> >
> > This would remove the need for a particularly ugly hack in making
> > the 9.1 backport of JSON binary upgradeable.
>
> Andrew is right that pg_upgrade is overly restrictive in checking _any_
> shared object file referenced in pg_proc.  I never expected that
> pg_catalog would have such references, but in Andrew's case it does, and
> pg_dump doesn't dump them, so I guess pg_upgrade shouldn't check them
> either.
>
> In some sense this is a hack for the JSON type, but it also gives users
> a way to create shared object references in old clusters that are _not_
> checked by pg_upgrade, and not migrated to the new server, so I suppose
> it is fine.

OK, now I know it is _not_ fine.  :-(

I just realized the problem as part of debugging the report of a problem
with plpython_call_handler():

    http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php
    http://archives.postgresql.org/pgsql-bugs/2012-05/msg00205.php

The problem is that functions defined in the "pg_catalog" schema, while
no explicitly dumped by pg_dump, are implicitly dumped by things like
CREATE LANGUAGE plperl.

I have added a pg_upgrade C comment documenting this issue in case we
revisit it later.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

pg_upgrade.diff

Re: pg_upgrade libraries check

From

Andrew Dunstan

Date:

27 May 2012, 09:49:27


On 05/27/2012 06:40 AM, Bruce Momjian wrote:
> On Fri, May 25, 2012 at 11:08:10PM -0400, Bruce Momjian wrote:
>> On Fri, May 25, 2012 at 10:20:29AM -0400, Andrew Dunstan wrote:
>>> pg_upgrade is a little over-keen about checking for shared libraries
>>> that back functions. In particular, it checks for libraries that
>>> support functions created in pg_catalog, even if pg_dump doesn't
>>> export the function.
>>>
>>> The attached patch mimics the filter that pg_dump uses for functions
>>> so that only the relevant libraries are checked.
>>>
>>> This would remove the need for a particularly ugly hack in making
>>> the 9.1 backport of JSON binary upgradeable.
>> Andrew is right that pg_upgrade is overly restrictive in checking _any_
>> shared object file referenced in pg_proc.  I never expected that
>> pg_catalog would have such references, but in Andrew's case it does, and
>> pg_dump doesn't dump them, so I guess pg_upgrade shouldn't check them
>> either.
>>
>> In some sense this is a hack for the JSON type, but it also gives users
>> a way to create shared object references in old clusters that are _not_
>> checked by pg_upgrade, and not migrated to the new server, so I suppose
>> it is fine.
> OK, now I know it is _not_ fine.  :-(
>
> I just realized the problem as part of debugging the report of a problem
> with plpython_call_handler():
>
>     http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php
>     http://archives.postgresql.org/pgsql-bugs/2012-05/msg00205.php
>
> The problem is that functions defined in the "pg_catalog" schema, while
> no explicitly dumped by pg_dump, are implicitly dumped by things like
> CREATE LANGUAGE plperl.
>
> I have added a pg_upgrade C comment documenting this issue in case we
> revisit it later.


"things like CREATE LANGUAGE plperl" is a rather vague phrase. The PL 
case could be easily handled by adding this to the query:
   OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid   = p.oid)


Do you know of any other cases that this would miss?

The fact is that unless we do something like this there is a potential 
for unnecessary pg_upgrade failures. The workaround I am currently using 
for the JSON backport of having to supply a dummy shared library is 
almost unspeakably ugly. If you won't consider changing the query, how 
about an option to explicitly instruct pg_upgrade to ignore a certain 
library in its checks?

cheers

andrew

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

27 May 2012, 11:32:17

On Sun, May 27, 2012 at 08:48:54AM -0400, Andrew Dunstan wrote:
> 
> >I just realized the problem as part of debugging the report of a problem
> >with plpython_call_handler():
> >
> >    http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php
> >    http://archives.postgresql.org/pgsql-bugs/2012-05/msg00205.php
> >
> >The problem is that functions defined in the "pg_catalog" schema, while
> >no explicitly dumped by pg_dump, are implicitly dumped by things like
> >CREATE LANGUAGE plperl.
> >
> >I have added a pg_upgrade C comment documenting this issue in case we
> >revisit it later.
> 
> 
> "things like CREATE LANGUAGE plperl" is a rather vague phrase. The
> PL case could be easily handled by adding this to the query:
> 
>    OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid
>    = p.oid)
> 
> 
> Do you know of any other cases that this would miss?

The problem is I don't know.  I don't know in what places we reference
shared object files implicit but not explicitly, and I can't know what
future places we might do this.

> The fact is that unless we do something like this there is a
> potential for unnecessary pg_upgrade failures. The workaround I am
> currently using for the JSON backport of having to supply a dummy
> shared library is almost unspeakably ugly. If you won't consider
> changing the query, how about an option to explicitly instruct
> pg_upgrade to ignore a certain library in its checks?

The plpython_call_handler case I mentioned is a good example of a place
where a pg_upgrade hack to map plpython to plpython2 has caused
pg_upgrade "check" to succeed, but the actual pg_upgrade to fail ---
certainly a bad thing, and something we would like to avoid.  This kind
of tinkering can easily cause such problems.

We are not writing a one-off pg_upgrade for JSON-backpatchers here.  If
you want to create a new pg_upgrade binary with that hack, feel free to
do so. Unless someone can explain a second use case for this, I am not
ready to risk making pg_upgrade more unstable, and I don't think the
community is either.

I am not the person who decides if this gets added to pg_upgrade, but I
am guessing what the community would want here.

FYI, your fix would not address the plpython_call_handler problem
because in that case we are actually dumping that function that
references the shared object, and the pg_dumpall restore will fail.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Tom Lane

Date:

27 May 2012, 12:31:33

Bruce Momjian <bruce@momjian.us> writes:
> On Sun, May 27, 2012 at 08:48:54AM -0400, Andrew Dunstan wrote:
>> "things like CREATE LANGUAGE plperl" is a rather vague phrase. The
>> PL case could be easily handled by adding this to the query:
>> OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid
>> = p.oid)
>> Do you know of any other cases that this would miss?

Well, laninline and lanvalidator for two ;-)

> The problem is I don't know.  I don't know in what places we reference
> shared object files implicit but not explicitly, and I can't know what
> future places we might do this.

The "future changes" argument seems like a straw man to me.  We're
already in the business of adjusting pg_upgrade when we make significant
catalog changes.

> We are not writing a one-off pg_upgrade for JSON-backpatchers here.

I tend to agree with that position, and particularly think that we
should not allow the not-community-approved design of the existing
JSON backport to drive changes to pg_upgrade.  It would be better to
ask first if there were a different way to construct that backport
that would fit better with pg_upgrade.

Having said that, I've got to also say that I think we've fundamentally
blown it with the current approach to upgrading extensions.  Because we
dump all the extension member objects, the extension contents have got
to be restorable into a new database version as-is, and that throws away
most of the flexibility that we were trying to buy with the extension
mechanism.  IMO we have *got* to get to a place where both pg_dump and
pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner
the better.  Once we have that, this type of issue could be addressed by
having different contents of the extension creation script for different
major server versions --- or maybe even the same server version but
different python library versions, to take something on-point for this
discussion.  For instance, Andrew's problem could be dealt with if the
backport were distributed as an extension "json-backport", and then all
that's needed in a new installation is an empty extension script of that
name.

More generally, this would mean that cross-version compatibility
problems for extensions could generally be solved in the extension
scripts, and not with kluges in pg_upgrade.  As things stand, you can be
sure that kluging pg_upgrade is going to be the only possible fix for
a very wide variety of issues.

I don't recall exactly what problems drove us to make pg_upgrade do
what it does with extensions, but we need a different fix for them.
        regards, tom lane

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

27 May 2012, 13:08:37

On Sun, May 27, 2012 at 11:31:12AM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Sun, May 27, 2012 at 08:48:54AM -0400, Andrew Dunstan wrote:
> >> "things like CREATE LANGUAGE plperl" is a rather vague phrase. The
> >> PL case could be easily handled by adding this to the query:
> >> OR EXISTS (SELECT 1 FROM pg_catalog.pg_language WHERE lanplcallfoid
> >> = p.oid)
> >> Do you know of any other cases that this would miss?
> 
> Well, laninline and lanvalidator for two ;-)
> 
> > The problem is I don't know.  I don't know in what places we reference
> > shared object files implicit but not explicitly, and I can't know what
> > future places we might do this.
> 
> The "future changes" argument seems like a straw man to me.  We're
> already in the business of adjusting pg_upgrade when we make significant
> catalog changes.

The bottom line is I just don't understand the rules of when a function
in the pg_catalog schema implicitly creates something that references a
shared object, and unless someone tells me, I am inclined to just have
pg_upgrade check everything and throw an error during 'check', rather
than throw an error during the upgrade.  If someone did tell me, I would
be happy with modifying the pg_upgrade query to match.

Also, pg_upgrade rarely requires adjustments for major version changes,
and we want to keep it that way.

> > We are not writing a one-off pg_upgrade for JSON-backpatchers here.
> 
> I tend to agree with that position, and particularly think that we
> should not allow the not-community-approved design of the existing
> JSON backport to drive changes to pg_upgrade.  It would be better to
> ask first if there were a different way to construct that backport
> that would fit better with pg_upgrade.

Yep.

A command-line flag just seems too user-visible for this use-case, and
too error-pone.  I barely understand what is going on, particularly with
plpython in "public" (which we don't fully even understand yet), so
adding a command-line flag seems like the wrong direction.

> Having said that, I've got to also say that I think we've fundamentally
> blown it with the current approach to upgrading extensions.  Because we
> dump all the extension member objects, the extension contents have got
> to be restorable into a new database version as-is, and that throws away
> most of the flexibility that we were trying to buy with the extension
> mechanism.  IMO we have *got* to get to a place where both pg_dump and
> pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner
> the better.  Once we have that, this type of issue could be addressed by
> having different contents of the extension creation script for different
> major server versions --- or maybe even the same server version but
> different python library versions, to take something on-point for this
> discussion.  For instance, Andrew's problem could be dealt with if the
> backport were distributed as an extension "json-backport", and then all
> that's needed in a new installation is an empty extension script of that
> name.
> 
> More generally, this would mean that cross-version compatibility
> problems for extensions could generally be solved in the extension
> scripts, and not with kludges in pg_upgrade.  As things stand, you can be
> sure that kludging pg_upgrade is going to be the only possible fix for
> a very wide variety of issues.
> 
> I don't recall exactly what problems drove us to make pg_upgrade do
> what it does with extensions, but we need a different fix for them.

Uh, pg_upgrade doesn't do anything special with extensions, so it must
have been something people did in pg_dump.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

27 May 2012, 13:26:07

On Sun, May 27, 2012 at 12:08:14PM -0400, Bruce Momjian wrote:
> > > We are not writing a one-off pg_upgrade for JSON-backpatchers here.
> > 
> > I tend to agree with that position, and particularly think that we
> > should not allow the not-community-approved design of the existing
> > JSON backport to drive changes to pg_upgrade.  It would be better to
> > ask first if there were a different way to construct that backport
> > that would fit better with pg_upgrade.
> 
> Yep.
> 
> A command-line flag just seems too user-visible for this use-case, and
> too error-pone.  I barely understand what is going on, particularly with
> plpython in "public" (which we don't fully even understand yet), so
> adding a command-line flag seems like the wrong direction.

FYI, I recommend to Andrew that he just set probin to NULL for the JSON
type in the old cluster before performing the upgrade.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Andrew Dunstan

Date:

27 May 2012, 13:31:42

On 05/27/2012 11:31 AM, Tom Lane wrote:
>
>
> Having said that, I've got to also say that I think we've fundamentally
> blown it with the current approach to upgrading extensions.  Because we
> dump all the extension member objects, the extension contents have got
> to be restorable into a new database version as-is, and that throws away
> most of the flexibility that we were trying to buy with the extension
> mechanism.  IMO we have *got* to get to a place where both pg_dump and
> pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner
> the better.  Once we have that, this type of issue could be addressed by
> having different contents of the extension creation script for different
> major server versions --- or maybe even the same server version but
> different python library versions, to take something on-point for this
> discussion.  For instance, Andrew's problem could be dealt with if the
> backport were distributed as an extension "json-backport", and then all
> that's needed in a new installation is an empty extension script of that
> name.

It sounds nice, but we'd have to make pg_upgrade drop its current 
assumption that libraries wanted in the old version will be named the 
same (one for one) as the libraries wanted in the new version. Currently 
it looks for every shared library named in probin (other than 
plpgsql.so) in the old cluster and tries to LOAD it in the new cluster, 
and errors out if it can't.

My current unspeakably ugly workaround for this behaviour is to supply a 
dummy library for the new cluster. The only other suggestion I have 
heard (from Bruce) to handle this is to null out the relevant probin 
entries before doing the upgrade. I'm not sure if that's better or 
worse. It is certainly just about as ugly.

So pg_upgrade definitely needs to get a lot smarter IMNSHO.

cheers

andrew

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

27 May 2012, 13:51:13

On Sun, May 27, 2012 at 12:31:09PM -0400, Andrew Dunstan wrote:
> 
> 
> On 05/27/2012 11:31 AM, Tom Lane wrote:
> >
> >
> >Having said that, I've got to also say that I think we've fundamentally
> >blown it with the current approach to upgrading extensions.  Because we
> >dump all the extension member objects, the extension contents have got
> >to be restorable into a new database version as-is, and that throws away
> >most of the flexibility that we were trying to buy with the extension
> >mechanism.  IMO we have *got* to get to a place where both pg_dump and
> >pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner
> >the better.  Once we have that, this type of issue could be addressed by
> >having different contents of the extension creation script for different
> >major server versions --- or maybe even the same server version but
> >different python library versions, to take something on-point for this
> >discussion.  For instance, Andrew's problem could be dealt with if the
> >backport were distributed as an extension "json-backport", and then all
> >that's needed in a new installation is an empty extension script of that
> >name.
> 
> 
> 
> It sounds nice, but we'd have to make pg_upgrade drop its current
> assumption that libraries wanted in the old version will be named
> the same (one for one) as the libraries wanted in the new version.
> Currently it looks for every shared library named in probin (other
> than plpgsql.so) in the old cluster and tries to LOAD it in the new
> cluster, and errors out if it can't.

I didn't fully understand this. Are you saying pg_upgrade will check
some extension config file for the library name?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Andrew Dunstan

Date:

27 May 2012, 14:08:20


On 05/27/2012 12:50 PM, Bruce Momjian wrote:
> On Sun, May 27, 2012 at 12:31:09PM -0400, Andrew Dunstan wrote:
>>
>> On 05/27/2012 11:31 AM, Tom Lane wrote:
>>>
>>> Having said that, I've got to also say that I think we've fundamentally
>>> blown it with the current approach to upgrading extensions.  Because we
>>> dump all the extension member objects, the extension contents have got
>>> to be restorable into a new database version as-is, and that throws away
>>> most of the flexibility that we were trying to buy with the extension
>>> mechanism.  IMO we have *got* to get to a place where both pg_dump and
>>> pg_upgrade dump extensions just as "CREATE EXTENSION", and the sooner
>>> the better.  Once we have that, this type of issue could be addressed by
>>> having different contents of the extension creation script for different
>>> major server versions --- or maybe even the same server version but
>>> different python library versions, to take something on-point for this
>>> discussion.  For instance, Andrew's problem could be dealt with if the
>>> backport were distributed as an extension "json-backport", and then all
>>> that's needed in a new installation is an empty extension script of that
>>> name.
>>
>>
>> It sounds nice, but we'd have to make pg_upgrade drop its current
>> assumption that libraries wanted in the old version will be named
>> the same (one for one) as the libraries wanted in the new version.
>> Currently it looks for every shared library named in probin (other
>> than plpgsql.so) in the old cluster and tries to LOAD it in the new
>> cluster, and errors out if it can't.
> I didn't fully understand this. Are you saying pg_upgrade will check
> some extension config file for the library name?


AIUI, for Tom's scheme to work pg_upgrade would need not to check 
libraries that belonged to extension functions. Currently it simply 
assumes that what is present in the old cluster is exactly what will be 
needed in the new cluster. Tom's proposed scheme would completely 
invalidate that assumption (which I would argue is already of doubtful 
validity). Instead of explicitly trying to LOAD the libraries it would 
have to rely on the "CREATE EXTENSION foo" failing, presumably at some 
time before it would be too late to abort.

cheers

andrew

Re: pg_upgrade libraries check

From

Tom Lane

Date:

27 May 2012, 15:33:12

Andrew Dunstan <andrew@dunslane.net> writes:
> AIUI, for Tom's scheme to work pg_upgrade would need not to check 
> libraries that belonged to extension functions. Currently it simply 
> assumes that what is present in the old cluster is exactly what will be 
> needed in the new cluster. Tom's proposed scheme would completely 
> invalidate that assumption (which I would argue is already of doubtful 
> validity). Instead of explicitly trying to LOAD the libraries it would 
> have to rely on the "CREATE EXTENSION foo" failing, presumably at some 
> time before it would be too late to abort.

Well, the scheme I had in mind would require pg_upgrade to verify that
the new cluster contains an extension control file for each extension in
the old cluster (which is something it really oughta do anyway, if it
doesn't already).  After that, though, it ought not be looking at the
individual functions defined by an extension --- it is the extension's
business which libraries those are in.

The only reason for pg_upgrade to still look at probin at all would be
to cover C functions that weren't within extensions.  In the long run it
might be possible to consider those unsupported, or at least not common
enough to justify a safety check in pg_upgrade.
        regards, tom lane

Re: pg_upgrade libraries check

From

Tom Lane

Date:

27 May 2012, 15:39:48

Bruce Momjian <bruce@momjian.us> writes:
> On Sun, May 27, 2012 at 11:31:12AM -0400, Tom Lane wrote:
>> I don't recall exactly what problems drove us to make pg_upgrade do
>> what it does with extensions, but we need a different fix for them.

> Uh, pg_upgrade doesn't do anything special with extensions, so it must
> have been something people did in pg_dump.

Most of the dirty work is in pg_dump --binary_upgrade, but it's pretty
disingenuous of you to disavow responsibility for those kluges.  They
are part and parcel of pg_upgrade IMO.
        regards, tom lane

Re: pg_upgrade libraries check

From

Andrew Dunstan

Date:

27 May 2012, 15:53:21


On 05/27/2012 02:32 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> AIUI, for Tom's scheme to work pg_upgrade would need not to check
>> libraries that belonged to extension functions. Currently it simply
>> assumes that what is present in the old cluster is exactly what will be
>> needed in the new cluster. Tom's proposed scheme would completely
>> invalidate that assumption (which I would argue is already of doubtful
>> validity). Instead of explicitly trying to LOAD the libraries it would
>> have to rely on the "CREATE EXTENSION foo" failing, presumably at some
>> time before it would be too late to abort.
> Well, the scheme I had in mind would require pg_upgrade to verify that
> the new cluster contains an extension control file for each extension in
> the old cluster (which is something it really oughta do anyway, if it
> doesn't already).  After that, though, it ought not be looking at the
> individual functions defined by an extension --- it is the extension's
> business which libraries those are in.
>

Yeah, that makes sense.

cheers

andrew

Re: pg_upgrade libraries check

From

Dimitri Fontaine

Date:

28 May 2012, 13:09:48

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Well, the scheme I had in mind would require pg_upgrade to verify that
> the new cluster contains an extension control file for each extension in
> the old cluster (which is something it really oughta do anyway, if it
> doesn't already).  After that, though, it ought not be looking at the
> individual functions defined by an extension --- it is the extension's
> business which libraries those are in.

I have some plans that we will be discussing later in the new dev cycle
and that would impact such a method if we're to follow them. To better
solve both the per-system (not even cluster) and per-database extension
versions and the inline/os-packaged extension discrepancy, I'm thinking
that we should move the extension support files from their shared OS
location to a per-database location at CREATE EXTENSION time.

That would mean managing several copies of those so that each database
can actually implement a different version and upgrade cycle of an
extension, including of its shared object libraries (some more session
state) in simplest cases (no extra shared object dependencies).

I don't intend to be already discussing the details of that proposal
here, that's just a hint so that you know that I intend to rework the
current control file strategy that we have, so any work on that in
pg_upgrade in next cycle will possibly be impacted.

> The only reason for pg_upgrade to still look at probin at all would be
> to cover C functions that weren't within extensions.  In the long run it
> might be possible to consider those unsupported, or at least not common
> enough to justify a safety check in pg_upgrade.

I'm thinking about optionally forbidding creating functions written in C
or installed into pg_catalog when not done via an extension script, and
maybe later down the road changing the default to be the forbidding.

The pg_catalog case makes sense as going via an extension's script is
the only way I know about to dump/restore the function.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_upgrade libraries check

From

Tom Lane

Date:

28 May 2012, 13:30:48

Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> I have some plans that we will be discussing later in the new dev cycle
> and that would impact such a method if we're to follow them. To better
> solve both the per-system (not even cluster) and per-database extension
> versions and the inline/os-packaged extension discrepancy, I'm thinking
> that we should move the extension support files from their shared OS
> location to a per-database location at CREATE EXTENSION time.

As a packager, I can say that moving shared libraries in such a way is
an absolute nonstarter, as in don't even bother to propose it because it
is not going to happen.  Putting shared libraries into a
postgres-writable directory will be seen (correctly) as a security hole
of the first magnitude, not to mention that in many systems it'd require
root privilege anyway to adjust the dynamic linker's search path.  You
could possibly make per-database copies of the control and script files,
but I don't see much point in that if you can't similarly
version-control the shared libraries.

I think we're better off sticking to the assumption that the files
constituting an extension are read-only so far as the database server is
concerned.
        regards, tom lane

Re: pg_upgrade libraries check

From

Robert Haas

Date:

28 May 2012, 17:07:14

On Sun, May 27, 2012 at 11:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't recall exactly what problems drove us to make pg_upgrade do
> what it does with extensions, but we need a different fix for them.

Well, you need pg_upgrade to preserve the OIDs of objects that are
part of extensions just as you do for any other objects.  If pg_dump
--binary-upgrade just emits "CREATE EXTENSION snarfle", for some
extension snarfle that provides an eponymous type, then the new
cluster is going to end up with a type with a different OID than than
whatever existed in the old cluster, and that's going to break all
sorts of things - e.g. arrays already on disk that contain the old
type OID.

I think that it would be an extremely fine thing if we could fix the
world so that we not preserve OIDs across a pg_upgrade; that would be
all kinds of wonderful.  However, I think that's likely to be quite a
difficult project.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_upgrade libraries check

From

Dimitri Fontaine

Date:

29 May 2012, 06:11:59

Tom Lane <tgl@sss.pgh.pa.us> writes:
> is not going to happen.  Putting shared libraries into a
> postgres-writable directory will be seen (correctly) as a security hole
> of the first magnitude, not to mention that in many systems it'd require
> root privilege anyway to adjust the dynamic linker's search path.

It seems I keep forgetting about security concerns. Thanks for the
heads-up, will try to think about something else then. Damn.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

29 May 2012, 11:23:04

On Sun, May 27, 2012 at 02:39:28PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Sun, May 27, 2012 at 11:31:12AM -0400, Tom Lane wrote:
> >> I don't recall exactly what problems drove us to make pg_upgrade do
> >> what it does with extensions, but we need a different fix for them.
> 
> > Uh, pg_upgrade doesn't do anything special with extensions, so it must
> > have been something people did in pg_dump.
> 
> Most of the dirty work is in pg_dump --binary_upgrade, but it's pretty
> disingenuous of you to disavow responsibility for those kluges.  They
> are part and parcel of pg_upgrade IMO.

True.  I was just saying I did not write any of that code and have not
studied it.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

29 May 2012, 11:29:20

On Sun, May 27, 2012 at 02:32:52PM -0400, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > AIUI, for Tom's scheme to work pg_upgrade would need not to check 
> > libraries that belonged to extension functions. Currently it simply 
> > assumes that what is present in the old cluster is exactly what will be 
> > needed in the new cluster. Tom's proposed scheme would completely 
> > invalidate that assumption (which I would argue is already of doubtful 
> > validity). Instead of explicitly trying to LOAD the libraries it would 
> > have to rely on the "CREATE EXTENSION foo" failing, presumably at some 
> > time before it would be too late to abort.
> 
> Well, the scheme I had in mind would require pg_upgrade to verify that
> the new cluster contains an extension control file for each extension in
> the old cluster (which is something it really oughta do anyway, if it
> doesn't already).  After that, though, it ought not be looking at the
> individual functions defined by an extension --- it is the extension's
> business which libraries those are in.

I assumed I could just have pg_upgrade create and drop the extension in
the new database to make sure it works.  In the JSON backpatch case, the
extension file would just do nothing, as has already been suggested.

In fact, can we identify right now if a function is used only by an
extension?  If so, we could try that idea with our existing
infrastructure.  That would allow extensions to change their shared
object file names with no effect on pg_upgrade.  This would allow us to
skip checking pg_catalog functions because those are now (mostly?)
extensions.

> The only reason for pg_upgrade to still look at probin at all would be
> to cover C functions that weren't within extensions.  In the long run it
> might be possible to consider those unsupported, or at least not common
> enough to justify a safety check in pg_upgrade.

What about C triggers and SPI functions.  Do we really want to require
extension files for them?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Tom Lane

Date:

29 May 2012, 14:46:48

Bruce Momjian <bruce@momjian.us> writes:
> I assumed I could just have pg_upgrade create and drop the extension in
> the new database to make sure it works.  In the JSON backpatch case, the
> extension file would just do nothing, as has already been suggested.

It seems like checking for the control file being present should be
sufficient.  I don't think it's part of pg_upgrade's job description to
test whether the new installation is broken.  And we don't really want
it cluttering the new installation with dead objects right off the bat
(could cause OID issues or LSN issues, for instance).

> In fact, can we identify right now if a function is used only by an
> extension?

ITYM is the function defined by an extension, and the answer to that is
"look in pg_depend".
        regards, tom lane

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

29 May 2012, 15:10:01

On Tue, May 29, 2012 at 01:46:28PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I assumed I could just have pg_upgrade create and drop the extension in
> > the new database to make sure it works.  In the JSON backpatch case, the
> > extension file would just do nothing, as has already been suggested.
> 
> It seems like checking for the control file being present should be
> sufficient.  I don't think it's part of pg_upgrade's job description to
> test whether the new installation is broken.  And we don't really want
> it cluttering the new installation with dead objects right off the bat
> (could cause OID issues or LSN issues, for instance).

True.  I just wasn't sure the control file method was fool-proof enough.

> > In fact, can we identify right now if a function is used only by an
> > extension?
> 
> ITYM is the function defined by an extension, and the answer to that is
> "look in pg_depend".

So is this something I should be exploring, or not worth it at this
time?  It would allow changing the names of extension shared object
files, but right now I don't know anyone doing that, so I am not sure of
the value of the change.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Tom Lane

Date:

29 May 2012, 15:38:22

Bruce Momjian <bruce@momjian.us> writes:
> On Tue, May 29, 2012 at 01:46:28PM -0400, Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> In fact, can we identify right now if a function is used only by an
>>> extension?

>> ITYM is the function defined by an extension, and the answer to that is
>> "look in pg_depend".

> So is this something I should be exploring, or not worth it at this
> time?  It would allow changing the names of extension shared object
> files, but right now I don't know anyone doing that, so I am not sure of
> the value of the change.

Well, it'd eliminate the kluges for plpython, as well as possibly fixing
Andrew's JSON backport issue, and going forward there are going to be
more things like that.  So I think it's something to pursue, but it's
not going to be a simple change unfortunately.

As Robert pointed out, we have to be able to preserve OIDs for at least
the types exported by an extension.  So that seems to mean that we need
some mechanism intermediate between "just run the new extension script"
and the current "physically duplicate the extension's contents"
approach.  I'm not real sure what that should look like.  As a straw
man, I could imagine making pg_dump --binary-upgrade produce something
like
CREATE EXTENSION hstore    WITH OIDS (TYPE hstore = 12345, TYPE hstore[] = 12346);

and then having code in CREATE EXTENSION that does the equivalent of the
OID-forcing hacks when we're about to create an object that matches
something in the WITH OIDS list.  Or maybe it'd be better to leave the
SQL command alone, and generalize the existing OID-forcing hooks so that
we can pre-set a list of names to force OIDs for, and then everything
can happen behind the back of CREATE EXTENSION.
        regards, tom lane

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

29 May 2012, 16:41:03

On Tue, May 29, 2012 at 02:38:03PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Tue, May 29, 2012 at 01:46:28PM -0400, Tom Lane wrote:
> >> Bruce Momjian <bruce@momjian.us> writes:
> >>> In fact, can we identify right now if a function is used only by an
> >>> extension?
> 
> >> ITYM is the function defined by an extension, and the answer to that is
> >> "look in pg_depend".
> 
> > So is this something I should be exploring, or not worth it at this
> > time?  It would allow changing the names of extension shared object
> > files, but right now I don't know anyone doing that, so I am not sure of
> > the value of the change.
> 
> Well, it'd eliminate the kluges for plpython, as well as possibly fixing
> Andrew's JSON backport issue, and going forward there are going to be
> more things like that.  So I think it's something to pursue, but it's
> not going to be a simple change unfortunately.

I am confused how they fix the plpython issue (mapping plpython.so to
plpython2.so).  Is there an extension for these languages or is it just
pg_pltemplate?  Which extensions have config files?

> As Robert pointed out, we have to be able to preserve OIDs for at least
> the types exported by an extension.  So that seems to mean that we need
> some mechanism intermediate between "just run the new extension script"
> and the current "physically duplicate the extension's contents"
> approach.  I'm not real sure what that should look like.  As a straw
> man, I could imagine making pg_dump --binary-upgrade produce something
> like
> 
>     CREATE EXTENSION hstore
>         WITH OIDS (TYPE hstore = 12345, TYPE hstore[] = 12346);
> 
> and then having code in CREATE EXTENSION that does the equivalent of the
> OID-forcing hacks when we're about to create an object that matches
> something in the WITH OIDS list.  Or maybe it'd be better to leave the
> SQL command alone, and generalize the existing OID-forcing hooks so that
> we can pre-set a list of names to force OIDs for, and then everything
> can happen behind the back of CREATE EXTENSION.

Yes, we have used those hooks before, and they are pretty clean.  I
wonder if the OID preservation designation has to be _in_ the extension
script, but I am unclear how that would be passed to pg_upgrade or
pg_dumpall.  As you said, it is pretty complex, and I am seeing that
now.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Tom Lane

Date:

29 May 2012, 17:11:07

Bruce Momjian <bruce@momjian.us> writes:
> On Tue, May 29, 2012 at 02:38:03PM -0400, Tom Lane wrote:
>> Well, it'd eliminate the kluges for plpython, as well as possibly fixing
>> Andrew's JSON backport issue, and going forward there are going to be
>> more things like that.  So I think it's something to pursue, but it's
>> not going to be a simple change unfortunately.

> I am confused how they fix the plpython issue (mapping plpython.so to
> plpython2.so).  Is there an extension for these languages or is it just
> pg_pltemplate?  Which extensions have config files?

Hmmm ... that is a good point; right now, there are probably still an
awful lot of installations that have PL languages installed "bare"
rather than wrapped in an extension, including all the ones where the
plpython library has the old name.  So a fix that only considers
extensions doesn't get you out of the plpython problem.

(Eventually I would like to see "CREATE LANGUAGE foo" with no parameters
redefined as doing "CREATE EXTENSION foo", and doing away with
pg_pltemplate; but it didn't get done for 9.2.)

Actually, though, I don't see why pg_upgrade couldn't treat PL
languages, even bare ones, the same way I'm proposing for extensions.
Right now, it's already the case AFAICS that pg_dump --binary-upgrade
does not dump PL support functions for PLs listed in pg_pltemplate;
it just emits the CREATE LANGUAGE command and nothing else.  This being
the case, it's simply wrong for pg_upgrade to be checking shlib names
for the old support functions, because that has got exactly nothing
to do with whether the CREATE LANGUAGE command will succeed in the new
installation.  There are at least two ways you could check that
condition without assuming that the shlib name has stayed the same:

1. Look for an extension control file for the language name in the new
installation; if present, assume the proper shlib is installed too.

2. Look at the pg_pltemplate entries for the language in the new
database and see if they point to a shlib that exists.

I'd vote for #1 on the grounds that it's simpler and won't break when
we remove pg_pltemplate.  It does assume that packagers of third-party
PLs have gotten off their duffs and extension-ified their languages.
I would think that would have been a pretty popular thing to do, though,
since it so greatly reduces the installation complexity for a language
not known to pg_pltemplate.
        regards, tom lane

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

29 May 2012, 18:17:35

On Tue, May 29, 2012 at 04:10:41PM -0400, Tom Lane wrote:
> Hmmm ... that is a good point; right now, there are probably still an
> awful lot of installations that have PL languages installed "bare"
> rather than wrapped in an extension, including all the ones where the
> plpython library has the old name.  So a fix that only considers
> extensions doesn't get you out of the plpython problem.
> 
> (Eventually I would like to see "CREATE LANGUAGE foo" with no parameters
> redefined as doing "CREATE EXTENSION foo", and doing away with
> pg_pltemplate; but it didn't get done for 9.2.)
> 
> Actually, though, I don't see why pg_upgrade couldn't treat PL
> languages, even bare ones, the same way I'm proposing for extensions.
> Right now, it's already the case AFAICS that pg_dump --binary-upgrade
> does not dump PL support functions for PLs listed in pg_pltemplate;
> it just emits the CREATE LANGUAGE command and nothing else.  This being
> the case, it's simply wrong for pg_upgrade to be checking shlib names
> for the old support functions, because that has got exactly nothing
> to do with whether the CREATE LANGUAGE command will succeed in the new
> installation.  There are at least two ways you could check that
> condition without assuming that the shlib name has stayed the same:
> 
> 1. Look for an extension control file for the language name in the new
> installation; if present, assume the proper shlib is installed too.
> 
> 2. Look at the pg_pltemplate entries for the language in the new
> database and see if they point to a shlib that exists.
> 
> I'd vote for #1 on the grounds that it's simpler and won't break when
> we remove pg_pltemplate.  It does assume that packagers of third-party
> PLs have gotten off their duffs and extension-ified their languages.
> I would think that would have been a pretty popular thing to do, though,
> since it so greatly reduces the installation complexity for a language
> not known to pg_pltemplate.

Uh, there is a pg_upgrade C comment on this, so we did discuss it at
some point;  see second paragraph:
        *  In Postgres 9.0, Python 3 support was added, and to do that, a        *  plpython2u language was created
withlibrary name plpython2.so        *  as a symbolic link to plpython.so.  In Postgres 9.1, only the        *
plpython2.solibrary was created, and both plpythonu and        *  plpython2u pointing to it.  For this reason, any
referenceto        *  library name "plpython" in an old PG <= 9.1 cluster must look        *  for "plpython2" in the
newcluster. *        *  For this case, we could check pg_pltemplate, but that only works        *  for languages, and
doesnot help with function shared objects,        *  so we just do a general fix.
 

The bottom line is that already needed function shared objects checking,
so we just wrapped languages into that as well.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Tom Lane

Date:

29 May 2012, 18:35:36

Bruce Momjian <bruce@momjian.us> writes:
> The bottom line is that already needed function shared objects checking,
> so we just wrapped languages into that as well.

Well, that'd be fine if it actually *worked*, but as per this
discussion, it doesn't work; you have to kluge around the fact that
the shared library name changed.  I'm suggesting that pg_upgrade needs
to be revised to treat this stuff at a higher level.  Yeah, you need to
look at shared library names for bare C functions, but we should be
getting away from that wherever there is a higher-level construct such
as an extension or PL.
        regards, tom lane

Re: pg_upgrade libraries check

From

Bruce Momjian

Date:

29 May 2012, 18:48:18

On Tue, May 29, 2012 at 05:35:18PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > The bottom line is that already needed function shared objects checking,
> > so we just wrapped languages into that as well.
> 
> Well, that'd be fine if it actually *worked*, but as per this
> discussion, it doesn't work; you have to kluge around the fact that
> the shared library name changed.  I'm suggesting that pg_upgrade needs
> to be revised to treat this stuff at a higher level.  Yeah, you need to
> look at shared library names for bare C functions, but we should be
> getting away from that wherever there is a higher-level construct such
> as an extension or PL.

Well, we have three problems.  One is the JSON problem that Andrew wants
skipped, there is the plpython rename which we adjust in the check, and
then there is the plpython support function that is in the public
schema, which I am working on a patch to throw a meaningful error.

The only one that actually is setup for a rename is the second one.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_upgrade libraries check

From

Jeff Davis

Date:

29 May 2012, 19:50:03

On Mon, 2012-05-28 at 16:06 -0400, Robert Haas wrote:
> On Sun, May 27, 2012 at 11:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I don't recall exactly what problems drove us to make pg_upgrade do
> > what it does with extensions, but we need a different fix for them.
> 
> Well, you need pg_upgrade to preserve the OIDs of objects that are
> part of extensions just as you do for any other objects.

Also, I think it needs to force the extension version to match the old
cluster. Otherwise, we could be dealing with on-disk format changes, or
other complexities.

It doesn't sound difficult, but I thought I'd bring it up.

Regards,Jeff Davis

Re: pg_upgrade libraries check

From

Tom Lane

Date:

29 May 2012, 20:07:12

Jeff Davis <pgsql@j-davis.com> writes:
> Also, I think it needs to force the extension version to match the old
> cluster. Otherwise, we could be dealing with on-disk format changes, or
> other complexities.

Yeah, I was thinking we might want --binary-upgrade to specify a
particular extension version in CREATE EXTENSION instead of letting it
default.  I have not really thought through the pros and cons of that,
but it seems plausible.
        regards, tom lane