Thread: pg_upgade vs config

pg_upgade vs config

From
Andrew Dunstan
Date:
I'm working on updating and making production ready my experimental 
cross version pg_upgrade testing module for the buildfarm. A couple of 
things have emerged that are of concern. This module does a much more 
complete test that our normal test for pg_upgrade, which only checks 
upgrading the standard regression database. This tests all the databases 
the buildfarm creates for testing, including those made by modules in 
contrib.

The biggest issue is this: the upgrade fails completely on 
ltree-plpython and hstore-plpython, presumably because these modules 
rely on the plpython module being loaded first. pg_upgrade rather 
simple-mindedly calls LOAD on the object library to test if it's usable. 
It's a bit embarrassing that we can't upgrade a database using one of 
our own modules. At the very least we should hard-code a way around this 
(e.g. have it load the relevant plpython module first), but more 
generally I think we need a way to tell pg_upgrade that module X relies 
on module Y. In the past ISTR we've said we don't support having 
dependencies between loadable modules, but that ship now seems to have 
sailed.

Second, we get an unexpected difference between the pre-upgrade and 
post-upgrade dumps for the bloom module:
   --- /home/bf/bfr/root/upgrade/HEAD/origin-REL9_6_STABLE.sql   2016-10-02 09:16:03.298341639 -0400   +++
/home/bf/bfr/root/upgrade/HEAD/converted-REL9_6_STABLE-to-HEAD.sql  2016-10-02 09:16:54.889343991 -0400   @@ -7413,6
+7413,20@@     COMMENT ON EXTENSION bloom IS 'bloom access method - signature   file based index';
 

   +--   +-- Name: bloom; Type: ACCESS METHOD; Schema: -; Owner:   +--   +   +CREATE ACCESS METHOD bloom TYPE INDEX
HANDLERpublic.blhandler;   +   +   +--   +-- Name: ACCESS METHOD bloom; Type: COMMENT; Schema: -; Owner:   +--   +
+COMMENTON ACCESS METHOD bloom IS 'bloom index access method';   +   +     SET search_path = public, pg_catalog;
 
     SET default_tablespace = '';



It looks like we have some work to do to teach pg_dump about handling 
access methods in extensions. This doesn't look quite as bad as the 
first issue, but it's a pity 9.6 escaped into the wild with this issue.

cheers

andrew




Re: pg_upgade vs config

From
Michael Paquier
Date:
On Sun, Oct 2, 2016 at 10:40 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> It looks like we have some work to do to teach pg_dump about handling access
> methods in extensions. This doesn't look quite as bad as the first issue,
> but it's a pity 9.6 escaped into the wild with this issue.

562f06f3 has addressed this issue 3 months ago, and there is a test in
src/test/modules/test_pg_dump.
-- 
Michael



Re: pg_upgade vs config

From
Andrew Dunstan
Date:

On 10/02/2016 09:50 AM, Michael Paquier wrote:
> On Sun, Oct 2, 2016 at 10:40 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> It looks like we have some work to do to teach pg_dump about handling access
>> methods in extensions. This doesn't look quite as bad as the first issue,
>> but it's a pity 9.6 escaped into the wild with this issue.
> 562f06f3 has addressed this issue 3 months ago, and there is a test in
> src/test/modules/test_pg_dump.


So then why are the pre-upgrade and post-upgrade dumps different?

cheers

andrew



Re: pg_upgade vs config

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> The biggest issue is this: the upgrade fails completely on 
> ltree-plpython and hstore-plpython, presumably because these modules 
> rely on the plpython module being loaded first. pg_upgrade rather 
> simple-mindedly calls LOAD on the object library to test if it's usable. 

FWIW, that seems to have worked fine yesterday on prairiedog.

I suspect the explanation is that macOS's dynamic linker is smart enough
to pull in plpython when one of those modules is LOAD'ed.  The ideal fix
would be to make that happen on all platforms.  I'm not actually sure
why it doesn't already; surely every dynamic linker in existence has
such a capability.

[ digs more deeply ... ]  Oh, weird: it looks like this succeeded in
every case except 9.6->HEAD upgrade.  Did we break something recently?
        regards, tom lane



Re: pg_upgade vs config

From
Tom Lane
Date:
I wrote:
> I suspect the explanation is that macOS's dynamic linker is smart enough
> to pull in plpython when one of those modules is LOAD'ed.  The ideal fix
> would be to make that happen on all platforms.  I'm not actually sure
> why it doesn't already; surely every dynamic linker in existence has
> such a capability.

Some experimentation says that this is indeed possible on Linux, at least.
It's a bit of a pain because the .so's we need to reference are not named
"libsomething.so" and thus a straight -l switch doesn't work.  The Linux
ld(1) man page documents that you can write "-l:filename" to override the
addition of "lib", but I have no idea how portable that is to other
toolchains.  (On macOS, and maybe other BSD-derived systems, it looks
like you can do this without the colon, ie -lhstore.so will work.)

Also, it seems that -L../hstore -l:hstore$(DLSUFFIX)
which was my first attempt, doesn't work because you end up with a
hard-coded reference to "../hstore/hstore.so", which rpath searching
doesn't cope with.  I was able to make it work by copying hstore.so
and plpython2.so into contrib/hstore_plpython and then just writing

SHLIB_LINK += -L. -l:hstore$(DLSUFFIX) \-l:plpython$(python_majorversion)$(DLSUFFIX) $(python_libspec)

That results in undecorated references that do work with the rpath.

This all seems depressingly platform-specific, but maybe we can make
it work on enough platforms to be satisfactory.
        regards, tom lane



Re: pg_upgade vs config

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 10/02/2016 09:50 AM, Michael Paquier wrote:
>> On Sun, Oct 2, 2016 at 10:40 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> It looks like we have some work to do to teach pg_dump about handling access
>>> methods in extensions. This doesn't look quite as bad as the first issue,
>>> but it's a pity 9.6 escaped into the wild with this issue.

>> 562f06f3 has addressed this issue 3 months ago, and there is a test in
>> src/test/modules/test_pg_dump.

> So then why are the pre-upgrade and post-upgrade dumps different?

Because pg_dump with --binary-upgrade neglects to emit

ALTER EXTENSION bloom ADD ACCESS METHOD bloom;

which it would need to do in order to make this work right.  The other
small problem is that there is no such ALTER EXTENSION syntax in the
backend.  This is a rather major oversight in the patch that added DDL
support for access methods, if you ask me.

(Also, I didn't look at what test_pg_dump is testing, but I bet it
isn't attempting to cover --binary-upgrade behavior.)
        regards, tom lane



Re: pg_upgade vs config

From
Andrew Dunstan
Date:

On 10/02/2016 01:53 PM, Tom Lane wrote:
>> So then why are the pre-upgrade and post-upgrade dumps different?
> Because pg_dump with --binary-upgrade neglects to emit
>
> ALTER EXTENSION bloom ADD ACCESS METHOD bloom;

That's what I suspected.

>
> which it would need to do in order to make this work right.  The other
> small problem is that there is no such ALTER EXTENSION syntax in the
> backend.  This is a rather major oversight in the patch that added DDL
> support for access methods, if you ask me.


I agree.

cheers

andrew




Re: pg_upgade vs config

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 10/02/2016 01:53 PM, Tom Lane wrote:
>> Because pg_dump with --binary-upgrade neglects to emit
>> ALTER EXTENSION bloom ADD ACCESS METHOD bloom;
>> which it would need to do in order to make this work right.  The other
>> small problem is that there is no such ALTER EXTENSION syntax in the
>> backend.  This is a rather major oversight in the patch that added DDL
>> support for access methods, if you ask me.

> I agree.

Remarkably enough, it seems that only a gram.y production need be added
--- the only other code involved is objectaddress.c, which does seem
to have gotten extended sufficiently.
        regards, tom lane



Re: pg_upgade vs config

From
Andrew Dunstan
Date:

On 10/02/2016 12:54 PM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> The biggest issue is this: the upgrade fails completely on
>> ltree-plpython and hstore-plpython, presumably because these modules
>> rely on the plpython module being loaded first. pg_upgrade rather
>> simple-mindedly calls LOAD on the object library to test if it's usable.
> FWIW, that seems to have worked fine yesterday on prairiedog.
>
> I suspect the explanation is that macOS's dynamic linker is smart enough
> to pull in plpython when one of those modules is LOAD'ed.  The ideal fix
> would be to make that happen on all platforms.  I'm not actually sure
> why it doesn't already; surely every dynamic linker in existence has
> such a capability.
>
> [ digs more deeply ... ]  Oh, weird: it looks like this succeeded in
> every case except 9.6->HEAD upgrade.  Did we break something recently?


Yeah, my latest version of the test module (soon to hit githyb) also 
does a self upgrade, and these modules pass that on 9.5, whereas they 
fail on 9.6, as well as the 9.6->HEAD and HEAD self-tests failing. So 
indeed it looks like we've broken something. Yet another example of why 
I need to get this test module production ready :-)

cheers

andrew




Re: pg_upgade vs config

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 10/02/2016 12:54 PM, Tom Lane wrote:
>> [ digs more deeply ... ]  Oh, weird: it looks like this succeeded in
>> every case except 9.6->HEAD upgrade.  Did we break something recently?

> Yeah, my latest version of the test module (soon to hit githyb) also 
> does a self upgrade, and these modules pass that on 9.5, whereas they 
> fail on 9.6, as well as the 9.6->HEAD and HEAD self-tests failing. So 
> indeed it looks like we've broken something.

I've now checked that simply doing LOAD 'hstore_plpython2.so' in a fresh
session fails, in both HEAD and 9.5, on both Linux and current macOS.
So it doesn't seem that we've broken anything since 9.5 --- it didn't
work before either.  The seeming successes may have been due to chance,
i.e. pg_upgrade probing the libraries in an order that happened to work.
I see no evidence that get_loadable_libraries/check_loadable_libraries
are paying any attention to what order the libraries are checked in.

I've found that the Linux '-l:hstore.so' solution works on HPUX as well,
at least to the extent of being able to run LOAD.  However, it doesn't
seem to be possible to make it work on macOS, which has a hard distinction
between "loadable modules" and "shared libraries".  Our extensions are
the former, which means that e.g. hstore.so is not a shlib and the linker
will refuse to consider it as a linkable library.  I tried converting
them to true shlibs, which is just a matter of changing "-bundle" to
"-dynamiclib" in the link command and dropping "-bundle_loader postgres"
because that's not accepted with "-dynamiclib".  That works about 90%,
but I found that plpython failed its regression test for a very scary
reason: there's a "hash_search()" somewhere in libc on this platform
and the linker was preferentially resolving plpython's call to that
rather than to the one in Postgres.  I don't think we want to risk
that sort of platform dependency :-(

On further reflection, though, this approach is probably a dead end
even without any platform concerns.  The reason is that if we do
'LOAD foo' and the dynamic linker then pulls in 'bar.so', bar.so
will be present in the address space all right, but we will not have
checked for a compatible magic block, nor have called its _PG_init
function if any.  In general, if calls into a module occur without
having called its _PG_init, it might misbehave very badly.

So now I'm thinking you're right, it'd be better to have some solution
whereby dfmgr.c knows about cross-module dependencies and loads the
dependencies first.  Not sure how to approach that.  The extension
"requires" mechanism is tantalizingly close to providing the data
we need, but dfmgr.c doesn't know about that, and there's no concept
of a reverse mapping from .so names to extensions anyway.
        regards, tom lane



Re: pg_upgade vs config

From
Tom Lane
Date:
I wrote:
> So it doesn't seem that we've broken anything since 9.5 --- it didn't
> work before either.  The seeming successes may have been due to chance,
> i.e. pg_upgrade probing the libraries in an order that happened to work.
> I see no evidence that get_loadable_libraries/check_loadable_libraries
> are paying any attention to what order the libraries are checked in.

It occurs to me that a back-patchable workaround for this would be to
make get_loadable_libraries sort the library names in order by length
(and I guess we might as well sort same-length names alphabetically).
This would for example guarantee that hstore_plpython is probed after
both hstore and plpython.  Admittedly, this is a kluge of the first
water.  But I see no prospect of back-patching any real fix, and it
would definitely be better if pg_upgrade didn't fail on these modules.
        regards, tom lane



Re: pg_upgade vs config

From
Andres Freund
Date:
On 2016-10-02 17:59:47 -0400, Tom Lane wrote:
> I've found that the Linux '-l:hstore.so' solution works on HPUX as well,
> at least to the extent of being able to run LOAD.  However, it doesn't
> seem to be possible to make it work on macOS, which has a hard distinction
> between "loadable modules" and "shared libraries".  Our extensions are
> the former, which means that e.g. hstore.so is not a shlib and the linker
> will refuse to consider it as a linkable library.  I tried converting
> them to true shlibs, which is just a matter of changing "-bundle" to
> "-dynamiclib" in the link command and dropping "-bundle_loader postgres"
> because that's not accepted with "-dynamiclib".  That works about 90%,
> but I found that plpython failed its regression test for a very scary
> reason: there's a "hash_search()" somewhere in libc on this platform
> and the linker was preferentially resolving plpython's call to that
> rather than to the one in Postgres.  I don't think we want to risk
> that sort of platform dependency :-(

ISTM that that's a risk independent of this specific issue? On other
platforms, I mean?


> So now I'm thinking you're right, it'd be better to have some solution
> whereby dfmgr.c knows about cross-module dependencies and loads the
> dependencies first.  Not sure how to approach that.  The extension
> "requires" mechanism is tantalizingly close to providing the data
> we need, but dfmgr.c doesn't know about that, and there's no concept
> of a reverse mapping from .so names to extensions anyway.

One, kind of extreme, way to get there would be to resolve the hstore
symbols hstore_plpython needs with load_external_function, during
_PG_init().  Most of these files don't actually depend on a large number
of symbols, so that should actually be doable.

Greetings,

Andres Freund



Re: pg_upgade vs config

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2016-10-02 17:59:47 -0400, Tom Lane wrote:
>> So now I'm thinking you're right, it'd be better to have some solution
>> whereby dfmgr.c knows about cross-module dependencies and loads the
>> dependencies first.  Not sure how to approach that.  The extension
>> "requires" mechanism is tantalizingly close to providing the data
>> we need, but dfmgr.c doesn't know about that, and there's no concept
>> of a reverse mapping from .so names to extensions anyway.

> One, kind of extreme, way to get there would be to resolve the hstore
> symbols hstore_plpython needs with load_external_function, during
> _PG_init().  Most of these files don't actually depend on a large number
> of symbols, so that should actually be doable.

Hm.  That would actually not be a bad idea, perhaps, because the method
we're using right now requires that the linker not bitch about unresolved
symbols at build time, which is a really bad thing that I'd prefer to
turn off.

It's still not a very back-patchable answer, but it's something that
we could get to in HEAD without a huge amount of work.
        regards, tom lane



Re: pg_upgade vs config

From
Tom Lane
Date:
I wrote:
> It occurs to me that a back-patchable workaround for this would be to
> make get_loadable_libraries sort the library names in order by length
> (and I guess we might as well sort same-length names alphabetically).
> This would for example guarantee that hstore_plpython is probed after
> both hstore and plpython.  Admittedly, this is a kluge of the first
> water.  But I see no prospect of back-patching any real fix, and it
> would definitely be better if pg_upgrade didn't fail on these modules.

I've tested the attached and verified that it allows pg_upgrade'ing
of the hstore_plpython regression DB --- or, if I reverse the sort
order, that it reproducibly fails.  I propose back-patching this
at least as far as 9.5, where the transform modules came in.  It might
be a good idea to go all the way back, just so that the behavior is
predictable.

            regards, tom lane

diff --git a/src/bin/pg_upgrade/function.c b/src/bin/pg_upgrade/function.c
index b432b54..5435bff 100644
*** a/src/bin/pg_upgrade/function.c
--- b/src/bin/pg_upgrade/function.c
***************
*** 15,20 ****
--- 15,44 ----


  /*
+  * qsort comparator for pointers to library names
+  *
+  * We sort first by name length, then alphabetically for names of the same
+  * length.  This is to ensure that, eg, "hstore_plpython" sorts after both
+  * "hstore" and "plpython"; otherwise transform modules will probably fail
+  * their LOAD tests.  (The backend ought to cope with that consideration,
+  * but it doesn't yet, and even when it does it'd be a good idea to have
+  * a predictable order of probing here.)
+  */
+ static int
+ library_name_compare(const void *p1, const void *p2)
+ {
+     const char *str1 = *(const char *const *) p1;
+     const char *str2 = *(const char *const *) p2;
+     int            slen1 = strlen(str1);
+     int            slen2 = strlen(str2);
+
+     if (slen1 != slen2)
+         return slen1 - slen2;
+     return strcmp(str1, str2);
+ }
+
+
+ /*
   * get_loadable_libraries()
   *
   *    Fetch the names of all old libraries containing C-language functions.
*************** get_loadable_libraries(void)
*** 38,47 ****
          PGconn       *conn = connectToServer(&old_cluster, active_db->db_name);

          /*
!          * Fetch all libraries referenced in this DB.  We can't exclude the
!          * "pg_catalog" schema because, while such functions are not
!          * explicitly dumped by pg_dump, they do reference implicit objects
!          * that pg_dump does dump, e.g. CREATE LANGUAGE plperl.
           */
          ress[dbnum] = executeQueryOrDie(conn,
                                          "SELECT DISTINCT probin "
--- 62,68 ----
          PGconn       *conn = connectToServer(&old_cluster, active_db->db_name);

          /*
!          * Fetch all libraries referenced in this DB.
           */
          ress[dbnum] = executeQueryOrDie(conn,
                                          "SELECT DISTINCT probin "
*************** get_loadable_libraries(void)
*** 69,76 ****

              res = executeQueryOrDie(conn,
                                      "SELECT 1 "
!                            "FROM    pg_catalog.pg_proc JOIN pg_namespace "
!                              "        ON pronamespace = pg_namespace.oid "
                                 "WHERE proname = 'plpython_call_handler' AND "
                                      "nspname = 'public' AND "
                                      "prolang = 13 /* C */ AND "
--- 90,98 ----

              res = executeQueryOrDie(conn,
                                      "SELECT 1 "
!                                     "FROM pg_catalog.pg_proc p "
!                                     "    JOIN pg_catalog.pg_namespace n "
!                                     "    ON pronamespace = n.oid "
                                 "WHERE proname = 'plpython_call_handler' AND "
                                      "nspname = 'public' AND "
                                      "prolang = 13 /* C */ AND "
*************** get_loadable_libraries(void)
*** 112,124 ****
      if (found_public_plpython_handler)
          pg_fatal("Remove the problem functions from the old cluster to continue.\n");

-     /* Allocate what's certainly enough space */
-     os_info.libraries = (char **) pg_malloc(totaltups * sizeof(char *));
-
      /*
!      * Now remove duplicates across DBs.  This is pretty inefficient code, but
!      * there probably aren't enough entries to matter.
       */
      totaltups = 0;

      for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
--- 134,151 ----
      if (found_public_plpython_handler)
          pg_fatal("Remove the problem functions from the old cluster to continue.\n");

      /*
!      * Now we want to remove duplicates across DBs and sort the library names
!      * into order.  This avoids multiple probes of the same library, and
!      * ensures that libraries are probed in a consistent order, which is
!      * important for reproducible behavior if one library depends on another.
!      *
!      * First transfer all the names into one array, then sort, then remove
!      * duplicates.  Note: we strdup each name in the first loop so that we can
!      * safely clear the PGresults in the same loop.  This is a bit wasteful
!      * but it's unlikely there are enough names to matter.
       */
+     os_info.libraries = (char **) pg_malloc(totaltups * sizeof(char *));
      totaltups = 0;

      for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
*************** get_loadable_libraries(void)
*** 131,157 ****
          for (rowno = 0; rowno < ntups; rowno++)
          {
              char       *lib = PQgetvalue(res, rowno, 0);
-             bool        dup = false;
-             int            n;

!             for (n = 0; n < totaltups; n++)
!             {
!                 if (strcmp(lib, os_info.libraries[n]) == 0)
!                 {
!                     dup = true;
!                     break;
!                 }
!             }
!             if (!dup)
!                 os_info.libraries[totaltups++] = pg_strdup(lib);
          }
-
          PQclear(res);
      }

-     os_info.num_libraries = totaltups;
-
      pg_free(ress);
  }


--- 158,191 ----
          for (rowno = 0; rowno < ntups; rowno++)
          {
              char       *lib = PQgetvalue(res, rowno, 0);

!             os_info.libraries[totaltups++] = pg_strdup(lib);
          }
          PQclear(res);
      }

      pg_free(ress);
+
+     if (totaltups > 1)
+     {
+         int            i,
+                     lastnondup;
+
+         qsort((void *) os_info.libraries, totaltups, sizeof(char *),
+               library_name_compare);
+
+         for (i = 1, lastnondup = 0; i < totaltups; i++)
+         {
+             if (strcmp(os_info.libraries[i],
+                        os_info.libraries[lastnondup]) != 0)
+                 os_info.libraries[++lastnondup] = os_info.libraries[i];
+             else
+                 pg_free(os_info.libraries[i]);
+         }
+         totaltups = lastnondup + 1;
+     }
+
+     os_info.num_libraries = totaltups;
  }



Re: pg_upgade vs config

From
Andrew Dunstan
Date:

On 10/02/2016 07:21 PM, Tom Lane wrote:
> I wrote:
>> It occurs to me that a back-patchable workaround for this would be to
>> make get_loadable_libraries sort the library names in order by length
>> (and I guess we might as well sort same-length names alphabetically).
>> This would for example guarantee that hstore_plpython is probed after
>> both hstore and plpython.  Admittedly, this is a kluge of the first
>> water.  But I see no prospect of back-patching any real fix, and it
>> would definitely be better if pg_upgrade didn't fail on these modules.
> I've tested the attached and verified that it allows pg_upgrade'ing
> of the hstore_plpython regression DB --- or, if I reverse the sort
> order, that it reproducibly fails.  I propose back-patching this
> at least as far as 9.5, where the transform modules came in.  It might
> be a good idea to go all the way back, just so that the behavior is
> predictable.
>
>             

Yeah, it's a really ugly kluge, but I don't have a better idea.

cheers

andrew