Thread: ALTER TABLE lock downgrades have broken pg_upgrade

ALTER TABLE lock downgrades have broken pg_upgrade

From
Tom Lane
Date:
There is logic in pg_upgrade plus the backend, mostly added by commit
4c6780fd1, to cope with the corner cases that sometimes arise where the
old and new versions have different ideas about whether a given table
needs a TOAST table.  The more common case is where there's a TOAST table
in the old DB, but (perhaps as a result of having dropped all the wide
columns) the new cluster doesn't think the table definition requires a
TOAST table.  The reverse is also possible, although according to the
existing code comments it can only happen when upgrading from pre-9.1.
The way pg_upgrade handles that case is that after running all the
table creation operations it issues this command:

            PQclear(executeQueryOrDie(conn, "ALTER TABLE %s.%s RESET (binary_upgrade_dummy_option);",
                         quote_identifier(PQgetvalue(res, rowno, i_nspname)),
                       quote_identifier(PQgetvalue(res, rowno, i_relname))));

which doesn't actually do anything (no such reloption being set) but
nonetheless triggers a call of AlterTableCreateToastTable, which
will cause a toast table to be created if the new server thinks the
table definition requires one.

Or at least, it did until Simon decided that ALTER TABLE RESET
doesn't require AccessExclusiveLock.  Now you get a failure.
I haven't tried to construct a pre-9.1 database that would trigger
this, but you can make it happen by applying the attached patch
to create a toast-table-less table in the regression tests,
and then doing "make check" in src/bin/pg_upgrade.  You get this:

...
Restoring database schemas in the new cluster
                                                            ok
Creating newly-required TOAST tables                        SQL command failed
ALTER TABLE "public"."i_once_had_a_toast_table" RESET (binary_upgrade_dummy_option);
ERROR:  AccessExclusiveLock required to add toast table.

Failure, exiting
+ rm -rf /tmp/pg_upgrade_check-o0CUMm
make: *** [check] Error 1


I think possibly the easiest fix for this is to have pg_upgrade,
instead of RESETting a nonexistent option, RESET something that's
still considered to require AccessExclusiveLock.  "user_catalog_table"
would work, looks like; though I'd want to annotate its entry in
reloptions.c to warn people away from downgrading its lock level.

More generally, though, I wonder how we can have some test coverage
on such cases going forward.  Is the patch below too ugly to commit
permanently, and if so, what other idea can you suggest?

            regards, tom lane

diff --git a/src/test/regress/expected/indirect_toast.out b/src/test/regress/expected/indirect_toast.out
index 4f4bf41..ad7127d 100644
*** a/src/test/regress/expected/indirect_toast.out
--- b/src/test/regress/expected/indirect_toast.out
*************** SELECT substring(toasttest::text, 1, 200
*** 149,151 ****
--- 149,158 ----

  DROP TABLE toasttest;
  DROP FUNCTION update_using_indirect();
+ -- Create a table that has a toast table, then modify it so it appears
+ -- not to have one, and leave it behind after the regression tests end.
+ -- This enables testing of this scenario for pg_upgrade.
+ create table i_once_had_a_toast_table(f1 int, f2 text);
+ insert into i_once_had_a_toast_table values(1, 'foo');
+ update pg_class set reltoastrelid = 0
+   where relname = 'i_once_had_a_toast_table';
diff --git a/src/test/regress/sql/indirect_toast.sql b/src/test/regress/sql/indirect_toast.sql
index d502480..cefbd0b 100644
*** a/src/test/regress/sql/indirect_toast.sql
--- b/src/test/regress/sql/indirect_toast.sql
*************** SELECT substring(toasttest::text, 1, 200
*** 59,61 ****
--- 59,69 ----

  DROP TABLE toasttest;
  DROP FUNCTION update_using_indirect();
+
+ -- Create a table that has a toast table, then modify it so it appears
+ -- not to have one, and leave it behind after the regression tests end.
+ -- This enables testing of this scenario for pg_upgrade.
+ create table i_once_had_a_toast_table(f1 int, f2 text);
+ insert into i_once_had_a_toast_table values(1, 'foo');
+ update pg_class set reltoastrelid = 0
+   where relname = 'i_once_had_a_toast_table';

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Alvaro Herrera
Date:
Tom Lane wrote:

> More generally, though, I wonder how we can have some test coverage
> on such cases going forward.  Is the patch below too ugly to commit
> permanently, and if so, what other idea can you suggest?

I suggest a buildfarm animal running a custom buildfarm module that
exercises the pg_upgrade test from every supported version to the latest
stable and to master -- together with your proposed case that leaves a
toastless table around for pg_upgrade to handle.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> Tom Lane wrote:
>
> > More generally, though, I wonder how we can have some test coverage
> > on such cases going forward.  Is the patch below too ugly to commit
> > permanently, and if so, what other idea can you suggest?
>
> I suggest a buildfarm animal running a custom buildfarm module that
> exercises the pg_upgrade test from every supported version to the latest
> stable and to master -- together with your proposed case that leaves a
> toastless table around for pg_upgrade to handle.

That would help greatly with pg_dump test coverage as well..  One of the
problems of trying to get good LOC coverage of pg_dump is that a *lot*
of the code is version-specific...

Thanks!

Stephen

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Alvaro Herrera
Date:
Stephen Frost wrote:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> > Tom Lane wrote:
> > 
> > > More generally, though, I wonder how we can have some test coverage
> > > on such cases going forward.  Is the patch below too ugly to commit
> > > permanently, and if so, what other idea can you suggest?
> > 
> > I suggest a buildfarm animal running a custom buildfarm module that
> > exercises the pg_upgrade test from every supported version to the latest
> > stable and to master -- together with your proposed case that leaves a
> > toastless table around for pg_upgrade to handle.
> 
> That would help greatly with pg_dump test coverage as well..  One of the
> problems of trying to get good LOC coverage of pg_dump is that a *lot*
> of the code is version-specific...

If we can put together a script that runs test.sh for various versions
and then verifies the runs, we could use it in both buildfarm and
coverage.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Andrew Dunstan
Date:

On 05/03/2016 01:21 PM, Stephen Frost wrote:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> Tom Lane wrote:
>>
>>> More generally, though, I wonder how we can have some test coverage
>>> on such cases going forward.  Is the patch below too ugly to commit
>>> permanently, and if so, what other idea can you suggest?
>> I suggest a buildfarm animal running a custom buildfarm module that
>> exercises the pg_upgrade test from every supported version to the latest
>> stable and to master -- together with your proposed case that leaves a
>> toastless table around for pg_upgrade to handle.
> That would help greatly with pg_dump test coverage as well..  One of the
> problems of trying to get good LOC coverage of pg_dump is that a *lot*
> of the code is version-specific...
>


I have an module that does it, although it's not really stable enough. 
But it's a big start.
See 
<https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm>

cheers

andrew




Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> > > Tom Lane wrote:
> > >
> > > > More generally, though, I wonder how we can have some test coverage
> > > > on such cases going forward.  Is the patch below too ugly to commit
> > > > permanently, and if so, what other idea can you suggest?
> > >
> > > I suggest a buildfarm animal running a custom buildfarm module that
> > > exercises the pg_upgrade test from every supported version to the latest
> > > stable and to master -- together with your proposed case that leaves a
> > > toastless table around for pg_upgrade to handle.
> >
> > That would help greatly with pg_dump test coverage as well..  One of the
> > problems of trying to get good LOC coverage of pg_dump is that a *lot*
> > of the code is version-specific...
>
> If we can put together a script that runs test.sh for various versions
> and then verifies the runs, we could use it in both buildfarm and
> coverage.

One other point is that pg_dump goes quite a bit farther back than just
what we currently support (or at least, it tries to).  I think that,
generally, that's a good thing, but it does mean we have a lot of cases
that don't get tested nearly as much...

I was able to get back to 7.4 up and running without too much trouble,
but even that doesn't cover all the cases we have in pg_dump.  I'm not
sure if we want to define a "we will support pg_dump back to X" cut-off
or if we want to try and get older versions to run on modern systems,
but it's definitely worth pointing out that we're trying to support much
farther back than what is currently supported in pg_dump today.

Thanks!

Stephen

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Andrew Dunstan
Date:

On 05/03/2016 01:28 PM, Andrew Dunstan wrote:
>
>
> On 05/03/2016 01:21 PM, Stephen Frost wrote:
>> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>>> Tom Lane wrote:
>>>
>>>> More generally, though, I wonder how we can have some test coverage
>>>> on such cases going forward.  Is the patch below too ugly to commit
>>>> permanently, and if so, what other idea can you suggest?
>>> I suggest a buildfarm animal running a custom buildfarm module that
>>> exercises the pg_upgrade test from every supported version to the 
>>> latest
>>> stable and to master -- together with your proposed case that leaves a
>>> toastless table around for pg_upgrade to handle.
>> That would help greatly with pg_dump test coverage as well.. One of the
>> problems of trying to get good LOC coverage of pg_dump is that a *lot*
>> of the code is version-specific...
>>
>
>
> I have an module that does it, although it's not really stable enough. 
> But it's a big start.
> See 
> <https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm>


Incidentally, just as a warning for anyone trying, this uses up a quite 
a lot of disk space.

You would need several GB available.

cheers

andrew






Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Andrew Dunstan
Date:

On 05/03/2016 01:33 PM, Andrew Dunstan wrote:
>
>
> On 05/03/2016 01:28 PM, Andrew Dunstan wrote:
>>
>>
>> On 05/03/2016 01:21 PM, Stephen Frost wrote:
>>> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>>>> Tom Lane wrote:
>>>>
>>>>> More generally, though, I wonder how we can have some test coverage
>>>>> on such cases going forward.  Is the patch below too ugly to commit
>>>>> permanently, and if so, what other idea can you suggest?
>>>> I suggest a buildfarm animal running a custom buildfarm module that
>>>> exercises the pg_upgrade test from every supported version to the
>>>> latest
>>>> stable and to master -- together with your proposed case that leaves a
>>>> toastless table around for pg_upgrade to handle.
>>> That would help greatly with pg_dump test coverage as well.. One of the
>>> problems of trying to get good LOC coverage of pg_dump is that a *lot*
>>> of the code is version-specific...
>>>
>>
>>
>> I have an module that does it, although it's not really stable
>> enough. But it's a big start.
>> See
>> <https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm>
>
>
> Incidentally, just as a warning for anyone trying, this uses up a
> quite a lot of disk space.
>
> You would need several GB available.
>
>

And if this is of any use, here are the dump differences from every live
version to git tip, as of this morning.


cheers

andrew


Attachment

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Alvaro Herrera
Date:
Stephen Frost wrote:

> One other point is that pg_dump goes quite a bit farther back than just
> what we currently support (or at least, it tries to).  I think that,
> generally, that's a good thing, but it does mean we have a lot of cases
> that don't get tested nearly as much...
> 
> I was able to get back to 7.4 up and running without too much trouble,
> but even that doesn't cover all the cases we have in pg_dump.  I'm not
> sure if we want to define a "we will support pg_dump back to X" cut-off
> or if we want to try and get older versions to run on modern systems,
> but it's definitely worth pointing out that we're trying to support much
> farther back than what is currently supported in pg_dump today.

Yeah.  Trying to compile old stuff with current tools (Debian jessie):

7.0's configure does not recognize my system:

checking host system type... Invalid configuration `x86_64-unknown-linux-gnu': machine `x86_64-unknown' not recognized

7.1's configure fails for accept() detection:

checking types of arguments for accept()... configure: error: could not determine argument types

7.2's configure works, but apparently it failed to find flex:

make[3]: Entering directory '/home/alvherre/Code/pgsql/source/throwaway/src/backend/bootstrap'
***
ERROR: `flex' is missing on your system. It is needed to create the
file `bootscanner.c'. You can either get flex from a GNU mirror site
or download an official distribution of PostgreSQL, which contains
pre-packaged flex output.
***
Makefile:60: recipe for target 'bootscanner.c' failed
make[3]: *** [bootscanner.c] Error 1


7.3 fails in ecpg preprocessor:

make[4]: Entering directory '/home/alvherre/Code/pgsql/source/throwaway/src/interfaces/ecpg/preproc'
make -C ../../../../src/port all
make[5]: Entering directory '/home/alvherre/Code/pgsql/source/throwaway/src/port'
make[5]: Nothing to be done for 'all'.
make[5]: Leaving directory '/home/alvherre/Code/pgsql/source/throwaway/src/port'
gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations -Wno-error -I./../include -I.
-I../../../../src/include -DMAJOR_VERSION=2 -DMINOR_VERSION=10 -DPATCHLEVEL=0
-DINCLUDE_PATH=\"/usr/local/pgsql/include\"  -c -o preproc.o preproc.c
 
In file included from preproc.y:5571:0:
pgc.c:170:18: error: conflicting types for ‘yyleng’extern yy_size_t yyleng;                 ^
In file included from preproc.y:7:0:
extern.h:32:4: note: previous declaration of ‘yyleng’ was here   yyleng;   ^
In file included from preproc.y:5571:0:
pgc.c:304:11: error: conflicting types for ‘yyleng’yy_size_t yyleng;          ^
In file included from preproc.y:7:0:
extern.h:32:4: note: previous declaration of ‘yyleng’ was here   yyleng;   ^
In file included from preproc.y:5571:0:
pgc.c:2723:16: warning: ‘input’ defined but not used [-Wunused-function]    static int input  (void)               ^
<builtin>: recipe for target 'preproc.o' failed
make[4]: *** [preproc.o] Error 1


7.4 seems to work fine.  I suppose it should be fine to remove pg_dump's
support for pre-7.3; people wanting to upgrade from before 7.3 (if any)
could use 9.6's pg_dump as an intermediate jump.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Andres Freund
Date:
Hi,

On 2016-05-03 12:07:51 -0400, Tom Lane wrote:
> I think possibly the easiest fix for this is to have pg_upgrade,
> instead of RESETting a nonexistent option, RESET something that's
> still considered to require AccessExclusiveLock.  "user_catalog_table"
> would work, looks like; though I'd want to annotate its entry in
> reloptions.c to warn people away from downgrading its lock level.

Alternatively we could just add a function for adding a toast table -
that seems less hacky and less likely to be broken in the future.

- Andres



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> One other point is that pg_dump goes quite a bit farther back than just
> what we currently support (or at least, it tries to).  I think that,
> generally, that's a good thing, but it does mean we have a lot of cases
> that don't get tested nearly as much...

Yeah.  I do periodically fire up servers back to 7.0 and see if pg_dump
can dump from them, but I don't pretend that that's a very thorough test.

> I was able to get back to 7.4 up and running without too much trouble,
> but even that doesn't cover all the cases we have in pg_dump.  I'm not
> sure if we want to define a "we will support pg_dump back to X" cut-off
> or if we want to try and get older versions to run on modern systems,
> but it's definitely worth pointing out that we're trying to support much
> farther back than what is currently supported in pg_dump today.

I've been thinking of proposing that it's time (not now, at this point,
but for 9.7) to rip out libpq's support for V2 protocol as well as
pg_dump's support for pre-7.4 backends.  That's a quite significant
amount of what at this point is very poorly tested code.  And I doubt
that it would be productive to try to improve the test coverage rather
than just removing it.

There might be an argument for moving pg_dump's cutoff further than that,
but going to 7.3 or later is significant because it would allow removal of
the kluges for schema-less and dependency-less servers.  I suggested 7.4
because it'd comport with removal of V2 wire protocol support, and because
7.4 is also our cutoff for describe support in psql.

I'm hesitant to move the cutoff really far, because we do still hear from
people running really old versions, and sooner or later those people will
want to upgrade.  It'd be good if they could use a modern pg_dump for the
purpose.
        regards, tom lane



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Stephen Frost
Date:
* Andres Freund (andres@anarazel.de) wrote:
> On 2016-05-03 12:07:51 -0400, Tom Lane wrote:
> > I think possibly the easiest fix for this is to have pg_upgrade,
> > instead of RESETting a nonexistent option, RESET something that's
> > still considered to require AccessExclusiveLock.  "user_catalog_table"
> > would work, looks like; though I'd want to annotate its entry in
> > reloptions.c to warn people away from downgrading its lock level.
>
> Alternatively we could just add a function for adding a toast table -
> that seems less hacky and less likely to be broken in the future.

+1

Thanks!

Stephen

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Andres Freund
Date:
On 2016-05-03 13:47:14 -0400, Tom Lane wrote:
> I've been thinking of proposing that it's time (not now, at this point,
> but for 9.7) to rip out libpq's support for V2 protocol as well as
> pg_dump's support for pre-7.4 backends.

+1


> There might be an argument for moving pg_dump's cutoff further than that,
> but going to 7.3 or later is significant because it would allow removal of
> the kluges for schema-less and dependency-less servers.  I suggested 7.4
> because it'd comport with removal of V2 wire protocol support, and because
> 7.4 is also our cutoff for describe support in psql.

I think we can be a lot more aggressive moving the cuttoff for psql than
for pg_dump; but that's more an argument ripping out some old psql code.


> I'm hesitant to move the cutoff really far, because we do still hear from
> people running really old versions, and sooner or later those people will
> want to upgrade.  It'd be good if they could use a modern pg_dump for the
> purpose.

I think we should consider making the cutoff point for pg_dump somewhat
predicatable. E.g. saying that we support 5 more versions than the
actually maintained ones.   The likelihood of breakages seems to
increase a good bit for older versions.


Andres



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Alvaro Herrera
Date:
Andrew Dunstan wrote:

> And if this is of any use, here are the dump differences from every live
> version to git tip, as of this morning.

Interesting, thanks.  I wonder if some of these diffs could be reduced
further by using pg_dump -Fd instead of a single text dump -- then
internal ordering would not matter, and I see that a large part of these
diffs is where GRANTs appear.  (I don't think it's a problem to use a
newer pg_dump to dump the older databases that don't support -Fd, for
this purpose.)

How would you recommend to run this in the coverage reporting machine?
Currently it's just doing "make check-world".  Could we add a buildfarm
script to run, standalone?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2016-05-03 12:07:51 -0400, Tom Lane wrote:
>> I think possibly the easiest fix for this is to have pg_upgrade,
>> instead of RESETting a nonexistent option, RESET something that's
>> still considered to require AccessExclusiveLock.  "user_catalog_table"
>> would work, looks like; though I'd want to annotate its entry in
>> reloptions.c to warn people away from downgrading its lock level.

> Alternatively we could just add a function for adding a toast table -
> that seems less hacky and less likely to be broken in the future.

We used to have an explicit "ALTER TABLE CREATE TOAST TABLE", IIRC,
but we got rid of it.  In any case, forcible creation is not what
we're after here, we just want to create it if needs_toast_table()
thinks we need one.  I'm fine with it being a hack, as long as we
have test coverage so we notice when somebody breaks the hack.
        regards, tom lane



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Andrew Dunstan
Date:

On 05/03/2016 01:58 PM, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>> And if this is of any use, here are the dump differences from every live
>> version to git tip, as of this morning.
> Interesting, thanks.  I wonder if some of these diffs could be reduced
> further by using pg_dump -Fd instead of a single text dump -- then
> internal ordering would not matter, and I see that a large part of these
> diffs is where GRANTs appear.  (I don't think it's a problem to use a
> newer pg_dump to dump the older databases that don't support -Fd, for
> this purpose.)
>
> How would you recommend to run this in the coverage reporting machine?
> Currently it's just doing "make check-world".  Could we add a buildfarm
> script to run, standalone?
>


Well, to run it you just run a buildfarm animal, possibly not even 
registered, with the module enabled. The module is actually in every 
buildfarm release, and has been for some time, but it's not enabled. 
Right now even if it runs it doesn't report anything to the server, it 
just outputs success/failure lines to stdout.

cheers

andrew




Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Tom Lane
Date:
I wrote:
> I haven't tried to construct a pre-9.1 database that would trigger
> this, but you can make it happen by applying the attached patch
> to create a toast-table-less table in the regression tests,
> and then doing "make check" in src/bin/pg_upgrade.  You get this:

> ...
> Restoring database schemas in the new cluster
>                                                             ok
> Creating newly-required TOAST tables                        SQL command failed
> ALTER TABLE "public"."i_once_had_a_toast_table" RESET (binary_upgrade_dummy_option);
> ERROR:  AccessExclusiveLock required to add toast table.
> Failure, exiting

> I think possibly the easiest fix for this is to have pg_upgrade,
> instead of RESETting a nonexistent option, RESET something that's
> still considered to require AccessExclusiveLock.  "user_catalog_table"
> would work, looks like; though I'd want to annotate its entry in
> reloptions.c to warn people away from downgrading its lock level.

I tried fixing it like that.  The alternate RESET target had behaved as
expected when I'd tested by hand, but in pg_upgrade it still fails,
only now with

Creating newly-required TOAST tables                        SQL command failed
ALTER TABLE "public"."i_need_a_toast_table" RESET (user_catalog_table);
ERROR:  pg_type OID value not set when in binary upgrade mode

This implies that there was some totally other patch, probably quite
pg_upgrade-specific, that broke this case independently of the
lock-downgrade change.

My conclusion is that we probably do need a specific pg_upgrade support
function to handle the case, rather than trying to sneak it through via
ALTER TABLE, which means that we won't be able to back-patch a fix.

I have no more time to work on this, but I think it needs to be fixed, and
I definitely think we had better put in test coverage when we do fix it.
Attached is a proposed patch that adds regression test coverage for this
and a related case (and triggers the failures I've been complaining of).

            regards, tom lane

diff --git a/src/test/regress/expected/indirect_toast.out b/src/test/regress/expected/indirect_toast.out
index 4f4bf41..3ed0189 100644
*** a/src/test/regress/expected/indirect_toast.out
--- b/src/test/regress/expected/indirect_toast.out
*************** SELECT substring(toasttest::text, 1, 200
*** 149,151 ****
--- 149,177 ----

  DROP TABLE toasttest;
  DROP FUNCTION update_using_indirect();
+ --
+ -- Create a couple of tables that have unusual TOAST situations, and leave
+ -- them around so that they'll be in the final regression database state.
+ -- This enables testing of these scenarios for pg_upgrade.
+ --
+ -- Table that has a TOAST table, but doesn't really need it.
+ create table i_have_useless_toast_table(f1 int, f2 text);
+ insert into i_have_useless_toast_table values(1, 'foo');
+ alter table i_have_useless_toast_table drop column f2;
+ -- Table that needs a TOAST table and has not got one.  This is uglier...
+ -- we can't actually remove the TOAST table, only unlink it from parent.
+ -- But leaving an orphan TOAST table is good for testing pg_upgrade, anyway.
+ create table i_need_a_toast_table(f1 int, f2 text);
+ insert into i_need_a_toast_table values(1, 'foo');
+ update pg_class set reltoastrelid = 0
+   where relname = 'i_need_a_toast_table';
+ SELECT relname, reltoastrelid <> 0 AS has_toast_table
+    FROM pg_class
+    WHERE oid::regclass IN ('i_have_useless_toast_table', 'i_need_a_toast_table')
+    ORDER BY 1;
+           relname           | has_toast_table
+ ----------------------------+-----------------
+  i_have_useless_toast_table | t
+  i_need_a_toast_table       | f
+ (2 rows)
+
diff --git a/src/test/regress/sql/indirect_toast.sql b/src/test/regress/sql/indirect_toast.sql
index d502480..15640fc 100644
*** a/src/test/regress/sql/indirect_toast.sql
--- b/src/test/regress/sql/indirect_toast.sql
*************** SELECT substring(toasttest::text, 1, 200
*** 59,61 ****
--- 59,85 ----

  DROP TABLE toasttest;
  DROP FUNCTION update_using_indirect();
+
+ --
+ -- Create a couple of tables that have unusual TOAST situations, and leave
+ -- them around so that they'll be in the final regression database state.
+ -- This enables testing of these scenarios for pg_upgrade.
+ --
+
+ -- Table that has a TOAST table, but doesn't really need it.
+ create table i_have_useless_toast_table(f1 int, f2 text);
+ insert into i_have_useless_toast_table values(1, 'foo');
+ alter table i_have_useless_toast_table drop column f2;
+
+ -- Table that needs a TOAST table and has not got one.  This is uglier...
+ -- we can't actually remove the TOAST table, only unlink it from parent.
+ -- But leaving an orphan TOAST table is good for testing pg_upgrade, anyway.
+ create table i_need_a_toast_table(f1 int, f2 text);
+ insert into i_need_a_toast_table values(1, 'foo');
+ update pg_class set reltoastrelid = 0
+   where relname = 'i_need_a_toast_table';
+
+ SELECT relname, reltoastrelid <> 0 AS has_toast_table
+    FROM pg_class
+    WHERE oid::regclass IN ('i_have_useless_toast_table', 'i_need_a_toast_table')
+    ORDER BY 1;

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Tom Lane
Date:
I wrote:
> I have no more time to work on this, but I think it needs to be fixed, and
> I definitely think we had better put in test coverage when we do fix it.

Actually, there is a really easy fix we could put in, which is to decide
that optionally_create_toast_tables() is useless and get rid of it.

Point 1: if a table did not have a TOAST table in the old database,
any decision that it needs one in the new database must be a very
close-to-the-edge situation; certainly the 9.2 change we had in this area
was a matter of rounding things off differently.  needs_toast_table()'s
threshold for max tuple length is only a quarter page, so there's a huge
amount of daylight between where we'd choose to create a toast table and
where users would actually see failures from not having one.  It's pretty
hard to believe that cross-version differences in the do-we-need-a-toast-
table heuristic would exceed that.

Point 2: the current code is broken, and will cause a pg_upgrade failure
if the situation of concern occurs.  That's certainly much worse than
not adding a marginally-useful toast table.

Point 3: in view of point 2, the lack of field complaints says that this
situation doesn't actually happen in the field.

Barring complaints, I'll fix this bug by removing that code altogether.
        regards, tom lane



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Simon Riggs
Date:
On 3 May 2016 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
 
Or at least, it did until Simon decided that ALTER TABLE RESET
doesn't require AccessExclusiveLock.

On reflection, this still seems like a good idea.
 
Now you get a failure.

Failure condition as an exception to that.
 
I haven't tried to construct a pre-9.1 database that would trigger
this, but you can make it happen by applying the attached patch
to create a toast-table-less table in the regression tests,
and then doing "make check" in src/bin/pg_upgrade.  You get this:

...
Restoring database schemas in the new cluster
                                                            ok
Creating newly-required TOAST tables                        SQL command failed
ALTER TABLE "public"."i_once_had_a_toast_table" RESET (binary_upgrade_dummy_option);
ERROR:  AccessExclusiveLock required to add toast table.

Failure, exiting

It appears that pg_upgrade is depending upon an undocumented side-effect of ALTER TABLE RESET.

I would say this side-effect should not exist, which IIUC is the same conclusion on your latest post. 

If pg_upgrade needs this, we should implement a specific function that does what pg_upgrade needs. That way we can isolate the requirement for an AccessExclusiveLock to the place that needs it: pg_upgrade. That will also make it less fragile in the future. I don't think that needs a specific command, just a function.

I accept that it is my bug and should fix it.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On 3 May 2016 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Or at least, it did until Simon decided that ALTER TABLE RESET
>> doesn't require AccessExclusiveLock.

> On reflection, this still seems like a good idea.

Yes, what pg_upgrade was doing was clearly a hack, and not a very nice one.

> I accept that it is my bug and should fix it.

It's moot at this point, see 1a2c17f8e.
        regards, tom lane



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Simon Riggs
Date:
On 7 May 2016 at 16:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> On 3 May 2016 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Or at least, it did until Simon decided that ALTER TABLE RESET
>> doesn't require AccessExclusiveLock.

> On reflection, this still seems like a good idea.

Yes, what pg_upgrade was doing was clearly a hack, and not a very nice one.

> I accept that it is my bug and should fix it.

It's moot at this point, see 1a2c17f8e.

OK, sorry for the noise and thanks for the fix. 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Greg Stark
Date:
For what it's worth, for my historical sort benchmarks I got Postgres
to build right back to 6.5 using modern tools. I have patches if
anyone wants them. Pre-7.3 doesn't actually run because we didn't
support 64-bit architectures before Tom did the V1 api (there was a
set of Alpha patches floating around but they don't seem sufficient
for x86_64). But I suspect if I built it for x86 32-bit that would
clear the immediate problem.


I was considering proposing backpatching some minimal patches to get
old unsupported branches to at least build with modern tools and run.
Just to make it easy for people to test historical behaviour and I
suppose pg_dump or other client testing would be a good use case too.
I was also going to propose turning off all warnings on these
unsupported back branches while I was at it.


One other lesson I learned doing this was that it was a pain referring
to individual git checkouts because we don't have a tag for point
where we branched the new development. So all my git-describes were
ending up with things like REL9_2_BETA2-619-gff6c78c or
REL9_3_BETA1-925-g6668ad1. You just had to know that REL9_3_BETA1-xxx
was probably a revision sometime during PG 9.5 development since BETA1
was probably where we forked development whereas 9.3 probably forked
off 9_2_BETA2. (And some were things like REL7_4_RC1-1513-g4525418
instead)

I would suggest adding tags for each version on the first development
revision so that we could see things like REL9_5_DEV-nnn would mean
the nnnth commit on the 9.5 development branch.

(What I actually did instead myself was use git describe --tags
--contains $i --match 'REL[0-9]_[0-9]_[0-9]' which gave me things like
REL9_5_0~1334 which means the 1334th commit *before* REL9_5_0. That
was also confusing but at least consistent)



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Bruce Momjian
Date:
On Tue, May  3, 2016 at 12:07:51PM -0400, Tom Lane wrote:
> I think possibly the easiest fix for this is to have pg_upgrade,
> instead of RESETting a nonexistent option, RESET something that's
> still considered to require AccessExclusiveLock.  "user_catalog_table"
> would work, looks like; though I'd want to annotate its entry in
> reloptions.c to warn people away from downgrading its lock level.
> 
> More generally, though, I wonder how we can have some test coverage
> on such cases going forward.  Is the patch below too ugly to commit
> permanently, and if so, what other idea can you suggest?

FYI, I only test _supported_ version combinations for pg_upgrade, i.e. I
don't test pg_upgrade _from_ unsupported versions, though I can see why
maybe I should.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Bruce Momjian
Date:
On Fri, May  6, 2016 at 03:32:23PM -0400, Tom Lane wrote:
> > I think possibly the easiest fix for this is to have pg_upgrade,
> > instead of RESETting a nonexistent option, RESET something that's
> > still considered to require AccessExclusiveLock.  "user_catalog_table"
> > would work, looks like; though I'd want to annotate its entry in
> > reloptions.c to warn people away from downgrading its lock level.
> 
> I tried fixing it like that.  The alternate RESET target had behaved as
> expected when I'd tested by hand, but in pg_upgrade it still fails,
> only now with
> 
> Creating newly-required TOAST tables                        SQL command failed
> ALTER TABLE "public"."i_need_a_toast_table" RESET (user_catalog_table);
> ERROR:  pg_type OID value not set when in binary upgrade mode

I think this means that the ALTER TABLE RESET is adding or potentially
adding a pg_type row, and no one called
binary_upgrade_set_next_pg_type_oid().

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Peter Eisentraut
Date:
On 5/3/16 1:25 PM, Alvaro Herrera wrote:
> If we can put together a script that runs test.sh for various versions
> and then verifies the runs, we could use it in both buildfarm and
> coverage.

Not that that would be useless, but note that the value in this case 
(and most others) comes from having a candidate object in the database 
before upgrade that exercises the particular problem, mostly independent 
of what version you upgrade from and to.  So far the way to do that is 
to leave "junk" in the regression test database, but that's clearly a 
bit silly.

I think the way forward is to create a TAP test suite for pg_upgrade 
that specifically exercises a lot of scenarios with small purpose-built 
test databases.

Then, the problem of having to compare dump output across versions also 
goes away more easily.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Bruce Momjian
Date:
On Wed, May 11, 2016 at 09:40:09AM -0400, Peter Eisentraut wrote:
> Not that that would be useless, but note that the value in this case (and
> most others) comes from having a candidate object in the database before
> upgrade that exercises the particular problem, mostly independent of what
> version you upgrade from and to.  So far the way to do that is to leave
> "junk" in the regression test database, but that's clearly a bit silly.
>
> I think the way forward is to create a TAP test suite for pg_upgrade that
> specifically exercises a lot of scenarios with small purpose-built test
> databases.
>
> Then, the problem of having to compare dump output across versions also goes
> away more easily.

I do have some small tests like for tablespaces.  I am attaching the SQL
script, if that is helpful.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

Attachment

Re: ALTER TABLE lock downgrades have broken pg_upgrade

From
Alvaro Herrera
Date:
Peter Eisentraut wrote:
> On 5/3/16 1:25 PM, Alvaro Herrera wrote:
> >If we can put together a script that runs test.sh for various versions
> >and then verifies the runs, we could use it in both buildfarm and
> >coverage.
> 
> Not that that would be useless, but note that the value in this case (and
> most others) comes from having a candidate object in the database before
> upgrade that exercises the particular problem, mostly independent of what
> version you upgrade from and to.  So far the way to do that is to leave
> "junk" in the regression test database, but that's clearly a bit silly.

True.  We have quite a few places in the standard regression tests that
leave junk behind purposefully for this reason.

> I think the way forward is to create a TAP test suite for pg_upgrade that
> specifically exercises a lot of scenarios with small purpose-built test
> databases.

That works for me.

> Then, the problem of having to compare dump output across versions also goes
> away more easily.

Not sure why, but if you think it does, then it sounds good.  Andrew's
current approach of counting lines in the diff seems brittle and not
entirely trustworthy.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services