Thread: ALTER TABLE lock downgrades have broken pg_upgrade
There is logic in pg_upgrade plus the backend, mostly added by commit 4c6780fd1, to cope with the corner cases that sometimes arise where the old and new versions have different ideas about whether a given table needs a TOAST table. The more common case is where there's a TOAST table in the old DB, but (perhaps as a result of having dropped all the wide columns) the new cluster doesn't think the table definition requires a TOAST table. The reverse is also possible, although according to the existing code comments it can only happen when upgrading from pre-9.1. The way pg_upgrade handles that case is that after running all the table creation operations it issues this command: PQclear(executeQueryOrDie(conn, "ALTER TABLE %s.%s RESET (binary_upgrade_dummy_option);", quote_identifier(PQgetvalue(res, rowno, i_nspname)), quote_identifier(PQgetvalue(res, rowno, i_relname)))); which doesn't actually do anything (no such reloption being set) but nonetheless triggers a call of AlterTableCreateToastTable, which will cause a toast table to be created if the new server thinks the table definition requires one. Or at least, it did until Simon decided that ALTER TABLE RESET doesn't require AccessExclusiveLock. Now you get a failure. I haven't tried to construct a pre-9.1 database that would trigger this, but you can make it happen by applying the attached patch to create a toast-table-less table in the regression tests, and then doing "make check" in src/bin/pg_upgrade. You get this: ... Restoring database schemas in the new cluster ok Creating newly-required TOAST tables SQL command failed ALTER TABLE "public"."i_once_had_a_toast_table" RESET (binary_upgrade_dummy_option); ERROR: AccessExclusiveLock required to add toast table. Failure, exiting + rm -rf /tmp/pg_upgrade_check-o0CUMm make: *** [check] Error 1 I think possibly the easiest fix for this is to have pg_upgrade, instead of RESETting a nonexistent option, RESET something that's still considered to require AccessExclusiveLock. "user_catalog_table" would work, looks like; though I'd want to annotate its entry in reloptions.c to warn people away from downgrading its lock level. More generally, though, I wonder how we can have some test coverage on such cases going forward. Is the patch below too ugly to commit permanently, and if so, what other idea can you suggest? regards, tom lane diff --git a/src/test/regress/expected/indirect_toast.out b/src/test/regress/expected/indirect_toast.out index 4f4bf41..ad7127d 100644 *** a/src/test/regress/expected/indirect_toast.out --- b/src/test/regress/expected/indirect_toast.out *************** SELECT substring(toasttest::text, 1, 200 *** 149,151 **** --- 149,158 ---- DROP TABLE toasttest; DROP FUNCTION update_using_indirect(); + -- Create a table that has a toast table, then modify it so it appears + -- not to have one, and leave it behind after the regression tests end. + -- This enables testing of this scenario for pg_upgrade. + create table i_once_had_a_toast_table(f1 int, f2 text); + insert into i_once_had_a_toast_table values(1, 'foo'); + update pg_class set reltoastrelid = 0 + where relname = 'i_once_had_a_toast_table'; diff --git a/src/test/regress/sql/indirect_toast.sql b/src/test/regress/sql/indirect_toast.sql index d502480..cefbd0b 100644 *** a/src/test/regress/sql/indirect_toast.sql --- b/src/test/regress/sql/indirect_toast.sql *************** SELECT substring(toasttest::text, 1, 200 *** 59,61 **** --- 59,69 ---- DROP TABLE toasttest; DROP FUNCTION update_using_indirect(); + + -- Create a table that has a toast table, then modify it so it appears + -- not to have one, and leave it behind after the regression tests end. + -- This enables testing of this scenario for pg_upgrade. + create table i_once_had_a_toast_table(f1 int, f2 text); + insert into i_once_had_a_toast_table values(1, 'foo'); + update pg_class set reltoastrelid = 0 + where relname = 'i_once_had_a_toast_table';
Tom Lane wrote: > More generally, though, I wonder how we can have some test coverage > on such cases going forward. Is the patch below too ugly to commit > permanently, and if so, what other idea can you suggest? I suggest a buildfarm animal running a custom buildfarm module that exercises the pg_upgrade test from every supported version to the latest stable and to master -- together with your proposed case that leaves a toastless table around for pg_upgrade to handle. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > Tom Lane wrote: > > > More generally, though, I wonder how we can have some test coverage > > on such cases going forward. Is the patch below too ugly to commit > > permanently, and if so, what other idea can you suggest? > > I suggest a buildfarm animal running a custom buildfarm module that > exercises the pg_upgrade test from every supported version to the latest > stable and to master -- together with your proposed case that leaves a > toastless table around for pg_upgrade to handle. That would help greatly with pg_dump test coverage as well.. One of the problems of trying to get good LOC coverage of pg_dump is that a *lot* of the code is version-specific... Thanks! Stephen
Stephen Frost wrote: > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > > Tom Lane wrote: > > > > > More generally, though, I wonder how we can have some test coverage > > > on such cases going forward. Is the patch below too ugly to commit > > > permanently, and if so, what other idea can you suggest? > > > > I suggest a buildfarm animal running a custom buildfarm module that > > exercises the pg_upgrade test from every supported version to the latest > > stable and to master -- together with your proposed case that leaves a > > toastless table around for pg_upgrade to handle. > > That would help greatly with pg_dump test coverage as well.. One of the > problems of trying to get good LOC coverage of pg_dump is that a *lot* > of the code is version-specific... If we can put together a script that runs test.sh for various versions and then verifies the runs, we could use it in both buildfarm and coverage. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 05/03/2016 01:21 PM, Stephen Frost wrote: > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: >> Tom Lane wrote: >> >>> More generally, though, I wonder how we can have some test coverage >>> on such cases going forward. Is the patch below too ugly to commit >>> permanently, and if so, what other idea can you suggest? >> I suggest a buildfarm animal running a custom buildfarm module that >> exercises the pg_upgrade test from every supported version to the latest >> stable and to master -- together with your proposed case that leaves a >> toastless table around for pg_upgrade to handle. > That would help greatly with pg_dump test coverage as well.. One of the > problems of trying to get good LOC coverage of pg_dump is that a *lot* > of the code is version-specific... > I have an module that does it, although it's not really stable enough. But it's a big start. See <https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm> cheers andrew
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > Stephen Frost wrote: > > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > > > Tom Lane wrote: > > > > > > > More generally, though, I wonder how we can have some test coverage > > > > on such cases going forward. Is the patch below too ugly to commit > > > > permanently, and if so, what other idea can you suggest? > > > > > > I suggest a buildfarm animal running a custom buildfarm module that > > > exercises the pg_upgrade test from every supported version to the latest > > > stable and to master -- together with your proposed case that leaves a > > > toastless table around for pg_upgrade to handle. > > > > That would help greatly with pg_dump test coverage as well.. One of the > > problems of trying to get good LOC coverage of pg_dump is that a *lot* > > of the code is version-specific... > > If we can put together a script that runs test.sh for various versions > and then verifies the runs, we could use it in both buildfarm and > coverage. One other point is that pg_dump goes quite a bit farther back than just what we currently support (or at least, it tries to). I think that, generally, that's a good thing, but it does mean we have a lot of cases that don't get tested nearly as much... I was able to get back to 7.4 up and running without too much trouble, but even that doesn't cover all the cases we have in pg_dump. I'm not sure if we want to define a "we will support pg_dump back to X" cut-off or if we want to try and get older versions to run on modern systems, but it's definitely worth pointing out that we're trying to support much farther back than what is currently supported in pg_dump today. Thanks! Stephen
On 05/03/2016 01:28 PM, Andrew Dunstan wrote: > > > On 05/03/2016 01:21 PM, Stephen Frost wrote: >> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: >>> Tom Lane wrote: >>> >>>> More generally, though, I wonder how we can have some test coverage >>>> on such cases going forward. Is the patch below too ugly to commit >>>> permanently, and if so, what other idea can you suggest? >>> I suggest a buildfarm animal running a custom buildfarm module that >>> exercises the pg_upgrade test from every supported version to the >>> latest >>> stable and to master -- together with your proposed case that leaves a >>> toastless table around for pg_upgrade to handle. >> That would help greatly with pg_dump test coverage as well.. One of the >> problems of trying to get good LOC coverage of pg_dump is that a *lot* >> of the code is version-specific... >> > > > I have an module that does it, although it's not really stable enough. > But it's a big start. > See > <https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm> Incidentally, just as a warning for anyone trying, this uses up a quite a lot of disk space. You would need several GB available. cheers andrew
On 05/03/2016 01:33 PM, Andrew Dunstan wrote: > > > On 05/03/2016 01:28 PM, Andrew Dunstan wrote: >> >> >> On 05/03/2016 01:21 PM, Stephen Frost wrote: >>> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: >>>> Tom Lane wrote: >>>> >>>>> More generally, though, I wonder how we can have some test coverage >>>>> on such cases going forward. Is the patch below too ugly to commit >>>>> permanently, and if so, what other idea can you suggest? >>>> I suggest a buildfarm animal running a custom buildfarm module that >>>> exercises the pg_upgrade test from every supported version to the >>>> latest >>>> stable and to master -- together with your proposed case that leaves a >>>> toastless table around for pg_upgrade to handle. >>> That would help greatly with pg_dump test coverage as well.. One of the >>> problems of trying to get good LOC coverage of pg_dump is that a *lot* >>> of the code is version-specific... >>> >> >> >> I have an module that does it, although it's not really stable >> enough. But it's a big start. >> See >> <https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm> > > > Incidentally, just as a warning for anyone trying, this uses up a > quite a lot of disk space. > > You would need several GB available. > > And if this is of any use, here are the dump differences from every live version to git tip, as of this morning. cheers andrew
Attachment
Stephen Frost wrote: > One other point is that pg_dump goes quite a bit farther back than just > what we currently support (or at least, it tries to). I think that, > generally, that's a good thing, but it does mean we have a lot of cases > that don't get tested nearly as much... > > I was able to get back to 7.4 up and running without too much trouble, > but even that doesn't cover all the cases we have in pg_dump. I'm not > sure if we want to define a "we will support pg_dump back to X" cut-off > or if we want to try and get older versions to run on modern systems, > but it's definitely worth pointing out that we're trying to support much > farther back than what is currently supported in pg_dump today. Yeah. Trying to compile old stuff with current tools (Debian jessie): 7.0's configure does not recognize my system: checking host system type... Invalid configuration `x86_64-unknown-linux-gnu': machine `x86_64-unknown' not recognized 7.1's configure fails for accept() detection: checking types of arguments for accept()... configure: error: could not determine argument types 7.2's configure works, but apparently it failed to find flex: make[3]: Entering directory '/home/alvherre/Code/pgsql/source/throwaway/src/backend/bootstrap' *** ERROR: `flex' is missing on your system. It is needed to create the file `bootscanner.c'. You can either get flex from a GNU mirror site or download an official distribution of PostgreSQL, which contains pre-packaged flex output. *** Makefile:60: recipe for target 'bootscanner.c' failed make[3]: *** [bootscanner.c] Error 1 7.3 fails in ecpg preprocessor: make[4]: Entering directory '/home/alvherre/Code/pgsql/source/throwaway/src/interfaces/ecpg/preproc' make -C ../../../../src/port all make[5]: Entering directory '/home/alvherre/Code/pgsql/source/throwaway/src/port' make[5]: Nothing to be done for 'all'. make[5]: Leaving directory '/home/alvherre/Code/pgsql/source/throwaway/src/port' gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations -Wno-error -I./../include -I. -I../../../../src/include -DMAJOR_VERSION=2 -DMINOR_VERSION=10 -DPATCHLEVEL=0 -DINCLUDE_PATH=\"/usr/local/pgsql/include\" -c -o preproc.o preproc.c In file included from preproc.y:5571:0: pgc.c:170:18: error: conflicting types for ‘yyleng’extern yy_size_t yyleng; ^ In file included from preproc.y:7:0: extern.h:32:4: note: previous declaration of ‘yyleng’ was here yyleng; ^ In file included from preproc.y:5571:0: pgc.c:304:11: error: conflicting types for ‘yyleng’yy_size_t yyleng; ^ In file included from preproc.y:7:0: extern.h:32:4: note: previous declaration of ‘yyleng’ was here yyleng; ^ In file included from preproc.y:5571:0: pgc.c:2723:16: warning: ‘input’ defined but not used [-Wunused-function] static int input (void) ^ <builtin>: recipe for target 'preproc.o' failed make[4]: *** [preproc.o] Error 1 7.4 seems to work fine. I suppose it should be fine to remove pg_dump's support for pre-7.3; people wanting to upgrade from before 7.3 (if any) could use 9.6's pg_dump as an intermediate jump. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2016-05-03 12:07:51 -0400, Tom Lane wrote: > I think possibly the easiest fix for this is to have pg_upgrade, > instead of RESETting a nonexistent option, RESET something that's > still considered to require AccessExclusiveLock. "user_catalog_table" > would work, looks like; though I'd want to annotate its entry in > reloptions.c to warn people away from downgrading its lock level. Alternatively we could just add a function for adding a toast table - that seems less hacky and less likely to be broken in the future. - Andres
Stephen Frost <sfrost@snowman.net> writes: > One other point is that pg_dump goes quite a bit farther back than just > what we currently support (or at least, it tries to). I think that, > generally, that's a good thing, but it does mean we have a lot of cases > that don't get tested nearly as much... Yeah. I do periodically fire up servers back to 7.0 and see if pg_dump can dump from them, but I don't pretend that that's a very thorough test. > I was able to get back to 7.4 up and running without too much trouble, > but even that doesn't cover all the cases we have in pg_dump. I'm not > sure if we want to define a "we will support pg_dump back to X" cut-off > or if we want to try and get older versions to run on modern systems, > but it's definitely worth pointing out that we're trying to support much > farther back than what is currently supported in pg_dump today. I've been thinking of proposing that it's time (not now, at this point, but for 9.7) to rip out libpq's support for V2 protocol as well as pg_dump's support for pre-7.4 backends. That's a quite significant amount of what at this point is very poorly tested code. And I doubt that it would be productive to try to improve the test coverage rather than just removing it. There might be an argument for moving pg_dump's cutoff further than that, but going to 7.3 or later is significant because it would allow removal of the kluges for schema-less and dependency-less servers. I suggested 7.4 because it'd comport with removal of V2 wire protocol support, and because 7.4 is also our cutoff for describe support in psql. I'm hesitant to move the cutoff really far, because we do still hear from people running really old versions, and sooner or later those people will want to upgrade. It'd be good if they could use a modern pg_dump for the purpose. regards, tom lane
* Andres Freund (andres@anarazel.de) wrote: > On 2016-05-03 12:07:51 -0400, Tom Lane wrote: > > I think possibly the easiest fix for this is to have pg_upgrade, > > instead of RESETting a nonexistent option, RESET something that's > > still considered to require AccessExclusiveLock. "user_catalog_table" > > would work, looks like; though I'd want to annotate its entry in > > reloptions.c to warn people away from downgrading its lock level. > > Alternatively we could just add a function for adding a toast table - > that seems less hacky and less likely to be broken in the future. +1 Thanks! Stephen
On 2016-05-03 13:47:14 -0400, Tom Lane wrote: > I've been thinking of proposing that it's time (not now, at this point, > but for 9.7) to rip out libpq's support for V2 protocol as well as > pg_dump's support for pre-7.4 backends. +1 > There might be an argument for moving pg_dump's cutoff further than that, > but going to 7.3 or later is significant because it would allow removal of > the kluges for schema-less and dependency-less servers. I suggested 7.4 > because it'd comport with removal of V2 wire protocol support, and because > 7.4 is also our cutoff for describe support in psql. I think we can be a lot more aggressive moving the cuttoff for psql than for pg_dump; but that's more an argument ripping out some old psql code. > I'm hesitant to move the cutoff really far, because we do still hear from > people running really old versions, and sooner or later those people will > want to upgrade. It'd be good if they could use a modern pg_dump for the > purpose. I think we should consider making the cutoff point for pg_dump somewhat predicatable. E.g. saying that we support 5 more versions than the actually maintained ones. The likelihood of breakages seems to increase a good bit for older versions. Andres
Andrew Dunstan wrote: > And if this is of any use, here are the dump differences from every live > version to git tip, as of this morning. Interesting, thanks. I wonder if some of these diffs could be reduced further by using pg_dump -Fd instead of a single text dump -- then internal ordering would not matter, and I see that a large part of these diffs is where GRANTs appear. (I don't think it's a problem to use a newer pg_dump to dump the older databases that don't support -Fd, for this purpose.) How would you recommend to run this in the coverage reporting machine? Currently it's just doing "make check-world". Could we add a buildfarm script to run, standalone? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andres Freund <andres@anarazel.de> writes: > On 2016-05-03 12:07:51 -0400, Tom Lane wrote: >> I think possibly the easiest fix for this is to have pg_upgrade, >> instead of RESETting a nonexistent option, RESET something that's >> still considered to require AccessExclusiveLock. "user_catalog_table" >> would work, looks like; though I'd want to annotate its entry in >> reloptions.c to warn people away from downgrading its lock level. > Alternatively we could just add a function for adding a toast table - > that seems less hacky and less likely to be broken in the future. We used to have an explicit "ALTER TABLE CREATE TOAST TABLE", IIRC, but we got rid of it. In any case, forcible creation is not what we're after here, we just want to create it if needs_toast_table() thinks we need one. I'm fine with it being a hack, as long as we have test coverage so we notice when somebody breaks the hack. regards, tom lane
On 05/03/2016 01:58 PM, Alvaro Herrera wrote: > Andrew Dunstan wrote: > >> And if this is of any use, here are the dump differences from every live >> version to git tip, as of this morning. > Interesting, thanks. I wonder if some of these diffs could be reduced > further by using pg_dump -Fd instead of a single text dump -- then > internal ordering would not matter, and I see that a large part of these > diffs is where GRANTs appear. (I don't think it's a problem to use a > newer pg_dump to dump the older databases that don't support -Fd, for > this purpose.) > > How would you recommend to run this in the coverage reporting machine? > Currently it's just doing "make check-world". Could we add a buildfarm > script to run, standalone? > Well, to run it you just run a buildfarm animal, possibly not even registered, with the module enabled. The module is actually in every buildfarm release, and has been for some time, but it's not enabled. Right now even if it runs it doesn't report anything to the server, it just outputs success/failure lines to stdout. cheers andrew
I wrote: > I haven't tried to construct a pre-9.1 database that would trigger > this, but you can make it happen by applying the attached patch > to create a toast-table-less table in the regression tests, > and then doing "make check" in src/bin/pg_upgrade. You get this: > ... > Restoring database schemas in the new cluster > ok > Creating newly-required TOAST tables SQL command failed > ALTER TABLE "public"."i_once_had_a_toast_table" RESET (binary_upgrade_dummy_option); > ERROR: AccessExclusiveLock required to add toast table. > Failure, exiting > I think possibly the easiest fix for this is to have pg_upgrade, > instead of RESETting a nonexistent option, RESET something that's > still considered to require AccessExclusiveLock. "user_catalog_table" > would work, looks like; though I'd want to annotate its entry in > reloptions.c to warn people away from downgrading its lock level. I tried fixing it like that. The alternate RESET target had behaved as expected when I'd tested by hand, but in pg_upgrade it still fails, only now with Creating newly-required TOAST tables SQL command failed ALTER TABLE "public"."i_need_a_toast_table" RESET (user_catalog_table); ERROR: pg_type OID value not set when in binary upgrade mode This implies that there was some totally other patch, probably quite pg_upgrade-specific, that broke this case independently of the lock-downgrade change. My conclusion is that we probably do need a specific pg_upgrade support function to handle the case, rather than trying to sneak it through via ALTER TABLE, which means that we won't be able to back-patch a fix. I have no more time to work on this, but I think it needs to be fixed, and I definitely think we had better put in test coverage when we do fix it. Attached is a proposed patch that adds regression test coverage for this and a related case (and triggers the failures I've been complaining of). regards, tom lane diff --git a/src/test/regress/expected/indirect_toast.out b/src/test/regress/expected/indirect_toast.out index 4f4bf41..3ed0189 100644 *** a/src/test/regress/expected/indirect_toast.out --- b/src/test/regress/expected/indirect_toast.out *************** SELECT substring(toasttest::text, 1, 200 *** 149,151 **** --- 149,177 ---- DROP TABLE toasttest; DROP FUNCTION update_using_indirect(); + -- + -- Create a couple of tables that have unusual TOAST situations, and leave + -- them around so that they'll be in the final regression database state. + -- This enables testing of these scenarios for pg_upgrade. + -- + -- Table that has a TOAST table, but doesn't really need it. + create table i_have_useless_toast_table(f1 int, f2 text); + insert into i_have_useless_toast_table values(1, 'foo'); + alter table i_have_useless_toast_table drop column f2; + -- Table that needs a TOAST table and has not got one. This is uglier... + -- we can't actually remove the TOAST table, only unlink it from parent. + -- But leaving an orphan TOAST table is good for testing pg_upgrade, anyway. + create table i_need_a_toast_table(f1 int, f2 text); + insert into i_need_a_toast_table values(1, 'foo'); + update pg_class set reltoastrelid = 0 + where relname = 'i_need_a_toast_table'; + SELECT relname, reltoastrelid <> 0 AS has_toast_table + FROM pg_class + WHERE oid::regclass IN ('i_have_useless_toast_table', 'i_need_a_toast_table') + ORDER BY 1; + relname | has_toast_table + ----------------------------+----------------- + i_have_useless_toast_table | t + i_need_a_toast_table | f + (2 rows) + diff --git a/src/test/regress/sql/indirect_toast.sql b/src/test/regress/sql/indirect_toast.sql index d502480..15640fc 100644 *** a/src/test/regress/sql/indirect_toast.sql --- b/src/test/regress/sql/indirect_toast.sql *************** SELECT substring(toasttest::text, 1, 200 *** 59,61 **** --- 59,85 ---- DROP TABLE toasttest; DROP FUNCTION update_using_indirect(); + + -- + -- Create a couple of tables that have unusual TOAST situations, and leave + -- them around so that they'll be in the final regression database state. + -- This enables testing of these scenarios for pg_upgrade. + -- + + -- Table that has a TOAST table, but doesn't really need it. + create table i_have_useless_toast_table(f1 int, f2 text); + insert into i_have_useless_toast_table values(1, 'foo'); + alter table i_have_useless_toast_table drop column f2; + + -- Table that needs a TOAST table and has not got one. This is uglier... + -- we can't actually remove the TOAST table, only unlink it from parent. + -- But leaving an orphan TOAST table is good for testing pg_upgrade, anyway. + create table i_need_a_toast_table(f1 int, f2 text); + insert into i_need_a_toast_table values(1, 'foo'); + update pg_class set reltoastrelid = 0 + where relname = 'i_need_a_toast_table'; + + SELECT relname, reltoastrelid <> 0 AS has_toast_table + FROM pg_class + WHERE oid::regclass IN ('i_have_useless_toast_table', 'i_need_a_toast_table') + ORDER BY 1;
I wrote: > I have no more time to work on this, but I think it needs to be fixed, and > I definitely think we had better put in test coverage when we do fix it. Actually, there is a really easy fix we could put in, which is to decide that optionally_create_toast_tables() is useless and get rid of it. Point 1: if a table did not have a TOAST table in the old database, any decision that it needs one in the new database must be a very close-to-the-edge situation; certainly the 9.2 change we had in this area was a matter of rounding things off differently. needs_toast_table()'s threshold for max tuple length is only a quarter page, so there's a huge amount of daylight between where we'd choose to create a toast table and where users would actually see failures from not having one. It's pretty hard to believe that cross-version differences in the do-we-need-a-toast- table heuristic would exceed that. Point 2: the current code is broken, and will cause a pg_upgrade failure if the situation of concern occurs. That's certainly much worse than not adding a marginally-useful toast table. Point 3: in view of point 2, the lack of field complaints says that this situation doesn't actually happen in the field. Barring complaints, I'll fix this bug by removing that code altogether. regards, tom lane
On 3 May 2016 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--
Or at least, it did until Simon decided that ALTER TABLE RESET
doesn't require AccessExclusiveLock.
On reflection, this still seems like a good idea.
Now you get a failure.
Failure condition as an exception to that.
I haven't tried to construct a pre-9.1 database that would trigger
this, but you can make it happen by applying the attached patch
to create a toast-table-less table in the regression tests,
and then doing "make check" in src/bin/pg_upgrade. You get this:
...
Restoring database schemas in the new cluster
ok
Creating newly-required TOAST tables SQL command failed
ALTER TABLE "public"."i_once_had_a_toast_table" RESET (binary_upgrade_dummy_option);
ERROR: AccessExclusiveLock required to add toast table.
Failure, exiting
It appears that pg_upgrade is depending upon an undocumented side-effect of ALTER TABLE RESET.
I would say this side-effect should not exist, which IIUC is the same conclusion on your latest post.
If pg_upgrade needs this, we should implement a specific function that does what pg_upgrade needs. That way we can isolate the requirement for an AccessExclusiveLock to the place that needs it: pg_upgrade. That will also make it less fragile in the future. I don't think that needs a specific command, just a function.
I accept that it is my bug and should fix it.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > On 3 May 2016 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Or at least, it did until Simon decided that ALTER TABLE RESET >> doesn't require AccessExclusiveLock. > On reflection, this still seems like a good idea. Yes, what pg_upgrade was doing was clearly a hack, and not a very nice one. > I accept that it is my bug and should fix it. It's moot at this point, see 1a2c17f8e. regards, tom lane
On 7 May 2016 at 16:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--
Simon Riggs <simon@2ndquadrant.com> writes:
> On 3 May 2016 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Or at least, it did until Simon decided that ALTER TABLE RESET
>> doesn't require AccessExclusiveLock.
> On reflection, this still seems like a good idea.
Yes, what pg_upgrade was doing was clearly a hack, and not a very nice one.
> I accept that it is my bug and should fix it.
It's moot at this point, see 1a2c17f8e.
OK, sorry for the noise and thanks for the fix.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
For what it's worth, for my historical sort benchmarks I got Postgres to build right back to 6.5 using modern tools. I have patches if anyone wants them. Pre-7.3 doesn't actually run because we didn't support 64-bit architectures before Tom did the V1 api (there was a set of Alpha patches floating around but they don't seem sufficient for x86_64). But I suspect if I built it for x86 32-bit that would clear the immediate problem. I was considering proposing backpatching some minimal patches to get old unsupported branches to at least build with modern tools and run. Just to make it easy for people to test historical behaviour and I suppose pg_dump or other client testing would be a good use case too. I was also going to propose turning off all warnings on these unsupported back branches while I was at it. One other lesson I learned doing this was that it was a pain referring to individual git checkouts because we don't have a tag for point where we branched the new development. So all my git-describes were ending up with things like REL9_2_BETA2-619-gff6c78c or REL9_3_BETA1-925-g6668ad1. You just had to know that REL9_3_BETA1-xxx was probably a revision sometime during PG 9.5 development since BETA1 was probably where we forked development whereas 9.3 probably forked off 9_2_BETA2. (And some were things like REL7_4_RC1-1513-g4525418 instead) I would suggest adding tags for each version on the first development revision so that we could see things like REL9_5_DEV-nnn would mean the nnnth commit on the 9.5 development branch. (What I actually did instead myself was use git describe --tags --contains $i --match 'REL[0-9]_[0-9]_[0-9]' which gave me things like REL9_5_0~1334 which means the 1334th commit *before* REL9_5_0. That was also confusing but at least consistent)
On Tue, May 3, 2016 at 12:07:51PM -0400, Tom Lane wrote: > I think possibly the easiest fix for this is to have pg_upgrade, > instead of RESETting a nonexistent option, RESET something that's > still considered to require AccessExclusiveLock. "user_catalog_table" > would work, looks like; though I'd want to annotate its entry in > reloptions.c to warn people away from downgrading its lock level. > > More generally, though, I wonder how we can have some test coverage > on such cases going forward. Is the patch below too ugly to commit > permanently, and if so, what other idea can you suggest? FYI, I only test _supported_ version combinations for pg_upgrade, i.e. I don't test pg_upgrade _from_ unsupported versions, though I can see why maybe I should. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Fri, May 6, 2016 at 03:32:23PM -0400, Tom Lane wrote: > > I think possibly the easiest fix for this is to have pg_upgrade, > > instead of RESETting a nonexistent option, RESET something that's > > still considered to require AccessExclusiveLock. "user_catalog_table" > > would work, looks like; though I'd want to annotate its entry in > > reloptions.c to warn people away from downgrading its lock level. > > I tried fixing it like that. The alternate RESET target had behaved as > expected when I'd tested by hand, but in pg_upgrade it still fails, > only now with > > Creating newly-required TOAST tables SQL command failed > ALTER TABLE "public"."i_need_a_toast_table" RESET (user_catalog_table); > ERROR: pg_type OID value not set when in binary upgrade mode I think this means that the ALTER TABLE RESET is adding or potentially adding a pg_type row, and no one called binary_upgrade_set_next_pg_type_oid(). -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 5/3/16 1:25 PM, Alvaro Herrera wrote: > If we can put together a script that runs test.sh for various versions > and then verifies the runs, we could use it in both buildfarm and > coverage. Not that that would be useless, but note that the value in this case (and most others) comes from having a candidate object in the database before upgrade that exercises the particular problem, mostly independent of what version you upgrade from and to. So far the way to do that is to leave "junk" in the regression test database, but that's clearly a bit silly. I think the way forward is to create a TAP test suite for pg_upgrade that specifically exercises a lot of scenarios with small purpose-built test databases. Then, the problem of having to compare dump output across versions also goes away more easily. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, May 11, 2016 at 09:40:09AM -0400, Peter Eisentraut wrote: > Not that that would be useless, but note that the value in this case (and > most others) comes from having a candidate object in the database before > upgrade that exercises the particular problem, mostly independent of what > version you upgrade from and to. So far the way to do that is to leave > "junk" in the regression test database, but that's clearly a bit silly. > > I think the way forward is to create a TAP test suite for pg_upgrade that > specifically exercises a lot of scenarios with small purpose-built test > databases. > > Then, the problem of having to compare dump output across versions also goes > away more easily. I do have some small tests like for tablespaces. I am attaching the SQL script, if that is helpful. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
Peter Eisentraut wrote: > On 5/3/16 1:25 PM, Alvaro Herrera wrote: > >If we can put together a script that runs test.sh for various versions > >and then verifies the runs, we could use it in both buildfarm and > >coverage. > > Not that that would be useless, but note that the value in this case (and > most others) comes from having a candidate object in the database before > upgrade that exercises the particular problem, mostly independent of what > version you upgrade from and to. So far the way to do that is to leave > "junk" in the regression test database, but that's clearly a bit silly. True. We have quite a few places in the standard regression tests that leave junk behind purposefully for this reason. > I think the way forward is to create a TAP test suite for pg_upgrade that > specifically exercises a lot of scenarios with small purpose-built test > databases. That works for me. > Then, the problem of having to compare dump output across versions also goes > away more easily. Not sure why, but if you think it does, then it sounds good. Andrew's current approach of counting lines in the diff seems brittle and not entirely trustworthy. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services