Thread: BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column
BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18618 Logged by: Anthony Hsu Email address: erwaman@gmail.com PostgreSQL version: 15.7 Operating system: Linux Description: If I create an unlogged table with an identity column as follows CREATE UNLOGGED TABLE test (a INT GENERATED BY DEFAULT AS IDENTITY); in PG14 and then run pg_upgrade to PG15+, it fails due to ``` pg_restore: error: could not execute query: ERROR: unexpected request for new relfilenode in binary upgrade mode Command was: -- For binary upgrade, must preserve pg_class oids and relfilenodes SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('16384'::pg_catalog.oid); SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('16384'::pg_catalog.oid); ALTER TABLE "public"."test" ALTER COLUMN "a" ADD GENERATED BY DEFAULT AS IDENTITY ( SEQUENCE NAME "public"."test_a_seq" START WITH 1 INCREMENT BY 1 NO MINVALUE NO MAXVALUE CACHE 1 ); ALTER SEQUENCE "public"."test_a_seq" SET LOGGED; ``` If I use pg_upgrade from 15.6, it succeeds; if I use pg_upgrade from 15.7 or 15.8, it fails with the above. Upon further testing, I found that if I revert https://github.com/postgres/postgres/commit/d17a3a4c6a34f61a3d4d9faa7a70c14d8d0c0ffb, then pg_upgrade succeeds. This bug also does not happen if I try upgrading from 15 -> 15 (same version), as the "ALTER SEQUENCE "public"."test_a_seq" SET LOGGED;" line is not generated by pg_dump in this case.
Re: BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > If I create an unlogged table with an identity column as follows > CREATE UNLOGGED TABLE test (a INT GENERATED BY DEFAULT AS IDENTITY); > in PG14 and then run pg_upgrade to PG15+, it fails due to > ``` > pg_restore: error: could not execute query: ERROR: unexpected request for > new relfilenode in binary upgrade mode > Command was: > -- For binary upgrade, must preserve pg_class oids and relfilenodes > SELECT > pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('16384'::pg_catalog.oid); > SELECT > pg_catalog.binary_upgrade_set_next_heap_relfilenode('16384'::pg_catalog.oid); > ALTER TABLE "public"."test" ALTER COLUMN "a" ADD GENERATED BY DEFAULT AS > IDENTITY ( > SEQUENCE NAME "public"."test_a_seq" > START WITH 1 > INCREMENT BY 1 > NO MINVALUE > NO MAXVALUE > CACHE 1 > ); > ALTER SEQUENCE "public"."test_a_seq" SET LOGGED; > ``` Yeah. pg_dump is trying to recreate the state of affairs in the source database, where the table is unlogged but its owned sequence is logged (because v14 didn't have unlogged sequences). And ALTER SEQUENCE SET LOGGED works by assigning the sequence a new relfilenode and then writing its data into that. That doesn't work in binary-upgrade mode, and if it did work it'd be the wrong thing because we have to preserve the sequence's relfilenode. However, in binary-upgrade mode there seems little need to be concerned about whether we produce a nice WAL trace of the SET LOGGED operation. I think we can just summarily change the sequence's relpersistence field and be done with it, more or less as attached. Anybody see a flaw in that reasoning? The only alternative I can see is to extend the ALTER TABLE ADD GENERATED syntax to allow correct specification of the sequence's LOGGED/UNLOGGED status to begin with. While cleaner, that would be a lot more work and probably result in compatibility problems for normal uses of pg_dump (where the output might get loaded into a server version that lacks the syntax extension). regards, tom lane diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c index b37fd688d3..73ab609153 100644 --- a/src/backend/commands/sequence.c +++ b/src/backend/commands/sequence.c @@ -545,6 +545,46 @@ SequenceChangePersistence(Oid relid, char newrelpersistence) Buffer buf; HeapTupleData seqdatatuple; + /* + * In binary-upgrade mode, we cannot assign a new relfilenumber to the + * sequence, so the normal code path below doesn't work. However, we + * don't really need to go through all the pushups of assigning a new + * relfilenumber in this mode, since there's no concern for concurrent + * operations, rollback-ability, or replication. It seems sufficient to + * just summarily change the sequence's relpersistence value. + */ + if (IsBinaryUpgrade) + { + Relation pg_class; + HeapTuple tuple; + Form_pg_class classform; + + /* Get a writable copy of the pg_class tuple for the sequence. */ + pg_class = table_open(RelationRelationId, RowExclusiveLock); + + tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "could not find tuple for relation %u", relid); + classform = (Form_pg_class) GETSTRUCT(tuple); + + /* Apply the update. */ + classform->relpersistence = newrelpersistence; + + CatalogTupleUpdate(pg_class, &tuple->t_self, tuple); + + heap_freetuple(tuple); + + table_close(pg_class, RowExclusiveLock); + + /* + * Make the pg_class row change visible. This will cause the + * sequence's relcache entry to get updated, too. + */ + CommandCounterIncrement(); + + return; + } + /* * ALTER SEQUENCE acquires this lock earlier. If we're processing an * owned sequence for ALTER TABLE, lock now. Without the lock, we'd
Re: BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column
From
Tom Lane
Date:
I wrote: > However, in binary-upgrade mode there seems little need to be > concerned about whether we produce a nice WAL trace of the SET LOGGED > operation. I think we can just summarily change the sequence's > relpersistence field and be done with it, more or less as attached. Nope, after further testing that doesn't work at all. In the first place, the SET UNLOGGED case falls foul of some assertions in the relcache code having to do with WAL-skipping; and that code looks complex and fragile enough that I don't want to try to poke holes in it. In the second place, I'd forgotten the need to add or remove an init fork. > The only alternative I can see is to extend the ALTER TABLE ADD > GENERATED syntax to allow correct specification of the sequence's > LOGGED/UNLOGGED status to begin with. While cleaner, that would be > a lot more work and probably result in compatibility problems for > normal uses of pg_dump (where the output might get loaded into a > server version that lacks the syntax extension). So we have to do it like that, and it seems not that bad, especially if we follow the lead of the SEQUENCE NAME option and don't bother to document this stuff. (I don't think that's a great precedent, but I didn't change it here.) Looking at this draft patch a second time, one way that we could reduce the surface area for compatibility problems is to emit these new ADD GENERATED options only in binary-upgrade mode, and continue to use ALTER SEQUENCE SET in standard dumps. But I kind of think that it's not worth the complication: sequences that don't match their table's persistence must be quite rare, else we'd have heard about this problem long ago. regards, tom lane diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c index b37fd688d3..d49d53f35a 100644 --- a/src/backend/commands/sequence.c +++ b/src/backend/commands/sequence.c @@ -1347,7 +1347,9 @@ init_params(ParseState *pstate, List *options, bool for_identity, /* * The parser allows this, but it is only for identity columns, in * which case it is filtered out in parse_utilcmd.c. We only get - * here if someone puts it into a CREATE SEQUENCE. + * here if someone puts it into a CREATE SEQUENCE. (The same is + * true for the equally-undocumented LOGGED and UNLOGGED options, + * but for those, the default error below seems sufficient.) */ ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 84cef57a70..4fcab9e3f1 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -4908,6 +4908,11 @@ SeqOptElem: AS SimpleTypename { $$ = makeDefElem("increment", (Node *) $3, @1); } + | LOGGED + { + /* not documented, only used by pg_dump */ + $$ = makeDefElem("logged", NULL, @1); + } | MAXVALUE NumericOnly { $$ = makeDefElem("maxvalue", (Node *) $2, @1); @@ -4945,6 +4950,11 @@ SeqOptElem: AS SimpleTypename { $$ = makeDefElem("restart", (Node *) $3, @1); } + | UNLOGGED + { + /* not documented, only used by pg_dump */ + $$ = makeDefElem("unlogged", NULL, @1); + } ; opt_by: BY diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c index 79cad4ab30..2cae834c2c 100644 --- a/src/backend/parser/parse_utilcmd.c +++ b/src/backend/parser/parse_utilcmd.c @@ -365,30 +365,20 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column, { ListCell *option; DefElem *nameEl = NULL; + DefElem *loggedEl = NULL; Oid snamespaceid; char *snamespace; char *sname; CreateSeqStmt *seqstmt; AlterSeqStmt *altseqstmt; List *attnamelist; - int nameEl_idx = -1; /* Make a copy of this as we may end up modifying it in the code below */ seqoptions = list_copy(seqoptions); /* - * Determine namespace and name to use for the sequence. - * - * First, check if a sequence name was passed in as an option. This is - * used by pg_dump. Else, generate a name. - * - * Although we use ChooseRelationName, it's not guaranteed that the - * selected sequence name won't conflict; given sufficiently long field - * names, two different serial columns in the same table could be assigned - * the same sequence name, and we'd not notice since we aren't creating - * the sequence quite yet. In practice this seems quite unlikely to be a - * problem, especially since few people would need two serial columns in - * one table. + * Check for special (undocumented) options used by pg_dump, and remove + * them from the seqoptions list if found. */ foreach(option, seqoptions) { @@ -399,12 +389,24 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column, if (nameEl) errorConflictingDefElem(defel, cxt->pstate); nameEl = defel; - nameEl_idx = foreach_current_index(option); + seqoptions = foreach_delete_current(seqoptions, option); + } + else if (strcmp(defel->defname, "logged") == 0 || + strcmp(defel->defname, "unlogged") == 0) + { + if (loggedEl) + errorConflictingDefElem(defel, cxt->pstate); + loggedEl = defel; + seqoptions = foreach_delete_current(seqoptions, option); } } + /* + * Determine namespace and name to use for the sequence. + */ if (nameEl) { + /* Use specified name */ RangeVar *rv = makeRangeVarFromNameList(castNode(List, nameEl->arg)); snamespace = rv->schemaname; @@ -418,11 +420,20 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column, snamespace = get_namespace_name(snamespaceid); } sname = rv->relname; - /* Remove the SEQUENCE NAME item from seqoptions */ - seqoptions = list_delete_nth_cell(seqoptions, nameEl_idx); } else { + /* + * Generate a name. + * + * Although we use ChooseRelationName, it's not guaranteed that the + * selected sequence name won't conflict; given sufficiently long + * field names, two different serial columns in the same table could + * be assigned the same sequence name, and we'd not notice since we + * aren't creating the sequence quite yet. In practice this seems + * quite unlikely to be a problem, especially since few people would + * need two serial columns in one table. + */ if (cxt->rel) snamespaceid = RelationGetNamespace(cxt->rel); else @@ -452,13 +463,22 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column, seqstmt->sequence = makeRangeVar(snamespace, sname, -1); /* - * Copy the persistence of the table. For CREATE TABLE, we get the - * persistence from cxt->relation, which comes from the CreateStmt in - * progress. For ALTER TABLE, the parser won't set + * Determine the persistence of the sequence. If LOGGED or UNLOGGED was + * specified, use that; otherwise copy the persistence of the table. For + * CREATE TABLE, we get the persistence from cxt->relation, which comes + * from the CreateStmt in progress. For ALTER TABLE, the parser won't set * cxt->relation->relpersistence, but we have cxt->rel as the existing * table, so we copy the persistence from there. */ - seqstmt->sequence->relpersistence = cxt->rel ? cxt->rel->rd_rel->relpersistence : cxt->relation->relpersistence; + if (loggedEl) + { + if (strcmp(loggedEl->defname, "logged") == 0) + seqstmt->sequence->relpersistence = RELPERSISTENCE_PERMANENT; + else + seqstmt->sequence->relpersistence = RELPERSISTENCE_UNLOGGED; + } + else + seqstmt->sequence->relpersistence = cxt->rel ? cxt->rel->rd_rel->relpersistence : cxt->relation->relpersistence; seqstmt->options = seqoptions; diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 546e7e4ce1..d140d6067b 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -17556,6 +17556,15 @@ dumpSequence(Archive *fout, const TableInfo *tbinfo) appendPQExpBufferStr(query, "BY DEFAULT"); appendPQExpBuffer(query, " AS IDENTITY (\n SEQUENCE NAME %s\n", fmtQualifiedDumpable(tbinfo)); + + /* + * Emit persistence option only if it's different from the owning + * table's. This avoids using this new syntax unnecessarily. + */ + if (tbinfo->relpersistence != owning_tab->relpersistence) + appendPQExpBuffer(query, " %s\n", + tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? + "UNLOGGED" : "LOGGED"); } else { @@ -17588,15 +17597,7 @@ dumpSequence(Archive *fout, const TableInfo *tbinfo) seq->cache, (seq->cycled ? "\n CYCLE" : "")); if (tbinfo->is_identity_sequence) - { appendPQExpBufferStr(query, "\n);\n"); - if (tbinfo->relpersistence != owning_tab->relpersistence) - appendPQExpBuffer(query, - "ALTER SEQUENCE %s SET %s;\n", - fmtQualifiedDumpable(tbinfo), - tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? - "UNLOGGED" : "LOGGED"); - } else appendPQExpBufferStr(query, ";\n"); diff --git a/src/test/regress/expected/identity.out b/src/test/regress/expected/identity.out index 3d554fe327..f14bfccfb1 100644 --- a/src/test/regress/expected/identity.out +++ b/src/test/regress/expected/identity.out @@ -905,3 +905,19 @@ SELECT * FROM itest16; DROP TABLE itest15; DROP TABLE itest16; +-- For testing of pg_dump and pg_upgrade, leave behind some identity +-- sequences whose logged-ness doesn't match their owning table's. +CREATE TABLE identity_dump_logged (a INT GENERATED ALWAYS AS IDENTITY); +ALTER SEQUENCE identity_dump_logged_a_seq SET UNLOGGED; +CREATE UNLOGGED TABLE identity_dump_unlogged (a INT GENERATED ALWAYS AS IDENTITY); +ALTER SEQUENCE identity_dump_unlogged_a_seq SET LOGGED; +SELECT relname, relpersistence FROM pg_class + WHERE relname ~ '^identity_dump_' ORDER BY 1; + relname | relpersistence +------------------------------+---------------- + identity_dump_logged | p + identity_dump_logged_a_seq | u + identity_dump_unlogged | u + identity_dump_unlogged_a_seq | p +(4 rows) + diff --git a/src/test/regress/sql/identity.sql b/src/test/regress/sql/identity.sql index 84c43a19a3..cb0e05a2f1 100644 --- a/src/test/regress/sql/identity.sql +++ b/src/test/regress/sql/identity.sql @@ -528,3 +528,12 @@ SELECT * FROM itest15; SELECT * FROM itest16; DROP TABLE itest15; DROP TABLE itest16; + +-- For testing of pg_dump and pg_upgrade, leave behind some identity +-- sequences whose logged-ness doesn't match their owning table's. +CREATE TABLE identity_dump_logged (a INT GENERATED ALWAYS AS IDENTITY); +ALTER SEQUENCE identity_dump_logged_a_seq SET UNLOGGED; +CREATE UNLOGGED TABLE identity_dump_unlogged (a INT GENERATED ALWAYS AS IDENTITY); +ALTER SEQUENCE identity_dump_unlogged_a_seq SET LOGGED; +SELECT relname, relpersistence FROM pg_class + WHERE relname ~ '^identity_dump_' ORDER BY 1;
Re: BUG #18618: pg_upgrade from 14 to 15+ fails for unlogged table with identity column
From
Tom Lane
Date:
I wrote: >> The only alternative I can see is to extend the ALTER TABLE ADD >> GENERATED syntax to allow correct specification of the sequence's >> LOGGED/UNLOGGED status to begin with. While cleaner, that would be >> a lot more work and probably result in compatibility problems for >> normal uses of pg_dump (where the output might get loaded into a >> server version that lacks the syntax extension). > So we have to do it like that, and it seems not that bad, especially > if we follow the lead of the SEQUENCE NAME option and don't bother > to document this stuff. (I don't think that's a great precedent, > but I didn't change it here.) Pushed with a bit of further polishing, including adding that missing documentation. regards, tom lane