preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce) |
Date | |
Msg-id | CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com Whole thread Raw |
In response to | Re: storing an explicit nonce (Shruthi Gowda <gowdashru@gmail.com>) |
Responses |
Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)
Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce) Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce) |
List | pgsql-hackers |
On Wed, Aug 11, 2021 at 3:41 AM Shruthi Gowda <gowdashru@gmail.com> wrote: > I have fixed all the issues and now the patch is working as expected. Hi, I'm changing the subject line since the patch does something which was discussed on that thread but isn't really related to the old email subject. In general, I think this patch is uncontroversial and in reasonably good shape. However, there's one part that I'm not too sure about. If Tom Lane happens to be paying attention to this thread, I think his feedback would be particularly useful, since he knows a lot about the inner workings of pg_dump. Opinions from anybody else would be great, too. Anyway, here's the hunk that worries me: + + /* + * Need a separate entry, otherwise the command will be run in the + * same transaction as the CREATE DATABASE command, which is not + * allowed. + */ + ArchiveEntry(fout, + dbCatId, /* catalog ID */ + dbDumpId, /* dump ID */ + ARCHIVE_OPTS(.tag = datname, + .owner = dba, + .description = "SET_DB_OID", + .section = SECTION_PRE_DATA, + .createStmt = setDBIdQry->data, + .dropStmt = NULL)); + To me, adding a separate TOC entry for a thing that is not really a separate object seems like a scary hack that might come back to bite us. Unfortunately, I don't know enough about pg_dump to say exactly how it might come back to bite us, which leaves wide open the possibility that I am completely wrong.... I just think it's the intention that archive entries correspond to actual objects in the database, not commands that we want executed in some particular order. If that criticism is indeed correct, then my proposal would be to instead add a WITH OID = nnn option to CREATE DATABASE and allow it to be used only in binary upgrade mode. That has the disadvantage of being inconsistent with the way that we preserve OIDs everywhere else, but the only other alternatives are (1) do something like the above, (2) remove the requirement that CREATE DATABASE run in its own transaction, and (3) give up. (2) sounds hard and (3) is unappealing. The rest of this email will be detailed review comments on the patch as presented, and thus probably only interesting to someone actually working on the patch. Feel free to skip if that's not you. - I suggest splitting the patch into one portion that deals with database OID and another portion that deals with tablespace OID and relfilenode OID, or maybe splitting it all the way into three separate patches, one for each. This could allow the uncontroversial parts to get committed first while we're wondering what to do about the problem described above. - There are two places in the patch, one in dumpDatabase() and one in generate_old_dump() where blank lines are removed with no other changes. Please avoid whitespace-only hunks. - If possible, please try to pgindent the new code. It's pretty good what you did, but e.g. the declaration of binary_upgrade_next_pg_tablespace_oid probably has less whitespace than pgindent is going to want. - The comments in dumpDatabase() claim that "postgres" and "template1" are handled specially in some way, but there seems to be no code that matches those comments. - heap_create()'s logic around setting create_storage looks slightly redundant. I'm not positive what would be better, but ... suppose you just took the part that's currently gated by if (!IsBinaryUpgrade) and did it unconditionally. Then put if (IsBinaryUpgrade) around the else clause, but delete the last bit from there that sets create_storage. Maybe we'd still want a comment saying that it's intentional that create_storage = true even though it will be overwritten later, but then, I think, we wouldn't need to set create_storage in two different places. Maybe I'm wrong. - If we're not going to do that, then I think you should swap the if and else clauses and reverse the sense of the test. In createdb(), CreateTableSpace(), and a bunch of existing places, we do if (IsBinaryUpgrade) { ... } else { ... } so I don't think it makes sense for this one to instead do if (!IsBinaryUpgrade) { ... } else { ... }. - I'm not sure that I'd bother renaming binary_upgrade_set_pg_class_oids_and_relfilenodes(). It's such a long name, and a relfilenode is kind of an OID, so the current name isn't even really wrong. I'd probably drop the header comment too, since it seems rather obvious. But both of these things are judgement calls. - Inside that function, there is a comment that says "Indexes cannot have toast tables, so we need not make this probe in the index code path." However, you have moved the code from someplace where it didn't happen for indexes to someplace where it happens for both tables and indexes. Therefore the comment, which was true when the code was where it was before, is now false. So you need to update it. - It is not clear to me why pg_upgrade's Makefile needs to be changed to include -DFRONTEND in CPPFLAGS. All of the .c files in this directory include postgres_fe.h rather than postgres.h, and that file has #define FRONTEND 1. Moreover, there are no actual code changes in this directory, so why should the Makefile need any change? - A couple of comment changes - and the commit message - mention data encryption, but that's not a feature that this patch implements, nor are we committed to adding it in the immediate future (or ever, really). So I think those places should be revised to say that we do this because we want the filenames to match between the old and new clusters, and leave the reasons why that might be a good thing up to the reader's imagination. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: