Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
Date | |
Msg-id | CA+TgmobqqvB_AotL2Bbi5xzF53wa7ERCSVy=o=xa2xWdoDGfsw@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
|
List | pgsql-hackers |
On Fri, Mar 11, 2022 at 5:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > Changes, 1) it take Robert's patch as first refactoring patch 2) > Rebase other new relmapper apis on top of that in 0002 3) Some code > refactoring in main patch 0005 and also one problem fix, earlier in > wal log method I have removed ForceSyncCommit(), but IMHO that is > equally valid whether we use file_copy or wal_log because in both > cases we are creating the disk files. 4) Support strategy in createdb > tool and add test case as part of 0006. I think there's something wrong with what this patch is doing with the XLOG records. It adds XLOG_DBASE_CREATEDIR, but the only new XLogInsert() calls in the patch are passing XLOG_DBASE_CREATE, and no existing references are adjusted. Similarly with xl_dbase_create_rec and xl_dbase_createdir_rec. Why would we introduce a new record type and not use it? Let's not call the functions for the different strategies CopyDatabase() and CopyDatabaseWithWal() but rather something that matches up to the strategy names e.g. create_database_using_wal_log() and create_database_using_file_copy(). There's something a little funny about the names wal_log and file_copy ... they're not quite parallel gramatically. But it's probably OK. The changes to createdb_failure_params make me a little nervous. I think we'd be in real trouble if we failed before completing both DropDatabaseBuffers() and ForgetDatabaseSyncRequests(). However, it looks to me like those are both intended to be no-fail operations, so I don't see an actual hazard here. But, hmm, what about on the recovery side? Suppose that we start copying the database block by block and then either (a) the standby is promoted before the copy is finished or (b) the copy fails. Now the standby has data in shared_buffers for a database that does not exist. If that's not bad, then why does createdb_failure_params need to DropDatabaseBuffers()? But I bet it is bad. I wonder if we should be using RelationCopyStorage() rather than this new function RelationCopyStorageUsingBuffer(). That would avoid having the buffers in shared_buffers, dodging the problem. But then we have a problem with checkpoint interlocking: we could begin replay from a checkpoint and find that the pages that were supposed to get copied prior to the checkpoint were actually not copied, because the checkpoint record could be written after we've logged a page being copied and before we actually write the page. Or, we could crash after writing a whole lot of pages and a checkpoint record, but before RelationCopyStorage() fsyncs the destination fork. It doesn't seem advisable to hold off checkpoints for the time it takes to copy an entire relation fork, so the solution is apparently to keep the data in shared buffers after all. But that brings us right back to square one. Have you thought through this whole problem carefully? It seems like a total mess to me at the moment, but maybe I'm missing something. There seems to be no reason to specify specific values for the members of enum CreateDBStrategy. I think the naming of some of the new functions might need work, in particular GetRelInfoFromTuple, GetRelListFromPage, and GetDatabaseRelationList. The names seem kind of generic for what they're doing. Maybe ScanSourceDatabasePgClass, ScanSourceDatabasePgClassPage, ScanSourceDatabasePgClassTuple? -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: