Re: standby recovery fails (tablespace related) (tentative patch and discussion) - Mailing list pgsql-hackers
From | Paul Guo |
---|---|
Subject | Re: standby recovery fails (tablespace related) (tentative patch and discussion) |
Date | |
Msg-id | 29C02691-6F05-422F-AB13-2B004D60F4CF@vmware.com Whole thread Raw |
In response to | Re: standby recovery fails (tablespace related) (tentative patch and discussion) (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
|
List | pgsql-hackers |
Thanks for the review, please see the replies below. > On Jan 5, 2021, at 9:07 AM, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Wed, 8 Jul 2020 12:56:44 +0000, Paul Guo <guopa@vmware.com> wrote in >> On 2020/01/15 19:18, Paul Guo wrote: >> I further fixed the last test failure (due to a small bug in the test, not in code). Attached are the new patch series.Let's see the CI pipeline result. >> >> Thanks for updating the patches! >> >> I started reading the 0003 patch. >> >> The approach that the 0003 patch uses is not the perfect solution. >> If the standby crashes after tblspc_redo() removes the directory and before >> its subsequent COMMIT record is replayed, PANIC error would occur since >> there can be some unresolved missing directory entries when we reach the >> consistent state. The problem would very rarely happen, though... >> Just idea; calling XLogFlush() to update the minimum recovery point just >> before tblspc_redo() performs destroy_tablespace_directories() may be >> safe and helpful for the problem? > > It seems to me that what the current patch does is too complex. What > we need to do here is to remember every invalid operation then forget > it when the prerequisite object is dropped. > > When a table space is dropped before consistency is established, we > don't need to care what has been performed inside the tablespace. In > this perspective, it is enough to remember tablespace ids when failed > to do something inside it due to the absence of the tablespace and > then forget it when we remove it. We could remember individual > database id to show them in error messages, but I'm not sure it's > useful. The reason log_invalid_page records block numbers is to allow > the machinery handle partial table truncations, but this is not the > case since dropping tablespace cannot leave some of containing > databases. > > As the result, we won't see an unresolved invalid operations in a > dropped tablespace. > > Am I missing something? Yes, removing the database id from the hash key in the log/forget code should be usually fine, but the previous code does stricter/safer checking. Consider the scenario: CREATE DATABASE newdb1 TEMPLATE template_db1; CREATE DATABASE newdb2 TEMPLATE template_db2; <- in case the template_db2 database directory is missing abnormally somehow. DROP DATABASE template_db1; The previous code could detect this but if we remove the database id in the code, this bad scenario is skipped. > > > dbase_redo: > + if (!(stat(parent_path, &st) == 0 && S_ISDIR(st.st_mode))) > + { > + XLogRecordMissingDir(xlrec->tablespace_id, InvalidOid, parent_path); > > This means "record the belonging table space directory if it is not > found OR it is not a directory". The former can be valid but the > latter is unconditionally can not (I don't think we bother considering > symlinks there). Again this is a safer check, in the case the parent_path is a file for example somehow, we should panic finally for the case and let the user checks and then does recovery again. > > + /* > + * Source directory may be missing. E.g. the template database used > + * for creating this database may have been dropped, due to reasons > + * noted above. Moving a database from one tablespace may also be a > + * partner in the crime. > + */ > + if (!(stat(src_path, &st) == 0 && S_ISDIR(st.st_mode))) > + { > + XLogLogMissingDir(xlrec->src_tablespace_id, xlrec->src_db_id, src_path); > > This is a part of *creation* of the target directory. Lack of the > source directory cannot be valid even if the source directory is > dropped afterwards in the WAL stream and we can allow that if the > *target* tablespace is dropped afterwards. As the result, as I > mentioned above, we don't need to record about the database directory. > > By the way the name XLogLogMiss.. is somewhat confusing. How about > XLogReportMissingDir (named after report_invalid_page). Agree with you. Also your words remind me that we should skip the checking if the consistency point is reached. Here is a git diff against the previous patch. I’ll send out the new rebased patches after the consensus is reached. diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 7ade385965..c8fe3fe228 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -90,7 +90,7 @@ typedef struct xl_missing_dir static HTAB *missing_dir_tab = NULL; void -XLogLogMissingDir(Oid spcNode, Oid dbNode, char *path) +XLogReportMissingDir(Oid spcNode, Oid dbNode, char *path) { xl_missing_dir_key key; bool found; @@ -103,16 +103,6 @@ XLogLogMissingDir(Oid spcNode, Oid dbNode, char *path) */ Assert(OidIsValid(spcNode)); - if (reachedConsistency) - { - if (dbNode == InvalidOid) - elog(PANIC, "cannot find directory %s (tablespace %d)", - path, spcNode); - else - elog(PANIC, "cannot find directory %s (tablespace %d database %d)", - path, spcNode, dbNode); - } - if (missing_dir_tab == NULL) { /* create hash table when first needed */ diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c index fbff422c3b..7bd6d4efd9 100644 --- a/src/backend/commands/dbcommands.c +++ b/src/backend/commands/dbcommands.c @@ -2205,7 +2205,7 @@ dbase_redo(XLogReaderState *record) (errmsg("some useless files may be left behind in old database directory \"%s\"", dst_path))); } - else + else if (!reachedConsistency) { /* * It is possible that drop tablespace record appearing later in @@ -2221,7 +2221,7 @@ dbase_redo(XLogReaderState *record) get_parent_directory(parent_path); if (!(stat(parent_path, &st) == 0 && S_ISDIR(st.st_mode))) { - XLogLogMissingDir(xlrec->tablespace_id, InvalidOid, parent_path); + XLogReportMissingDir(xlrec->tablespace_id, InvalidOid, parent_path); skip = true; ereport(WARNING, (errmsg("skipping create database WAL record"), @@ -2239,9 +2239,10 @@ dbase_redo(XLogReaderState *record) * noted above. Moving a database from one tablespace may also be a * partner in the crime. */ - if (!(stat(src_path, &st) == 0 && S_ISDIR(st.st_mode))) + if (!(stat(src_path, &st) == 0 && S_ISDIR(st.st_mode)) && + !reachedConsistency) { - XLogLogMissingDir(xlrec->src_tablespace_id, xlrec->src_db_id, src_path); + XLogReportMissingDir(xlrec->src_tablespace_id, xlrec->src_db_id, src_path); skip = true; ereport(WARNING, (errmsg("skipping create database WAL record"), @@ -2311,7 +2312,8 @@ dbase_redo(XLogReaderState *record) (errmsg("some useless files may be left behind in old database directory \"%s\"", dst_path))); - XLogForgetMissingDir(xlrec->tablespace_ids[i], xlrec->db_id); + if (!reachedConsistency) + XLogForgetMissingDir(xlrec->tablespace_ids[i], xlrec->db_id); pfree(dst_path); } diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c index 294c9676b4..15eaa757cc 100644 --- a/src/backend/commands/tablespace.c +++ b/src/backend/commands/tablespace.c @@ -1534,7 +1534,8 @@ tblspc_redo(XLogReaderState *record) { xl_tblspc_drop_rec *xlrec = (xl_tblspc_drop_rec *) XLogRecGetData(record); - XLogForgetMissingDir(xlrec->ts_id, InvalidOid); + if (!reachedConsistency) + XLogForgetMissingDir(xlrec->ts_id, InvalidOid); XLogFlush(record->EndRecPtr); diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h index da561af5ab..6561d9cebe 100644 --- a/src/include/access/xlogutils.h +++ b/src/include/access/xlogutils.h @@ -23,7 +23,7 @@ extern void XLogDropDatabase(Oid dbid); extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum, BlockNumber nblocks); -extern void XLogLogMissingDir(Oid spcNode, Oid dbNode, char *path); +extern void XLogReportMissingDir(Oid spcNode, Oid dbNode, char *path); extern void XLogForgetMissingDir(Oid spcNode, Oid dbNode); extern void XLogCheckMissingDirs(void); diff --git a/src/test/recovery/t/011_crash_recovery.pl b/src/test/recovery/t/011_crash_recovery.pl index 748200ebb5..95eb6d26cc 100644 --- a/src/test/recovery/t/011_crash_recovery.pl +++ b/src/test/recovery/t/011_crash_recovery.pl @@ -141,7 +141,7 @@ $node_master->wait_for_catchup($node_standby, 'replay', $node_standby->safe_psql('postgres', 'CHECKPOINT'); # Do immediate shutdown just after a sequence of CREAT DATABASE / DROP -# DATABASE / DROP TABLESPACE. This causes CREATE DATBASE WAL records +# DATABASE / DROP TABLESPACE. This causes CREATE DATABASE WAL records
pgsql-hackers by date: