Thread: Hot standby, misc issues
There's a couple of items on the TODO page (https://wiki.postgresql.org/wiki/Hot_Standby_TODO) that haven't been discussed on-list: > In normal operation, a few commands call ForceSyncCommit() to force non-async commit. Should ForceSyncCommit force an XLogFlush()during recovery as well? > > * Simon says: No, why should it? For the same reason we emit ForceSyncCommit() in normal operation. For example, in DROP DATABASE, we delete all the files belonging to the database, and then commit the transaction. If we crash after all the files have been deleted but before the commit, you have an entry in pg_database without any files. To minimize the window for that, we use ForceSyncCommit() to rush the commit record to disk as quick as possible. We have the same window during recovery, and forcing an XLogFlush() (which updates minRecoveryPoint during recovery) would help to keep it small. This isn't really related to Hot Standby. If you set the PITR target time/xid to between the XLOG_DBASE_DROP record and the COMMIT record, you end up with a zombie pg_database entry. > @Heikki: Why is error checking in KnownAssignedXidsRemove() #ifdef'd out?? It's explained in the comment: /* XXX: This can still happen: If a transaction with a subtransaction* that haven't been reported yet aborts, and no WALrecords have been* written using the subxid, the abort record will contain that subxid* and we haven't seen it before.*/ -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, 2009-12-04 at 10:23 +0200, Heikki Linnakangas wrote: > > > @Heikki: Why is error checking in KnownAssignedXidsRemove() #ifdef'd > out?? > > It's explained in the comment: > /* XXX: This can still happen: If a transaction with a subtransaction > * that haven't been reported yet aborts, and no WAL records have been > * written using the subxid, the abort record will contain that subxid > * and we haven't seen it before. > */ Just realised that this occurs again because the call to RecordKnownAssignedTransactionIds() was removed from xact_commit_abort(). I'm guessing you didn't like the call in that place for some reason, since I smile while I remember it has been removed twice(!) even though I put "do not remove" comments on it to describe this corner case. Not going to put it back a third time. -- Simon Riggs www.2ndQuadrant.com
Simon Riggs wrote: > On Fri, 2009-12-04 at 10:23 +0200, Heikki Linnakangas wrote: >>> @Heikki: Why is error checking in KnownAssignedXidsRemove() #ifdef'd >> out?? >> >> It's explained in the comment: >> /* XXX: This can still happen: If a transaction with a subtransaction >> * that haven't been reported yet aborts, and no WAL records have been >> * written using the subxid, the abort record will contain that subxid >> * and we haven't seen it before. >> */ > > Just realised that this occurs again because the call to > RecordKnownAssignedTransactionIds() was removed from > xact_commit_abort(). > > I'm guessing you didn't like the call in that place for some reason, > since I smile while I remember it has been removed twice(!) even though > I put "do not remove" comments on it to describe this corner case. > > Not going to put it back a third time. :-). Well, it does seem pointless to add entries to the hash table, just to remove them at the very next line. But you're right, we should still advance latestObservedXid, and if we do that, we need to memorize any not-yet-seen XIDs in the known-assigned xids array. So that RecordKnownAssignedTransactionIds() call needs to be put back. BTW, if you want to resurrect the check in KnownAssignedXidsRemove(), you also need to not complain before you reach the running-xacts record and open up for read-only connections. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Sat, 2009-12-05 at 22:56 +0200, Heikki Linnakangas wrote: > So that RecordKnownAssignedTransactionIds() call needs to be put back. OK > BTW, if you want to resurrect the check in KnownAssignedXidsRemove(), > you also need to not complain before you reach the running-xacts record > and open up for read-only connections. Yep -- Simon Riggs www.2ndQuadrant.com