Thread: Hot standby, misc issues

Hot standby, misc issues

From
Heikki Linnakangas
Date:
There's a couple of items on the TODO page
(https://wiki.postgresql.org/wiki/Hot_Standby_TODO) that haven't been
discussed on-list:

> In normal operation, a few commands call ForceSyncCommit() to force non-async commit. Should ForceSyncCommit force an
XLogFlush()during recovery as well?
 
> 
>     * Simon says: No, why should it? 

For the same reason we emit ForceSyncCommit() in normal operation. For
example, in DROP DATABASE, we delete all the files belonging to the
database, and then commit the transaction. If we crash after all the
files have been deleted but before the commit, you have an entry in
pg_database without any files. To minimize the window for that, we use
ForceSyncCommit() to rush the commit record to disk as quick as
possible. We have the same window during recovery, and forcing an
XLogFlush() (which updates minRecoveryPoint during recovery) would help
to keep it small.

This isn't really related to Hot Standby. If you set the PITR target
time/xid to between the XLOG_DBASE_DROP record and the COMMIT record,
you end up with a zombie pg_database entry.

> @Heikki: Why is error checking in KnownAssignedXidsRemove() #ifdef'd out?? 

It's explained in the comment:
/* XXX: This can still happen: If a transaction with a subtransaction* that haven't been reported yet aborts, and no
WALrecords have been* written using the subxid, the abort record will contain that subxid* and we haven't seen it
before.*/


--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot standby, misc issues

From
Simon Riggs
Date:
On Fri, 2009-12-04 at 10:23 +0200, Heikki Linnakangas wrote:
> 
> > @Heikki: Why is error checking in KnownAssignedXidsRemove() #ifdef'd
> out?? 
> 
> It's explained in the comment:
> /* XXX: This can still happen: If a transaction with a subtransaction
>  * that haven't been reported yet aborts, and no WAL records have been
>  * written using the subxid, the abort record will contain that subxid
>  * and we haven't seen it before.
>  */

Just realised that this occurs again because the call to
RecordKnownAssignedTransactionIds() was removed from
xact_commit_abort().

I'm guessing you didn't like the call in that place for some reason,
since I smile while I remember it has been removed twice(!) even though
I put "do not remove" comments on it to describe this corner case.

Not going to put it back a third time.

-- Simon Riggs           www.2ndQuadrant.com



Re: Hot standby, misc issues

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> On Fri, 2009-12-04 at 10:23 +0200, Heikki Linnakangas wrote:
>>> @Heikki: Why is error checking in KnownAssignedXidsRemove() #ifdef'd
>> out?? 
>>
>> It's explained in the comment:
>> /* XXX: This can still happen: If a transaction with a subtransaction
>>  * that haven't been reported yet aborts, and no WAL records have been
>>  * written using the subxid, the abort record will contain that subxid
>>  * and we haven't seen it before.
>>  */
> 
> Just realised that this occurs again because the call to
> RecordKnownAssignedTransactionIds() was removed from
> xact_commit_abort().
> 
> I'm guessing you didn't like the call in that place for some reason,
> since I smile while I remember it has been removed twice(!) even though
> I put "do not remove" comments on it to describe this corner case.
> 
> Not going to put it back a third time.

:-). Well, it does seem pointless to add entries to the hash table, just
to remove them at the very next line. But you're right, we should still
advance latestObservedXid, and if we do that, we need to memorize any
not-yet-seen XIDs in the known-assigned xids array. So that
RecordKnownAssignedTransactionIds() call needs to be put back.

BTW, if you want to resurrect the check in KnownAssignedXidsRemove(),
you also need to not complain before you reach the running-xacts record
and open up for read-only connections.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Hot standby, misc issues

From
Simon Riggs
Date:
On Sat, 2009-12-05 at 22:56 +0200, Heikki Linnakangas wrote:

> So that RecordKnownAssignedTransactionIds() call needs to be put back.

OK

> BTW, if you want to resurrect the check in KnownAssignedXidsRemove(),
> you also need to not complain before you reach the running-xacts record
> and open up for read-only connections.

Yep

-- Simon Riggs           www.2ndQuadrant.com