Hi all,
I've get found out that issue in my project.
Update in PRIMARY while pg_basebackup is performing,
Can cause the STANDBY could not UPDATE after promote.
In Standby Node, when first XID of a page come, pg_subtrans
must be extended by ExtendSUBTRANS function.
But if that XID created while pg_basebackup (with "-x" option)
was performing, it did not extend.
Due to that, after basebackup complete, start that DB and do
promote complete. It failed in SAVEPOINT UPDATE query like follows.
---
postgres=# BEGIN;
postgres=# SAVEPOINT testsavepoint;
postgres=# UPDATE test_tbl SET name = 'test';
ERROR: could not access status of transaction 1409172
DETAIL: Could not read from file "pg_subtrans/0015" at offset 131072: Success.
---
I've also confirm source and realize that,
When StartupXLOG call RecordKnownAssignedTransactionIds(in the "main redo apply loop"), the "standbyState" still
was STANDBY_INITIALIZED, so it is returned without goto
ExtendSUBTRANS (to check and extend space for pg_subtrans).
Therefor, after STANDBY promote, when UPDATE query made by
SAVEPOINT is executed, the process as follow is performed
and get the above ERROR in SimpleLruReadPage function.
AssignTransactionId => SubTransSetParent => SubTransSetParent => SimpleLruReadPage
I think that ExtendSUBTRANS must be called even if "standbyState"
in STANDBY_INITIALIZED in order to avoid the case above.
I also attach a patch. Could anyone confirm for me.
Regard,
---
Dang Minh Huong
NEC Soft,Ltd.
http://www.necsoft.com/eng/