Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns |
Date | |
Msg-id | CAD21AoCxVNYK0R=SSPchcx1wm6mx=qJ5W8qDVakFCOAy1UxqAQ@mail.gmail.com Whole thread Raw |
Responses |
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns |
List | pgsql-hackers |
On Tue, Mar 16, 2021 at 1:35 AM Oh, Mike <minsoo@amazon.com> wrote: > > Sending this to pgsql-hackers list to create a CommitFest entry with the attached patch proposal. > > > > Hello, > > We noticed that the logical replication could fail when the Standby::RUNNING_XACT record is generated in the middle ofa catalog modifying transaction and if the logical decoding has to restart from the RUNNING_XACT > > WAL entry. > > The Standby::RUNNING_XACT record is generated periodically (roughly every 15s by default) or during a CHECKPOINT operation. > > > > Detailed problem description: > > Tested on 11.8 & current master. > > The logical replication slot restart_lsn advances in the middle of an open txn that modified the catalog (e.g. TRUNCATEoperation). > > Should the logical decoding has to restart it could fail with an error like this: > > ERROR: could not map filenode "base/13237/442428" Thank you for reporting the issue. I could reproduce this issue by the steps you shared. > > Currently, the system relies on processing Heap2::NEW_CID to keep track of catalog modifying (sub)transactions. > > This context is lost if the logical decoding has to restart from a Standby::RUNNING_XACTS that is written between the NEW_CIDrecord and its parent txn commit. > > If the logical stream restarts from this restart_lsn, then it doesn't have the xid responsible for modifying the catalog. > I agree with your analysis. Since we don’t use commit WAL record to track the transaction that has modified system catalogs, if we decode only the commit record of such transaction, we cannot know the transaction has been modified system catalogs, resulting in the subsequent transaction scans system catalog with the wrong snapshot. With the patch, if the commit WAL record has a XACT_XINFO_HAS_INVALS flag and its xid is included in RUNNING_XACT record written at restart_lsn, we forcibly add the top XID and its sub XIDs as a committed transaction that has modified system catalogs to the snapshot. I might be missing something about your patch but I have some comments on this approach: 1. Commit WAL record may not have invalidation message for system catalogs (e.g., when commit record has only invalidation message for relcache) even if it has XACT_XINFO_HAS_INVALS flag. In this case, the transaction wrongly is added to the snapshot, is that okay? 2. We might add a subtransaction XID as a committed transaction that has modified system catalogs even if it actually didn't. As the comment in SnapBuildBuildSnapshot() describes, we track only the transactions that have modified the system catalog and store in the snapshot (in the ‘xip' array). The patch could break that assumption. However, I’m really not sure how to deal with this point. We cannot know which subtransaction has actually modified system catalogs by using only the commit WAL record. 3. The patch covers only the case where the restart_lsn exactly matches the LSN of RUNNING_XACT. I wonder if there could be a case where the decoding starts at a WAL record other than RUNNING_XACT but the next WAL record is RUNNING_XACT. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: