Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns - Mailing list pgsql-hackers

From Jeremy Schneider
Subject Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date
Msg-id dfb427c5-6681-f013-2d52-16803af370ce@amazon.com
Whole thread Raw
In response to Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On 7/29/21 01:25, Masahiko Sawada wrote:
> On Tue, Mar 16, 2021 at 1:35 AM Oh, Mike <minsoo@amazon.com> wrote:
>>
>> Sending this to pgsql-hackers list to create a CommitFest entry with the attached patch proposal.
>>
>> ...
>>
>> Detailed problem description:
>>
>> Tested on 11.8 & current master.
>>
>> The logical replication slot restart_lsn advances in the middle of an open txn that modified the catalog (e.g.
TRUNCATEoperation).
 
>>
>> Should the logical decoding has to restart it could fail with an error like this:
>>
>> ERROR:  could not map filenode "base/13237/442428"
> 
> Thank you for reporting the issue.
> 
> I could reproduce this issue by the steps you shared.


I also noticed a bug report earlier this year with another PG user
reporting the same error - on version 12.3

https://www.postgresql.org/message-id/flat/16812-3d9df99bd77ff616%40postgresql.org

Today I received a report from a new PG user of this same error message
causing their logical replication to break. This customer was also
running PostgreSQL 12.3 on both source and target side.

Haven't yet dumped WAL or anything, but wanted to point out that the
error is being seen in the wild - I hope we can get a version of this
patch committed soon, as it will help with at least one cause.


-Jeremy

-- 
Jeremy Schneider
Database Engineer
Amazon Web Services



pgsql-hackers by date:

Previous
From: Jeremy Schneider
Date:
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Next
From: Ranier Vilela
Date:
Subject: Re: Fix uninitialized variable access (src/backend/utils/mmgr/freepage.c)