Disabled logical replication origin session causes primary key errors - Mailing list pgsql-bugs

From Shawn McCoy
Subject Disabled logical replication origin session causes primary key errors
Date
Msg-id CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com
Whole thread Raw
List pgsql-bugs
Hello,

We have discovered a recent regression in the Origin handling of logical replication apply workers. We have found the cause of the issue was due to the worker resetting its local origin session information during the processing of an error that is
silently handled allowing the worker to continue. We suspect this is caused by the recent change made in the following thread, https://www.postgresql.org/message-id/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com

The logical replication apply worker will originally setup the origin correctly. However, on the first insert will call into the trigger which will raise an exception. This exception will execute the error callback that resets the origin session state. The exception will then be silently handled, returning execution back to the apply worker.  In the second reproduction, a function based index is used with the same result.

At this point, the apply worker can continue to commit these changes, but has cleared all local origin session state. As a result, we will not update our remote to local LSN mapping of the origin. Allowing for duplicate data to be applied.

This was tested and observed in at least these versions:

PostgreSQL 16.8
PostgreSQL 17.4

We provide a simple reproduction of the issue below in 2 separate use-cases.

Regards,
Shawn McCoy, Drew Callahan, Scott Mead
Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18899: FreeBSD, assembly by means of GCC with ASAN ends with error: undef reference to backtrace_symbols_fd
Next
From: Tom Lane
Date:
Subject: Re: BUG #18896: A potential problem in heap_page_items (pageinspect, PG-17)