Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load - Mailing list pgsql-bugs

From Shlok Kyal
Subject Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load
Date
Msg-id CANhcyEXNGYzFe+JqUNZh0tT0aBV-0HOWJrcUYrUyduzFsy-Xeg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load  (Zane Duffield <duffieldzane@gmail.com>)
List pgsql-bugs
On Wed, 23 Apr 2025 at 08:43, Zane Duffield <duffieldzane@gmail.com> wrote:
>
> Hi Euler, thanks for your reply.
>
> On Wed, Apr 23, 2025 at 11:58 AM Euler Taveira <euler@eulerto.com> wrote:
>>
>> On Wed, Apr 16, 2025, at 8:14 PM, PG Bug reporting form wrote:
>>
>> I'm in the process of converting our databases from pglogical logical
>> replication to the native logical replication implementation on PostgreSQL
>> 17. One of the bugs we encountered and had to work around with pglogical was
>> the plugin dropping records while converting to a streaming replica to
>> logical via pglogical_create_subscriber (reported
>> https://github.com/2ndQuadrant/pglogical/issues/349). I was trying to
>> confirm that the native logical replication implementation did not have this
>> problem, and I've found that it might have a different problem.
>>
>>
>> pg_createsubscriber uses a different approach than pglogical. While pglogical
>> uses a restore point, pg_createsubscriber uses the LSN from the latest
>> replication slot as a replication start point. The restore point approach is
>> usually suitable to physical replication but might not cover all scenarios for
>> logical replication (such as when there are in progress transactions). Since
>> creating a logical replication slot does find a consistent decoding start
>> point, it is a natural choice to start the logical replication (that also needs
>> to find a decoding start point).
>>
>> I should say that I've been operating under the assumption that
>> pg_createsubscriber is designed for use on a replica for a *live* primary
>> database, if this isn't correct then someone please let me know.
>>
>>
>> pg_createsubscriber expects a physical replica that is preferably stopped
>> before running it.
>
>
> I think pg_createsubscriber actually gives you an error if the replica is not stopped. I was talking about the
primary.
>
>>
>> Your script is not waiting enough time until it applies the backlog. Unless,
>> you are seeing a different symptom, there is no bug.
>>
>> You should have used something similar to wait_for_subscription_sync routine
>> (Cluster.pm) before counting the rows. That's what is used in the
>> pg_createsubscriber tests. It guarantees the subscriber has caught up.
>>
>
> It may be true that the script doesn't wait long enough for all systems, but when I reproduced the issue on my
machine(s)I confirmed that the logical decoder process was properly stuck on a conflicting primary key, rather than
justcatching up. 
>
> From the log file
>>
>> 2025-04-16 09:17:16.090 AEST [3845786] port=5341 ERROR:  duplicate key value violates unique constraint
"test_table_pkey"
>> 2025-04-16 09:17:16.090 AEST [3845786] port=5341 DETAIL:  Key (f1)=(20700) already exists.
>> 2025-04-16 09:17:16.090 AEST [3845786] port=5341 CONTEXT:  processing remote data for replication origin "pg_24576"
duringmessage type "INSERT" for replication target relation "public.test_table" in transaction 1581, finished at
0/3720058
>> 2025-04-16 09:17:16.091 AEST [3816845] port=5341 LOG:  background worker "logical replication apply worker" (PID
3845786)exited with exit code 1 
>
>
>   wait_for_subscription_sync sounds like a better solution than what I have, but you might still be able to reproduce
theproblem if you increase the sleep interval on line 198. 
>
> I wonder if Shlok could confirm whether they found the conflicting primary key in their reproduction?
>

I have analysed the issue and found that this issue appears when
'consistent_lsn' is set to a COMMIT lsn during pg_createsubscriber is
run.

pg_waldump on primary node:
rmgr: Standby     len (rec/tot):    102/   102, tx:          0, lsn:
0/06AEB2B0, prev 0/06AEB270, desc: RUNNING_XACTS nextXid 1128
latestCompletedXid 1114 oldestRunningXid 1115; 13 xacts: 1121 1118
1115 1116 1117 1119 1120 1124 1125 1122 1126 1123 1127
rmgr: Transaction len (rec/tot):     46/    46, tx:       1116, lsn:
0/06AEB318, prev 0/06AEB2B0, desc: COMMIT 2025-04-17 19:22:22.789463
IST

logs for replica node:
2025-04-17 19:22:34.759 IST [2410569] port=5341 LOG:  logical
replication apply worker for subscription "sub" has started
2025-04-17 19:22:34.831 IST [2410569] port=5341 ERROR:  duplicate key
value violates unique constraint "test_table_pkey"
2025-04-17 19:22:34.831 IST [2410569] port=5341 DETAIL:  Key
(f1)=(402000) already exists.
2025-04-17 19:22:34.831 IST [2410569] port=5341 CONTEXT:  processing
remote data for replication origin "pg_24576" during message type
"INSERT" for replication target relation "public.test_table" in
transaction 1116, finished at 0/6AEB318
2025-04-17 19:22:34.835 IST [2410562] port=5341 LOG:  background
worker "logical replication apply worker" (PID 2410569) exited with
exit code 1

logs for primary node:
2025-04-17 19:22:34.788 IST [2410570] port=5340 ubuntu@test_db/sub
DETAIL:  Waiting for transactions (approximately 8) older than 1115 to
end.
2025-04-17 19:22:34.788 IST [2410570] port=5340 ubuntu@test_db/sub
STATEMENT:  START_REPLICATION SLOT "sub" LOGICAL 0/6AEB318
(proto_version '4', origin 'any', publication_names '"pub"')
2025-04-17 19:22:34.825 IST [2410570] port=5340 ubuntu@test_db/sub
LOG:  logical decoding found consistent point at 0/6AEB2B0
2025-04-17 19:22:34.825 IST [2410570] port=5340 ubuntu@test_db/sub
DETAIL:  There are no old transactions anymore.
2025-04-17 19:22:34.825 IST [2410570] port=5340 ubuntu@test_db/sub
STATEMENT:  START_REPLICATION SLOT "sub" LOGICAL 0/6AEB318
(proto_version '4', origin 'any', publication_names '"pub"')
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu@test_db/sub
LOG:  could not send data to client: Connection reset by peer
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu@test_db/sub
CONTEXT:  slot "sub", output plugin "pgoutput", in the change
callback, associated LSN 0/695BAA8
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu@test_db/sub
STATEMENT:  START_REPLICATION SLOT "sub" LOGICAL 0/6AEB318
(proto_version '4', origin 'any', publication_names '"pub"')
2025-04-17 19:22:34.832 IST [2410570] port=5340 ubuntu@test_db/sub
LOG:  released logical replication slot "sub"

logs for the pg_createsubscriber run:
2025-04-17 19:22:34.361 IST [2410552] port=5341
ubuntu@test_db/pg_createsubscriber LOG:  statement: SELECT
pg_catalog.pg_replication_origin_advance('pg_24576', '0/6AEB318')
2025-04-17 19:22:34.361 IST [2410552] port=5341
ubuntu@test_db/pg_createsubscriber LOG:  duration: 0.640 ms
2025-04-17 19:22:34.362 IST [2410552] port=5341
ubuntu@test_db/pg_createsubscriber LOG:  statement: ALTER SUBSCRIPTION
"sub" ENABLE

Here '0/06AEB318' is a COMMIT lsn. During the run of
pg_createsubscriber, the 'recovery_target_lsn' of standby is set to
'0/06AEB318'. So, after restart of standby it will have the records
for the transaction committed at '0/06AEB318'.
Now after subscription is created, replication origin is advanced to
lsn '0/06AEB318'. Now when this subscription is enabled, the
applyworker tries to apply this record again and thus gives a primary
key conflict.

In case of normal logical replication, the replication origin is
advanced to the next record to that of COMMIT lsn (in code it is
referred as end_lsn).

The occurrence of this failure is rare because it happens only when
the 'consistent_lsn' is set to a lsn which refers to a COMMIT record.


Thanks and Regards,
Shlok Kyal



pgsql-bugs by date:

Previous
From: "李园园"
Date:
Subject: DSA refcnt overflow in pg_stat/could not attach to dynamic shared area
Next
From: Shlok Kyal
Date:
Subject: Re: BUG #18897: Logical replication conflict after using pg_createsubscriber under heavy load