Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date
Msg-id CAGPVpCRWEVhXa7ovrhuSQofx4to7o22oU9iKtrOgAOtz_=Y6vg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Melih Mutlu <m.melihmutlu@gmail.com>)
List pgsql-hackers
Hi hackers,

I've been working on/struggling with this patch for a while. But I haven't updated this thread regularly. 
So sharing what I did with this patch so far.

> Amit Kapila <amit.kapila16@gmail.com>, 6 Ağu 2022 Cmt, 16:01 tarihinde şunu yazdı:
>>
>> I think there is some basic flaw in slot reuse design. Currently, we
>> copy the table by starting a repeatable read transaction (BEGIN READ
>> ONLY ISOLATION LEVEL REPEATABLE READ) and create a slot that
>> establishes a snapshot which is first used for copy and then LSN
>> returned by it is used in the catchup phase after the copy is done.
>> The patch won't establish such a snapshot before a table copy as it
>> won't create a slot each time. If this understanding is correct, I
>> think we need to use ExportSnapshot/ImportSnapshot functionality to
>> achieve it or do something else to avoid the problem mentioned.

This issue that Amit mentioned causes some problems such as duplicated rows in the subscriber.
Basically, with this patch, tablesync worker creates a replication slot only in its first run. To ensure table copy and sync are consistent with each other, the worker needs the correct snapshot and LSN which both are returned by slot create operation. 
Since this patch does not create a rep. slot in each table copy and instead reuses the one created in the beginning, we do not get a new snapshot and LSN for each table anymore. Snapshot gets lost right after the transaction is committed, but  the patch continues to use the same LSN for next tables without the proper snapshot.
In the end, for example, the worker might first copy some rows, then apply changes from rep. slot and inserts those rows again for the tables in later iterations.

I discussed some possible ways to resolve this with Amit offline:
1- Copy all tables in one transaction so that we wouldn't need to deal with snapshots. 
Not easy to keep track of the progress. If the transaction fails, we would need to start all over again.

2- Don't lose the first snapshot (by keeping a transaction open with the snapshot imported or some other way) and use the same snapshot and LSN for all tables.
I'm not sure about the side effects of keeping a transaction open that long or using a snapshot that might be too old after some time.
Still seems like it might work.

3- For each table, get a new snapshot and LSN by using an existing replication slot.
Even though this approach wouldn't create a new replication slot, preparing the slot for snapshot and then taking the snapshot may be costly. 
Need some numbers here to see how much this approach would improve the performance.

I decided to go with approach 3 (creating a new snapshot with an existing replication slot) for now since it would require less change in the tablesync worker logic than the other options would.
To achieve this, this patch introduces a new command for Streaming Replication Protocol. 
The new REPLICATION_SLOT_SNAPSHOT command basically mimics how CREATE_REPLICATION_SLOT creates a snapshot, but without actually creating a new replication slot.
Later the tablesync worker calls this command if it decides not to create a new rep. slot, uses the snapshot created and LSN returned by the command.

Also;
After the changes discussed here [1], concurrent replication origin drops by apply worker and tablesync workers may hold each other on wait due to locks taken by replorigin_drop_by_name.
I see that this harms the performance of logical replication quite a bit in terms of speed.
Even though reusing replication origins was discussed in this thread before, the patch didn't include any change to do so.
The updated version of the patch now also reuses replication origins too. Seems like even only changes to reuse origins by itself improves the performance significantly.

Attached two patches:
0001: adds REPLICATION_SLOT_SNAPSHOT command for replication protocol.
0002: Reuses workers/replication slots and origins for tablesync

I would appreciate any feedback/review/thought on the approach and both patches.
I will also share some numbers to compare performances of the patch and master branch soon.

Best,
--
Melih Mutlu
Microsoft


 
Attachment

pgsql-hackers by date:

Previous
From: gkokolatos@pm.me
Date:
Subject: Re: Add LZ4 compression in pg_dump
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: suppressing useless wakeups in logical/worker.c