Re: Parallelize stream replication process - Mailing list pgsql-hackers

From Asim Praveen
Subject Re: Parallelize stream replication process
Date
Msg-id 54AB2364-A227-4724-8104-1743F42A8009@vmware.com
Whole thread Raw
In response to Re: Parallelize stream replication process  (Li Japin <japinli@hotmail.com>)
List pgsql-hackers

> On 16-Sep-2020, at 8:32 AM, Li Japin <japinli@hotmail.com> wrote:
>
> Thanks for clarifying the questions!
>
>> On Sep 15, 2020, at 12:41 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>>
>> I think we must ask few questions:
>>
>> 1. What's the major gain we get out of this? Is it that the time to
>> stream gets reduced or something else?
>
> I think when the database failover, we might shorten the recovery time from the parallel stream replication.
>
>> If the answer to the above point is something solid, then
>> 2. How do we distribute the work to multiple processes?
>> 3. Do we need all of the workers to maintain the order in which they
>> read WAL files(on the publisher) and apply the changes(on the
>> subscriber?)
>> 3. Do we want to map the sender/publisher workers to
>> receiver/subscriber workers on a one-to-one basis? If not, how do we
>> do it?
>> 4. How do sender and receiver workers communicate?
>> 5. What if we have multiple subscribers/receivers?
>>
>> I'm no expert in replication, I may be wrong as well. Others may have
>> better thoughts.
>>
>
> Maybe we can distribute the work to multiple processes according by the WAL record type.
>
> In the first step, I think we can parallel the replay process. We can classify the WAL by WAL type or RmgrId,
> and then parallel those WAL replay if possible.
>

This is a rather hard problem to solve, mainly because the (partial)
order inherent in the WAL stream must be preserved when distributing
subsets of WAL records for parallel replay.  The order can be
characterised as follows:

(1) All records emitted by a transaction must be replayed before
replaying the commit record emitted by that transaction.

(2) Commit records emitted by different transactions must be replayed
in the order in which they appear in the WAL stream.

Asim


pgsql-hackers by date:

Previous
From: Lauri Svan
Date:
Subject: Extending array intersection ops to bloom indexes
Next
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] [PATCH] Generic type subscripting