Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences
Date
Msg-id aeb2ba8d-e6f4-5486-cc4c-0d4982c291cb@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences  (Petr Jelinek <petr.jelinek@enterprisedb.com>)
Responses Re: logical decoding and replication of sequences  (Greg Stark <stark@mit.edu>)
Re: logical decoding and replication of sequences  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 3/23/22 13:46, Petr Jelinek wrote:
> 
>> On 23. 3. 2022, at 12:50, Amit Kapila <amit.kapila16@gmail.com
>> <mailto:amit.kapila16@gmail.com>> wrote:
>>
>> On Tue, Mar 22, 2022 at 5:41 PM Petr Jelinek
>> <petr.jelinek@enterprisedb.com <mailto:petr.jelinek@enterprisedb.com>>
>> wrote:
>>>
>>>> On 22. 3. 2022, at 13:09, Amit Kapila <amit.kapila16@gmail.com
>>>> <mailto:amit.kapila16@gmail.com>> wrote:
>>>>
>>>> On Mon, Mar 21, 2022 at 4:25 AM Tomas Vondra
>>>> <tomas.vondra@enterprisedb.com
>>>> <mailto:tomas.vondra@enterprisedb.com>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Attached is a rebased patch, addressing most of the remaining issues.
>>>>>
>>>>
>>>> It appears that on the apply side, the patch always creates a new
>>>> relfilenode irrespective of whether the sequence message is
>>>> transactional or not. Is it required to create a new relfilenode for
>>>> non-transactional messages? If not that could be costly?
>>>>
>>>
>>>
>>> That's a good catch, I think we should just write the page in the
>>> non-transactional case, no need to mess with relnodes.
>>>
>>
>> What if the current node has also incremented from the existing
>> sequence? Basically, how will we deal with conflicts? It seems we will
>> overwrite the actions done on the existing node which means sequence
>> values can go back.
>>
> 
> 
> I think this is perfectly acceptable behavior, we are replicating state
> from upstream, not reconciling state on downstream.
> 
> You can't really use the builtin sequences to implement distributed
> sequence via replication. If user wants to write to both nodes they
> should not replicate the sequence value and instead offset the sequence
> on each node so they produce different ranges, that's quite common
> approach. One day we might want revisit adding support for custom
> sequence AMs.
> 

Exactly. Moreover it's about the same behavior as if you update table
data on the subscriber, and then an UPDATE gets replicated and
overwrites the local change.

Attached is a patch fixing the relfilenode issue - now we only allocate
a new relfilenode for the transactional case, and an in-place update
similar to a setval() otherwise. And thanks for noticing this.

> 
>> * Currently, the patch uses one sync worker per sequence. It seems to
>> be a waste of resources considering apart from one additional process,
>> we need origin/slot to sync each sequence.
>>
> 
> 
> This is indeed wasteful but not something that I'd consider blocker for
> the patch personally.
> 

Right, and the same argument can be made for tablesync of tiny tables
(which a sequence essentially is). I'm sure there are ways to improve
this, but that can be done later.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Next
From: Robert Haas
Date:
Subject: Re: multithreaded zstd backup compression for client and server