Home > mailing lists

Re: In logical replication concurrent update of partition key createsa duplicate record on standby. - Mailing list pgsql-hackers

From	amul sul
Subject	Re: In logical replication concurrent update of partition key createsa duplicate record on standby.
Date	February 8, 2018 21:54:16
Msg-id	CAAJ_b96ehpQOnwNakOedeGuEUCmjOYJipDbhmBdy34PeypjUrg@mail.gmail.com Whole thread Raw
In response to	Re: In logical replication concurrent update of partition key createsa duplicate record on standby. (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: In logical replication concurrent update of partition key createsa duplicate record on standby. (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On Wed, Feb 7, 2018 at 6:00 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Feb 7, 2018 at 3:42 PM, amul sul <sulamul@gmail.com> wrote:
>> On Wed, Feb 7, 2018 at 3:03 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>>> On 7 February 2018 at 13:53, amul sul <sulamul@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> If an update of partition key involves tuple movement from one partition to
>>>> another partition then there will be a separate delete on one partition and
>>>> insert on the other partition made.
>>>>
>>>> In the logical replication if an update performed on the master and standby at
>>>> the same moment, then replication worker tries to replicate delete + insert
>>>> operation on standby. While replying master changes on standby for the delete
>>>> operation worker will log "concurrent update, retrying" message (because the
>>>> update on standby has already deleted) and move forward to reply the next
>>>> insert operation. Standby update also did the same delete+insert is as part of
>>>> the update of partition key in a result there will be two records inserted on
>>>> standby.
>>>
>>> A quick thinking on how to resolve this makes me wonder if we can
>>> manage to pass some information through logical decoding that the
>>> delete is part of a partition key update. This is analogous to how we
>>> set some information locally in the tuple by setting
>>> tp.t_data->t_ctid.ip_blkid to InvalidBlockNumber.
>>>
>>
>> +1,
>>
>
> I also mentioned the same thing in the other thread [1], but I think
> that alone won't solve the dual record problem as you are seeing.  I
> think we need to do something for next insert as you are suggesting.
>
>> also if  worker failed to reply delete operation on standby then
>> we need to decide what will be the next step, should we skip follow
>> insert operation or error out or something else.
>>
>
> That would be tricky, do you see any simple way of doing either of those.
>

Not really, like ExecUpdate for an update of partition key if delete is failed
then the further insert will be skipped, but you are correct, it might be more
tricky than I can think -- there is no guarantee that the next insert operation
which replication worker trying to replicate is part of the update of partition
key mechanism.  How can one identify that an insert operation on one relation is
related to previously deleting operation on some other relation?

Regards,
Amul

pgsql-hackers by date:

From: Claudio Freire
Date: 08 February 2018, 21:45:15
Subject: Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

From: amul sul
Date: 08 February 2018, 21:55:49
Subject: Re: In logical replication concurrent update of partition key createsa duplicate record on standby.

Re: In logical replication concurrent update of partition key createsa duplicate record on standby. - Mailing list pgsql-hackers

Previous

Next