Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id 20f4bd55-8c0f-ae60-abd9-6dc70289da7a@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers

On 9/13/23 15:18, Ashutosh Bapat wrote:
> On Fri, Aug 18, 2023 at 4:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Aug 18, 2023 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>
>>> On Thu, Aug 17, 2023 at 7:13 PM Ashutosh Bapat
>>> <ashutosh.bapat.oss@gmail.com> wrote:
>>>>
>>>> On Wed, Aug 16, 2023 at 7:56 PM Tomas Vondra
>>>> <tomas.vondra@enterprisedb.com> wrote:
>>>>>
>>>>>>
>>>>>> But whether or not that's the case, downstream should not request (and
>>>>>> hence receive) any changes that have been already applied (and
>>>>>> committed) downstream as a principle. I think a way to achieve this is
>>>>>> to update the replorigin_session_origin_lsn so that a sequence change
>>>>>> applied once is not requested (and hence sent) again.
>>>>>>
>>>>>
>>>>> I guess we could update the origin, per attached 0004. We don't have
>>>>> timestamp to set replorigin_session_origin_timestamp, but it seems we
>>>>> don't need that.
>>>>>
>>>>> The attached patch merges the earlier improvements, except for the part
>>>>> that experimented with adding a "fake" transaction (which turned out to
>>>>> have a number of difficult issues).
>>>>
>>>> 0004 looks good to me.
>>>
>>>
>>> + {
>>>   CommitTransactionCommand();
>>> +
>>> + /*
>>> + * Update origin state so we don't try applying this sequence
>>> + * change in case of crash.
>>> + *
>>> + * XXX We don't have replorigin_session_origin_timestamp, but we
>>> + * can just leave that set to 0.
>>> + */
>>> + replorigin_session_origin_lsn = seq.lsn;
>>>
>>> IIUC, your proposal is to update the replorigin_session_origin_lsn, so
>>> that after restart, it doesn't use some prior origin LSN to start with
>>> which can in turn lead the sequence to go backward. If so, it should
>>> be updated before calling CommitTransactionCommand() as we are doing
>>> in apply_handle_commit_internal(). If that is not the intention then
>>> it is not clear to me how updating replorigin_session_origin_lsn after
>>> commit is helpful.
>>>
>>
>> typedef struct ReplicationState
>> {
>> ...
>>     /*
>>      * Location of the latest commit from the remote side.
>>      */
>>     XLogRecPtr    remote_lsn;
>>
>> This is the variable that will be updated with the value of
>> replorigin_session_origin_lsn. This means we will now track some
>> arbitrary LSN location of the remote side in this variable. The above
>> comment makes me wonder if there is anything we are missing or if it
>> is just a matter of updating this comment because before the patch we
>> always adhere to what is written in the comment.
> 
> I don't think we are missing anything. This value is used to track the
> remote LSN upto which all the commits from upstream have been applied
> locally. Since a non-transactional sequence change is like a single
> WAL record transaction, it's LSN acts as the LSN of the mini-commit.
> So it should be fine to update remote_lsn with sequence WAL record's
> end LSN. That's what the patches do. I don't see any hazard. But you
> are right, we need to update comments. Here and also at other places
> like
> replorigin_session_advance() which uses remote_commit as name of the
> argument which gets assigned to ReplicationState::remote_lsn.
> 

I agree - updating the replorigin_session_origin_lsn shouldn't break
anything. As you write, it's essentially a "mini-commit" and the commit
order remains the same.

I'm not sure about resetting replorigin_session_origin_timestamp to 0
though. It's not something we rely on very much (it may not correlated
with the commit order etc.). But why should we set it to 0? We don't do
that for regular commits, right? And IMO it makes sense to just use the
timestamp of the last commit before the sequence change.

FWIW I've left this in a separate commit, but I'll merge that into 0002
in the next patch version.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences, take 2
Next
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences, take 2