Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] logical decoding of two-phase transactions
Date
Msg-id CAA4eK1LyzdpUo6A_LbxqoAu_aT-y5DpXi1xt9oCy_DT6CTd18w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] logical decoding of two-phase transactions  (Ajin Cherian <itsajin@gmail.com>)
Responses Re: [HACKERS] logical decoding of two-phase transactions
List pgsql-hackers
On Wed, Nov 11, 2020 at 4:30 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> Did some further tests on the problem I saw and I see that it does not
> have anything to do with this patch. I picked code from top of head.
> If I have enough changes in a transaction to initiate streaming, then
> it will also stream the same changes after a commit.
>
> BEGIN;
> INSERT INTO stream_test SELECT repeat('a', 10) || g.i FROM
> generate_series(1,800) g(i);
> SELECT data FROM pg_logical_slot_get_changes('regression_slot',
> NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
> 'skip-empty-xacts', '1', 'stream-changes', '1');
> ** see streamed output here **
> END;
> SELECT data FROM pg_logical_slot_get_changes('regression_slot',
> NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
> 'skip-empty-xacts', '1', 'stream-changes', '1');
> ** see the same streamed output here **
>
> I think this is because since the transaction has not been committed,
> SnapBuildCommitTxn is not called which is what moves the
> "builder->start_decoding_at", and as a result
> later calls to pg_logical_slot_get_changes will start from the
> previous lsn.
>

No, we always move start_decoding_at after streaming changes. It will
be moved because we have advanced the confirmed_flush location after
streaming all the changes (via LogicalConfirmReceivedLocation()) which
will be used to set 'start_decoding_at' when we create decoding
context (CreateDecodingContext) next time. However, we don't advance
'restart_lsn' due to which it start from the same point and accumulate
all changes for transaction each time. Now, after Commit we get an
extra record which is ahead of 'start_decoding_at' and we try to
decode it, it will get all the changes of the transaction. It might be
that we update the documentation for pg_logical_slot_get_changes() to
indicate the same but I don't think this is a problem.

> I did do a quick test in pgoutput using pub/sub and I
> dont see duplication of data there but I haven't
> verified exactly what happens.
>

Yeah, because we always move ahead for WAL locations in that unless
the subscriber/publisher is restarted in which case it should start
from the required location. But still, we can try to see if there is
any bug.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Ajin Cherian
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Antonin Houska
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs