Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: [HACKERS] logical decoding of two-phase transactions
Date
Msg-id CAFPTHDa8aE255wRQAzjhb=rUZaVpfWd6JHgeVpP8Gi_zV132wg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] logical decoding of two-phase transactions  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: [HACKERS] logical decoding of two-phase transactions  (Ajin Cherian <itsajin@gmail.com>)
Re: [HACKERS] logical decoding of two-phase transactions  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
I was doing some testing, and I found some issues. Two issues. The
first one, seems to be a behaviour that might be acceptable, the
second one not so much.
I was using test_decoding, not sure how this might behave with the
pg_output plugin.

Test 1:
A transaction that is immediately rollbacked after the prepare.

SET synchronous_commit = on;
SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot',
'test_decoding');
CREATE TABLE stream_test(data text);
-- consume DDL
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');

BEGIN;
INSERT INTO stream_test SELECT repeat('a', 10) || g.i FROM
generate_series(1, 20) g(i);
PREPARE TRANSACTION 'test1';
ROLLBACK PREPARED 'test1';
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
==================

Here, what is seen is that while the transaction was not decoded at
all  since it was rollbacked before it could get decoded, the ROLLBACK
PREPARED is actually decoded.
The result being that the standby could get a spurious ROLLBACK
PREPARED. The current code in worker.c does handle this silently. So,
this might not be an issue.

Test 2:
A transaction that is partially streamed , is then prepared.
'
BEGIN;
INSERT INTO stream_test SELECT repeat('a', 10) || g.i FROM
generate_series(1,800) g(i);
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
PREPARE TRANSACTION 'test1';
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
ROLLBACK PREPARED 'test1';
==========================

Here, what is seen is that the transaction is streamed twice, first
when it crosses the memory threshold and is streamed (usually only in
the 2nd pg_logical_slot_get_changes call)
and then the same transaction is streamed again after the prepare.
This cannot be right, as it would result in duplication of data on the
standby.

I will be debugging the second issue and try to arrive at a fix.

regards,
Ajin Cherian
Fujitsu Australia.

On Tue, Nov 10, 2020 at 4:47 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI - I have cross-checked all the v18 patch code against the v18 code
> coverage [1] resulting from running the tests.
>
> The purpose of this study was to identify where there may be any gaps
> in the testing of this patch - e.g is there some v18 code not
> currently getting executed by the tests?
>
> I found almost all of the normal (not error) code paths are getting executed.
>
> For details please see attached the study results. (MS Excel file)
>
> ===
>
> [1] https://www.postgresql.org/message-id/CAHut%2BPu4BpUr0GfCLqJjXc%3DDcaKSvjDarSN89-4W2nxBeae9hQ%40mail.gmail.com
>
> Kind Regards,
> Peter Smith.
> Fujitsu Australia



pgsql-hackers by date:

Previous
From: Paul Guo
Date:
Subject: Re: Multi Inserts in CREATE TABLE AS - revived patch
Next
From: Amit Langote
Date:
Subject: Re: ModifyTable overheads in generic plans