Re: logical replication empty transactions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: logical replication empty transactions
Date
Msg-id CAA4eK1JFAprzsQc6JjGfLEn9+qqp_eDCUUpLtCMrhHP2fwXOmw@mail.gmail.com
Whole thread Raw
In response to RE: logical replication empty transactions  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
On Wed, Feb 16, 2022 at 8:45 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

[ideas to skip empty prepare/commit_prepare ....]

>
> I feel if we don't want to change the protocol of commit_prepared,
> we need to make the publisher solely judge whether the prepare was empty or not,
> after the restart.
>
> One idea I thought at the beginning was to utilize and apply
> the existing mechanism to spill ReorderBufferSerializeTXN object to local disk,
> by postponing the prepare txn object cleanup and when the walsender exits
> and commit prepared didn't come, spilling the transaction's data,
> then restoring it after the restart in the DecodePrepare.
> However, this idea wasn't crash-safe fundamentally. It means,
> if the publisher crashes before spilling the empty prepare transaction,
> we fail to detect the prepare was empty and come down to send the commit_prepared
> in the situation where the subscriber didn't get the prepare data again.
> So, I thought to utilize the spill mechanism didn't work for this purpose.
>
> Another idea would be, to create an empty file under the the pg_replslot/slotname
> with a prefix different from "xid"  in the DecodePrepare before the shutdown
> if the prepare was empty, and bypass the cleanup of the serialized txns
> and check the existence after the restart. But, this is pretty ad-hoc and I wasn't sure
> if to address the corner case of the restart has the strong enough justification
> to create this new file format.
>

I think for this idea to work you need to create such an empty file
each time we skip empty prepare as the system might crash after
prepare and we won't get time to create such a file. I don't think it
is advisable to do I/O to save the network message.

> Therefore, in my humble opinion, the idea of protocol change slightly wins,
> since the impact of the protocol change would not be big. We introduced
> the protocol version 3 in the devel version and the number of users should be little.
>

There is also the cost of the additional check (whether prepared xact
exists) at the time of processing each commit prepared message. I
think if we want to go in this direction then it is better to do it
via a subscription parameter (say skip_empty_prepare_xact or something
like that) so that we can pay the additional cost of such a check
conditionally when such a parameter is set by the user. I feel for now
we can document in comments why we can't skip empty prepared
transactions and maybe as an idea(s) worth exploring to implement the
same. OTOH, if multiple agree on such a solution we can even try to
implement it and see if that works.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Next
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: Small and unaffected typo in pg_logical_slot_get_changes_guts()