RE: logical replication empty transactions - Mailing list pgsql-hackers
From | osumi.takamichi@fujitsu.com |
---|---|
Subject | RE: logical replication empty transactions |
Date | |
Msg-id | TYCPR01MB837337FF0019ED0AAA7A56D9ED359@TYCPR01MB8373.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: logical replication empty transactions (Ajin Cherian <itsajin@gmail.com>) |
Responses |
Re: logical replication empty transactions
Re: logical replication empty transactions |
List | pgsql-hackers |
Hi I'll quote one other remaining discussion of this thread again to invoke more attentions from the community. On Friday, August 13, 2021 8:01 PM Ajin Cherian <itsajin@gmail.com> wrote: > On Mon, Aug 2, 2021 at 7:20 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > Few other miscellaneous comments: > > 1. > > static void > > pgoutput_commit_prepared_txn(LogicalDecodingContext *ctx, > > ReorderBufferTXN *txn, > > - XLogRecPtr commit_lsn) > > + XLogRecPtr commit_lsn, XLogRecPtr prepare_end_lsn, TimestampTz > > + prepare_time) > > { > > + PGOutputTxnData *txndata = (PGOutputTxnData *) > txn->output_plugin_private; > > + > > OutputPluginUpdateProgress(ctx); > > > > + /* > > + * If the BEGIN PREPARE was not yet sent, then it means there were no > > + * relevant changes encountered, so we can skip the COMMIT PREPARED > > + * message too. > > + */ > > + if (txndata) > > + { > > + bool skip = !txndata->sent_begin_txn; pfree(txndata); > > + txn->output_plugin_private = NULL; > > > > How is this supposed to work after the restart when prepared is sent > > before the restart and we are just sending commit_prepared after > > restart? Won't this lead to sending commit_prepared even when the > > corresponding prepare is not sent? Can we think of a better way to > > deal with this? > > > > I have tried to resolve this by adding logic in worker,c to silently ignore spurious > commit_prepareds. But this change required checking if the prepare exists on > the subscriber before attempting the commit_prepared but the current API that > checks this requires prepare time and transaction end_lsn. But for this I had to > change the protocol of commit_prepared, and I understand that this would > break backward compatibility between subscriber and publisher (you have > raised this issue as well). > I am not sure how else to handle this, let me know if you have any other ideas. I feel if we don't want to change the protocol of commit_prepared, we need to make the publisher solely judge whether the prepare was empty or not, after the restart. One idea I thought at the beginning was to utilize and apply the existing mechanism to spill ReorderBufferSerializeTXN object to local disk, by postponing the prepare txn object cleanup and when the walsender exits and commit prepared didn't come, spilling the transaction's data, then restoring it after the restart in the DecodePrepare. However, this idea wasn't crash-safe fundamentally. It means, if the publisher crashes before spilling the empty prepare transaction, we fail to detect the prepare was empty and come down to send the commit_prepared in the situation where the subscriber didn't get the prepare data again. So, I thought to utilize the spill mechanism didn't work for this purpose. Another idea would be, to create an empty file under the the pg_replslot/slotname with a prefix different from "xid" in the DecodePrepare before the shutdown if the prepare was empty, and bypass the cleanup of the serialized txns and check the existence after the restart. But, this is pretty ad-hoc and I wasn't sure if to address the corner case of the restart has the strong enough justification to create this new file format. Therefore, in my humble opinion, the idea of protocol change slightly wins, since the impact of the protocol change would not be big. We introduced the protocol version 3 in the devel version and the number of users should be little. Best Regards, Takamichi Osumi
pgsql-hackers by date: