Re: logical replication empty transactions - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: logical replication empty transactions
Date
Msg-id CAFPTHDb5JExY2zgx-_6EfBUoeBs5LOxYAH-TT3AxrH8FiSL9UQ@mail.gmail.com
Whole thread Raw
In response to RE: logical replication empty transactions  ("shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com>)
List pgsql-hackers
On Wed, Mar 2, 2022 at 1:01 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> 4.
> @@ -1617,9 +1829,21 @@ pgoutput_stream_prepare_txn(LogicalDecodingContext *ctx,
>                                                         ReorderBufferTXN *txn,
>                                                         XLogRecPtr prepare_lsn)
>  {
> +       PGOutputTxnData *txndata = txn->output_plugin_private;
> +       bool                    sent_begin_txn = txndata->sent_begin_txn;
> +
>         Assert(rbtxn_is_streamed(txn));
>
> -       OutputPluginUpdateProgress(ctx);
> +       pfree(txndata);
> +       txn->output_plugin_private = NULL;
> +
> +       if (!sent_begin_txn)
> +       {
> +               elog(DEBUG1, "Skipping replication of an empty transaction in stream prepare");
> +               return;
> +       }
> +
> +       OutputPluginUpdateProgress(ctx, false);
>         OutputPluginPrepareWrite(ctx, true);
>         logicalrep_write_stream_prepare(ctx->out, txn, prepare_lsn);
>         OutputPluginWrite(ctx, true);
>
> I notice that the patch skips stream prepared transaction, this would cause an
> error on subscriber side when committing this transaction on publisher side, so
> I think we'd better not do that.
>
> For example:
> (set logical_decoding_work_mem = 64kB, max_prepared_transactions = 10 in
> postgresql.conf)
>
> -- publisher
> create table test (a int, b text, primary key(a));
> create table test2 (a int, b text, primary key(a));
> create publication pub for table test;
>
> -- subscriber
> create table test (a int, b text, primary key(a));
> create table test2 (a int, b text, primary key(a));
> create subscription sub connection 'dbname=postgres port=5432' publication pub with(two_phase=on, streaming=on);
>
> -- publisher
> begin;
> INSERT INTO test2 SELECT i, md5(i::text) FROM generate_series(1, 1000) s(i);
> prepare transaction 't';
> commit prepared 't';
>
> The error message in subscriber log:
> ERROR:  prepared transaction with identifier "pg_gid_16391_722" does not exist
>

Thanks for the test. I guess this mixed streaming+two-phase runs into
the same problem that
was there while skipping two-phased transactions. If the eventual
commit prepared comes after a restart,
then there is no way of knowing if the original transaction was
skipped or not and we can't know if the commit prepared
needs to be sent. I tried not skipping the "stream prepare", but that
causes a crash in the apply worker
as it tries to find the non-existent streamed file. We could add logic
to silently ignore a spurious "stream prepare"
but that might not be ideal. Any thoughts on how to address this? Or
else, we will need to avoid skipping streamed
transactions as well.

regards,
Ajin Cherian
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Add 64-bit XIDs into PostgreSQL 15
Next
From: Aleksander Alekseev
Date:
Subject: Re: Changing "Hot Standby" to "hot standby"