Re: logical replication empty transactions - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: logical replication empty transactions
Date
Msg-id CAFPTHDb8sraUR1qq3-30YwePd6YsMMQEuTLXQMx0K_ncnqih_Q@mail.gmail.com
Whole thread Raw
In response to Re: logical replication empty transactions  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: logical replication empty transactions  (Peter Smith <smithpb2250@gmail.com>)
Re: logical replication empty transactions  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Thu, Jul 22, 2021 at 6:11 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Ajin.
>
> I have reviewed the v8 patch and my feedback comments are below:
>
> //////////
>
> 1. Apply v8 gave multiple whitespace warnings.
>
> ------
>
> 2. Commit comment - wording
>
> If (when processing a COMMIT / PREPARE message) we find there had been
> no other change for that transaction, then do not send the COMMIT /
> PREPARE message. This means that pgoutput will skip BEGIN / COMMIT
> or BEGIN PREPARE / PREPARE  messages for transactions that are empty.
>
> =>
>
> Shouldn't this also mention some other messages that may be skipped?
> - COMMIT PREPARED
> - ROLLBACK PREPARED
>

Updated.

> ------
>
> 3. doc/src/sgml/logicaldecoding.sgml - wording
>
> @@ -884,11 +884,20 @@ typedef void (*LogicalDecodePrepareCB) (struct
> LogicalDecodingContext *ctx,
>        The required <function>commit_prepared_cb</function> callback is called
>        whenever a transaction <command>COMMIT PREPARED</command> has
> been decoded.
>        The <parameter>gid</parameter> field, which is part of the
> -      <parameter>txn</parameter> parameter, can be used in this callback.
> +      <parameter>txn</parameter> parameter, can be used in this callback. The
> +      parameters <parameter>prepare_end_lsn</parameter> and
> +      <parameter>prepare_time</parameter> can be used to check if the plugin
> +      has received this <command>PREPARE TRANSACTION</command> command or not.
> +      If yes, it can commit the transaction, otherwise, it can skip the commit.
> +      The <parameter>gid</parameter> alone is not sufficient to determine this
> +      because the downstream may already have a prepared transaction with the
> +      same identifier.
>
> =>
>
> Typo: Should that say "downstream node" instead of just "downstream" ?
>
> ------

Updated.

>
> 4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_begin_txn
> callback comment
>
> @@ -406,14 +413,38 @@ pgoutput_startup(LogicalDecodingContext *ctx,
> OutputPluginOptions *opt,
>
>  /*
>   * BEGIN callback
> + * Don't send BEGIN message here. Instead, postpone it until the first
> + * change. In logical replication, a common scenario is to replicate a set
> + * of tables (instead of all tables) and transactions whose changes were on
>
> =>
>
> Typo: "BEGIN callback" --> "BEGIN callback." (with the period).
>
> And, I think maybe it will be better if it has a separating blank line too.
>
> e.g.
>
> /*
>  * BEGIN callback.
>  *
>  * Don't send BEGIN ....
>
> (NOTE: this review comment applies to other callback function comments
> too, so please hunt them all down)
>
> ------

Updated.

>
> 5. src/backend/replication/pgoutput/pgoutput.c - data / txndata
>
>  static void
>  pgoutput_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
>  {
> + PGOutputTxnData    *data = MemoryContextAllocZero(ctx->context,
> + sizeof(PGOutputTxnData));
>
> =>
>
> There is some inconsistent naming of the local variable in the patch.
> Sometimes it is called "data"; Sometimes it is called "txdata" etc. It
> would be better to just stick with the same variable name everywhere.
>
> (NOTE: this comment applies to several places in this patch)
>
> ------

I've changed all occurance of PGOutputTxnData to txndata. Note that
there is another structure PGOutputData which still uses the name
data.

>
> 6. src/backend/replication/pgoutput/pgoutput.c - Strange way to use Assert
>
> + /* If not streaming, should have setup txndata as part of
> BEGIN/BEGIN PREPARE */
> + if (!in_streaming)
> + Assert(txndata);
> +
>
> =>
>
> This style of Assert code seemed strange to me. In production mode
> isn't that going to evaluate to some condition with a ((void) true)
> body? IMO it might be better to just include the streaming check as
> part of the Assert. For example:
>
> BEFORE
> if (!in_streaming)
> Assert(txndata);
>
> AFTER
> Assert(in_streaming || txndata);
>
> (NOTE: This same review comment applies in at least 3 places in this
> patch, so please hunt them all down)
>

Updated.

> ------
>
> 7. src/backend/replication/pgoutput/pgoutput.c - comment wording
>
> @@ -677,6 +810,18 @@ pgoutput_change(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>   Assert(false);
>   }
>
> + /*
> + * output BEGIN / BEGIN PREPARE if we haven't yet,
> +     * while streaming no need to send BEGIN / BEGIN PREPARE.
> + */
> + if (!in_streaming && !txndata->sent_begin_txn)
>
> =>
>
> English not really that comment is. The comment should also start with
> uppercase.
>
> (NOTE: This same comment was in couple of places in the patch)
>

Updated.

regards,
Ajin Cherian
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: proposal: enhancing plpgsql debug API - returns text value of variable content
Next
From: Andrew Dunstan
Date:
Subject: Re: window build doesn't apply PG_CPPFLAGS correctly