Thread: Avoid streaming the transaction which are skipped (in corner cases)

Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
During DecodeCommit() for skipping a transaction we use ReadRecPtr to
check whether to skip this transaction or not.  Whereas in
ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to
stream or not. Generally it will not create a problem but if the
commit record itself is adding some changes to the transaction(e.g.
snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
EndRecPtr then streaming will decide to stream the transaction where
as DecodeCommit will decide to skip it.  And for handling this case in
ReorderBufferForget() we call stream_abort().

So ideally if we are planning to skip the transaction we should never
stream it hence there is no need to stream abort such transaction in
case of skip.

In this patch I have fixed the skip condition in the streaming case
and also added an assert inside ReorderBufferForget() to ensure that
the transaction should have never been streamed.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Ashutosh Bapat
Date:
Excellent catch. We were looking at this code last week and wondered
the purpose of this abort. Probably we should have some macro or
function to decided whether to skip a transaction based on log record.
That will avoid using different values in different places.

On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> During DecodeCommit() for skipping a transaction we use ReadRecPtr to
> check whether to skip this transaction or not.  Whereas in
> ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to
> stream or not. Generally it will not create a problem but if the
> commit record itself is adding some changes to the transaction(e.g.
> snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
> EndRecPtr then streaming will decide to stream the transaction where
> as DecodeCommit will decide to skip it.  And for handling this case in
> ReorderBufferForget() we call stream_abort().
>
> So ideally if we are planning to skip the transaction we should never
> stream it hence there is no need to stream abort such transaction in
> case of skip.
>
> In this patch I have fixed the skip condition in the streaming case
> and also added an assert inside ReorderBufferForget() to ensure that
> the transaction should have never been streamed.
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com



-- 
Best Wishes,
Ashutosh Bapat



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> During DecodeCommit() for skipping a transaction we use ReadRecPtr to
> check whether to skip this transaction or not.  Whereas in
> ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to
> stream or not. Generally it will not create a problem but if the
> commit record itself is adding some changes to the transaction(e.g.
> snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
> EndRecPtr then streaming will decide to stream the transaction where
> as DecodeCommit will decide to skip it.  And for handling this case in
> ReorderBufferForget() we call stream_abort().
>

The other cases are probably where we don't have FilterByOrigin or
dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS.
We anyway actually don't send anything for such cases except empty
start/stop messages. Can we add some flag to txn which says that there
is at least one change like DML that we want to stream? Then we can
use that flag to decide whether to stream or not.

-- 
With Regards,
Amit Kapila.



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Fri, Nov 25, 2022 at 4:04 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Excellent catch. We were looking at this code last week and wondered
> the purpose of this abort. Probably we should have some macro or
> function to decided whether to skip a transaction based on log record.
> That will avoid using different values in different places.

We do have a common function i.e. SnapBuildXactNeedsSkip() but there
are two problems 1) it has a dependency on the input parameter so the
result may vary based on the input 2) this is only checked based on
the LSN but there are other factors dbid and originid based on those
also transaction could be skipped during DecodeCommit.  So I think one
possible solution could be to remember a dbid and originid in
ReorderBufferTXN as soon as we get the first change which has valid
values for these parameters.  And now as you suggested have a common
function that will be used by streaming as well as by DecodeCommit to
decide on whether to skip or not.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > During DecodeCommit() for skipping a transaction we use ReadRecPtr to
> > check whether to skip this transaction or not.  Whereas in
> > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to
> > stream or not. Generally it will not create a problem but if the
> > commit record itself is adding some changes to the transaction(e.g.
> > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
> > EndRecPtr then streaming will decide to stream the transaction where
> > as DecodeCommit will decide to skip it.  And for handling this case in
> > ReorderBufferForget() we call stream_abort().
> >
>
> The other cases are probably where we don't have FilterByOrigin or
> dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS.
> We anyway actually don't send anything for such cases except empty
> start/stop messages. Can we add some flag to txn which says that there
> is at least one change like DML that we want to stream?
>

We can probably think of using txn_flags for this purpose.

> Then we can
> use that flag to decide whether to stream or not.
>

The other possibility here is to use ReorderBufferTXN's base_snapshot.
Normally, we don't stream unless the base_snapshot is set. However,
there are a few challenges (a) the base_snapshot is set in
SnapBuildProcessChange which is called before DecodeInsert, and
similar APIs. So, now even if we filter out insert due origin or
db_id, the base_snapshot will be set. (b) we currently set it to
execute invalidations even when there are no changes to send to
downstream.

For (a), I guess we can split SnapBuildProcessChange, such that the
part where we set the base snapshot will be done after DecodeInsert
and similar APIs decide to queue the change. I am less sure about
point (b), ideally, if we don't need a snapshot to execute
invalidations, I think we can avoid setting base_snapshot even in that
case. Then we can use it as a check whether we want to stream
anything.

I feel this approach required a lot more changes and bit riskier as
compared to having a flag in txn_flags.

-- 
With Regards,
Amit Kapila.



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Sat, Nov 26, 2022 at 10:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Nov 25, 2022 at 4:04 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Excellent catch. We were looking at this code last week and wondered
> > the purpose of this abort. Probably we should have some macro or
> > function to decided whether to skip a transaction based on log record.
> > That will avoid using different values in different places.
>
> We do have a common function i.e. SnapBuildXactNeedsSkip() but there
> are two problems 1) it has a dependency on the input parameter so the
> result may vary based on the input 2) this is only checked based on
> the LSN but there are other factors dbid and originid based on those
> also transaction could be skipped during DecodeCommit.  So I think one
> possible solution could be to remember a dbid and originid in
> ReorderBufferTXN as soon as we get the first change which has valid
> values for these parameters.
>

But is the required information say 'dbid' available in all records,
for example, what about XLOG_XACT_INVALIDATIONS? The other thing to
consider in this regard is if we are planning to have additional
information as mentioned by me in another to decide whether to stream
or not then the additional checks may be redundant anyway. It is a
good idea to have a common check at both places but if not, we can at
least add some comments to say why the check is different.

-- 
With Regards,
Amit Kapila.



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Sat, Nov 26, 2022 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to
> > > check whether to skip this transaction or not.  Whereas in
> > > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to
> > > stream or not. Generally it will not create a problem but if the
> > > commit record itself is adding some changes to the transaction(e.g.
> > > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
> > > EndRecPtr then streaming will decide to stream the transaction where
> > > as DecodeCommit will decide to skip it.  And for handling this case in
> > > ReorderBufferForget() we call stream_abort().
> > >
> >
> > The other cases are probably where we don't have FilterByOrigin or
> > dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS.
> > We anyway actually don't send anything for such cases except empty
> > start/stop messages. Can we add some flag to txn which says that there
> > is at least one change like DML that we want to stream?
> >
>
> We can probably think of using txn_flags for this purpose.

In the attached patch I have used txn_flags to identify whether it has
any streamable change or not and the transaction will not be selected
for streaming unless it has at least one streamable change.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

RE: Avoid streaming the transaction which are skipped (in corner cases)

From
"shiy.fnst@fujitsu.com"
Date:
On Sun, Nov 27, 2022 1:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> 
> On Sat, Nov 26, 2022 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> > > On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com>
> wrote:
> > > >
> > > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to
> > > > check whether to skip this transaction or not.  Whereas in
> > > > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether
> to
> > > > stream or not. Generally it will not create a problem but if the
> > > > commit record itself is adding some changes to the transaction(e.g.
> > > > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
> > > > EndRecPtr then streaming will decide to stream the transaction where
> > > > as DecodeCommit will decide to skip it.  And for handling this case in
> > > > ReorderBufferForget() we call stream_abort().
> > > >
> > >
> > > The other cases are probably where we don't have FilterByOrigin or
> > > dbid check, for example,
> XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS.
> > > We anyway actually don't send anything for such cases except empty
> > > start/stop messages. Can we add some flag to txn which says that there
> > > is at least one change like DML that we want to stream?
> > >
> >
> > We can probably think of using txn_flags for this purpose.
> 
> In the attached patch I have used txn_flags to identify whether it has
> any streamable change or not and the transaction will not be selected
> for streaming unless it has at least one streamable change.
> 

Thanks for your patch.

I saw that the patch added a check when selecting largest transaction, but in
addition to ReorderBufferCheckMemoryLimit(), the transaction can also be
streamed in ReorderBufferProcessPartialChange(). Should we add the check in
this function, too?

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 9a58c4bfb9..108737b02f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -768,7 +768,8 @@ ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
      */
     if (ReorderBufferCanStartStreaming(rb) &&
         !(rbtxn_has_partial_change(toptxn)) &&
-        rbtxn_is_serialized(txn))
+        rbtxn_is_serialized(txn) &&
+        rbtxn_has_streamable_change(txn))
         ReorderBufferStreamTXN(rb, toptxn);
 }

Regards,
Shi yu

Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Mon, Nov 28, 2022 at 1:46 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> Thanks for your patch.
>
> I saw that the patch added a check when selecting largest transaction, but in
> addition to ReorderBufferCheckMemoryLimit(), the transaction can also be
> streamed in ReorderBufferProcessPartialChange(). Should we add the check in
> this function, too?
>
> diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
> index 9a58c4bfb9..108737b02f 100644
> --- a/src/backend/replication/logical/reorderbuffer.c
> +++ b/src/backend/replication/logical/reorderbuffer.c
> @@ -768,7 +768,8 @@ ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
>          */
>         if (ReorderBufferCanStartStreaming(rb) &&
>                 !(rbtxn_has_partial_change(toptxn)) &&
> -               rbtxn_is_serialized(txn))
> +               rbtxn_is_serialized(txn) &&
> +               rbtxn_has_streamable_change(txn))
>                 ReorderBufferStreamTXN(rb, toptxn);
>  }

You are right we need this in ReorderBufferProcessPartialChange() as
well.  I will fix this in the next version.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Mon, Nov 28, 2022 at 3:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Nov 28, 2022 at 1:46 PM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
> >
> > Thanks for your patch.
> >
> > I saw that the patch added a check when selecting largest transaction, but in
> > addition to ReorderBufferCheckMemoryLimit(), the transaction can also be
> > streamed in ReorderBufferProcessPartialChange(). Should we add the check in
> > this function, too?
> >
> > diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
> > index 9a58c4bfb9..108737b02f 100644
> > --- a/src/backend/replication/logical/reorderbuffer.c
> > +++ b/src/backend/replication/logical/reorderbuffer.c
> > @@ -768,7 +768,8 @@ ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
> >          */
> >         if (ReorderBufferCanStartStreaming(rb) &&
> >                 !(rbtxn_has_partial_change(toptxn)) &&
> > -               rbtxn_is_serialized(txn))
> > +               rbtxn_is_serialized(txn) &&
> > +               rbtxn_has_streamable_change(txn))
> >                 ReorderBufferStreamTXN(rb, toptxn);
> >  }
>
> You are right we need this in ReorderBufferProcessPartialChange() as
> well.  I will fix this in the next version.

Fixed this in the attached patch.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

RE: Avoid streaming the transaction which are skipped (in corner cases)

From
"houzj.fnst@fujitsu.com"
Date:
On Tuesday, November 29, 2022 12:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

Hi,

> 
> On Mon, Nov 28, 2022 at 3:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Nov 28, 2022 at 1:46 PM shiy.fnst@fujitsu.com
> > <shiy.fnst@fujitsu.com> wrote:
> > >
> > > Thanks for your patch.
> > >
> > > I saw that the patch added a check when selecting largest
> > > transaction, but in addition to ReorderBufferCheckMemoryLimit(), the
> > > transaction can also be streamed in
> > > ReorderBufferProcessPartialChange(). Should we add the check in this
> function, too?
> > >
> > > diff --git a/src/backend/replication/logical/reorderbuffer.c
> > > b/src/backend/replication/logical/reorderbuffer.c
> > > index 9a58c4bfb9..108737b02f 100644
> > > --- a/src/backend/replication/logical/reorderbuffer.c
> > > +++ b/src/backend/replication/logical/reorderbuffer.c
> > > @@ -768,7 +768,8 @@
> ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN
> *txn,
> > >          */
> > >         if (ReorderBufferCanStartStreaming(rb) &&
> > >                 !(rbtxn_has_partial_change(toptxn)) &&
> > > -               rbtxn_is_serialized(txn))
> > > +               rbtxn_is_serialized(txn) &&
> > > +               rbtxn_has_streamable_change(txn))
> > >                 ReorderBufferStreamTXN(rb, toptxn);  }
> >
> > You are right we need this in ReorderBufferProcessPartialChange() as
> > well.  I will fix this in the next version.
> 
> Fixed this in the attached patch.

Thanks for updating the patch.

I have few comments about the patch.

1.

1.1.
-    /* For streamed transactions notify the remote node about the abort. */
-    if (rbtxn_is_streamed(txn))
-        rb->stream_abort(rb, txn, lsn);
+    /* the transaction which is being skipped shouldn't have been streamed */
+    Assert(!rbtxn_is_streamed(txn));

1.2
-        rbtxn_is_serialized(txn))
+        rbtxn_is_serialized(txn) &&
+        rbtxn_has_streamable_change(txn))
         ReorderBufferStreamTXN(rb, toptxn);

In the above two places, I think we should do the check for the top-level
transaction(e.g. toptxn) because the patch only set flag for the top-level
transaction.

2.

+    /*
+     * If there are any streamable changes getting queued then get the top
+     * transaction and mark it has streamable change.  This is required for
+     * streaming in-progress transactions, the in-progress transaction will
+     * not be selected for streaming unless it has at least one streamable
+     * change.
+     */
+    if (change->action == REORDER_BUFFER_CHANGE_INSERT ||
+        change->action == REORDER_BUFFER_CHANGE_UPDATE ||
+        change->action == REORDER_BUFFER_CHANGE_DELETE ||
+        change->action == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT ||
+        change->action == REORDER_BUFFER_CHANGE_TRUNCATE)

I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE can also be
considered as streamable. Is there a reason that we don't check it here ?

Best regards,
Hou zj


Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Ashutosh Bapat
Date:
Hi Dilip,

On Tue, Nov 29, 2022 at 9:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> >
> > You are right we need this in ReorderBufferProcessPartialChange() as
> > well.  I will fix this in the next version.
>
> Fixed this in the attached patch.
>

I focused my attention on SnapBuildXactNeedsSkip() usages and I see
they are using different end points of WAL record
1 decode.c        logicalmsg_decode               594
SnapBuildXactNeedsSkip(builder, buf->origptr)))
2 decode.c        DecodeTXNNeedSkip              1250 return
(SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) ||
3 reorderbuffer.c AssertTXNLsnOrder               897 if
(SnapBuildXactNeedsSkip(ctx->snapshot_builder,
ctx->reader->EndRecPtr))
4 reorderbuffer.c ReorderBufferCanStartStreaming 3922
!SnapBuildXactNeedsSkip(builder, ctx->reader->EndRecPtr))
5 snapbuild.c     SnapBuildXactNeedsSkip          429
SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr)

The first two are using origin ptr and the last two are using end ptr.
you have fixed the fourth one. Do we need to fix the third one as
well?

Probably we need to create two wrappers (macros) around
SnapBuildXactNeedsSkip(), one which accepts a XLogRecordBuffer and
other which accepts XLogReaderState. Then use those. That way at least
we have logic unified as to which XLogRecPtr to use.

--
Best Wishes,
Ashutosh Bapat



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Fri, Dec 2, 2022 at 4:58 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Dilip,
>
> On Tue, Nov 29, 2022 at 9:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > >
> > > You are right we need this in ReorderBufferProcessPartialChange() as
> > > well.  I will fix this in the next version.
> >
> > Fixed this in the attached patch.
> >
>
> I focused my attention on SnapBuildXactNeedsSkip() usages and I see
> they are using different end points of WAL record
> 1 decode.c        logicalmsg_decode               594
> SnapBuildXactNeedsSkip(builder, buf->origptr)))
> 2 decode.c        DecodeTXNNeedSkip              1250 return
> (SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) ||
> 3 reorderbuffer.c AssertTXNLsnOrder               897 if
> (SnapBuildXactNeedsSkip(ctx->snapshot_builder,
> ctx->reader->EndRecPtr))
> 4 reorderbuffer.c ReorderBufferCanStartStreaming 3922
> !SnapBuildXactNeedsSkip(builder, ctx->reader->EndRecPtr))
> 5 snapbuild.c     SnapBuildXactNeedsSkip          429
> SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr)
>
> The first two are using origin ptr and the last two are using end ptr.
> you have fixed the fourth one. Do we need to fix the third one as
> well?
>

I think we can change the third one as well but I haven't tested it.
Adding Sawada-San for his inputs as it is added in commit 16b1fe0037.
In any case, I think we can do that as a separate patch because it is
not directly related to the streaming case we are trying to solve as
part of this patch.

> Probably we need to create two wrappers (macros) around
> SnapBuildXactNeedsSkip(), one which accepts a XLogRecordBuffer and
> other which accepts XLogReaderState. Then use those. That way at least
> we have logic unified as to which XLogRecPtr to use.
>

I don't know how that will be an improvement because both those have
the start and end locations of the record.

-- 
With Regards,
Amit Kapila.



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Tue, Nov 29, 2022 at 12:23 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, November 29, 2022 12:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I have few comments about the patch.
>
> 1.
>
> 1.1.
> -       /* For streamed transactions notify the remote node about the abort. */
> -       if (rbtxn_is_streamed(txn))
> -               rb->stream_abort(rb, txn, lsn);
> +       /* the transaction which is being skipped shouldn't have been streamed */
> +       Assert(!rbtxn_is_streamed(txn));
>
> 1.2
> -               rbtxn_is_serialized(txn))
> +               rbtxn_is_serialized(txn) &&
> +               rbtxn_has_streamable_change(txn))
>                 ReorderBufferStreamTXN(rb, toptxn);
>
> In the above two places, I think we should do the check for the top-level
> transaction(e.g. toptxn) because the patch only set flag for the top-level
> transaction.
>

Among these, the first one seems okay because it will check both the
transaction and its subtransactions from that path and none of those
should be marked as streamed. I have fixed the second one in the
attached patch.

> 2.
>
> +       /*
> +        * If there are any streamable changes getting queued then get the top
> +        * transaction and mark it has streamable change.  This is required for
> +        * streaming in-progress transactions, the in-progress transaction will
> +        * not be selected for streaming unless it has at least one streamable
> +        * change.
> +        */
> +       if (change->action == REORDER_BUFFER_CHANGE_INSERT ||
> +               change->action == REORDER_BUFFER_CHANGE_UPDATE ||
> +               change->action == REORDER_BUFFER_CHANGE_DELETE ||
> +               change->action == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT ||
> +               change->action == REORDER_BUFFER_CHANGE_TRUNCATE)
>
> I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE can also be
> considered as streamable. Is there a reason that we don't check it here ?
>

No, I don't see any reason not to do this check for
REORDER_BUFFER_CHANGE_MESSAGE.

Apart from the above, I have slightly adjusted the comments in the
attached. Do let me know what you think of the attached.

-- 
With Regards,
Amit Kapila.

Attachment

RE: Avoid streaming the transaction which are skipped (in corner cases)

From
"houzj.fnst@fujitsu.com"
Date:
On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Tue, Nov 29, 2022 at 12:23 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tuesday, November 29, 2022 12:08 PM Dilip Kumar
> <dilipbalaut@gmail.com> wrote:
> >
> > I have few comments about the patch.
> >
> > 1.
> >
> > 1.1.
> > -       /* For streamed transactions notify the remote node about the abort.
> */
> > -       if (rbtxn_is_streamed(txn))
> > -               rb->stream_abort(rb, txn, lsn);
> > +       /* the transaction which is being skipped shouldn't have been
> streamed */
> > +       Assert(!rbtxn_is_streamed(txn));
> >
> > 1.2
> > -               rbtxn_is_serialized(txn))
> > +               rbtxn_is_serialized(txn) &&
> > +               rbtxn_has_streamable_change(txn))
> >                 ReorderBufferStreamTXN(rb, toptxn);
> >
> > In the above two places, I think we should do the check for the
> > top-level transaction(e.g. toptxn) because the patch only set flag for
> > the top-level transaction.
> >
> 
> Among these, the first one seems okay because it will check both the transaction
> and its subtransactions from that path and none of those should be marked as
> streamed. I have fixed the second one in the attached patch.
> 
> > 2.
> >
> > +       /*
> > +        * If there are any streamable changes getting queued then get the
> top
> > +        * transaction and mark it has streamable change.  This is required
> for
> > +        * streaming in-progress transactions, the in-progress transaction will
> > +        * not be selected for streaming unless it has at least one streamable
> > +        * change.
> > +        */
> > +       if (change->action == REORDER_BUFFER_CHANGE_INSERT ||
> > +               change->action == REORDER_BUFFER_CHANGE_UPDATE ||
> > +               change->action == REORDER_BUFFER_CHANGE_DELETE ||
> > +               change->action ==
> REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT ||
> > +               change->action ==
> REORDER_BUFFER_CHANGE_TRUNCATE)
> >
> > I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE
> > can also be considered as streamable. Is there a reason that we don't check it
> here ?
> >
> 
> No, I don't see any reason not to do this check for
> REORDER_BUFFER_CHANGE_MESSAGE.
> 
> Apart from the above, I have slightly adjusted the comments in the attached. Do
> let me know what you think of the attached.

Thanks for updating the patch. It looks good to me.

Best regards,
Hou zj

Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Apart from the above, I have slightly adjusted the comments in the attached. Do
> > let me know what you think of the attached.
>
> Thanks for updating the patch. It looks good to me.
>

I feel the function name ReorderBufferLargestTopTXN() is slightly
misleading because it also checks some of the streaming properties
(like whether the TXN has partial changes and whether it contains any
streamable change). Shall we rename it to
ReorderBufferLargestStreamableTopTXN() or something like that?

The other point to consider is whether we need to have a test case for
this patch. I think before this patch if the size of DDL changes in a
transaction exceeds logical_decoding_work_mem, the empty streams will
be output in the plugin but after this patch, there won't be any such
stream.

-- 
With Regards,
Amit Kapila.



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Apart from the above, I have slightly adjusted the comments in the attached. Do
> > > let me know what you think of the attached.
> >
> > Thanks for updating the patch. It looks good to me.
> >
>
> I feel the function name ReorderBufferLargestTopTXN() is slightly
> misleading because it also checks some of the streaming properties
> (like whether the TXN has partial changes and whether it contains any
> streamable change). Shall we rename it to
> ReorderBufferLargestStreamableTopTXN() or something like that?

Yes that makes sense

> The other point to consider is whether we need to have a test case for
> this patch. I think before this patch if the size of DDL changes in a
> transaction exceeds logical_decoding_work_mem, the empty streams will
> be output in the plugin but after this patch, there won't be any such
> stream.

Yes, we can do that, I will make these two changes.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Mon, Dec 5, 2022 at 9:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > Apart from the above, I have slightly adjusted the comments in the attached. Do
> > > > let me know what you think of the attached.
> > >
> > > Thanks for updating the patch. It looks good to me.
> > >
> >
> > I feel the function name ReorderBufferLargestTopTXN() is slightly
> > misleading because it also checks some of the streaming properties
> > (like whether the TXN has partial changes and whether it contains any
> > streamable change). Shall we rename it to
> > ReorderBufferLargestStreamableTopTXN() or something like that?
>
> Yes that makes sense

I have done this change in the attached patch.

> > The other point to consider is whether we need to have a test case for
> > this patch. I think before this patch if the size of DDL changes in a
> > transaction exceeds logical_decoding_work_mem, the empty streams will
> > be output in the plugin but after this patch, there won't be any such
> > stream.

I tried this test, but I think generating 64k data with just CID
messages will make the test case really big.  I tried using multiple
sessions such that one session makes the reorder buffer full but
contains partial changes so that we try to stream another transaction
but that is not possible in an automated test to consistently generate
the partial change.

I think we need something like this[1] so that we can better control
the streaming.


[1]https://www.postgresql.org/message-id/OSZPR01MB631042582805A8E8615BC413FD329%40OSZPR01MB6310.jpnprd01.prod.outlook.com


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 5, 2022 at 9:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > Apart from the above, I have slightly adjusted the comments in the attached. Do
> > > > > let me know what you think of the attached.
> > > >
> > > > Thanks for updating the patch. It looks good to me.
> > > >
> > >
> > > I feel the function name ReorderBufferLargestTopTXN() is slightly
> > > misleading because it also checks some of the streaming properties
> > > (like whether the TXN has partial changes and whether it contains any
> > > streamable change). Shall we rename it to
> > > ReorderBufferLargestStreamableTopTXN() or something like that?
> >
> > Yes that makes sense
>
> I have done this change in the attached patch.
>
> > > The other point to consider is whether we need to have a test case for
> > > this patch. I think before this patch if the size of DDL changes in a
> > > transaction exceeds logical_decoding_work_mem, the empty streams will
> > > be output in the plugin but after this patch, there won't be any such
> > > stream.
>
> I tried this test, but I think generating 64k data with just CID
> messages will make the test case really big.  I tried using multiple
> sessions such that one session makes the reorder buffer full but
> contains partial changes so that we try to stream another transaction
> but that is not possible in an automated test to consistently generate
> the partial change.
>

I also don't see a way to achieve it in an automated way because both
toast and speculative inserts are part of one statement, so we need a
real concurrent test to make it happen. Can anyone else think of a way
to achieve it?

> I think we need something like this[1] so that we can better control
> the streaming.
>

+1. The additional advantage would be that we can generate parallel
apply and new streaming tests with much lesser data. Shi-San, can you
please start a new thread for the GUC patch proposed by you as
indicated by Dilip?

-- 
With Regards,
Amit Kapila.



RE: Avoid streaming the transaction which are skipped (in corner cases)

From
"shiy.fnst@fujitsu.com"
Date:
On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > I think we need something like this[1] so that we can better control
> > the streaming.
> >
> 
> +1. The additional advantage would be that we can generate parallel
> apply and new streaming tests with much lesser data. Shi-San, can you
> please start a new thread for the GUC patch proposed by you as
> indicated by Dilip?
> 

OK, I started a new thread for it. [1]

[1]
https://www.postgresql.org/message-id/OSZPR01MB63104E7449DBE41932DB19F1FD1B9%40OSZPR01MB6310.jpnprd01.prod.outlook.com

Regards,
Shi yu

Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Tue, Dec 6, 2022 at 11:55 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > I think we need something like this[1] so that we can better control
> > > the streaming.
> > >
> >
> > +1. The additional advantage would be that we can generate parallel
> > apply and new streaming tests with much lesser data. Shi-San, can you
> > please start a new thread for the GUC patch proposed by you as
> > indicated by Dilip?
> >
>
> OK, I started a new thread for it. [1]
>

Thanks. I think it is better to go ahead with this patch and once we
decide what is the right thing to do in terms of GUC then we can try
to add additional tests for this. Anyway, it is not that the code
added by this patch is not getting covered by existing tests. What do
you think?

-- 
With Regards,
Amit Kapila.



Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Dilip Kumar
Date:
On Wed, Dec 7, 2022 at 9:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 6, 2022 at 11:55 AM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
> >
> > On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > I think we need something like this[1] so that we can better control
> > > > the streaming.
> > > >
> > >
> > > +1. The additional advantage would be that we can generate parallel
> > > apply and new streaming tests with much lesser data. Shi-San, can you
> > > please start a new thread for the GUC patch proposed by you as
> > > indicated by Dilip?
> > >
> >
> > OK, I started a new thread for it. [1]
> >
>
> Thanks. I think it is better to go ahead with this patch and once we
> decide what is the right thing to do in terms of GUC then we can try
> to add additional tests for this. Anyway, it is not that the code
> added by this patch is not getting covered by existing tests. What do
> you think?

That makes sense to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



RE: Avoid streaming the transaction which are skipped (in corner cases)

From
"shiy.fnst@fujitsu.com"
Date:
On Wed, Dec 7, 2022 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> 
> On Wed, Dec 7, 2022 at 9:28 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Tue, Dec 6, 2022 at 11:55 AM shiy.fnst@fujitsu.com
> > <shiy.fnst@fujitsu.com> wrote:
> > >
> > > On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > > >
> > > > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com>
> wrote:
> > > > >
> > > > > I think we need something like this[1] so that we can better control
> > > > > the streaming.
> > > > >
> > > >
> > > > +1. The additional advantage would be that we can generate parallel
> > > > apply and new streaming tests with much lesser data. Shi-San, can you
> > > > please start a new thread for the GUC patch proposed by you as
> > > > indicated by Dilip?
> > > >
> > >
> > > OK, I started a new thread for it. [1]
> > >
> >
> > Thanks. I think it is better to go ahead with this patch and once we
> > decide what is the right thing to do in terms of GUC then we can try
> > to add additional tests for this. Anyway, it is not that the code
> > added by this patch is not getting covered by existing tests. What do
> > you think?
> 
> That makes sense to me.
> 

+1

Regards,
Shi yu

Re: Avoid streaming the transaction which are skipped (in corner cases)

From
Amit Kapila
Date:
On Wed, Dec 7, 2022 at 9:35 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Wed, Dec 7, 2022 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > >
> > > Thanks. I think it is better to go ahead with this patch and once we
> > > decide what is the right thing to do in terms of GUC then we can try
> > > to add additional tests for this. Anyway, it is not that the code
> > > added by this patch is not getting covered by existing tests. What do
> > > you think?
> >
> > That makes sense to me.
> >
>
> +1
>

Pushed.

-- 
With Regards,
Amit Kapila.