Thread: Avoid streaming the transaction which are skipped (in corner cases)
During DecodeCommit() for skipping a transaction we use ReadRecPtr to check whether to skip this transaction or not. Whereas in ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to stream or not. Generally it will not create a problem but if the commit record itself is adding some changes to the transaction(e.g. snapshot) and if the "start_decoding_at" is in between ReadRecPtr and EndRecPtr then streaming will decide to stream the transaction where as DecodeCommit will decide to skip it. And for handling this case in ReorderBufferForget() we call stream_abort(). So ideally if we are planning to skip the transaction we should never stream it hence there is no need to stream abort such transaction in case of skip. In this patch I have fixed the skip condition in the streaming case and also added an assert inside ReorderBufferForget() to ensure that the transaction should have never been streamed. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
Excellent catch. We were looking at this code last week and wondered the purpose of this abort. Probably we should have some macro or function to decided whether to skip a transaction based on log record. That will avoid using different values in different places. On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to > check whether to skip this transaction or not. Whereas in > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to > stream or not. Generally it will not create a problem but if the > commit record itself is adding some changes to the transaction(e.g. > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and > EndRecPtr then streaming will decide to stream the transaction where > as DecodeCommit will decide to skip it. And for handling this case in > ReorderBufferForget() we call stream_abort(). > > So ideally if we are planning to skip the transaction we should never > stream it hence there is no need to stream abort such transaction in > case of skip. > > In this patch I have fixed the skip condition in the streaming case > and also added an assert inside ReorderBufferForget() to ensure that > the transaction should have never been streamed. > > -- > Regards, > Dilip Kumar > EnterpriseDB: http://www.enterprisedb.com -- Best Wishes, Ashutosh Bapat
On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to > check whether to skip this transaction or not. Whereas in > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to > stream or not. Generally it will not create a problem but if the > commit record itself is adding some changes to the transaction(e.g. > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and > EndRecPtr then streaming will decide to stream the transaction where > as DecodeCommit will decide to skip it. And for handling this case in > ReorderBufferForget() we call stream_abort(). > The other cases are probably where we don't have FilterByOrigin or dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS. We anyway actually don't send anything for such cases except empty start/stop messages. Can we add some flag to txn which says that there is at least one change like DML that we want to stream? Then we can use that flag to decide whether to stream or not. -- With Regards, Amit Kapila.
On Fri, Nov 25, 2022 at 4:04 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Excellent catch. We were looking at this code last week and wondered > the purpose of this abort. Probably we should have some macro or > function to decided whether to skip a transaction based on log record. > That will avoid using different values in different places. We do have a common function i.e. SnapBuildXactNeedsSkip() but there are two problems 1) it has a dependency on the input parameter so the result may vary based on the input 2) this is only checked based on the LSN but there are other factors dbid and originid based on those also transaction could be skipped during DecodeCommit. So I think one possible solution could be to remember a dbid and originid in ReorderBufferTXN as soon as we get the first change which has valid values for these parameters. And now as you suggested have a common function that will be used by streaming as well as by DecodeCommit to decide on whether to skip or not. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to > > check whether to skip this transaction or not. Whereas in > > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to > > stream or not. Generally it will not create a problem but if the > > commit record itself is adding some changes to the transaction(e.g. > > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and > > EndRecPtr then streaming will decide to stream the transaction where > > as DecodeCommit will decide to skip it. And for handling this case in > > ReorderBufferForget() we call stream_abort(). > > > > The other cases are probably where we don't have FilterByOrigin or > dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS. > We anyway actually don't send anything for such cases except empty > start/stop messages. Can we add some flag to txn which says that there > is at least one change like DML that we want to stream? > We can probably think of using txn_flags for this purpose. > Then we can > use that flag to decide whether to stream or not. > The other possibility here is to use ReorderBufferTXN's base_snapshot. Normally, we don't stream unless the base_snapshot is set. However, there are a few challenges (a) the base_snapshot is set in SnapBuildProcessChange which is called before DecodeInsert, and similar APIs. So, now even if we filter out insert due origin or db_id, the base_snapshot will be set. (b) we currently set it to execute invalidations even when there are no changes to send to downstream. For (a), I guess we can split SnapBuildProcessChange, such that the part where we set the base snapshot will be done after DecodeInsert and similar APIs decide to queue the change. I am less sure about point (b), ideally, if we don't need a snapshot to execute invalidations, I think we can avoid setting base_snapshot even in that case. Then we can use it as a check whether we want to stream anything. I feel this approach required a lot more changes and bit riskier as compared to having a flag in txn_flags. -- With Regards, Amit Kapila.
On Sat, Nov 26, 2022 at 10:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Nov 25, 2022 at 4:04 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Excellent catch. We were looking at this code last week and wondered > > the purpose of this abort. Probably we should have some macro or > > function to decided whether to skip a transaction based on log record. > > That will avoid using different values in different places. > > We do have a common function i.e. SnapBuildXactNeedsSkip() but there > are two problems 1) it has a dependency on the input parameter so the > result may vary based on the input 2) this is only checked based on > the LSN but there are other factors dbid and originid based on those > also transaction could be skipped during DecodeCommit. So I think one > possible solution could be to remember a dbid and originid in > ReorderBufferTXN as soon as we get the first change which has valid > values for these parameters. > But is the required information say 'dbid' available in all records, for example, what about XLOG_XACT_INVALIDATIONS? The other thing to consider in this regard is if we are planning to have additional information as mentioned by me in another to decide whether to stream or not then the additional checks may be redundant anyway. It is a good idea to have a common check at both places but if not, we can at least add some comments to say why the check is different. -- With Regards, Amit Kapila.
On Sat, Nov 26, 2022 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to > > > check whether to skip this transaction or not. Whereas in > > > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to > > > stream or not. Generally it will not create a problem but if the > > > commit record itself is adding some changes to the transaction(e.g. > > > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and > > > EndRecPtr then streaming will decide to stream the transaction where > > > as DecodeCommit will decide to skip it. And for handling this case in > > > ReorderBufferForget() we call stream_abort(). > > > > > > > The other cases are probably where we don't have FilterByOrigin or > > dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS. > > We anyway actually don't send anything for such cases except empty > > start/stop messages. Can we add some flag to txn which says that there > > is at least one change like DML that we want to stream? > > > > We can probably think of using txn_flags for this purpose. In the attached patch I have used txn_flags to identify whether it has any streamable change or not and the transaction will not be selected for streaming unless it has at least one streamable change. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
RE: Avoid streaming the transaction which are skipped (in corner cases)
From
"shiy.fnst@fujitsu.com"
Date:
On Sun, Nov 27, 2022 1:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Nov 26, 2022 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > > > On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar <dilipbalaut@gmail.com> > wrote: > > > > > > > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to > > > > check whether to skip this transaction or not. Whereas in > > > > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether > to > > > > stream or not. Generally it will not create a problem but if the > > > > commit record itself is adding some changes to the transaction(e.g. > > > > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and > > > > EndRecPtr then streaming will decide to stream the transaction where > > > > as DecodeCommit will decide to skip it. And for handling this case in > > > > ReorderBufferForget() we call stream_abort(). > > > > > > > > > > The other cases are probably where we don't have FilterByOrigin or > > > dbid check, for example, > XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS. > > > We anyway actually don't send anything for such cases except empty > > > start/stop messages. Can we add some flag to txn which says that there > > > is at least one change like DML that we want to stream? > > > > > > > We can probably think of using txn_flags for this purpose. > > In the attached patch I have used txn_flags to identify whether it has > any streamable change or not and the transaction will not be selected > for streaming unless it has at least one streamable change. > Thanks for your patch. I saw that the patch added a check when selecting largest transaction, but in addition to ReorderBufferCheckMemoryLimit(), the transaction can also be streamed in ReorderBufferProcessPartialChange(). Should we add the check in this function, too? diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 9a58c4bfb9..108737b02f 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -768,7 +768,8 @@ ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn, */ if (ReorderBufferCanStartStreaming(rb) && !(rbtxn_has_partial_change(toptxn)) && - rbtxn_is_serialized(txn)) + rbtxn_is_serialized(txn) && + rbtxn_has_streamable_change(txn)) ReorderBufferStreamTXN(rb, toptxn); } Regards, Shi yu
On Mon, Nov 28, 2022 at 1:46 PM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > Thanks for your patch. > > I saw that the patch added a check when selecting largest transaction, but in > addition to ReorderBufferCheckMemoryLimit(), the transaction can also be > streamed in ReorderBufferProcessPartialChange(). Should we add the check in > this function, too? > > diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c > index 9a58c4bfb9..108737b02f 100644 > --- a/src/backend/replication/logical/reorderbuffer.c > +++ b/src/backend/replication/logical/reorderbuffer.c > @@ -768,7 +768,8 @@ ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn, > */ > if (ReorderBufferCanStartStreaming(rb) && > !(rbtxn_has_partial_change(toptxn)) && > - rbtxn_is_serialized(txn)) > + rbtxn_is_serialized(txn) && > + rbtxn_has_streamable_change(txn)) > ReorderBufferStreamTXN(rb, toptxn); > } You are right we need this in ReorderBufferProcessPartialChange() as well. I will fix this in the next version. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Mon, Nov 28, 2022 at 3:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Nov 28, 2022 at 1:46 PM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > > > Thanks for your patch. > > > > I saw that the patch added a check when selecting largest transaction, but in > > addition to ReorderBufferCheckMemoryLimit(), the transaction can also be > > streamed in ReorderBufferProcessPartialChange(). Should we add the check in > > this function, too? > > > > diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c > > index 9a58c4bfb9..108737b02f 100644 > > --- a/src/backend/replication/logical/reorderbuffer.c > > +++ b/src/backend/replication/logical/reorderbuffer.c > > @@ -768,7 +768,8 @@ ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn, > > */ > > if (ReorderBufferCanStartStreaming(rb) && > > !(rbtxn_has_partial_change(toptxn)) && > > - rbtxn_is_serialized(txn)) > > + rbtxn_is_serialized(txn) && > > + rbtxn_has_streamable_change(txn)) > > ReorderBufferStreamTXN(rb, toptxn); > > } > > You are right we need this in ReorderBufferProcessPartialChange() as > well. I will fix this in the next version. Fixed this in the attached patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
RE: Avoid streaming the transaction which are skipped (in corner cases)
From
"houzj.fnst@fujitsu.com"
Date:
On Tuesday, November 29, 2022 12:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: Hi, > > On Mon, Nov 28, 2022 at 3:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Nov 28, 2022 at 1:46 PM shiy.fnst@fujitsu.com > > <shiy.fnst@fujitsu.com> wrote: > > > > > > Thanks for your patch. > > > > > > I saw that the patch added a check when selecting largest > > > transaction, but in addition to ReorderBufferCheckMemoryLimit(), the > > > transaction can also be streamed in > > > ReorderBufferProcessPartialChange(). Should we add the check in this > function, too? > > > > > > diff --git a/src/backend/replication/logical/reorderbuffer.c > > > b/src/backend/replication/logical/reorderbuffer.c > > > index 9a58c4bfb9..108737b02f 100644 > > > --- a/src/backend/replication/logical/reorderbuffer.c > > > +++ b/src/backend/replication/logical/reorderbuffer.c > > > @@ -768,7 +768,8 @@ > ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN > *txn, > > > */ > > > if (ReorderBufferCanStartStreaming(rb) && > > > !(rbtxn_has_partial_change(toptxn)) && > > > - rbtxn_is_serialized(txn)) > > > + rbtxn_is_serialized(txn) && > > > + rbtxn_has_streamable_change(txn)) > > > ReorderBufferStreamTXN(rb, toptxn); } > > > > You are right we need this in ReorderBufferProcessPartialChange() as > > well. I will fix this in the next version. > > Fixed this in the attached patch. Thanks for updating the patch. I have few comments about the patch. 1. 1.1. - /* For streamed transactions notify the remote node about the abort. */ - if (rbtxn_is_streamed(txn)) - rb->stream_abort(rb, txn, lsn); + /* the transaction which is being skipped shouldn't have been streamed */ + Assert(!rbtxn_is_streamed(txn)); 1.2 - rbtxn_is_serialized(txn)) + rbtxn_is_serialized(txn) && + rbtxn_has_streamable_change(txn)) ReorderBufferStreamTXN(rb, toptxn); In the above two places, I think we should do the check for the top-level transaction(e.g. toptxn) because the patch only set flag for the top-level transaction. 2. + /* + * If there are any streamable changes getting queued then get the top + * transaction and mark it has streamable change. This is required for + * streaming in-progress transactions, the in-progress transaction will + * not be selected for streaming unless it has at least one streamable + * change. + */ + if (change->action == REORDER_BUFFER_CHANGE_INSERT || + change->action == REORDER_BUFFER_CHANGE_UPDATE || + change->action == REORDER_BUFFER_CHANGE_DELETE || + change->action == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT || + change->action == REORDER_BUFFER_CHANGE_TRUNCATE) I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE can also be considered as streamable. Is there a reason that we don't check it here ? Best regards, Hou zj
Hi Dilip, On Tue, Nov 29, 2022 at 9:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > You are right we need this in ReorderBufferProcessPartialChange() as > > well. I will fix this in the next version. > > Fixed this in the attached patch. > I focused my attention on SnapBuildXactNeedsSkip() usages and I see they are using different end points of WAL record 1 decode.c logicalmsg_decode 594 SnapBuildXactNeedsSkip(builder, buf->origptr))) 2 decode.c DecodeTXNNeedSkip 1250 return (SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) || 3 reorderbuffer.c AssertTXNLsnOrder 897 if (SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr)) 4 reorderbuffer.c ReorderBufferCanStartStreaming 3922 !SnapBuildXactNeedsSkip(builder, ctx->reader->EndRecPtr)) 5 snapbuild.c SnapBuildXactNeedsSkip 429 SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr) The first two are using origin ptr and the last two are using end ptr. you have fixed the fourth one. Do we need to fix the third one as well? Probably we need to create two wrappers (macros) around SnapBuildXactNeedsSkip(), one which accepts a XLogRecordBuffer and other which accepts XLogReaderState. Then use those. That way at least we have logic unified as to which XLogRecPtr to use. -- Best Wishes, Ashutosh Bapat
On Fri, Dec 2, 2022 at 4:58 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Dilip, > > On Tue, Nov 29, 2022 at 9:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > You are right we need this in ReorderBufferProcessPartialChange() as > > > well. I will fix this in the next version. > > > > Fixed this in the attached patch. > > > > I focused my attention on SnapBuildXactNeedsSkip() usages and I see > they are using different end points of WAL record > 1 decode.c logicalmsg_decode 594 > SnapBuildXactNeedsSkip(builder, buf->origptr))) > 2 decode.c DecodeTXNNeedSkip 1250 return > (SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) || > 3 reorderbuffer.c AssertTXNLsnOrder 897 if > (SnapBuildXactNeedsSkip(ctx->snapshot_builder, > ctx->reader->EndRecPtr)) > 4 reorderbuffer.c ReorderBufferCanStartStreaming 3922 > !SnapBuildXactNeedsSkip(builder, ctx->reader->EndRecPtr)) > 5 snapbuild.c SnapBuildXactNeedsSkip 429 > SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr) > > The first two are using origin ptr and the last two are using end ptr. > you have fixed the fourth one. Do we need to fix the third one as > well? > I think we can change the third one as well but I haven't tested it. Adding Sawada-San for his inputs as it is added in commit 16b1fe0037. In any case, I think we can do that as a separate patch because it is not directly related to the streaming case we are trying to solve as part of this patch. > Probably we need to create two wrappers (macros) around > SnapBuildXactNeedsSkip(), one which accepts a XLogRecordBuffer and > other which accepts XLogReaderState. Then use those. That way at least > we have logic unified as to which XLogRecPtr to use. > I don't know how that will be an improvement because both those have the start and end locations of the record. -- With Regards, Amit Kapila.
On Tue, Nov 29, 2022 at 12:23 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, November 29, 2022 12:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I have few comments about the patch. > > 1. > > 1.1. > - /* For streamed transactions notify the remote node about the abort. */ > - if (rbtxn_is_streamed(txn)) > - rb->stream_abort(rb, txn, lsn); > + /* the transaction which is being skipped shouldn't have been streamed */ > + Assert(!rbtxn_is_streamed(txn)); > > 1.2 > - rbtxn_is_serialized(txn)) > + rbtxn_is_serialized(txn) && > + rbtxn_has_streamable_change(txn)) > ReorderBufferStreamTXN(rb, toptxn); > > In the above two places, I think we should do the check for the top-level > transaction(e.g. toptxn) because the patch only set flag for the top-level > transaction. > Among these, the first one seems okay because it will check both the transaction and its subtransactions from that path and none of those should be marked as streamed. I have fixed the second one in the attached patch. > 2. > > + /* > + * If there are any streamable changes getting queued then get the top > + * transaction and mark it has streamable change. This is required for > + * streaming in-progress transactions, the in-progress transaction will > + * not be selected for streaming unless it has at least one streamable > + * change. > + */ > + if (change->action == REORDER_BUFFER_CHANGE_INSERT || > + change->action == REORDER_BUFFER_CHANGE_UPDATE || > + change->action == REORDER_BUFFER_CHANGE_DELETE || > + change->action == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT || > + change->action == REORDER_BUFFER_CHANGE_TRUNCATE) > > I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE can also be > considered as streamable. Is there a reason that we don't check it here ? > No, I don't see any reason not to do this check for REORDER_BUFFER_CHANGE_MESSAGE. Apart from the above, I have slightly adjusted the comments in the attached. Do let me know what you think of the attached. -- With Regards, Amit Kapila.
Attachment
RE: Avoid streaming the transaction which are skipped (in corner cases)
From
"houzj.fnst@fujitsu.com"
Date:
On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Nov 29, 2022 at 12:23 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Tuesday, November 29, 2022 12:08 PM Dilip Kumar > <dilipbalaut@gmail.com> wrote: > > > > I have few comments about the patch. > > > > 1. > > > > 1.1. > > - /* For streamed transactions notify the remote node about the abort. > */ > > - if (rbtxn_is_streamed(txn)) > > - rb->stream_abort(rb, txn, lsn); > > + /* the transaction which is being skipped shouldn't have been > streamed */ > > + Assert(!rbtxn_is_streamed(txn)); > > > > 1.2 > > - rbtxn_is_serialized(txn)) > > + rbtxn_is_serialized(txn) && > > + rbtxn_has_streamable_change(txn)) > > ReorderBufferStreamTXN(rb, toptxn); > > > > In the above two places, I think we should do the check for the > > top-level transaction(e.g. toptxn) because the patch only set flag for > > the top-level transaction. > > > > Among these, the first one seems okay because it will check both the transaction > and its subtransactions from that path and none of those should be marked as > streamed. I have fixed the second one in the attached patch. > > > 2. > > > > + /* > > + * If there are any streamable changes getting queued then get the > top > > + * transaction and mark it has streamable change. This is required > for > > + * streaming in-progress transactions, the in-progress transaction will > > + * not be selected for streaming unless it has at least one streamable > > + * change. > > + */ > > + if (change->action == REORDER_BUFFER_CHANGE_INSERT || > > + change->action == REORDER_BUFFER_CHANGE_UPDATE || > > + change->action == REORDER_BUFFER_CHANGE_DELETE || > > + change->action == > REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT || > > + change->action == > REORDER_BUFFER_CHANGE_TRUNCATE) > > > > I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE > > can also be considered as streamable. Is there a reason that we don't check it > here ? > > > > No, I don't see any reason not to do this check for > REORDER_BUFFER_CHANGE_MESSAGE. > > Apart from the above, I have slightly adjusted the comments in the attached. Do > let me know what you think of the attached. Thanks for updating the patch. It looks good to me. Best regards, Hou zj
On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Apart from the above, I have slightly adjusted the comments in the attached. Do > > let me know what you think of the attached. > > Thanks for updating the patch. It looks good to me. > I feel the function name ReorderBufferLargestTopTXN() is slightly misleading because it also checks some of the streaming properties (like whether the TXN has partial changes and whether it contains any streamable change). Shall we rename it to ReorderBufferLargestStreamableTopTXN() or something like that? The other point to consider is whether we need to have a test case for this patch. I think before this patch if the size of DDL changes in a transaction exceeds logical_decoding_work_mem, the empty streams will be output in the plugin but after this patch, there won't be any such stream. -- With Regards, Amit Kapila.
On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Apart from the above, I have slightly adjusted the comments in the attached. Do > > > let me know what you think of the attached. > > > > Thanks for updating the patch. It looks good to me. > > > > I feel the function name ReorderBufferLargestTopTXN() is slightly > misleading because it also checks some of the streaming properties > (like whether the TXN has partial changes and whether it contains any > streamable change). Shall we rename it to > ReorderBufferLargestStreamableTopTXN() or something like that? Yes that makes sense > The other point to consider is whether we need to have a test case for > this patch. I think before this patch if the size of DDL changes in a > transaction exceeds logical_decoding_work_mem, the empty streams will > be output in the plugin but after this patch, there won't be any such > stream. Yes, we can do that, I will make these two changes. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Mon, Dec 5, 2022 at 9:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > > > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Apart from the above, I have slightly adjusted the comments in the attached. Do > > > > let me know what you think of the attached. > > > > > > Thanks for updating the patch. It looks good to me. > > > > > > > I feel the function name ReorderBufferLargestTopTXN() is slightly > > misleading because it also checks some of the streaming properties > > (like whether the TXN has partial changes and whether it contains any > > streamable change). Shall we rename it to > > ReorderBufferLargestStreamableTopTXN() or something like that? > > Yes that makes sense I have done this change in the attached patch. > > The other point to consider is whether we need to have a test case for > > this patch. I think before this patch if the size of DDL changes in a > > transaction exceeds logical_decoding_work_mem, the empty streams will > > be output in the plugin but after this patch, there won't be any such > > stream. I tried this test, but I think generating 64k data with just CID messages will make the test case really big. I tried using multiple sessions such that one session makes the reorder buffer full but contains partial changes so that we try to stream another transaction but that is not possible in an automated test to consistently generate the partial change. I think we need something like this[1] so that we can better control the streaming. [1]https://www.postgresql.org/message-id/OSZPR01MB631042582805A8E8615BC413FD329%40OSZPR01MB6310.jpnprd01.prod.outlook.com -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 5, 2022 at 9:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Sun, Dec 4, 2022 at 5:14 PM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > On Saturday, December 3, 2022 7:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > Apart from the above, I have slightly adjusted the comments in the attached. Do > > > > > let me know what you think of the attached. > > > > > > > > Thanks for updating the patch. It looks good to me. > > > > > > > > > > I feel the function name ReorderBufferLargestTopTXN() is slightly > > > misleading because it also checks some of the streaming properties > > > (like whether the TXN has partial changes and whether it contains any > > > streamable change). Shall we rename it to > > > ReorderBufferLargestStreamableTopTXN() or something like that? > > > > Yes that makes sense > > I have done this change in the attached patch. > > > > The other point to consider is whether we need to have a test case for > > > this patch. I think before this patch if the size of DDL changes in a > > > transaction exceeds logical_decoding_work_mem, the empty streams will > > > be output in the plugin but after this patch, there won't be any such > > > stream. > > I tried this test, but I think generating 64k data with just CID > messages will make the test case really big. I tried using multiple > sessions such that one session makes the reorder buffer full but > contains partial changes so that we try to stream another transaction > but that is not possible in an automated test to consistently generate > the partial change. > I also don't see a way to achieve it in an automated way because both toast and speculative inserts are part of one statement, so we need a real concurrent test to make it happen. Can anyone else think of a way to achieve it? > I think we need something like this[1] so that we can better control > the streaming. > +1. The additional advantage would be that we can generate parallel apply and new streaming tests with much lesser data. Shi-San, can you please start a new thread for the GUC patch proposed by you as indicated by Dilip? -- With Regards, Amit Kapila.
RE: Avoid streaming the transaction which are skipped (in corner cases)
From
"shiy.fnst@fujitsu.com"
Date:
On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > I think we need something like this[1] so that we can better control > > the streaming. > > > > +1. The additional advantage would be that we can generate parallel > apply and new streaming tests with much lesser data. Shi-San, can you > please start a new thread for the GUC patch proposed by you as > indicated by Dilip? > OK, I started a new thread for it. [1] [1] https://www.postgresql.org/message-id/OSZPR01MB63104E7449DBE41932DB19F1FD1B9%40OSZPR01MB6310.jpnprd01.prod.outlook.com Regards, Shi yu
On Tue, Dec 6, 2022 at 11:55 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I think we need something like this[1] so that we can better control > > > the streaming. > > > > > > > +1. The additional advantage would be that we can generate parallel > > apply and new streaming tests with much lesser data. Shi-San, can you > > please start a new thread for the GUC patch proposed by you as > > indicated by Dilip? > > > > OK, I started a new thread for it. [1] > Thanks. I think it is better to go ahead with this patch and once we decide what is the right thing to do in terms of GUC then we can try to add additional tests for this. Anyway, it is not that the code added by this patch is not getting covered by existing tests. What do you think? -- With Regards, Amit Kapila.
On Wed, Dec 7, 2022 at 9:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Dec 6, 2022 at 11:55 AM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > > > On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > I think we need something like this[1] so that we can better control > > > > the streaming. > > > > > > > > > > +1. The additional advantage would be that we can generate parallel > > > apply and new streaming tests with much lesser data. Shi-San, can you > > > please start a new thread for the GUC patch proposed by you as > > > indicated by Dilip? > > > > > > > OK, I started a new thread for it. [1] > > > > Thanks. I think it is better to go ahead with this patch and once we > decide what is the right thing to do in terms of GUC then we can try > to add additional tests for this. Anyway, it is not that the code > added by this patch is not getting covered by existing tests. What do > you think? That makes sense to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: Avoid streaming the transaction which are skipped (in corner cases)
From
"shiy.fnst@fujitsu.com"
Date:
On Wed, Dec 7, 2022 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 7, 2022 at 9:28 AM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > On Tue, Dec 6, 2022 at 11:55 AM shiy.fnst@fujitsu.com > > <shiy.fnst@fujitsu.com> wrote: > > > > > > On Mon, Dec 5, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > > > > > On Mon, Dec 5, 2022 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> > wrote: > > > > > > > > > > I think we need something like this[1] so that we can better control > > > > > the streaming. > > > > > > > > > > > > > +1. The additional advantage would be that we can generate parallel > > > > apply and new streaming tests with much lesser data. Shi-San, can you > > > > please start a new thread for the GUC patch proposed by you as > > > > indicated by Dilip? > > > > > > > > > > OK, I started a new thread for it. [1] > > > > > > > Thanks. I think it is better to go ahead with this patch and once we > > decide what is the right thing to do in terms of GUC then we can try > > to add additional tests for this. Anyway, it is not that the code > > added by this patch is not getting covered by existing tests. What do > > you think? > > That makes sense to me. > +1 Regards, Shi yu
On Wed, Dec 7, 2022 at 9:35 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > On Wed, Dec 7, 2022 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > Thanks. I think it is better to go ahead with this patch and once we > > > decide what is the right thing to do in terms of GUC then we can try > > > to add additional tests for this. Anyway, it is not that the code > > > added by this patch is not getting covered by existing tests. What do > > > you think? > > > > That makes sense to me. > > > > +1 > Pushed. -- With Regards, Amit Kapila.