Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAA4eK1JN23VBcfDXRMawwUaqU=9zDrZ6PuaR01kHLskJz+jJbw@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Mon, May 18, 2020 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> 3.
> + /*
> + * If streaming is enable and we have serialized this transaction because
> + * it had incomplete tuple.  So if now we have got the complete tuple we
> + * can stream it.
> + */
> + if (ReorderBufferCanStream(rb) && can_stream && rbtxn_is_serialized(toptxn)
> + && !(rbtxn_has_toast_insert(txn)) && !(rbtxn_has_spec_insert(txn)))
> + {
>
> This comment is just saying what you are doing in the if-check.  I
> think you need to explain the rationale behind it. I don't like the
> variable name 'can_stream' because it matches ReorderBufferCanStream
> whereas it is for a different purpose, how about naming it as
> 'change_complete' or something like that.  The check has many
> conditions, can we move it to a separate function to make the code
> here look clean?
>

Do we really need this?  Immediately after this check, we are calling
ReorderBufferCheckMemoryLimit which will anyway stream the changes if
required.  Can we move the changes related to the detection of
incomplete data to a separate function?

Another comments on v20-0010-Bugfix-handling-of-incomplete-toast-tuple:

+ else if (rbtxn_has_toast_insert(txn) &&
+ ChangeIsInsertOrUpdate(change->action))
+ {
+ toptxn->txn_flags &= ~RBTXN_HAS_TOAST_INSERT;
+ can_stream = true;
+ }
..
+#define ChangeIsInsertOrUpdate(action) \
+ (((action) == REORDER_BUFFER_CHANGE_INSERT) || \
+ ((action) == REORDER_BUFFER_CHANGE_UPDATE) || \
+ ((action) == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT))

How can we clear the RBTXN_HAS_TOAST_INSERT flag on
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT action?

IIUC, the basic idea used to handle incomplete changes (which is
possible in case of toast tuples and speculative inserts) is to mark
such TXNs as containing incomplete changes and then while finding the
largest top-level TXN for streaming, we ignore such TXN's and move to
next largest TXN.  If none of the TXNs have complete changes then we
choose the largest (sub)transaction and spill the same to make the
in-memory changes below logical_decoding_work_mem threshold.  This
idea can work but the strategy to choose the transaction is suboptimal
for cases where TXNs have some changes which are complete followed by
an incomplete toast or speculative tuple.  I was having an offlist
discussion with Robert on this problem and he suggested that it would
be better if we track the complete part of changes separately and then
we can avoid the drawback mentioned above.  I have thought about this
and I think it can work if we track the size and LSN of completed
changes.  I think we need to ensure that if there is concurrent abort
then we discard all changes for current (sub)transaction not only up
to completed changes LSN whereas if the streaming is successful then
we can truncate the changes only up to completed changes LSN. What do
you think?

I wonder why you have done this as 0010 in the patch series, it should
be as 0006 after the
0005-Implement-streaming-mode-in-ReorderBuffer.patch.  If we can do
that way then it would be easier for me to review.  Is there a reason
for not doing so?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: WIP: Aggregation push-down
Next
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions