Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAFiTN-vHEuiHAR5403vYBbe7+huwmwBvsFA4JkE6ky1mJYWfMA@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, May 28, 2020 at 2:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 28, 2020 at 12:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, May 26, 2020 at 12:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Isn't this problem only for subxact file as we anyway create changes
> > > file as part of start stream message which should have come after
> > > abort?  If so, can't we detect whether subxact file exists probably by
> > > using nsubxacts or something like that?  Can you please once try to
> > > reproduce this scenario to ensure that we are not missing anything?
> >
> > I have tested this, as of now, by default we create both changes and
> > subxact files irrespective of whether we get any subtransactions or
> > not.  Maybe this could be optimized that only if we have any subxact
> > then only create that file otherwise not?  What's your opinion on the
> > same.
> >
>
> Yeah, that makes sense.
>
> > > > > > > 8.
> > > > > > > @@ -2295,6 +2677,13 @@ ReorderBufferXidSetCatalogChanges(ReorderBuffer
> > > > > > > *rb, TransactionId xid,
> > > > > > >   txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> > > > > > >
> > > > > > >   txn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > > > > > +
> > > > > > > + /*
> > > > > > > + * TOCHECK: Mark toplevel transaction as having catalog changes too
> > > > > > > + * if one of its children has.
> > > > > > > + */
> > > > > > > + if (txn->toptxn != NULL)
> > > > > > > + txn->toptxn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > > > > >  }
> > > > > > >
> > > > > > > Why are we marking top transaction here?
> > > > > >
> > > > > > We need to mark top transaction to decide whether to build tuplecid
> > > > > > hash or not.  In non-streaming mode, we are only sending during the
> > > > > > commit time, and during commit time we know whether the top
> > > > > > transaction has any catalog changes or not based on the invalidation
> > > > > > message so we are marking the top transaction there in DecodeCommit.
> > > > > > Since here we are not waiting till commit so we need to mark the top
> > > > > > transaction as soon as we mark any of its child transactions.
> > > > > >
> > > > >
> > > > > But how does it help?  We use this flag (via
> > > > > ReorderBufferXidHasCatalogChanges) in SnapBuildCommitTxn which is
> > > > > anyway done in DecodeCommit and that too after setting this flag for
> > > > > the top transaction if required.  So, how will it help in setting it
> > > > > while processing for subxid.  Also, even if we have to do it won't it
> > > > > add the xid needlessly in builder->committed.xip array?
> > > >
> > > > In ReorderBufferBuildTupleCidHash, we use this flag to decide whether
> > > > to build the tuplecid hash or not based on whether it has catalog
> > > > changes or not.
> > > >
> > >
> > > Okay, but you haven't answered the second part of the question: "won't
> > > it add the xid of top transaction needlessly in builder->committed.xip
> > > array, see function SnapBuildCommitTxn?"  IIUC, this can happen
> > > without patch as well because DecodeCommit also sets the flags just
> > > based on invalidation messages irrespective of whether the messages
> > > are generated by top transaction or not, is that right?
> >
> > Yes, with or without the patch it always adds the topxid.  I think
> > purpose for doing this with/without patch is not for the snapshot
> > instead we are marking the top itself that some of its subtxn has the
> > catalog changes so that while building the tuplecid has we can know
> > whether to build the hash or not.  But, having said that I feel in
> > ReorderBufferBuildTupleCidHash why do we need these two checks
> > if (!rbtxn_has_catalog_changes(txn) || dlist_is_empty(&txn->tuplecids))
> > return;
> >
> > I mean it should be enough to just have the check,  because if we have
> > added something to the tuplecids then catalog changes must be there
> > because that time we are setting the catalog changes to true.
> >
> > if (dlist_is_empty(&txn->tuplecids))
> > return;
> >
> > I think in the base code there are multiple things going on
> > 1. If we get new CID we always set the catalog change in that
> > transaction but add the tuplecids in the top transaction.  So
> > basically, top transaction is so far not marked with catalog changes
> > but it has tuplecids.
> > 2. Now, in DecodeCommit the top xid will be marked that it has catalog
> > changes based on the invalidation messages.
> >
>
> I don't think it is advisable to remove that check from base code
> unless we have a strong reason for doing so.  I think here you can
> write better comments about why you are marking the flag for top
> transaction and remove TOCHECK from the comment.

Ok, I will do that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: Amit Langote
Date:
Subject: Re: Getting ERROR with FOR UPDATE/SHARE for partitioned table.