Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAFiTN-uBd3p50s=kfEiSbg93_Fnf_VU+haS4FqBZAm7PmbX69g@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Sat, Mar 28, 2020 at 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 9:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Mar 4, 2020 at 3:16 AM Tomas Vondra
> > <tomas.vondra@2ndquadrant.com> wrote:
> > >
> > >
> > > The first thing I realized that WAL-logging of assignments in v12 does
> > > both the "old" logging (using dedicated message) and "new" with
> > > toplevel-XID embedded in the first message. Yes, the patch was wrong,
> > > because it eliminated all calls to ProcArrayApplyXidAssignment() and so
> > > it was trivial to crash the replica due to KnownAssignedXids overflow.
> > > But I don't think re-introducing XLOG_XACT_ASSIGNMENT message is the
> > > right fix.
> > >
> > > I actually proposed doing this (having both ways to log assignments) so
> > > that there's no regression risk with (wal_level < logical). But IIRC
> > > Andres objected to it, argumenting that we should not log the same piece
> > > of information in two very different ways at the same time (IIRC it was
> > > discussed on the FOSDEM dev meeting, so I don't have a link to share).
> > > And I do agree with him ...
> > >
> > > The question is, why couldn't the replica use the same assignment info
> > > we already write for logical decoding? The main challenge is that now
> > > the assignment can be sent in many different xlog messages, from a bunch
> > > of resource managers (essentially, any xlog message with a xid can have
> > > embedded XID of the toplevel xact). So the handling would either need to
> > > happen in every rmgr, or we need to move it before we call the rmgr.
> > >
> > > For exampple, we might do this e.g. in StartupXLOG() I think, per the
> > > attached patch (FWIW this particular fix was written by Masahiko Sawada,
> > > not me). This does the trick for me - I'm no longer able to reproduce
> > > the KnownAssignedXids overflow.
> > >
> > > The one difference is that we used to call ProcArrayApplyXidAssignment
> > > for larger groups of XIDs, as sent in the assignment message. Now we
> > > call it for each individual assignment. I don't know if this is an
> > > issue, but I suppose we might introduce some sort of local caching
> > > (accumulate the assignments into a local array, call the function only
> > > when we have enough of them).
> >
> > Thanks for the pointers,  I will think over these points.
> >
>
> I have looked at the solution proposed and I would like to share my
> findings.  I think calling ProcArrayApplyXidAssignment for each
> subtransaction is not a good idea for a couple of reasons:
> (a) It will just beat the purpose of maintaining KnowAssignedXids
> array which is to avoid looking at pg_subtrans in
> TransactionIdIsInProgress() on standby.  Basically, if we remove it
> for each subXid, it will consider the KnowAssignedXids to be
> overflowed and check pg_subtrans frequently.

Right, I also think this is a problem with this solution.  I think we
may try to avoid this by caching this information.  But, then we will
have to maintain this in some dimensional array which stores
sub-transaction ids per top transaction or we can maintain a list of
sub-transaction for each transaction.  I haven't thought about how
much complexity this solution will add.

> (b)  Calling ProcArrayApplyXidAssignment() for each subtransaction can
> be costly from the perspective of concurrency because it acquires
> ProcArrayLock in Exclusive mode, so concurrently running transactions
> might start blocking at this lock.

Right

 Also, I see that
> SubTransSetParent() makes the page dirty, so it might lead to more
> writes if we spread out setting that by calling it separately for each
> sub-transaction.

Right.

>
> Apart from this, I don't see how the proposed fix is correct because
> as far as I can see it tries to remove the Xid before we even record
> it via RecordKnownAssignedTransactionIds().  It seems after patch
> RecordKnownAssignedTransactionIds() will be called after
> ProcArrayApplyXidAssignment(), how could that be correct.

Valid point.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: "movead.li@highgo.ca"
Date:
Subject: Re: A bug when use get_bit() function for a long bytea string