Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAA4eK1+UA1UJrfw=12gXwhDW=oTc2L-o8Hjp2m20Kt2MNeP=jQ@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Tue, Jan 28, 2020 at 11:34 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jan 28, 2020 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jan 22, 2020 at 10:30 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Jan 14, 2020 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Hmm, I think this can turn out to be inefficient because we can easily
> > > > end up spilling the data even when we don't need to so.  Consider
> > > > cases, where part of the streamed changes are for toast, and remaining
> > > > are the changes which we would have streamed and hence can be removed.
> > > > In such cases, we could have easily consumed remaining changes for
> > > > toast without spilling.  Also, I am not sure if spilling changes from
> > > > the hash table is a good idea as they are no more in the same order as
> > > > they were in ReorderBuffer which means the order in which we serialize
> > > > the changes normally would change and that might have some impact, so
> > > > we would need some more study if we want to pursue this idea.
> > > I have fixed this bug and attached it as a separate patch.  I will
> > > merge it to the main patch after we agree with the idea and after some
> > > more testing.
> > >
> > > The idea is that whenever we get the toasted chunk instead of directly
> > > inserting it into the toast hash I am inserting it into some local
> > > list so that if we don't get the change for the main table then we can
> > > insert these changes back to the txn->changes.  So once we get the
> > > change for the main table at that time I am preparing the hash table
> > > to merge the chunks.
> > >
> >
> >
> > I think this idea will work but appears to be quite costly because (a)
> > you might need to serialize/deserialize the changes multiple times and
> > might attempt streaming multiple times even though you can't do (b)
> > you need to remove/add the same set of changes from the main list
> > multiple times.
> I agree with this.
> >
> > It seems to me that we need to add all of this new handling because
> > while taking the decision whether to stream or not we don't know
> > whether the txn has changes that can't be streamed.  One idea to make
> > it work is that we identify it while decoding the WAL.  I think we
> > need to set a bit in the insert/delete WAL record to identify if the
> > tuple belongs to a toast relation.  This won't add any additional
> > overhead in WAL and reduce a lot of complexity in the logical decoding
> > and also decoding will be efficient.  If this is feasible, then we can
> > do the same for speculative insertions.
> The Idea looks good to me.  I will work on this.
>

One more thing we can do is to identify whether the tuple belongs to
toast relation while decoding it.  However, I think to do that we need
to have access to relcache at that time and that might add some
overhead as we need to do that for each tuple.  Can we investigate
what it will take to do that and if it is better than setting a bit
during WAL logging.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: Peter Griggs
Date:
Subject: Re: [QUESTION/PROPOSAL] loose quadtree in spgist