RE: Truncate in synchronous logical replication failed - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Truncate in synchronous logical replication failed
Date
Msg-id OSBPR01MB48888CA6F58CCE11ABD1667AED459@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Truncate in synchronous logical replication failed  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Truncate in synchronous logical replication failed  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
On Friday, April 23, 2021 3:43 PM  Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Apr 23, 2021 at 12:04 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Thursday, April 22, 2021 6:33 PM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> > > On Tue, Apr 20, 2021 at 9:00 AM osumi.takamichi@fujitsu.com
> > > <osumi.takamichi@fujitsu.com> wrote:
> > > >
> > > > On  Tuesday, April 20, 2021 10:53 AM Ajin Cherian
> > > > <itsajin@gmail.com>
> > > wrote:
> > > > >
> > > > > I reviewed the patch, ran make check, no issues. One minor
> comment:
> > > > >
> > > > > Could you add the comment similar to
> > > > > RelationGetIndexAttrBitmap() on why the redo, it's not very
> > > > > obvious to someone reading the code, why we are refetching the
> index list here.
> > > > >
> > > > > + /* Check if we need to redo */
> > > > >
> > > > > + newindexoidlist = RelationGetIndexList(relation);
> > > > Yeah, makes sense. Fixed.
> > >
> > > I don't think here we need to restart to get a stable list of
> > > indexes as we do in RelationGetIndexAttrBitmap. The reason is here
> > > we build the cache entry using a historic snapshot and all the later
> > > changes are absorbed while decoding WAL.
> > I rechecked this and I agree with this.
> > I don't see any problem to remove the redo check.
> > Based on this direction, I'll share my several minor comments.
> >
> > [1] a typo of RelationGetIdentityKeyBitmap's comment
> >
> > + * This is a special purpose function used during logical
> > + replication. Here,
> > + * unlike RelationGetIndexAttrBitmap(), we don't a acquire lock on
> > + the required
> >
> > We have "a" in an unnatural place. It should be "we don't acquire...".
> >
> > [2] suggestion to fix RelationGetIdentityKeyBitmap's comment
> >
> > + * later changes are absorbed while decoding WAL. Due to this reason,
> > + we don't
> > + * need to retry here in case of a change in the set of indexes.
> >
> > I think it's better to use "Because of" instead of "Due to".
> > This patch works to solve the deadlock.
> >
> 
> I am not sure which one is better. I would like to keep it as it is unless you feel
> strongly about point 2.
> 
> > [3] wrong comment in RelationGetIdentityKeyBitmap
> >
> > +       /* Save some values to compare after building attributes */
> >
> > I wrote this comment for the redo check that has been removed already.
> > We can delete it.
> >
> > [4] suggestion to remove local variable relreplindex in
> > RelationGetIdentityKeyBitmap
> >
> > I think we don't need relreplindex.
> > We can just pass relaton->rd_replidindex to RelationIdGetRelation().
> > There is no more reference of the variable.
> >
> > [5] suggestion to fix the place to release indexoidlist
> >
> > I thought we can do its list_free() ealier, after checking if there is
> > no indexes.
> >
> 
> Hmm, why? If there are no indexes then we wouldn't have allocated any
> memory.
> 
> > [6] suggestion to remove period in one comment.
> >
> > +       /* Quick exit if we already computed the result. */
> >
> > This comes from RelationGetIndexAttrBitmap (and my posted versions)
> > but I think we can remove the period to give better alignment shared with
> other comments in the function.
> >
> 
> Can you please update the patch except for the two points to which I
> responded above?
I've combined v5 with above accepted comments.

Just in case, I've conducted make check-world, 
the test of clobber_cache_always mode again for v6
and tight loop test of 100 times for 010_truncate.pl. 
The result is that all passed with no failure.


Best Regards,
    Takamichi Osumi


Attachment

pgsql-hackers by date:

Previous
From: Patrik Novotny
Date:
Subject: Re: RFE: Make statistics robust for unplanned events
Next
From: Amit Kapila
Date:
Subject: Re: Replication slot stats misgivings