RE: Truncate in synchronous logical replication failed - Mailing list pgsql-hackers
From | osumi.takamichi@fujitsu.com |
---|---|
Subject | RE: Truncate in synchronous logical replication failed |
Date | |
Msg-id | OSBPR01MB4888C84371882A4E001E3A44ED459@OSBPR01MB4888.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Truncate in synchronous logical replication failed (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Truncate in synchronous logical replication failed
|
List | pgsql-hackers |
On Thursday, April 22, 2021 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Tue, Apr 20, 2021 at 9:00 AM osumi.takamichi@fujitsu.com > <osumi.takamichi@fujitsu.com> wrote: > > > > On Tuesday, April 20, 2021 10:53 AM Ajin Cherian <itsajin@gmail.com> > wrote: > > > > > > I reviewed the patch, ran make check, no issues. One minor comment: > > > > > > Could you add the comment similar to RelationGetIndexAttrBitmap() on > > > why the redo, it's not very obvious to someone reading the code, why > > > we are refetching the index list here. > > > > > > + /* Check if we need to redo */ > > > > > > + newindexoidlist = RelationGetIndexList(relation); > > Yeah, makes sense. Fixed. > > I don't think here we need to restart to get a stable list of indexes as we do in > RelationGetIndexAttrBitmap. The reason is here we build the cache entry > using a historic snapshot and all the later changes are absorbed while > decoding WAL. I rechecked this and I agree with this. I don't see any problem to remove the redo check. Based on this direction, I'll share my several minor comments. [1] a typo of RelationGetIdentityKeyBitmap's comment + * This is a special purpose function used during logical replication. Here, + * unlike RelationGetIndexAttrBitmap(), we don't a acquire lock on the required We have "a" in an unnatural place. It should be "we don't acquire...". [2] suggestion to fix RelationGetIdentityKeyBitmap's comment + * later changes are absorbed while decoding WAL. Due to this reason, we don't + * need to retry here in case of a change in the set of indexes. I think it's better to use "Because of" instead of "Due to". This patch works to solve the deadlock. [3] wrong comment in RelationGetIdentityKeyBitmap + /* Save some values to compare after building attributes */ I wrote this comment for the redo check that has been removed already. We can delete it. [4] suggestion to remove local variable relreplindex in RelationGetIdentityKeyBitmap I think we don't need relreplindex. We can just pass relaton->rd_replidindex to RelationIdGetRelation(). There is no more reference of the variable. [5] suggestion to fix the place to release indexoidlist I thought we can do its list_free() ealier, after checking if there is no indexes. [6] suggestion to remove period in one comment. + /* Quick exit if we already computed the result. */ This comes from RelationGetIndexAttrBitmap (and my posted versions) but I think we can remove the period to give better alignment shared with other comments in the function. > I have updated that and modified few comments in the > attached patch. Can you please test this in clobber_cache_always mode? I > think just testing subscription/t/010_truncate.pl would be sufficient. I did it. It didn't fail. No problem. > Now, this bug exists in prior versions (>= v11) where we have introduced > decoding of Truncate. Do we want to backpatch this? As no user has reported > this till now apart from Tang who I think got it while doing some other patch > testing, we might want to just push it in HEAD. I am fine either way. Petr, > others, do you have any opinion on this matter? I think we are fine with applying this patch to the HEAD only, since nobody has complained about the hang issue. Best Regards, Takamichi Osumi
pgsql-hackers by date: