Re: Truncate in synchronous logical replication failed - Mailing list pgsql-hackers
From | Japin Li |
---|---|
Subject | Re: Truncate in synchronous logical replication failed |
Date | |
Msg-id | MEYP282MB16691F27E96F23D6D4D5CDF5B64E9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM Whole thread Raw |
In response to | RE: Truncate in synchronous logical replication failed ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>) |
Responses |
RE: Truncate in synchronous logical replication failed
|
List | pgsql-hackers |
On Tue, 13 Apr 2021 at 21:54, osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > On Monday, April 12, 2021 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com >> <osumi.takamichi@fujitsu.com> wrote: >> > but if we take a measure to fix the doc, we have to be careful for the >> > description, because when we remove the primary keys of 'test' tables on the >> scenario in [1], we don't have this issue. >> > It means TRUNCATE in synchronous logical replication is not always >> blocked. >> > >> >> The problem happens only when we try to fetch IDENTITY_KEY attributes >> because pgoutput uses RelationGetIndexAttrBitmap() to get that information >> which locks the required indexes. Now, because TRUNCATE has already >> acquired an exclusive lock on the index, it seems to create a sort of deadlock >> where the actual Truncate operation waits for logical replication of operation to >> complete and logical replication waits for actual Truncate operation to finish. >> >> Do we really need to use RelationGetIndexAttrBitmap() to build IDENTITY_KEY >> attributes? During decoding, we don't even lock the main relation, we just scan >> the system table and build that information using a historic snapshot. Can't we >> do something similar here? > I think we can build the IDENTITY_KEY attributes with NoLock > instead of calling RelationGetIndexAttrBitmap(). > > When we trace back the caller side of logicalrep_write_attrs(), > doing the thing equivalent to RelationGetIndexAttrBitmap() > for INDEX_ATTR_BITMAP_IDENTITY_KEY impacts only pgoutput_truncate. > > OTOH, I can't find codes similar to RelationGetIndexAttrBitmap() > in pgoutput_* functions and in the file of relcache.c. > Therefore, I'd like to discuss how to address the hang. > > My first idea is to extract some parts of RelationGetIndexAttrBitmap() > only for INDEX_ATTR_BITMAP_IDENTITY_KEY and implement those > either in a logicalrep_write_attrs() or as a new function. > RelationGetIndexAttrBitmap() has 'restart' label for goto statement > in order to ensure to return up-to-date attribute bitmaps, so > I prefer having a new function when we choose this direction. > Having that goto in logicalrep_write_attrs() makes it a little bit messy, I felt. > > The other direction might be to extend RelationGetIndexAttrBitmap's function definition > to accept lockmode to give NoLock from logicalrep_write_attrs(). > But, this change impacts on other several callers so is not as good as the first direction above, I think. > > If someone has any better idea, please let me know. > I think the first idea is better than the second. OTOH, can we release the locks before SyncRepWaitForLSN(), since it already flush to local WAL files. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
pgsql-hackers by date: