Thread: Truncate in synchronous logical replication failed

Truncate in synchronous logical replication failed

From
"tanghy.fnst@fujitsu.com"
Date:
Hi

I met a problem in synchronous logical replication. The client hangs when TRUNCATE TABLE at publisher.

Example of the procedure:
------publisher------
create table test (a int primary key);
create publication pub for table test;

------subscriber------
create table test (a int primary key);
create subscription sub connection 'dbname=postgres' publication pub;

Then, set synchronous_standby_names = 'sub’  on publisher, and reload publisher.

------publisher------
truncate test;

Then the client of publisher will wait for a long time. A moment later, the publisher and subscriber will report
followingerrors. 
Subscriber log
2021-04-07 12:13:07.700 CST [3542235] logical replication worker ERROR:  terminating logical replication worker due to
timeout
2021-04-07 12:13:07.722 CST [3542217] postmaster LOG:  background worker "logical replication worker" (PID 3542235)
exitedwith exit code 1 
2021-04-07 12:13:07.723 CST [3542357] logical replication worker LOG:  logical replication apply worker for
subscription"sub" has started 
2021-04-07 12:13:07.745 CST [3542357] logical replication worker ERROR:  could not start WAL streaming: ERROR:
replicationslot "sub" is active for PID 3542236 
Publisher log
2021-04-07 12:13:07.745 CST [3542358] walsender ERROR:  replication slot "sub" is active for PID 3542236
2021-04-07 12:13:07.745 CST [3542358] walsender STATEMENT:  START_REPLICATION SLOT "sub" LOGICAL 0/169ECE8
(proto_version'2', publication_names '"pub"') 

I checked the PG-DOC, found it says that “Replication of TRUNCATE commands is supported”[1], so maybe TRUNCATE is not
supportedin synchronous logical replication?  

If my understanding is right, maybe PG-DOC can be modified like this. Any thought?
Replication of TRUNCATE commands is supported
->
Replication of TRUNCATE commands is supported in asynchronous mode

[1]https://www.postgresql.org/docs/devel/logical-replication-restrictions.html

Regards,
Tang



Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Wed, Apr 7, 2021 at 12:26 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> Hi
>
> I met a problem in synchronous logical replication. The client hangs when TRUNCATE TABLE at publisher.
>

Can you please check if the behavior is the same for PG-13? This is
just to ensure that we have not introduced any bug in PG-14.

--
With Regards,
Amit Kapila.



RE: Truncate in synchronous logical replication failed

From
"tanghy.fnst@fujitsu.com"
Date:
On Wednesday, April 7, 2021 5:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote

>Can you please check if the behavior is the same for PG-13? This is
>just to ensure that we have not introduced any bug in PG-14.

Yes, same failure happens at PG-13, too.

Regards,
Tang

Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Wed, 07 Apr 2021 at 16:34, tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> On Wednesday, April 7, 2021 5:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote
>
>>Can you please check if the behavior is the same for PG-13? This is
>>just to ensure that we have not introduced any bug in PG-14.
>
> Yes, same failure happens at PG-13, too.
>

I found that when we truncate a table in synchronous logical replication,
LockAcquireExtended() [1] will try to take a lock via fast path and it
failed (FastPathStrongRelationLocks->count[fasthashcode] = 1).
However, it can acquire the lock when in asynchronous logical replication.
I'm not familiar with the locks, any suggestions? What the difference
between sync and async logical replication for locks?

[1]
    if (EligibleForRelationFastPath(locktag, lockmode) &&
        FastPathLocalUseCount < FP_LOCK_SLOTS_PER_BACKEND)
    {
        uint32      fasthashcode = FastPathStrongLockHashPartition(hashcode);
        bool        acquired;

        /*
         * LWLockAcquire acts as a memory sequencing point, so it's safe to
         * assume that any strong locker whose increment to
         * FastPathStrongRelationLocks->counts becomes visible after we test
         * it has yet to begin to transfer fast-path locks.
         */
        LWLockAcquire(&MyProc->fpInfoLock, LW_EXCLUSIVE);
        if (FastPathStrongRelationLocks->count[fasthashcode] != 0)
            acquired = false;
        else
            acquired = FastPathGrantRelationLock(locktag->locktag_field2,
                                                 lockmode);
        LWLockRelease(&MyProc->fpInfoLock);
        if (acquired)
        {
            /*
             * The locallock might contain stale pointers to some old shared
             * objects; we MUST reset these to null before considering the
             * lock to be acquired via fast-path.
             */
            locallock->lock = NULL;
            locallock->proclock = NULL;
            GrantLockLocal(locallock, owner);
            return LOCKACQUIRE_OK;
        }
    }

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Thu, 08 Apr 2021 at 19:20, Japin Li <japinli@hotmail.com> wrote:
> On Wed, 07 Apr 2021 at 16:34, tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
>> On Wednesday, April 7, 2021 5:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote
>>
>>>Can you please check if the behavior is the same for PG-13? This is
>>>just to ensure that we have not introduced any bug in PG-14.
>>
>> Yes, same failure happens at PG-13, too.
>>
>
> I found that when we truncate a table in synchronous logical replication,
> LockAcquireExtended() [1] will try to take a lock via fast path and it
> failed (FastPathStrongRelationLocks->count[fasthashcode] = 1).
> However, it can acquire the lock when in asynchronous logical replication.
> I'm not familiar with the locks, any suggestions? What the difference
> between sync and async logical replication for locks?
>

After some analyze, I find that when the TRUNCATE finish, it will call
SyncRepWaitForLSN(), for asynchronous logical replication, it will exit
early, and then it calls ResourceOwnerRelease(RESOURCE_RELEASE_LOCKS) to
release the locks, so the walsender can acquire the lock.

But for synchronous logical replication, SyncRepWaitForLSN() will wait
for specified LSN to be confirmed, so it cannot release the lock, and
the walsender try to acquire the lock.  Obviously, it cannot acquire the
lock, because the lock hold by the process which performs TRUNCATE
command. This is why the TRUNCATE in synchronous logical replication is
blocked.


I don't know if it makes sense to fix this, if so, how to do fix it?
Thoughts?

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
Hi


On Saturday, April 10, 2021 11:52 PM Japin Li <japinli@hotmail.com> wrote:
> On Thu, 08 Apr 2021 at 19:20, Japin Li <japinli@hotmail.com> wrote:
> > On Wed, 07 Apr 2021 at 16:34, tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >> On Wednesday, April 7, 2021 5:28 PM Amit Kapila
> >> <amit.kapila16@gmail.com> wrote
> >>
> >>>Can you please check if the behavior is the same for PG-13? This is
> >>>just to ensure that we have not introduced any bug in PG-14.
> >>
> >> Yes, same failure happens at PG-13, too.
> >>
> >
> > I found that when we truncate a table in synchronous logical
> > replication,
> > LockAcquireExtended() [1] will try to take a lock via fast path and it
> > failed (FastPathStrongRelationLocks->count[fasthashcode] = 1).
> > However, it can acquire the lock when in asynchronous logical replication.
> > I'm not familiar with the locks, any suggestions? What the difference
> > between sync and async logical replication for locks?
> >
>
> After some analyze, I find that when the TRUNCATE finish, it will call
> SyncRepWaitForLSN(), for asynchronous logical replication, it will exit early,
> and then it calls ResourceOwnerRelease(RESOURCE_RELEASE_LOCKS) to
> release the locks, so the walsender can acquire the lock.
>
> But for synchronous logical replication, SyncRepWaitForLSN() will wait for
> specified LSN to be confirmed, so it cannot release the lock, and the
> walsender try to acquire the lock.  Obviously, it cannot acquire the lock,
> because the lock hold by the process which performs TRUNCATE command.
> This is why the TRUNCATE in synchronous logical replication is blocked.
Yeah, the TRUNCATE waits in SyncRepWaitForLSN() while
the walsender is blocked by the AccessExclusiveLock taken by it,
which makes the subscriber cannot take the change and leads to a sort of deadlock.


On Wednesday, April 7, 2021 3:56 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> I checked the PG-DOC, found it says that “Replication of TRUNCATE
> commands is supported”[1], so maybe TRUNCATE is not supported in
> synchronous logical replication?
>
> If my understanding is right, maybe PG-DOC can be modified like this. Any
> thought?
> Replication of TRUNCATE commands is supported
> ->
> Replication of TRUNCATE commands is supported in asynchronous mode
I'm not sure if this becomes the final solution,
but if we take a measure to fix the doc, we have to be careful for the description,
because when we remove the primary keys of 'test' tables on the scenario in [1], we don't have this issue.
It means TRUNCATE in synchronous logical replication is not always blocked.

Having the primary key on the pub only causes the hang.
Also, I can observe the same hang using REPLICA IDENTITY USING INDEX and without primary key on the pub,
while I cannot reproduce the problem with the REPLICA IDENTITY FULL and without primary key.
This difference comes from logicalrep_write_attrs() which has a branch to call RelationGetIndexAttrBitmap().
Therefore, the description above is not correct, strictly speaking, I thought.

I'll share my analysis when I get a better idea to address this.

[1] -
https://www.postgresql.org/message-id/OS0PR01MB6113C2499C7DC70EE55ADB82FB759%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Best Regards,
    Takamichi Osumi




Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> > I checked the PG-DOC, found it says that “Replication of TRUNCATE
> > commands is supported”[1], so maybe TRUNCATE is not supported in
> > synchronous logical replication?
> >
> > If my understanding is right, maybe PG-DOC can be modified like this. Any
> > thought?
> > Replication of TRUNCATE commands is supported
> > ->
> > Replication of TRUNCATE commands is supported in asynchronous mode
> I'm not sure if this becomes the final solution,
>

I think unless the solution is not possible or extremely complicated
going via this route doesn't seem advisable.

> but if we take a measure to fix the doc, we have to be careful for the description,
> because when we remove the primary keys of 'test' tables on the scenario in [1], we don't have this issue.
> It means TRUNCATE in synchronous logical replication is not always blocked.
>

The problem happens only when we try to fetch IDENTITY_KEY attributes
because pgoutput uses RelationGetIndexAttrBitmap() to get that
information which locks the required indexes. Now, because TRUNCATE
has already acquired an exclusive lock on the index, it seems to
create a sort of deadlock where the actual Truncate operation waits
for logical replication of operation to complete and logical
replication waits for actual Truncate operation to finish.

Do we really need to use RelationGetIndexAttrBitmap() to build
IDENTITY_KEY attributes? During decoding, we don't even lock the main
relation, we just scan the system table and build that information
using a historic snapshot. Can't we do something similar here?

Adding Petr J. and Peter E. to know their views as this seems to be an
old problem (since the decoding of Truncate operation is introduced).

--
With Regards,
Amit Kapila.



RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Monday, April 12, 2021 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> > but if we take a measure to fix the doc, we have to be careful for the
> > description, because when we remove the primary keys of 'test' tables on the
> scenario in [1], we don't have this issue.
> > It means TRUNCATE in synchronous logical replication is not always
> blocked.
> >
> 
> The problem happens only when we try to fetch IDENTITY_KEY attributes
> because pgoutput uses RelationGetIndexAttrBitmap() to get that information
> which locks the required indexes. Now, because TRUNCATE has already
> acquired an exclusive lock on the index, it seems to create a sort of deadlock
> where the actual Truncate operation waits for logical replication of operation to
> complete and logical replication waits for actual Truncate operation to finish.
> 
> Do we really need to use RelationGetIndexAttrBitmap() to build IDENTITY_KEY
> attributes? During decoding, we don't even lock the main relation, we just scan
> the system table and build that information using a historic snapshot. Can't we
> do something similar here?
I think we can build the IDENTITY_KEY attributes with NoLock
instead of calling RelationGetIndexAttrBitmap().

When we trace back the caller side of logicalrep_write_attrs(),
doing the thing equivalent to RelationGetIndexAttrBitmap()
for INDEX_ATTR_BITMAP_IDENTITY_KEY impacts only pgoutput_truncate.

OTOH, I can't find codes similar to RelationGetIndexAttrBitmap()
in pgoutput_* functions and in the file of relcache.c.
Therefore, I'd like to discuss how to address the hang.

My first idea is to extract some parts of RelationGetIndexAttrBitmap()
only for INDEX_ATTR_BITMAP_IDENTITY_KEY and implement those
either in a logicalrep_write_attrs() or as a new function.
RelationGetIndexAttrBitmap() has 'restart' label for goto statement
in order to ensure to return up-to-date attribute bitmaps, so
I prefer having a new function when we choose this direction.
Having that goto in logicalrep_write_attrs() makes it a little bit messy, I felt.

The other direction might be to extend RelationGetIndexAttrBitmap's function definition
to accept lockmode to give NoLock from logicalrep_write_attrs().
But, this change impacts on other several callers so is not as good as the first direction above, I think.

If someone has any better idea, please let me know.

Best Regards,
    Takamichi Osumi


Re: Truncate in synchronous logical replication failed

From
Petr Jelinek
Date:
> On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
>>
>>> I checked the PG-DOC, found it says that “Replication of TRUNCATE
>>> commands is supported”[1], so maybe TRUNCATE is not supported in
>>> synchronous logical replication?
>>>
>>> If my understanding is right, maybe PG-DOC can be modified like this. Any
>>> thought?
>>> Replication of TRUNCATE commands is supported
>>> ->
>>> Replication of TRUNCATE commands is supported in asynchronous mode
>> I'm not sure if this becomes the final solution,
>>
>
> I think unless the solution is not possible or extremely complicated
> going via this route doesn't seem advisable.
>
>> but if we take a measure to fix the doc, we have to be careful for the description,
>> because when we remove the primary keys of 'test' tables on the scenario in [1], we don't have this issue.
>> It means TRUNCATE in synchronous logical replication is not always blocked.
>>
>
> The problem happens only when we try to fetch IDENTITY_KEY attributes
> because pgoutput uses RelationGetIndexAttrBitmap() to get that
> information which locks the required indexes. Now, because TRUNCATE
> has already acquired an exclusive lock on the index, it seems to
> create a sort of deadlock where the actual Truncate operation waits
> for logical replication of operation to complete and logical
> replication waits for actual Truncate operation to finish.
>
> Do we really need to use RelationGetIndexAttrBitmap() to build
> IDENTITY_KEY attributes? During decoding, we don't even lock the main
> relation, we just scan the system table and build that information
> using a historic snapshot. Can't we do something similar here?
>
> Adding Petr J. and Peter E. to know their views as this seems to be an
> old problem (since the decoding of Truncate operation is introduced).

We used RelationGetIndexAttrBitmap because it already existed, no other reason. I am not sure what exact locking we
needbut I would have guessed at least AccessShareLock would be needed. 

--
Petr




Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Tue, 13 Apr 2021 at 21:54, osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:
> On Monday, April 12, 2021 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com
>> <osumi.takamichi@fujitsu.com> wrote:
>> > but if we take a measure to fix the doc, we have to be careful for the
>> > description, because when we remove the primary keys of 'test' tables on the
>> scenario in [1], we don't have this issue.
>> > It means TRUNCATE in synchronous logical replication is not always
>> blocked.
>> >
>> 
>> The problem happens only when we try to fetch IDENTITY_KEY attributes
>> because pgoutput uses RelationGetIndexAttrBitmap() to get that information
>> which locks the required indexes. Now, because TRUNCATE has already
>> acquired an exclusive lock on the index, it seems to create a sort of deadlock
>> where the actual Truncate operation waits for logical replication of operation to
>> complete and logical replication waits for actual Truncate operation to finish.
>> 
>> Do we really need to use RelationGetIndexAttrBitmap() to build IDENTITY_KEY
>> attributes? During decoding, we don't even lock the main relation, we just scan
>> the system table and build that information using a historic snapshot. Can't we
>> do something similar here?
> I think we can build the IDENTITY_KEY attributes with NoLock
> instead of calling RelationGetIndexAttrBitmap().
>
> When we trace back the caller side of logicalrep_write_attrs(),
> doing the thing equivalent to RelationGetIndexAttrBitmap()
> for INDEX_ATTR_BITMAP_IDENTITY_KEY impacts only pgoutput_truncate.
>
> OTOH, I can't find codes similar to RelationGetIndexAttrBitmap()
> in pgoutput_* functions and in the file of relcache.c.
> Therefore, I'd like to discuss how to address the hang.
>
> My first idea is to extract some parts of RelationGetIndexAttrBitmap()
> only for INDEX_ATTR_BITMAP_IDENTITY_KEY and implement those
> either in a logicalrep_write_attrs() or as a new function.
> RelationGetIndexAttrBitmap() has 'restart' label for goto statement
> in order to ensure to return up-to-date attribute bitmaps, so
> I prefer having a new function when we choose this direction.
> Having that goto in logicalrep_write_attrs() makes it a little bit messy, I felt.
>
> The other direction might be to extend RelationGetIndexAttrBitmap's function definition
> to accept lockmode to give NoLock from logicalrep_write_attrs().
> But, this change impacts on other several callers so is not as good as the first direction above, I think.
>
> If someone has any better idea, please let me know.
>

I think the first idea is better than the second.  OTOH, can we release the
locks before SyncRepWaitForLSN(), since it already flush to local WAL files.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Tuesday, April 13, 2021 11:38 PM Petr Jelinek <petr.jelinek@enterprisedb.com> wrote:
> > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> >>
> >>> I checked the PG-DOC, found it says that “Replication of TRUNCATE
> >>> commands is supported”[1], so maybe TRUNCATE is not supported in
> >>> synchronous logical replication?
> >>>
> >>> If my understanding is right, maybe PG-DOC can be modified like
> >>> this. Any thought?
> >>> Replication of TRUNCATE commands is supported
> >>> ->
> >>> Replication of TRUNCATE commands is supported in asynchronous
> mode
> >> I'm not sure if this becomes the final solution,
> >>
> >
> > I think unless the solution is not possible or extremely complicated
> > going via this route doesn't seem advisable.
> >
> >> but if we take a measure to fix the doc, we have to be careful for
> >> the description, because when we remove the primary keys of 'test' tables
> on the scenario in [1], we don't have this issue.
> >> It means TRUNCATE in synchronous logical replication is not always
> blocked.
> >>
> >
> > The problem happens only when we try to fetch IDENTITY_KEY attributes
> > because pgoutput uses RelationGetIndexAttrBitmap() to get that
> > information which locks the required indexes. Now, because TRUNCATE
> > has already acquired an exclusive lock on the index, it seems to
> > create a sort of deadlock where the actual Truncate operation waits
> > for logical replication of operation to complete and logical
> > replication waits for actual Truncate operation to finish.
> >
> > Do we really need to use RelationGetIndexAttrBitmap() to build
> > IDENTITY_KEY attributes? During decoding, we don't even lock the main
> > relation, we just scan the system table and build that information
> > using a historic snapshot. Can't we do something similar here?
> >
> > Adding Petr J. and Peter E. to know their views as this seems to be an
> > old problem (since the decoding of Truncate operation is introduced).
> 
> We used RelationGetIndexAttrBitmap because it already existed, no other
> reason.I am not sure what exact locking we need but I would have guessed at
> least AccessShareLock would be needed.
This was true.

Having a look at the comment of index_open(), there's a description of basic rule
that NoLock should be used if appropriate lock on the index is already taken.
And, making the walsender use NoLock to build the attributes
leads us to the Assert in the relation_open().

Please ignore the two ideas I suggested in another mail,
which doesn't follow the basic and work.

Best Regards,
    Takamichi Osumi


RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Wednesday, April 14, 2021 11:38 AM Japin Li <japinli@hotmail.com> wrote:
> On Tue, 13 Apr 2021 at 21:54, osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> > On Monday, April 12, 2021 3:58 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >> On Mon, Apr 12, 2021 at 10:03 AM osumi.takamichi@fujitsu.com
> >> <osumi.takamichi@fujitsu.com> wrote:
> >> > but if we take a measure to fix the doc, we have to be careful for
> >> > the description, because when we remove the primary keys of 'test'
> >> > tables on the
> >> scenario in [1], we don't have this issue.
> >> > It means TRUNCATE in synchronous logical replication is not always
> >> blocked.
> >> >
> >>
> >> The problem happens only when we try to fetch IDENTITY_KEY attributes
> >> because pgoutput uses RelationGetIndexAttrBitmap() to get that
> >> information which locks the required indexes. Now, because TRUNCATE
> >> has already acquired an exclusive lock on the index, it seems to
> >> create a sort of deadlock where the actual Truncate operation waits
> >> for logical replication of operation to complete and logical replication waits
> for actual Truncate operation to finish.
> >>
> >> Do we really need to use RelationGetIndexAttrBitmap() to build
> >> IDENTITY_KEY attributes? During decoding, we don't even lock the main
> >> relation, we just scan the system table and build that information
> >> using a historic snapshot. Can't we do something similar here?
> > I think we can build the IDENTITY_KEY attributes with NoLock instead
> > of calling RelationGetIndexAttrBitmap().
> >
> > When we trace back the caller side of logicalrep_write_attrs(), doing
> > the thing equivalent to RelationGetIndexAttrBitmap() for
> > INDEX_ATTR_BITMAP_IDENTITY_KEY impacts only pgoutput_truncate.
> >
> > OTOH, I can't find codes similar to RelationGetIndexAttrBitmap() in
> > pgoutput_* functions and in the file of relcache.c.
> > Therefore, I'd like to discuss how to address the hang.
> >
> > My first idea is to extract some parts of RelationGetIndexAttrBitmap()
> > only for INDEX_ATTR_BITMAP_IDENTITY_KEY and implement those either
> in
> > a logicalrep_write_attrs() or as a new function.
> > RelationGetIndexAttrBitmap() has 'restart' label for goto statement in
> > order to ensure to return up-to-date attribute bitmaps, so I prefer
> > having a new function when we choose this direction.
> > Having that goto in logicalrep_write_attrs() makes it a little bit messy, I felt.
> >
> > The other direction might be to extend RelationGetIndexAttrBitmap's
> > function definition to accept lockmode to give NoLock from
> logicalrep_write_attrs().
> > But, this change impacts on other several callers so is not as good as the first
> direction above, I think.
> >
> > If someone has any better idea, please let me know.
> >
>
> I think the first idea is better than the second.  OTOH, can we release the locks
> before SyncRepWaitForLSN(), since it already flush to local WAL files.
Thank you for your comments.
I didn't mean to change and touch TRUNCATE side to release the locks,
because I expected that its AccessExclusiveLock
protects any other operation (e.g. DROP INDEX) to the table
which affects IDENTITY KEY building. But, now as I said in another e-mail,
both ideas above can't work. Really sorry for making noises.


Best Regards,
    Takamichi Osumi




Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
<petr.jelinek@enterprisedb.com> wrote:
>
> > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > The problem happens only when we try to fetch IDENTITY_KEY attributes
> > because pgoutput uses RelationGetIndexAttrBitmap() to get that
> > information which locks the required indexes. Now, because TRUNCATE
> > has already acquired an exclusive lock on the index, it seems to
> > create a sort of deadlock where the actual Truncate operation waits
> > for logical replication of operation to complete and logical
> > replication waits for actual Truncate operation to finish.
> >
> > Do we really need to use RelationGetIndexAttrBitmap() to build
> > IDENTITY_KEY attributes? During decoding, we don't even lock the main
> > relation, we just scan the system table and build that information
> > using a historic snapshot. Can't we do something similar here?
> >
> > Adding Petr J. and Peter E. to know their views as this seems to be an
> > old problem (since the decoding of Truncate operation is introduced).
>
> We used RelationGetIndexAttrBitmap because it already existed, no other reason.
>

Fair enough. But I think we should do something about it because using
the same (RelationGetIndexAttrBitmap) just breaks the synchronous
logical replication. I think this is broken since the logical
replication of Truncate is supported.

> I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
>

Are you telling that we need AccessShareLock on the index? If so, why
is it different from how we access the relation during decoding
(basically in ReorderBufferProcessTXN, we directly use
RelationIdGetRelation() without any lock on the relation)? I think we
do it that way because we need it to process WAL entries and we need
the relation state based on the historic snapshot, so, even if the
relation is later changed/dropped, we are fine with the old state we
got. Isn't the same thing applies here in logicalrep_write_attrs? If
that is true then some equivalent of RelationGetIndexAttrBitmap where
we use RelationIdGetRelation instead of index_open should work? Am, I
missing something?

-- 
With Regards,
Amit Kapila.



Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
> <petr.jelinek@enterprisedb.com> wrote:
> >
> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
> > > information which locks the required indexes. Now, because TRUNCATE
> > > has already acquired an exclusive lock on the index, it seems to
> > > create a sort of deadlock where the actual Truncate operation waits
> > > for logical replication of operation to complete and logical
> > > replication waits for actual Truncate operation to finish.
> > >
> > > Do we really need to use RelationGetIndexAttrBitmap() to build
> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
> > > relation, we just scan the system table and build that information
> > > using a historic snapshot. Can't we do something similar here?
> > >
> > > Adding Petr J. and Peter E. to know their views as this seems to be an
> > > old problem (since the decoding of Truncate operation is introduced).
> >
> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
> >
>
> Fair enough. But I think we should do something about it because using
> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
> logical replication. I think this is broken since the logical
> replication of Truncate is supported.
>
> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
> >
>
> Are you telling that we need AccessShareLock on the index? If so, why
> is it different from how we access the relation during decoding
> (basically in ReorderBufferProcessTXN, we directly use
> RelationIdGetRelation() without any lock on the relation)? I think we
> do it that way because we need it to process WAL entries and we need
> the relation state based on the historic snapshot, so, even if the
> relation is later changed/dropped, we are fine with the old state we
> got. Isn't the same thing applies here in logicalrep_write_attrs? If
> that is true then some equivalent of RelationGetIndexAttrBitmap where
> we use RelationIdGetRelation instead of index_open should work?
>

Today, again I have thought about this and don't see a problem with
the above idea. If the above understanding is correct, then I think
for our purpose in pgoutput, we just need to call RelationGetIndexList
and then build the idattr list for relation->rd_replidindex. We can
then cache it in relation->rd_idattr. I am not sure if it is really
required to do all the other work in RelationGetIndexAttrBitmap.

-- 
With Regards,
Amit Kapila.



Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
>> <petr.jelinek@enterprisedb.com> wrote:
>> >
>> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > >
>> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
>> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
>> > > information which locks the required indexes. Now, because TRUNCATE
>> > > has already acquired an exclusive lock on the index, it seems to
>> > > create a sort of deadlock where the actual Truncate operation waits
>> > > for logical replication of operation to complete and logical
>> > > replication waits for actual Truncate operation to finish.
>> > >
>> > > Do we really need to use RelationGetIndexAttrBitmap() to build
>> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
>> > > relation, we just scan the system table and build that information
>> > > using a historic snapshot. Can't we do something similar here?
>> > >
>> > > Adding Petr J. and Peter E. to know their views as this seems to be an
>> > > old problem (since the decoding of Truncate operation is introduced).
>> >
>> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
>> >
>>
>> Fair enough. But I think we should do something about it because using
>> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
>> logical replication. I think this is broken since the logical
>> replication of Truncate is supported.
>>
>> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
>> >
>>
>> Are you telling that we need AccessShareLock on the index? If so, why
>> is it different from how we access the relation during decoding
>> (basically in ReorderBufferProcessTXN, we directly use
>> RelationIdGetRelation() without any lock on the relation)? I think we
>> do it that way because we need it to process WAL entries and we need
>> the relation state based on the historic snapshot, so, even if the
>> relation is later changed/dropped, we are fine with the old state we
>> got. Isn't the same thing applies here in logicalrep_write_attrs? If
>> that is true then some equivalent of RelationGetIndexAttrBitmap where
>> we use RelationIdGetRelation instead of index_open should work?
>>
>
> Today, again I have thought about this and don't see a problem with
> the above idea. If the above understanding is correct, then I think
> for our purpose in pgoutput, we just need to call RelationGetIndexList
> and then build the idattr list for relation->rd_replidindex.

Sorry, I don't know how can we build the idattr without open the index.
If we should open the index, then we should use NoLock, since the TRUNCATE
side hold AccessExclusiveLock. As Osumi points out in [1], The NoLock mode
assumes that the  appropriate lock on the index is already taken.

Please let me know if I have misunderstood.

[1]
https://www.postgresql.org/message-id/OSBPR01MB488834BDBD67BFF2FB048B3DED4E9%40OSBPR01MB4888.jpnprd01.prod.outlook.com

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Thu, Apr 15, 2021 at 4:30 PM Japin Li <japinli@hotmail.com> wrote:
>
>
> On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
> >> <petr.jelinek@enterprisedb.com> wrote:
> >> >
> >> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> > >
> >> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
> >> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
> >> > > information which locks the required indexes. Now, because TRUNCATE
> >> > > has already acquired an exclusive lock on the index, it seems to
> >> > > create a sort of deadlock where the actual Truncate operation waits
> >> > > for logical replication of operation to complete and logical
> >> > > replication waits for actual Truncate operation to finish.
> >> > >
> >> > > Do we really need to use RelationGetIndexAttrBitmap() to build
> >> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
> >> > > relation, we just scan the system table and build that information
> >> > > using a historic snapshot. Can't we do something similar here?
> >> > >
> >> > > Adding Petr J. and Peter E. to know their views as this seems to be an
> >> > > old problem (since the decoding of Truncate operation is introduced).
> >> >
> >> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
> >> >
> >>
> >> Fair enough. But I think we should do something about it because using
> >> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
> >> logical replication. I think this is broken since the logical
> >> replication of Truncate is supported.
> >>
> >> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
> >> >
> >>
> >> Are you telling that we need AccessShareLock on the index? If so, why
> >> is it different from how we access the relation during decoding
> >> (basically in ReorderBufferProcessTXN, we directly use
> >> RelationIdGetRelation() without any lock on the relation)? I think we
> >> do it that way because we need it to process WAL entries and we need
> >> the relation state based on the historic snapshot, so, even if the
> >> relation is later changed/dropped, we are fine with the old state we
> >> got. Isn't the same thing applies here in logicalrep_write_attrs? If
> >> that is true then some equivalent of RelationGetIndexAttrBitmap where
> >> we use RelationIdGetRelation instead of index_open should work?
> >>
> >
> > Today, again I have thought about this and don't see a problem with
> > the above idea. If the above understanding is correct, then I think
> > for our purpose in pgoutput, we just need to call RelationGetIndexList
> > and then build the idattr list for relation->rd_replidindex.
>
> Sorry, I don't know how can we build the idattr without open the index.
> If we should open the index, then we should use NoLock, since the TRUNCATE
> side hold AccessExclusiveLock. As Osumi points out in [1], The NoLock mode
> assumes that the  appropriate lock on the index is already taken.
>

Why can't we use RelationIdGetRelation() by passing the required
indexOid to it?


-- 
With Regards,
Amit Kapila.



Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Thu, 15 Apr 2021 at 19:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Apr 15, 2021 at 4:30 PM Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >>
>> >> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
>> >> <petr.jelinek@enterprisedb.com> wrote:
>> >> >
>> >> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> > >
>> >> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
>> >> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
>> >> > > information which locks the required indexes. Now, because TRUNCATE
>> >> > > has already acquired an exclusive lock on the index, it seems to
>> >> > > create a sort of deadlock where the actual Truncate operation waits
>> >> > > for logical replication of operation to complete and logical
>> >> > > replication waits for actual Truncate operation to finish.
>> >> > >
>> >> > > Do we really need to use RelationGetIndexAttrBitmap() to build
>> >> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
>> >> > > relation, we just scan the system table and build that information
>> >> > > using a historic snapshot. Can't we do something similar here?
>> >> > >
>> >> > > Adding Petr J. and Peter E. to know their views as this seems to be an
>> >> > > old problem (since the decoding of Truncate operation is introduced).
>> >> >
>> >> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
>> >> >
>> >>
>> >> Fair enough. But I think we should do something about it because using
>> >> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
>> >> logical replication. I think this is broken since the logical
>> >> replication of Truncate is supported.
>> >>
>> >> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
>> >> >
>> >>
>> >> Are you telling that we need AccessShareLock on the index? If so, why
>> >> is it different from how we access the relation during decoding
>> >> (basically in ReorderBufferProcessTXN, we directly use
>> >> RelationIdGetRelation() without any lock on the relation)? I think we
>> >> do it that way because we need it to process WAL entries and we need
>> >> the relation state based on the historic snapshot, so, even if the
>> >> relation is later changed/dropped, we are fine with the old state we
>> >> got. Isn't the same thing applies here in logicalrep_write_attrs? If
>> >> that is true then some equivalent of RelationGetIndexAttrBitmap where
>> >> we use RelationIdGetRelation instead of index_open should work?
>> >>
>> >
>> > Today, again I have thought about this and don't see a problem with
>> > the above idea. If the above understanding is correct, then I think
>> > for our purpose in pgoutput, we just need to call RelationGetIndexList
>> > and then build the idattr list for relation->rd_replidindex.
>>
>> Sorry, I don't know how can we build the idattr without open the index.
>> If we should open the index, then we should use NoLock, since the TRUNCATE
>> side hold AccessExclusiveLock. As Osumi points out in [1], The NoLock mode
>> assumes that the  appropriate lock on the index is already taken.
>>
>
> Why can't we use RelationIdGetRelation() by passing the required
> indexOid to it?

Thanks for your reminder.  It might be a way to solve this problem.
I'll try it later.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Thu, 15 Apr 2021 at 19:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Apr 15, 2021 at 4:30 PM Japin Li <japinli@hotmail.com> wrote:
>>
>>
>> On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >>
>> >> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
>> >> <petr.jelinek@enterprisedb.com> wrote:
>> >> >
>> >> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> > >
>> >> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
>> >> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
>> >> > > information which locks the required indexes. Now, because TRUNCATE
>> >> > > has already acquired an exclusive lock on the index, it seems to
>> >> > > create a sort of deadlock where the actual Truncate operation waits
>> >> > > for logical replication of operation to complete and logical
>> >> > > replication waits for actual Truncate operation to finish.
>> >> > >
>> >> > > Do we really need to use RelationGetIndexAttrBitmap() to build
>> >> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
>> >> > > relation, we just scan the system table and build that information
>> >> > > using a historic snapshot. Can't we do something similar here?
>> >> > >
>> >> > > Adding Petr J. and Peter E. to know their views as this seems to be an
>> >> > > old problem (since the decoding of Truncate operation is introduced).
>> >> >
>> >> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
>> >> >
>> >>
>> >> Fair enough. But I think we should do something about it because using
>> >> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
>> >> logical replication. I think this is broken since the logical
>> >> replication of Truncate is supported.
>> >>
>> >> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
>> >> >
>> >>
>> >> Are you telling that we need AccessShareLock on the index? If so, why
>> >> is it different from how we access the relation during decoding
>> >> (basically in ReorderBufferProcessTXN, we directly use
>> >> RelationIdGetRelation() without any lock on the relation)? I think we
>> >> do it that way because we need it to process WAL entries and we need
>> >> the relation state based on the historic snapshot, so, even if the
>> >> relation is later changed/dropped, we are fine with the old state we
>> >> got. Isn't the same thing applies here in logicalrep_write_attrs? If
>> >> that is true then some equivalent of RelationGetIndexAttrBitmap where
>> >> we use RelationIdGetRelation instead of index_open should work?
>> >>
>> >
>> > Today, again I have thought about this and don't see a problem with
>> > the above idea. If the above understanding is correct, then I think
>> > for our purpose in pgoutput, we just need to call RelationGetIndexList
>> > and then build the idattr list for relation->rd_replidindex.
>>
>> Sorry, I don't know how can we build the idattr without open the index.
>> If we should open the index, then we should use NoLock, since the TRUNCATE
>> side hold AccessExclusiveLock. As Osumi points out in [1], The NoLock mode
>> assumes that the  appropriate lock on the index is already taken.
>>
>
> Why can't we use RelationIdGetRelation() by passing the required
> indexOid to it?

Hi Amit, as your suggested, I try to use RelationIdGetRelation() replace
the index_open() to avoid specify the AccessSharedLock, then the TRUNCATE
will not be blocked.

Attached patch fixed it. Thoughts?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Attachment

RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
Hi


On Friday, April 16, 2021 11:02 AM Japin Li <japinli@hotmail.com>
> On Thu, 15 Apr 2021 at 19:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Apr 15, 2021 at 4:30 PM Japin Li <japinli@hotmail.com> wrote:
> >>
> >>
> >> On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >> > On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> >> >>
> >> >> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
> >> >> <petr.jelinek@enterprisedb.com> wrote:
> >> >> >
> >> >> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >> >> > >
> >> >> > > The problem happens only when we try to fetch IDENTITY_KEY
> >> >> > > attributes because pgoutput uses RelationGetIndexAttrBitmap()
> >> >> > > to get that information which locks the required indexes. Now,
> >> >> > > because TRUNCATE has already acquired an exclusive lock on the
> >> >> > > index, it seems to create a sort of deadlock where the actual
> >> >> > > Truncate operation waits for logical replication of operation
> >> >> > > to complete and logical replication waits for actual Truncate
> operation to finish.
> >> >> > >
> >> >> > > Do we really need to use RelationGetIndexAttrBitmap() to build
> >> >> > > IDENTITY_KEY attributes? During decoding, we don't even lock
> >> >> > > the main relation, we just scan the system table and build
> >> >> > > that information using a historic snapshot. Can't we do something
> similar here?
> >> >> > >
> >> >> > > Adding Petr J. and Peter E. to know their views as this seems
> >> >> > > to be an old problem (since the decoding of Truncate operation is
> introduced).
> >> >> >
> >> >> > We used RelationGetIndexAttrBitmap because it already existed, no
> other reason.
> >> >> >
> >> >>
> >> >> Fair enough. But I think we should do something about it because
> >> >> using the same (RelationGetIndexAttrBitmap) just breaks the
> >> >> synchronous logical replication. I think this is broken since the
> >> >> logical replication of Truncate is supported.
> >> >>
> >> >> > I am not sure what exact locking we need but I would have guessed
> at least AccessShareLock would be needed.
> >> >> >
> >> >>
> >> >> Are you telling that we need AccessShareLock on the index? If so,
> >> >> why is it different from how we access the relation during
> >> >> decoding (basically in ReorderBufferProcessTXN, we directly use
> >> >> RelationIdGetRelation() without any lock on the relation)? I think
> >> >> we do it that way because we need it to process WAL entries and we
> >> >> need the relation state based on the historic snapshot, so, even
> >> >> if the relation is later changed/dropped, we are fine with the old
> >> >> state we got. Isn't the same thing applies here in
> >> >> logicalrep_write_attrs? If that is true then some equivalent of
> >> >> RelationGetIndexAttrBitmap where we use RelationIdGetRelation
> instead of index_open should work?
> >> >>
> >> >
> >> > Today, again I have thought about this and don't see a problem with
> >> > the above idea. If the above understanding is correct, then I think
> >> > for our purpose in pgoutput, we just need to call
> >> > RelationGetIndexList and then build the idattr list for
> relation->rd_replidindex.
> >>
> >> Sorry, I don't know how can we build the idattr without open the index.
> >> If we should open the index, then we should use NoLock, since the
> >> TRUNCATE side hold AccessExclusiveLock. As Osumi points out in [1],
> >> The NoLock mode assumes that the  appropriate lock on the index is
> already taken.
> >>
> >
> > Why can't we use RelationIdGetRelation() by passing the required
> > indexOid to it?
>
> Thanks for your reminder.  It might be a way to solve this problem.
Yeah. I've made the 1st patch for this issue.

In my env, with the patch
the TRUNCATE in synchronous logical replication doesn't hang.
It's OK with make check-world as well.


Best Regards,
    Takamichi Osumi


Attachment

Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Fri, Apr 16, 2021 at 12:56 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> > Thanks for your reminder.  It might be a way to solve this problem.
> Yeah. I've made the 1st patch for this issue.
>
> In my env, with the patch
> the TRUNCATE in synchronous logical replication doesn't hang.
>

Few initial comments:
=====================
1.
+ relreplindex = relation->rd_replidindex;
+
+ /*
+ * build attributes to idindexattrs.
+ */
+ idindexattrs = NULL;
+ foreach(l, indexoidlist)
+ {
+ Oid indexOid = lfirst_oid(l);
+ Relation indexDesc;
+ int i;
+ bool isIDKey; /* replica identity index */
+
+ indexDesc = RelationIdGetRelation(indexOid);

When you have oid of replica identity index (relreplindex) then what
is the need to traverse all the indexes?

2.
It is better to name the function as RelationGet...

-- 
With Regards,
Amit Kapila.



Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Fri, Apr 16, 2021 at 12:55 PM Japin Li <japinli@hotmail.com> wrote:
>
> On Thu, 15 Apr 2021 at 19:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Apr 15, 2021 at 4:30 PM Japin Li <japinli@hotmail.com> wrote:
> >>
> >>
> >> On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> > On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> >>
> >> >> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
> >> >> <petr.jelinek@enterprisedb.com> wrote:
> >> >> >
> >> >> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> >> > >
> >> >> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
> >> >> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
> >> >> > > information which locks the required indexes. Now, because TRUNCATE
> >> >> > > has already acquired an exclusive lock on the index, it seems to
> >> >> > > create a sort of deadlock where the actual Truncate operation waits
> >> >> > > for logical replication of operation to complete and logical
> >> >> > > replication waits for actual Truncate operation to finish.
> >> >> > >
> >> >> > > Do we really need to use RelationGetIndexAttrBitmap() to build
> >> >> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
> >> >> > > relation, we just scan the system table and build that information
> >> >> > > using a historic snapshot. Can't we do something similar here?
> >> >> > >
> >> >> > > Adding Petr J. and Peter E. to know their views as this seems to be an
> >> >> > > old problem (since the decoding of Truncate operation is introduced).
> >> >> >
> >> >> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
> >> >> >
> >> >>
> >> >> Fair enough. But I think we should do something about it because using
> >> >> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
> >> >> logical replication. I think this is broken since the logical
> >> >> replication of Truncate is supported.
> >> >>
> >> >> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
> >> >> >
> >> >>
> >> >> Are you telling that we need AccessShareLock on the index? If so, why
> >> >> is it different from how we access the relation during decoding
> >> >> (basically in ReorderBufferProcessTXN, we directly use
> >> >> RelationIdGetRelation() without any lock on the relation)? I think we
> >> >> do it that way because we need it to process WAL entries and we need
> >> >> the relation state based on the historic snapshot, so, even if the
> >> >> relation is later changed/dropped, we are fine with the old state we
> >> >> got. Isn't the same thing applies here in logicalrep_write_attrs? If
> >> >> that is true then some equivalent of RelationGetIndexAttrBitmap where
> >> >> we use RelationIdGetRelation instead of index_open should work?
> >> >>
> >> >
> >> > Today, again I have thought about this and don't see a problem with
> >> > the above idea. If the above understanding is correct, then I think
> >> > for our purpose in pgoutput, we just need to call RelationGetIndexList
> >> > and then build the idattr list for relation->rd_replidindex.
> >>
> >> Sorry, I don't know how can we build the idattr without open the index.
> >> If we should open the index, then we should use NoLock, since the TRUNCATE
> >> side hold AccessExclusiveLock. As Osumi points out in [1], The NoLock mode
> >> assumes that the  appropriate lock on the index is already taken.
> >>
> >
> > Why can't we use RelationIdGetRelation() by passing the required
> > indexOid to it?
>
> Hi Amit, as your suggested, I try to use RelationIdGetRelation() replace
> the index_open() to avoid specify the AccessSharedLock, then the TRUNCATE
> will not be blocked.
>

It is okay as POC but we can't change the existing function
RelationGetIndexAttrBitmap. It is used from other places as well. It
might be better to write a separate function for this, something like
what Osumi-San's patch is trying to do.

-- 
With Regards,
Amit Kapila.



RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
Hi


On Friday, April 16, 2021 5:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Apr 16, 2021 at 12:56 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > > Thanks for your reminder.  It might be a way to solve this problem.
> > Yeah. I've made the 1st patch for this issue.
> >
> > In my env, with the patch
> > the TRUNCATE in synchronous logical replication doesn't hang.
> >
> 
> Few initial comments:
> =====================
> 1.
> + relreplindex = relation->rd_replidindex;
> +
> + /*
> + * build attributes to idindexattrs.
> + */
> + idindexattrs = NULL;
> + foreach(l, indexoidlist)
> + {
> + Oid indexOid = lfirst_oid(l);
> + Relation indexDesc;
> + int i;
> + bool isIDKey; /* replica identity index */
> +
> + indexDesc = RelationIdGetRelation(indexOid);
> 
> When you have oid of replica identity index (relreplindex) then what is the
> need to traverse all the indexes?
Ok. No need to traverse all the indexes. Will fix this part.

> 2.
> It is better to name the function as RelationGet...
You are right. I'll modify this in my next version.


Best Regards,
    Takamichi Osumi


Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Fri, 16 Apr 2021 at 16:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Apr 16, 2021 at 12:55 PM Japin Li <japinli@hotmail.com> wrote:
>>
>> On Thu, 15 Apr 2021 at 19:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > On Thu, Apr 15, 2021 at 4:30 PM Japin Li <japinli@hotmail.com> wrote:
>> >>
>> >>
>> >> On Thu, 15 Apr 2021 at 18:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> > On Wed, Apr 14, 2021 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> >>
>> >> >> On Tue, Apr 13, 2021 at 8:07 PM Petr Jelinek
>> >> >> <petr.jelinek@enterprisedb.com> wrote:
>> >> >> >
>> >> >> > > On 12 Apr 2021, at 08:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> >> > >
>> >> >> > > The problem happens only when we try to fetch IDENTITY_KEY attributes
>> >> >> > > because pgoutput uses RelationGetIndexAttrBitmap() to get that
>> >> >> > > information which locks the required indexes. Now, because TRUNCATE
>> >> >> > > has already acquired an exclusive lock on the index, it seems to
>> >> >> > > create a sort of deadlock where the actual Truncate operation waits
>> >> >> > > for logical replication of operation to complete and logical
>> >> >> > > replication waits for actual Truncate operation to finish.
>> >> >> > >
>> >> >> > > Do we really need to use RelationGetIndexAttrBitmap() to build
>> >> >> > > IDENTITY_KEY attributes? During decoding, we don't even lock the main
>> >> >> > > relation, we just scan the system table and build that information
>> >> >> > > using a historic snapshot. Can't we do something similar here?
>> >> >> > >
>> >> >> > > Adding Petr J. and Peter E. to know their views as this seems to be an
>> >> >> > > old problem (since the decoding of Truncate operation is introduced).
>> >> >> >
>> >> >> > We used RelationGetIndexAttrBitmap because it already existed, no other reason.
>> >> >> >
>> >> >>
>> >> >> Fair enough. But I think we should do something about it because using
>> >> >> the same (RelationGetIndexAttrBitmap) just breaks the synchronous
>> >> >> logical replication. I think this is broken since the logical
>> >> >> replication of Truncate is supported.
>> >> >>
>> >> >> > I am not sure what exact locking we need but I would have guessed at least AccessShareLock would be needed.
>> >> >> >
>> >> >>
>> >> >> Are you telling that we need AccessShareLock on the index? If so, why
>> >> >> is it different from how we access the relation during decoding
>> >> >> (basically in ReorderBufferProcessTXN, we directly use
>> >> >> RelationIdGetRelation() without any lock on the relation)? I think we
>> >> >> do it that way because we need it to process WAL entries and we need
>> >> >> the relation state based on the historic snapshot, so, even if the
>> >> >> relation is later changed/dropped, we are fine with the old state we
>> >> >> got. Isn't the same thing applies here in logicalrep_write_attrs? If
>> >> >> that is true then some equivalent of RelationGetIndexAttrBitmap where
>> >> >> we use RelationIdGetRelation instead of index_open should work?
>> >> >>
>> >> >
>> >> > Today, again I have thought about this and don't see a problem with
>> >> > the above idea. If the above understanding is correct, then I think
>> >> > for our purpose in pgoutput, we just need to call RelationGetIndexList
>> >> > and then build the idattr list for relation->rd_replidindex.
>> >>
>> >> Sorry, I don't know how can we build the idattr without open the index.
>> >> If we should open the index, then we should use NoLock, since the TRUNCATE
>> >> side hold AccessExclusiveLock. As Osumi points out in [1], The NoLock mode
>> >> assumes that the  appropriate lock on the index is already taken.
>> >>
>> >
>> > Why can't we use RelationIdGetRelation() by passing the required
>> > indexOid to it?
>>
>> Hi Amit, as your suggested, I try to use RelationIdGetRelation() replace
>> the index_open() to avoid specify the AccessSharedLock, then the TRUNCATE
>> will not be blocked.
>>
>
> It is okay as POC but we can't change the existing function
> RelationGetIndexAttrBitmap. It is used from other places as well. It
> might be better to write a separate function for this, something like
> what Osumi-San's patch is trying to do.

Agreed!

Hi Osumi-San, can you merge the test case in your next version?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Fri, Apr 16, 2021 at 3:08 PM Japin Li <japinli@hotmail.com> wrote:
>
> On Fri, 16 Apr 2021 at 16:52, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Fri, Apr 16, 2021 at 12:55 PM Japin Li <japinli@hotmail.com> wrote:
> > It is okay as POC but we can't change the existing function
> > RelationGetIndexAttrBitmap. It is used from other places as well. It
> > might be better to write a separate function for this, something like
> > what Osumi-San's patch is trying to do.
>
> Agreed!
>
> Hi Osumi-San, can you merge the test case in your next version?
>

+1. Your test looks reasonable to me.

-- 
With Regards,
Amit Kapila.



Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Fri, 16 Apr 2021 at 17:19, osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:
> Hi
>
>
> On Friday, April 16, 2021 5:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, Apr 16, 2021 at 12:56 PM osumi.takamichi@fujitsu.com
>> <osumi.takamichi@fujitsu.com> wrote:
>> >
>> > > Thanks for your reminder.  It might be a way to solve this problem.
>> > Yeah. I've made the 1st patch for this issue.
>> >
>> > In my env, with the patch
>> > the TRUNCATE in synchronous logical replication doesn't hang.
>> >
>>
>> Few initial comments:
>> =====================
>> 1.
>> + relreplindex = relation->rd_replidindex;
>> +
>> + /*
>> + * build attributes to idindexattrs.
>> + */
>> + idindexattrs = NULL;
>> + foreach(l, indexoidlist)
>> + {
>> + Oid indexOid = lfirst_oid(l);
>> + Relation indexDesc;
>> + int i;
>> + bool isIDKey; /* replica identity index */
>> +
>> + indexDesc = RelationIdGetRelation(indexOid);
>>
>> When you have oid of replica identity index (relreplindex) then what is the
>> need to traverse all the indexes?
> Ok. No need to traverse all the indexes. Will fix this part.
>
>> 2.
>> It is better to name the function as RelationGet...
> You are right. I'll modify this in my next version.
>

I took the liberty to address review comments and provide a v2 patch on top of
your's v1 patch, also merge the test case.

Sorry for attaching.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Attachment

RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Saturday, April 17, 2021 12:53 AM Japin Li <japinli@hotmail.com>
> On Fri, 16 Apr 2021 at 17:19, osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> > On Friday, April 16, 2021 5:50 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >> On Fri, Apr 16, 2021 at 12:56 PM osumi.takamichi@fujitsu.com
> >> <osumi.takamichi@fujitsu.com> wrote:
> >> >
> >> > > Thanks for your reminder.  It might be a way to solve this problem.
> >> > Yeah. I've made the 1st patch for this issue.
> >> >
> >> > In my env, with the patch
> >> > the TRUNCATE in synchronous logical replication doesn't hang.
> >> >
> >>
> >> Few initial comments:
> >> =====================
> >> 1.
> >> + relreplindex = relation->rd_replidindex;
> >> +
> >> + /*
> >> + * build attributes to idindexattrs.
> >> + */
> >> + idindexattrs = NULL;
> >> + foreach(l, indexoidlist)
> >> + {
> >> + Oid indexOid = lfirst_oid(l);
> >> + Relation indexDesc;
> >> + int i;
> >> + bool isIDKey; /* replica identity index */
> >> +
> >> + indexDesc = RelationIdGetRelation(indexOid);
> >>
> >> When you have oid of replica identity index (relreplindex) then what
> >> is the need to traverse all the indexes?
> > Ok. No need to traverse all the indexes. Will fix this part.
> >
> >> 2.
> >> It is better to name the function as RelationGet...
> > You are right. I'll modify this in my next version.
> >
>
> I took the liberty to address review comments and provide a v2 patch on top
> of your's v1 patch, also merge the test case.
>
> Sorry for attaching.
No problem. Thank you for updating the patch.
I've conducted some cosmetic changes. Could you please check this ?
That's already applied by pgindent.

I executed RT for this and made no failure.
Just in case, I executed 010_truncate.pl test 100 times in a tight loop,
which also didn't fail.

Best Regards,
    Takamichi Osumi


Attachment

Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Sat, 17 Apr 2021 at 12:03, osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:
> On Saturday, April 17, 2021 12:53 AM Japin Li <japinli@hotmail.com>
>> On Fri, 16 Apr 2021 at 17:19, osumi.takamichi@fujitsu.com
>> <osumi.takamichi@fujitsu.com> wrote:
>> > On Friday, April 16, 2021 5:50 PM Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>> >> On Fri, Apr 16, 2021 at 12:56 PM osumi.takamichi@fujitsu.com
>> >> <osumi.takamichi@fujitsu.com> wrote:
>> >> >
>> >> > > Thanks for your reminder.  It might be a way to solve this problem.
>> >> > Yeah. I've made the 1st patch for this issue.
>> >> >
>> >> > In my env, with the patch
>> >> > the TRUNCATE in synchronous logical replication doesn't hang.
>> >> >
>> >>
>> >> Few initial comments:
>> >> =====================
>> >> 1.
>> >> + relreplindex = relation->rd_replidindex;
>> >> +
>> >> + /*
>> >> + * build attributes to idindexattrs.
>> >> + */
>> >> + idindexattrs = NULL;
>> >> + foreach(l, indexoidlist)
>> >> + {
>> >> + Oid indexOid = lfirst_oid(l);
>> >> + Relation indexDesc;
>> >> + int i;
>> >> + bool isIDKey; /* replica identity index */
>> >> +
>> >> + indexDesc = RelationIdGetRelation(indexOid);
>> >>
>> >> When you have oid of replica identity index (relreplindex) then what
>> >> is the need to traverse all the indexes?
>> > Ok. No need to traverse all the indexes. Will fix this part.
>> >
>> >> 2.
>> >> It is better to name the function as RelationGet...
>> > You are right. I'll modify this in my next version.
>> >
>>
>> I took the liberty to address review comments and provide a v2 patch on top
>> of your's v1 patch, also merge the test case.
>>
>> Sorry for attaching.
> No problem. Thank you for updating the patch.
> I've conducted some cosmetic changes. Could you please check this ?
> That's already applied by pgindent.
>
> I executed RT for this and made no failure.
> Just in case, I executed 010_truncate.pl test 100 times in a tight loop,
> which also didn't fail.
>

LGTM, I made an entry in the commitfest[1], so that the patches will get a
chance to run on all the platforms.

[1] - https://commitfest.postgresql.org/33/3081/

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Truncate in synchronous logical replication failed

From
Ajin Cherian
Date:


On Sat, Apr 17, 2021 at 2:04 PM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:

No problem. Thank you for updating the patch.
I've conducted some cosmetic changes. Could you please check this ?
That's already applied by pgindent.

I executed RT for this and made no failure.
Just in case, I executed 010_truncate.pl test 100 times in a tight loop,
which also didn't fail.


I reviewed the patch, ran make check, no issues. One minor comment:

Could you add the comment similar to RelationGetIndexAttrBitmap() on why the redo, it's not very obvious
to someone reading the code, why we are refetching the index list here.

+ /* Check if we need to redo */
+ newindexoidlist = RelationGetIndexList(relation); 

thanks,
Ajin Cherian
Fujitsu Australia

RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On  Tuesday, April 20, 2021 10:53 AM Ajin Cherian <itsajin@gmail.com> wrote:
> On Sat, Apr 17, 2021 at 2:04 PM osumi.takamichi@fujitsu.com
> <mailto:osumi.takamichi@fujitsu.com>  <osumi.takamichi@fujitsu.com
> <mailto:osumi.takamichi@fujitsu.com> > wrote:
> 
>     No problem. Thank you for updating the patch.
>     I've conducted some cosmetic changes. Could you please check
> this ?
>     That's already applied by pgindent.
> 
>     I executed RT for this and made no failure.
>     Just in case, I executed 010_truncate.pl <http://010_truncate.pl>
> test 100 times in a tight loop,
>     which also didn't fail.
> 
> I reviewed the patch, ran make check, no issues. One minor comment:
> 
> Could you add the comment similar to RelationGetIndexAttrBitmap() on why
> the redo, it's not very obvious to someone reading the code, why we are
> refetching the index list here.
> 
> + /* Check if we need to redo */
> 
> + newindexoidlist = RelationGetIndexList(relation);
Yeah, makes sense. Fixed.
Its indents seem a bit weird but came from pgindent.
Thank you for your review !


Best Regards,
    Takamichi Osumi


Attachment

Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Tue, Apr 20, 2021 at 9:00 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On  Tuesday, April 20, 2021 10:53 AM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > I reviewed the patch, ran make check, no issues. One minor comment:
> >
> > Could you add the comment similar to RelationGetIndexAttrBitmap() on why
> > the redo, it's not very obvious to someone reading the code, why we are
> > refetching the index list here.
> >
> > + /* Check if we need to redo */
> >
> > + newindexoidlist = RelationGetIndexList(relation);
> Yeah, makes sense. Fixed.

I don't think here we need to restart to get a stable list of indexes
as we do in RelationGetIndexAttrBitmap. The reason is here we build
the cache entry using a historic snapshot and all the later changes
are absorbed while decoding WAL. I have updated that and modified few
comments in the attached patch. Can you please test this in
clobber_cache_always mode? I think just testing
subscription/t/010_truncate.pl would be sufficient.

Now, this bug exists in prior versions (>= v11) where we have
introduced decoding of Truncate. Do we want to backpatch this? As no
user has reported this till now apart from Tang who I think got it
while doing some other patch testing, we might want to just push it in
HEAD. I am fine either way. Petr, others, do you have any opinion on
this matter?

-- 
With Regards,
Amit Kapila.

Attachment

RE: Truncate in synchronous logical replication failed

From
"shiy.fnst@fujitsu.com"
Date:
> I don't think here we need to restart to get a stable list of indexes
> as we do in RelationGetIndexAttrBitmap. The reason is here we build
> the cache entry using a historic snapshot and all the later changes
> are absorbed while decoding WAL. I have updated that and modified few
> comments in the attached patch. Can you please test this in
> clobber_cache_always mode? I think just testing
> subscription/t/010_truncate.pl would be sufficient.

Thanks for your patch. I tested your patch and it passes 'make check-world' and it works as expected.
By the way, I also tested in clobber_cache_always mode, it passed, too.(I only tested subscription/t/010_truncate.pl.)

Regards,
Shi yu


RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Thursday, April 22, 2021 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Apr 20, 2021 at 9:00 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On  Tuesday, April 20, 2021 10:53 AM Ajin Cherian <itsajin@gmail.com>
> wrote:
> > >
> > > I reviewed the patch, ran make check, no issues. One minor comment:
> > >
> > > Could you add the comment similar to RelationGetIndexAttrBitmap() on
> > > why the redo, it's not very obvious to someone reading the code, why
> > > we are refetching the index list here.
> > >
> > > + /* Check if we need to redo */
> > >
> > > + newindexoidlist = RelationGetIndexList(relation);
> > Yeah, makes sense. Fixed.
> 
> I don't think here we need to restart to get a stable list of indexes as we do in
> RelationGetIndexAttrBitmap. The reason is here we build the cache entry
> using a historic snapshot and all the later changes are absorbed while
> decoding WAL.
I rechecked this and I agree with this.
I don't see any problem to remove the redo check.
Based on this direction, I'll share my several minor comments.

[1] a typo of RelationGetIdentityKeyBitmap's comment

+ * This is a special purpose function used during logical replication. Here,
+ * unlike RelationGetIndexAttrBitmap(), we don't a acquire lock on the required

We have "a" in an unnatural place. It should be "we don't acquire...".

[2] suggestion to fix RelationGetIdentityKeyBitmap's comment

+ * later changes are absorbed while decoding WAL. Due to this reason, we don't
+ * need to retry here in case of a change in the set of indexes.

I think it's better to use "Because of" instead of "Due to".
This patch works to solve the deadlock.

[3] wrong comment in RelationGetIdentityKeyBitmap

+       /* Save some values to compare after building attributes */

I wrote this comment for the redo check
that has been removed already. We can delete it.

[4] suggestion to remove local variable relreplindex in RelationGetIdentityKeyBitmap

I think we don't need relreplindex.
We can just pass relaton->rd_replidindex to RelationIdGetRelation().
There is no more reference of the variable.

[5] suggestion to fix the place to release indexoidlist

I thought we can do its list_free() ealier,
after checking if there is no indexes.

[6] suggestion to remove period in one comment.

+       /* Quick exit if we already computed the result. */

This comes from RelationGetIndexAttrBitmap (and my posted versions)
but I think we can remove the period to give better alignment shared with other comments in the function.

> I have updated that and modified few comments in the
> attached patch. Can you please test this in clobber_cache_always mode? I
> think just testing subscription/t/010_truncate.pl would be sufficient.
I did it. It didn't fail. No problem.

> Now, this bug exists in prior versions (>= v11) where we have introduced
> decoding of Truncate. Do we want to backpatch this? As no user has reported
> this till now apart from Tang who I think got it while doing some other patch
> testing, we might want to just push it in HEAD. I am fine either way. Petr,
> others, do you have any opinion on this matter?
I think we are fine with applying this patch to the HEAD only,
since nobody has complained about the hang issue.


Best Regards,
    Takamichi Osumi


Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Fri, Apr 23, 2021 at 12:04 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, April 22, 2021 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Tue, Apr 20, 2021 at 9:00 AM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> > >
> > > On  Tuesday, April 20, 2021 10:53 AM Ajin Cherian <itsajin@gmail.com>
> > wrote:
> > > >
> > > > I reviewed the patch, ran make check, no issues. One minor comment:
> > > >
> > > > Could you add the comment similar to RelationGetIndexAttrBitmap() on
> > > > why the redo, it's not very obvious to someone reading the code, why
> > > > we are refetching the index list here.
> > > >
> > > > + /* Check if we need to redo */
> > > >
> > > > + newindexoidlist = RelationGetIndexList(relation);
> > > Yeah, makes sense. Fixed.
> >
> > I don't think here we need to restart to get a stable list of indexes as we do in
> > RelationGetIndexAttrBitmap. The reason is here we build the cache entry
> > using a historic snapshot and all the later changes are absorbed while
> > decoding WAL.
> I rechecked this and I agree with this.
> I don't see any problem to remove the redo check.
> Based on this direction, I'll share my several minor comments.
>
> [1] a typo of RelationGetIdentityKeyBitmap's comment
>
> + * This is a special purpose function used during logical replication. Here,
> + * unlike RelationGetIndexAttrBitmap(), we don't a acquire lock on the required
>
> We have "a" in an unnatural place. It should be "we don't acquire...".
>
> [2] suggestion to fix RelationGetIdentityKeyBitmap's comment
>
> + * later changes are absorbed while decoding WAL. Due to this reason, we don't
> + * need to retry here in case of a change in the set of indexes.
>
> I think it's better to use "Because of" instead of "Due to".
> This patch works to solve the deadlock.
>

I am not sure which one is better. I would like to keep it as it is
unless you feel strongly about point 2.

> [3] wrong comment in RelationGetIdentityKeyBitmap
>
> +       /* Save some values to compare after building attributes */
>
> I wrote this comment for the redo check
> that has been removed already. We can delete it.
>
> [4] suggestion to remove local variable relreplindex in RelationGetIdentityKeyBitmap
>
> I think we don't need relreplindex.
> We can just pass relaton->rd_replidindex to RelationIdGetRelation().
> There is no more reference of the variable.
>
> [5] suggestion to fix the place to release indexoidlist
>
> I thought we can do its list_free() ealier,
> after checking if there is no indexes.
>

Hmm, why? If there are no indexes then we wouldn't have allocated any memory.

> [6] suggestion to remove period in one comment.
>
> +       /* Quick exit if we already computed the result. */
>
> This comes from RelationGetIndexAttrBitmap (and my posted versions)
> but I think we can remove the period to give better alignment shared with other comments in the function.
>

Can you please update the patch except for the two points to which I
responded above?

-- 
With Regards,
Amit Kapila.



RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Friday, April 23, 2021 3:43 PM  Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Apr 23, 2021 at 12:04 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Thursday, April 22, 2021 6:33 PM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> > > On Tue, Apr 20, 2021 at 9:00 AM osumi.takamichi@fujitsu.com
> > > <osumi.takamichi@fujitsu.com> wrote:
> > > >
> > > > On  Tuesday, April 20, 2021 10:53 AM Ajin Cherian
> > > > <itsajin@gmail.com>
> > > wrote:
> > > > >
> > > > > I reviewed the patch, ran make check, no issues. One minor
> comment:
> > > > >
> > > > > Could you add the comment similar to
> > > > > RelationGetIndexAttrBitmap() on why the redo, it's not very
> > > > > obvious to someone reading the code, why we are refetching the
> index list here.
> > > > >
> > > > > + /* Check if we need to redo */
> > > > >
> > > > > + newindexoidlist = RelationGetIndexList(relation);
> > > > Yeah, makes sense. Fixed.
> > >
> > > I don't think here we need to restart to get a stable list of
> > > indexes as we do in RelationGetIndexAttrBitmap. The reason is here
> > > we build the cache entry using a historic snapshot and all the later
> > > changes are absorbed while decoding WAL.
> > I rechecked this and I agree with this.
> > I don't see any problem to remove the redo check.
> > Based on this direction, I'll share my several minor comments.
> >
> > [1] a typo of RelationGetIdentityKeyBitmap's comment
> >
> > + * This is a special purpose function used during logical
> > + replication. Here,
> > + * unlike RelationGetIndexAttrBitmap(), we don't a acquire lock on
> > + the required
> >
> > We have "a" in an unnatural place. It should be "we don't acquire...".
> >
> > [2] suggestion to fix RelationGetIdentityKeyBitmap's comment
> >
> > + * later changes are absorbed while decoding WAL. Due to this reason,
> > + we don't
> > + * need to retry here in case of a change in the set of indexes.
> >
> > I think it's better to use "Because of" instead of "Due to".
> > This patch works to solve the deadlock.
> >
> 
> I am not sure which one is better. I would like to keep it as it is unless you feel
> strongly about point 2.
> 
> > [3] wrong comment in RelationGetIdentityKeyBitmap
> >
> > +       /* Save some values to compare after building attributes */
> >
> > I wrote this comment for the redo check that has been removed already.
> > We can delete it.
> >
> > [4] suggestion to remove local variable relreplindex in
> > RelationGetIdentityKeyBitmap
> >
> > I think we don't need relreplindex.
> > We can just pass relaton->rd_replidindex to RelationIdGetRelation().
> > There is no more reference of the variable.
> >
> > [5] suggestion to fix the place to release indexoidlist
> >
> > I thought we can do its list_free() ealier, after checking if there is
> > no indexes.
> >
> 
> Hmm, why? If there are no indexes then we wouldn't have allocated any
> memory.
> 
> > [6] suggestion to remove period in one comment.
> >
> > +       /* Quick exit if we already computed the result. */
> >
> > This comes from RelationGetIndexAttrBitmap (and my posted versions)
> > but I think we can remove the period to give better alignment shared with
> other comments in the function.
> >
> 
> Can you please update the patch except for the two points to which I
> responded above?
I've combined v5 with above accepted comments.

Just in case, I've conducted make check-world, 
the test of clobber_cache_always mode again for v6
and tight loop test of 100 times for 010_truncate.pl. 
The result is that all passed with no failure.


Best Regards,
    Takamichi Osumi


Attachment

RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Friday, April 23, 2021 6:03 PM I wrote:
> I've combined v5 with above accepted comments.
> 
> Just in case, I've conducted make check-world, the test of
> clobber_cache_always mode again for v6 and tight loop test of 100 times for
> 010_truncate.pl.
> The result is that all passed with no failure.
I'm sorry, I realized another minor thing which should be fixed.

In v6, I did below.
+Bitmapset *
+RelationGetIdentityKeyBitmap(Relation relation)
+{
+       Bitmapset  *idindexattrs;       /* columns in the replica identity */
...
+       /* Build attributes to idindexattrs */
+       idindexattrs = NULL;

But, we removed the loop, so we can insert NULL
at the beginning to declare idindexattrs.
v7 is the version to update this part and
related comments from v6.

Best Regards,
    Takamichi Osumi


Attachment

Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Fri, Apr 23, 2021 at 7:18 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>

The latest patch looks good to me. I have made a minor modification
and added a commit message in the attached. I would like to once again
ask whether anybody else thinks we should backpatch this? Just a
summary for anybody not following this thread:

This patch fixes the Logical Replication of Truncate in synchronous
commit mode. The Truncate operation acquires an exclusive lock on the
target relation and indexes and waits for logical replication of the
operation to finish at commit. Now because we are acquiring the shared
lock on the target index to get index attributes in pgoutput while
sending the changes for the Truncate operation, it leads to a
deadlock.

Actually, we don't need to acquire a lock on the target index as we
build the cache entry using a historic snapshot and all the later
changes are absorbed while decoding WAL. So, we wrote a special
purpose function for logical replication to get a bitmap of replica
identity attribute numbers where we get that information without
locking the target index.

We are planning not to backpatch this as there doesn't seem to be any
field complaint about this issue since it was introduced in commit
5dfd1e5a in v11.

-- 
With Regards,
Amit Kapila.

Attachment

RE: Truncate in synchronous logical replication failed

From
"osumi.takamichi@fujitsu.com"
Date:
On Monday, April 26, 2021 1:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Apr 23, 2021 at 7:18 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> 
> The latest patch looks good to me. I have made a minor modification and
> added a commit message in the attached.
Thank you for updating the patch.

I think we need one space for "targetindex" in the commit message.
From my side, there is no more additional comments !

> I would like to once again ask
> whether anybody else thinks we should backpatch this? Just a summary for
> anybody not following this thread:
> 
> This patch fixes the Logical Replication of Truncate in synchronous commit
> mode. The Truncate operation acquires an exclusive lock on the target
> relation and indexes and waits for logical replication of the operation to finish
> at commit. Now because we are acquiring the shared lock on the target index
> to get index attributes in pgoutput while sending the changes for the Truncate
> operation, it leads to a deadlock.
> 
> Actually, we don't need to acquire a lock on the target index as we build the
> cache entry using a historic snapshot and all the later changes are absorbed
> while decoding WAL. So, we wrote a special purpose function for logical
> replication to get a bitmap of replica identity attribute numbers where we get
> that information without locking the target index.
> 
> We are planning not to backpatch this as there doesn't seem to be any field
> complaint about this issue since it was introduced in commit 5dfd1e5a in v11.
Please anyone, share your opinion on this matter, when you have.


Best Regards,
    Takamichi Osumi


Re: Truncate in synchronous logical replication failed

From
Japin Li
Date:
On Mon, 26 Apr 2021 at 12:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Apr 23, 2021 at 7:18 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
>>
>
> The latest patch looks good to me. I have made a minor modification
> and added a commit message in the attached. I would like to once again
> ask whether anybody else thinks we should backpatch this? Just a
> summary for anybody not following this thread:
>
> This patch fixes the Logical Replication of Truncate in synchronous
> commit mode. The Truncate operation acquires an exclusive lock on the
> target relation and indexes and waits for logical replication of the
> operation to finish at commit. Now because we are acquiring the shared
> lock on the target index to get index attributes in pgoutput while
> sending the changes for the Truncate operation, it leads to a
> deadlock.
>
> Actually, we don't need to acquire a lock on the target index as we
> build the cache entry using a historic snapshot and all the later
> changes are absorbed while decoding WAL. So, we wrote a special
> purpose function for logical replication to get a bitmap of replica
> identity attribute numbers where we get that information without
> locking the target index.
>
> We are planning not to backpatch this as there doesn't seem to be any
> field complaint about this issue since it was introduced in commit
> 5dfd1e5a in v11.

+1 for apply only on HEAD.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Truncate in synchronous logical replication failed

From
Amit Kapila
Date:
On Mon, Apr 26, 2021 at 12:37 PM Japin Li <japinli@hotmail.com> wrote:
>
> On Mon, 26 Apr 2021 at 12:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Fri, Apr 23, 2021 at 7:18 PM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> >>
> >
> > The latest patch looks good to me. I have made a minor modification
> > and added a commit message in the attached. I would like to once again
> > ask whether anybody else thinks we should backpatch this? Just a
> > summary for anybody not following this thread:
> >
> > This patch fixes the Logical Replication of Truncate in synchronous
> > commit mode. The Truncate operation acquires an exclusive lock on the
> > target relation and indexes and waits for logical replication of the
> > operation to finish at commit. Now because we are acquiring the shared
> > lock on the target index to get index attributes in pgoutput while
> > sending the changes for the Truncate operation, it leads to a
> > deadlock.
> >
> > Actually, we don't need to acquire a lock on the target index as we
> > build the cache entry using a historic snapshot and all the later
> > changes are absorbed while decoding WAL. So, we wrote a special
> > purpose function for logical replication to get a bitmap of replica
> > identity attribute numbers where we get that information without
> > locking the target index.
> >
> > We are planning not to backpatch this as there doesn't seem to be any
> > field complaint about this issue since it was introduced in commit
> > 5dfd1e5a in v11.
>
> +1 for apply only on HEAD.
>

Seeing no other suggestions, I have pushed this in HEAD only. Thanks!

-- 
With Regards,
Amit Kapila.



RE: Truncate in synchronous logical replication failed

From
"tanghy.fnst@fujitsu.com"
Date:
On Tuesday, April 27, 2021 1:17 PM, Amit Kapila <amit.kapila16@gmail.com> wrote

>Seeing no other suggestions, I have pushed this in HEAD only. Thanks!

Sorry for the later reply on this issue.
I tested with the latest HEAD, the issue is fixed now. Thanks a lot.

Regards
Tang