Thread: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17716
Logged by:          bowenshi
Email address:      zxwsbg12138@gmail.com
PostgreSQL version: 13.9
Operating system:   centos
Description:

Dears, I have found a problem while using logical decoding.  I have two
publications, and the walsender process hang while decoding DROP PUBLICATION
XLOG of another publication. The pstack looks like following:

#0  0x00000000008b9c24 in hash_seq_search ()
#1  0x00007fd14340590c in rel_sync_cache_publication_cb () from
pgoutput.so
#2  0x0000000000897a1a in CallSyscacheCallbacks ()
#3  0x000000000073682e in ReorderBufferCommit ()
#4  0x000000000072b988 in DecodeXactOp ()
#5  0x000000000072bd52 in LogicalDecodingProcessRecord ()
#6  0x000000000074d14f in XLogSendLogical ()
#7  0x000000000074f562 in WalSndLoop ()
#8  0x0000000000750322 in exec_replication_command ()
#9  0x0000000000792301 in PostgresMain ()
#10 0x000000000048854f in ServerLoop ()
#11 0x000000000071b579 in PostmasterMain ()
#12 0x000000000048948d in main ()

When I user perl to analyze, it spend almost 99% time in `hash_seq_search`
function.

This problem can be reproduced , and I will send the shell script later.


The following step can help reproducing the problem:
1. Run "initinstance.sh" and get two instance, one publisher and one
subscriber, this script will create a subscription as well;
2. Run “createschema.sh” and create 1000 schemas, each schema has 100 tables;
3. Run “createpublication.sh” and create a huge publication, which
publishes 10000 tables;
4. Run “insert.sh” and insert exactly one tuple into each table
created in Step 2;
5. Run “insert into t values(1);” in publisher, and then we can found
it in subscriber;
6. Run “drop publication pub_t_large;” in publisher;
7. Run “insert into t values(1);” in publisher, and then we can not
find it in subscriber, cause the walsender has hanged in decoding Step
6’s XLOG;



PG Bug reporting form <noreply@postgresql.org> 于2022年12月13日周二 15:51写道:
>
> The following bug has been logged on the website:
>
> Bug reference:      17716
> Logged by:          bowenshi
> Email address:      zxwsbg12138@gmail.com
> PostgreSQL version: 13.9
> Operating system:   centos
> Description:
>
> Dears, I have found a problem while using logical decoding.  I have two
> publications, and the walsender process hang while decoding DROP PUBLICATION
> XLOG of another publication. The pstack looks like following:
>
> #0  0x00000000008b9c24 in hash_seq_search ()
> #1  0x00007fd14340590c in rel_sync_cache_publication_cb () from
> pgoutput.so
> #2  0x0000000000897a1a in CallSyscacheCallbacks ()
> #3  0x000000000073682e in ReorderBufferCommit ()
> #4  0x000000000072b988 in DecodeXactOp ()
> #5  0x000000000072bd52 in LogicalDecodingProcessRecord ()
> #6  0x000000000074d14f in XLogSendLogical ()
> #7  0x000000000074f562 in WalSndLoop ()
> #8  0x0000000000750322 in exec_replication_command ()
> #9  0x0000000000792301 in PostgresMain ()
> #10 0x000000000048854f in ServerLoop ()
> #11 0x000000000071b579 in PostmasterMain ()
> #12 0x000000000048948d in main ()
>
> When I user perl to analyze, it spend almost 99% time in `hash_seq_search`
> function.
>
> This problem can be reproduced , and I will send the shell script later.
>

Attachment

Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From
shveta malik
Date:

Hello,
The problem is not reproducing on HEAD but reproducing on 13.9

Problem Description:
On drop of large publication with 10,000 tables added to it, walsender hangs in
ReorderBufferCommit -->ReorderBufferExecuteInvalidations->....hash_seq_search()

Root Cause:
Drop publication has 10000 tuples (table entries) to be deleted from catalogs. So it creates invalidation msgs to invalidate various cache entries. Each table-entry deletion results in 4 invalidation msgs. So for complete drop publication operation, it adds to 40,000 invalidation msgs aprox (+few more for publication drop itself). These invalidation msgs are added to top-txn.

Later when Walsender tries to read Xlog of 'drop publication' operation, it tries to replay these invalidation msgs for each command-id increment. Command-id is incremented from 1 to 10,000. So for each command-id, it replays all 40,000 invalidation msgs. There is no way to identify that only 4 invalidation msgs belong to a particular command-id and thus all accumulative msgs are played on each command-id increment. This results in a hang like situation in the above stack. And thus insert executed (belonging to different publication) immediately after drop publication is not received on subscriber.

What fixes it on HEAD:
On later versions, there is a new mechanism introduced where-in after each command id increment, we log invalidations to be done for that command-id. Record of type XLOG_XACT_INVALIDATIONS is inserted in XLOG. For the case in hand, each such invalidation record has 4 invalidation msgs for respective command-id. So all 40,000 invalidation msgs are inserted to Xlog but in pieces i.e. 10,000 records for invalidations, each record representing invalidations for 1 particular command id and each having 4 invalidation msgs.
Later when walsender tries to replay, it replays only that particualr's command-id xlog-record after each command-id increment (in the order these were inserted). Thus we perform invalidation 10,000 times but for 4 msgs at once, instead of 40,000 msgs every time in the previous version.

These changes are committed through below commit Ids:

Full change:
commit c55040ccd017962b7b8d3fbcdc184aa90c722a21
 Author: Dilip Kumar, Tomas Vondra, Amit Kapila
Date:   Thu Jul 23 08:19:07 2020 +0530

Trivial Fix:
commit 36e545cd051ffe58064c1f38faeb64f4feef4aec
  Author: Amit Kapila
Date:   Thu Nov 10 16:56:49 2022 +0530


I have not tried it yet in version 14, hopefully we will see it working there. I will try and update it soon.

thanks
Shveta

On Tue, Dec 13, 2022 at 5:43 PM Bowen Shi <zxwsbg12138@gmail.com> wrote:
The following step can help reproducing the problem:
1. Run "initinstance.sh" and get two instance, one publisher and one
subscriber, this script will create a subscription as well;
2. Run “createschema.sh” and create 1000 schemas, each schema has 100 tables;
3. Run “createpublication.sh” and create a huge publication, which
publishes 10000 tables;
4. Run “insert.sh” and insert exactly one tuple into each table
created in Step 2;
5. Run “insert into t values(1);” in publisher, and then we can found
it in subscriber;
6. Run “drop publication pub_t_large;” in publisher;
7. Run “insert into t values(1);” in publisher, and then we can not
find it in subscriber, cause the walsender has hanged in decoding Step
6’s XLOG;



PG Bug reporting form <noreply@postgresql.org> 于2022年12月13日周二 15:51写道:
>
> The following bug has been logged on the website:
>
> Bug reference:      17716
> Logged by:          bowenshi
> Email address:      zxwsbg12138@gmail.com
> PostgreSQL version: 13.9
> Operating system:   centos
> Description:
>
> Dears, I have found a problem while using logical decoding.  I have two
> publications, and the walsender process hang while decoding DROP PUBLICATION
> XLOG of another publication. The pstack looks like following:
>
> #0  0x00000000008b9c24 in hash_seq_search ()
> #1  0x00007fd14340590c in rel_sync_cache_publication_cb () from
> pgoutput.so
> #2  0x0000000000897a1a in CallSyscacheCallbacks ()
> #3  0x000000000073682e in ReorderBufferCommit ()
> #4  0x000000000072b988 in DecodeXactOp ()
> #5  0x000000000072bd52 in LogicalDecodingProcessRecord ()
> #6  0x000000000074d14f in XLogSendLogical ()
> #7  0x000000000074f562 in WalSndLoop ()
> #8  0x0000000000750322 in exec_replication_command ()
> #9  0x0000000000792301 in PostgresMain ()
> #10 0x000000000048854f in ServerLoop ()
> #11 0x000000000071b579 in PostmasterMain ()
> #12 0x000000000048948d in main ()
>
> When I user perl to analyze, it spend almost 99% time in `hash_seq_search`
> function.
>
> This problem can be reproduced , and I will send the shell script later.
>
Hello,
Thanks for your advice. I make some tests and this problem can't be
reproduced in PG 14+ version. I think adding a new XLOG type will help
resolve this problem. But I think the following patch may be helpful
in the PG 13 version.

The invalidation contains two parts: pgoutput and relfilenodeMap. We
have no way to optimize relfilenodeMap part , since it has been
discussed in previous mails
https://www.postgresql.org/message-id/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com.

However, I'd like to contribute a patch to fix pgoutput part. We can skip
invalidating caches after first time with a lazy tag and this works.
It almost doubles the walsender performance while decoding this XLOG.

I use the test in the last email and reduce the number of relations in
publications to 1000, the test result is following:

Before optimization: 76 min
After optimization: 35 min

Though the result is not good enough, I think this patch is still worthy.

Attachment

Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From
shveta malik
Date:
Hello,
The idea looks good to me. For 'relation schema cache (pgoutput  one)', on receiving invalidation msg for one hash-value, we invalidate the complete cache as there is no way to find an entry corresponding to that hash-value and thus your fix-proposal will make good difference. But I feel it makes sense on HEAD as well.

This complete cache invalidation happens multiple times even on HEAD (10k times for the given case). This cache is mostly empty in given test-case, but consider the case where we have huge number of publications and subscriptions (to make this cache have huge number of entries) and then we try to drop 1 large publication with say 40k-50k tables, in that case we might see slowness while traversing and invalidating the concerned cache on HEAD as well. The test case with increased magnitude can be tried for HEAD once to see if we need it on HEAD or not. 

thanks
Shveta


On Mon, Dec 19, 2022 at 5:52 PM Bowen Shi <zxwsbg12138@gmail.com> wrote:
Hello,
Thanks for your advice. I make some tests and this problem can't be
reproduced in PG 14+ version. I think adding a new XLOG type will help
resolve this problem. But I think the following patch may be helpful
in the PG 13 version.

The invalidation contains two parts: pgoutput and relfilenodeMap. We
have no way to optimize relfilenodeMap part , since it has been
discussed in previous mails
https://www.postgresql.org/message-id/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com.

However, I'd like to contribute a patch to fix pgoutput part. We can skip
invalidating caches after first time with a lazy tag and this works.
It almost doubles the walsender performance while decoding this XLOG.

I use the test in the last email and reduce the number of relations in
publications to 1000, the test result is following:

Before optimization: 76 min
After optimization: 35 min

Though the result is not good enough, I think this patch is still worthy.

Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From
shveta malik
Date:
Hello,
I tried to reproduce the lag with a bigger magnitude test case i.e. added more
tables to pub_t_large to increase command_ids and added huge number of tables
to working publication pub_t to increase the number of entries in
rel-cache, but no luck.
No noticeable lag observed on HEAD with the new mechanism of invalidation.

thanks
Shveta


On Tue, Dec 20, 2022 at 11:40 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Hello,
> The idea looks good to me. For 'relation schema cache (pgoutput  one)', on receiving invalidation msg for one
hash-value,we invalidate the complete cache as there is no way to find an entry corresponding to that hash-value and
thusyour fix-proposal will make good difference. But I feel it makes sense on HEAD as well. 
>
> This complete cache invalidation happens multiple times even on HEAD (10k times for the given case). This cache is
mostlyempty in given test-case, but consider the case where we have huge number of publications and subscriptions (to
makethis cache have huge number of entries) and then we try to drop 1 large publication with say 40k-50k tables, in
thatcase we might see slowness while traversing and invalidating the concerned cache on HEAD as well. The test case
withincreased magnitude can be tried for HEAD once to see if we need it on HEAD or not. 
>
> thanks
> Shveta
>
>
> On Mon, Dec 19, 2022 at 5:52 PM Bowen Shi <zxwsbg12138@gmail.com> wrote:
>>
>> Hello,
>> Thanks for your advice. I make some tests and this problem can't be
>> reproduced in PG 14+ version. I think adding a new XLOG type will help
>> resolve this problem. But I think the following patch may be helpful
>> in the PG 13 version.
>>
>> The invalidation contains two parts: pgoutput and relfilenodeMap. We
>> have no way to optimize relfilenodeMap part , since it has been
>> discussed in previous mails
>> https://www.postgresql.org/message-id/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com.
>>
>> However, I'd like to contribute a patch to fix pgoutput part. We can skip
>> invalidating caches after first time with a lazy tag and this works.
>> It almost doubles the walsender performance while decoding this XLOG.
>>
>> I use the test in the last email and reduce the number of relations in
>> publications to 1000, the test result is following:
>>
>> Before optimization: 76 min
>> After optimization: 35 min
>>
>> Though the result is not good enough, I think this patch is still worthy.



Hello,
Thanks for your tests.

>
> No noticeable lag observed on HEAD with the new mechanism of invalidation.
>

Does the new mechanism of invalidation mean using
'XLOG_XACT_INVALIDATIONS‘ to invalid cache? And did you use perf or
pstack to watch the walsender process.

thanks
BowenShi



Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From
shveta malik
Date:
Yes using XLOG_XACT_INVALIDATIONS. It reduced the number of invalidations
executed per commandId.
I used pstack. But it was pretty quick, not much chance to see stacks.
No lag seen
on the next insert on subscriber after drop publication, even with
heavier testcase.

thanks
Shveta


On Wed, Dec 21, 2022 at 10:45 AM Bowen Shi <zxwsbg12138@gmail.com> wrote:
>
> Hello,
> Thanks for your tests.
>
> >
> > No noticeable lag observed on HEAD with the new mechanism of invalidation.
> >
>
> Does the new mechanism of invalidation mean using
> 'XLOG_XACT_INVALIDATIONS‘ to invalid cache? And did you use perf or
> pstack to watch the walsender process.
>
> thanks
> BowenShi



Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From
Amit Kapila
Date:
On Mon, Dec 19, 2022 at 9:58 PM Bowen Shi <zxwsbg12138@gmail.com> wrote:
>
> Hello,
> Thanks for your advice. I make some tests and this problem can't be
> reproduced in PG 14+ version. I think adding a new XLOG type will help
> resolve this problem. But I think the following patch may be helpful
> in the PG 13 version.
>
> The invalidation contains two parts: pgoutput and relfilenodeMap. We
> have no way to optimize relfilenodeMap part , since it has been
> discussed in previous mails
> https://www.postgresql.org/message-id/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com.
>
> However, I'd like to contribute a patch to fix pgoutput part. We can skip
> invalidating caches after first time with a lazy tag and this works.
> It almost doubles the walsender performance while decoding this XLOG.
>

I don't see any obvious problem with your suggested fix but it doesn't
appear to be a very good solution as it will still mark all entries
for an invalidation after any valid DML operation. Also, there doesn't
appear to be much benefit in versions >=v14 (as per tests done till
now). So, instead of partially solving your use case for old versions,
isn't it better if you can upgrade to later versions (>=14)?

-- 
With Regards,
Amit Kapila.