Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions - Mailing list pgsql-hackers

From Chao Li
Subject Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions
Date
Msg-id 25034397-BBCD-4642-A86B-2811FC82DC64@gmail.com
Whole thread Raw
In response to Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers

> On Jan 26, 2026, at 10:51, Chao Li <li.evan.chao@gmail.com> wrote:
>
> In a previous discussion [4], Dmitry Dolgov pointed out a test case that resulted in a DEADLOCK. I ran that test
againstv3. The test still fails, but I no longer observe a deadlock; instead, the server now crashes during partition
attachment.I will investigate this further. 

I tried to investigate the server crash yesterday, but I’m no longer able to reproduce it. From the record of the first
crashI encountered, the call stack looked 
like this:
```
TRAP: failed Assert("entry->data.lockmode == BUFFER_LOCK_UNLOCK"), File: "bufmgr.c", Line: 5908, PID: 47991
0   postgres                            0x00000001013d9bb0 ExceptionalCondition + 216
1   postgres                            0x0000000101129a80 BufferLockConditional + 88
2   postgres                            0x0000000101129a04 ConditionalLockBuffer + 224
3   postgres                            0x0000000100b8966c _bt_conditionallockbuf + 28
4   postgres                            0x0000000100b88714 _bt_allocbuf + 128
5   postgres                            0x0000000100b858d4 _bt_split + 1496
6   postgres                            0x0000000100b82cec _bt_insertonpg + 1520
7   postgres                            0x0000000100b81220 _bt_doinsert + 608
8   postgres                            0x0000000100b9a008 btinsert + 120
9   postgres                            0x0000000100b7a224 index_insert + 552
10  postgres                            0x0000000100c5dd50 CatalogIndexInsert + 764
11  postgres                            0x0000000100c5df60 CatalogTupleUpdate + 100
12  postgres                            0x0000000100c7e608 ConstraintSetParentConstraint + 580
13  postgres                            0x0000000100de6598 AttachPartitionEnsureIndexes + 1596
14  postgres                            0x0000000100de5cac attachPartitionTable + 80
15  postgres                            0x0000000100dd864c ATExecAttachPartition + 2520
16  postgres                            0x0000000100dcb8e8 ATExecCmd + 4464
17  postgres                            0x0000000100dc6054 ATRewriteCatalogs + 408
18  postgres                            0x0000000100dbfa18 ATController + 256
19  postgres                            0x0000000100dbf84c AlterTable + 96
20  postgres                            0x00000001011a3508 ProcessUtilitySlow + 1704
21  postgres                            0x00000001011a111c standard_ProcessUtility + 3504
22  postgres                            0x00000001011a035c ProcessUtility + 360
23  postgres                            0x000000010119fa10 PortalRunUtility + 216
24  postgres                            0x000000010119eae0 PortalRunMulti + 688
25  postgres                            0x000000010119e018 PortalRun + 788
26  postgres                            0x0000000101198dcc exec_simple_query + 1380
27  postgres                            0x0000000101197ee8 PostgresMain + 3244
28  postgres                            0x000000010118f8d0 BackendInitialize + 0
29  postgres                            0x0000000101061f3c postmaster_child_launch + 456
30  postgres                            0x00000001010696c8 BackendStartup + 304
31  postgres                            0x0000000101067564 ServerLoop + 372
32  postgres                            0x0000000101066044 PostmasterMain + 6440
33  postgres                            0x0000000100ee40a4 main + 924
34  dyld                                0x000000019a36dd54 start + 7184
2026-01-26 09:52:41.240 CST [46845] LOG:  client backend (PID 47991) was terminated by signal 6: Abort trap: 6
```

I noticed that the Assert in bufmgr.c was removed earlier today by commit 333f58637.

However, with the server crash no longer occurring, the DEADLOCK issue reappeared. After some investigation, I
confirmedthat the deadlock is not specific to this patch, I can consistently reproduce it with ATTACH PARTITION on the
masterbranch. That suggests this is a more general problem. 

I’ll start a new thread to follow up on the deadlock separately.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Decoupling our alignment assumptions about int64 and double
Next
From: Fujii Masao
Date:
Subject: Re: display hot standby state in psql prompt