Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock - Mailing list pgsql-hackers

From chenhj
Subject Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock
Date
Msg-id 31a702a.14dd.166c1366ac1.Coremail.chjischj@163.com
Whole thread Raw
Responses Re:Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock  (chenhj <chjischj@163.com>)
Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock  (Peter Geoghegan <pg@bowt.ie>)
Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi,all

In our PostgreSQL 10.2 database, two sessions of insert and autovacuum of gin index blocked in LWLock:buffer_content.
This blocked checkpoint and dead tuple recycle,and we had to restart the datebase.
According to the information collected from gcore, a deadlock occurred when acquiring the buffer lock.

This system is a citus 7.2.1 cluster. It has been running for more than a year. This problem has not happened before. 
10/19 the cluster has been expanded from 8 workers to 16 workers. I don't know if it has anything to do with expansion. 
The problem happend on the coordinator node of citus, but the table is an normal table(not partition table).
And the load of this coordinator node has always been heavy.

After 10/25 this issue occurred three times in the 4 days, but we can not reproduce this problem in test environment.
3 faults are stuck on 3 different tables, but the stack is the same.

## stack of insert:Acquire lock 0x2aaab4009564 and hold 0x2aaab670dfe4, 0x2aaac587ae64
----------------------------------------------------------------------------
    (gdb) bt
    #0  0x00007fe11552379b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
    #1  0x00007fe11552382f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
    #2  0x00007fe1155238cb in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
    #3  0x000000000069d362 in PGSemaphoreLock (sema=0x2aaaaac02958) at pg_sema.c:310
    #4  0x00000000007095ac in LWLockAcquire (lock=0x2aaab4009564, mode=LW_EXCLUSIVE) at lwlock.c:1233
    #5  0x0000000000490a32 in ginStepRight (buffer=6713826, index=<optimized out>, lockmode=lockmode@entry=2) at ginbtree.c:174
    #6  0x0000000000490c1c in ginFinishSplit (btree=btree@entry=0x7ffd81e4f950, stack=0x28eebf8, freestack=freestack@entry=1 '\001', buildStats=buildStats@entry=0x0) at ginbtree.c:701
    #7  0x0000000000491396 in ginInsertValue (btree=btree@entry=0x7ffd81e4f950, stack=<optimized out>, insertdata=insertdata@entry=0x7ffd81e4f940, buildStats=buildStats@entry=0x0)
        at ginbtree.c:773
    #8  0x000000000048fb42 in ginInsertItemPointers (index=<optimized out>, rootBlkno=rootBlkno@entry=644, items=items@entry=0x2916598, nitem=nitem@entry=353, buildStats=buildStats@entry=0x0)
        at gindatapage.c:1907
    #9  0x000000000048a9ea in ginEntryInsert (ginstate=ginstate@entry=0x28e7ef8, attnum=<optimized out>, key=42920440, category=<optimized out>, items=0x2916598, nitem=353, 
        buildStats=buildStats@entry=0x0) at gininsert.c:214
    #10 0x0000000000496d94 in ginInsertCleanup (ginstate=ginstate@entry=0x28e7ef8, full_clean=full_clean@entry=0 '\000', fill_fsm=fill_fsm@entry=1 '\001', 
        forceCleanup=forceCleanup@entry=0 '\000', stats=stats@entry=0x0) at ginfast.c:883
    #11 0x0000000000497727 in ginHeapTupleFastInsert (ginstate=ginstate@entry=0x28e7ef8, collector=collector@entry=0x7ffd81e4fe60) at ginfast.c:448
    #12 0x000000000048b209 in gininsert (index=<optimized out>, values=0x7ffd81e4ff40, isnull=0x7ffd81e50040 "", ht_ctid=0x280d98c, heapRel=<optimized out>, checkUnique=<optimized out>, 
        indexInfo=0x28b5aa8) at gininsert.c:522
    #13 0x00000000005ee8dd in ExecInsertIndexTuples (slot=slot@entry=0x28b5d58, tupleid=tupleid@entry=0x280d98c, estate=estate@entry=0x28b5288, noDupErr=noDupErr@entry=1 '\001', 
        specConflict=specConflict@entry=0x7ffd81e5013b "", arbiterIndexes=arbiterIndexes@entry=0x28c6dd8) at execIndexing.c:386
    #14 0x000000000060ccf5 in ExecInsert (canSetTag=1 '\001', estate=0x28b5288, onconflict=ONCONFLICT_UPDATE, arbiterIndexes=0x28c6dd8, planSlot=0x28b5d58, slot=0x28b5d58, mtstate=0x28b5628)
        at nodeModifyTable.c:564
    #15 ExecModifyTable (pstate=0x28b5628) at nodeModifyTable.c:1766
    #16 0x00000000005ef612 in standard_ExecutorRun (queryDesc=<optimized out>, direction=<optimized out>, count=<optimized out>, execute_once=<optimized out>)
        at ../../../src/include/executor/executor.h:250
    #17 0x00007fe10607d15d in pgss_ExecutorRun (queryDesc=0x28cc1d8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:889
    #18 0x000000000071a7ba in ProcessQuery (plan=<optimized out>, 
        sourceText=0x2849058 "INSERT INTO bi_dm.tdm_sncity_workorder_analytic_statistics (CITYCODE,CITYNAME,DISTRICT,DISTRICT_NAME,ZL_WERKS,ZL_WERKS_NAME,AREA_CD,AREA_NM,ZSIZE,DEPT_CD,TMALL_FLG,ORDER_DATE,DISTRIBUTE,WORKORDER,FNSH"..., params=0x28b2260, queryEnv=0x0, dest=0xc9e2a0 <donothingDR>, completionTag=0x7ffd81e507b0 "") at pquery.c:161
    #19 0x000000000071a9f7 in PortalRunMulti (portal=portal@entry=0x289fdd8, isTopLevel=isTopLevel@entry=1 '\001', setHoldSnapshot=setHoldSnapshot@entry=0 '\000', dest=0xc9e2a0 <donothingDR>, 
        dest@entry=0x284d778, altdest=0xc9e2a0 <donothingDR>, altdest@entry=0x284d778, completionTag=completionTag@entry=0x7ffd81e507b0 "") at pquery.c:1286
    #20 0x000000000071b535 in PortalRun (portal=<optimized out>, count=1, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x284d778, altdest=0x284d778, 
        completionTag=0x7ffd81e507b0 "") at pquery.c:799
    #21 0x0000000000718f84 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:1984
    #22 0x000000000047ad3c in BackendRun (port=0x27cbc40) at postmaster.c:4405
    #23 BackendStartup (port=0x27cbc40) at postmaster.c:4077
    #24 ServerLoop () at postmaster.c:1755
    #25 0x00000000006afacf in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x277ba20) at postmaster.c:1363
    #26 0x000000000047bb6f in main (argc=3, argv=0x277ba20) at main.c:228
    (gdb) f 4
    #4  0x00000000007095ac in LWLockAcquire (lock=0x2aaab4009564, mode=LW_EXCLUSIVE) at lwlock.c:1233
    1233 PGSemaphoreLock(proc->sem);
    (gdb) p num_held_lwlocks
    $16 = 2
    (gdb) p InterruptHoldoffCount
    $17 = 3
    (gdb) p held_lwlocks
    $18 = {{lock = 0x2aaab670dfe4, mode = LW_EXCLUSIVE}, {lock = 0x2aaac587ae64, mode = LW_EXCLUSIVE}, {lock = 0x2aaaaac0a380, mode = LW_SHARED}, {lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {
        lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {
        lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {lock = 0x2aaab670a2e4, mode = LW_EXCLUSIVE}, {lock = 0x2aaaaac15a80, mode = LW_EXCLUSIVE}, {lock = 0x0, 
        mode = LW_EXCLUSIVE} <repeats 189 times>}


## stack of autovacuum:Acquire lock 0x2aaab670dfe4 and hold 0x2aaab4009564
--------------------------------------
    (gdb) bt
    #0  0x00007fe11552379b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
    #1  0x00007fe11552382f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
    #2  0x00007fe1155238cb in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
    #3  0x000000000069d362 in PGSemaphoreLock (sema=0x2aaaaac07eb8) at pg_sema.c:310
    #4  0x00000000007095ac in LWLockAcquire (lock=0x2aaab670dfe4, mode=LW_EXCLUSIVE) at lwlock.c:1233
    #5  0x00000000004947ef in ginScanToDelete (gvs=gvs@entry=0x7ffd81e4d0f0, blkno=701, isRoot=isRoot@entry=0 '\000', parent=parent@entry=0x7ffd81e4c790, myoff=myoff@entry=37) at ginvacuum.c:262
    #6  0x0000000000494874 in ginScanToDelete (gvs=gvs@entry=0x7ffd81e4d0f0, blkno=blkno@entry=9954, isRoot=isRoot@entry=1 '\001', parent=parent@entry=0x7ffd81e4c790, myoff=myoff@entry=0)
        at ginvacuum.c:277
    #7  0x0000000000494ed1 in ginVacuumPostingTreeLeaves (gvs=gvs@entry=0x7ffd81e4d0f0, blkno=9954, isRoot=isRoot@entry=0 '\000') at ginvacuum.c:404
    #8  0x0000000000494e21 in ginVacuumPostingTreeLeaves (gvs=gvs@entry=0x7ffd81e4d0f0, blkno=644, isRoot=isRoot@entry=1 '\001') at ginvacuum.c:372
    #9  0x0000000000495540 in ginVacuumPostingTree (rootBlkno=<optimized out>, gvs=0x7ffd81e4d0f0) at ginvacuum.c:426
    #10 ginbulkdelete (info=0x7ffd81e4f720, stats=<optimized out>, callback=<optimized out>, callback_state=<optimized out>) at ginvacuum.c:649
    #11 0x00000000005e1194 in lazy_vacuum_index (indrel=0x3146e28, stats=stats@entry=0x28ec200, vacrelstats=vacrelstats@entry=0x28ebc28) at vacuumlazy.c:1621
    #12 0x00000000005e214d in lazy_scan_heap (aggressive=<optimized out>, nindexes=<optimized out>, Irel=<optimized out>, vacrelstats=<optimized out>, options=16, onerel=0x28ec1f8)
        at vacuumlazy.c:1311
    #13 lazy_vacuum_rel (onerel=onerel@entry=0x3144fa8, options=options@entry=99, params=params@entry=0x289f270, bstrategy=<optimized out>) at vacuumlazy.c:258
    #14 0x00000000005dfc35 in vacuum_rel (relid=relid@entry=2060396948, relation=relation@entry=0x7ffd81e4fe50, options=options@entry=99, params=params@entry=0x289f270) at vacuum.c:1463
    #15 0x00000000005e0ad2 in vacuum (options=99, relation=relation@entry=0x7ffd81e4fe50, relid=<optimized out>, params=params@entry=0x289f270, va_cols=va_cols@entry=0x0, 
        bstrategy=<optimized out>, bstrategy@entry=0x289e988, isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:306
    #16 0x00000000006a01ff in autovacuum_do_vac_analyze (bstrategy=0x289e988, tab=0x289f268) at autovacuum.c:3095
    #17 do_autovacuum () at autovacuum.c:2450
    #18 0x00000000006a06a9 in AutoVacWorkerMain (argv=0x0, argc=0) at autovacuum.c:1693
    #19 0x00000000006a0f49 in StartAutoVacWorker () at autovacuum.c:1498
    #20 0x00000000006aeaea in StartAutovacuumWorker () at postmaster.c:5462
    #21 sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5159
    #22 <signal handler called>
    #23 0x00007fe11334bb83 in __select_nocancel () from /lib64/libc.so.6
    #24 0x000000000047a3a3 in ServerLoop () at postmaster.c:1719
    #25 0x00000000006afacf in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x277ba20) at postmaster.c:1363
    #26 0x000000000047bb6f in main (argc=3, argv=0x277ba20) at main.c:228
    (gdb) f 4
    #4  0x00000000007095ac in LWLockAcquire (lock=0x2aaab670dfe4, mode=LW_EXCLUSIVE) at lwlock.c:1233
    1233 PGSemaphoreLock(proc->sem);
    (gdb) 
    (gdb)  p num_held_lwlocks
    $1 = 1
    (gdb) p InterruptHoldoffCount
    $2 = 2
    (gdb)  p held_lwlocks
    $3 = {{lock = 0x2aaab4009564, mode = LW_EXCLUSIVE}, {lock = 0x2aaaaac0aa00, mode = LW_SHARED}, {lock = 0x2aaaba8da364, mode = LW_EXCLUSIVE}, {lock = 0x2aaaaac15900, mode = LW_EXCLUSIVE}, {
        lock = 0x2aaab66eb064, mode = LW_EXCLUSIVE}, {lock = 0x2aaaaac15900, mode = LW_EXCLUSIVE}, {lock = 0x2aaaaac15c80, mode = LW_EXCLUSIVE}, {lock = 0x0, 
        mode = LW_EXCLUSIVE} <repeats 193 times>}
    
# env
- CentOS 7.3 64bit
- PostgreSQL 10.2(installed via rpm download from postgresql.org)

# application SQL

begin;
delete from bi_dm.tdm_sncity_workorder_analytic_statistics where order_date between '2018-10-09' and '2018-10-09';

INSERT INTO bi_dm.tdm_sncity_workorder_analytic_statistics (CITYCODE,CITYNAME,DISTRICT,DISTRICT_NAME,ZL_WERKS,ZL_WERKS_NAME,AREA_CD,AREA_NM,ZSIZE,DEPT_CD,TMALL_FLG,ORDER_DATE,DISTRIBUTE,WORKORDER,FNSH
WORKORDER,STATISFACTION_CHILD,STATISFACTION_MOTHER,ONLYONCE_FNSH,REPLY_1H,FNSH_24H,FNSH_48H,ETL_TIME,SALE_PTY,SALE_PTY_FLAG,VBELN_TP,DEALCOUNT_3,NOT_FNSH_NUM) VALUES($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11
,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27) ON CONFLICT(order_date,citycode,cityname,district,district_name,zl_werks,zl_werks_name,area_cd,area_nm,zsize,dept_cd,tmall_flg,sale_pty
,sale_pty_flag,vbeln_tp) DO UPDATE SET CITYCODE = excluded.CITYCODE,CITYNAME = excluded.CITYNAME,DISTRICT = excluded.DISTRICT,DISTRICT_NAME = excluded.DISTRICT_NAME,ZL_WERKS = excluded.ZL_WERKS,
ZL_WERKS_NAME = excluded.ZL_WERKS_NAME,AREA_CD = excluded.AREA_CD,AREA_NM = excluded.AREA_NM,ZSIZE = excluded.ZSIZE,DEPT_CD = excluded.DEPT_CD,TMALL_FLG = excluded.TMALL_FLG,ORDER_DATE = 
excluded.ORDER_DATE,DISTRIBUTE = excluded.DISTRIBUTE,WORKORDER = excluded.WORKORDER,FNSHWORKORDER = excluded.FNSHWORKORDER,STATISFACTION_CHILD = excluded.STATISFACTION_CHILD,STATISFACTION_MOTHER = excluded.STATISFACTIOH_24H
,FNSH_48H = excluded.FNSH_48H,ETL_TIME = excluded.ETL_TIME,SALE_PTY = excluded.SALE_PTY,SALE_PTY_FLAG = excluded.SALE_PTY_FLAG,VBELN_TP = excluded.VBELN_TP,DEALCOUNT_3 = excluded.DEALCOUNT_3,
NOT_N_MOTHER,ONLYONCE_FNSH = excluded.ONLYONCE_FNSH,REPLY_1H = excluded.REPLY_1H,FNSH_24H = excluded.FNSH_24H,FNSH_48H = excluded.FNSH_48H,ETL_TIME = excluded.ETL_TIME,SALE_PTY = excluded.SALE_PTY,
SALE_PTY_FLAG = excluded.SALE_PTY_FLAG,VBELN_TP = excluded.VBELN_TP,DEALCOUNT_3 = excluded.DEALCOUNT_3,NOT_FNSH_NUM = excluded.NOT_FNSH_NUM 

-- use jdbc batch update to execute insert

commit;

# table def
db1=# \d bi_dm.tdm_sncity_workorder_analytic_statistics
             Table "bi_dm.tdm_sncity_workorder_analytic_statistics"
        Column        |          Type          | Collation | Nullable | Default
----------------------+------------------------+-----------+----------+---------
citycode             | character varying(100) |           |          |
cityname             | character varying(100) |           |          |
district             | character varying(100) |           |          |
district_name        | character varying(100) |           |          |
zl_werks             | character varying(100) |           |          |
zl_werks_name        | character varying(100) |           |          |
area_cd              | character varying(100) |           |          |
area_nm              | character varying(100) |           |          |
zsize                | character varying(100) |           |          |
dept_cd              | character varying(100) |           |          |
tmall_flg            | character varying(100) |           |          |
order_date           | character varying(100) |           |          |
distribute           | integer                |           |          |
workorder            | integer                |           |          |
fnshworkorder        | integer                |           |          |
statisfaction_child  | integer                |           |          |
statisfaction_mother | integer                |           |          |
onlyonce_fnsh        | integer                |           |          |
reply_1h             | integer                |           |          |
fnsh_24h             | integer                |           |          |
fnsh_48h             | integer                |           |          |
etl_time             | character varying(100) |           |          |
sale_pty             | character varying(100) |           |          |
sale_pty_flag        | character varying(100) |           |
vbeln_tp             | character varying(100) |           |          |
dealcount_3          | integer                |           |          |
not_fnsh_num         | integer                |           |          |
Indexes:
    "idx1_tdm_sncity_workorder_analytic_statistics" UNIQUE, btree (order_date, citycode, cityname, district, district_name, zl_werks, zl_werks_name, area_cd, area_nm, zsize, dept_cd, tmall_flg, sale_pty, sale_pty_flag, vbeln_tp)
    "gin_tdm_sncity_workorder_analytic_statistics" gin (order_date)

# another phenomenon

The above phenomenon just happened on the primary node. The standby node hang on another stack four times.

## stack of recovery process
--------------------------------------
#0  0x00007f7aaf72a79b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
#1  0x00007f7aaf72a82f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00007f7aaf72a8cb in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
#3  0x000000000069d362 in PGSemaphoreLock (sema=0x2aaaaac07ff8) at pg_sema.c:310
#4  0x00000000007095ac in LWLockAcquire (lock=0x2aaaad50e2e4, mode=LW_EXCLUSIVE) at lwlock.c:1233
#5  0x0000000000503c1a in XLogReadBufferForRedoExtended (record=record@entry=0x105ce58, block_id=block_id@entry=2 '\002', mode=mode@entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock@entry=0 '\000', buf=buf@entry=0x7fff550d93c0) at xlogutils.c:399
#6  0x0000000000503d00 in XLogReadBufferForRedo (record=record@entry=0x105ce58, block_id=block_id@entry=2 '\002', buf=buf@entry=0x7fff550d93c0) at xlogutils.c:293
#7  0x000000000048bcd5 in ginRedoDeletePage (record=0x105ce58) at ginxlog.c:482
#8  gin_redo (record=0x105ce58) at ginxlog.c:731
#9  0x00000000004f8b83 in StartupXLOG () at xlog.c:7183
#10 0x00000000006b0091 in StartupProcessMain () at startup.c:217
#11 0x000000000050689a in AuxiliaryProcessMain (argc=argc@entry=2, argv=argv@entry=0x7fff550dcbe0) at bootstrap.c:426
#12 0x00000000006ad160 in StartChildProcess (type=StartupProcess) at postmaster.c:5361
#13 0x00000000006afaab in PostmasterMain (argc=argc@entry=1, argv=argv@entry=0x101ba20) at postmaster.c:1355
#14 0x000000000047bb6f in main (argc=1, argv=0x101ba20) at main.c:228

## stack of backend process(tens of such process)
--------------------------------------
#0  0x00007f7aaf72a79b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
#1  0x00007f7aaf72a82f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00007f7aaf72a8cb in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
#3  0x000000000069d362 in PGSemaphoreLock (sema=0x2aaaaac074b8) at pg_sema.c:310
#4  0x00000000007095ac in LWLockAcquire (lock=0x2aaac6c081e4, mode=LW_SHARED) at lwlock.c:1233
#5  0x0000000000490943 in ginTraverseLock (buffer=7034160, searchMode=searchMode@entry=1 '\001') at ginbtree.c:40
#6  0x00000000004910e0 in ginFindLeafPage (btree=btree@entry=0x1fd6318, searchMode=searchMode@entry=1 '\001', snapshot=snapshot@entry=0x200a958) at ginbtree.c:97
#7  0x000000000048fe5b in ginScanBeginPostingTree (btree=btree@entry=0x1fd6318, index=<optimized out>, rootBlkno=rootBlkno@entry=2246, snapshot=snapshot@entry=0x200a958) at gindatapage.c:1924
#8  0x000000000049319c in startScanEntry (ginstate=ginstate@entry=0x1eade90, entry=<optimized out>, snapshot=0x200a958) at ginget.c:390
#9  0x0000000000493a92 in startScan (scan=0x1e6c348, scan=0x1e6c348) at ginget.c:543
#10 gingetbitmap (scan=0x1e6c348, tbm=0x20df978) at ginget.c:1829
#11 0x00000000004c245a in index_getbitmap (scan=scan@entry=0x1e6c348, bitmap=bitmap@entry=0x20df978) at indexam.c:726
#12 0x0000000000600c52 in MultiExecBitmapIndexScan (node=0x1ebfcd0) at nodeBitmapIndexscan.c:104
#13 0x00000000005f40b9 in MultiExecProcNode (node=<optimized out>) at execProcnode.c:490
#14 0x00000000006004c7 in BitmapHeapNext (node=node@entry=0x1a0c618) at nodeBitmapHeapscan.c:117
#15 0x00000000005f5094 in ExecScanFetch (recheckMtd=0x5ff990 <BitmapHeapRecheck>, accessMtd=0x5ffa20 <BitmapHeapNext>, node=0x1a0c618) at execScan.c:97
#16 ExecScan (node=0x1a0c618, accessMtd=0x5ffa20 <BitmapHeapNext>, recheckMtd=0x5ff990 <BitmapHeapRecheck>) at execScan.c:164
#17 0x0000000000604a3f in ExecProcNode (node=0x1a0c618) at ../../../src/include/executor/executor.h:250
#18 ExecHashJoin (pstate=0x1ebf670) at nodeHashjoin.c:145
#19 0x00000000005fad89 in ExecProcNode (node=0x1ebf670) at ../../../src/include/executor/executor.h:250
#20 fetch_input_tuple (aggstate=aggstate@entry=0x1ebef78) at nodeAgg.c:695
#21 0x00000000005fcfd3 in agg_fill_hash_table (aggstate=0x1ebef78) at nodeAgg.c:2539
#22 ExecAgg (pstate=0x1ebef78) at nodeAgg.c:2151
#23 0x0000000000610636 in ExecProcNode (node=0x1ebef78) at ../../../src/include/executor/executor.h:250
#24 ExecSort (pstate=0x1ebed08) at nodeSort.c:106
#25 0x0000000000607cfd in ExecProcNode (node=0x1ebed08) at ../../../src/include/executor/executor.h:250
#26 ExecLimit (pstate=0x1ebe0f8) at nodeLimit.c:95
#27 0x00000000005ef612 in ExecProcNode (node=0x1ebe0f8) at ../../../src/include/executor/executor.h:250
#28 ExecutePlan (execute_once=<optimized out>, dest=0x10e34f8, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1ebe0f8, estate=0x1ebde98) at execMain.c:1722
#29 standard_ExecutorRun (queryDesc=0x18caf88, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:363
#30 0x00007f7aa028415d in pgss_ExecutorRun (queryDesc=0x18caf88, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:889
#31 0x000000000071a28b in PortalRunSelect (portal=portal@entry=0x1111d88, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x10e34f8) at pquery.c:932
#32 0x000000000071b63f in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x10e34f8, altdest=0x10e34f8, completionTag=0x7fff550dc220 "") at pquery.c:773
#33 0x0000000000718f84 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:1984
#34 0x000000000047ad3c in BackendRun (port=0x106a590) at postmaster.c:4405
#35 BackendStartup (port=0x106a590) at postmaster.c:4077
#36 ServerLoop () at postmaster.c:1755
#37 0x00000000006afacf in PostmasterMain (argc=argc@entry=1, argv=argv@entry=0x101ba20) at postmaster.c:1363
#38 0x000000000047bb6f in main (argc=1, argv=0x101ba20) at main.c:228

# Similar problem

I noticed that there were some similar bug reports before. 

- https://www.postgresql.org/message-id/CAK7KUdBjF4TN5Z2Rf0fv1MVUa%3DuqQssKMhoTXUa2y88R4PPjsg%40mail.gmail.com
- https://www.postgresql.org/message-id/1509467408126-0.post@n3.nabble.com
- https://www.postgresql-archive.org/Stalls-on-PGSemaphoreLock-td5797385.html

According to this https://www.postgresql-archive.org/Stalls-on-PGSemaphoreLock-td5797385.html, this issue may be related to the huge page. 
So we turn off the huge page, and I don't know if this problem will happen again, but it hasn't happened yet to now.

Maybe should only turn off transparent hugepage

BTW:
Related configuration of our machine is as following:

cat /sys/kernel/mm/transparent_hugepage/defrag
    [always] madvise never 
        
cat /proc/sys/kernel/sem 
250 32000 32 128

/proc/meminfo
AnonHugePages:   4446208 kB
HugePages_Total:   38912
HugePages_Free:     5907
HugePages_Rsvd:       40
HugePages_Surp:        0

Maybe we should only trun off transparent hugepage instead of huge page.

Now, I am not sure if this problem is a kernel configuration issue or a gin index implementation bug, any tips?

Regard,
Chen Huajun

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: replication_slots usability issue
Next
From: Andres Freund
Date:
Subject: Re: replication_slots usability issue