BUG #17705: Segmentation fault in BufFileLoadBuffer - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17705: Segmentation fault in BufFileLoadBuffer
Date
Msg-id 17705-1cbd5f5a98e2eedc@postgresql.org
Whole thread Raw
Responses Re: BUG #17705: Segmentation fault in BufFileLoadBuffer
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17705
Logged by:          Vahur Sinijärv
Email address:      vahursi@gmail.com
PostgreSQL version: 13.9
Operating system:   RHEL 8.7
Description:

Hi!

We started having random segmentation faults with our postgres 13.4 server,
running on RHEL 8.7. It was upgraded to 13.9, but the issue persists. The
database is fairly large, about 250GB on disk.

I got core dump of one of the crashes and it shows SIGSEGV in
BufFileLoadBuffer. I tried to investigate this a little and it seems the
reason for it is that ExecHashJoinGetSavedTuple reads {0, 0} as header,
meaning hashvalue is 0 and tuple length is 0. Line 1277 in nodeHashjoin.c
subtracts sizeof(uint32) from 0 and passes it as size to BufFileRead(). GDB
shows size=18446744073627287632 at frame #1 which is not ((uint64_t) -4),
but -82263984. I think this is caused by BufFileRead which decrements
parameter 'size' by bytes read, so apparently it has read 82263980 bytes,
overwriting BufFile struct passed to BufFileLoadBuffer. Its files field now
contains ascii instead of pointer and file->files[file->curFile]; causes
SIGSEGV.

Why it has read {0, 0} as saved tuple header, or what could have written
these zeroes there, I could not find out...

Stack trace from core dump:
Core was generated by `postgres: mmx mmx ********(32806) INSERT
                      '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000077db27 in BufFileLoadBuffer (file=0x7c15548) at
buffile.c:430
430             thisfile = file->files[file->curFile];
(gdb) bt
#0  0x000000000077db27 in BufFileLoadBuffer (file=0x7c15548) at
buffile.c:430
#1  BufFileRead (file=file@entry=0x7c15548, ptr=0x7c15c68,
ptr@entry=0x2da1cbc, size=18446744073627287632)
    at buffile.c:554
#2  0x000000000065b40c in ExecHashJoinGetSavedTuple (file=0x7c15548,
hashvalue=hashvalue@entry=0x7ffe58d22b4c,
    tupleSlot=0x2d6a3b8, hjstate=0x2df8d88) at nodeHashjoin.c:1277
#3  0x000000000065ba96 in ExecHashJoinOuterGetTuple
(hashvalue=0x7ffe58d22b4c, hjstate=0x2df8d88, outerNode=0x2db57f8)
    at nodeHashjoin.c:866
#4  ExecHashJoinImpl (parallel=false, pstate=0x2df8d88) at
nodeHashjoin.c:356
#5  ExecHashJoin (pstate=0x2df8d88) at nodeHashjoin.c:591
#6  0x000000000065bef5 in ExecProcNode (node=0x2df8d88) at
../../../src/include/executor/executor.h:248
#7  ExecHashJoinOuterGetTuple (hashvalue=0x7ffe58d22bdc, hjstate=0x2df8ae8,
outerNode=0x2df8d88) at nodeHashjoin.c:826
#8  ExecHashJoinImpl (parallel=false, pstate=0x2df8ae8) at
nodeHashjoin.c:356
#9  ExecHashJoin (pstate=0x2df8ae8) at nodeHashjoin.c:591
#10 0x000000000065bef5 in ExecProcNode (node=0x2df8ae8) at
../../../src/include/executor/executor.h:248
#11 ExecHashJoinOuterGetTuple (hashvalue=0x7ffe58d22c6c, hjstate=0x2df8848,
outerNode=0x2df8ae8) at nodeHashjoin.c:826
#12 ExecHashJoinImpl (parallel=false, pstate=0x2df8848) at
nodeHashjoin.c:356
#13 ExecHashJoin (pstate=0x2df8848) at nodeHashjoin.c:591
#14 0x000000000065bef5 in ExecProcNode (node=0x2df8848) at
../../../src/include/executor/executor.h:248
#15 ExecHashJoinOuterGetTuple (hashvalue=0x7ffe58d22cfc, hjstate=0x2df85e8,
outerNode=0x2df8848) at nodeHashjoin.c:826
#16 ExecHashJoinImpl (parallel=false, pstate=0x2df85e8) at
nodeHashjoin.c:356
#17 ExecHashJoin (pstate=0x2df85e8) at nodeHashjoin.c:591
#18 0x0000000000666209 in ExecProcNode (node=0x2df85e8) at
../../../src/include/executor/executor.h:248
#19 ExecModifyTable (pstate=0x2df7648) at nodeModifyTable.c:2178
#20 0x000000000063b153 in ExecProcNode (node=0x2df7648) at
../../../src/include/executor/executor.h:248
#21 ExecutePlan (execute_once=<optimized out>, dest=0xd41480 <donothingDR>,
direction=<optimized out>, numberTuples=0,
    sendTuples=<optimized out>, operation=CMD_INSERT,
use_parallel_mode=<optimized out>, planstate=0x2df7648,
    estate=0x2df7218) at execMain.c:1632
#22 standard_ExecutorRun (queryDesc=0x2c2a188, direction=<optimized out>,
count=0, execute_once=<optimized out>)
    at execMain.c:350
#23 0x00000000007aee02 in ProcessQuery (plan=<optimized out>,
sourceText=0x2df1688 "", params=0x0, queryEnv=0x0,
    dest=0xd41480 <donothingDR>, qc=0x7ffe58d231e0) at pquery.c:160
#24 0x00000000007af86b in PortalRunMulti (portal=portal@entry=0x2bc97e8,
isTopLevel=isTopLevel@entry=true,
    setHoldSnapshot=setHoldSnapshot@entry=false, dest=0xd41480
<donothingDR>, dest@entry=0x2b58978,
    altdest=0xd41480 <donothingDR>, altdest@entry=0x2b58978,
qc=qc@entry=0x7ffe58d231e0) at pquery.c:1271
#25 0x00000000007afbda in PortalRun (portal=0x2bc97e8,
count=9223372036854775807, isTopLevel=<optimized out>,
    run_once=<optimized out>, dest=0x2b58978, altdest=0x2b58978,
qc=0x7ffe58d231e0) at pquery.c:788
#26 0x00000000007ad016 in PostgresMain (argc=<optimized out>,
argv=<optimized out>, dbname=<optimized out>,
    username=<optimized out>) at postgres.c:2185
#27 0x000000000072e4ed in BackendRun (port=0x2b786b0, port=0x2b786b0) at
postmaster.c:4550
#28 BackendStartup (port=0x2b786b0) at postmaster.c:4234
#29 ServerLoop () at postmaster.c:1739
#30 0x000000000072f460 in PostmasterMain (argc=argc@entry=15,
argv=argv@entry=0x2b50b00) at postmaster.c:1412
#31 0x0000000000486dd0 in main (argc=15, argv=0x2b50b00) at main.c:210
(gdb) print file
$1 = (BufFile *) 0x7c15548
(gdb) print file->curOffset
$2 = 0
(gdb) print file->files
$3 = (File *) 0x3038333631332c30
(gdb) print file->curFile
$4 = -819986428
(gdb)


pgsql-bugs by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Bug in jsonb_path_exists (maybe _match) one-element scalar/variable jsonpath handling
Next
From: Thomas Munro
Date:
Subject: Re: BUG #17705: Segmentation fault in BufFileLoadBuffer