Re: Fix segfault while accessing half-initialized hash table in pgstat_shmem.c - Mailing list pgsql-hackers

From Ranier Vilela
Subject Re: Fix segfault while accessing half-initialized hash table in pgstat_shmem.c
Date
Msg-id CAEudQAogM-u88dbCcGF-ymS1Y_B91=neYg0u7OsVrmXyXfgDNw@mail.gmail.com
Whole thread Raw
In response to Fix segfault while accessing half-initialized hash table in pgstat_shmem.c  (Mikhail Kot <mikhail.kot@databricks.com>)
Responses Re: Fix segfault while accessing half-initialized hash table in pgstat_shmem.c
List pgsql-hackers


Em qua., 3 de set. de 2025 às 03:34, Mikhail Kot <mikhail.kot@databricks.com> escreveu:
Hi,

I've encountered the following segmentation fault lately. It happens when
Postgres is experiencing high memory pressure. There are multiple OOM errors in
the log as well.

Core was generated by `postgres: neondb_owner neondb ::1(46658) BIND
          '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  pg_atomic_read_u32_impl (ptr=0x8) at
../../../../src/include/port/atomics/generic.h:48
#1  pg_atomic_read_u32 (ptr=0x8) at ../../../../src/include/port/atomics.h:239
#2  LWLockAttemptLock (lock=lock@entry=0x4,
mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:821
#3  0x000056446bce129f in LWLockConditionalAcquire (lock=0x4,
mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1386
#4  0x000056446bd0bacf in pgstat_lock_entry
(entry_ref=entry_ref@entry=0x56446d9f4340, nowait=nowait@entry=true)
at pgstat_shmem.c:625
#5  0x000056446bd0a3c9 in pgstat_relation_flush_cb
(entry_ref=0x56446d9f4340, nowait=<optimized out>) at
pgstat_relation.c:794
#6  0x000056446bd069f5 in pgstat_flush_pending_entries
(nowait=<optimized out>) at pgstat.c:1217
#7  pgstat_report_stat (force=<optimized out>, force@entry=false) at
pgstat.c:658
#8  0x000056446bcf16c1 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4623
#9  0x000056446bc716b3 in BackendRun (port=<optimized out>,
port=<optimized out>) at postmaster.c:4465
#10 BackendStartup (port=<optimized out>) at postmaster.c:4193
#11 ServerLoop () at postmaster.c:1782
#12 0x000056446bc726ea in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x56446cd803b0) at postmaster.c:1466
#13 0x000056446b9d5a00 in main (argc=3, argv=0x56446cd803b0) at main.c:238

The error originates from pgstat_shmem.c file where shhashent is left in
half-initialized state if pgstat_init_entry(), calling dsa_allocate0(), errors
out with OOM. Then shhashent causes a segmentation fault on access. I propose a
patch which solves this issue. The patch is for main branch, but the code is
nearly identical in Postgres 13-17 so I suggest backporting it to other
supported versions.

The patch changes pgstat_init_entry()'s behaviour, returning NULL if memory
allocation failed.
I'm wondering if it wouldn't be better to raise elog(ERROR), and avoid
many checks for this NULL.

best regards,
Ranier Vilela

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Refactoring: Use soft error reporting for *_opt_error functions
Next
From: Florents Tselai
Date:
Subject: Re: split func.sgml to separated individual sgml files