RE: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path - Mailing list pgsql-hackers

From
Subject RE: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path
Date
Msg-id 00f901dcac2f$4c8cf9a0$e5a6ece0$@icloud.com
Whole thread Raw
In response to [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path  (<paul.bunn@icloud.com>)
Responses Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path
List pgsql-hackers

Attached is a new patch to demonstrate the bug in make_new_segment.

It’s not meant as a permanent new test (it’s using some hard-coded assumptions about the x64 platform) -- just useful in validating the bug and fix.

 

 

From: paul.bunn@icloud.com <paul.bunn@icloud.com>
Sent: Wednesday, March 4, 2026 12:01 AM
To: pgsql-hackers@postgresql.org
Subject: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

 

Hi hackers,

 

Sorry for the previously poorly-formatted/threaded email.

 

We've identified a bug in the DSA (Dynamic Shared Memory Area) allocator

that causes memory corruption and crashes during parallel hash joins with

large data sets.  The bug has been present since the DSA implementation

was introduced and affects all supported branches.

 

Attached is a minimal fix (5 lines added, 0 changed).

 

== Bug Summary ==

 

In make_new_segment() (src/backend/utils/mmgr/dsa.c), there are two paths

for computing segment layout:

 

  Path 1 (geometric): knows total_size upfront, computes

    total_pages = total_size / FPM_PAGE_SIZE

    pagemap entries = total_pages                  <-- correct

 

  Path 2 (odd-sized, when requested > geometric): works forward from

    usable_pages = requested_pages

    pagemap entries = usable_pages                 <-- BUG

 

The pagemap is indexed by absolute page number.  The FreePageManager hands

out pages with indices from metadata_pages to (metadata_pages +

usable_pages - 1).  Since metadata_pages >= 1, page indices at the high

end of the range exceed usable_pages, making the pagemap accesses

out-of-bounds.

 

== How It Was Found on Postgres 15 ==

 

Multiple parallel worker backends crashed simultaneously with SIGSEGV in

dsa_get_address(), called from dsa_free() in ExecHashTableDetachBatch()

during parallel hash join batch cleanup.

 

Stack trace:

  #0 dsa_get_address (area, dp) at dsa.c:955

  #1 dsa_free (area, dp) at dsa.c:839

  #2 ExecHashTableDetachBatch (hashtable) at nodeHash.c:3189

  #3 ExecParallelHashJoinNewBatch (hjstate) at nodeHashjoin.c:1157

 

All crashing workers computed the same pageno (196993), which was the last

valid FPM page but beyond the pagemap array (usable_pages = 196609).

 

== Root Cause (from core dump analysis) ==

 

The crashing segment had:

  usable_pages   = 196,609   (pagemap has this many entries)

  metadata_pages = 385

  total_pages    = 196,994   (= metadata_pages + usable_pages)

  last FPM page  = 196,993   (= metadata_pages + usable_pages - 1)

 

  pagemap valid indices: 0 .. 196,608

  FPM page indices:      385 .. 196,993

 

Pages 196,609 through 196,993 (385 pages) have no valid pagemap entry.

 

The pagemap array ends 3,072 bytes before the data area starts (padding

zone).  Most out-of-bounds entries fall in this padding and cause silent

corruption.  The last 17 entries fall in the data area itself, causing

bidirectional corruption: pagemap writes destroy allocated object data,

and subsequent pagemap reads return garbage, crashing dsa_free().

 

== The Fix ==

 

After computing metadata_bytes with usable_pages pagemap entries, add

entries for the metadata pages themselves:

 

  metadata_bytes +=

      ((metadata_bytes / (FPM_PAGE_SIZE - sizeof(dsa_pointer))) + 1) *

      sizeof(dsa_pointer);

 

The divisor (FPM_PAGE_SIZE - sizeof(dsa_pointer)) = 4088 accounts for

the circular dependency: each metadata page costs one pagemap entry

(8 bytes), so only 4088 of each 4096-byte metadata page is net-free for

other pagemap entries.  The +1 absorbs the ceiling.

 

 

== Repro ==

It's tricky to create a test that causes a failure, or standalone SQL that will cause

the bug to manifest. This is a (long) dormant bug that only causes a crash in rare

circumstances.

 

--

Thanks,

Paul

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_resetwal.c: duplicate '0' in hex character set for -l option validation
Next
From: Etsuro Fujita
Date:
Subject: Re: Options to control remote transactions’ access/deferrable modes in postgres_fdw