Re: PG18 GIN parallel index build crash - invalid memory alloc request size - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PG18 GIN parallel index build crash - invalid memory alloc request size
Date
Msg-id dea2cde1-a5ba-428e-9d20-941de6f71050@vondra.me
Whole thread Raw
In response to PG18 GIN parallel index build crash - invalid memory alloc request size  (Gregory Smith <gregsmithpgsql@gmail.com>)
Responses Re: PG18 GIN parallel index build crash - invalid memory alloc request size
List pgsql-hackers
Hi,

On 10/24/25 05:03, Gregory Smith wrote:
> Testing PostgreSQL 18.0 on Debian from PGDG repo:  18.0-1.pgdg12+3 with
> PostGIS 3.6.0+dfsg-2.pgdg12+1.  Running the osm2pgsql workload to load
> the entire OSM Planet data set in my home lab system.
> 
> I found a weird crash during the recently adjusted parallel GIN index
> building code.  There are 2 parallel workers spawning, one of them
> crashes then everything terminates.  This is one of the last steps in
> OSM loading, I can reproduce just by trying the one statement again:
> 
> gis=# CREATE INDEX ON "public"."planet_osm_polygon" USING GIN (tags);
> ERROR:  invalid memory alloc request size 1113001620
> 
> I see that this area of the code was just being triaged during early
> beta time in May, may need another round.
> 
> The table is 215 GB.  Server has 128GB and only 1/3 is nailed down,
> there's plenty of RAM available.
> 
> Settings include:
> work_mem=1GB
> maintenance_work_mem=20GB
> shared_buffers=48GB
> max_parallel_workers_per_gather = 8
> 

Hmmm, I wonder if the m_w_m is high enough to confuse the trimming logic
in some way. Can you try if using smaller m_w_m (maybe 128MB-256MB)
makes the issue go away?

> Log files show a number of similarly big allocations working before
> then, here's an example:
> 
> LOG:  temporary file: path "base/pgsql_tmp/
> pgsql_tmp161831.0.fileset/0.1", size 1073741824
> STATEMENT:  CREATE INDEX ON "public"."planet_osm_polygon" USING BTREE
> (osm_id)
> ERROR:  invalid memory alloc request size 1137667788
> STATEMENT:  CREATE INDEX ON "public"."planet_osm_polygon" USING GIN (tags)
> CONTEXT:  parallel worker
> 

But that btree allocation is exactly 1GB, which is the palloc limit. And
IIRC the tuplesort code is doing palloc_huge, so that's probably why it
works fine. While the GIN code does a plain repalloc(), so it's subject
to the MaxAllocSize limit.

> And another one to show the size at crash is a little different each time:
> ERROR: Database error: ERROR:  invalid memory alloc request size 1115943018
> 
> Hooked into the error message and it gave this stack trace:
> 
> #0  errfinish (filename=0x5646de247420 "./build/../src/backend/utils/
> mmgr/mcxt.c",
>     lineno=1174, funcname=0x5646de2477d0 <__func__.3>
> "MemoryContextSizeFailure")
>     at ./build/../src/backend/utils/error/elog.c:476
> #1  0x00005646ddb4ae9c in MemoryContextSizeFailure (
>     context=context@entry=0x56471ce98c90, size=size@entry=1136261136,
>     flags=flags@entry=0) at ./build/../src/backend/utils/mmgr/mcxt.c:1174
> #2  0x00005646de05898d in MemoryContextCheckSize (flags=0, size=1136261136,
>     context=0x56471ce98c90) at ./build/../src/include/utils/
> memutils_internal.h:172
> #3  MemoryContextCheckSize (flags=0, size=1136261136,
> context=0x56471ce98c90)
>     at ./build/../src/include/utils/memutils_internal.h:167
> #4  AllocSetRealloc (pointer=0x7f34f558b040, size=1136261136, flags=0)
>     at ./build/../src/backend/utils/mmgr/aset.c:1203
> #5  0x00005646ddb701c8 in GinBufferStoreTuple (buffer=0x56471cee0d10,
>     tup=0x7f34dfdd2030) at ./build/../src/backend/access/gin/
> gininsert.c:1497
> #6  0x00005646ddb70503 in _gin_process_worker_data (progress=<optimized
> out>,
>     worker_sort=0x56471cf13638, state=0x7ffc288b0200)
>     at ./build/../src/backend/access/gin/gininsert.c:1926
> #7  _gin_parallel_scan_and_build (state=state@entry=0x7ffc288b0200,
>     ginshared=ginshared@entry=0x7f4168a5d360,
>     sharedsort=sharedsort@entry=0x7f4168a5d300,
> heap=heap@entry=0x7f41686e5280,
>     index=index@entry=0x7f41686e4738, sortmem=<optimized out>,
>     progress=<optimized out>) at ./build/../src/backend/access/gin/
> gininsert.c:2046
> #8  0x00005646ddb71ebf in _gin_parallel_build_main (seg=<optimized out>,
>     toc=0x7f4168a5d000) at ./build/../src/backend/access/gin/
> gininsert.c:2159
> #9  0x00005646ddbdf882 in ParallelWorkerMain (main_arg=<optimized out>)
>     at ./build/../src/backend/access/transam/parallel.c:1563
> #10 0x00005646dde40670 in BackgroundWorkerMain (startup_data=<optimized
> out>,
>     startup_data_len=<optimized out>)
>     at ./build/../src/backend/postmaster/bgworker.c:843
> #11 0x00005646dde42a45 in postmaster_child_launch (
>     child_type=child_type@entry=B_BG_WORKER, child_slot=320,
>     startup_data=startup_data@entry=0x56471cdbc8f8,
>     startup_data_len=startup_data_len@entry=1472,
> client_sock=client_sock@entry=0x0)
>     at ./build/../src/backend/postmaster/launch_backend.c:290
> #12 0x00005646dde44265 in StartBackgroundWorker (rw=0x56471cdbc8f8)
>     at ./build/../src/backend/postmaster/postmaster.c:4157
> #13 maybe_start_bgworkers () at ./build/../src/backend/postmaster/
> postmaster.c:4323
> #14 0x00005646dde45b13 in LaunchMissingBackgroundProcesses ()
>     at ./build/../src/backend/postmaster/postmaster.c:3397
> #15 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1717
> #16 0x00005646dde47f6d in PostmasterMain (argc=argc@entry=5,
>     argv=argv@entry=0x56471cd66dc0)
>     at ./build/../src/backend/postmaster/postmaster.c:1400
> #17 0x00005646ddb4d56c in main (argc=5, argv=0x56471cd66dc0)
>     at ./build/../src/backend/main/main.c:227
> 
> I've frozen my testing at the spot where I can reproduce the problem.  I
> was going to try dropping m_w_m next and turning off the parallel
> execution.  I didn't want to touch anything until after asking if
> there's more data that should be collected from a crashing instance.
> 

Hmm, so it's failing on the repalloc in GinBufferStoreTuple(), which is
merging the "GinTuple" into an in-memory buffer. I'll take a closer look
once I get back from pgconf.eu, but I guess I failed to consider that
the "parts" may be large enough to exceed MaxAlloc.

The code tries to flush the "frozen" part of the TID lists part that can
no longer change, but I think with m_w_m this large it could happen the
first two buffers are already too large (and the trimming happens only
after the fact).

Can you show the contents of buffer and tup? I'm especially interested
in these fields:

  buffer->nitems
  buffer->maxitems
  buffer->nfrozen
  tup->nitems

If I'm right, I think there are two ways to fix this:

(1) apply the trimming earlier, i.e. try to freeze + flush before
actually merging the data (essentially, update nfrozen earlier)

(2) use repalloc_huge (and palloc_huge) in GinBufferStoreTuple

Or we probably should do both.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: contrib/sepgsql regression tests have been broken for months
Next
From: Bertrand Drouvot
Date:
Subject: Re: Report bytes and transactions actually sent downtream