Thread: BUG #18334: Segfault when running a query with parallel workers

BUG #18334: Segfault when running a query with parallel workers

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18334
Logged by:          Marcin Barczyński
Email address:      mba.ogolny@gmail.com
PostgreSQL version: 13.13
Operating system:   Ubuntu 22.04.3 LTS
Description:

Obfuscated query:

WITH dt1 AS (
          SELECT
          right(d.p, -length('STR1') -1) || 'STR4' || f.n AS p1
          FROM dc d
            INNER JOIN fc f ON f.pid = d.id
              AND f.vid = d.vid
            WHERE f.vid = func1('STR2')
              AND d.aids && ARRAY[(
                  SELECT id from dc
                  WHERE p = 'STR1' AND vid = func1('STR2')
              )]
              AND right(d.p, -length('STR1') -1) || 'STR4' || f.n != ''
        ), dt2 AS (
          SELECT
          d.p || 'STR4' || f.n AS p2
          FROM dc d
            INNER JOIN fc f ON f.pid = d.id
              AND f.vid = d.vid
            WHERE f.vid = func1('STR3')
              AND d.aids && ARRAY[(
                  SELECT id from dc
                  WHERE p = '' AND vid = func1('STR3')
              )]
              AND d.p || 'STR4' || f.n != ''

        )
        SELECT dt2.p2
                  FROM dt1 RIGHT OUTER JOIN dt2 ON p1 = p2
                  WHERE p1 IS NULL;

Log messages:

2024-02-03 09:16:33.798 EST [3261686-102] app= LOG:  background worker
"parallel worker" (PID 2387431) was terminated by signal 11: Segmentation
fault
2024-02-03 09:16:33.798 EST [3261686-103] app= DETAIL:  Failed process was
running: set max_parallel_workers=8; set work_mem='20GB'; 

Backtrace:

#0  0x0000557ba04345ac in dsa_get_address (area=0x557ba22e9668,
dp=<optimized out>) at
utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c:955
#1  0x0000557ba014ec21 in ExecParallelHashNextTuple (tuple=0x7fc42a891560,
hashtable=0x557ba233dcb8) at
executor/./build/../src/backend/executor/nodeHash.c:3272
#2  ExecParallelScanHashBucket (hjstate=0x557ba22fdf28,
econtext=0x557ba22fddf0) at
executor/./build/../src/backend/executor/nodeHash.c:2059
#3  0x0000557ba01514b5 in ExecHashJoinImpl (parallel=<optimized out>,
pstate=<optimized out>) at
executor/./build/../src/backend/executor/nodeHashjoin.c:455
#4  ExecParallelHashJoin (pstate=<optimized out>) at
executor/./build/../src/backend/executor/nodeHashjoin.c:637
#5  0x0000557ba013547d in ExecProcNodeInstr (node=0x557ba22fdf28) at
executor/./build/../src/backend/executor/execProcnode.c:467
#6  0x0000557ba012b03d in ExecProcNode (node=0x557ba22fdf28) at
executor/./build/../src/include/executor/executor.h:248
#7  ExecutePlan (execute_once=<optimized out>, dest=0x557ba2281a78,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x557ba22fdf28, estate=0x557ba22c1008)
    at executor/./build/../src/backend/executor/execMain.c:1632
#8  standard_ExecutorRun (queryDesc=0x557ba22d17c0, direction=<optimized
out>, count=0, execute_once=<optimized out>) at
executor/./build/../src/backend/executor/execMain.c:350
#9  0x00007fc42a976f25 in pgss_ExecutorRun (queryDesc=0x557ba22d17c0,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at
./build/../contrib/pg_stat_statements/pg_stat_statements.c:1045
#10 0x00007fc42e5d56d2 in explain_ExecutorRun (queryDesc=0x557ba22d17c0,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at
./build/../contrib/auto_explain/auto_explain.c:334
#11 0x0000557ba0131ba9 in ExecutorRun (execute_once=true, count=<optimized
out>, direction=ForwardScanDirection, queryDesc=0x557ba22d17c0) at
executor/./build/../src/backend/executor/execMain.c:292
#12 ParallelQueryMain (seg=seg@entry=0x557ba2239b18,
toc=toc@entry=0x7fc42a890000) at
executor/./build/../src/backend/executor/execParallel.c:1448
#13 0x0000557b9fff010e in ParallelWorkerMain (main_arg=<optimized out>) at
access/transam/./build/../src/backend/access/transam/parallel.c:1494
#14 0x0000557ba0231ada in StartBackgroundWorker () at
postmaster/./build/../src/backend/postmaster/bgworker.c:890
#15 0x0000557ba0241ffe in do_start_bgworker (rw=<optimized out>) at
postmaster/./build/../src/backend/postmaster/postmaster.c:5896
#16 maybe_start_bgworkers () at
postmaster/./build/../src/backend/postmaster/postmaster.c:6121
#17 0x0000557ba024224d in sigusr1_handler (postgres_signal_arg=<optimized
out>) at postmaster/./build/../src/backend/postmaster/postmaster.c:5281
#18 <signal handler called>
#19 0x00007fc42d65959d in __GI___select (nfds=nfds@entry=8,
readfds=readfds@entry=0x7ffda2d1ba20, writefds=writefds@entry=0x0,
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7ffda2d1b980) at
../sysdeps/unix/sysv/linux/select.c:69
#20 0x0000557ba02433d6 in ServerLoop () at
postmaster/./build/../src/backend/postmaster/postmaster.c:1706
#21 0x0000557ba02450e5 in PostmasterMain (argc=5, argv=<optimized out>) at
postmaster/./build/../src/backend/postmaster/postmaster.c:1415
#22 0x0000557b9ff5a017 in main (argc=5, argv=0x557ba2121300) at
main/./build/../src/backend/main/main.c:210

It happens non-deterministically but frequently in our environment.

I have a core dump and will gladly send additional info if needed.


PG Bug reporting form <noreply@postgresql.org> writes:
> Log messages:

> 2024-02-03 09:16:33.798 EST [3261686-102] app= LOG:  background worker
> "parallel worker" (PID 2387431) was terminated by signal 11: Segmentation
> fault
> 2024-02-03 09:16:33.798 EST [3261686-103] app= DETAIL:  Failed process was
> running: set max_parallel_workers=8; set work_mem='20GB'; 

It's hard to do anything with just the query.  Can you put together a
self-contained test case, including table definitions and some sample
data?  (The data most likely could be dummy generated data.)
It would also be useful to know what non-default settings you are
using.

            regards, tom lane



Re: BUG #18334: Segfault when running a query with parallel workers

From
Marcin Barczyński
Date:
On Tue, Feb 6, 2024 at 2:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> PG Bug reporting form <noreply@postgresql.org> writes:
> > Log messages:
>
> > 2024-02-03 09:16:33.798 EST [3261686-102] app= LOG:  background worker
> > "parallel worker" (PID 2387431) was terminated by signal 11: Segmentation
> > fault
> > 2024-02-03 09:16:33.798 EST [3261686-103] app= DETAIL:  Failed process was
> > running: set max_parallel_workers=8; set work_mem='20GB';
>
> It's hard to do anything with just the query.  Can you put together a
> self-contained test case, including table definitions and some sample
> data?  (The data most likely could be dummy generated data.)

No, not really. This issue happens on a production machine and a large
volume of data (terabytes) is likely the cause of the error.

Regards,
Marcin Barczyński



Re: BUG #18334: Segfault when running a query with parallel workers

From
Thomas Munro
Date:
On Wed, Feb 7, 2024 at 12:30 AM Marcin Barczyński <mba.ogolny@gmail.com> wrote:
> On Tue, Feb 6, 2024 at 2:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > PG Bug reporting form <noreply@postgresql.org> writes:
> > > 2024-02-03 09:16:33.798 EST [3261686-102] app= LOG:  background worker
> > > "parallel worker" (PID 2387431) was terminated by signal 11: Segmentation
> > > fault
> > > 2024-02-03 09:16:33.798 EST [3261686-103] app= DETAIL:  Failed process was
> > > running: set max_parallel_workers=8; set work_mem='20GB';
> >
> > It's hard to do anything with just the query.  Can you put together a
> > self-contained test case, including table definitions and some sample
> > data?  (The data most likely could be dummy generated data.)
>
> No, not really. This issue happens on a production machine and a large
> volume of data (terabytes) is likely the cause of the error.

Hi,

Could you please show EXPLAIN ANALYZE for the query?  In gdb from that
core, can you please show "info proc mappings", and in frame 0 "print
*area", and in frame 1, "print *tuple" and "print *hashtable"?



Re: BUG #18334: Segfault when running a query with parallel workers

From
Marcin Barczyński
Date:
Hi Thomas,

On Sun, Feb 11, 2024 at 10:31 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> Could you please show EXPLAIN ANALYZE for the query?  In gdb from that
> core, can you please show "info proc mappings", and in frame 0 "print
> *area", and in frame 1, "print *tuple" and "print *hashtable"?

I'm sorry for my late reply.
It happened again, and I'm pasting info you requested from core.
PostgreSQL 13.15.

Stack trace:

#0  0x000056134d5bb011 in dsa_free (area=0x56134e07d718, dp=<optimized
out>) at utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c:840
840 utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c: No such file
or directory.
(gdb) bt
#0  0x000056134d5bb011 in dsa_free (area=0x56134e07d718, dp=<optimized
out>) at utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c:840
#1  0x000056134d2d6a0c in ExecHashTableDetachBatch
(hashtable=hashtable@entry=0x56134e154540) at
executor/./build/../src/backend/executor/nodeHash.c:3181
#2  0x000056134d2d821a in ExecParallelHashJoinNewBatch
(hjstate=0x56134e087b48) at
executor/./build/../src/backend/executor/nodeHashjoin.c:1131
#3  ExecHashJoinImpl (parallel=<optimized out>, pstate=<optimized
out>) at executor/./build/../src/backend/executor/nodeHashjoin.c:590
#4  ExecParallelHashJoin (pstate=<optimized out>) at
executor/./build/../src/backend/executor/nodeHashjoin.c:637
#5  0x000056134d2bbffd in ExecProcNodeInstr (node=0x56134e087b48) at
executor/./build/../src/backend/executor/execProcnode.c:467
#6  0x000056134d2b1bbd in ExecProcNode (node=0x56134e087b48) at
executor/./build/../src/include/executor/executor.h:248
#7  ExecutePlan (execute_once=<optimized out>, dest=0x56134dfe1fe8,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
    planstate=0x56134e087b48, estate=0x56134e087858) at
executor/./build/../src/backend/executor/execMain.c:1632
#8  standard_ExecutorRun (queryDesc=0x56134e0783e0,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
executor/./build/../src/backend/executor/execMain.c:350
#9  0x00007f3a734c9f25 in pgss_ExecutorRun (queryDesc=0x56134e0783e0,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
    at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:1045
#10 0x00007f3a771296d2 in explain_ExecutorRun
(queryDesc=0x56134e0783e0, direction=ForwardScanDirection, count=0,
execute_once=<optimized out>)
    at ./build/../contrib/auto_explain/auto_explain.c:334
#11 0x000056134d2b8729 in ExecutorRun (execute_once=true,
count=<optimized out>, direction=ForwardScanDirection,
queryDesc=0x56134e0783e0)
    at executor/./build/../src/backend/executor/execMain.c:292
#12 ParallelQueryMain (seg=seg@entry=0x56134df98db8,
toc=toc@entry=0x7f321dfa4000) at
executor/./build/../src/backend/executor/execParallel.c:1448
#13 0x000056134d1767ce in ParallelWorkerMain (main_arg=<optimized
out>) at access/transam/./build/../src/backend/access/transam/parallel.c:1494
#14 0x000056134d3b981a in StartBackgroundWorker () at
postmaster/./build/../src/backend/postmaster/bgworker.c:890
#15 0x000056134d3c963e in do_start_bgworker (rw=<optimized out>) at
postmaster/./build/../src/backend/postmaster/postmaster.c:5896
#16 maybe_start_bgworkers () at
postmaster/./build/../src/backend/postmaster/postmaster.c:6121
#17 0x000056134d3c988d in sigusr1_handler
(postgres_signal_arg=<optimized out>) at
postmaster/./build/../src/backend/postmaster/postmaster.c:5281
#18 <signal handler called>
#19 0x00007f3a761ac59d in __GI___select (nfds=nfds@entry=8,
readfds=readfds@entry=0x7fff97c44720, writefds=writefds@entry=0x0,
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff97c44680)
    at ../sysdeps/unix/sysv/linux/select.c:69
#20 0x000056134d3caa16 in ServerLoop () at
postmaster/./build/../src/backend/postmaster/postmaster.c:1706
#21 0x000056134d3cc725 in PostmasterMain (argc=5, argv=<optimized
out>) at postmaster/./build/../src/backend/postmaster/postmaster.c:1415
#22 0x000056134d0e0377 in main (argc=5, argv=0x56134de8d300) at
main/./build/../src/backend/main/main.c:210


(gdb) info proc mappings
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x56134cfab000     0x56134d068000    0xbd000        0x0
/usr/lib/postgresql/13/bin/postgres
      0x56134d068000     0x56134d60b000   0x5a3000    0xbd000
/usr/lib/postgresql/13/bin/postgres
      0x56134d60b000     0x56134d827000   0x21c000   0x660000
/usr/lib/postgresql/13/bin/postgres
      0x56134d827000     0x56134d845000    0x1e000   0x87b000
/usr/lib/postgresql/13/bin/postgres
      0x56134d845000     0x56134d854000     0xf000   0x899000
/usr/lib/postgresql/13/bin/postgres
      0x7f2e9599e000     0x7f2f1599e000 0x80000000        0x0
/dev/shm/PostgreSQL.940706000


(gdb) print *area
$1 = {control = 0x7f321dfa4500, mapping_pinned = false, segment_maps =
{{segment = 0x0, mapped_address = 0x7f321dfa4500 "", header =
0x7f321dfa4500, fpm = 0x7f321dfa5d20,
      pagemap = 0x7f321dfa6168}, {segment = 0x56134dfa1ec8,
mapped_address = 0x7f3216cd8000 "", header = 0x7f3216cd8000, fpm =
0x7f3216cd8038, pagemap = 0x7f3216cd8480}, {
      segment = 0x56134dfa1f18, mapped_address = 0x7f31f6bd7000 "",
header = 0x7f31f6bd7000, fpm = 0x7f31f6bd7038, pagemap =
0x7f31f6bd7480}, {segment = 0x56134dfa2078,
      mapped_address = 0x7f30d60a6000 "", header = 0x7f30d60a6000, fpm
= 0x7f30d60a6038, pagemap = 0x7f30d60a6480}, {segment =
0x56134dfa2118, mapped_address = 0x7f30d58a6000 "",
      header = 0x7f30d58a6000, fpm = 0x7f30d58a6038, pagemap =
0x7f30d58a6480}, {segment = 0x56134dfa20c8, mapped_address =
0x7f30d5ca6000 "", header = 0x7f30d5ca6000, fpm = 0x7f30d5ca6038,
      pagemap = 0x7f30d5ca6480}, {segment = 0x56134dfa2168,
mapped_address = 0x7f30d50a6000 "", header = 0x7f30d50a6000, fpm =
0x7f30d50a6038, pagemap = 0x7f30d50a6480}, {
      segment = 0x56134dfa21b8, mapped_address = 0x7f30d449e000 "",
header = 0x7f30d449e000, fpm = 0x7f30d449e038, pagemap =
0x7f30d449e480}, {segment = 0x56134dfa2208,
      mapped_address = 0x7f30d2c90000 "", header = 0x7f30d2c90000, fpm
= 0x7f30d2c90038, pagemap = 0x7f30d2c90480}, {segment =
0x56134dfa2258, mapped_address = 0x7f30cfc76000 "",
      header = 0x7f30cfc76000, fpm = 0x7f30cfc76038, pagemap =
0x7f30cfc76480}, {segment = 0x56134ee12048, mapped_address =
0x7f307599e000 "", header = 0x7f307599e000, fpm = 0x7f307599e038,
      pagemap = 0x7f307599e480}, {segment = 0x56134ee11ff8,
mapped_address = 0x7f307b9d0000 "", header = 0x7f307b9d0000, fpm =
0x7f307b9d0038, pagemap = 0x7f307b9d0480}, {
      segment = 0x56134ee11fa8, mapped_address = 0x7f3087a32000 "",
header = 0x7f3087a32000, fpm = 0x7f3087a32038, pagemap =
0x7f3087a32480}, {segment = 0x56134dfa2dd8,
      mapped_address = 0x7f309faf4000 "", header = 0x7f309faf4000, fpm
= 0x7f309faf4038, pagemap = 0x7f309faf4480}, {segment =
0x56134dfa1fb8, mapped_address = 0x7f30d62d3000 "",
      header = 0x7f30d62d3000, fpm = 0x7f30d62d3038, pagemap =
0x7f30d62d3480}, {segment = 0x56134dfa1f68, mapped_address =
0x7f31365d5000 "", header = 0x7f31365d5000, fpm = 0x7f31365d5038,
      pagemap = 0x7f31365d5480}, {segment = 0x56134ee12098,
mapped_address = 0x7f306599e000 "", header = 0x7f306599e000, fpm =
0x7f306599e038, pagemap = 0x7f306599e480}, {
      segment = 0x56134ee120e8, mapped_address = 0x7f305599e000 "",
header = 0x7f305599e000, fpm = 0x7f305599e038, pagemap =
0x7f305599e480}, {segment = 0x56134ee12138,
      mapped_address = 0x7f303599e000 "", header = 0x7f303599e000, fpm
= 0x7f303599e038, pagemap = 0x7f303599e480}, {segment =
0x56134ee12188, mapped_address = 0x7f301599e000 "",
      header = 0x7f301599e000, fpm = 0x7f301599e038, pagemap =
0x7f301599e480}, {segment = 0x56134ee121d8, mapped_address =
0x7f2fd599e000 "", header = 0x7f2fd599e000, fpm = 0x7f2fd599e038,
      pagemap = 0x7f2fd599e480}, {segment = 0x56134ee12228,
mapped_address = 0x7f2f9599e000 "", header = 0x7f2f9599e000, fpm =
0x7f2f9599e038, pagemap = 0x7f2f9599e480}, {
      segment = 0x56134ee12278, mapped_address = 0x7f2f1599e000 "",
header = 0x7f2f1599e000, fpm = 0x7f2f1599e038, pagemap =
0x7f2f1599e480}, {segment = 0x56134ee122c8,
      mapped_address = 0x7f2e9599e000 "", header = 0x7f2e9599e000, fpm
= 0x7f2e9599e038, pagemap = 0x7f2e9599e480}, {segment = 0x0,
mapped_address = 0x0, header = 0x0, fpm = 0x0,
      pagemap = 0x0} <repeats 1000 times>}, high_segment_index = 23,
freed_segment_counter = 0}


(gdb) frame 1
(gdb) print *hashtable
$2 = {nbuckets = 67108864, log2_nbuckets = 26, nbuckets_original =
67108864, nbuckets_optimal = 67108864, log2_nbuckets_optimal = 26,
buckets = {unshared = 0x7f31f6cd8000,
    shared = 0x7f31f6cd8000}, keepNulls = false, skewEnabled = false,
skewBucket = 0x0, skewBucketLen = 0, nSkewBuckets = 0, skewBucketNums
= 0x0, nbatch = 1, curbatch = 0, nbatch_original = 1,
  nbatch_outstart = 1, growEnabled = true, totalTuples = 65785362,
partialTuples = 5057580, skewTuples = 0, innerBatchFile = 0x0,
outerBatchFile = 0x0, outer_hashfunctions = 0x56134e1e04b8,
  inner_hashfunctions = 0x56134e1e0508, hashStrict = 0x56134e1e0558,
collations = 0x56134e1e0570, spaceUsed = 0, spaceAllowed =
13958643712, spacePeak = 0, spaceUsedSkew = 0,
  spaceAllowedSkew = 279172874, hashCxt = 0x56134e1e03a0, batchCxt =
0x56134e1e23b0, chunks = 0x0, current_chunk = 0x0, area =
0x56134e07d718, parallel_state = 0x7f321dfa4400,
  batches = 0x56134e1e07f8, current_chunk_shared = 0}


This is the code where crashed happened

https://github.com/postgres/postgres/blob/8e5faba4b918ba6142339c8f55eaa4eb99776a03/src/backend/utils/mmgr/dsa.c#L835-L840:

/* Locate the object, span and pool. */
segment_map = get_segment_by_index(area, DSA_EXTRACT_SEGMENT_NUMBER(dp));
pageno = DSA_EXTRACT_OFFSET(dp) / FPM_PAGE_SIZE;
span_pointer = segment_map->pagemap[pageno];
span = dsa_get_address(area, span_pointer);
superblock = dsa_get_address(area, span->start);

(gdb) print *segment_map
$4 = {segment = 0x56134dfa2dd8, mapped_address = 0x7f309faf4000 "",
header = 0x7f309faf4000, fpm = 0x7f309faf4038, pagemap =
0x7f309faf4480}

(gdb) print pageno
$5 = 196979

(gdb) print span_pointer
$6 = 0

It looks that if `span_pointer` is 0, `span` is NULL and `span->start`
causes a segfault.
`span_pointer` is 0 because all `segment_map->pagemap` are zeros:

(gdb) print segment_map->pagemap[0]
$10 = 0
(gdb) print segment_map->pagemap[1]
$11 = 0
(gdb) print segment_map->pagemap[2]
$12 = 0
(gdb) print segment_map->pagemap[265]
$14 = 0
(gdb) print segment_map->pagemap[187387]
$15 = 0
(gdb) print segment_map->pagemap[196979]
$16 = 0


Regards,
Marcin Barczyński



On Thu, May 23, 2024 at 11:59 PM Marcin Barczyński <mba.ogolny@gmail.com> wrote:
> (gdb) print *segment_map
> $4 = {segment = 0x56134dfa2dd8, mapped_address = 0x7f309faf4000 "",
> header = 0x7f309faf4000, fpm = 0x7f309faf4038, pagemap =
> 0x7f309faf4480}
>
> (gdb) print pageno
> $5 = 196979

Hmm.  Page 196979 is an offset of around 769MB within the segment
(pages here are 4k).  What does segment_map->segment->mapped_size
show?  It's OK for the pagemap to contain zeroes, but it should
contain non-zero values for pages that contain the start of an
allocated object.  The actual dsa_pointer has been optimised out but
should be visible from frame #1 as batch->chunks.  I think its higher
24 bits should contain 13 (the element of area->segment_maps that
seems to correspond to the above), and its lower 40 bits should
contain that number ~769MB.

The things that are unusually high so far in your emails are worker
count and work_mem, so that it can make quite large hash tables, in
your case up to 13GB.  Perhaps there is a silly arithmetic/type
problem around large numbers somewhere (perhaps somewhere near 4GB+
segments, but I don't expect segment #13 to be very large IIRC).  But
then that would fail more often I think...  It seems to be
rare/intermittent, and yet you don't have any batching or re-bucketing
in your problem (nbatch and nbuckets have their original values), so a
lot of the more complex parts of the PHJ code are not in play here.
Hmm.

I wondered if the tricky edge case where a segment gets unmapped and
then then remapped in the same slot could be leading to segment
confusion.  That does involve a bit of memory order footwork.  What
CPU architecture is this?  But alas I can't come up with any case
where that could go wrong even if there is an unknown bug in that
area, because the no-rebatching, no-rebucketing case doesn't free
anything until the end when it frees everything (ie it never frees
something and then allocate, a requirement for slot re-use).



On Fri, May 24, 2024 at 12:45 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I wondered if the tricky edge case where a segment gets unmapped and
> then then remapped in the same slot could be leading to segment
> confusion.  That does involve a bit of memory order footwork.  What
> CPU architecture is this?  But alas I can't come up with any case
> where that could go wrong even if there is an unknown bug in that
> area, because the no-rebatching, no-rebucketing case doesn't free
> anything until the end when it frees everything (ie it never frees
> something and then allocate, a requirement for slot re-use).

... but if I'm missing something there, it might be a clue visible
from gdb if area->control->freed_segment_counter (the one in shared
memory) and area->freed_segment_counter (the one in this backend) have
different values, if your core captured the segments.



Re: BUG #18334: Segfault when running a query with parallel workers

From
Marcin Barczyński
Date:
Thank you for looking into this.

On Fri, May 24, 2024 at 3:33 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> What does segment_map->segment->mapped_size show?

(gdb) print *(segment_map->segment)
$3 = {node = {prev = 0x56134ee11fa8, next = 0x56134dfa2258}, resowner
= 0x56134df98a98, handle = 2051931009, control_slot = 30, impl_private
= 0x0, mapped_address = 0x7f309faf4000,
  mapped_size = 806887424, on_detach = {head = {next = 0x0}}}

> The actual dsa_pointer has been optimised out but
> should be visible from frame #1 as batch->chunks.

(gdb) frame 1
(gdb) print *batch
$4 = {buckets = 0, batch_barrier = {mutex = 0 '\000', phase = 0,
participants = 0, arrived = 0, elected = 0, static_party = false,
condition_variable = {mutex = 0 '\000', wakeup = {head = 0,
        tail = 0}}}, chunks = 0, size = 0, estimated_size = 0, ntuples
= 0, old_ntuples = 0, space_exhausted = false}

> What CPU architecture is this?

x64, AMD EPYC 9374F

> ... but if I'm missing something there, it might be a clue visible
> from gdb if area->control->freed_segment_counter (the one in shared
> memory) and area->freed_segment_counter (the one in this backend) have
> different values, if your core captured the segments.

(gdb) p *area->control
$1 = {segment_header = {magic = 0, usable_pages = 0, size = 0, prev =
0, next = 0, bin = 0, freed = false}, handle = 0, segment_handles = {0
<repeats 1024 times>}, segment_bins = {
    0 <repeats 16 times>}, pools = {{lock = {tranche = 0, state =
{value = 0}, waiters = {head = 0, tail = 0}}, spans = {0, 0, 0, 0}}
<repeats 38 times>}, total_segment_size = 0,
  max_total_segment_size = 0, high_segment_index = 0, refcnt = 0,
pinned = false, freed_segment_counter = 0, lwlock_tranche_id = 0, lock
= {tranche = 0, state = {value = 0}, waiters = {head = 0,
      tail = 0}}}

(gdb) p *area
$2 = {control = 0x7f321dfa4500, mapping_pinned = false, segment_maps =
{{segment = 0x0, mapped_address = 0x7f321dfa4500 "", header =
0x7f321dfa4500, fpm = 0x7f321dfa5d20,
      pagemap = 0x7f321dfa6168}, {segment = 0x56134dfa1ec8,
mapped_address = 0x7f3216cd8000 "", header = 0x7f3216cd8000, fpm =
0x7f3216cd8038, pagemap = 0x7f3216cd8480}, {
      segment = 0x56134dfa1f18, mapped_address = 0x7f31f6bd7000 "",
header = 0x7f31f6bd7000, fpm = 0x7f31f6bd7038, pagemap =
0x7f31f6bd7480}, {segment = 0x56134dfa2078,
      mapped_address = 0x7f30d60a6000 "", header = 0x7f30d60a6000, fpm
= 0x7f30d60a6038, pagemap = 0x7f30d60a6480}, {segment =
0x56134dfa2118, mapped_address = 0x7f30d58a6000 "",
      header = 0x7f30d58a6000, fpm = 0x7f30d58a6038, pagemap =
0x7f30d58a6480}, {segment = 0x56134dfa20c8, mapped_address =
0x7f30d5ca6000 "", header = 0x7f30d5ca6000, fpm = 0x7f30d5ca6038,
      pagemap = 0x7f30d5ca6480}, {segment = 0x56134dfa2168,
mapped_address = 0x7f30d50a6000 "", header = 0x7f30d50a6000, fpm =
0x7f30d50a6038, pagemap = 0x7f30d50a6480}, {
      segment = 0x56134dfa21b8, mapped_address = 0x7f30d449e000 "",
header = 0x7f30d449e000, fpm = 0x7f30d449e038, pagemap =
0x7f30d449e480}, {segment = 0x56134dfa2208,
      mapped_address = 0x7f30d2c90000 "", header = 0x7f30d2c90000, fpm
= 0x7f30d2c90038, pagemap = 0x7f30d2c90480}, {segment =
0x56134dfa2258, mapped_address = 0x7f30cfc76000 "",
      header = 0x7f30cfc76000, fpm = 0x7f30cfc76038, pagemap =
0x7f30cfc76480}, {segment = 0x56134ee12048, mapped_address =
0x7f307599e000 "", header = 0x7f307599e000, fpm = 0x7f307599e038,
      pagemap = 0x7f307599e480}, {segment = 0x56134ee11ff8,
mapped_address = 0x7f307b9d0000 "", header = 0x7f307b9d0000, fpm =
0x7f307b9d0038, pagemap = 0x7f307b9d0480}, {
      segment = 0x56134ee11fa8, mapped_address = 0x7f3087a32000 "",
header = 0x7f3087a32000, fpm = 0x7f3087a32038, pagemap =
0x7f3087a32480}, {segment = 0x56134dfa2dd8,
      mapped_address = 0x7f309faf4000 "", header = 0x7f309faf4000, fpm
= 0x7f309faf4038, pagemap = 0x7f309faf4480}, {segment =
0x56134dfa1fb8, mapped_address = 0x7f30d62d3000 "",
      header = 0x7f30d62d3000, fpm = 0x7f30d62d3038, pagemap =
0x7f30d62d3480}, {segment = 0x56134dfa1f68, mapped_address =
0x7f31365d5000 "", header = 0x7f31365d5000, fpm = 0x7f31365d5038,
      pagemap = 0x7f31365d5480}, {segment = 0x56134ee12098,
mapped_address = 0x7f306599e000 "", header = 0x7f306599e000, fpm =
0x7f306599e038, pagemap = 0x7f306599e480}, {
      segment = 0x56134ee120e8, mapped_address = 0x7f305599e000 "",
header = 0x7f305599e000, fpm = 0x7f305599e038, pagemap =
0x7f305599e480}, {segment = 0x56134ee12138,
      mapped_address = 0x7f303599e000 "", header = 0x7f303599e000, fpm
= 0x7f303599e038, pagemap = 0x7f303599e480}, {segment =
0x56134ee12188, mapped_address = 0x7f301599e000 "",
      header = 0x7f301599e000, fpm = 0x7f301599e038, pagemap =
0x7f301599e480}, {segment = 0x56134ee121d8, mapped_address =
0x7f2fd599e000 "", header = 0x7f2fd599e000, fpm = 0x7f2fd599e038,
      pagemap = 0x7f2fd599e480}, {segment = 0x56134ee12228,
mapped_address = 0x7f2f9599e000 "", header = 0x7f2f9599e000, fpm =
0x7f2f9599e038, pagemap = 0x7f2f9599e480}, {
      segment = 0x56134ee12278, mapped_address = 0x7f2f1599e000 "",
header = 0x7f2f1599e000, fpm = 0x7f2f1599e038, pagemap =
0x7f2f1599e480}, {segment = 0x56134ee122c8,
      mapped_address = 0x7f2e9599e000 "", header = 0x7f2e9599e000, fpm
= 0x7f2e9599e038, pagemap = 0x7f2e9599e480}, {segment = 0x0,
mapped_address = 0x0, header = 0x0, fpm = 0x0,
      pagemap = 0x0} <repeats 1000 times>}, high_segment_index = 23,
freed_segment_counter = 0}


I hope this sheds some light on the issue.

Best regards,
Marcin Barczyński



Re: BUG #18334: Segfault when running a query with parallel workers

From
Marcin Barczyński
Date:
Hello!

If it would make things easier, I can share the core dump.

On Fri, May 24, 2024 at 10:26 AM Marcin Barczyński <mba.ogolny@gmail.com> wrote:
>
> Thank you for looking into this.
>
> On Fri, May 24, 2024 at 3:33 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > What does segment_map->segment->mapped_size show?
>
> (gdb) print *(segment_map->segment)
> $3 = {node = {prev = 0x56134ee11fa8, next = 0x56134dfa2258}, resowner
> = 0x56134df98a98, handle = 2051931009, control_slot = 30, impl_private
> = 0x0, mapped_address = 0x7f309faf4000,
>   mapped_size = 806887424, on_detach = {head = {next = 0x0}}}
>
> > The actual dsa_pointer has been optimised out but
> > should be visible from frame #1 as batch->chunks.
>
> (gdb) frame 1
> (gdb) print *batch
> $4 = {buckets = 0, batch_barrier = {mutex = 0 '\000', phase = 0,
> participants = 0, arrived = 0, elected = 0, static_party = false,
> condition_variable = {mutex = 0 '\000', wakeup = {head = 0,
>         tail = 0}}}, chunks = 0, size = 0, estimated_size = 0, ntuples
> = 0, old_ntuples = 0, space_exhausted = false}
>
> > What CPU architecture is this?
>
> x64, AMD EPYC 9374F
>
> > ... but if I'm missing something there, it might be a clue visible
> > from gdb if area->control->freed_segment_counter (the one in shared
> > memory) and area->freed_segment_counter (the one in this backend) have
> > different values, if your core captured the segments.
>
> (gdb) p *area->control
> $1 = {segment_header = {magic = 0, usable_pages = 0, size = 0, prev =
> 0, next = 0, bin = 0, freed = false}, handle = 0, segment_handles = {0
> <repeats 1024 times>}, segment_bins = {
>     0 <repeats 16 times>}, pools = {{lock = {tranche = 0, state =
> {value = 0}, waiters = {head = 0, tail = 0}}, spans = {0, 0, 0, 0}}
> <repeats 38 times>}, total_segment_size = 0,
>   max_total_segment_size = 0, high_segment_index = 0, refcnt = 0,
> pinned = false, freed_segment_counter = 0, lwlock_tranche_id = 0, lock
> = {tranche = 0, state = {value = 0}, waiters = {head = 0,
>       tail = 0}}}
>
> (gdb) p *area
> $2 = {control = 0x7f321dfa4500, mapping_pinned = false, segment_maps =
> {{segment = 0x0, mapped_address = 0x7f321dfa4500 "", header =
> 0x7f321dfa4500, fpm = 0x7f321dfa5d20,
>       pagemap = 0x7f321dfa6168}, {segment = 0x56134dfa1ec8,
> mapped_address = 0x7f3216cd8000 "", header = 0x7f3216cd8000, fpm =
> 0x7f3216cd8038, pagemap = 0x7f3216cd8480}, {
>       segment = 0x56134dfa1f18, mapped_address = 0x7f31f6bd7000 "",
> header = 0x7f31f6bd7000, fpm = 0x7f31f6bd7038, pagemap =
> 0x7f31f6bd7480}, {segment = 0x56134dfa2078,
>       mapped_address = 0x7f30d60a6000 "", header = 0x7f30d60a6000, fpm
> = 0x7f30d60a6038, pagemap = 0x7f30d60a6480}, {segment =
> 0x56134dfa2118, mapped_address = 0x7f30d58a6000 "",
>       header = 0x7f30d58a6000, fpm = 0x7f30d58a6038, pagemap =
> 0x7f30d58a6480}, {segment = 0x56134dfa20c8, mapped_address =
> 0x7f30d5ca6000 "", header = 0x7f30d5ca6000, fpm = 0x7f30d5ca6038,
>       pagemap = 0x7f30d5ca6480}, {segment = 0x56134dfa2168,
> mapped_address = 0x7f30d50a6000 "", header = 0x7f30d50a6000, fpm =
> 0x7f30d50a6038, pagemap = 0x7f30d50a6480}, {
>       segment = 0x56134dfa21b8, mapped_address = 0x7f30d449e000 "",
> header = 0x7f30d449e000, fpm = 0x7f30d449e038, pagemap =
> 0x7f30d449e480}, {segment = 0x56134dfa2208,
>       mapped_address = 0x7f30d2c90000 "", header = 0x7f30d2c90000, fpm
> = 0x7f30d2c90038, pagemap = 0x7f30d2c90480}, {segment =
> 0x56134dfa2258, mapped_address = 0x7f30cfc76000 "",
>       header = 0x7f30cfc76000, fpm = 0x7f30cfc76038, pagemap =
> 0x7f30cfc76480}, {segment = 0x56134ee12048, mapped_address =
> 0x7f307599e000 "", header = 0x7f307599e000, fpm = 0x7f307599e038,
>       pagemap = 0x7f307599e480}, {segment = 0x56134ee11ff8,
> mapped_address = 0x7f307b9d0000 "", header = 0x7f307b9d0000, fpm =
> 0x7f307b9d0038, pagemap = 0x7f307b9d0480}, {
>       segment = 0x56134ee11fa8, mapped_address = 0x7f3087a32000 "",
> header = 0x7f3087a32000, fpm = 0x7f3087a32038, pagemap =
> 0x7f3087a32480}, {segment = 0x56134dfa2dd8,
>       mapped_address = 0x7f309faf4000 "", header = 0x7f309faf4000, fpm
> = 0x7f309faf4038, pagemap = 0x7f309faf4480}, {segment =
> 0x56134dfa1fb8, mapped_address = 0x7f30d62d3000 "",
>       header = 0x7f30d62d3000, fpm = 0x7f30d62d3038, pagemap =
> 0x7f30d62d3480}, {segment = 0x56134dfa1f68, mapped_address =
> 0x7f31365d5000 "", header = 0x7f31365d5000, fpm = 0x7f31365d5038,
>       pagemap = 0x7f31365d5480}, {segment = 0x56134ee12098,
> mapped_address = 0x7f306599e000 "", header = 0x7f306599e000, fpm =
> 0x7f306599e038, pagemap = 0x7f306599e480}, {
>       segment = 0x56134ee120e8, mapped_address = 0x7f305599e000 "",
> header = 0x7f305599e000, fpm = 0x7f305599e038, pagemap =
> 0x7f305599e480}, {segment = 0x56134ee12138,
>       mapped_address = 0x7f303599e000 "", header = 0x7f303599e000, fpm
> = 0x7f303599e038, pagemap = 0x7f303599e480}, {segment =
> 0x56134ee12188, mapped_address = 0x7f301599e000 "",
>       header = 0x7f301599e000, fpm = 0x7f301599e038, pagemap =
> 0x7f301599e480}, {segment = 0x56134ee121d8, mapped_address =
> 0x7f2fd599e000 "", header = 0x7f2fd599e000, fpm = 0x7f2fd599e038,
>       pagemap = 0x7f2fd599e480}, {segment = 0x56134ee12228,
> mapped_address = 0x7f2f9599e000 "", header = 0x7f2f9599e000, fpm =
> 0x7f2f9599e038, pagemap = 0x7f2f9599e480}, {
>       segment = 0x56134ee12278, mapped_address = 0x7f2f1599e000 "",
> header = 0x7f2f1599e000, fpm = 0x7f2f1599e038, pagemap =
> 0x7f2f1599e480}, {segment = 0x56134ee122c8,
>       mapped_address = 0x7f2e9599e000 "", header = 0x7f2e9599e000, fpm
> = 0x7f2e9599e038, pagemap = 0x7f2e9599e480}, {segment = 0x0,
> mapped_address = 0x0, header = 0x0, fpm = 0x0,
>       pagemap = 0x0} <repeats 1000 times>}, high_segment_index = 23,
> freed_segment_counter = 0}
>
>
> I hope this sheds some light on the issue.
>
> Best regards,
> Marcin Barczyński