Re: SegFault on 9.6.14 - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: SegFault on 9.6.14 |
Date | |
Msg-id | 20190716003449.fjegtxinhrqubysu@development Whole thread Raw |
In response to | Re: SegFault on 9.6.14 (Jerry Sievers <gsievers19@comcast.net>) |
Responses |
Re: SegFault on 9.6.14
|
List | pgsql-hackers |
On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > >> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote: >> >>>Greetings Hackers. >>> >>>We have a reproduceable case of $subject that issues a backtrace such as >>>seen below. >>> >>>The query that I'd prefer to sanitize before sending is <30 lines of at >>>a glance, not terribly complex logic. >>> >>>It nonetheless dies hard after a few seconds of running and as expected, >>>results in an automatic all-backend restart. >>> >>>Please advise on how to proceed. Thanks! >>> >>>bt >>>#0 initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001') >>> at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233 >>>#1 0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529 >>>#2 0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280 >>>#3 0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158 >>>#4 0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475 >>>#5 0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166 >>>#6 0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019 >>>#7 0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226 >>><about 30 lines omitted> >>> >> >> Hmmm, that means it's crashing here: >> >> if (scan->rs_parallel != NULL) >> scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here >> else >> scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd); >> >> But clearly, scan is valid (otherwise it'd crash on the if condition), >> and scan->rs_parallel must me non-NULL. Which probably means the pointer >> is (no longer) valid. >> >> Could it be that the rs_parallel DSM disappears on rescan, or something >> like that? > >No clue but something I just tried was to disable parallelism by setting >max_parallel_workers_per_gather to 0 and however the query has not >finished after a few minutes, there is no crash. > That might be a hint my rough analysis was somewhat correct. The question is whether the non-parallel plan does the same thing. Maybe it picks a plan that does not require rescans, or something like that. >Please advise. > It would be useful to see (a) exacution plan of the query, (b) full backtrace and (c) a bit of context for the place where it crashed. Something like (in gdb): bt full list p *scan regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: