pg13.2: invalid memory alloc request size NNNN - Mailing list pgsql-hackers

From Justin Pryzby
Subject pg13.2: invalid memory alloc request size NNNN
Date
Msg-id 20210212014837.GE1793@telsasoft.com
Whole thread Raw
Responses Re: pg13.2: invalid memory alloc request size NNNN
Re: pg13.2: invalid memory alloc request size NNNN
List pgsql-hackers
ts=# \errverbose 
ERROR:  XX000: invalid memory alloc request size 18446744073709551613

#0  pg_re_throw () at elog.c:1716
#1  0x0000000000a33b12 in errfinish (filename=0xbff20e "mcxt.c", lineno=959, funcname=0xbff2db <__func__.6684>
"palloc")at elog.c:502
 
#2  0x0000000000a6760d in palloc (size=18446744073709551613) at mcxt.c:959
#3  0x00000000009fb149 in text_to_cstring (t=0x2aaae8023010) at varlena.c:212
#4  0x00000000009fbf05 in textout (fcinfo=0x2094538) at varlena.c:557
#5  0x00000000006bdd50 in ExecInterpExpr (state=0x2093990, econtext=0x20933d8, isnull=0x7fff5bf04a87) at
execExprInterp.c:1112
#6  0x00000000006d4f18 in ExecEvalExprSwitchContext (state=0x2093990, econtext=0x20933d8, isNull=0x7fff5bf04a87) at
../../../src/include/executor/executor.h:316
#7  0x00000000006d4f81 in ExecProject (projInfo=0x2093988) at ../../../src/include/executor/executor.h:350
#8  0x00000000006d5371 in ExecScan (node=0x20932c8, accessMtd=0x7082e0 <SeqNext>, recheckMtd=0x708385 <SeqRecheck>) at
execScan.c:238
#9  0x00000000007083c2 in ExecSeqScan (pstate=0x20932c8) at nodeSeqscan.c:112
#10 0x00000000006d1b00 in ExecProcNodeInstr (node=0x20932c8) at execProcnode.c:466
#11 0x00000000006e742c in ExecProcNode (node=0x20932c8) at ../../../src/include/executor/executor.h:248
#12 0x00000000006e77de in ExecAppend (pstate=0x2089208) at nodeAppend.c:267
#13 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2089208) at execProcnode.c:466
#14 0x000000000070964f in ExecProcNode (node=0x2089208) at ../../../src/include/executor/executor.h:248
#15 0x0000000000709795 in ExecSort (pstate=0x2088ff8) at nodeSort.c:108
#16 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2088ff8) at execProcnode.c:466
#17 0x00000000006d1ad1 in ExecProcNodeFirst (node=0x2088ff8) at execProcnode.c:450
#18 0x00000000006dec36 in ExecProcNode (node=0x2088ff8) at ../../../src/include/executor/executor.h:248
#19 0x00000000006df079 in fetch_input_tuple (aggstate=0x2088a20) at nodeAgg.c:589
#20 0x00000000006e1fad in agg_retrieve_direct (aggstate=0x2088a20) at nodeAgg.c:2368
#21 0x00000000006e1bfd in ExecAgg (pstate=0x2088a20) at nodeAgg.c:2183
#22 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2088a20) at execProcnode.c:466
#23 0x00000000006d1ad1 in ExecProcNodeFirst (node=0x2088a20) at execProcnode.c:450
#24 0x00000000006c6ffa in ExecProcNode (node=0x2088a20) at ../../../src/include/executor/executor.h:248
#25 0x00000000006c966b in ExecutePlan (estate=0x2032f48, planstate=0x2088a20, use_parallel_mode=false,
operation=CMD_SELECT,sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xbb3400 <donothingDR>, 
 
    execute_once=true) at execMain.c:1632

#3  0x00000000009fb149 in text_to_cstring (t=0x2aaae8023010) at varlena.c:212
212             result = (char *) palloc(len + 1);

(gdb) l
207             /* must cast away the const, unfortunately */
208             text       *tunpacked = pg_detoast_datum_packed(unconstify(text *, t));
209             int                     len = VARSIZE_ANY_EXHDR(tunpacked);
210             char       *result;
211
212             result = (char *) palloc(len + 1);

(gdb) p len
$1 = -4

This VM had some issue early today and I killed the VM, causing PG to execute
recovery.  I'm tentatively blaming that on zfs, so this could conceivably be a
data error (although recovery supposedly would have resolved it).  I just
checked and data_checksums=off.

The query has mode(), string_agg(), distinct.

Here's a redacted plan for the query:

 GroupAggregate  (cost=15681340.44..20726393.56 rows=908609 width=618)
   Group Key: (((COALESCE(a.ii, $0) || lpad(a.ii, 5, '0'::text)) || lpad(a.ii, 5, '0'::text))), a.ii, (COALESCE(a.ii,
$2)),(CASE (a.ii)::integer WHEN 1 THEN 'qq'::text WHEN 2 THEN 'qq'::text WHEN 3 THEN 'qq'::text WHEN 4 THEN 'qq'::text
WHEN5 THEN 'qq qq'::text WHEN 6 THEN 'qq-qq'::text ELSE a.ii END), (CASE WHEN (COALESCE(a.ii, $3) = substr(a.ii, 1,
length(COALESCE(a.ii,$4)))) THEN 'qq qq'::text WHEN (hashed SubPlan 7) THEN 'qq qq'::text ELSE 'qq qq qq'::text END)
 
   InitPlan 1 (returns $0)
     ->  Seq Scan on d
   InitPlan 3 (returns $2)
     ->  Seq Scan on d d
   InitPlan 4 (returns $3)
     ->  Seq Scan on d d
   InitPlan 5 (returns $4)
     ->  Seq Scan on d d
   InitPlan 6 (returns $5)
     ->  Seq Scan on d d
   ->  Sort  (cost=15681335.39..15704050.62 rows=9086093 width=313)
         Sort Key: (((COALESCE(a.ii, $0) || lpad(a.ii, 5, '0'::text)) || lpad(a.ii, 5, '0'::text))), a.ii,
(COALESCE(a.ii,$2)), (CASE (a.ii)::integer WHEN 1 THEN 'qq'::text WHEN 2 THEN 'qq'::text WHEN 3 THEN 'qq'::text WHEN 4
THEN'qq'::text WHEN 5 THEN 'qq qq'::text WHEN 6 THEN 'qq-qq'::text ELSE a.ii END), (CASE WHEN (COALESCE(a.ii, $3) =
substr(a.ii,1, length(COALESCE(a.ii, $4)))) THEN 'qq qq'::text WHEN (hashed SubPlan 7) THEN 'qq qq'::text ELSE 'qq qq
qq'::textEND)
 
         ->  Append  (cost=1.01..13295792.30 rows=9086093 width=313)
               ->  Seq Scan on a a  (cost=1.01..5689033.34 rows=3948764 width=328)
                     Filter: ((ii >= '2021-02-10 00:00:00+10'::timestamp with time zone) AND (ii < '2021-02-11
00:00:00+10'::timestampwith time zone))
 
                     SubPlan 7
                       ->  Seq Scan on d d  (cost=0.00..1.01 rows=1 width=7)
               ->  Seq Scan on b  (cost=1.01..12.75 rows=1 width=417)
                     Filter: ((ii >= '2021-02-10 00:00:00+10'::timestamp with time zone) AND (ii < '2021-02-11
00:00:00+10'::timestampwith time zone))
 
                     SubPlan 11
                       ->  Seq Scan on d d  (cost=0.00..1.01 rows=1 width=7)
               ->  Seq Scan on c c  (cost=1.01..7561315.74 rows=5137328 width=302)
                     Filter: ((ii >= '2021-02-10 00:00:00+10'::timestamp with time zone) AND (ii < '2021-02-11
00:00:00+10'::timestampwith time zone))
 
                     SubPlan 14
                       ->  Seq Scan on d d  (cost=0.00..1.01 rows=1 width=7)

I restored to a test cluster, but so far not able to reproduce the issue there,
so I'm soliciting suggestions how to debug it further.

-- 
Justin



pgsql-hackers by date:

Previous
From: Ajin Cherian
Date:
Subject: Re: Single transaction in the tablesync worker?
Next
From: Andy Fan
Date:
Subject: Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)