Thread: BUG #6698: sub-query with join producing out of memory in where clause

BUG #6698: sub-query with join producing out of memory in where clause

From
armando.miraglia@stud-inf.unibz.it
Date:
The following bug has been logged on the website:

Bug reference:      6698
Logged by:          Armando Miraglia
Email address:      armando.miraglia@stud-inf.unibz.it
PostgreSQL version: 9.1.2
Operating system:   Arch Linux/Ubuntu
Description:=20=20=20=20=20=20=20=20

Hi everybody!

Fact: while I was trying to produce a c-function I got an OOM which
RhodiumToad helped me to debug. The OOM is reproducible with also standard
query.

Environment: I tested the POC using 9.1.2 but also 9.2devel compiled
"by-hand"

Reproducibility:
- limit the memory usage
ulimit -S -v 500000
- start postgresql
postgres -D ../data.ascii/

- run the following query from psql
SELECT *=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
=20=20=20=20=20=20=20
  FROM generate_series(1,1000000) i=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20
=20=20=20=20=20=20=20
 WHERE 100 <=3D (SELECT COUNT(*)=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20
=20=20=20=20=20=20=20
                 FROM unnest(array(select j from
generate_series(i-100,i+100) j)) u1
                      JOIN=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20
=20=20=20=20=20=20=20
                      unnest(array(select j from
generate_series(i-100,i+100) j)) u2
                      ON (u1.u1=3Du2.u2));=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
=20=20=20=20=20=20=20
Error:
- psql side:
ERROR:  out of memory
DETAIL:  Failed on request of size 828.
- server side:
...
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
    PortalHeapMemory: 1024 total in 1 blocks; 824 free (0 chunks); 200 used
      ExecutorState: 458358928 total in 67 blocks; 794136 free (15965
chunks); 457564792 used
        accumArrayResult: 8192 total in 1 blocks; 5744 free (4 chunks); 2448
used
        HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
          HashBatchContext: 32768 total in 2 blocks; 8416 free (1 chunks);
24352 used
...

As you can see I am using enconding sqlascii.

Thanks a lot!
Armando Miraglia

Re: BUG #6698: sub-query with join producing out of memory in where clause

From
Heikki Linnakangas
Date:
On 19.06.2012 04:01, armando.miraglia@stud-inf.unibz.it wrote:
> The following bug has been logged on the website:
>
> Bug reference:      6698
> Logged by:          Armando Miraglia
> Email address:      armando.miraglia@stud-inf.unibz.it
> PostgreSQL version: 9.1.2
> Operating system:   Arch Linux/Ubuntu
> Description:
>
> Hi everybody!
>
> Fact: while I was trying to produce a c-function I got an OOM which
> RhodiumToad helped me to debug. The OOM is reproducible with also standard
> query.
>
> Environment: I tested the POC using 9.1.2 but also 9.2devel compiled
> "by-hand"
>
> Reproducibility:
> - limit the memory usage
> ulimit -S -v 500000
> - start postgresql
> postgres -D ../data.ascii/
>
> - run the following query from psql
> SELECT *
>
>    FROM generate_series(1,1000000) i
>
>   WHERE 100<= (SELECT COUNT(*)
>
>                   FROM unnest(array(select j from
> generate_series(i-100,i+100) j)) u1
>                        JOIN
>
>                        unnest(array(select j from
> generate_series(i-100,i+100) j)) u2
>                        ON (u1.u1=u2.u2));
>
> Error:
> - psql side:
> ERROR:  out of memory
> DETAIL:  Failed on request of size 828.
> - server side:
> ...
>    PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
>      PortalHeapMemory: 1024 total in 1 blocks; 824 free (0 chunks); 200 used
>        ExecutorState: 458358928 total in 67 blocks; 794136 free (15965
> chunks); 457564792 used
>          accumArrayResult: 8192 total in 1 blocks; 5744 free (4 chunks); 2448
> used
>          HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
>            HashBatchContext: 32768 total in 2 blocks; 8416 free (1 chunks);
> 24352 used
> ...

This test case can be further reduced into:

explain analyze
SELECT * FROM generate_series(1,100000) i
WHERE (SELECT array(select repeat('a', 10000) || i) u1) is not null;

We're leaking the array constructed on each row, in ExecSetParamPlan().
At line 1000 in nodeSubplan.c, we create a new array and store it as the
value of the PARAM_EXEC parameter. But it's never free'd. The old value
of the parameter is simply overwritten.

I'm not sure what the correct fix is. I suppose we could pfree() the old
value before overwriting it, but I'm not sure if that's safe, or if
there might still be references to the old value somewhere in the executor.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
> I'm not sure what the correct fix is. I suppose we could pfree() the old=
=20
> value before overwriting it, but I'm not sure if that's safe, or if=20
> there might still be references to the old value somewhere in the executo=
r.

It will resolve the current problem but I am also not sure whether it can c=
reate
any other problem because in this function most of the work is done in per-=
query memory context.
One thing if we can clarify that why per-tuple memory context is not suffic=
ient for this value
than it can be easy to conclude on solution for this problem.=20

Another thing I have noticed is that in function ExecScanSubPlan, the simil=
ar work is not done in
Per-query memory context, I am not sure if it is of any relevance for the p=
roblem you mentioned.


With Regards,
Amit Kapila.
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> This test case can be further reduced into:

> explain analyze
> SELECT * FROM generate_series(1,100000) i
> WHERE (SELECT array(select repeat('a', 10000) || i) u1) is not null;

> We're leaking the array constructed on each row, in ExecSetParamPlan().
> At line 1000 in nodeSubplan.c, we create a new array and store it as the
> value of the PARAM_EXEC parameter. But it's never free'd. The old value
> of the parameter is simply overwritten.

> I'm not sure what the correct fix is. I suppose we could pfree() the old
> value before overwriting it, but I'm not sure if that's safe, or if
> there might still be references to the old value somewhere in the executor.

I think it should be safe to pfree the prior value at this point.  If
you compare the code for scalar subplan results, a few lines above here,
pass-by-ref param values are references into node->curTuple which gets
summarily freed on the next cycle.  So if anybody is keeping a pointer
around then it would already be failing for non-ARRAY cases, and we've
not seen any such reports.  I think we can handle array results the same
way as curTuple, ie, keep a link in the node's state.  (I thought first
about trying to pfree the prior contents of prm->value itself, but I'm
less sure that that is safe --- if memory serves, the same ParamExecData
slot can sometimes be used for multiple purposes.)

Will give it a try ...

            regards, tom lane