Thread: BUG #6698: sub-query with join producing out of memory in where clause
BUG #6698: sub-query with join producing out of memory in where clause
From
armando.miraglia@stud-inf.unibz.it
Date:
The following bug has been logged on the website: Bug reference: 6698 Logged by: Armando Miraglia Email address: armando.miraglia@stud-inf.unibz.it PostgreSQL version: 9.1.2 Operating system: Arch Linux/Ubuntu Description:=20=20=20=20=20=20=20=20 Hi everybody! Fact: while I was trying to produce a c-function I got an OOM which RhodiumToad helped me to debug. The OOM is reproducible with also standard query. Environment: I tested the POC using 9.1.2 but also 9.2devel compiled "by-hand" Reproducibility: - limit the memory usage ulimit -S -v 500000 - start postgresql postgres -D ../data.ascii/ - run the following query from psql SELECT *=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =20=20=20=20=20=20=20 FROM generate_series(1,1000000) i=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20 =20=20=20=20=20=20=20 WHERE 100 <=3D (SELECT COUNT(*)=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 =20=20=20=20=20=20=20 FROM unnest(array(select j from generate_series(i-100,i+100) j)) u1 JOIN=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 =20=20=20=20=20=20=20 unnest(array(select j from generate_series(i-100,i+100) j)) u2 ON (u1.u1=3Du2.u2));=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =20=20=20=20=20=20=20 Error: - psql side: ERROR: out of memory DETAIL: Failed on request of size 828. - server side: ... PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used PortalHeapMemory: 1024 total in 1 blocks; 824 free (0 chunks); 200 used ExecutorState: 458358928 total in 67 blocks; 794136 free (15965 chunks); 457564792 used accumArrayResult: 8192 total in 1 blocks; 5744 free (4 chunks); 2448 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 32768 total in 2 blocks; 8416 free (1 chunks); 24352 used ... As you can see I am using enconding sqlascii. Thanks a lot! Armando Miraglia
Re: BUG #6698: sub-query with join producing out of memory in where clause
From
Heikki Linnakangas
Date:
On 19.06.2012 04:01, armando.miraglia@stud-inf.unibz.it wrote: > The following bug has been logged on the website: > > Bug reference: 6698 > Logged by: Armando Miraglia > Email address: armando.miraglia@stud-inf.unibz.it > PostgreSQL version: 9.1.2 > Operating system: Arch Linux/Ubuntu > Description: > > Hi everybody! > > Fact: while I was trying to produce a c-function I got an OOM which > RhodiumToad helped me to debug. The OOM is reproducible with also standard > query. > > Environment: I tested the POC using 9.1.2 but also 9.2devel compiled > "by-hand" > > Reproducibility: > - limit the memory usage > ulimit -S -v 500000 > - start postgresql > postgres -D ../data.ascii/ > > - run the following query from psql > SELECT * > > FROM generate_series(1,1000000) i > > WHERE 100<= (SELECT COUNT(*) > > FROM unnest(array(select j from > generate_series(i-100,i+100) j)) u1 > JOIN > > unnest(array(select j from > generate_series(i-100,i+100) j)) u2 > ON (u1.u1=u2.u2)); > > Error: > - psql side: > ERROR: out of memory > DETAIL: Failed on request of size 828. > - server side: > ... > PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used > PortalHeapMemory: 1024 total in 1 blocks; 824 free (0 chunks); 200 used > ExecutorState: 458358928 total in 67 blocks; 794136 free (15965 > chunks); 457564792 used > accumArrayResult: 8192 total in 1 blocks; 5744 free (4 chunks); 2448 > used > HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used > HashBatchContext: 32768 total in 2 blocks; 8416 free (1 chunks); > 24352 used > ... This test case can be further reduced into: explain analyze SELECT * FROM generate_series(1,100000) i WHERE (SELECT array(select repeat('a', 10000) || i) u1) is not null; We're leaking the array constructed on each row, in ExecSetParamPlan(). At line 1000 in nodeSubplan.c, we create a new array and store it as the value of the PARAM_EXEC parameter. But it's never free'd. The old value of the parameter is simply overwritten. I'm not sure what the correct fix is. I suppose we could pfree() the old value before overwriting it, but I'm not sure if that's safe, or if there might still be references to the old value somewhere in the executor. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> I'm not sure what the correct fix is. I suppose we could pfree() the old= =20 > value before overwriting it, but I'm not sure if that's safe, or if=20 > there might still be references to the old value somewhere in the executo= r. It will resolve the current problem but I am also not sure whether it can c= reate any other problem because in this function most of the work is done in per-= query memory context. One thing if we can clarify that why per-tuple memory context is not suffic= ient for this value than it can be easy to conclude on solution for this problem.=20 Another thing I have noticed is that in function ExecScanSubPlan, the simil= ar work is not done in Per-query memory context, I am not sure if it is of any relevance for the p= roblem you mentioned. With Regards, Amit Kapila.
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > This test case can be further reduced into: > explain analyze > SELECT * FROM generate_series(1,100000) i > WHERE (SELECT array(select repeat('a', 10000) || i) u1) is not null; > We're leaking the array constructed on each row, in ExecSetParamPlan(). > At line 1000 in nodeSubplan.c, we create a new array and store it as the > value of the PARAM_EXEC parameter. But it's never free'd. The old value > of the parameter is simply overwritten. > I'm not sure what the correct fix is. I suppose we could pfree() the old > value before overwriting it, but I'm not sure if that's safe, or if > there might still be references to the old value somewhere in the executor. I think it should be safe to pfree the prior value at this point. If you compare the code for scalar subplan results, a few lines above here, pass-by-ref param values are references into node->curTuple which gets summarily freed on the next cycle. So if anybody is keeping a pointer around then it would already be failing for non-ARRAY cases, and we've not seen any such reports. I think we can handle array results the same way as curTuple, ie, keep a link in the node's state. (I thought first about trying to pfree the prior contents of prm->value itself, but I'm less sure that that is safe --- if memory serves, the same ParamExecData slot can sometimes be used for multiple purposes.) Will give it a try ... regards, tom lane