Thread: strange OOM errors with EXECUTE in PL/pgSQL
Hi, one of our local users reported he's getting OOM errors on 9.2, although on 9.1 the code worked fine. Attached is a simple test-case that should give you an OOM error almost immediately. What it does: 1) creates a simple table called "test" with one text column. 2) creates a plpgsql function with one parameter, and all that function does is passing the parameter to EXECUTE 3) calls the function with a string containing many INSERTs into the test table The way the EXECUTE is used is a bit awkward, but the failures seem a bit strange to me. The whole script is ~500kB and most of that is about 11k of very simple INSERT statements: insert into test(value) values (''aaaaaaaaaa''); all of them are exactly the same. Yet when it fails with OOM, the log contains memory context stats like these: TopMemoryContext: 5303376 total in 649 blocks; 2648 free ... PL/pgSQL function context: 8192 total in 1 blocks; 3160 free ... TopTransactionContext: 8192 total in 1 blocks; 6304 free ... ExecutorState: 8192 total in 1 blocks; 7616 free ... ExprContext: 8192 total in 1 blocks; 8160 free ... SPI Exec: 33554432 total in 14 blocks; 6005416 free ... CachedPlanSource: 3072 total in 2 blocks; 1856 free ... CachedPlanSource: 538688 total in 3 blocks; 1744 free ... CachedPlanQuery: 3072 total in 2 blocks; 1648 free ... CachedPlanSource: 538688 total in 3 blocks; 1744 free ... CachedPlanQuery: 3072 total in 2 blocks; 1648 free ... CachedPlanSource: 538688 total in 3 blocks; 1744 free ... CachedPlanQuery: 3072 total in 2 blocks; 1648 free ... CachedPlanSource: 538688 total in 3 blocks; 1744 free ... CachedPlanQuery: 3072 total in 2 blocks; 1648 free ... CachedPlanSource: 538688 total in 3 blocks; 1744 free ... ... There is ~9500 of these CachedPlanSource + CachedPlanQuery row pairs (see the attached log). That seems a bit strange to me, because all the queries are exactly the same in this test case. The number of queries needed to get OOM is inversely proportional to the query length - by using a longer text (instead of 'aaaaaaaaaaa') you may use much less queries. I am no expert in this area, but it seems to me that the code does not expect that many INSERTs in EXECUTE and does not release the memory for some reason (e.g. because the plans are allocated in SPI Exec memory context, etc.). regards Tomas
Attachment
Tomas Vondra <tv@fuzzy.cz> writes: > What it does: > 1) creates a simple table called "test" with one text column. > 2) creates a plpgsql function with one parameter, and all that function > does is passing the parameter to EXECUTE > 3) calls the function with a string containing many INSERTs into the > test table > The way the EXECUTE is used is a bit awkward, but the failures seem a > bit strange to me. The whole script is ~500kB and most of that is about > 11k of very simple INSERT statements: > insert into test(value) values (''aaaaaaaaaa''); The reason this fails is that you've got a half-megabyte source string, and each of the 11000 plans that are due to be created from it saves its own copy of the source string. Hence, 5500 megabytes needed just for source strings. We could possibly fix this by inventing some sort of reference-sharing arrangement (which'd be complicated and fragile) or by not storing the source strings with the plans (which'd deal a serious blow to our ability to provide helpful error messages). Neither answer seems appealing. I think it would be a better idea to adopt a less brain-dead way of processing the data. Can't you convert this to a single INSERT with a lot of VALUES rows? Or split it into multiple EXECUTE chunks? regards, tom lane
On 20.12.2012 02:29, Tom Lane wrote: > Tomas Vondra <tv@fuzzy.cz> writes: >> What it does: > >> 1) creates a simple table called "test" with one text column. > >> 2) creates a plpgsql function with one parameter, and all that function >> does is passing the parameter to EXECUTE > >> 3) calls the function with a string containing many INSERTs into the >> test table > >> The way the EXECUTE is used is a bit awkward, but the failures seem a >> bit strange to me. The whole script is ~500kB and most of that is about >> 11k of very simple INSERT statements: > >> insert into test(value) values (''aaaaaaaaaa''); > > The reason this fails is that you've got a half-megabyte source string, > and each of the 11000 plans that are due to be created from it saves > its own copy of the source string. Hence, 5500 megabytes needed just > for source strings. > > We could possibly fix this by inventing some sort of reference-sharing > arrangement (which'd be complicated and fragile) or by not storing the > source strings with the plans (which'd deal a serious blow to our > ability to provide helpful error messages). Neither answer seems > appealing. Thanks for the explanation, I didn't occur to me that each plan keeps a copy of the whole source string. > I think it would be a better idea to adopt a less brain-dead way of > processing the data. Can't you convert this to a single INSERT with a > lot of VALUES rows? Or split it into multiple EXECUTE chunks? Well, it's not my app but I'll recommend it to them. Actually I already did but I didn't have an explanation of why it behaves like this. The really annoying bit is that in 9.1 this works fine (it's just as crazy approach as on but it does not end with OOM error). Tomas
Tom Lane <tgl@sss.pgh.pa.us> writes: > The reason this fails is that you've got a half-megabyte source string, > and each of the 11000 plans that are due to be created from it saves > its own copy of the source string. Hence, 5500 megabytes needed just > for source strings. > > We could possibly fix this by inventing some sort of reference-sharing > arrangement (which'd be complicated and fragile) or by not storing the > source strings with the plans (which'd deal a serious blow to our > ability to provide helpful error messages). Neither answer seems > appealing. I don't readily see how complicated and fragile it would be, it looks like a hash table of symbols pointing to source strings and a reference counting, and each plan would need to reference that symbol. Now maybe that's what you call complicated and fragile, and even if not, I'm not really sure it would pull its weight. The use case of sending over and over again *in a given session* the exact same query string without using PREPARE/EXECUTE looks like quite tiny. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 12/20/2012 4:47 PM, Dimitri Fontaine wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> The reason this fails is that you've got a half-megabyte source string, >> and each of the 11000 plans that are due to be created from it saves >> its own copy of the source string. Hence, 5500 megabytes needed just >> for source strings. >> >> We could possibly fix this by inventing some sort of reference-sharing >> arrangement (which'd be complicated and fragile) or by not storing the >> source strings with the plans (which'd deal a serious blow to our >> ability to provide helpful error messages). Neither answer seems >> appealing. > > I don't readily see how complicated and fragile it would be, it looks > like a hash table of symbols pointing to source strings and a reference > counting, and each plan would need to reference that symbol. Now maybe > that's what you call complicated and fragile, and even if not, I'm not > really sure it would pull its weight. The use case of sending over and > over again *in a given session* the exact same query string without > using PREPARE/EXECUTE looks like quite tiny. > That sounds like a bit of overkill to me. Don't all the plans result as a plan list from a multi-statement query string, which was parsed into a query tree "list" and each single query tree then planned? I don't think there is any way that a single one of those trees (parse or plan) will be free'd separately. If that is true, then proper usage of memory contexts would make reference counting obsolete, even though all plans refer to the same copy. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
Jan Wieck <JanWieck@Yahoo.com> writes: > On 12/20/2012 4:47 PM, Dimitri Fontaine wrote: >> Tom Lane <tgl@sss.pgh.pa.us> writes: >>> The reason this fails is that you've got a half-megabyte source string, >>> and each of the 11000 plans that are due to be created from it saves >>> its own copy of the source string. Hence, 5500 megabytes needed just >>> for source strings. >>> >>> We could possibly fix this by inventing some sort of reference-sharing >>> arrangement (which'd be complicated and fragile) or by not storing the >>> source strings with the plans (which'd deal a serious blow to our >>> ability to provide helpful error messages). Neither answer seems >>> appealing. > Don't all the plans result as a plan list from a multi-statement query > string, which was parsed into a query tree "list" and each single query > tree then planned? I don't think there is any way that a single one of > those trees (parse or plan) will be free'd separately. If that is true, > then proper usage of memory contexts would make reference counting > obsolete, even though all plans refer to the same copy. The issue is that a multi-statement string gives rise to multiple CachedPlanSources, which could be freed independently so far as plancache.c knows. (spi.c wouldn't actually attempt to do so.) So you'd really need reference counting, or else some explicit connection between the CachedPlanSources, neither of which seems exactly trivial to me. As of HEAD this particular complaint is moot anyway, because SPI_execute now goes through the "one-shot CachedPlan" facility, and so it makes no (zero) copies of the source string. It'd still be possible to hit the problem when trying to SPI_prepare a very-many-statement string, but I think the use case for that is pretty darn small. So I'm not excited about adding complication to fix it. regards, tom lane