Thread: Less than ideal error reporting in pg_stat_statements
A client was getting some hard to diagnose out of memory errors. What made this especially confusing was that there was no context reported at all, other than the (enormous) statement that triggered the error. At first I thought the lack of context indicated a palloc had failed during ereport() (since we apparently just toss the previous error when that happens), but it turns out there's some error reporting in pg_stat_statements that's less than ideal. Attached patch fixes, though I'm not sure if %lld is portable or not. I'll also argue that this is a bug and should be backpatched, but I'm not hell-bent on that. At the same time I looked for other messages that don't explicitly reference pg_stat_statements; the only others are in pg_stat_statements_internal() complaining about being called in an inappropriate function context. Presumably at that point there's a reasonable error context stack so I didn't bother with them. This still seems a bit fragile to me though. Anyone working in here has to notice that most every errmsg mentions pg_stat_statements and decide there's a good reason for that. ISTM it'd be better to push a new ErrorContextCallback onto the stack any time we enter the module. If folks think that's a good idea I'll pursue it as a separate patch. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Attachment
On 23 September 2015 at 10:16, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Attached patch fixes, though I'm not sure if %lld is portable or not.
I think you could probably use INT64_FORMAT, and cast the stat.st_size to int64 too.
There's an example in FileRead() in fd.c
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Sep 22, 2015 at 3:16 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > At first I thought the lack of context indicated a palloc had failed during > ereport() (since we apparently just toss the previous error when that > happens), but it turns out there's some error reporting in > pg_stat_statements that's less than ideal. Attached patch fixes, though I'm > not sure if %lld is portable or not. + ereport(LOG, + (errcode(ERRCODE_OUT_OF_MEMORY), + errmsg("out of memory attempting to pg_stat_statement file"), + errdetail("file \"%s\": size %lld", PGSS_TEXT_FILE, stat.st_size))); Uh, what? I'm not opposed to this basic idea, but I think the message should be reworded, and that the presence of two separate ereport() call sites like the above is totally unnecessary. The existing MaxAllocSize check is just defensive; no user-visible distinction needs to be made. -- Peter Geoghegan
On 9/22/15 5:58 PM, Peter Geoghegan wrote: > On Tue, Sep 22, 2015 at 3:16 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> At first I thought the lack of context indicated a palloc had failed during >> ereport() (since we apparently just toss the previous error when that >> happens), but it turns out there's some error reporting in >> pg_stat_statements that's less than ideal. Attached patch fixes, though I'm >> not sure if %lld is portable or not. > > + ereport(LOG, > + (errcode(ERRCODE_OUT_OF_MEMORY), > + errmsg("out of memory attempting to pg_stat_statement file"), > + errdetail("file \"%s\": size %lld", PGSS_TEXT_FILE, > stat.st_size))); > > Uh, what? Oops. I'll fix that and address David's concern tomorrow. > I'm not opposed to this basic idea, but I think the message should be > reworded, and that the presence of two separate ereport() call sites > like the above is totally unnecessary. The existing MaxAllocSize check > is just defensive; no user-visible distinction needs to be made. I disagree. If you're running this on a 200+GB machine with plenty of free memory and get that error you're going to be scratching your head. I can see an argument for using the OOM SQLSTATE, but treating an artificial limit the same as a system error seems pretty bogus. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
David Rowley <david.rowley@2ndquadrant.com> writes: > On 23 September 2015 at 10:16, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >>> Attached patch fixes, though I'm not sure if %lld is portable or not. It is not. > I think you could probably use INT64_FORMAT, Not in a message you expect to be translatable. There are ways around that, but TBH I do not think that including the file size in the errdetail is valuable enough to be worth the trouble. I'd just leave it out. "insufficient memory to load statement file" seems quite enough. regards, tom lane
Peter Geoghegan <pg@heroku.com> writes: > I'm not opposed to this basic idea, but I think the message should be > reworded, and that the presence of two separate ereport() call sites > like the above is totally unnecessary. The existing MaxAllocSize check > is just defensive; no user-visible distinction needs to be made. I wonder whether the real problem here is failure to truncate statement texts to something sane. Do we really need to record the whole text of multi-megabyte statements? Especially if doing so could render the entire feature nonfunctional? regards, tom lane
On Tue, Sep 22, 2015 at 4:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wonder whether the real problem here is failure to truncate statement > texts to something sane. Do we really need to record the whole text of > multi-megabyte statements? Especially if doing so could render the entire > feature nonfunctional? I recently encountered a 9.4 customer database that had an insanely large query text stored by pg_stat_statements, apparently created as part of a process of kicking the tires of their new installation. I don't know how large it actually was, but it caused psql to stall for over 10 seconds. Insane queries happen, so truncating query text could conceal the extent of how unreasonable a query is. I think that the real problem here is that garbage collection needs to deal with OOM more appropriately. That's the only way there could be a problem with an in-flight query as opposed to a query that looks at pg_stat_statements, which seems to be Nasby's complaint. My guess is that this very large query involved a very large number of constants, possibly contained inside an " IN ( )". Slight variants of the same query, that a human would probably consider to be equivalent have caused artificial pressure on garbage collection. -- Peter Geoghegan
On Tue, Sep 22, 2015 at 5:01 PM, Peter Geoghegan <pg@heroku.com> wrote: > My guess is that this very large query involved a very large number of > constants, possibly contained inside an " IN ( )". Slight variants of > the same query, that a human would probably consider to be equivalent > have caused artificial pressure on garbage collection. I could write a patch to do compaction in-place. The basic idea is that there'd be a slow path in the event of an OOM-like condition (i.e. an actual OOM, or when the MaxAllocSize limitation is violated) that first scans through entries, and determines the exact required buffer size for every non-garbage query text. As this iteration/scanning occurs, the entries' offsets in shared memory are rewritten assuming that the first entry starts at 0, the second at 0 + length of first + 1 (for NUL sentinal byte), and so on. We then allocate a minimal buffer, lseek() and copy into the buffer, so that the expectation of finding query texts at those offsets is actually met. Finally, unlink() old file, create new one, and write new buffer out. I think I wanted to do things that way originally. If even that exact, minimal buffer size cannot be allocated, then ISTM that the user is out of luck. That will be very rare in practice, but should it occur we log the issue and give up on storing query texts entirely, so as to avoid thrashing while still giving the user something to go on. This new code path is never hit until a garbage collection is required, so hopefully the garbage created was not a pathological issue with a weird workload, but rather something that will not recur for a very long time. That seems to me to be better than getting into the business of deciding how long of a query text is too long. I'm doubtful that this had anything to do with MaxAllocSize. You'd certainly need a lot of bloat to be affected by that in any way. I wonder how high pg_stat_statements.max was set to on this system, and how long each query text was on average. -- Peter Geoghegan
Peter Geoghegan wrote: > My guess is that this very large query involved a very large number of > constants, possibly contained inside an " IN ( )". Slight variants of > the same query, that a human would probably consider to be equivalent > have caused artificial pressure on garbage collection. So if I have multiple queries like SELECT foo FROM bar WHERE baz IN (a, b) SELECT foo FROM bar WHERE baz IN (a, b, c) they are not normalized down to the same? That seems odd. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Sep 22, 2015 at 6:55 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > So if I have multiple queries like > > SELECT foo FROM bar WHERE baz IN (a, b) > SELECT foo FROM bar WHERE baz IN (a, b, c) > > they are not normalized down to the same? That seems odd. Yes, although in practice it's usually down to a variable number of constants appearing within the "IN ( )", which is more odd IMV. We discussed changing this before. I don't have strong feelings either way. -- Peter Geoghegan
On 9/22/15 8:01 PM, Peter Geoghegan wrote: > I'm doubtful that this had anything to do with MaxAllocSize. You'd > certainly need a lot of bloat to be affected by that in any way. I > wonder how high pg_stat_statements.max was set to on this system, and > how long each query text was on average. max was set to 10000. I don't know about average query text size, but the command that was causing the error was a very large number of individual INSERT ... VALUES statements all in one command. The machine had plenty of free memory and no ulimit, so I don't see how this could have been anything but MaxAllocSize, unless there's some other failure mode in malloc I don't know about. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On 9/22/15 6:27 PM, Jim Nasby wrote: >> + ereport(LOG, >> + (errcode(ERRCODE_OUT_OF_MEMORY), >> + errmsg("out of memory attempting to pg_stat_statement >> file"), >> + errdetail("file \"%s\": size %lld", PGSS_TEXT_FILE, >> stat.st_size))); >> >> Uh, what? > > Oops. I'll fix that and address David's concern tomorrow. New patch attached. I stripped the size reporting out and simplified the conditionals a bit as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Attachment
On Wed, Sep 23, 2015 at 3:01 AM, Peter Geoghegan <pg@heroku.com> wrote: > I think that the real problem here is that garbage collection needs to > deal with OOM more appropriately. +1 I've also been seeing lots of log messages saying "LOG: out of memory" on a server that's hosting development databases. I put off debugging this until now because it didn't seem to have any adverse effects on the system. The file on my system is currently 5.1GB (!). I don't know how it got there -- under normal circumstances we don't have any enormous queries, but perhaps our application bugs during development triggered that. The configuration on this system is pg_stat_statements.max = 10000 and pg_stat_statements.track = all. ---- The comment near gc_qtexts says: * This won't be called often in the typical case, since it's likely that * there won't be too much churn, and besides, a similar compaction process * occurs when serializing to disk at shutdown or as part of resetting. * Despite this, it seems prudent to plan for the edge case where the file * becomes unreasonably large, with no other method of compaction likely to * occur in the foreseeable future. [...] * Load the old texts file. If we fail (out of memory, for instance) just * skip the garbage collection. So, as I understand it: if the system runs low on memory for an extended period, and/or the file grows beyond 1GB (MaxAlloc), garbage collection stops entirely, meaning it starts leaking disk space until a manual intervention. It's very frustrating when debugging aides cause further problems on a system. If the in-line compaction doesn't materialize (or it's decided not to backport it), I would propose instead to add a test to pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps even a lower limit). Surely dropping some statistics is better than this pathology. Regards, Marti
On Fri, Sep 25, 2015 at 8:51 AM, Marti Raudsepp <marti@juffo.org> wrote: > I've also been seeing lots of log messages saying "LOG: out of > memory" on a server that's hosting development databases. I put off > debugging this until now because it didn't seem to have any adverse > effects on the system. > > The file on my system is currently 5.1GB (!). I don't know how it got > there -- under normal circumstances we don't have any enormous > queries, but perhaps our application bugs during development triggered > that. It could be explained by legitimate errors during planning, for example. The query text is still stored. > So, as I understand it: if the system runs low on memory for an > extended period, and/or the file grows beyond 1GB (MaxAlloc), garbage > collection stops entirely, meaning it starts leaking disk space until > a manual intervention. I don't think that there is much more to discuss here: this is a bug. I will try and write a patch to fix it shortly. It will be non-trivial, and I'm quite busy right now, so it might take a while. A short-term remediation is to call pg_stat_statements_reset() on systems affected like this. -- Peter Geoghegan
On Fri, Sep 25, 2015 at 11:37 AM, Peter Geoghegan <pg@heroku.com> wrote: >> So, as I understand it: if the system runs low on memory for an >> extended period, and/or the file grows beyond 1GB (MaxAlloc), garbage >> collection stops entirely, meaning it starts leaking disk space until >> a manual intervention. > > I don't think that there is much more to discuss here: this is a bug. > I will try and write a patch to fix it shortly. I should add that it only leaks disk space at the rate at which new queries are observed that are not stored within pg_stat_statements (due to an error originating in the planner or something -- they remain "sticky" entries). The reason we've not heard far more problem reports is that it usually never gets out of hand in the first place. Come to think of it, you'd have to repeatedly have new queries that are never "unstickied"; if you have substantively the same query as an error-during-planning "sticky" entry, it will still probably be able to use that existing entry (it will become "unstickied" by this second execution of what the fingerprinting logic considers to be the same query). In short, you have to have just the right workload to hit the bug. -- Peter Geoghegan
On Tue, Sep 22, 2015 at 6:01 PM, Peter Geoghegan <pg@heroku.com> wrote: > I'm doubtful that this had anything to do with MaxAllocSize. You'd > certainly need a lot of bloat to be affected by that in any way. I > wonder how high pg_stat_statements.max was set to on this system, and > how long each query text was on average. To clarify: I think it probably starts off not having much to do with MaxAllocSize. However, it might well be the case that transient memory pressure results in the problematic code path hitting the MaxAllocSize imitation. So it starts with malloc() returning NULL, which temporarily blocks garbage collection, but in bad cases the MaxAllocSize limitation becomes a permanent barrier to performing a garbage collection (without a manual intervention). -- Peter Geoghegan
On Fri, Sep 25, 2015 at 8:51 AM, Marti Raudsepp <marti@juffo.org> wrote: > I've also been seeing lots of log messages saying "LOG: out of > memory" on a server that's hosting development databases. I put off > debugging this until now because it didn't seem to have any adverse > effects on the system. That's unfortunate. > It's very frustrating when debugging aides cause further problems on a > system. If the in-line compaction doesn't materialize (or it's decided > not to backport it), I would propose instead to add a test to > pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps > even a lower limit). Surely dropping some statistics is better than > this pathology. I heard a customer complaint today that seems similar. A Heroku customer attempted a migration from MySQL to PostgreSQL in this manner: mysqldump | psql This at least worked well enough to cause problems for pg_stat_statements (some queries were not rejected by the parser, I suppose). While I'm opposed to arbitrary limits for tools like pg_stat_statements, I think the following defensive measure might be useful on top of better OOM defenses: Test for query text length within pgss_store() where a pgssJumbleState is passed by caller (the post-parse-analysis hook caller -- not executor hook caller that has query costs to store). If it is greater than, say, 10 * Max(ASSUMED_MEDIAN_INIT, pgss->cur_median_usage), do not bother to normalize the query text, or store anything at all. Simply return. Any entry we create at that point will be a sticky entry (pending actually completing execution without the entry being evicted), and it doesn't seem worth worrying about normalizing very large query texts, which tend to be qualitatively similar to utility statements from the user's perspective anyway. Besides, query text normalization always occurred on a best-effort basis. It's not very uncommon for pg_stat_statements to fail to normalize query texts today for obscure reasons. This would avoid storing very large query texts again and again when a very large DML statement repeatedly fails (e.g. due to a data integration task that can run into duplicate violations) and is repeatedly rerun with tweaks. Maybe there is a substantively distinct table in each case, because a CREATE TABLE is performed as part of the same transaction, so each failed query gets a new pg_stat_statements entry, and a new, large query text. I should probably also assume that sticky entries have a generic length (existing pgss->mean_query_len) for the purposes of accumulating query text length within entry_dealloc(). They should not get to contribute to median query length (which throttles garbage collection to prevent thrashing). Anyone have an opinion on that? -- Peter Geoghegan
On Tue, Sep 22, 2015 at 6:01 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Sep 22, 2015 at 5:01 PM, Peter Geoghegan <pg@heroku.com> wrote: >> My guess is that this very large query involved a very large number of >> constants, possibly contained inside an " IN ( )". Slight variants of >> the same query, that a human would probably consider to be equivalent >> have caused artificial pressure on garbage collection. > > I could write a patch to do compaction in-place. In the end, I decided on a simpler approach to fixing this general sort of problem with the attached patch. See commit message for details. I went this way not because compaction in-place was necessarily a bad idea, but because I think that a minimal approach will work just as well in real world cases. It would be nice to get this committed before the next point releases are tagged, since I've now heard a handful of complaints like this. -- Peter Geoghegan
Attachment
On Fri, Oct 2, 2015 at 2:04 PM, Peter Geoghegan <pg@heroku.com> wrote: > It would be nice to get this committed before the next point releases > are tagged, since I've now heard a handful of complaints like this. I grep'd for SIZE_MAX to make sure that that was something that is available on all supported platforms, since it's C99. What I originally thought was code now turns out to actually be a code-like comment within aset.c. I think that SIZE_MAX should be replaced by MaxAllocHugeSize before the patch is committed. That should be perfectly portable. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > It would be nice to get this committed before the next point releases > are tagged, since I've now heard a handful of complaints like this. I'm not too impressed with this bit: /* Allocate buffer; beware that off_t might be wider than size_t */ - if (stat.st_size <= MaxAllocSize) + if (stat.st_size <= SIZE_MAX) buf = (char *) malloc(stat.st_size); because there are no, zero, not one uses of SIZE_MAX in our code today, and I do not see such a symbol required by the POSIX v2 spec either. Perhaps this will work, but you're asking us to introduce a brand new portability hazard just hours before a wrap deadline. That is not happening. Other than that, this seems roughly sane, though I've not read it in detail or tested it. Does anyone have an objection to trying to squeeze in something along this line? regards, tom lane
On Fri, Oct 2, 2015 at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm not too impressed with this bit: > > /* Allocate buffer; beware that off_t might be wider than size_t */ > - if (stat.st_size <= MaxAllocSize) > + if (stat.st_size <= SIZE_MAX) > buf = (char *) malloc(stat.st_size); > > because there are no, zero, not one uses of SIZE_MAX in our code today, > and I do not see such a symbol required by the POSIX v2 spec either. See my remarks just now. This can easily be fixed. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > I think that SIZE_MAX should be replaced by MaxAllocHugeSize before > the patch is committed. That should be perfectly portable. Hmm ... only back to 9.4, but I guess that's far enough. regards, tom lane
On Fri, Oct 2, 2015 at 2:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I think that SIZE_MAX should be replaced by MaxAllocHugeSize before >> the patch is committed. That should be perfectly portable. > > Hmm ... only back to 9.4, but I guess that's far enough. I just realized that the existing gc_fail handler label within gc_qtexts() lacks something like this, too: unlink(PGSS_TEXT_FILE); (void) AllocateFile(PGSS_TEXT_FILE, PG_BINARY_W); SpinLockAcquire(&s->mutex); pgss->extent = 0; SpinLockRelease(&s->mutex); I think it should do this anyway, but it makes particular sense in light of the proposed changes. All existing failure cases within gc_qtexts() seem like a good reason to give up forever. The spinlock acquisition above is actually necessary despite the n_writers trick, because that's only used by qtext_store(). -- Peter Geoghegan
On Fri, Oct 2, 2015 at 3:57 PM, Peter Geoghegan <pg@heroku.com> wrote: > (void) AllocateFile(PGSS_TEXT_FILE, PG_BINARY_W); Naturally, a FreeFile() call will also be required for any successfully allocated file. -- Peter Geoghegan
On Fri, Oct 2, 2015 at 3:57 PM, Peter Geoghegan <pg@heroku.com> wrote: > The spinlock acquisition above is actually necessary despite the > n_writers trick, because that's only used by qtext_store(). Actually, isn't that another bug? The fact that we don't do the same from within gc_qtexts() in normal cases, even with an exclusive lock held? We do this: /* Reset the shared extent pointer */ pgss->extent = extent; I saw one really weird case on a customer database, with an enormous although totally repetitive query text and one entry total (I mentioned this in passing up-thread). Although I'd be willing to believe it was just a very odd use of the database, since apparently they were doing some kind of stress-test, perhaps it could be better explained by a bug like this. To recap, for other people: pg_stat_statements_internal() may do this without any shared lock held: /* No point in loading file now if there are active writers */ if (n_writers == 0) qbuffer = qtext_load_file(&qbuffer_size); -- Peter Geoghegan
On Fri, Oct 2, 2015 at 4:11 PM, Peter Geoghegan <pg@heroku.com> wrote: > Actually, isn't that another bug? The fact that we don't do the same > from within gc_qtexts() in normal cases, even with an exclusive lock > held? We do this: Ah, no. We check pgss->gc_count in any case, so it should be fine. That will also make it safe to do the unlink() as outlined already, because a new qtext_load_file() call from pg_stat_statements_internal() (due to gc_count bump) will allocate the file again by name. -- Peter Geoghegan
... do you want to produce an updated patch? I'm not planning to look at it until tomorrow anyway. regards, tom lane
On Fri, Oct 2, 2015 at 4:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > ... do you want to produce an updated patch? I'm not planning to look at > it until tomorrow anyway. I'll post a new patch by about midday tomorrow west coast time. Hopefully that works for you. -- Peter Geoghegan
On Fri, Oct 2, 2015 at 4:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > ... do you want to produce an updated patch? I'm not planning to look at > it until tomorrow anyway. Attached, revised patch deals with the issues around removing the query text file when garbage collection encounters trouble. There is no reason to be optimistic about any error within gc_qtexts() not recurring during a future garbage collection. OOM might be an exception, but probably not, since gc_qtexts() is reached when a new entry is created with a new query text, which in general makes OOM progressively more likely. Thanks for accommodating me here. -- Peter Geoghegan
Attachment
Peter Geoghegan <pg@heroku.com> writes: > Attached, revised patch deals with the issues around removing the > query text file when garbage collection encounters trouble. There is > no reason to be optimistic about any error within gc_qtexts() not > recurring during a future garbage collection. OOM might be an > exception, but probably not, since gc_qtexts() is reached when a new > entry is created with a new query text, which in general makes OOM > progressively more likely. Hm. I'm unconvinced by the aspects of this that involve using mean_query_len as a filter on which texts will be accepted. While that's not necessarily bad in the abstract, there are way too many implementation artifacts here that will result in random-seeming decisions about whether to normalize. For instance: * mean_query_len only gets recomputed in entry_dealloc(), which is only run if we exceed pgss_max, and gc_qtexts(), which is only run if we decide the query texts file is more than 50% bloat. So there could be quite a long startup transient before the value gets off its initialization minimum, and I'm suspicious that there might be plausible use-cases where it never does. So it's not so much "restrict to a multiple of the mean query len" as "restrict to some number that might once upon a time have had some relation to the mean query len, or maybe not". * One could expect that after changing mean_query_len, the population of query texts would change character as a result of the filter behavior changing, so that convergence to stable behavior over the long haul is not exactly self-evident. * As you've got it here, entry_dealloc() and gc_qtexts() don't compute mean_query_len the same way, because only one of them discriminates against sticky entries. So the value would bounce about rather randomly based on which one had run last. * I'm not exactly convinced that sticky entries should be ignored for this purpose anyway. Taking a step back, ISTM the real issue you're fighting here is lots of orphaned sticky entries, but the patch doesn't do anything directly to fix that problem. I wonder if we should do something like making entry_dealloc() and/or gc_qtexts() aggressively remove sticky entries, or at least those with "large" texts. I think the aspects of this patch that are reasonably uncontroversial are increasing the allowed malloc attempt size in gc_qtexts, flushing the query text file on malloc failure, fixing the missing cleanup steps after a gc failure, and making entry_dealloc's recomputation of mean_query_len sane (which I'll define for the moment as "the same as gc_qtexts would get"). Since we're hard against a release deadline, I propose to commit just those changes, and we can consider the idea of a query size filter and/or redefining mean_query_len at leisure. regards, tom lane
On Sun, Oct 4, 2015 at 9:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Hm. I'm unconvinced by the aspects of this that involve using > mean_query_len as a filter on which texts will be accepted. While that's > not necessarily bad in the abstract, there are way too many implementation > artifacts here that will result in random-seeming decisions about whether > to normalize. There are already plausible race conditions that can make query text normalization not occur. I think that's much more likely in practice to cause a failure to normalize than anything proposed here; I've personally observed such things in the wild a few times already. Also, note that mean_query_len is not used directly there -- I decided on Max(ASSUMED_LENGTH_INIT, mean_query_len) instead. mean_query_len is just for cases with very large query texts. > * mean_query_len only gets recomputed in entry_dealloc(), which is only > run if we exceed pgss_max, and gc_qtexts(), which is only run if we decide > the query texts file is more than 50% bloat. So there could be quite a > long startup transient before the value gets off its initialization > minimum, and I'm suspicious that there might be plausible use-cases where > it never does. So it's not so much "restrict to a multiple of the mean > query len" as "restrict to some number that might once upon a time have > had some relation to the mean query len, or maybe not". ASSUMED_LENGTH_INIT * 5 is a pretty conservative lower bound, I'd say. mean_query_len is only really used for cases where query texts are much longer on average. So in order for that to be a problem, you'd have to have what are, in an absolute sense, very large query texts. I think I noticed no more than a handful of changes in the regression tests, for example. > * One could expect that after changing mean_query_len, the population of > query texts would change character as a result of the filter behavior > changing, so that convergence to stable behavior over the long haul is > not exactly self-evident. FWIW, I think that there is a feedback loop today, and that in problem cases that was what allowed it to get out of hand. > * As you've got it here, entry_dealloc() and gc_qtexts() don't compute > mean_query_len the same way, because only one of them discriminates > against sticky entries. So the value would bounce about rather randomly > based on which one had run last. entry_dealloc() will naturally run far more frequently than gc_qtexts(). That said, it would be better if they matched. > * I'm not exactly convinced that sticky entries should be ignored for > this purpose anyway. I think that data integration transactions that fail repeatedly are strongly implicated here in practice. That's behind the query size filter thing that you may also take issue with, as well as this. > Taking a step back, ISTM the real issue you're fighting here is lots of > orphaned sticky entries, but the patch doesn't do anything directly to fix > that problem. I wonder if we should do something like making > entry_dealloc() and/or gc_qtexts() aggressively remove sticky entries, > or at least those with "large" texts. Sticky entries are (almost by definition) always aggressively removed, and I hesitate to give certain ones a lower usage_count to begin with, which is the only way to directly be more aggressive that might work better. > I think the aspects of this patch that are reasonably uncontroversial are > increasing the allowed malloc attempt size in gc_qtexts, flushing the > query text file on malloc failure, fixing the missing cleanup steps after > a gc failure, and making entry_dealloc's recomputation of mean_query_len > sane (which I'll define for the moment as "the same as gc_qtexts would > get"). Since we're hard against a release deadline, I propose to commit > just those changes, and we can consider the idea of a query size filter > and/or redefining mean_query_len at leisure. I'm not clear on what you actually propose to do to "make entry_dealloc's recomputation of mean_query_len sane", but I think you are talking about something distinct from what I've proposed based on your separate remarks about entry_dealloc and the extra discrimination against sticky entries there (vis-a-vis calculating mean query length). I can't decide exactly what you mean, though: neither entry_dealloc nor gc_qtexts care about orphaned query texts in my patch (or in master). Please clarify. I'd be quite happy if you did everything listed, and also left the extra discrimination against sticky entries within entry_dealloc in -- consider what happens when a huge malloc() ends up swapping with an exclusive lock held, and consider that repeated, failed data integration transactions are implicated in this in a big way when a problem appears in the wild. A big part of the problem here was that garbage collection did not run often enough. In other words, I'd be fine with *not* doing the query size filter thing for now, since that is something that seems like an extra defense and not core to the problem. I was kind of ambivalent about doing that part myself, actually. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > I'm not clear on what you actually propose to do to "make > entry_dealloc's recomputation of mean_query_len sane", but I think you > are talking about something distinct from what I've proposed Ah, right, sorry. I meant to make its result match what gc_texts would get, by not falsely counting entries with dropped texts. That's not what you have in your patch but it seems like an easy enough fix. > I'd be quite happy if you did everything listed, and also left the > extra discrimination against sticky entries within entry_dealloc in -- > consider what happens when a huge malloc() ends up swapping with an > exclusive lock held, and consider that repeated, failed data > integration transactions are implicated in this in a big way when a > problem appears in the wild. A big part of the problem here was that > garbage collection did not run often enough. Hm. The problem I've got with this is that then mean_query_len means something significantly different after entry_dealloc than it does after gc_texts. I'd be okay with changing *both* of those functions to ignore sticky entries in the calculation, if that seems reasonable to you. > In other words, I'd be fine with *not* doing the query size filter > thing for now, since that is something that seems like an extra > defense and not core to the problem. I was kind of ambivalent about > doing that part myself, actually. Agreed on that part. regards, tom lane
On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ah, right, sorry. I meant to make its result match what gc_texts would > get, by not falsely counting entries with dropped texts. That's not > what you have in your patch but it seems like an easy enough fix. I'm trying to make mean_query_len representative of *useful* entry query length. I guess I don't have that within gc_texts in my patch, but I do have it within entry_dealloc (up to and including considering dropped texts), which FWIW is far more important. >> I'd be quite happy if you did everything listed, and also left the >> extra discrimination against sticky entries within entry_dealloc in -- >> consider what happens when a huge malloc() ends up swapping with an >> exclusive lock held, and consider that repeated, failed data >> integration transactions are implicated in this in a big way when a >> problem appears in the wild. A big part of the problem here was that >> garbage collection did not run often enough. > > Hm. The problem I've got with this is that then mean_query_len means > something significantly different after entry_dealloc than it does > after gc_texts. > > I'd be okay with changing *both* of those functions to ignore sticky > entries in the calculation, if that seems reasonable to you. That seems perfectly reasonable, yes. Should I leave that to you? >> In other words, I'd be fine with *not* doing the query size filter >> thing for now, since that is something that seems like an extra >> defense and not core to the problem. I was kind of ambivalent about >> doing that part myself, actually. > > Agreed on that part. We're in full agreement on what needs to happen for the next point release, then. Excellent. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hm. The problem I've got with this is that then mean_query_len means >> something significantly different after entry_dealloc than it does >> after gc_texts. >> >> I'd be okay with changing *both* of those functions to ignore sticky >> entries in the calculation, if that seems reasonable to you. > That seems perfectly reasonable, yes. Should I leave that to you? Sure, I can take it. regards, tom lane
On Sun, Oct 4, 2015 at 1:12 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Ah, right, sorry. I meant to make its result match what gc_texts would >> get, by not falsely counting entries with dropped texts. That's not >> what you have in your patch but it seems like an easy enough fix. > > I'm trying to make mean_query_len representative of *useful* entry > query length. I guess I don't have that within gc_texts in my patch, > but I do have it within entry_dealloc (up to and including considering > dropped texts), which FWIW is far more important. To be clear: I wasn't sure why you though I falsely count entries with dropped texts within entry_dealloc(). I suppose my sense was that dropped texts ought to not make garbage collection occur too frequently, which could also be a problem. Garbage collection ought to occur when the size of the query text file becomes excessive relative to useful entries. I was worried about the thrashing risk from dropped text entries. Maybe we could, as an alternative, not forget the original size of dropped query texts, relying only on their offset to indicate the text is invalid. Dropped query texts would then not be special in that sense, which seems like a good thing all around. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hm. The problem I've got with this is that then mean_query_len means >> something significantly different after entry_dealloc than it does >> after gc_texts. >> >> I'd be okay with changing *both* of those functions to ignore sticky >> entries in the calculation, if that seems reasonable to you. > That seems perfectly reasonable, yes. Should I leave that to you? After a closer look I decided that wasn't reasonable at all. Discounting sticky texts would then mean that after a GC cycle, we might still think the query texts file is bloated and issue another GC request, which of course would not shrink the file, so that we'd be doing GC cycles every time we added a new query. The mean query len recorded by gc_qtexts() *has to* match the mean length of what it actually put in the file, not the mean length of what we might wish it would put in the file. By the same token, I'm back to believing that it's fundamentally bogus for entry_dealloc() to compute mean_query_len that way. The most likely result of that is useless GC cycles. The only thing that will actually free memory when you've got a lot of dead sticky entries is getting rid of the sticky hashtable entries, and the only way to do that is to wait for entry_dealloc() to get rid of 'em. You were unenthused about making that removal more aggressive, which is fine, but you can't have it both ways. It does strike me that when we do get rid of the sticky entries, cleanup of the texts file might lag a bit longer than it needs to because mean_query_len is computed before not after deleting whatever entries we're going to delete. On average, that shouldn't matter ... but if we are tossing a bunch of dead sticky entries, maybe they would have a higher mean length than the rest? Not sure about it. I put a comment about this into entry_dealloc() for the moment, but we can revisit whether it is worth adding code/cycles to get a more up-to-date mean length. Anyway, I've committed the aspects of this that we're agreed on. regards, tom lane
Peter Geoghegan <pg@heroku.com> writes: > To be clear: I wasn't sure why you though I falsely count entries with > dropped texts within entry_dealloc(). In the existing^H^H^Hprevious code, dropped-text entries would essentially act as length-zero summands in the average calculation, whereas I think we agree that they ought to be ignored; otherwise they decrease the computed mean and thereby increase the probability of (useless) GC cycles. In the worst case where the hashtable is mostly dropped-text entries, which would for instance be the prevailing situation shortly after a GC failure, we'd be calculating ridiculously small mean values and that'd prompt extra GC cycles no? regards, tom lane
On 10/04/2015 06:16 PM, Tom Lane wrote: > Peter Geoghegan <pg@heroku.com> writes: >> To be clear: I wasn't sure why you though I falsely count entries with >> dropped texts within entry_dealloc(). > In the existing^H^H^Hprevious code, dropped-text entries would essentially > act as length-zero summands in the average calculation, whereas I think > we agree that they ought to be ignored; otherwise they decrease the > computed mean and thereby increase the probability of (useless) GC cycles. > In the worst case where the hashtable is mostly dropped-text entries, > which would for instance be the prevailing situation shortly after a GC > failure, we'd be calculating ridiculously small mean values and that'd > prompt extra GC cycles no? > > Sorry, I'm a bit late to this party. Does what you have committed mean people are less likely to see "Out of Memory" coming from pg_stat_statements? If not, what can be done about them short of a restart? And what bad effects follow from an event generating them? The docs seem to be quite silent on these points. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Sorry, I'm a bit late to this party. Does what you have committed mean > people are less likely to see "Out of Memory" coming from > pg_stat_statements? If not, what can be done about them short of a > restart? And what bad effects follow from an event generating them? The main thing we've done that will alleviate that is increase the size of query text file that the garbage-collection routine can cope with from MaxAllocSize (1GB) to MaxAllocHugeSize (at least 2GB, lots more on 64bit machines, though on 32-bit you probably can't get to 2GB anyway ...). Also, what will now happen if you do get an out-of-memory is that the code will discard stored query texts and truncate the file, so that the problem doesn't recur (at least not till you build up a new set of stored query texts). At this point you still have statistics, but they can only be identified by query ID since the text has been forgotten. I'm not sure how useful that situation really is ... > The docs seem to be quite silent on these points. The docs probably ought to describe this situation and recommend reducing pg_stat_statements.max if you want to preserve query texts. regards, tom lane
On Sun, Oct 4, 2015 at 3:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That seems perfectly reasonable, yes. Should I leave that to you? > > After a closer look I decided that wasn't reasonable at all. Discounting > sticky texts would then mean that after a GC cycle, we might still think > the query texts file is bloated and issue another GC request, which of > course would not shrink the file, so that we'd be doing GC cycles every > time we added a new query. That seems unlikely to me. Firstly, there is a generic check of 512 bytes per entry within need_gc_qtexts() -- we never GC if that generic limit is not exceeded, and that's far from tiny. Secondly, how long do you think those sticky entries will stay around in the hastable to continue to cause trouble, and dominate over regular entries? There'd have to be constant evictions/cache pressure for this situation to occur, because the threshold within need_gc_qtexts() is based on pg_stat_statements.max, and yet many evictions aggressively evict sticky entries very quickly. Seems like a very unusual workload to me. I think that you're overestimating the cost of discounting sticky entries, which are usually very much in the minority (at least when it matters, with cache pressure), and underestimating the cost of continuing to weigh sticky entries' contribution to mean query text. As I said, my proposal to not have sticky entries contribute to overall mean query text length is based on a problem report involving a continually failing data integration process. That in particular is what I hope to stop having a negative impact on mean query length -- a unique queryid makes for an entry that is bound to remain useless forever (as opposed to just failing the first few times). With the larger limit of MaxAllocHugeSize, I worry about swapping with the exclusive lock held. The fact that there is a big disparity between mean query length for sticky and non-sticky entries is weird. It was seen to happen in the wild only because the sticky entries that clogged things up were not really distinct to a human -- only to the fingerprinting/jumbling code, more or less by accident, which in a practical sense caused a distortion. That's what I'm targeting. I have a hard time imagining any harm from discounting sticky entries with a realistic case. > The mean query len recorded by gc_qtexts() > *has to* match the mean length of what it actually put in the file, not > the mean length of what we might wish it would put in the file. Why? Why not simply care about whether or not the file was unreasonably large relative to available, useful query statistics? I think that mean_query_len isn't something that swings all that much with realistic workloads and 5,000 representative entries -- it certainly should not swing wildly. The problem to an extent was that that accidentally stopped being true. > By the same token, I'm back to believing that it's fundamentally bogus for > entry_dealloc() to compute mean_query_len that way. The most likely > result of that is useless GC cycles. The only thing that will actually > free memory when you've got a lot of dead sticky entries is getting rid of > the sticky hashtable entries, and the only way to do that is to wait for > entry_dealloc() to get rid of 'em. You were unenthused about making that > removal more aggressive, which is fine, but you can't have it both ways. Meanwhile, mean_query_len grows as more "distinct" entries are created, pushing out their garbage collection further and further. Especially when there is a big flood of odd queries that stay sticky. Once again, I'm having a really hard time imagining a minority of current, non-sticky hashtable entries dominating a majority of current, sticky hashtable entries come garbage collection time. Garbage collection is linked to creation of entries, which is also linked to eviction, which aggressively evicts sticky entries in this thrashing scenario that you describe. > It does strike me that when we do get rid of the sticky entries, cleanup > of the texts file might lag a bit longer than it needs to because > mean_query_len is computed before not after deleting whatever entries > we're going to delete. On average, that shouldn't matter ... but if we > are tossing a bunch of dead sticky entries, maybe they would have a higher > mean length than the rest? Yes, that is something that was observed to happen in the problem case I looked into. You know my solution already. -- Peter Geoghegan
On Sun, Oct 4, 2015 at 3:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> To be clear: I wasn't sure why you though I falsely count entries with >> dropped texts within entry_dealloc(). > > In the existing^H^H^Hprevious code, dropped-text entries would essentially > act as length-zero summands in the average calculation, whereas I think > we agree that they ought to be ignored; otherwise they decrease the > computed mean and thereby increase the probability of (useless) GC cycles. > In the worst case where the hashtable is mostly dropped-text entries, > which would for instance be the prevailing situation shortly after a GC > failure, we'd be calculating ridiculously small mean values and that'd > prompt extra GC cycles no? Yes, but my patch changed that, too. I suggested that first. -- Peter Geoghegan
On Wed, Sep 23, 2015 at 1:41 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > max was set to 10000. I don't know about average query text size, but the > command that was causing the error was a very large number of individual > INSERT ... VALUES statements all in one command. > > The machine had plenty of free memory and no ulimit, so I don't see how this > could have been anything but MaxAllocSize, unless there's some other failure > mode in malloc I don't know about. The patch that Tom committed does __nothing__ to actually address *how* you were able to get into this position in the first place. Sure, now you might be able to recover if you're not affected by the MaxAllocSize limitation, but: What did it take in the first place to need memory > MaxAllocSize to do a garbage collection? Whatever it was that allowed things to get that bad before a garbage collection was attempted is not even considered in the patch committed. The patch just committed is, in short, nothing more than a band-aid, and may do more harm than good due to allocating huge amounts of memory while a very contended exclusive LWLock is held. You say you have plenty of memory and no ulimit, and I believe that you're right that there was never a conventional OOM where malloc() returns NULL. If we conservatively assume that you were only barely over the MaxAllocSize limitation before a garbage collection was first attempted, then the average query text length would have to be about 50 kilobytes, since only then 20,000 texts (pg_stat_statements.max * 2) would put us over. Does anyone actually think that the *average* SQL query on Jim's client's application was in excess of 50% the size of the ~3,000 line file pg_stat_statements.c? It beggars belief. I'm annoyed and disappointed that the patch committed does not even begin to address the underlying problem -- it just adds an escape hatch, and fixes another theoretical issue that no one was affected by. Honestly, next time I won't bother. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > I'm annoyed and disappointed that the patch committed does not even > begin to address the underlying problem -- it just adds an escape > hatch, and fixes another theoretical issue that no one was affected > by. Honestly, next time I won't bother. The problem as I see it is that what you submitted is a kluge that will have weird and unpredictable side effects. Moreover, it seems to be targeting an extremely narrow problem case, ie large numbers of queries that (a) have long query texts and (b) are distinct to the fingerprinting code and (c) fail. It seems to me that you could get into equal trouble with situations where (c) is not satisfied, and what then? I'm certainly amenable to doing further work on this problem. But I do not think that what we had was well-enough-thought-out to risk pushing it just hours before a release deadline. Let's arrive at a more carefully considered fix in a leisurely fashion. regards, tom lane
On 10/05/2015 11:15 AM, Tom Lane wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I'm annoyed and disappointed that the patch committed does not even >> begin to address the underlying problem -- it just adds an escape >> hatch, and fixes another theoretical issue that no one was affected >> by. Honestly, next time I won't bother. > The problem as I see it is that what you submitted is a kluge that will > have weird and unpredictable side effects. Moreover, it seems to be > targeting an extremely narrow problem case, ie large numbers of queries > that (a) have long query texts and (b) are distinct to the fingerprinting > code and (c) fail. It seems to me that you could get into equal trouble > with situations where (c) is not satisfied, and what then? > > I'm certainly amenable to doing further work on this problem. But I do > not think that what we had was well-enough-thought-out to risk pushing > it just hours before a release deadline. Let's arrive at a more > carefully considered fix in a leisurely fashion. > > FWIW, (a) and (b) but not (c) is probably the right description for my client who has been seeing problems here. cheers andrew
Andrew Dunstan wrote: > > On 10/05/2015 11:15 AM, Tom Lane wrote: > >Peter Geoghegan <pg@heroku.com> writes: > >>I'm annoyed and disappointed that the patch committed does not even > >>begin to address the underlying problem -- it just adds an escape > >>hatch, and fixes another theoretical issue that no one was affected > >>by. Honestly, next time I won't bother. > >The problem as I see it is that what you submitted is a kluge that will > >have weird and unpredictable side effects. Moreover, it seems to be > >targeting an extremely narrow problem case, ie large numbers of queries > >that (a) have long query texts and (b) are distinct to the fingerprinting > >code and (c) fail. It seems to me that you could get into equal trouble > >with situations where (c) is not satisfied, and what then? > FWIW, (a) and (b) but not (c) is probably the right description for my > client who has been seeing problems here. I think the fact that long IN lists are fingerprinted differently according to the number of elements in the list makes the scenario rather very likely -- not particularly narrow. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Andrew Dunstan wrote: >> FWIW, (a) and (b) but not (c) is probably the right description for my >> client who has been seeing problems here. > I think the fact that long IN lists are fingerprinted differently > according to the number of elements in the list makes the scenario > rather very likely -- not particularly narrow. That's certainly something worth looking at, but I think it's probably more complicated than that. If you just write "WHERE x IN (1,2,3,4)", that gets folded to a ScalarArrayOp with a single array constant, which the existing code would deal with just fine. We need to identify situations where that's not the case but yet we shouldn't distinguish. In any case, that's just a marginal tweak for one class of query. I suspect the right fix for the core problem is the one Peter mentioned in passing earlier, namely make it possible to do garbage collection without having to slurp the entire file into memory at once. It'd be slower, without a doubt, but we could continue to use the existing code path unless the file gets really large. regards, tom lane
On Mon, Oct 5, 2015 at 8:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > That's certainly something worth looking at, but I think it's probably > more complicated than that. If you just write "WHERE x IN (1,2,3,4)", > that gets folded to a ScalarArrayOp with a single array constant, which > the existing code would deal with just fine. We need to identify > situations where that's not the case but yet we shouldn't distinguish. I have a certain amount of sympathy for doing that kind of thing, but significantly less than I have for cases with many failed queries, which is why I focused on that. > In any case, that's just a marginal tweak for one class of query. Sometimes narrow cases are also important and representative cases. I don't care if someone has thrashing type issues with pg_stat_statements when they're doing something really odd that calls into question the purpose of using it to begin with [1]. The two classes of queries we talked about (1. Many aborted data integration transactions, and 2. A variable number of constants) are interesting because a reasonable person could have those cases, and run into trouble with pg_stat_statements as a consequence. > I suspect the right fix for the core problem is the one Peter mentioned > in passing earlier, namely make it possible to do garbage collection > without having to slurp the entire file into memory at once. It'd be > slower, without a doubt, but we could continue to use the existing code > path unless the file gets really large. While it would be nice to not lose query texts on OOM, that's not my primary concern. My primary concern is infrequent garbage collection. This fix certainly isn't going to help with the fact that garbage collection can be stalled for far too long in at least quasi-reasonable cases. [1] http://www.postgresql.org/message-id/52E9D98A.4000007@lab.ntt.co.jp -- Peter Geoghegan
On 10/5/15 10:50 AM, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Andrew Dunstan wrote: >>> FWIW, (a) and (b) but not (c) is probably the right description for my >>> client who has been seeing problems here. > >> I think the fact that long IN lists are fingerprinted differently >> according to the number of elements in the list makes the scenario >> rather very likely -- not particularly narrow. > > That's certainly something worth looking at, but I think it's probably > more complicated than that. If you just write "WHERE x IN (1,2,3,4)", > that gets folded to a ScalarArrayOp with a single array constant, which > the existing code would deal with just fine. We need to identify > situations where that's not the case but yet we shouldn't distinguish. > > In any case, that's just a marginal tweak for one class of query. > I suspect the right fix for the core problem is the one Peter mentioned > in passing earlier, namely make it possible to do garbage collection > without having to slurp the entire file into memory at once. It'd be > slower, without a doubt, but we could continue to use the existing code > path unless the file gets really large. To address what Peter raised up-thread, according to my client the process that was doing this was generating 10,000 inserts per transaction and sending them all as a single statement. They tried cutting it to 1000 inserts and it still had the problem. Each overall command string could have been megabytes in size. Perhaps it's not worth supporting that, but if that's the decision then there needs to at least be better error reporting around this. I'll try to test Tom's patch next week to see what affect it has on this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On 10/4/15 6:10 PM, Tom Lane wrote: > Andrew Dunstan<andrew@dunslane.net> writes: >> >Sorry, I'm a bit late to this party. Does what you have committed mean >> >people are less likely to see "Out of Memory" coming from >> >pg_stat_statements? If not, what can be done about them short of a >> >restart? And what bad effects follow from an event generating them? > The main thing we've done that will alleviate that is increase the size of > query text file that the garbage-collection routine can cope with from > MaxAllocSize (1GB) to MaxAllocHugeSize (at least 2GB, lots more on 64bit > machines, though on 32-bit you probably can't get to 2GB anyway ...). FWIW, I've verified on $CLIENT's system that this works as Tom described. The truncation happened somewhere a bit north of 3GB, which seems odd as this is a 64 bit system. But at least there were no OOM errors. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com