Thread: psql leaks memory on query cancellation
Hi,
psql (PostgreSQL) 10.3
Here are the steps to reproduce a leak:
1. connect to 10.3 server, perform the query similar to:
select 'message' || generate_series(1,1000000000);
2. monitoring psql memory usage in htop or similar tool, press ctrl+c at some point where you can clearly distinguish a psql with a big allocated buffer from psql without it.
3. see the query cancelled, but psql memory usage stays the same.
1. connect to 10.3 server, perform the query similar to:
select 'message' || generate_series(1,1000000000);
2. monitoring psql memory usage in htop or similar tool, press ctrl+c at some point where you can clearly distinguish a psql with a big allocated buffer from psql without it.
3. see the query cancelled, but psql memory usage stays the same.
This is especially painful when query you're debugging has a runaway join condition, and you understand it only after it doesn't return in seconds as you've expected.
Is it expected behavior (so I can have a look at something server returned somehow and it's kept there for me), or a plain leak?
Darafei Praliaskouski,
GIS Engineer / Juno Minsk
On 12.04.2018 13:26, Darafei "Komяpa" Praliaskouski wrote:
Hi,psql (PostgreSQL) 10.3Here are the steps to reproduce a leak:
1. connect to 10.3 server, perform the query similar to:
select 'message' || generate_series(1,1000000000);
2. monitoring psql memory usage in htop or similar tool, press ctrl+c at some point where you can clearly distinguish a psql with a big allocated buffer from psql without it.
3. see the query cancelled, but psql memory usage stays the same.This is especially painful when query you're debugging has a runaway join condition, and you understand it only after it doesn't return in seconds as you've expected.Is it expected behavior (so I can have a look at something server returned somehow and it's kept there for me), or a plain leak?Darafei Praliaskouski,GIS Engineer / Juno Minsk
It seems to be effect of glibc malloc which doesn't return deallocated memory to OS.
I attach gdb to psql and print memory usage before/after query interruption:
Arena 0:
system bytes = 989835264
in use bytes = 989811328
Total (incl. mmap):
system bytes = 1258274816
in use bytes = 1258250880
max mmap regions = 1
max mmap bytes = 268439552
^CCancel request sent
ERROR: canceling statement due to user request
postgres=# Arena 0:
system bytes = 1284907008
in use bytes = 278032
Total (incl. mmap):
system bytes = 1284907008
in use bytes = 278032
max mmap regions = 1
max mmap bytes = 536875008
As you can seen there are 1.2 Gb of mapped memory, but only 270kb of it is used.
Unfortunately it is more or less expected behavior of malloc: it is assumed that if application consume a lot of memory at some moment of time and then release it,
then most likely it will reuse it again in future. So there is not so much sense to return it to OS. And it is really not easy to do so because of fragmentation.
You can not easily unmap this 1.2 Gb of mapped space if there is live objects may be just just few bytes length allocated somewhere in this area.
The only solution is to use some memory contexts like in Postgres backends, but it requires complete rewriting of libpq. I am not sure that somebody will want to do it.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 12 April 2018 at 18:26, Darafei "Komяpa" Praliaskouski <me@komzpa.net> wrote: > Hi, > > psql (PostgreSQL) 10.3 > > Here are the steps to reproduce a leak: > > 1. connect to 10.3 server, perform the query similar to: > > select 'message' || generate_series(1,1000000000); > > 2. monitoring psql memory usage in htop or similar tool, press ctrl+c at > some point where you can clearly distinguish a psql with a big allocated > buffer from psql without it. > > 3. see the query cancelled, but psql memory usage stays the same. > > This is especially painful when query you're debugging has a runaway join > condition, and you understand it only after it doesn't return in seconds as > you've expected. > > Is it expected behavior (so I can have a look at something server returned > somehow and it's kept there for me), or a plain leak? This is totally normal behaviour for any C program. It's part of why systems should have swap space. But it's generally fairly harmless, as it's uncommon for apps to make huge allocations only once then stay running with low memory use thereafter. It would potentially be a leak if doing the same thing repeatedly led to repeated growth. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> Is it expected behavior (so I can have a look at something server returned
> somehow and it's kept there for me), or a plain leak?
This is totally normal behaviour for any C program.
Thanks Konstantin and Craig for the help.
sudo apt install libjemalloc1
To mitigate the issue I've changed the allocator on my laptop to jemalloc.
For single psql run on my Ubuntu system:
For single psql run on my Ubuntu system:
sudo apt install libjemalloc1
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 psql
A global replacement by putting /usr/lib/x86_64-linux-gnu/libjemalloc.so.1 to /etc/ld.so.preload also had a positive effect fixing this behavior in all the applications, reducing after-boot memory footprint from 7 GB to 3.
A global replacement by putting /usr/lib/x86_64-linux-gnu/libjemalloc.so.1 to /etc/ld.so.preload also had a positive effect fixing this behavior in all the applications, reducing after-boot memory footprint from 7 GB to 3.
Darafei Praliaskouski,
GIS Engineer / Juno Minsk
Craig Ringer <craig@2ndquadrant.com> writes: > On 12 April 2018 at 18:26, Darafei "Komяpa" Praliaskouski <me@komzpa.net> wrote: >> Is it expected behavior (so I can have a look at something server returned >> somehow and it's kept there for me), or a plain leak? > This is totally normal behaviour for any C program. Well, it depends. On some platforms, malloc basically never gives back memory to the OS, but on glibc/Linux it does. What I see here, experimenting with a slightly-cut-down version of the example, is that if psql is allowed to finish the query normally then its memory usage according to "top" *does* go back down to very little at the end. But if one hits control-C partway through, it doesn't. I imagine that this indicates that control-C processing allocates some memory it doesn't free, resulting in an "island" up at the end of memory that prevents glibc from releasing all the free memory it's got. Whether that's an actual leak, or just memory we're holding onto in hopes of reusing it, isn't clear. (valgrind might be useful.) In short, there might be something that could be improved here, but I've not dug into it enough to say for sure. BTW, I also notice that either psql or libpq seems to take a darn long time to release a several-GB-sized query result. That might be improvable as well. regards, tom lane
> I imagine that this indicates that control-C processing allocates some > memory it doesn't free, resulting in an "island" up at the end of memory > that prevents glibc from releasing all the free memory it's got. Whether > that's an actual leak, or just memory we're holding onto in hopes of > reusing it, isn't clear. (valgrind might be useful.) malloc could request memory from kernel by two ways: sbrk() and mmap(), first one has described problem, mmap hasn't. It's described in mallopt(3) in section M_MMAP_THRESHOLD, to test that try to repeat test with MALLOC_MMAP_THRESHOLD_ environment set to 8192. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
I wrote: > I imagine that this indicates that control-C processing allocates some > memory it doesn't free, resulting in an "island" up at the end of memory > that prevents glibc from releasing all the free memory it's got. Whether > that's an actual leak, or just memory we're holding onto in hopes of > reusing it, isn't clear. (valgrind might be useful.) So I poked into this a little, and realized that the "island" is in fact the error PGresult we get back from the server when the query is cancelled. We used to print that and throw it away ... but since the addition of \errverbose, psql just sits on it, at least till the next error. Since that's up against the end of memory, it prevents freeing all the space belonging to the huge query result we cancelled midway through collection of. The problem, therefore, is that libpq reads the error message and builds a PGresult for it before it throws away the partially-collected query result it was working on. This is dumb. Quite aside from the question of when memory can be released back to the OS, we are putting ourselves at unnecessary risk of OOM. Therefore, I propose the attached patch, which simply sees to it that we discard any partial query result at the start of error message collection not the end. This makes the behavior very much better, at least on Linux. I think this is a back-patchable bug fix; certainly so at least back to 9.6 where \errverbose was added. Versions before that do not show the persistent memory bloat the OP is complaining of, so that what we have here is arguably a performance regression. Comments? > BTW, I also notice that either psql or libpq seems to take a darn > long time to release a several-GB-sized query result. That might > be improvable as well. I looked into that too, and found that what I was seeing was that if I exit my pager with "q", psql continues to run, formatting the enormous query result and writing it to nowhere. Since we've got SIGPIPE disabled, nothing happens to stop it; although if you have the presence of mind to press control-C, you do get control back pretty quickly. I wonder whether fe_utils/print.c ought to be adjusted to stop on output errors, along the lines of - if (cancel_pressed) + if (cancel_pressed || ferror(fout)) break; However, the implications of that aren't entirely clear, so I merely suggest that as a topic for research not an immediately proposed patch. regards, tom lane diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c index d3ca5d2..8345faf 100644 --- a/src/interfaces/libpq/fe-protocol3.c +++ b/src/interfaces/libpq/fe-protocol3.c @@ -880,6 +880,14 @@ pqGetErrorNotice3(PGconn *conn, bool isError) char id; /* + * If this is an error message, pre-emptively clear any incomplete query + * result we may have. We'd just throw it away below anyway, and + * releasing it before collecting the error might avoid out-of-memory. + */ + if (isError) + pqClearAsyncResult(conn); + + /* * Since the fields might be pretty long, we create a temporary * PQExpBuffer rather than using conn->workBuffer. workBuffer is intended * for stuff that is expected to be short. We shouldn't use @@ -943,7 +951,7 @@ pqGetErrorNotice3(PGconn *conn, bool isError) { if (res) res->errMsg = pqResultStrdup(res, workBuf.data); - pqClearAsyncResult(conn); + pqClearAsyncResult(conn); /* redundant, but be safe */ conn->result = res; if (PQExpBufferDataBroken(workBuf)) printfPQExpBuffer(&conn->errorMessage,
Therefore, I propose the attached patch, which simply sees to it that
we discard any partial query result at the start of error message
collection not the end. This makes the behavior very much better,
at least on Linux.
I have tested the build of Postgres with attached patch and confirm that I don't see this problematic behavior anymore even with default allocator. Thank you!
I think this is a back-patchable bug fix; certainly so at least back
to 9.6 where \errverbose was added. Versions before that do not show
the persistent memory bloat the OP is complaining of, so that what
we have here is arguably a performance regression. Comments?
This should not bring regression, since the memory is freed anyway, but is valuable as puts less pressure on client memory requirements for manual data inspection workflows.
Darafei Praliaskouski,
GIS Engineer / Juno Minsk
On 13 April 2018 at 05:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Therefore, I propose the attached patch, which simply sees to it that > we discard any partial query result at the start of error message > collection not the end. This makes the behavior very much better, > at least on Linux. > > I think this is a back-patchable bug fix; certainly so at least back > to 9.6 where \errverbose was added. Versions before that do not show > the persistent memory bloat the OP is complaining of, so that what > we have here is arguably a performance regression. Comments? That seems reasonable. It's all dependent on tons of internal details of how libc decides to lay out memory, how it gets memory from the kernel, etc. On some platforms that "island" won't affect anything anyway, and on others I expect that freeing it probably still won't release memory back to the OS. But if it helps on even one common platform and has no downsides, it's well worth it. BTW, https://stackoverflow.com/q/2215259/398670 is relevant. This makes me wonder if we could do something cheeky like mmap MAP_ANONYMOUS'ing /dev/zero ourselves, giving ourselves a separate memory arena for result sets that we can reset when we're done. We don't know which result sets will be big though, so it hardly seems worth it. > > I looked into that too, and found that what I was seeing was that > if I exit my pager with "q", psql continues to run, formatting the > enormous query result and writing it to nowhere. Huh. I'd noticed that behaviour, but it never crossed my care threshold. I think I tend to do just what you describe below habitually. > Since we've got > SIGPIPE disabled, nothing happens to stop it; although if you have > the presence of mind to press control-C, you do get control back > pretty quickly. I wonder whether fe_utils/print.c ought to be > adjusted to stop on output errors, along the lines of > > - if (cancel_pressed) > + if (cancel_pressed || ferror(fout)) > break; > > However, the implications of that aren't entirely clear, so I merely > suggest that as a topic for research not an immediately proposed patch. Definitely separate. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services