Thread: Query with large in clauses uses a lot of memory
I had an issue today where the OOM killer terminated one of my postgres processes. On my server I have 8 GB of RAM, shared_memory is 1 GB and work_memory is 24MB. I have connection pooling which limits us to 25 connections. Even if I'm maxed out there, I'm still only using 1.6 MB of RAM of my 8 which seems like it shouldn't be a problem. Looking through my postgres logs, I noticed that right about the time of the OOM incident, I had some queries running with pretty massive in clauses (thank you ruby/ActiveRecord). One of the queries was about 28MB in size. So, I decided to try an experiment. I wrote 2 queries as follows: 1 ) select pg_sleep(100) ; 2 ) with q (s1, s2) as (select pg_sleep(100), 1) select * from q where s2 in ( 1, <about 28 MB worth of comma delimited numbers>) I ran those queries via psql and did this: -sh-4.1$ ps aux | grep -i -E "local|COMMAND" | grep -v ruby USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 20896 27.0 28.2 3416812 2132112 ? Ss 21:18 0:02 postgres: hireology hireology [local] SELECT postgres 20899 0.0 0.0 1281368 4800 ? Ss 21:18 0:00 postgres: hireology hireology [local] SELECT It looks to me like the connection running the big query is using about 2GB more memory than the other one. I could see why it might use *some* more (like 28MB more?), but 2GB more seems excessive. So, the question is why does it use so much more memory. And is there anything I can do to limit this problem other than fixing the silly queries? Thanks in advance for any help, Greig Wise -- View this message in context: http://postgresql.nabble.com/Query-with-large-in-clauses-uses-a-lot-of-memory-tp5931716.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
greigwise <greigwise@comcast.net> writes: > So, I decided to try an experiment. I wrote 2 queries as follows: > 1 ) select pg_sleep(100) ; > 2 ) with q (s1, s2) as (select pg_sleep(100), 1) > select * from q where s2 in ( 1, <about 28 MB worth of comma > delimited numbers>) > > It looks to me like the connection running the big query is using about 2GB > more memory than the other one. I could see why it might use *some* more > (like 28MB more?), but 2GB more seems excessive. Don't hold your breath waiting for that to get better. Depending on what I assume about the widths of your numbers, you've got something like 3 million Const parse nodes in that query, so the system is eating something like 600-700 bytes per Const, which is not all that many copies because one Const node plus List overhead is probably 100 bytes on a 64-bit server. OK, it's not exactly frugal perhaps, but it would not be hard to get to that at all if you're running the query in a way that requires keeping a plancache entry for it. It would take significant work (and probably some performance sacrifices) to make much of a dent in the space consumption, and even if we put in the work, I'd only expect to be able to dent it a bit --- an order-of-magnitude reduction is not in the cards. Queries with that many parse elements in them are just not cheap. Now, that WHERE condition will eventually get folded to the form s2 = ANY ('{1,2,...}'::integer[]) and that constant array is a *lot* less space-wasteful, only 4 bytes per element (or 8 bytes if we're talking bigints). So the approach I'd advise is trying to send the query with a constant array to begin with --- either write it like that, or like s2 = ANY ($1::integer[]) and send the array as an out-of-line parameter. Don't know how hard it might be to arm-wrestle ActiveRecord into doing it like that :-( regards, tom lane
Wow. Thanks for the prompt answer. As a follow-up I was wondering if maybe there would be a way to tell it to NOT try to plan/execute the query (and instead throw an error) if the memory usage exceeded X. Thanks again. Greig -- View this message in context: http://postgresql.nabble.com/Query-with-large-in-clauses-uses-a-lot-of-memory-tp5931716p5932279.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.