Thread: Query with large in clauses uses a lot of memory

Query with large in clauses uses a lot of memory

From
greigwise
Date:
I had an issue today where the OOM killer terminated one of my postgres
processes.
On my server I have 8 GB of RAM, shared_memory is 1 GB and work_memory is
24MB.
I have connection pooling which limits us to 25 connections.  Even if I'm
maxed out there, I'm still only using 1.6 MB of RAM of my 8 which seems like
it shouldn't be a problem.

Looking through my postgres logs, I noticed that right about the time of the
OOM incident, I had some queries running with pretty massive in clauses
(thank you ruby/ActiveRecord).  One of the queries was about 28MB in size.

So, I decided to try an experiment.  I wrote 2 queries as follows:
1 ) select pg_sleep(100) ;
2 ) with q (s1, s2) as (select pg_sleep(100), 1)
        select * from q where s2 in ( 1, <about 28 MB worth of comma
delimited numbers>)

I ran those queries via psql and did this:

-sh-4.1$ ps aux | grep -i -E "local|COMMAND" | grep -v ruby
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 20896 27.0 28.2 3416812 2132112 ?     Ss   21:18   0:02 postgres:
hireology hireology [local] SELECT
postgres 20899  0.0  0.0 1281368 4800 ?        Ss   21:18   0:00 postgres:
hireology hireology [local] SELECT

It looks to me like the connection running the big query is using about 2GB
more memory than the other one.  I could see why it might use *some* more
(like 28MB more?), but 2GB more seems excessive.

So, the question is why does it use so much more memory.  And is there
anything I can do to limit this problem other than fixing the silly queries?

Thanks in advance for any help,
Greig Wise



--
View this message in context:
http://postgresql.nabble.com/Query-with-large-in-clauses-uses-a-lot-of-memory-tp5931716.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Query with large in clauses uses a lot of memory

From
Tom Lane
Date:
greigwise <greigwise@comcast.net> writes:
> So, I decided to try an experiment.  I wrote 2 queries as follows:
> 1 ) select pg_sleep(100) ;
> 2 ) with q (s1, s2) as (select pg_sleep(100), 1)
>         select * from q where s2 in ( 1, <about 28 MB worth of comma
> delimited numbers>)
>
> It looks to me like the connection running the big query is using about 2GB
> more memory than the other one.  I could see why it might use *some* more
> (like 28MB more?), but 2GB more seems excessive.

Don't hold your breath waiting for that to get better.  Depending on what
I assume about the widths of your numbers, you've got something like 3
million Const parse nodes in that query, so the system is eating something
like 600-700 bytes per Const, which is not all that many copies because
one Const node plus List overhead is probably 100 bytes on a 64-bit
server.  OK, it's not exactly frugal perhaps, but it would not be hard to
get to that at all if you're running the query in a way that requires
keeping a plancache entry for it.  It would take significant work (and
probably some performance sacrifices) to make much of a dent in the
space consumption, and even if we put in the work, I'd only expect to
be able to dent it a bit --- an order-of-magnitude reduction is not in
the cards.  Queries with that many parse elements in them are just not
cheap.

Now, that WHERE condition will eventually get folded to the form

    s2 = ANY ('{1,2,...}'::integer[])

and that constant array is a *lot* less space-wasteful, only 4 bytes
per element (or 8 bytes if we're talking bigints).  So the approach I'd
advise is trying to send the query with a constant array to begin with
--- either write it like that, or like

    s2 = ANY ($1::integer[])

and send the array as an out-of-line parameter.

Don't know how hard it might be to arm-wrestle ActiveRecord into doing
it like that :-(

            regards, tom lane


Re: Query with large in clauses uses a lot of memory

From
greigwise
Date:
Wow.  Thanks for the prompt answer.

As a follow-up I was wondering if maybe there would be a way to tell it to
NOT try to plan/execute the query (and instead throw an error) if the memory
usage exceeded X.

Thanks again.

Greig



--
View this message in context:
http://postgresql.nabble.com/Query-with-large-in-clauses-uses-a-lot-of-memory-tp5931716p5932279.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.