Thread: swap storm created by 8.2.3

swap storm created by 8.2.3

From
Joseph Shraibman
Date:
I'm running:

PostgreSQL 8.2.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
3.4.6 20060404 (Red Hat 3.4.6-3)

My memory settings are:

work_mem = 64MB
shared_buffers = 128MB
temp_buffers = 32MB

I ran a query that was "SELECT field, count(*) INTO TEMP temptable" and
it grew to be 10gig (as reported by top) and brought the whole machine
to its knees.  How do I keep this from happening again?

Re: swap storm created by 8.2.3

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 05/22/07 12:17, Joseph Shraibman wrote:
> I'm running:
>
> PostgreSQL 8.2.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
> 3.4.6 20060404 (Red Hat 3.4.6-3)
>
> My memory settings are:
>
> work_mem = 64MB
> shared_buffers = 128MB
> temp_buffers = 32MB
>
> I ran a query that was "SELECT field, count(*) INTO TEMP temptable" and
> it grew to be 10gig (as reported by top) and brought the whole machine
> to its knees.  How do I keep this from happening again?

It looks to me as if TEMP tables are in-memory structures, not
file-backed objects like "regular" tables.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGUysiS9HxQb37XmcRAo06AJ4k2i7Q1GN1digKbcYoZYuZv/E0SwCg4zN5
ENSWo9fvXwTJWKatOGb/xpY=
=WAvF
-----END PGP SIGNATURE-----

Re: swap storm created by 8.2.3

From
Richard Huxton
Date:
Joseph Shraibman wrote:
> I'm running:
>
> PostgreSQL 8.2.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
> 3.4.6 20060404 (Red Hat 3.4.6-3)
>
> My memory settings are:
>
> work_mem = 64MB
> shared_buffers = 128MB
> temp_buffers = 32MB
>
> I ran a query that was "SELECT field, count(*) INTO TEMP temptable" and
> it grew to be 10gig (as reported by top)

What was the real query?
How many rows are we talking about?

 > and brought the whole machine
> to its knees.  How do I keep this from happening again?

Set your per-user limits (man ulimit or man bash) to restrict PG's
overall memory consumption.

--
   Richard Huxton
   Archonet Ltd

Re: swap storm created by 8.2.3

From
Joseph Shraibman
Date:

Richard Huxton wrote:
> Joseph Shraibman wrote:
>> I'm running:
>>
>> PostgreSQL 8.2.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc
>> (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
>>
>> My memory settings are:
>>
>> work_mem = 64MB
>> shared_buffers = 128MB
>> temp_buffers = 32MB
>>
>> I ran a query that was "SELECT field, count(*) INTO TEMP temptable"
>> and it grew to be 10gig (as reported by top)
>
> What was the real query?

First I selected 90634 rows (3 ints) into the first temp table, then I
did "select intfield1, count(intfield2) FROM realtable rt WHERE rt.id =
temptable.id and other conditions on rt here GROUP BY intfield1".  The
size of the second temp table should have been no more than 60000 rows.

> How many rows are we talking about?
>
>  > and brought the whole machine
>> to its knees.  How do I keep this from happening again?
>
> Set your per-user limits (man ulimit or man bash) to restrict PG's
> overall memory consumption.
>
What happens when PG hits that limit?  Will it start using disk space
for TEMP tables then?

Re: swap storm created by 8.2.3

From
Richard Huxton
Date:
Joseph Shraibman wrote:
>>> I ran a query that was "SELECT field, count(*) INTO TEMP temptable"
>>> and it grew to be 10gig (as reported by top)
>>
>> What was the real query?
>
> First I selected 90634 rows (3 ints) into the first temp table, then I
> did "select intfield1, count(intfield2) FROM realtable rt WHERE rt.id =
> temptable.id and other conditions on rt here GROUP BY intfield1".  The
> size of the second temp table should have been no more than 60000 rows.

I can't see where the behaviour you're seeing is coming from.

I remember hearing of cases where PG has picked a plan that ends up
requiring massively more RAM than it anticipated. Having committed
itself to that plan, the executor then follows it through regardless of
wrong initial assumptions. IIRC one example was building a hash where it
was expecting 10 distinct values and got 10,000 instead (or something of
that sort).

Can you reproduce this?

Can you post the explain for the query?

>> How many rows are we talking about?
>>
>>  > and brought the whole machine
>>> to its knees.  How do I keep this from happening again?
>>
>> Set your per-user limits (man ulimit or man bash) to restrict PG's
>> overall memory consumption.
>>
> What happens when PG hits that limit?  Will it start using disk space
> for TEMP tables then?

I'd expect PG to just fail (in that one backend) - it's unlikely there's
anything useful it can do if it's refused memory when it needs it.
That's to stop a query from overpowering your machine, not for
performance issues.

If your memory settings in postgresql.conf are reasonable (and they look
fine), this shouldn't happen. Let's see if an EXPLAIN sheds any light.

--
   Richard Huxton
   Archonet Ltd

Re: swap storm created by 8.2.3

From
Joseph Shraibman
Date:

Richard Huxton wrote:
> Joseph Shraibman wrote:
>>>> I ran a query that was "SELECT field, count(*) INTO TEMP temptable"
>>>> and it grew to be 10gig (as reported by top)
>>>
>>> What was the real query?
>>
>> First I selected 90634 rows (3 ints) into the first temp table, then I
>> did "select intfield1, count(intfield2) FROM realtable rt WHERE rt.id
>> = temptable.id and other conditions on rt here GROUP BY intfield1".
>> The size of the second temp table should have been no more than 60000
>> rows.
>
<SNIP>
>
> If your memory settings in postgresql.conf are reasonable (and they look
> fine), this shouldn't happen. Let's see if an EXPLAIN sheds any light.
>
=> explain SELECT ml.uid, count(ml.jid) AS cnt INTO TEMP tempml FROM  ml
WHERE ml.jid = tempjr1180108653561.id  AND ml.status IN(2,5,20) GROUP BY
ml.uid;
NOTICE:  adding missing FROM-clause entry for table "tempjr1180108653561"
LINE 2: ...INTO TEMP tempml FROM ml WHERE ml.jid = tempjr1180...
                                                              ^
                                         QUERY PLAN
------------------------------------------------------------------------------------------
  HashAggregate  (cost=11960837.72..11967601.06 rows=541067 width=8)
    ->  Hash Join  (cost=9675074.94..11849780.55 rows=22211434 width=8)
          Hash Cond: (tempjr1180108653561.id = ml.jid)
          ->  Seq Scan on tempjr1180108653561  (cost=0.00..31.40
rows=2140 width=4)
          ->  Hash  (cost=6511767.18..6511767.18 rows=181979021 width=8)
                ->  Seq Scan on ml  (cost=0.00..6511767.18
rows=181979021 width=8)
                      Filter: (status = ANY ('{2,5,20}'::integer[]))
(7 rows)

Re: swap storm created by 8.2.3

From
Richard Huxton
Date:
Joseph Shraibman wrote:
>
>
> Richard Huxton wrote:
>> Joseph Shraibman wrote:
>>>>> I ran a query that was "SELECT field, count(*) INTO TEMP temptable"
>>>>> and it grew to be 10gig (as reported by top)
>>>>
>>>> What was the real query?
>>>
>>> First I selected 90634 rows (3 ints) into the first temp table, then
>>> I did "select intfield1, count(intfield2) FROM realtable rt WHERE
>>> rt.id = temptable.id and other conditions on rt here GROUP BY
>>> intfield1".  The size of the second temp table should have been no
>>> more than 60000 rows.
>>
> <SNIP>
>>
>> If your memory settings in postgresql.conf are reasonable (and they
>> look fine), this shouldn't happen. Let's see if an EXPLAIN sheds any
>> light.
>>
> => explain SELECT ml.uid, count(ml.jid) AS cnt INTO TEMP tempml FROM  ml
> WHERE ml.jid = tempjr1180108653561.id  AND ml.status IN(2,5,20) GROUP BY
> ml.uid;

> NOTICE:  adding missing FROM-clause entry for table "tempjr1180108653561"

I'm guessing this is just a typo from your test and you'd normally
mention the temp-table.

> LINE 2: ...INTO TEMP tempml FROM ml WHERE ml.jid = tempjr1180...
>                                                              ^
>                                         QUERY PLAN
> ------------------------------------------------------------------------------------------
>
>  HashAggregate  (cost=11960837.72..11967601.06 rows=541067 width=8)
>    ->  Hash Join  (cost=9675074.94..11849780.55 rows=22211434 width=8)

Here you seem to have 22 million rows estimated for your join.

>          Hash Cond: (tempjr1180108653561.id = ml.jid)
>          ->  Seq Scan on tempjr1180108653561  (cost=0.00..31.40
> rows=2140 width=4)

Is the 2140 rows here a good estimate?

>          ->  Hash  (cost=6511767.18..6511767.18 rows=181979021 width=8)
>                ->  Seq Scan on ml  (cost=0.00..6511767.18 rows=181979021
> width=8)

OK, so the 22 million matches is because "ml" has 181 million rows. Is
that right too?

>                      Filter: (status = ANY ('{2,5,20}'::integer[]))

Overall it's estimating about 9 times the number of rows you were
expecting (541000 vs 60000). Not enough to account for your extreme
memory usage.

Let's see if that hash-join is really the culprit. Can you run EXPLAIN
and then EXPLAIN ANALYSE on the query, but first issue:

SET enable_hashjoin=off;

If that make little difference, try the same with enable_hashagg.

--
   Richard Huxton
   Archonet Ltd

Re: swap storm created by 8.2.3

From
Tom Lane
Date:
Richard Huxton <dev@archonet.com> writes:
> Let's see if that hash-join is really the culprit. Can you run EXPLAIN
> and then EXPLAIN ANALYSE on the query, but first issue:
> SET enable_hashjoin=off;
> If that make little difference, try the same with enable_hashagg.

It seems like it must be the hashagg step --- hashjoin spills to disk in
an orderly fashion when it reaches work_mem, but hashagg doesn't (yet).
However, if we know that there're only going to be 60K hashagg entries,
how could the memory get blown out by that?  Perhaps there's a memory
leak here somewhere.

Please restart your postmaster under a reasonable ulimit setting, so
that it will get ENOMEM rather than going into swap hell, and then try
the query again.  When it runs up against the ulimit it will give an
"out of memory" error and dump some per-context memory usage info into
the postmaster log.  That info is what we need to see.

            regards, tom lane

Re: swap storm created by 8.2.3

From
Joseph S
Date:
I set the ulimit for data to 4194304k:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) 4194304
file size               (blocks, -f) unlimited
pending signals                 (-i) 1024
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 8192
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32255
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

but after running a while the process still grew past 10 gig and created
a swap storm (as reported by top):

6266 postgres  15   0 11.2g 3.6g  732 D    9 91.7  21:29.01 postgres:
user2 owl 199.107.233.201(45564) EXPLAIN

... and I had to kill -INT it to keep the system responsive.

Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
>> Let's see if that hash-join is really the culprit. Can you run EXPLAIN
>> and then EXPLAIN ANALYSE on the query, but first issue:
>> SET enable_hashjoin=off;
>> If that make little difference, try the same with enable_hashagg.
>
> It seems like it must be the hashagg step --- hashjoin spills to disk in
> an orderly fashion when it reaches work_mem, but hashagg doesn't (yet).
> However, if we know that there're only going to be 60K hashagg entries,
> how could the memory get blown out by that?  Perhaps there's a memory
> leak here somewhere.
>
> Please restart your postmaster under a reasonable ulimit setting, so
> that it will get ENOMEM rather than going into swap hell, and then try
> the query again.  When it runs up against the ulimit it will give an
> "out of memory" error and dump some per-context memory usage info into
> the postmaster log.  That info is what we need to see.
>
>             regards, tom lane

Re: swap storm created by 8.2.3

From
Tom Lane
Date:
Joseph S <jks@selectacast.net> writes:
> I set the ulimit for data to 4194304k:

I've noticed that ulimit -d doesn't seem to do what you might expect
on Linuxen.  Try -m and/or -v.

            regards, tom lane