Re: slave restarts with kill -9 coming from somewhere, or nowhere - Mailing list pgsql-admin

From Bert
Subject Re: slave restarts with kill -9 coming from somewhere, or nowhere
Date
Msg-id CAFCtE1movdoU5fBnfoCi4gZpMsFr0v0+gsC_fA0_4gOyKk9oPg@mail.gmail.com
Whole thread Raw
In response to Re: slave restarts with kill -9 coming from somewhere, or nowhere  (Bert <biertie@gmail.com>)
Responses Re: slave restarts with kill -9 coming from somewhere, or nowhere  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
hi,

this is strange: one connection almost killed the server. So not a combination of a lot of connections. I saw one connection grewing till over 100GB. Then I cancelled the connection before the oom killer became active again.

These are my memory settings:
shared_buffers = 20GB 
temp_buffers = 1GB
max_prepared_transactions = 10
work_mem = 4GB
maintenance_work_mem = 1GB
max_stack_depth = 8MB
wal_buffers = 32MB
effective_cache_size = 88GB

The server has 128GB ram

How is it possible that one connection (query) uses all the ram? And how can I avoid it?

ps: the database is a DWH. I don't need a lot of connections. But I want to process a lot of data fast.

cheers,
Bert




On Wed, Apr 3, 2013 at 10:10 AM, Bert <biertie@gmail.com> wrote:
Hi all,

I have turned vm.overcommit_memory on 1.

It's a pretty much dedicated machine anyway, except for some postgres maintainance scripts I run in python / bash from the server.

We'll see what it gives.

cheers,
Bert


On Wed, Apr 3, 2013 at 8:45 AM, Bert <biertie@gmail.com> wrote:
Hi Tom,

thanks for the tip! it was indeed the oom killer.

Is it wise to disable the oom killer? Or will the server really go down withough postgres doing something about it?

currently I already lowered the shared_memory value a bit..

cheers,
Bert


On Tue, Apr 2, 2013 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bert <biertie@gmail.com> writes:
> I'm running the latest postgres version (9.2.3), and today for the first
> time I encountered this:

> 12774 2013-04-02 18:13:10 CEST LOG:  server process (PID 28463) was
> terminated by signal 9: Killed

AFAIK there are only two possible sources of signal 9: a manual kill,
or the Linux kernel's OOM killer.  If it's the latter there should be
a concurrent entry in the kernel logfiles about this.  If you find one,
suggest reading up on how to disable OOM kills, or at least reconfigure
your system to make them less probable.

                        regards, tom lane



--
Bert Desmet
0477/305361



--
Bert Desmet
0477/305361



--
Bert Desmet
0477/305361

pgsql-admin by date:

Previous
From: Albe Laurenz
Date:
Subject: Re: FW: psql error
Next
From: Tom Lane
Date:
Subject: Re: slave restarts with kill -9 coming from somewhere, or nowhere