Re: postgres crashes on insert in 40 different threads - Mailing list pgsql-admin

From Stéphane Schildknecht
Subject Re: postgres crashes on insert in 40 different threads
Date
Msg-id 52121C56.3020101@postgresql.fr
Whole thread Raw
In response to Re: postgres crashes on insert in 40 different threads  (Dzmitry <dzmitry.nikitsin@gmail.com>)
Responses Re: postgres crashes on insert in 40 different threads
List pgsql-admin
Le 19/08/2013 15:06, Dzmitry a écrit :
> Thank you guys ! Found in logs(db-slave1 is my replica  that use streaming
> replication):
>
> Aug 18 15:49:38 db-slave1 kernel: [25094456.525703] postgres invoked
> oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
> Aug 18 15:49:38 db-slave1 kernel: [25094456.525708] postgres cpuset=/
> mems_allowed=0
> Aug 18 15:49:38 db-slave1 kernel: [25094456.525712] Pid: 26418, comm:
> postgres Not tainted 3.2.0-40-virtual #64-Ubuntu
>
>
> Aug 18 15:49:48 db-slave1 kernel: [25094456.621937] Out of memory: Kill
> process 2414 (postgres) score 417 or sacrifice child
> Aug 18 15:49:48 db-slave1 kernel: [25094456.621949] Killed process 2414
> (postgres) total-vm:13057464kB, anon-rss:28560kB, file-rss:12773376kB
>
>
>
> Aug 19 03:18:00 db-slave1 kernel: [25135758.540539] postgres invoked
> oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
> Aug 19 03:18:01 db-slave1 kernel: [25135758.540544] postgres cpuset=/
> mems_allowed=0
> Aug 19 03:18:07 db-slave1 kernel: [25135758.540548] Pid: 23994, comm:
> postgres Not tainted 3.2.0-40-virtual #64-Ubuntu
>
> Aug 19 03:18:07 db-slave1 kernel: [25135758.626405] Out of memory: Kill
> process 28354 (postgres) score 348 or sacrifice child
>
> Aug 19 03:18:07 db-slave1 kernel: [25135758.626418] Killed process 28354
> (postgres) total-vm:13021248kB, anon-rss:8704kB, file-rss:10686512kB
> Aug 19 03:18:07 db-slave1 kernel: [25135763.068736] postgres invoked
> oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
> Aug 19 03:18:07 db-slave1 kernel: [25135763.068740] postgres cpuset=/
> mems_allowed=0
> Aug 19 03:18:07 db-slave1 kernel: [25135763.068743] Pid: 6780, comm:
> postgres Not tainted 3.2.0-40-virtual #64-Ubuntu
>
> Aug 19 03:18:07 db-slave1 kernel: [25135763.150285] Out of memory: Kill
> process 20322 (postgres) score 301 or sacrifice child
> Aug 19 03:18:07 db-slave1 kernel: [25135763.150297] Killed process 20322
> (postgres) total-vm:13058892kB, anon-rss:47172kB, file-rss:9186604kB
>
> So I will do as Stéphane advised - make shared buffers 6GB. Do you know if
> I need to do anything else - increase shared memory(SHMMAX/SHMMIN)
> parameters ?
>
>
> Right now I have
> Shhmax - 13223870464
> Shmall - 4194304
>
>
>
> Thanks,
>    Dzmitry
>
>
>
>
>
> On 8/19/13 1:05 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote:
>
>>>> Dzmitry wrote:
>>>>> On 8/19/13 11:36 AM, "Stéphane Schildknecht"
>>>>> <stephane.schildknecht@postgresql.fr> wrote:
>>>>>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>>>>>   I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>>>>>> RAM.
>>>>>>> With following settings:
>>>>>>> max_connections = 550
>>>>>>> shared_buffers = 12GB
>>>>>>> temp_buffers = 8MB
>>>>>>> max_prepared_transactions = 0
>>>>>>> work_mem = 50MB
>>>>>>> maintenance_work_mem = 1GB
>>>>>>> fsync = on
>>>>>>> wal_buffers = 16MB
>>>>>>> commit_delay = 50
>>>>>>> commit_siblings = 7
>>>>>>> checkpoint_segments = 32
>>>>>>> checkpoint_completion_target = 0.9
>>>>>>> effective_cache_size = 22GB
>>>>>>> autovacuum = on
>>>>>>> autovacuum_vacuum_threshold = 1800
>>>>>>> autovacuum_analyze_threshold = 900
>>>>>>>
>>>>>>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>>>>>> thread
>>>>>>> check if record exists ­ if not => insert record, if exists =>
>>>>>>> update
>>>>>>> record.
>>>>>>> During this update, my disk IO almost always ­ 100% and sometimes it
>>>>>>> crash my
>>>>>>> DB with following message:
>>>>>>>
>>>>>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>>>>>> terminated by signal 9: Killed
>>>> [...]
>>>>>>> My DB size is not very big ­ 169GB.
>>>>>>>
>>>>>>> Anyone know how can I get rid of DB crash  ?
>>>>>> The fact that the checkpointer was killed -9 let me think the
>>>>>> OOMKiller has
>>>>>> detected you were out of memory.
>>>>>>
>>>>>> Could that be the case?
>>>>>>
>>>>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to
>>>>>> lower that
>>>>>> value to 6GB, for instance.
>>>>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>>>>>
>>>>>> I can imagine your system is swapping a lot, and you exhaust swap
>>>>>> memory before crash.
>>>>> I don't think it's the case. I am using newrelic for monitoring my DB
>>>>> servers(I have one master and 2 slaves - all use the same
>>>>> configuration) -
>>>>> memory is not going above 12.5GB, so I have a good reserve, also I
>>>>> don't
>>>>> see any swapping there :(
>>>> You can check by examining /var/log/messages to see if the OOM
>>>> killer is at fault, which is highly likely.
>>>>
>>>> The OOM killer uses heuristics, so it does the wrong thing
>>>> occasionally.
>>>>
>>>> The documentation is helpful:
>>>>
>>>> http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-ME
>>>> MORY-OVERCOMMIT
>>> Do you mean postgres log file(in postgres.conf)
>>>
>>> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
>>> log_min_messages = warning
>>>
>>> Or /var/log/messages ? Because I haven't this file :(
>> I meant the latter.
>> /var/log/messages is just where syslog output is directed on some
>> Linux distributions.  I don't know Ubuntu, so sorry if I got
>> it wrong.  Maybe it is /var/log/syslog on Ubuntu.
>> In case of doubt check your syslog configuration.
>>
>> Yours,
>> Laurenz Albe
>
>
>


As Laurenz said, you should have a look at documentation.

It explains how you can lower the risk OOMKiller kills your PostgreSQL processes.
1. You can set vm.overcommit_memory to 2 in sysctl.conf
2. You can adjust the value of oom_score_adj in startup script to prevent
OOMKiller to kill Postmaster
3. You can lower shared_buffers, work_mem and max_connections.

Regards,

--
Stéphane Schildknecht
Loxodata - Conseil, expertise et formations



pgsql-admin by date:

Previous
From: Dzmitry
Date:
Subject: Re: postgres crashes on insert in 40 different threads
Next
From: Scott Ribe
Date:
Subject: Re: postgres crashes on insert in 40 different threads