Re: Memory Allocation - Mailing list pgsql-performance

From Scott Carey
Subject Re: Memory Allocation
Date
Msg-id BDFBB77C9E07BE4A984DAAE981D19F961ACA1F1ADC@EXVMBX018-1.exch018.msoutlookonline.net
Whole thread Raw
In response to Memory Allocation  ("Ryan Hansen" <ryan.hansen@brightbuilders.com>)
Responses Re: Memory Allocation
List pgsql-performance

Tuning for bulk loading:

 

Make sure the Linux kernel paramters in /proc/sys/vm related to the page cache are set well.

Set swappiness to 0 or 1.

Make sure you understand and configure /proc/sys/vm/dirty_background_ratio

and /proc/sys/vm/dirty_ratio well. 

With enough RAM the default on some kernel versions is way, way off (40% of RAM with dirty pages!  yuck).

http://www.westnet.com/~gsmith/content/linux-pdflush.htm

If postgres is doing a lot of caching for you you probably want dirty_ratio at 10% or less, and you'll want the OS to start flushing to disk sooner rather than later.  A dirty_background_ratio of 3% with 24GB of RAM  is 720MB -- a pretty big buffer.  I would not personally want this buffer to be larger than 5 seconds of max write speed of the disk I/O.

 

You'll need to tune your background writer to be aggressive enough to actually write data fast enough so that checkpoints don't suck, and tune your checkpoint size and settings as well.  Turn on checkpoint logging on the database and run tests while looking at the output of those.  Ideally, most of your batch writes have made it to the OS before the checkpoint, and the OS has actually started moving most of it to disk.  If your settings are wrong,  you'll have the data buffered twice, and most or nearly all of it will be in memory when the checkpoint happens, and the checkpoint will take a LONG time.  The default Linux settings + default postgres settings + large shared_buffers will almost guarantee this situation for bulk loads.  Both have to be configured with complementary settings.  If you have a large postgres buffer, the OS buffer should be small and write more aggressively.  If you have a small postgres buffer, the OS can be more lazy and cache much more.

 

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Ryan Hansen
Sent: Wednesday, November 26, 2008 2:10 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Memory Allocation

 

Hey all,

 

This may be more of a Linux question than a PG question, but I’m wondering if any of you have successfully allocated more than 8 GB of memory to PG before.

 

I have a fairly robust server running Ubuntu Hardy Heron, 24 GB of memory, and I’ve tried to commit half the memory to PG’s shared buffer, but it seems to fail.  I’m setting the kernel shared memory accordingly using sysctl, which seems to work fine, but when I set the shared buffer in PG and restart the service, it fails if it’s above about 8 GB.  I actually have it currently set at 6 GB.

 

I don’t have the exact failure message handy, but I can certainly get it if that helps.  Mostly I’m just looking to know if there’s any general reason why it would fail, some inherent kernel or db limitation that I’m unaware of. 

 

If it matters, this DB is going to be hosting and processing hundreds of GB and eventually TB of data, it’s a heavy read-write system, not transactional processing, just a lot of data file parsing (python/bash) and bulk loading.  Obviously the disks get hit pretty hard already, so I want to make the most of the large amount of available memory wherever possible.  So I’m trying to tune in that direction.

 

Any info is appreciated.

 

Thanks!

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Memory Allocation
Next
From: "Kevin Grittner"
Date:
Subject: Re: Memory Allocation