Thread: Setting Shared Buffers , Effective Cache, Sort Mem Parameters
Hi We are in the process of building a new machine for our production database. Below you will see some of the harware specs for the machine. I need some help with setting these parameters (shared buffers, effective cache, sort mem) in the pg_conf file. Also can anyone explain the difference between shared buffers and effective cache , how these are allocated in the main memory (the docs are not clear on this). Here are the Hardware details: Operating System: Red Hat 9 Database Ver: Postgres 7.4 CPU'S : 4 RAM : 4 gig Datafile layout : RAID 1+0 Transaction log : on different RAID1 Array RAID Stripe Size: 8k Thanks! Pallav
On Thu, 22 Apr 2004 13:51:42 -0400, Pallav Kalva <pkalva@deg.cc> wrote: >I need some help with setting these parameters (shared buffers, >effective cache, sort mem) in the pg_conf file. It really depends on the kind of queries you intend to run, the number of concurrent active connections, the size of the working set (active part of the database), what else is running on the machine, and and and ... Setting shared_buffers to 10000, effective_cache_size to 400000 (80% of installed RAM), and sort_mem to a few thousand might be a good start. > Also can anyone explain >the difference between shared buffers and effective cache , how these >are allocated in the main memory (the docs are not clear on this). Shared_buffers directly controls how many pages are allocated as internal cache. Effective_cache_size doesn't allocate anything, it is just a hint to the planner how much cache is available on the system level. Servus Manfred
On Thu, 22 Apr 2004, Pallav Kalva wrote: > Hi > > We are in the process of building a new machine for our production > database. Below you will see some of the harware specs for the machine. > I need some help with setting these parameters (shared buffers, > effective cache, sort mem) in the pg_conf file. Also can anyone explain > the difference between shared buffers and effective cache , how these > are allocated in the main memory (the docs are not clear on this). > > Here are the Hardware details: > Operating System: Red Hat 9 > Database Ver: Postgres 7.4 > CPU'S : 4 > RAM : 4 gig > Datafile layout : RAID 1+0 > Transaction log : on different RAID1 Array > RAID Stripe Size: 8k Read this first: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html Basically shared buffers are the "play area" for the database backends to toss data in the air and munge it together. The effective cache size reflects the approximate amount of space your operating system is using to buffer Postgresql data. On a dedicated database machine this is about the same as the size of the kernel buffer shown in top. On a mixed machine, you'll have to see how much of what data is getting buffered to get a guesstimate of how much kernel cache is being used for pgsql and how much for other processes. Then divide that number in bytes by 8192, the default block size. On a machine with 1.2 gigs of kernel cache, that'd be about 150,000 blocks. Buffer sizes from 1000 to 10000 blocks are common. Block sizes from 10000 to 50000 can somtimes increase performance, but those sizes only really make sense for machines with lots of ram, and very large datasets being operated on.
On Fri, 23 Apr 2004 10:20:10 -0400, Pallav Kalva <pkalva@deg.cc> wrote: > the database sizes is around 2- 4 gig and >there are 5 of them. this machine is > mainly for the databases and nothing is running on them. Did I understand correctly that you run (or plan to run) five postmasters? Is there a special reason that you cannot put all your tables into one database? > setting shared buffers to 10000 allocates (81Mb) and effective >cache to 400000 would be around (3gig) > does this means that if all of the 81mb of the shared memory gets >allocated it will use rest from the effective > cache of (3g-81mb) ? Simply said, if Postgres wants to access a block, it first looks whether this block is already in shared buffers which should be the case, if the block is one of the last 10000 blocks accessed. Otherwise the block has to be read in. If the OS has the block in its cache, reading it is just a (fast) memory operation, else it involves a (slow) physical disk read. The number of database pages residing in the OS cache is totally out of control of Postgres. Effective_cache_size tells the query planner how many database pages can be *expected* to be present in the OS cache. >increasing the shared buffers space to 2g Setting shared_buffers to half your available memory is the worst thing you can do. You would end up caching exactly the same set of blocks in the internal buffers and in the OS cache, thus effectively making one of the caches useless. Better keep shared_buffers low and let the OS do its job. Servus Manfred
On Fri, 23 Apr 2004, Manfred Koizar wrote: > > Setting shared_buffers to half your available memory is the worst thing > you can do. You would end up caching exactly the same set of blocks in > the internal buffers and in the OS cache, thus effectively making one of > the caches useless. One minor detail... You wouldn't really cache the _exact_ same blocks because cache-hits in shared-buffers (on the most frequently accessed pages) would let the OS cache some other pages in it's cache. But in my experience Manfred's right that there's no benefit and some penalty to making shared_buffers so large it takes a significant piece away from the OS's caching.
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > [ on setting shared_buffers = half of RAM ] > One minor detail... You wouldn't really cache the _exact_ same blocks > because cache-hits in shared-buffers (on the most frequently accessed > pages) would let the OS cache some other pages in it's cache. > But in my experience Manfred's right that there's no benefit and > some penalty to making shared_buffers so large it takes a significant > piece away from the OS's caching. True, it'd probably not be the *exact* worst case. But it'd be a good approximation. In practice you should either bet on the kernel doing most of the caching (in which case you set shared_buffers pretty low) or bet on Postgres doing most of the caching (in which case you set shared_buffers to eat most of RAM). The conventional wisdom at this point is to bet the first way; no one has shown performance benefits from setting shared_buffers higher than the low tens of thousands. (Most of the mail list traffic on this predates the existence of pgsql-performance, so check the other list archives too if you go looking for discussion.) It's possible that Jan's recent buffer-management improvements will change the story as of 7.5. I kinda doubt it myself, but it'd be worth re-running any experiments you've done when you start working with 7.5. regards, tom lane
Tom, > It's possible that Jan's recent buffer-management improvements will > change the story as of 7.5. I kinda doubt it myself, but it'd be worth > re-running any experiments you've done when you start working with 7.5. Yes, Jan has indicated to me that he expects to make much heavier use of shared buffers under ARC. But 7.5 still seems to be too unstable for me to test this assertion on a large database. -- Josh Berkus Aglio Database Solutions San Francisco