Thread: Kernel cache vs shared_buffers
We're in the process of upgrading our db server's memory from 2GB to 8GB to improve access performance. This is a dedicated dual Xeon db server not running any significant non-db processes. Our database size on disk is ~11GB, although we expect it to grow to ~20GB. Much of this data is inactive and seldom used or accessed. Read access is our primary concern as our data is normalized and retrieveing a complete product requires many reads to associated tables and indexes to put it all together. Our larger tables have 10-20 rows per disk block, I assume that most blocks will have a mix of frequently accessed and inactive rows. Of course, we wouldn't want to double-cache, so my inclination would be to either give most of the memory to shared_buffers, or to leave that small and let the kernel (Linux 2.6x) do the buffering. I have no idea regarding the inner working of the pg's shared cache, but what I would like to find out is whether it is table-row-based, or disk-block-based. In the case of it being disk-block based, my inclination would be to let the kernel do the buffering. In the case of the cache being table-row-based, I would expect it to be much more space-efficient and I would be inclined to give the memory to the pg. In that case, is it feasible to set shared_buffers to something like 500000 x 8k blocks? We make extensive use of indexes on the larger tables and would seldom, if ever, do sequential scans. Any comments or advice would be great! Michael.
Michael van Rooyen wrote: > I have no idea regarding the inner working of the pg's shared cache, but > what I would like to find out is whether it is table-row-based, or > disk-block-based. It's block based. > In the case of it being disk-block based, my > inclination would be to let the kernel do the buffering. In the case of > the cache being table-row-based, I would expect it to be much more > space-efficient and I would be inclined to give the memory to the pg. > In that case, is it feasible to set shared_buffers to something like > 500000 x 8k blocks? We make extensive use of indexes on the larger > tables and would seldom, if ever, do sequential scans. A common rule of thumb people quote here is to set shared_buffers to 1/4 of available RAM, and leave the rest for OS cache. That's probably a good configuration to start with. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Sat, May 12, 2007 at 03:28:45PM +0100, Heikki Linnakangas wrote: > >In the case of it being disk-block based, my > >inclination would be to let the kernel do the buffering. In the case of > >the cache being table-row-based, I would expect it to be much more > >space-efficient and I would be inclined to give the memory to the pg. > >In that case, is it feasible to set shared_buffers to something like > >500000 x 8k blocks? We make extensive use of indexes on the larger > >tables and would seldom, if ever, do sequential scans. > > A common rule of thumb people quote here is to set shared_buffers to 1/4 > of available RAM, and leave the rest for OS cache. That's probably a > good configuration to start with. If you really care about performance it would be a good idea to start with that and do your own benchmarking. Much of the consensus about shared_buffers was built up before 8.0, and the shared buffer management we have today looks nothing like what was in 7.4. You might find that shared_buffers = 50% of memory or even higher might perform better for your workload. If you do find results like that, please share them. :) -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
> A common rule of thumb people quote here is to set shared_buffers to 1/4 > of available RAM, and leave the rest for OS cache. That's probably a > good configuration to start with. > And just for the record: This rule of thumb does NOT apply to PostgreSQL on Windows. My current rule of thumb on Windows: set shared_buffers to minimum * 2 Adjust effective_cache_size to the number given as "system cache" within the task manager. Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Reinsburgstraße 202b 70197 Stuttgart 0173/9409607 fx 01212-5-13695179 - Python: the only language with more web frameworks than keywords.
Harald Armin Massa wrote: >> A common rule of thumb people quote here is to set shared_buffers to 1/4 >> of available RAM, and leave the rest for OS cache. That's probably a >> good configuration to start with. > > And just for the record: This rule of thumb does NOT apply to > PostgreSQL on Windows. My current rule of thumb on Windows: set > shared_buffers to minimum * 2 > Adjust effective_cache_size to the number given as "system cache" > within the task manager. Why? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki, > > PostgreSQL on Windows. My current rule of thumb on Windows: set > > shared_buffers to minimum * 2 > > Adjust effective_cache_size to the number given as "system cache" > > within the task manager. > > Why? I tried with shared_buffers = 50% of available memory, and with 30% of available memory, and the thoughput on complex queries stalled or got worse. I lowered shared_buffers to minimum, and started raising effective_cache_size, and performance on real world queries improved. pg_bench did not fully agree when simulating large numbers concurrent queries. So I tried setting shared_buffers between minimum and 2.5*minimum, and pg_bench speeds recovered and real world queries did similiar. My understanding is that shared_buffers are realised as memory mapped file in win32; and that they are only usually kept in memory. Maybe I understood that wrong. Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Reinsburgstraße 202b 70197 Stuttgart 0173/9409607 fx 01212-5-13695179 - Python: the only language with more web frameworks than keywords.
Harald Armin Massa wrote: > Heikki, > > >> > PostgreSQL on Windows. My current rule of thumb on Windows: set >> > shared_buffers to minimum * 2 >> > Adjust effective_cache_size to the number given as "system cache" >> > within the task manager. >> >> Why? > > I tried with shared_buffers = 50% of available memory, and with 30% of > available memory, and the thoughput on complex queries stalled or got > worse. > > I lowered shared_buffers to minimum, and started raising > effective_cache_size, and performance on real world queries improved. > pg_bench did not fully agree when simulating large numbers concurrent > queries. > > So I tried setting shared_buffers between minimum and 2.5*minimum, and > pg_bench speeds recovered and real world queries did similiar. > > My understanding is that shared_buffers are realised as memory mapped > file in win32; and that they are only usually kept in memory. Maybe I > understood that wrong. Almost. It's a memory mapped region backed by the system pagefile. That said, it would be good to try to figure out *why* this is happening. It's been on my list of things to do to run checks with the profiler (now that we can ;-) with the msvc stuff) and try to figure out where it's slowing down. It could be as simple as that there's much more overhead trying to access the shared memory from different processes than we're used to on other platforms. //Magnus