Thread: postgres optimization
hi
i am optmizing postgres database and need some help on it.
I currently have a server with 2 Gb RAM.
I have setup the /proc/sys/kernel/shmall and /proc/sys/kernel/shmmax to 536870912 i.e 512 Mb.
should i change it to 2 Gb and how will it affect the perfomance and will there be any downsides to it or is it better to keep it less.
I then changed the shared buffers in config file according to this expression specifies in postgres manual
max connections=128
250kB + 8.2 kB *
shared buffer=65277
what value should i specify for Effective Cache Size. I came across this article but it is slightly confusing.
"Say there is 1.5GB RAM in your machine, shared buffers are set to 32MB and effective cache size is set to 800MB. So if a query needs 700MB of data set, PostgreSQL would estimate that all the data required should be available in memory and would opt for more aggressive plan in terms of optimization, involving heavier index usage and merge joins. But if effective cache is set to only 200MB, the query planner is liable to opt for the more I/O efficient sequential scan."
thanks,
kailash
i am optmizing postgres database and need some help on it.
I currently have a server with 2 Gb RAM.
I have setup the /proc/sys/kernel/shmall and /proc/sys/kernel/shmmax to 536870912 i.e 512 Mb.
should i change it to 2 Gb and how will it affect the perfomance and will there be any downsides to it or is it better to keep it less.
I then changed the shared buffers in config file according to this expression specifies in postgres manual
max connections=128
250kB + 8.2 kB *
shared_buffers
+ 14.2 kB * max_connections
or infinityshared buffer=65277
what value should i specify for Effective Cache Size. I came across this article but it is slightly confusing.
"Say there is 1.5GB RAM in your machine, shared buffers are set to 32MB and effective cache size is set to 800MB. So if a query needs 700MB of data set, PostgreSQL would estimate that all the data required should be available in memory and would opt for more aggressive plan in terms of optimization, involving heavier index usage and merge joins. But if effective cache is set to only 200MB, the query planner is liable to opt for the more I/O efficient sequential scan."
thanks,
kailash
On Fri, 2005-07-29 at 08:24, Kailash Vyas wrote: > hi > > i am optmizing postgres database and need some help on it. > I currently have a server with 2 Gb RAM. > > I have setup the /proc/sys/kernel/shmall and /proc/sys/kernel/shmmax > to 536870912 i.e 512 Mb. > should i change it to 2 Gb and how will it affect the perfomance and > will there be any downsides to it or is it better to keep it less. Generally speaking, you're better off letting your kernel do the majority of CACHING, and letting postgresql do BUFFERING. On a linux box, the kernel will use all spare memory to cache disk accesses automagically. So, assuming postgresql uses up a few hundred megs, after the machine reaches a steady state, the kernel should be using the 1.5 gig or so left over for caching. While the latest versions of postgresql have much improved caching algorithms for its buffers, it still pretty much dumps the buffers when the last backend looking at them disconnects. I.e. it's not a persistent caching system like the kernel cache is. > I then changed the shared buffers in config file according to this > expression specifies in postgres manual > > max connections=128 > 250kB + 8.2 kB * shared_buffers + 14.2 kB * max_connections or > infinity > > > shared buffer=65277 That's large, but not unreasonable for a machine handling a large dataset. I assume you do have a decent sized dataset (a gig or more) to handle. > what value should i specify for Effective Cache Size. I came across > this article but it is slightly confusing. Effective cache size just tells the query planner about how much memory the OS is using to cache your dataset. Bring the machine up, run lots of queries, and check the cache and buffers with top, and there's your amount. divide by 8k to get the setting for effective cache size.
Hello, I followed this advice for picking a good effective_cache_size value (below) from Scott Marlowe, and run into a bit of trouble: I looked at the `top' output and saw "721380k cached". So I calculated the effective cache size using Scott's formula: 721380/8 = 90172 Then I changed my effective_cache size from the previous 10000 to 90172: effective_cache_size = 90172 # typically 8KB each I restarted PG, and thought I'd see some performance improvements, but I actually saw degradation in performance. All of a sudden a query that took a second started taking a few seconds, and consumed more CPU than before. Can anyone explain this drop in performance and increase in CPU usage and tell me what I did wrong? I also noticed that the "721380k cached" number in top dropped to about 300000k (about a half). Maybe that was simply due to PG restart? If so, does that indicate the kernel had about 400,000K worth of PG data cached? Thanks, Otis --- Scott Marlowe <smarlowe@g2switchworks.com> wrote: > Effective cache size just tells the query planner about how much > memory > the OS is using to cache your dataset. > > Bring the machine up, run lots of queries, and check the cache and > buffers with top, and there's your amount. divide by 8k to get the > setting for effective cache size.
On Fri, 2005-08-12 at 17:48, ogjunk-pgjedan@yahoo.com wrote: > Hello, > > I followed this advice for picking a good effective_cache_size value > (below) from Scott Marlowe, and run into a bit of trouble: > > I looked at the `top' output and saw "721380k cached". > So I calculated the effective cache size using Scott's formula: > > 721380/8 = 90172 > > Then I changed my effective_cache size from the previous 10000 to > 90172: > > effective_cache_size = 90172 # typically 8KB each > > > I restarted PG, and thought I'd see some performance improvements, but > I actually saw degradation in performance. All of a sudden a query > that took a second started taking a few seconds, and consumed more CPU > than before. There are a lot of reasons why this may be so, but the most likely is that when restarting postgresql, the kernel cache got old and was dumped out, so that it wasn't holding all the data. Basically, a larger effective cache size favors indexes over seq scan, since the data are more likely to be in memory when you ask for them. > Can anyone explain this drop in performance and increase in CPU usage > and tell me what I did wrong? > > I also noticed that the "721380k cached" number in top dropped to about > 300000k (about a half). Maybe that was simply due to PG restart? If > so, does that indicate the kernel had about 400,000K worth of PG data > cached? That's pretty normal, over time, with postgresql being idle. Don't worry about how the query runs the first time you run it so much as how it runs consistently over time. If the numbers come back to where they were, then you've got the right setting. Like I said, the effective_cache_size is kind of a lead hammer, used to nudge the planner one way or another but doesn't need to be exact to be useful. Take a look at the plan being chosen, it is likely to be a random access (i.e. use an index) right now rather than a seq scan. The seq scan would win on a machine with nothing in the kernel cache, but the index scan will usually win if everything is already cached. Again, these are broad strokes of the brush. It all depends on the actual % of a table returned and a few other things...