Thread: Optimization
Greetings,
I am trying to understand the various factors used by Postgres to optimize. I presently have a dual-866 Dell server with 1GB of memory. I've done the following:
set /proc/sys/kernel/shmmax to 512000000
shared_buffers = 32000
sort_mem = 32000
max_connections=64
fsync=false
sort_mem = 32000
max_connections=64
fsync=false
Can someone tell me what effective_cache_size should be set to? what kind of formula to use for this? (I got the other figures from phpbuilder.com, and modified for 512k memory).
The databases I'm using have about 200,000+ news headlines with full-text indexes (which range upwards of a few million records). They are updated about every 5 to 10 minutes, which means I also have to run a vacuum about once every 2 to 3 hours at least. As I get more updates obviously the efficiency goes down. I'm trying to make the most of this system but don't fully understand PG's optimization stuff.
Thanks in advance,
Justin Long
Justin- It sounds like you're on a system similar to ours, so I'll pass along the changes that I made, which seem to have increased performance, and most importantly, haven't hurt anything. The main difference in our environment is that we are less Update/Insert intensive than you are- in our application, 90% of our information (court cases) is static (Closed) and 10% are frequently being updated (Pending/Active). This means I only vacuum once a week. I haven't had chance to churn out objective tests yet, but my subjective judgment is that this set of params works well: Set SHMMAX and SHMALL in the kernel to 134217728 (128MB) Set shared_buffers to 8192 (64MB) Set sort_mem to 16384 (16MB) Set effective_cache_size to 65536 (1/2 GB) The Hardware is a dual-processor Athlon 1.2 Ghz box with 1 GB of RAM and the DB on SCSI RAID drives. The database size is about 8GB, with the largest table 2.5 GB, and the two most commonly queried tables at 1 GB each. The OS is Debian Linux kernel 2.4.x (recompiled custom kernel for dual processor support) The PostgreSQL version is 7.3.2 My reasoning was to increase shared_buffers based on anecdotal recommendations I've seen on this list to 64MB and boost the OS SHMMAX to twice that value to allow adequate room for other shared memory needs, thus reserving 128MB off the top. Of the remaining memory, 256MB goes to 16MB sort space times a guesstimate of 16 simultaneous sorts at any given time. If I leave about 128 MB for headroom, then 1/2 GB should be left available for the effective cache size. I've never been tempted to turn fsync off. That seems like a risky move. Regards, -Nick --------------------------------------------------------------------- Nick Fankhauser nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788 doxpop - Court records at your fingertips - http://www.doxpop.com/
Justin- It sounds like you're on a system similar to ours, so I'll pass along the changes that I made, which seem to have increased performance, and most importantly, haven't hurt anything. The main difference in our environment is that we are less Update/Insert intensive than you are- in our application, 90% of our information (court cases) is static (Closed) and 10% are frequently being updated (Pending/Active). This means I only vacuum once a week. I haven't had chance to churn out objective tests yet, but my subjective judgment is that this set of params works well: Set SHMMAX and SHMALL in the kernel to 134217728 (128MB) Set shared_buffers to 8192 (64MB) Set sort_mem to 16384 (16MB) Set effective_cache_size to 65536 (1/2 GB) The Hardware is a dual-processor Athlon 1.2 Ghz box with 1 GB of RAM and the DB on SCSI RAID drives. The database size is about 8GB, with the largest table 2.5 GB, and the two most commonly queried tables at 1 GB each. The OS is Debian Linux kernel 2.4.x (recompiled custom kernel for dual processor support) The PostgreSQL version is 7.3.2 My reasoning was to increase shared_buffers based on anecdotal recommendations I've seen on this list to 64MB and boost the OS SHMMAX to twice that value to allow adequate room for other shared memory needs, thus reserving 128MB. Of the remaining memory, 256MB goes to 16MB sort space times a guesstimate of 16 simultaneous sorts at any given time. If I leave about 128 MB for headroom, then 1/2 GB should be left available for the effective cache size. I've never been tempted to turn fsync off. That seems like a risky move. Regards, -Nick --------------------------------------------------------------------- Nick Fankhauser nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788 doxpop - Court records at your fingertips - http://www.doxpop.com/
>Can someone tell me what effective_cache_size should be set to? You may be able to intuit this from my last post, but if I understand correctly, what you should be doing is estimating how much memory is likely to be "left over" for the OS to do disk caching with after all of the basic needs of the OS, PostgreSQL & any other applications are taken care of. You then tell postgresql what to expect in terms of caching resources by putting this number into effective_cache_size, and this allows the query planner come up with a strategy that is optimized for the expected cache size. So the "formula" would be: Figure out how much memory is normally in use allowing adequate margins, subtract this from your total RAM, and make the remainder your effective_cache size. -Nick
Justin, > I am trying to understand the various factors used by Postgres to optimize. I presently have a dual-866 Dell server with 1GB of memory. I've done the following: Please set the performance articles at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php -- -Josh Berkus Aglio Database Solutions San Francisco
Justin, > I am trying to understand the various factors used by Postgres to optimize. I presently have a dual-866 Dell server with 1GB of memory. I've done the following: see: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php which has articles on .conf files. (feel free to link these articles at PHPbuilder.com and elsewhere!) > The databases I'm using have about 200,000+ news headlines with full-text indexes (which range upwards of a few million records). They are updated about every 5 to 10 minutes, which means I also have to run a vacuum about once every 2 to 3 hours at least. As I get more updates obviously the efficiency goes down. I'm trying to make the most of this system but don't fully understand PG's optimization stuff. Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every 10-15 minutes, not every 2-3 hours. Regular VACUUM does not lock your database. You will also want to increase your FSM_relations so that VACUUM is more effective/efficient; again, see the articles. -- -Josh Berkus Aglio Database Solutions San Francisco
On 28 Jul 2003 at 12:27, Josh Berkus wrote: > Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every > 10-15 minutes, not every 2-3 hours. Regular VACUUM does not lock your > database. You will also want to increase your FSM_relations so that VACUUM > is more effective/efficient; again, see the articles. There is an auto-vacuum daemon in contrib and if I understand it correctly, it is not getting much of a field testing. How about you guys installing it and trying it? Bye Shridhar -- O'Reilly's Law of the Kitchen: Cleanliness is next to impossible
On Tue, 29 Jul 2003, Shridhar Daithankar wrote: > On 28 Jul 2003 at 12:27, Josh Berkus wrote: > > Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every > > 10-15 minutes, not every 2-3 hours. Regular VACUUM does not lock your > > database. You will also want to increase your FSM_relations so that VACUUM > > is more effective/efficient; again, see the articles. > > There is an auto-vacuum daemon in contrib and if I understand it correctly, it > is not getting much of a field testing. How about you guys installing it and > trying it? If there is such a daemon, what is it called? As I can't see it. Is it part of gborg? Peter Childs
On 29 Jul 2003 at 8:14, Peter Childs wrote: > On Tue, 29 Jul 2003, Shridhar Daithankar wrote: > > > On 28 Jul 2003 at 12:27, Josh Berkus wrote: > > > Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every > > > 10-15 minutes, not every 2-3 hours. Regular VACUUM does not lock your > > > database. You will also want to increase your FSM_relations so that VACUUM > > > is more effective/efficient; again, see the articles. > > > > There is an auto-vacuum daemon in contrib and if I understand it correctly, it > > is not getting much of a field testing. How about you guys installing it and > > trying it? > > If there is such a daemon, what is it called? As I can't see it. > Is it part of gborg? It is in sources. See contrib module in postgresql CVS, 7.4 beta if you prefer to wait till announement. It is called as pgavd.. Bye Shridhar -- squatcho, n.: The button at the top of a baseball cap. -- "Sniglets", Rich Hall & Friends