Thread: Some help on buffers and other performance tricks
Hi all, I've got PG 8.0 on Debian sarge set up ... I want to speed up performance on the system. The system will run PG, Apache front-end on port 80 and Tomcat / Cocoon for the webapp. The webapp is not so heavily used, so we can give the max performance to the database. The database has a lot of work to do, we upload files every day. The current server has 8 databases of around 1 million records. This will be more in the future. There's only one main table, with some smaller tables. 95% of the records are in that one table. A lot of updates are done on that table, affecting 10-20% of the records. The system has 1 gig of ram. I could give 512Mb to PG. Filesystem is ext2, with the -noatime parameter in fstab Could I get some suggestions in how to configure my buffers, wals, .... ? Met vriendelijke groeten, Bien à vous, Kind regards, <bold>Yves Vindevogel</bold> <bold>Implements</bold> <smaller> </smaller>Hi all, I've got PG 8.0 on Debian sarge set up ... I want to speed up performance on the system. The system will run PG, Apache front-end on port 80 and Tomcat / Cocoon for the webapp. The webapp is not so heavily used, so we can give the max performance to the database. The database has a lot of work to do, we upload files every day. The current server has 8 databases of around 1 million records. This will be more in the future. There's only one main table, with some smaller tables. 95% of the records are in that one table. A lot of updates are done on that table, affecting 10-20% of the records. The system has 1 gig of ram. I could give 512Mb to PG. Filesystem is ext2, with the -noatime parameter in fstab Could I get some suggestions in how to configure my buffers, wals, .... ? Met vriendelijke groeten, Bien à vous, Kind regards, Yves Vindevogel Implements <smaller> Mail: yves.vindevogel@implements.be - Mobile: +32 (478) 80 82 91 Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76 Web: http://www.implements.be <italic><x-tad-smaller> First they ignore you. Then they laugh at you. Then they fight you. Then you win. Mahatma Ghandi.</x-tad-smaller></italic></smaller> Mail: yves.vindevogel@implements.be - Mobile: +32 (478) 80 82 91 Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76 Web: http://www.implements.be First they ignore you. Then they laugh at you. Then they fight you. Then you win. Mahatma Ghandi.
Attachment
0= Optimize your schema to be a tight as possible. Your goal is to give yourself the maximum chance that everything youwant to work on is in RAM when you need it. 1= Upgrade your RAM to as much as you can possibly strain to afford. 4GB at least. It's that important. 2= If the _entire_ DB does not fit in RAM after upgrading your RAM, the next step is making sure your HD IO subsystem isadequate to your needs. 3= Read the various pg tuning docs that are available and Do The Right Thing. 4= If performance is still not acceptable, then it's time to drill down into what specific actions/queries are problems. If you get to here and the entire DBMS is still not close to acceptable, your fundamental assumptions have to be re-examined. Ron -----Original Message----- From: Yves Vindevogel <yves.vindevogel@implements.be> Sent: Nov 9, 2005 3:11 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Some help on buffers and other performance tricks Hi all, I've got PG 8.0 on Debian sarge set up ... I want to speed up performance on the system. The system will run PG, Apache front-end on port 80 and Tomcat / Cocoon for the webapp. The webapp is not so heavily used, so we can give the max performance to the database. The database has a lot of work to do, we upload files every day. The current server has 8 databases of around 1 million records. This will be more in the future. There's only one main table, with some smaller tables. 95% of the records are in that one table. A lot of updates are done on that table, affecting 10-20% of the records. The system has 1 gig of ram. I could give 512Mb to PG. Filesystem is ext2, with the -noatime parameter in fstab Could I get some suggestions in how to configure my buffers, wals, .... ? Met vriendelijke groeten, Bien � vous, Kind regards, Yves Vindevogel Implements
Ron Peacetree wrote: > 0= Optimize your schema to be a tight as possible. Your goal is to give yourself the maximum chance that everything youwant to work on is in RAM when you need it. > 1= Upgrade your RAM to as much as you can possibly strain to afford. 4GB at least. It's that important. > 2= If the _entire_ DB does not fit in RAM after upgrading your RAM, the next step is making sure your HD IO subsystem isadequate to your needs. > 3= Read the various pg tuning docs that are available and Do The Right Thing. > 4= If performance is still not acceptable, then it's time to drill down into what specific actions/queries are problems. > If you get to here and the entire DBMS is still not close to acceptable, your fundamental assumptions have to be re-examined. IMHO you should really be examining your queries _before_ you do any investment in hardware, because later those may prove unnecessary. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Frank Wiles wrote: > Obviously there are systems/datasets/quantities where this won't > always work out best, but for the majority of systems out there > complicating your schema, maxing your hardware, and THEN tuning > is IMHO the wrong approach. I wasn't suggesting to complicate the schema -- I was actually thinking in systems where some queries are not using indexes, some queries are plain wrong, etc. Buying a very expensive RAID and then noticing that you just needed to create an index, is going to make somebody feel at least somewhat stupid. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 17.7", W 73º 14' 26.8" Y una voz del caos me habló y me dijo "Sonríe y sé feliz, podría ser peor". Y sonreí. Y fui feliz. Y fue peor.
On Wed, 9 Nov 2005 21:43:33 -0300 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Frank Wiles wrote: > > > Obviously there are systems/datasets/quantities where this won't > > always work out best, but for the majority of systems out there > > complicating your schema, maxing your hardware, and THEN tuning > > is IMHO the wrong approach. > > I wasn't suggesting to complicate the schema -- I was actually > thinking in systems where some queries are not using indexes, some > queries are plain wrong, etc. Buying a very expensive RAID and then > noticing that you just needed to create an index, is going to make > somebody feel at least somewhat stupid. Sorry I was referring to Ron statement that the first step should be to "Optimize your schema to be as tight as possible." But I agree, finding out you need an index after spending $$$ on extra hardware would be bad. Especially if you have to explain it to the person forking over the $$$! :) --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
On Wed, 9 Nov 2005 20:07:52 -0300 Alvaro Herrera <alvherre@commandprompt.com> wrote: > IMHO you should really be examining your queries _before_ you do any > investment in hardware, because later those may prove unnecessary. It all really depends on what you're doing. For some of the systems I run, 4 GBs of RAM is *WAY* overkill, RAID 1+0 is overkill, etc. In general I would slightly change the "order of operations" from: 1) Buy tons of RAM 2) Buy lots of disk I/O 3) Tune your conf 4) Examine your queries to 1) Tune your conf 2) Spend a few minutes examining your queries 3) Buy as much RAM as you can afford 4) Buy as much disk I/O as you can 5) Do in depth tuning of your queries/conf Personally I avoid planning my schema around my performance at the start. I just try to represent the data in a sensible, normalized way. While I'm sure I sub-consciously make decisions based on performance considerations early on, I don't do any major schema overhauls until I find I can't get the performance I need via tuning. Obviously there are systems/datasets/quantities where this won't always work out best, but for the majority of systems out there complicating your schema, maxing your hardware, and THEN tuning is IMHO the wrong approach. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
The point Gentlemen, was that Good Architecture is King. That's what I was trying to emphasize by calling proper DB architecturestep 0. All other things being equal (and they usually aren't, this sort of stuff is _very_ context dependent),the more of your critical schema that you can fit into RAM during normal operation the better. ...and it all starts with proper DB design. Otherwise, you are quite right in stating that you risk wasting time, effort,and HW. Ron -----Original Message----- From: Frank Wiles <frank@wiles.org> Sent: Nov 9, 2005 6:53 PM To: Alvaro Herrera <alvherre@commandprompt.com> Cc: rjpeace@earthlink.net, yves.vindevogel@implements.be, pgsql-performance@postgresql.org Subject: Re: [PERFORM] Some help on buffers and other performance tricks On Wed, 9 Nov 2005 20:07:52 -0300 Alvaro Herrera <alvherre@commandprompt.com> wrote: > IMHO you should really be examining your queries _before_ you do any > investment in hardware, because later those may prove unnecessary. It all really depends on what you're doing. For some of the systems I run, 4 GBs of RAM is *WAY* overkill, RAID 1+0 is overkill, etc. In general I would slightly change the "order of operations" from: 1) Buy tons of RAM 2) Buy lots of disk I/O 3) Tune your conf 4) Examine your queries to 1) Tune your conf 2) Spend a few minutes examining your queries 3) Buy as much RAM as you can afford 4) Buy as much disk I/O as you can 5) Do in depth tuning of your queries/conf Personally I avoid planning my schema around my performance at the start. I just try to represent the data in a sensible, normalized way. While I'm sure I sub-consciously make decisions based on performance considerations early on, I don't do any major schema overhauls until I find I can't get the performance I need via tuning. Obviously there are systems/datasets/quantities where this won't always work out best, but for the majority of systems out there complicating your schema, maxing your hardware, and THEN tuning is IMHO the wrong approach. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org --------------------------------- ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
On Wed, 2005-11-09 at 22:20, Ron Peacetree wrote: > The point Gentlemen, was that Good Architecture is King. That's what I was trying to emphasize by calling proper DB architecturestep 0. All other things being equal (and they usually aren't, this sort of stuff is _very_ context dependent),the more of your critical schema that you can fit into RAM during normal operation the better. > > ...and it all starts with proper DB design. Otherwise, you are quite right in stating that you risk wasting time, effort,and HW. Very valid point. It's the reason, in my last job, we had a mainline server with dual 2800MHz CPUs and a big RAID array. And our development, build and test system was a Dual Pentium Pro 200 with 256 Meg of ram. You notice slow queries real fast on such a box.
On Thu, 10 Nov 2005 09:16:10 -0600 Scott Marlowe <smarlowe@g2switchworks.com> wrote: > Very valid point. It's the reason, in my last job, we had a mainline > server with dual 2800MHz CPUs and a big RAID array. > > And our development, build and test system was a Dual Pentium Pro 200 > with 256 Meg of ram. You notice slow queries real fast on such a box. I know several people who use this development method. It can sometimes lead to premature optimizations, but overall I think it is a great way to work. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
> The point Gentlemen, was that Good Architecture is King. That's what I > was trying to emphasize by calling proper DB architecture step 0. All > other things being equal (and they usually aren't, this sort of stuff is > _very_ context dependent), the more of your critical schema that you can > fit into RAM during normal operation the better. > > ...and it all starts with proper DB design. Otherwise, you are quite > right in stating that you risk wasting time, effort, and HW. > > Ron +1! I answer lots of question on this list that are in the form of 'query x is running to slow'. Often, the first thing that pops in my mind is 'why are you running query x in the first place?' The #1 indicator that something is not right is 'distinct' clause. Distinct (and its evil cousin, union) are often brought in to address problems. The human brain is the best optimizer. Even on old hardware the server can handle a *lot* of data. It's just about where we add inefficiency...lousy database designs lead to lousy queries or (even worse) extra application code. Merlin
On Thu, 2005-11-10 at 09:25, Frank Wiles wrote: > On Thu, 10 Nov 2005 09:16:10 -0600 > Scott Marlowe <smarlowe@g2switchworks.com> wrote: > > > Very valid point. It's the reason, in my last job, we had a mainline > > server with dual 2800MHz CPUs and a big RAID array. > > > > And our development, build and test system was a Dual Pentium Pro 200 > > with 256 Meg of ram. You notice slow queries real fast on such a box. > > I know several people who use this development method. It can > sometimes lead to premature optimizations, but overall I think it is > a great way to work. Hehe. Yeah, you get used to things running a bit slower pretty quickly. Keep in mind though, that the test box is likely only supporting one single application at a time, whereas the production server may be running dozens or even hundreds of apps, so it's important to catch performance issues before they get to production. Plus, the Dual PPRo 200 WAS running a decent RAID array, even if it was a linux kernel software RAID and not hardware. But it was on 8 9 gigabyte SCSI drives, so it was quite fast for reads. In actuality, a lot of the folks developed their code on their own workstations (generally 1+GHz machines with 1G or more of ram) then had to move them over to the ppro 200 for testing and acceptance. So that kind of helps stop the premature optimizations. We were mainly looking to catch stupidity before it got to production.