Re: 121+ million record table perf problems - Mailing list pgsql-performance
From | Greg Smith |
---|---|
Subject | Re: 121+ million record table perf problems |
Date | |
Msg-id | Pine.GSO.4.64.0705182347110.29971@westnet.com Whole thread Raw |
In response to | 121+ million record table perf problems (cyber-postgres@midnightfantasy.com) |
List | pgsql-performance |
On Fri, 18 May 2007, cyber-postgres@midnightfantasy.com wrote: > shared_buffers = 24MB > work_mem = 256MB > maintenance_work_mem = 512MB You should take a minute to follow the suggestions at http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm and set dramatically higher values for shared_buffers and effective_cache_size for your server. Also, your work_mem figure may be OK for now, but if ever do have 10 people connect to this database at once and run big queries you could have an issue with it set that high--that's a per client setting. After you're done with that, you should also follow the suggestions there to do a VACCUM ANALYZE. That may knock out two other potential issues at once. It will take a while to run, but I think you need it badly to sort out what you've already done. > random_page_cost = 100 I'm not sure what logic prompted this change, but after you correct the above you should return this to its default; if this is helping now it's only because other things are so far off from where they should be. > update res set state=5001; > The update query that started this all I had to kill after 17hours. It > should have updated all 121+ million records. That brought my select count > down to 19 minutes, but still a far cry from acceptable. You should work on the select side of this first. If that isn't running in a moderate amount of time, trying to get the much more difficult update to happen quickly is hopeless. Once the select is under control, there are a lot of parameters to adjust that will effect the speed of the updates. The first thing to do is dramatically increase checkpoint_segments; I would set that to at least 30 in your situation. Also: going to RAID-1 won't make a bit of difference to your update speed; could even make it worse. Adding more RAM may not help much either. If you don't have one already, the real key to improving performance in a heavy update situation is to get a better disk controller with a cache that helps accelerate writes. Then the next step is to stripe this data across multiple disks in a RAID-0 configuration to split the I/O up. You have a lot of work ahead of you. Even after you resolve the gross issues here, you have a table that has around 10 indexes on it. Maintaining those is far from free; every time you update a single record in that table, the system has to update each of those indexes on top of the record update itself. So you're really asking your system to do around 1.2 billion disk-related operations when you throw out your simple batch update against every row, and good luck getting that to run in a time frame that's less than days long. The right way to get a feel for what's going on is to drop all the indexes except for the constraints and see how the bulk update runs after the parameter changes suggested above are in place and the database has been cleaned up with vacuum+analyze. Once you have a feel for that, add some indexes back in and see how it degrades. Then you'll know how adding each one of them impacts your performance. I suspect you're going to have to redesign your indexing scheme before this is over. I don't think your current design is ever going to work the way you expect it to. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
pgsql-performance by date: