Thread: Serious issues with CPU usage
Hi, i'm having _serious_ issues of postgres hogging up the CPU over time. A graph showing this can be seen at http://andri.estpak.ee/cpu0.png . The database is running on Redhat 9 (stock 2.4.20-8 kernel), on a reiserfs partition (~8% usage - no problem there), and this problem has been with PostgreSQL 7.3.2 (both package provided by Redhat and self-rebuilt package) and 7.3.4 (i used the 7.3.4 SRPM available at postgres ftp site). A VACUUM FULL is a remedy to this problem, but a simple VACUUM isn't. This can be reproduced, I think, by a simple UPDATE command: database=# EXPLAIN ANALYZE UPDATE table SET random_int_field = 1, last_updated = NOW() WHERE primary_key = 3772; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Index Scan using table_pkey on table (cost=0.00..6.81 rows=1 width=83) (actual time=0.09..0.10 rows=1 loops=1) Index Cond: (primary_key = 3772) Total runtime: 0.37 msec When I repeat this command using simple <up><enter>, I can see the "Total runtime" time grow ever so slightly - creeping from 0.37 to 0.38, then 0.39 etc. Would probably get higher if I had the patience. :) The table "table" used in this example has 2721 rows, so size isn't an issue here. Any comments or suggestions are welcome. If more information is needed, let me know and I'll post the needed details.
<andris@neti.ee> writes: > i'm having _serious_ issues of postgres hogging up the CPU over time. A graph > showing this can be seen at http://andri.estpak.ee/cpu0.png . You really haven't shown us anything that would explain that graph ... repeated UPDATEs will slow down a little until you vacuum, but not by the ratio you seem to be indicating. At least not if they're indexscans. If you've also got sequential-scan queries, and you're doing many zillion updates between vacuums, the answer is to vacuum more often. A decent rule of thumb is to vacuum whenever you've updated more than about 10% of the rows in a table since your last vacuum. > A VACUUM FULL is a remedy to this problem, but a simple VACUUM isn't. I find that odd; maybe there's something else going on here. But you've not given enough details to speculate. regards, tom lane
andris@neti.ee kirjutas L, 06.09.2003 kell 00:58: > Hi, > > i'm having _serious_ issues of postgres hogging up the CPU over time. A graph > showing this can be seen at http://andri.estpak.ee/cpu0.png . > > The database is running on Redhat 9 (stock 2.4.20-8 kernel), on a reiserfs > partition (~8% usage - no problem there), and this problem has been with > PostgreSQL 7.3.2 (both package provided by Redhat and self-rebuilt package) > and 7.3.4 (i used the 7.3.4 SRPM available at postgres ftp site). > > A VACUUM FULL is a remedy to this problem, but a simple VACUUM isn't. Could it be that FSM is too small for your vacuum interval ? Also, you could try running REINDEX (instead of or in addition to plain VACUUM) and see if this is is an index issue. > This can be reproduced, I think, by a simple UPDATE command: > > database=# EXPLAIN ANALYZE UPDATE table SET random_int_field = 1, last_updated > = NOW() WHERE primary_key = 3772; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------- > Index Scan using table_pkey on table (cost=0.00..6.81 rows=1 width=83) > (actual time=0.09..0.10 rows=1 loops=1) > Index Cond: (primary_key = 3772) > Total runtime: 0.37 msec > > When I repeat this command using simple <up><enter>, I can see the "Total > runtime" time grow ever so slightly - creeping from 0.37 to 0.38, then 0.39 > etc. Would probably get higher if I had the patience. :) > > The table "table" used in this example has 2721 rows, so size isn't an issue here. Due to the MVCC the raw table size (file size) can be much bigger if you dont VACUUM often enough. > Any comments or suggestions are welcome. If more information is needed, let me > know and I'll post the needed details. 1. What types of queries do you run, and how often ? 2. How is your database tuned (postgresql.conf settings) ? 3. How much memory does your machine have ? BTW, are you sure that this is postgres that is using up the memory ? I've read that reiserfs is a CPU hog, so this may be something that does intensive disk access, so some IO stats would be useful as well as real data and index file sizes. You could also set up logging and then check if there are some pathological queries that run for several hour doing nested seqscans ;) ----------------------- Hannu
Hope that you don't find it too distracting, I decided to answer to emails in one go. ---- On Saturday 06 September 2003 03:05, Tom Lane wrote: > indexscans. If you've also got sequential-scan queries, and you're > doing many zillion updates between vacuums, the answer is to vacuum > more often. A decent rule of thumb is to vacuum whenever you've updated > more than about 10% of the rows in a table since your last vacuum. Basically I do this: 1) select about ~700 ID's I have to poll 2) poll them 3) update those 700 rows in that "table" I used (~2700 rows total). And I do this cycle once per minute, so yes, I've got a zillion updates. 700 of 2700 is roughly 25%, so I'd have to vacuum once per minute? The manual actually had a suggestion of vacuuming after big changes, but I didn't think it was that bad. ----- On Saturday 06 September 2003 12:10, Hannu Krosing wrote: > Could it be that FSM is too small for your vacuum interval ? > > Also, you could try running REINDEX (instead of or in addition to plain > VACUUM) and see if this is is an index issue. VACUUM ANALYZE helped to lessen the load. Not as much as VACUUM FULL, but still bring it down to reasonable level. > 1. What types of queries do you run, and how often ? First, cycle posted above; second, every 5 minutes ~40 SELECTs that include that table. I left the once-per-minute poller offline this weekend, and the CPU usage didn't creep up. > 2. How is your database tuned (postgresql.conf settings) ? shared_buffers = 13000 max_fsm_relations = 100000 max_fsm_pages = 1000000 max_locks_per_transaction = 256 wal_buffers = 64 sort_mem = 32768 vacuum_mem = 16384 fsync = false effective_cache_size = 60000 Using these settings I was able to bring CPU usage down to a more reasonable level: http://andri.estpak.ee/cpu1.png This is much better than the first graph (see http://andri.estpak.ee/cpu0.png ), but you can still see CPU usage creeping up. VACUUM FULL was done at 03:00 and 09:00. The small drop at ~12:45 is thanks to VACUUM ANALYZE. If this is the best you can get with postgres right now, then I'll just have to increase the frequency of VACUUMing, but that feels like a hackish solution :( > 3. How much memory does your machine have ? 1 gigabyte. -- andri
On 8 Sep 2003 at 13:50, Andri Saar wrote: > If this is the best you can get with postgres right now, then I'll just have > to increase the frequency of VACUUMing, but that feels like a hackish > solution :( Use a autovacuum daemon. There is one in postgresql contrib module. It was introduced during 7.4 development and it works with 7.3.x. as well. Current 7.4CVS head has some problems with stats collector but soon it should be fine. Check it out.. Bye Shridhar -- Punishment becomes ineffective after a certain point. Men become insensitive. -- Eneg, "Patterns of Force", stardate 2534.7
On Mon, 8 Sep 2003 13:50:23 +0300, Andri Saar <andris@estpak.ee> wrote: >Basically I do this: >1) select about ~700 ID's I have to poll >2) poll them >3) update those 700 rows in that "table" I used (~2700 rows total). > >And I do this cycle once per minute, so yes, I've got a zillion updates. 700 >of 2700 is roughly 25%, so I'd have to vacuum once per minute? With such a small table VACUUM should be a matter of less than one second: fred=# vacuum verbose t; INFO: --Relation public.t-- INFO: Index t_pkey: Pages 65; Tuples 16384: Deleted 4096. CPU 0.01s/0.10u sec elapsed 0.21 sec. INFO: Removed 4096 tuples in 154 pages. CPU 0.04s/0.02u sec elapsed 0.07 sec. INFO: Pages 192: Changed 192, Empty 0; Tup 16384: Vac 4096, Keep 0, UnUsed 0. Total CPU 0.08s/0.16u sec elapsed 0.36 sec. VACUUM Time: 415.00 ms And this is on a 400 MHz machine under cygwin, so don't worry if you have a real computer. Servus Manfred
Andri Saar <andris@estpak.ee> writes: > If this is the best you can get with postgres right now, then I'll just have > to increase the frequency of VACUUMing, but that feels like a hackish > solution :( Not at all. The overhead represented by VACUUM would have to be paid somewhere, somehow, in any database. Postgres allows you to control exactly when it gets paid. It looks to me like throwing a plain VACUUM into your poller cycle (or possibly VACUUM ANALYZE depending on how fast the table's stats change) would solve your problems nicely. Note that once you have that policy in place, you will want to do one VACUUM FULL, and possibly a REINDEX, to get the table's physical size back down to something commensurate with 2700 useful rows. I shudder to think of where it had gotten to before. Routine VACUUMing should hold it to a reasonable size after that. regards, tom lane
On Monday 08 September 2003 17:04, Tom Lane wrote: > > It looks to me like throwing a plain VACUUM into your poller cycle > (or possibly VACUUM ANALYZE depending on how fast the table's stats > change) would solve your problems nicely. > I compled the pg_autovacuum daemon from 7.4beta sources as Shridhar Daithankar recommended, and it seems to work fine. At first glance I thought VACUUM is a thing you do maybe once per week during routine administration tasks like making a full backup, but I was wrong. Thanks to all for your help, we can consider this problem solved. Note to future generations: default postgres configuration settings are very conservative and don't be afraid to VACUUM very often. andri
On 8 Sep 2003 at 17:31, Andri Saar wrote: > Note to future generations: default postgres configuration settings are very > conservative and don't be afraid to VACUUM very often. You should have looked at earlier default postgresql they were arcane by that standard. 7.4 at least attempts to determine the shared_buffers while doing initdb. That results in much better default performance. Bye Shridhar -- Ritchie's Rule: (1) Everything has some value -- if you use the right currency. (2) Paint splashes last longer than the paint job. (3) Search and ye shall find -- but make sure it was lost.