Thread: new to postgres (and db management) and performance already a problem :-(
Hi, We have a horribly designed postgres 8.1.0 database (not my fault!). I am pretty new to database design and management and have really no idea how to diagnose performance problems. The db has only 25-30 tables, and half of them are only there because our codebase needs them (long story, again not my fault!). Basically we have 10 tables that are being accessed, and only a couple of queries that join more than 3 tables. Most of the action takes place on two tables. One of the devs has done some truly atrocious coding and is using the db as his data access mechanism (instead of an in-memory array, and he only needs an array/collection). It is running on an p4 3000ish (desktop model) running early linux 2.6 (mdk 10.1) (512meg of ram) so that shouldn't be an issue, as we are talking only about 20000 inserts a day. It probably gets queried about 20000 times a day too (all vb6 via the pg odbc). So... seeing as I didn't really do any investigation as to setting default sizes for storage and the like - I am wondering whether our performance problems (a programme running 1.5x slower than two weeks ago) might not be coming from the db (or rather, my maintaining of it). I have turned on stats, so as to allow autovacuuming, but have no idea whether that could be related. Is it better to schedule a cron job to do it x times a day? I just left all the default values in postgres.conf... could I do some tweaking? Does anyone know of any practical resources that might guide me in sorting out these sorts of problems? Some stuff with pratical examples would be good so I could compare with what we have. Thanks Antoine ps. I had a look with top and it didn't look like it was going much over 15% cpu, with memory usage negligeable. There are usually about 10 open connections. I couldn't find an easy way to check for disk accessings. pps. The db is just one possible reason for our bottleneck so if you tell me it is very unlikely I will be most reassured!
Re: new to postgres (and db management) and performance already a problem :-(
From
Andrew Sullivan
Date:
On Mon, Jan 16, 2006 at 11:07:52PM +0100, Antoine wrote: > performance problems (a programme running 1.5x slower than two weeks > ago) might not be coming from the db (or rather, my maintaining of it). > I have turned on stats, so as to allow autovacuuming, but have no idea > whether that could be related. Is it better to schedule a cron job to do > it x times a day? I just left all the default values in postgres.conf... > could I do some tweaking? The first thing you need to do is find out where your problem is. Are queries running slowly? You need to do some EXPLAIN ANALYSE queries to understand that. A -- Andrew Sullivan | ajs@crankycanuck.ca The whole tendency of modern prose is away from concreteness. --George Orwell
Antoine <melser.anton@gmail.com> writes: > So... seeing as I didn't really do any investigation as to setting > default sizes for storage and the like - I am wondering whether our > performance problems (a programme running 1.5x slower than two weeks > ago) might not be coming from the db (or rather, my maintaining of it). That does sound like a lack-of-vacuuming problem. If the performance goes back where it was after VACUUM FULL, then you can be pretty sure of it. Note that autovacuum is not designed to fix this for you: it only ever issues regular vacuum not vacuum full. > I couldn't find an easy way to check for disk accessings. Watch the output of "vmstat 1" or "iostat 1" for info about that. regards, tom lane
> That does sound like a lack-of-vacuuming problem. If the performance > goes back where it was after VACUUM FULL, then you can be pretty sure > of it. Note that autovacuum is not designed to fix this for you: it > only ever issues regular vacuum not vacuum full. in our db system (for a website), i notice performance boosts after a vacuum full. but then, a VACUUM FULL takes 50min+ during which the db is not really accessible to web-users. is there another way to perform maintenance tasks AND leaving the db fully operable and accessible? thanks, thomas
<me@alternize.com> writes: > in our db system (for a website), i notice performance boosts after a vacuum > full. but then, a VACUUM FULL takes 50min+ during which the db is not really > accessible to web-users. is there another way to perform maintenance tasks > AND leaving the db fully operable and accessible? You're not doing regular vacuums often enough. regards, tom lane
>> in our db system (for a website), i notice performance boosts after a >> vacuum >> full. but then, a VACUUM FULL takes 50min+ during which the db is not >> really >> accessible to web-users. is there another way to perform maintenance >> tasks >> AND leaving the db fully operable and accessible? > > You're not doing regular vacuums often enough. well, shouldn't autovacuum take care of "regular" vacuums? in addition to autovacuum, tables with data changes are vacuumed and reindexed once a day - still performance seems to degrade slowly until a vacuum full is initiated... could an additional daily vacuum over the entire db (even on tables that only get data added, never changed or removed) help? - thomas
Re: new to postgres (and db management) and performance already a problem :-(
From
Christopher Browne
Date:
>>> in our db system (for a website), i notice performance boosts after >>> a vacuum >>> full. but then, a VACUUM FULL takes 50min+ during which the db is >>> not really >>> accessible to web-users. is there another way to perform >>> maintenance tasks >>> AND leaving the db fully operable and accessible? >> >> You're not doing regular vacuums often enough. By the way, you can get that VACUUM FULL to be "less injurious" if you collect a list of tables: pubs=# select table_schema, table_name from information_schema.tables where table_type = 'BASE TABLE'; And then VACUUM FULL table by table. It'll take the same 50 minutes; it'll be more sporadically "unusable" which may turn out better. But that's just one step better; you want more steps :-). > well, shouldn't autovacuum take care of "regular" vacuums? in addition > to autovacuum, tables with data changes are vacuumed and reindexed > once a day - > still performance seems to degrade slowly until a vacuum full is > initiated... could an additional daily vacuum over the entire db (even > on tables that only get data added, never changed or removed) help? Tables which never see updates/deletes don't need to get vacuumed very often. They should only need to get a periodic ANALYZE so that the query optimizer gets the right stats. There are probably many tables where pg_autovacuum is doing a fine job. What you need to do is to figure out which tables *aren't* getting maintained well enough, and see about doing something special to them. What you may want to do is to go table by table and, for each one, do two things: 1) VACUUM VERBOSE, which will report some information about how much dead space there is on the table. 2) Contrib function pgstattuple(), which reports more detailed info about space usage (alas, for just the table). You'll find, between these, that there are some tables that have a LOT of dead space. At that point, there may be three answers: a) PG 8.1 pg_autovacuum allows you to modify how often specific tables are vacuumed; upping the numbers for the offending tables may clear things up b) Schedule cron jobs to periodically (hourly? several times per hour?) VACUUM the "offending" tables c) You may decide to fall back to VACUUM FULL; if you do so just for a small set of tables, the "time of pain" won't be the 50 minutes you're living with now... Try a), b), and c) in order on the "offending" tables as they address the problem at increasing cost... -- (reverse (concatenate 'string "moc.liamg" "@" "enworbbc")) http://linuxdatabases.info/info/x.html "Listen, strange women, lyin' in ponds, distributin' swords, is no basis for a system of government. Supreme executive power derives itself from a mandate from the masses, not from some farcical aquatic ceremony." -- Monty Python and the Holy Grail
> Try a), b), and c) in order on the "offending" tables as they address > the problem at increasing cost... thanks alot for the detailed information! the entire concept of vacuum isn't yet that clear to me, so your explanations and hints are very much appreciated. i'll defenitely try these steps this weekend when the next full vacuum was scheduled :-) best regards, thomas
On 17/01/06, me@alternize.com <me@alternize.com> wrote:
Thanks guys, that pretty much answered my question(s) too. I have a sneaking suspicion that vacuuming won't do too much for us however... now that I think about it - we do very little removing, pretty much only inserts and selects. I will give it a vacuum full and see what happens.
Cheers
Antoine
> Try a), b), and c) in order on the "offending" tables as they address
> the problem at increasing cost...
thanks alot for the detailed information! the entire concept of vacuum isn't
yet that clear to me, so your explanations and hints are very much
appreciated. i'll defenitely try these steps this weekend when the next full
vacuum was scheduled :-)
Thanks guys, that pretty much answered my question(s) too. I have a sneaking suspicion that vacuuming won't do too much for us however... now that I think about it - we do very little removing, pretty much only inserts and selects. I will give it a vacuum full and see what happens.
Cheers
Antoine
--
This is where I should put some witty comment.
Re: new to postgres (and db management) and performance already a problem :-(
From
Andrew Sullivan
Date:
On Tue, Jan 17, 2006 at 09:14:27AM +0100, Antoine wrote: > think about it - we do very little removing, pretty much only inserts and > selects. I will give it a vacuum full and see what happens. UPDATES? Remember that, in Postgres, UPDATE is effectively DELETE + INSERT (from the point of view of storage, not the point of view of the user). A -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner
Hi, Tom, Tom Lane wrote: >>in our db system (for a website), i notice performance boosts after a vacuum >>full. but then, a VACUUM FULL takes 50min+ during which the db is not really >>accessible to web-users. is there another way to perform maintenance tasks >>AND leaving the db fully operable and accessible? > > You're not doing regular vacuums often enough. It may also help to increase the max_fsm_pages setting, so postmaster has more memory to remember freed pages between VACUUMs. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Hi, Thomas, me@alternize.com wrote: >> Try a), b), and c) in order on the "offending" tables as they address >> the problem at increasing cost... > > thanks alot for the detailed information! the entire concept of vacuum > isn't yet that clear to me, so your explanations and hints are very much > appreciated. i'll defenitely try these steps this weekend when the next > full vacuum was scheduled :-) Basically, VACUUM scans the whole table and looks for pages containing garbage rows (or row versions), deletes the garbage, and adds those pages to the free space map (if there are free slots). When allocating new rows / row versions, PostgreSQL first tries to fit them in pages from the free space maps before allocating new pages. This is why a high max_fsm_pages setting can help when VACUUM freqency is low. VACUUM FULL additionally moves rows between pages, trying to concentrate all the free space at the end of the tables (aka "defragmentation"), so it can then truncate the files and release the space to the filesystem. CLUSTER basically rebuilds the tables by copying all rows into a new table, in index order, and then dropping the old table, which also reduces fragmentation, but not as strong as VACUUM FULL might. ANALYZE creates statistics about the distribution of values in a column, allowing the query optimizer to estimate the selectivity of query criteria. (This explanation is rather simplified, and ignores indices as well as the fact that a table can consist of multiple files. Also, I believe that newer PostgreSQL versions allow VACUUM to truncate files when free pages happen to appear at the very end of the file.) HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org