Re: More Praise for 7.4RC2 - Mailing list pgsql-general
From | scott.marlowe |
---|---|
Subject | Re: More Praise for 7.4RC2 |
Date | |
Msg-id | Pine.LNX.4.33.0311131356430.1026-100000@css120.ihs.com Whole thread Raw |
In response to | Re: More Praise for 7.4RC2 (Reece Hart <reece@in-machina.com>) |
Responses |
Re: More Praise for 7.4RC2
Re: More Praise for 7.4RC2 |
List | pgsql-general |
On Thu, 13 Nov 2003, Reece Hart wrote: > On Thu, 2003-11-13 at 10:09, scott.marlowe wrote: > > > Do you vacuum full every so often? If not, and if you've been overflowing > > your fsm, then your tables will just grow without shrinking. > > Also, index growth could be a problem. > > > Hmm. I didn't realize that I needed to vacuum full as well -- I thought > vacuum was sufficient for performance gains, and that full reclaimed > space but didn't result in significant performance gains. I have > reindexed infrequently, but since that locks the table I didn't do that > (or vacuum full) often. I guess I should try out pg_autovacuum, but I > think that full vacuums only to prevent XID wraparound (if age>1.5B > transactions), but not for compaction (is this correct?). Assuming your free space map has enough room, and you vacuum (plain vacuum) often enough, you're dandy. But, let's say you run 10,000 transactions on a 1,000 row table, then run a plain vacuum. Even if your fsm can hold all the space that is free in that table's space, you've got a sparsely populated table that will take a while to seq scan through. I.e. if you don't vacuum (regular) often enough, then your tables may be quite large. I've tested out the pg_autovacuum daemon, and it seems to work quite well for me. What you're really shooting for is a "steady state" table size. Let's say you've got a 10,000 row table, and you average 500 changes an hour on it. If you vacuum it every day it will probably be fine, as by the end of 24 hours, the table will take up the space of about 22,000 rows. So, it will have the same basic performance as if it had 22,000 rows in it. If you start vacuuming it every hour after that first few days, then it will always be about 22,000 rows in size, (assuming the rows aren't noticable bigger or smaller from one version to the next.) While plain vacuums can reclaim the space at the very end of a table, and could theoretically make this table shrink over time, it's highly unlikely to ever drop back down to the approximate 10,000 rows in size started with. Vacuum full will drop it back down to somewhere around there. So, if your table is HIGHLY updated, you may need to run a plain vacuum very often, and that's where the autovacuum daemon comes in handy. Just set it to run every 30 minutes or so, and let it go. It should only vacuum the tables that have had lots of change, and leave the others alone. With the vacuum delay patch that's in testing for 7.5, it may well be that running the autovacuum daemon will become acceptable in places where, right now, vacuum, even the regular kind, produce too much system load / slow down in the middle of the day. > > The real test is to dump the database and reload it to give 7.3.4 a fair > > shake. > > It turns out that I have two copies of this database around at the > moment running on 7.3.4. One was a fresh restore, and that's what I used > to generate the explain. However, the query was run on the older > database which was vacuumed and analyzed (but not vacuum full or > reindexed), and on that instance the query took a long time. On the > fresh install, it takes 72s. In summary: > > 7.3.4, long-running db: eons > 7.3.4, freshly restored: 72s > 7.4RC2, freshly restored: 0.3s You can use the oid2name program in the contrib directory to kinda research which files are big under those trees and see if it's a table or index growth problem. something like: #su - postgres $ oid2name All databases: --------------------------------- 16975 = template0 16976 = postgres $ cd $PGDATA/base/16976 du -s *|sort -n 1004 16640 12232 109169550 65644 109169548 will list the largest files in the postgres database directory. oid2name -d postgres |grep 10169548 109169548 = accounts tells me that it's the accounts table that's taking up all my room. and so on.
pgsql-general by date: