Thread: Can anyone explain this: duplicate dbs.
Wondering if someone could explain a pecularity for me:
We have a database which takes 1000ms to perform a certain query on.
If I pg_dump that database then create a new database (e.g. "tempdb") and upload the dump file (thus making a duplicate) then the same query only takes 190ms !!
We have a database which takes 1000ms to perform a certain query on.
If I pg_dump that database then create a new database (e.g. "tempdb") and upload the dump file (thus making a duplicate) then the same query only takes 190ms !!
Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an impact on these times.
Can anyone explain why this may be occurring and how I might be able to keep the original database running at the same speed as "tempdb"?
Thanks in advance,
Dave.
Can anyone explain why this may be occurring and how I might be able to keep the original database running at the same speed as "tempdb"?
Thanks in advance,
Dave.
> Can anyone explain why this may be occurring and how I might be able to > keep the original database running at the same speed as "tempdb"? You're not vacuuming anywhere near often enough. Read up the database maintenance section of the manual. Then, set up contrib/pg_autovacuum to vacuum your database regularly, or make a cron job to run "vacuumdb -a -z -q" once an hour, say. You can fix for the case when you haven't been vacuuming enough by a once off VACUUM FULL ANALYZE command, but this will lock tables exclusively as it does its work. Chris
> If I pg_dump that database then create a new database (e.g. "tempdb") > and upload the dump file (thus making a duplicate) then the same query > only takes 190ms !! > Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an > impact on these times. Damn, for some reason I didn't read that you had already tried vacuum full. In that case, I can't explain it except perhaps you aren't vacuuming properly, or the right thing, or it's a disk cache thing. Chris
SpaceBallOne wrote: > Wondering if someone could explain a pecularity for me: > > We have a database which takes 1000ms to perform a certain query on. > > If I pg_dump that database then create a new database (e.g. "tempdb") > and upload the dump file (thus making a duplicate) then the same query > only takes 190ms !! > Vacuum, vacuum analyse, and vacuum full analyse does not seem to have > an impact on these times. > > Can anyone explain why this may be occurring and how I might be able > to keep the original database running at the same speed as "tempdb"? > > Thanks in advance, > > Dave. What version of postgres? There are a few possibilities. If you are having a lot of updates to the table, you can get index bloat. And vacuum doesn't fix indexes. You have to "REINDEX" to do that. Though REINDEX has the same lock that VACUUM FULL has, so you need to be a little careful with it. Probably better is to do CLUSTER, as it does a REINDEX and a sort, so your table ends up nicer when you are done. Also, older versions of postgres had a worse time with index bloat. One thing that caused a lot of problem is a table that you insert into over time, so that all the values are incrementing. If you are deleting older entries, that area won't be re-used because they fall at the back end. I believe newer versions have been fixed. By the way, I think doing: CREATE DATABASE tempdb WITH TEMPLATE = originaldb; Is a much faster way of doing dump and load. I *think* it would recreate indexes, etc. If it just does a copy it may not show the dump/restore improvement. John =:->
Attachment
> What version of postgres? 8.0.2 ... but I think I've seen this before on 7.3 ... > There are a few possibilities. If you are having a lot of updates to the > table, you can get index bloat. And vacuum doesn't fix indexes. You have > to "REINDEX" to do that. Though REINDEX has the same lock that VACUUM > FULL has, so you need to be a little careful with it. > Probably better is to do CLUSTER, as it does a REINDEX and a sort, so > your table ends up nicer when you are done. Thanks, will try those next time this problem crops up (i just deleted / recreated the database to speed things for its users in the office ... probably should have held off to see if I could find a solution first!). Yes, the database / table-in-question does have a lot of updates, deletes, and new rows (relatively speaking for a small business). Would CLUSTER / REINDEX still have an effect if our queries were done via sequential scan? This is a old database (as in built by me when i was just starting to learn unix / postgres) so the database design is pretty horrible (little normalisation, no indexes). Have taken Chris's advice onboard too and setup cron to do a vacuumdb hourly instead of my weekly vacuum. Cheers, Dave.
> Would CLUSTER / REINDEX still have an effect if our queries were done > via sequential scan? SELECTS don't write to the database, so they have no effect at all on vacuuming/analyzing. You only need to worry about that with writes. > This is a old database (as in built by me when i > was just starting to learn unix / postgres) so the database design is > pretty horrible (little normalisation, no indexes). No indexes? Bloody hell :D Use EXPLAIN ANALYZE SELECT ... ; on all of your selects to see where they are slow and where you can add indexes... Chris
SpaceBallOne wrote: >> What version of postgres? > > > 8.0.2 ... but I think I've seen this before on 7.3 ... > >> There are a few possibilities. If you are having a lot of updates to the >> table, you can get index bloat. And vacuum doesn't fix indexes. You have >> to "REINDEX" to do that. Though REINDEX has the same lock that VACUUM >> FULL has, so you need to be a little careful with it. > > >> Probably better is to do CLUSTER, as it does a REINDEX and a sort, so >> your table ends up nicer when you are done. > > > Thanks, will try those next time this problem crops up (i just deleted > / recreated the database to speed things for its users in the office > ... probably should have held off to see if I could find a solution > first!). > > Yes, the database / table-in-question does have a lot of updates, > deletes, and new rows (relatively speaking for a small business). > > Would CLUSTER / REINDEX still have an effect if our queries were done > via sequential scan? This is a old database (as in built by me when i > was just starting to learn unix / postgres) so the database design is > pretty horrible (little normalisation, no indexes). Well, my first recommendation is to put in some indexes. :) They are relatively easy to setup and can drastically improve select performance. What version of postgres are you using? What does it say at the end of "VACUUM FULL ANALYZE VERBOSE", that should tell you how many free pages were reclaimed and how big your free space map should be. If you only did 1 VACUUM FULL, you might try another, as it sounds like your tables aren't properly filled. I'm pretty sure vacuum only removes empty pages/marks locations for the free space map so they can be re-used, while vacuum full will move entries around to create free pages. It sounds like it didn't do it properly. But even so, CLUSTER is still your friend, as it allows you to "presort" the rows in your tables. > > Have taken Chris's advice onboard too and setup cron to do a vacuumdb > hourly instead of my weekly vacuum. > > Cheers, > > Dave. > > John =:->
Attachment
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: >> If I pg_dump that database then create a new database (e.g. "tempdb") >> and upload the dump file (thus making a duplicate) then the same query >> only takes 190ms !! >> Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an >> impact on these times. > Damn, for some reason I didn't read that you had already tried vacuum > full. I'm thinking index bloat, and a PG version too old for vacuum full to recover any index space. But without any information about PG version or EXPLAIN ANALYZE results, we're all just guessing. regards, tom lane
John A Meinel <john@arbash-meinel.com> writes: > By the way, I think doing: > CREATE DATABASE tempdb WITH TEMPLATE = originaldb; > Is a much faster way of doing dump and load. I *think* it would recreate > indexes, etc. If it just does a copy it may not show the dump/restore > improvement. CREATE DATABASE just does a physical copy, so it won't do anything at all for bloat issues. regards, tom lane
On Tue, May 24, 2005 at 21:39:15 -0500, John A Meinel <john@arbash-meinel.com> wrote: > > By the way, I think doing: > > CREATE DATABASE tempdb WITH TEMPLATE = originaldb; > > Is a much faster way of doing dump and load. I *think* it would recreate > indexes, etc. If it just does a copy it may not show the dump/restore > improvement. You need to be careful when doing this. See section 18.3 of the 8.0 docs for caveats.