Thread: "Vacuum Full Analyze" taking so long
I have an 8.02 postgresql database with about 180 GB in size, running on 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum full analyze command, and has been running for at least two consecutive days with no other processes running (it's an offline loading server). I tweaked the maintenanace_mem to its max (2 GB) with work_mem of 8M. I have no issues with my checkpoints. I can still I/O activities against the physical files of the "property" table and its two indexes (primary key and r index). The property files are about 128GB and indexes are about 15 GB. I have run the same maintenance job on a different box (staging) with identical hardware config (except with 64 GB instead of 32) and took less than 12 hours. Any clue or tip is really appreciated. Also read a comment by Tom Lane, that terminating the process should be crash-safe if I had to. Thanks, -- Husam ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings ********************************************************************** This message contains confidential information intended only for the use of the addressee(s) named above and may contain information that is legally privileged. If you are not the addressee, or the person responsible for delivering it to the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is strictly prohibited. If you have received this message by mistake, please immediately notify us by replying to the message and delete the original message immediately thereafter. Thank you. FADLD Tag **********************************************************************
Nothing was running except the job. The server did not look stressed out looking at top and vmstat. We have seen slower query performance when performing load tests, so I run the re-index on all application indexes and then issue a full vacuum. I ran the same thing on a staging server and it took less than 12 hours. Is there a possibility the DB pages are corrupted. Is there a command to verify that. (In Oracle, there's a dbverify command that checks for corruption on the data files level). The other question I have. What would be the proper approach to rebuild indexes. I re-indexes and then run vacuum/analyze. Should I not use the re-index approach, and instead, drop the indexes, vacuum the tables, and then create the indexes, then run analyze on tables and indexes?? Thanks, -- Husam -----Original Message----- From: Luke Lonergan [mailto:llonergan@greenplum.com] Sent: Monday, July 25, 2005 3:49 PM To: Tomeh, Husam; pgsql-performance@postgresql.org Subject: Re: [PERFORM] "Vacuum Full Analyze" taking so long Vacuum full takes an exclusive lock on the tables it runs against, so if you have anything else reading the table while you are trying to run it, the vacuum full will wait, possibly forever until it can get the lock. What does the system load look like while you are running this? What does vmstat 1 show you? Is there load on the system other than the database? Do you really need to run vacuum full instead of vacuum? - Luke On 7/25/05 2:30 PM, "Tomeh, Husam" <htomeh@firstam.com> wrote: > > I have an 8.02 postgresql database with about 180 GB in size, running > on > 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum > full analyze command, and has been running for at least two > consecutive days with no other processes running (it's an offline > loading server). I tweaked the maintenanace_mem to its max (2 GB) with > work_mem of 8M. I have no issues with my checkpoints. I can still I/O > activities against the physical files of the "property" table and its > two indexes (primary key and r index). The property files are about > 128GB and indexes are about 15 GB. I have run the same maintenance job > on a different box > (staging) with identical hardware config (except with 64 GB instead of > 32) and took less than 12 hours. Any clue or tip is really > appreciated. > > Also read a comment by Tom Lane, that terminating the process should > be crash-safe if I had to. > > Thanks, > ********************************************************************** This message contains confidential information intended only for the use of the addressee(s) named above and may contain information that is legally privileged. If you are not the addressee, or the person responsible for delivering it to the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is strictly prohibited. If you have received this message by mistake, please immediately notify us by replying to the message and delete the original message immediately thereafter. Thank you. FADLD Tag **********************************************************************
I'd say, "don't do that". Unless you've deleted a lot of stuff and are expecting the DB to shrink, a full vacuum shouldn't really be needed. On a DB that big a full vacuum is just going to take a long time. If you really are shrinking, consider structuring things so you can just drop a table instead of vacuuming it (the drop is fairly instantaneous). If you can't do that, consider dropping the indices, vacuuming, and recreating the indices. Mike Stone
Vacuum full takes an exclusive lock on the tables it runs against, so if you have anything else reading the table while you are trying to run it, the vacuum full will wait, possibly forever until it can get the lock. What does the system load look like while you are running this? What does vmstat 1 show you? Is there load on the system other than the database? Do you really need to run vacuum full instead of vacuum? - Luke On 7/25/05 2:30 PM, "Tomeh, Husam" <htomeh@firstam.com> wrote: > > I have an 8.02 postgresql database with about 180 GB in size, running on > 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum > full analyze command, and has been running for at least two consecutive > days with no other processes running (it's an offline loading server). I > tweaked the maintenanace_mem to its max (2 GB) with work_mem of 8M. I > have no issues with my checkpoints. I can still I/O activities against > the physical files of the "property" table and its two indexes (primary > key and r index). The property files are about 128GB and indexes are > about 15 GB. I have run the same maintenance job on a different box > (staging) with identical hardware config (except with 64 GB instead of > 32) and took less than 12 hours. Any clue or tip is really > appreciated. > > Also read a comment by Tom Lane, that terminating the process should be > crash-safe if I had to. > > Thanks, >
Husam, On 7/25/05 4:31 PM, "John A Meinel" <john@arbash-meinel.com> wrote: > Tomeh, Husam wrote: >> >> Nothing was running except the job. The server did not look stressed out >> looking at top and vmstat. We have seen slower query performance when >> performing load tests, so I run the re-index on all application indexes >> and then issue a full vacuum. I ran the same thing on a staging server >> and it took less than 12 hours. Is there a possibility the DB pages are >> corrupted. Is there a command to verify that. (In Oracle, there's a >> dbverify command that checks for corruption on the data files level). >> >> The other question I have. What would be the proper approach to rebuild >> indexes. I re-indexes and then run vacuum/analyze. Should I not use the >> re-index approach, and instead, drop the indexes, vacuum the tables, and >> then create the indexes, then run analyze on tables and indexes?? > > I *think* if you are planning on dropping the indexes anyway, just drop > them, VACUUM ANALYZE, and then recreate them, I don't think you have to > re-analyze after you have recreated them. I agree - and don't run "VACUUM FULL", it is quite different from "VACUUM". Also - you should only need to vacuum if you've deleted a lot of data. It's job is to reclaim space lost to rows marked deleted. So, in fact, you may not even need to run VACUUM. "VACUUM FULL" is like a disk defragmentation operation within the DBMS, and is only necessary if there is a slowdown in performance from lots and lots of deletes and/or updates and new data isn't finding sequential pages for storage, which is rare. Given the need for locking, it's generally better to dump and restore in that case, but again it's a very rare occasion. I don't know of a command to check for page corruption, but I would think that if you can run VACUUM (not full) you should be OK. - Luke
Tomeh, Husam wrote: > > Nothing was running except the job. The server did not look stressed out > looking at top and vmstat. We have seen slower query performance when > performing load tests, so I run the re-index on all application indexes > and then issue a full vacuum. I ran the same thing on a staging server > and it took less than 12 hours. Is there a possibility the DB pages are > corrupted. Is there a command to verify that. (In Oracle, there's a > dbverify command that checks for corruption on the data files level). > > The other question I have. What would be the proper approach to rebuild > indexes. I re-indexes and then run vacuum/analyze. Should I not use the > re-index approach, and instead, drop the indexes, vacuum the tables, and > then create the indexes, then run analyze on tables and indexes?? I *think* if you are planning on dropping the indexes anyway, just drop them, VACUUM ANALYZE, and then recreate them, I don't think you have to re-analyze after you have recreated them. John =:-> > > Thanks, > >
Attachment
Tomeh, Husam wrote: > The other question I have. What would be the proper approach to rebuild > indexes. I re-indexes and then run vacuum/analyze. Should I not use the > re-index approach, and instead, drop the indexes, vacuum the tables, and > then create the indexes, then run analyze on tables and indexes?? If you just want to rebuild indexes, just drop and recreate. However, you are also running a VACUUM FULL, so I presume you have deleted a significant number of rows and want to recover the space that was in use by them. In that scenario, it is often better to CLUSTER the table to force a rebuild. While VACUUM FULL moves the tuples around inside the existing file(s), CLUSTER simply creates new file(s), moves all the non-deleted tuples there and then swaps the old and the new files. There can be a significant performance increase in doing so (but you obviously need to have some free diskspace). If you CLUSTER your table it will be ordered by the index you specify. There can be a performance increase in doing so, but if you don't want to you can also do a no-op ALTER TABLE and change a column to a datatype that is the same as it already has. This too will force a rewrite of the table but without ordering the tuples. So in short my recommendations: - to rebuild indexes, just drop and recreate the indexes - to rebuild everything because there is space that can bepermanently reclaimed, drop indexes, cluster or alter the table, recreate the indexes and anlyze the table Jochem
Thank you all for your great input. It sure helped. -- Husam -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Jochem van Dieten Sent: Tuesday, July 26, 2005 2:58 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] "Vacuum Full Analyze" taking so long Tomeh, Husam wrote: > The other question I have. What would be the proper approach to > rebuild indexes. I re-indexes and then run vacuum/analyze. Should I > not use the re-index approach, and instead, drop the indexes, vacuum > the tables, and then create the indexes, then run analyze on tables and indexes?? If you just want to rebuild indexes, just drop and recreate. However, you are also running a VACUUM FULL, so I presume you have deleted a significant number of rows and want to recover the space that was in use by them. In that scenario, it is often better to CLUSTER the table to force a rebuild. While VACUUM FULL moves the tuples around inside the existing file(s), CLUSTER simply creates new file(s), moves all the non-deleted tuples there and then swaps the old and the new files. There can be a significant performance increase in doing so (but you obviously need to have some free diskspace). If you CLUSTER your table it will be ordered by the index you specify. There can be a performance increase in doing so, but if you don't want to you can also do a no-op ALTER TABLE and change a column to a datatype that is the same as it already has. This too will force a rewrite of the table but without ordering the tuples. So in short my recommendations: - to rebuild indexes, just drop and recreate the indexes - to rebuild everything because there is space that can bepermanently reclaimed, drop indexes, cluster or alter the table, recreate the indexes and anlyze the table Jochem ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ********************************************************************** This message contains confidential information intended only for the use of the addressee(s) named above and may contain information that is legally privileged. If you are not the addressee, or the person responsible for delivering it to the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is strictly prohibited. If you have received this message by mistake, please immediately notify us by replying to the message and delete the original message immediately thereafter. Thank you. FADLD Tag **********************************************************************