Thread: "Vacuum Full Analyze" taking so long

"Vacuum Full Analyze" taking so long

From
"Tomeh, Husam"
Date:
I have an 8.02 postgresql database with about 180 GB in size, running on
2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum
full analyze command, and has been running for at least two consecutive
days with no other processes running (it's an offline loading server). I
tweaked the maintenanace_mem to its max (2 GB) with work_mem of 8M. I
have no issues with my checkpoints. I can still I/O activities against
the physical files of the "property" table and its two indexes (primary
key and r index). The property files are about 128GB and indexes are
about 15 GB. I have run the same maintenance job on a different box
(staging) with identical hardware config (except with 64 GB instead of
32) and took less than 12 hours.   Any clue or tip is really
appreciated.

Also read a comment by Tom Lane, that terminating the process should be
crash-safe if I had to.

Thanks,


--
 Husam

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

**********************************************************************
This message contains confidential information intended only for the
use of the addressee(s) named above and may contain information that
is legally privileged.  If you are not the addressee, or the person
responsible for delivering it to the addressee, you are hereby
notified that reading, disseminating, distributing or copying this
message is strictly prohibited.  If you have received this message by
mistake, please immediately notify us by replying to the message and
delete the original message immediately thereafter.

Thank you.                                       FADLD Tag
**********************************************************************


Re: "Vacuum Full Analyze" taking so long

From
"Tomeh, Husam"
Date:
Nothing was running except the job. The server did not look stressed out
looking at top and vmstat. We have seen slower query performance when
performing load tests, so I run the re-index on all application indexes
and then issue a full vacuum. I ran the same thing on a staging server
and it took less than 12 hours. Is there a possibility the DB pages are
corrupted. Is there a command to verify that. (In Oracle, there's a
dbverify command that checks for corruption on the data files level).

The other question I have. What would be the proper approach to rebuild
indexes. I re-indexes and then run vacuum/analyze. Should I not use the
re-index approach, and instead, drop the indexes, vacuum the tables, and
then create the indexes, then run analyze on tables and indexes??

Thanks,


--
 Husam

-----Original Message-----
From: Luke Lonergan [mailto:llonergan@greenplum.com]
Sent: Monday, July 25, 2005 3:49 PM
To: Tomeh, Husam; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] "Vacuum Full Analyze" taking so long

Vacuum full takes an exclusive lock on the tables it runs against, so if
you have anything else reading the table while you are trying to run it,
the vacuum full will wait, possibly forever until it can get the lock.

What does the system load look like while you are running this?  What
does vmstat 1 show you?  Is there load on the system other than the
database?

Do you really need to run vacuum full instead of vacuum?

- Luke


On 7/25/05 2:30 PM, "Tomeh, Husam" <htomeh@firstam.com> wrote:

>
> I have an 8.02 postgresql database with about 180 GB in size, running
> on
> 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum

> full analyze command, and has been running for at least two
> consecutive days with no other processes running (it's an offline
> loading server). I tweaked the maintenanace_mem to its max (2 GB) with

> work_mem of 8M. I have no issues with my checkpoints. I can still I/O
> activities against the physical files of the "property" table and its
> two indexes (primary key and r index). The property files are about
> 128GB and indexes are about 15 GB. I have run the same maintenance job

> on a different box
> (staging) with identical hardware config (except with 64 GB instead of
> 32) and took less than 12 hours.   Any clue or tip is really
> appreciated.
>
> Also read a comment by Tom Lane, that terminating the process should
> be crash-safe if I had to.
>
> Thanks,
>



**********************************************************************
This message contains confidential information intended only for the
use of the addressee(s) named above and may contain information that
is legally privileged.  If you are not the addressee, or the person
responsible for delivering it to the addressee, you are hereby
notified that reading, disseminating, distributing or copying this
message is strictly prohibited.  If you have received this message by
mistake, please immediately notify us by replying to the message and
delete the original message immediately thereafter.

Thank you.                                       FADLD Tag
**********************************************************************


Re: "Vacuum Full Analyze" taking so long

From
Michael Stone
Date:
I'd say, "don't do that". Unless you've deleted a lot of stuff and are
expecting the DB to shrink, a full vacuum shouldn't really be needed. On
a DB that big a full vacuum is just going to take a long time. If you
really are shrinking, consider structuring things so you can just drop a
table instead of vacuuming it (the drop is fairly instantaneous). If you
can't do that, consider dropping the indices, vacuuming, and recreating
the indices.

Mike Stone

Re: "Vacuum Full Analyze" taking so long

From
"Luke Lonergan"
Date:
Vacuum full takes an exclusive lock on the tables it runs against, so if you
have anything else reading the table while you are trying to run it, the
vacuum full will wait, possibly forever until it can get the lock.

What does the system load look like while you are running this?  What does
vmstat 1 show you?  Is there load on the system other than the database?

Do you really need to run vacuum full instead of vacuum?

- Luke


On 7/25/05 2:30 PM, "Tomeh, Husam" <htomeh@firstam.com> wrote:

>
> I have an 8.02 postgresql database with about 180 GB in size, running on
> 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum
> full analyze command, and has been running for at least two consecutive
> days with no other processes running (it's an offline loading server). I
> tweaked the maintenanace_mem to its max (2 GB) with work_mem of 8M. I
> have no issues with my checkpoints. I can still I/O activities against
> the physical files of the "property" table and its two indexes (primary
> key and r index). The property files are about 128GB and indexes are
> about 15 GB. I have run the same maintenance job on a different box
> (staging) with identical hardware config (except with 64 GB instead of
> 32) and took less than 12 hours.   Any clue or tip is really
> appreciated.
>
> Also read a comment by Tom Lane, that terminating the process should be
> crash-safe if I had to.
>
> Thanks,
>



Re: "Vacuum Full Analyze" taking so long

From
"Luke Lonergan"
Date:
Husam,

On 7/25/05 4:31 PM, "John A Meinel" <john@arbash-meinel.com> wrote:

> Tomeh, Husam wrote:
>>
>> Nothing was running except the job. The server did not look stressed out
>> looking at top and vmstat. We have seen slower query performance when
>> performing load tests, so I run the re-index on all application indexes
>> and then issue a full vacuum. I ran the same thing on a staging server
>> and it took less than 12 hours. Is there a possibility the DB pages are
>> corrupted. Is there a command to verify that. (In Oracle, there's a
>> dbverify command that checks for corruption on the data files level).
>>
>> The other question I have. What would be the proper approach to rebuild
>> indexes. I re-indexes and then run vacuum/analyze. Should I not use the
>> re-index approach, and instead, drop the indexes, vacuum the tables, and
>> then create the indexes, then run analyze on tables and indexes??
>
> I *think* if you are planning on dropping the indexes anyway, just drop
> them, VACUUM ANALYZE, and then recreate them, I don't think you have to
> re-analyze after you have recreated them.

I agree - and don't run "VACUUM FULL", it is quite different from "VACUUM".
Also - you should only need to vacuum if you've deleted a lot of data.  It's
job is to reclaim space lost to rows marked deleted.  So, in fact, you may
not even need to run VACUUM.

"VACUUM FULL" is like a disk defragmentation operation within the DBMS, and
is only necessary if there is a slowdown in performance from lots and lots
of deletes and/or updates and new data isn't finding sequential pages for
storage, which is rare.  Given the need for locking, it's generally better
to dump and restore in that case, but again it's a very rare occasion.

I don't know of a command to check for page corruption, but I would think
that if you can run VACUUM (not full) you should be OK.

- Luke



Re: "Vacuum Full Analyze" taking so long

From
John A Meinel
Date:
Tomeh, Husam wrote:
>
> Nothing was running except the job. The server did not look stressed out
> looking at top and vmstat. We have seen slower query performance when
> performing load tests, so I run the re-index on all application indexes
> and then issue a full vacuum. I ran the same thing on a staging server
> and it took less than 12 hours. Is there a possibility the DB pages are
> corrupted. Is there a command to verify that. (In Oracle, there's a
> dbverify command that checks for corruption on the data files level).
>
> The other question I have. What would be the proper approach to rebuild
> indexes. I re-indexes and then run vacuum/analyze. Should I not use the
> re-index approach, and instead, drop the indexes, vacuum the tables, and
> then create the indexes, then run analyze on tables and indexes??

I *think* if you are planning on dropping the indexes anyway, just drop
them, VACUUM ANALYZE, and then recreate them, I don't think you have to
re-analyze after you have recreated them.

John
=:->

>
> Thanks,
>
>


Attachment

Re: "Vacuum Full Analyze" taking so long

From
Jochem van Dieten
Date:
Tomeh, Husam wrote:
> The other question I have. What would be the proper approach to rebuild
> indexes. I re-indexes and then run vacuum/analyze. Should I not use the
> re-index approach, and instead, drop the indexes, vacuum the tables, and
> then create the indexes, then run analyze on tables and indexes??

If you just want to rebuild indexes, just drop and recreate.

However, you are also running a VACUUM FULL, so I presume you
have deleted a significant number of rows and want to recover the
space that was in use by them. In that scenario, it is often
better to CLUSTER the table to force a rebuild. While VACUUM FULL
moves the tuples around inside the existing file(s), CLUSTER
simply creates new file(s), moves all the non-deleted tuples
there and then swaps the old and the new files. There can be a
significant performance increase in doing so (but you obviously
need to have some free diskspace).
If you CLUSTER your table it will be ordered by the index you
specify. There can be a performance increase in doing so, but if
you don't want to you can also do a no-op ALTER TABLE and change
a column to a datatype that is the same as it already has. This
too will force a rewrite of the table but without ordering the
tuples.

So in short my recommendations:
- to rebuild indexes, just drop and recreate the indexes
- to rebuild everything because there is space that can
bepermanently reclaimed, drop indexes, cluster or alter the
table, recreate the indexes and anlyze the table

Jochem

Re: "Vacuum Full Analyze" taking so long

From
"Tomeh, Husam"
Date:
Thank you all for your great input. It sure helped.

--
 Husam

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Jochem van
Dieten
Sent: Tuesday, July 26, 2005 2:58 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] "Vacuum Full Analyze" taking so long

Tomeh, Husam wrote:
> The other question I have. What would be the proper approach to
> rebuild indexes. I re-indexes and then run vacuum/analyze. Should I
> not use the re-index approach, and instead, drop the indexes, vacuum
> the tables, and then create the indexes, then run analyze on tables
and indexes??

If you just want to rebuild indexes, just drop and recreate.

However, you are also running a VACUUM FULL, so I presume you have
deleted a significant number of rows and want to recover the space that
was in use by them. In that scenario, it is often better to CLUSTER the
table to force a rebuild. While VACUUM FULL moves the tuples around
inside the existing file(s), CLUSTER simply creates new file(s), moves
all the non-deleted tuples there and then swaps the old and the new
files. There can be a significant performance increase in doing so (but
you obviously need to have some free diskspace).
If you CLUSTER your table it will be ordered by the index you specify.
There can be a performance increase in doing so, but if you don't want
to you can also do a no-op ALTER TABLE and change a column to a datatype
that is the same as it already has. This too will force a rewrite of the
table but without ordering the tuples.

So in short my recommendations:
- to rebuild indexes, just drop and recreate the indexes
- to rebuild everything because there is space that can bepermanently
reclaimed, drop indexes, cluster or alter the table, recreate the
indexes and anlyze the table

Jochem

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

**********************************************************************
This message contains confidential information intended only for the
use of the addressee(s) named above and may contain information that
is legally privileged.  If you are not the addressee, or the person
responsible for delivering it to the addressee, you are hereby
notified that reading, disseminating, distributing or copying this
message is strictly prohibited.  If you have received this message by
mistake, please immediately notify us by replying to the message and
delete the original message immediately thereafter.

Thank you.                                       FADLD Tag
**********************************************************************