Thread: partial VACUUM FULL

partial VACUUM FULL

From
Joseph Shraibman
Date:
If I cancel a VACUUM FULL, is the work that was done up until that point
thrown away?  I have a table that needs vacuuming but I can't accept the
downtime involved in vacuuming.

Re: partial VACUUM FULL

From
Tom Lane
Date:
Joseph Shraibman <jks@selectacast.net> writes:
> If I cancel a VACUUM FULL, is the work that was done up until that point
> thrown away?

Yes.

            regards, tom lane

Re: partial VACUUM FULL

From
Bill Moran
Date:
Joseph Shraibman wrote:
> If I cancel a VACUUM FULL, is the work that was done up until that point
> thrown away?  I have a table that needs vacuuming but I can't accept the
> downtime involved in vacuuming.

Not sure about the "cancel vacuum full" question, but I had some other thoughts
for you.

Keep in mind that a plain vacuum can do a lot of good if done regularly, and
it doesn't lock tables, thus the database can be in regular use while it's
run.  As a result, there is no downtime involved with regularly scheduled
vacuums.

There _can_ be a performance hit while vacuum is running, so you may need to
take that into account.  But I would expect that the performance hit incurred
during running vacuum will be less than that of not running it for long
periods of time.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Re: partial VACUUM FULL

From
Joseph Shraibman
Date:
My specific case:
I updated every row in a large table.
In the same transaction I tried to run vacuum on this table, which
rolled back the transaction. Then I reran my update.  So this large
table is now 2/3 expired tuples, and the performance hit is noticable.


Bill Moran wrote:
> Joseph Shraibman wrote:
>
>> If I cancel a VACUUM FULL, is the work that was done up until that
>> point thrown away?  I have a table that needs vacuuming but I can't
>> accept the downtime involved in vacuuming.
>
>
> Not sure about the "cancel vacuum full" question, but I had some other
> thoughts
> for you.
>
> Keep in mind that a plain vacuum can do a lot of good if done regularly,
> and
> it doesn't lock tables, thus the database can be in regular use while it's
> run.  As a result, there is no downtime involved with regularly scheduled
> vacuums.
>
> There _can_ be a performance hit while vacuum is running, so you may
> need to
> take that into account.  But I would expect that the performance hit
> incurred
> during running vacuum will be less than that of not running it for long
> periods of time.
>


Re: partial VACUUM FULL

From
Christopher Petrilli
Date:
On Mar 23, 2004, at 3:57 PM, Bill Moran wrote:

> Joseph Shraibman wrote:
>> If I cancel a VACUUM FULL, is the work that was done up until that
>> point thrown away?  I have a table that needs vacuuming but I can't
>> accept the downtime involved in vacuuming.
>
> Not sure about the "cancel vacuum full" question, but I had some other
> thoughts
> for you.
>
> Keep in mind that a plain vacuum can do a lot of good if done
> regularly, and
> it doesn't lock tables, thus the database can be in regular use while
> it's
> run.  As a result, there is no downtime involved with regularly
> scheduled
> vacuums.

Unfortunately, with some things, and I'm not sure why, as I don't
understand the VACUUM stuff that well, I had assumed that running
VACUUM ANALYZE nightly would be enough.  After I noticed that a
specific database (very transient data) had bloated to nearly 7Gb, I
ran VACUUM FULL on it, which took an hour or so, and it was reduced
down to under 1GB.

Is there a better way to deal with this?  This is on 7.3, and I wonder
if 7.4 fixed that, but it's been hard to schedule time to upgrade.

Chris
--
| Christopher Petrilli
| petrilli (at) amber.org


Re: partial VACUUM FULL

From
Bill Moran
Date:
Christopher Petrilli wrote:
>
> On Mar 23, 2004, at 3:57 PM, Bill Moran wrote:
>
>> Joseph Shraibman wrote:
>>
>>> If I cancel a VACUUM FULL, is the work that was done up until that
>>> point thrown away?  I have a table that needs vacuuming but I can't
>>> accept the downtime involved in vacuuming.
>>
>> Not sure about the "cancel vacuum full" question, but I had some other
>> thoughts
>> for you.
>>
>> Keep in mind that a plain vacuum can do a lot of good if done
>> regularly, and
>> it doesn't lock tables, thus the database can be in regular use while
>> it's
>> run.  As a result, there is no downtime involved with regularly scheduled
>> vacuums.
>
> Unfortunately, with some things, and I'm not sure why, as I don't
> understand the VACUUM stuff that well, I had assumed that running VACUUM
> ANALYZE nightly would be enough.  After I noticed that a specific
> database (very transient data) had bloated to nearly 7Gb, I ran VACUUM
> FULL on it, which took an hour or so, and it was reduced down to under 1GB.
>
> Is there a better way to deal with this?  This is on 7.3, and I wonder
> if 7.4 fixed that, but it's been hard to schedule time to upgrade.

vacuum's ability to clean things up has certain limitations.  One is that it
can't vacuum dead tuples that are still locked up in a transaction.  Another
is that the setting of vacuum_mem (in postgresql.conf) limits the amount of
cleanup that vacuum can do.

I suspect that the vacuum_mem setting is the problem in your case.  Possible
solutions are:
1) Run vacuuum full instead
2) Run vacuum more frequently
3) Increase the value of vacuum_mem

Each of these has it's disadvantages.  You should do a little research into
how vacuum works with vacuum_mem to determine which is the best approach for
your circumstance.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Re: partial VACUUM FULL

From
Frank Finner
Date:
I ran into the same problem. Did you try another VACUUM FULL after the first? It ran a lot faster
after the first initial one (about one magnitude faster), and now I do a VACUUM FULL every night.
Luckily we have the possibility of a downtime window at night so we may lock some tables then. In
earlier releases of PostgreSQL there was no option FULL, so I guess I didn´t realize the slightly
different behaviour of VACUUM and forgot to use the option FULL in 7.3.

I also ran into another trap yesterday with a very frequently updated database (some big tables are
dropped and copied from another database every few minutes): Some system tables became extremely
large after about 10 weeks. A VACUUM FULL did not cause any relief (repeatedly needed more than
15 minutes and used enormous amounts of memory, so the postmaster started heavily swapping), so I
dumped that database, dropped it, created it new and reloaded it. This worked fine (VACUUM FULL now
needs less than 30 seconds), and I was lucky to have a time window to drop/create, but I really
would like to know, if somebody knows a better way. This was with 7.3.5, I did not test that with
7.4.

Regards, Frank.


On Tue, 23 Mar 2004 16:24:15 -0500 Christopher Petrilli <petrilli@amber.org> sat down, thought long
and then wrote:

> Unfortunately, with some things, and I'm not sure why, as I don't 
> understand the VACUUM stuff that well, I had assumed that running 
> VACUUM ANALYZE nightly would be enough.  After I noticed that a 
> specific database (very transient data) had bloated to nearly 7Gb, I 
> ran VACUUM FULL on it, which took an hour or so, and it was reduced 
> down to under 1GB.
> 
> Is there a better way to deal with this?  This is on 7.3, and I wonder 
> if 7.4 fixed that, but it's been hard to schedule time to upgrade.
> 
> Chris
> --
> | Christopher Petrilli
> | petrilli (at) amber.org
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings


-- 
Frank Finner

Memory follows memory, memory defeats memory; some things are banished
only into the realms of our rich imaginings  -  but this does not mean
that they do not or cannot or will not exist - they exist! They exist!
                              (M. Moorcock, "The Revenge Of The Rose")

Re: partial VACUUM FULL

From
Tom Lane
Date:
> Christopher Petrilli wrote:
>> Unfortunately, with some things, and I'm not sure why, as I don't
>> understand the VACUUM stuff that well, I had assumed that running VACUUM
>> ANALYZE nightly would be enough.  After I noticed that a specific
>> database (very transient data) had bloated to nearly 7Gb, I ran VACUUM
>> FULL on it, which took an hour or so, and it was reduced down to under 1GB.

Evidently, nightly is not enough.  How often are you turning over the
content of the database's largest tables?  If you want to keep the
wasted space to, say, 50%, then you need to vacuum about as often as
the application will update every row of the table once.  Then you have
at most one dead tuple for every live tuple.

Bill Moran <wmoran@potentialtech.com> writes:
> vacuum's ability to clean things up has certain limitations.  One is that it
> can't vacuum dead tuples that are still locked up in a transaction.

Correct.  If you have clients that are sitting around holding open
transactions for very long periods (comparable to your inter-vacuum
interval) then you need to fix those clients.

> Another
> is that the setting of vacuum_mem (in postgresql.conf) limits the amount of
> cleanup that vacuum can do.

This is completely untrue.  Increasing vacuum_mem will likely make
things faster on large tables (by avoiding the need for multiple passes
over the indexes).  It will not change the end result though.

            regards, tom lane

Re: partial VACUUM FULL

From
Tom Lane
Date:
Frank Finner <postgresql@finner.de> writes:
> I also ran into another trap yesterday with a very frequently updated
> database (some big tables are dropped and copied from another database
> every few minutes): Some system tables became extremely large after
> about 10 weeks.

Use 7.4.

            regards, tom lane

Re: partial VACUUM FULL

From
Bill Moran
Date:
Tom Lane wrote:
>>Another
>>is that the setting of vacuum_mem (in postgresql.conf) limits the amount of
>>cleanup that vacuum can do.
>
> This is completely untrue.  Increasing vacuum_mem will likely make
> things faster on large tables (by avoiding the need for multiple passes
> over the indexes).  It will not change the end result though.

My mistake then.

Was this true for some previous version?  I could have swore I read somewhere
that vacuum_mem had to be set high enough or vacuum wouldn't be able to clean
everything up (aside from anything locked in transactions).  Now that I'm
looking, I can't find any such reference, so perhaps I misunderstood and
twisted the meaning.

Is performance the only reason for increasing vacuum_mem?

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Re: partial VACUUM FULL

From
Tom Lane
Date:
Bill Moran <wmoran@potentialtech.com> writes:
> Was this true for some previous version?  I could have swore I read somewhere
> that vacuum_mem had to be set high enough or vacuum wouldn't be able to clean
> everything up (aside from anything locked in transactions).

Nope, never been the case.

> Is performance the only reason for increasing vacuum_mem?

Yes.

            regards, tom lane

Re: partial VACUUM FULL

From
Joe Conway
Date:
Tom Lane wrote:
> This is completely untrue.  Increasing vacuum_mem will likely make
> things faster on large tables (by avoiding the need for multiple passes
> over the indexes).  It will not change the end result though.

I can attest to that, based on very recent empirical evidence. On a 28
million row table, I saw something like 33% speed-up in going from 256MB
to 320MB for vacuum_mem.

Joe


Re: partial VACUUM FULL

From
Joseph Shraibman
Date:
Joe Conway wrote:
> Tom Lane wrote:
>
>> This is completely untrue.  Increasing vacuum_mem will likely make
>> things faster on large tables (by avoiding the need for multiple passes
>> over the indexes).  It will not change the end result though.
>
>
> I can attest to that, based on very recent empirical evidence. On a 28
> million row table, I saw something like 33% speed-up in going from 256MB
> to 320MB for vacuum_mem.
>
> Joe

When I finally ran the vacuum full I set vacuum_mem to 1 gig, but it
never used more than 20 meg.  Took 40 minutes btw

Re: partial VACUUM FULL

From
"scott.marlowe"
Date:
On Tue, 23 Mar 2004, Tom Lane wrote:

> Bill Moran <wmoran@potentialtech.com> writes:
> > Was this true for some previous version?  I could have swore I read somewhere
> > that vacuum_mem had to be set high enough or vacuum wouldn't be able to clean
> > everything up (aside from anything locked in transactions).
>
> Nope, never been the case.
>
> > Is performance the only reason for increasing vacuum_mem?
>
> Yes.

Maybe Bill's thinking of the fsm settings and regular vacuums and the
limitations on how many tuples can be reclaimed by regular vacuuming being
tied to fsm settings?



Re: partial VACUUM FULL

From
Bill Moran
Date:
scott.marlowe wrote:
> On Tue, 23 Mar 2004, Tom Lane wrote:
>
>>Bill Moran <wmoran@potentialtech.com> writes:
>>
>>>Was this true for some previous version?  I could have swore I read somewhere
>>>that vacuum_mem had to be set high enough or vacuum wouldn't be able to clean
>>>everything up (aside from anything locked in transactions).
>>
>>Nope, never been the case.
>>
>>>Is performance the only reason for increasing vacuum_mem?
>>
>>Yes
>
> Maybe Bill's thinking of the fsm settings and regular vacuums and the
> limitations on how many tuples can be reclaimed by regular vacuuming being
> tied to fsm settings?

It's very likely that's where my confusion stemmed from, Scott.

Thanks for the pointer, I think I can clear up _all_ of my misunderstandings
now.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com