Thread: VACUUM FULL on 24/7 server

VACUUM FULL on 24/7 server

From
Aleksey Serba
Date:
   Hello!

   I have 24/7 production server under high load.
   I need to perform vacuum full on several tables to recover disk
   space / memory  usage frequently ( the server must be online during
vacuum time )
   The one trick that i see is to try to vacuum duplicate of
   production database ( or just some tables ).
   But there are some pitfalls:
       http://www.postgresql.org/docs/7.4/interactive/backup-file.html
       "If you have dug into the details of the file system layout of
       the data you may be tempted to try to back up or restore only
       certain individual tables or databases from their respective
       files or directories. This will not work because the
       information contained in these files contains only half the
       truth. The other half is in the commit log files pg_clog/*,
       which contain the commit status of all transactions. A table
       file is only usable with this information. Of course it is also
       impossible to restore only a table and the associated pg_clog
       data because that would render all other tables in the database
       cluster useless."

    Any thoughts?

    Thanks in advance,
           Aleksey

Re: VACUUM FULL on 24/7 server

From
Tom Lane
Date:
Aleksey Serba <aserba@gmail.com> writes:
>    I have 24/7 production server under high load.
>    I need to perform vacuum full on several tables to recover disk
>    space / memory  usage frequently ( the server must be online during
> vacuum time )

Don't use VACUUM FULL; plain VACUUM should be enough if you are managing
the database properly (for instance, make sure the FSM settings are high
enough).

            regards, tom lane

Re: VACUUM FULL on 24/7 server

From
Christopher Browne
Date:
aserba@gmail.com (Aleksey Serba) wrote:
>    Hello!
>
>    I have 24/7 production server under high load.
>    I need to perform vacuum full on several tables to recover disk
>    space / memory  usage frequently ( the server must be online during
>    vacuum time )

The main thought is: "Don't do that."

It is almost certainly the wrong idea to do a VACUUM FULL.

Assuming that the tables in question aren't so large that they cause
mass eviction of buffers, it should suffice to do a plain VACUUM (and
NOT a "VACUUM FULL") on the tables in question quite frequently.

If the tables _are_ rather large, then you might want to look into the
"lazier" vacuum patch which causes the vacuum process to sleep once in
a while so as to diminish the "eviction" effects.  A version of this
was proposed for 7.3, and I know Jan Wieck has had a patch for 7.4.
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://linuxfinances.info/info/spiritual.html
I am not a number!
I am a free man!

Re: VACUUM FULL on 24/7 server

From
Gaetano Mendola
Date:
Christopher Browne wrote:
 > aserba@gmail.com (Aleksey Serba) wrote:
 >
 >>   Hello!
 >>
 >>   I have 24/7 production server under high load.
 >>   I need to perform vacuum full on several tables to recover disk
 >>   space / memory  usage frequently ( the server must be online during
 >>   vacuum time )
 >
 >
 > The main thought is: "Don't do that."
 >
 > It is almost certainly the wrong idea to do a VACUUM FULL.
 >
 > Assuming that the tables in question aren't so large that they cause
 > mass eviction of buffers, it should suffice to do a plain VACUUM (and
 > NOT a "VACUUM FULL") on the tables in question quite frequently.

This is easy to say and almost impraticable. I run a 7.4.5 with the autovacuum:

pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8

I have also a "vacuumdb -z -v -a"  running each six hours and if i don't execute
a vacuum FULL for one weeks I collect almost 400 MB of dead rows :-(
For this reason even with a 7.4.5 I'm obliged to run a vacuum full at least once
a week and a reindex once a month.

And my FSM parameters are large enough:

INFO:  free space map: 141 relations, 26787 pages stored; 26032 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 2000000 pages = 11780 kB shared memory.


Regards
Gaetano Mendola

PS: I do not have any "idle in transaction" connections around.





Re: VACUUM FULL on 24/7 server

From
Tom Lane
Date:
Gaetano Mendola <mendola@bigfoot.com> writes:
> Christopher Browne wrote:
>>> Assuming that the tables in question aren't so large that they cause
>>> mass eviction of buffers, it should suffice to do a plain VACUUM (and
>>> NOT a "VACUUM FULL") on the tables in question quite frequently.

> This is easy to say and almost impraticable. I run a 7.4.5 with the autovacuum:

> pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8

I'm not very familiar at all with appropriate settings for autovacuum,
but doesn't the above say to vacuum a table only when the dead space
reaches 50%?  That seems awfully lax to me.  I've always thought one
should vacuum often enough to keep dead space to maybe 10 to 25%.

            regards, tom lane

Re: VACUUM FULL on 24/7 server

From
Gaetano Mendola
Date:
Tom Lane wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>>Christopher Browne wrote:
>>
>>>>Assuming that the tables in question aren't so large that they cause
>>>>mass eviction of buffers, it should suffice to do a plain VACUUM (and
>>>>NOT a "VACUUM FULL") on the tables in question quite frequently.
>
>
>>This is easy to say and almost impraticable. I run a 7.4.5 with the autovacuum:
>
>
>>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8
>
>
> I'm not very familiar at all with appropriate settings for autovacuum,
> but doesn't the above say to vacuum a table only when the dead space
> reaches 50%?  That seems awfully lax to me.  I've always thought one
> should vacuum often enough to keep dead space to maybe 10 to 25%.

The problem is that I can not set these value per table and per database
so, I had to find some compromise, however I will test in the next days
what happen with -V 0.2

However each six hour I perform a vacuum on all database and the HD space
continue to grow even with FSM parameters large enough.

I'll post in a couple of day about the new settings.

Regards
Gaetano Mendola





Re: VACUUM FULL on 24/7 server

From
"Matthew T. O'Connor"
Date:
On Sun, 2004-10-03 at 21:01, Gaetano Mendola wrote:
> Tom Lane wrote:
> > Gaetano Mendola <mendola@bigfoot.com> writes:
> >
> >>Christopher Browne wrote:
> >>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8
> >
> > I'm not very familiar at all with appropriate settings for
autovacuum,
> > but doesn't the above say to vacuum a table only when the dead space
> > reaches 50%?  That seems awfully lax to me.  I've always thought one
> > should vacuum often enough to keep dead space to maybe 10 to 25%.

Yes that is what those options say.  The default values are even more
lax.  I wasn't sure how best to set them, I erred on the conservative
side.

> The problem is that I can not set these value per table and per
database
> so, I had to find some compromise, however I will test in the next
days
> what happen with -V 0.2
>
> However each six hour I perform a vacuum on all database and the HD
space
> continue to grow even with FSM parameters large enough.

Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3
times a day buying you much.  It's not a bad idea to do once in a while.

Given the way Postgres works, it is normal to have slack space in your
tables.  The real question is do your table stop growing?  At some point
you should reach a stead state where you have some percentage of slack
space that stops growing.

You said that after running for a week you have 400M of reclaimable
space.  Is that a problem?  If you don't do a vacuum full for two weeks
is it still 400M?  My guess is most of the 400M is created in the first
few hours (perhaps days) after running your vacuum full.

Matthew


Re: VACUUM FULL on 24/7 server

From
Gaetano Mendola
Date:
Matthew T. O'Connor wrote:
> On Sun, 2004-10-03 at 21:01, Gaetano Mendola wrote:
>
>>Tom Lane wrote:
>>
>>>Gaetano Mendola <mendola@bigfoot.com> writes:
>>>
>>>
>>>>Christopher Browne wrote:
>>>>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8
>>>
>>>I'm not very familiar at all with appropriate settings for
>
> autovacuum,
>
>>>but doesn't the above say to vacuum a table only when the dead space
>>>reaches 50%?  That seems awfully lax to me.  I've always thought one
>>>should vacuum often enough to keep dead space to maybe 10 to 25%.
>
>
> Yes that is what those options say.  The default values are even more
> lax.  I wasn't sure how best to set them, I erred on the conservative
> side.
>
>
>>The problem is that I can not set these value per table and per
>
> database
>
>>so, I had to find some compromise, however I will test in the next
>
> days
>
>>what happen with -V 0.2
>>
>>However each six hour I perform a vacuum on all database and the HD
>
> space
>
>>continue to grow even with FSM parameters large enough.
>
>
> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3
> times a day buying you much.  It's not a bad idea to do once in a while.

The reason is that I have few tables of about 5 milion with ~ 10000 insert per
day. Even with setting  -v 300 -V 0.1 this means these tables will be analyzed
each 50 days. So I  have to force it.


Regards
Gaetano Mendola



















> Given the way Postgres works, it is normal to have slack space in your
> tables.  The real question is do your table stop growing?  At some point
> you should reach a stead state where you have some percentage of slack
> space that stops growing.
>
> You said that after running for a week you have 400M of reclaimable
> space.  Is that a problem?  If you don't do a vacuum full for two weeks
> is it still 400M?  My guess is most of the 400M is created in the first
> few hours (perhaps days) after running your vacuum full.
>
> Matthew
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

Re: VACUUM FULL on 24/7 server

From
"Matthew T. O'Connor"
Date:
Gaetano Mendola wrote:

> Matthew T. O'Connor wrote:
>
>> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3
>> times a day buying you much.  It's not a bad idea to do once in a while.
>
>
> The reason is that I have few tables of about 5 milion with ~ 10000
> insert per
> day. Even with setting  -v 300 -V 0.1 this means these tables will be
> analyzed
> each 50 days. So I  have to force it.

I understand, but 10,000 new rows increate your table size only 0.2%, so
it won't significantly effect anything.  Also, if they really are just
inserts then vacuum is totally unnecessary.  I agree that for these
situations pg_autovacuum should be supplemented by vacuumdb -a -z every
once in a while, all I was pointing out was that 3 times a day is
probably excessive.

Matthew


Re: VACUUM FULL on 24/7 server

From
Gaetano Mendola
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Matthew T. O'Connor wrote:
| Gaetano Mendola wrote:
|
|> Matthew T. O'Connor wrote:
|>
|>> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3
|>> times a day buying you much.  It's not a bad idea to do once in a while.
|>
|>
|>
|> The reason is that I have few tables of about 5 milion with ~ 10000
|> insert per
|> day. Even with setting  -v 300 -V 0.1 this means these tables will be
|> analyzed
|> each 50 days. So I  have to force it.
|
|
| I understand, but 10,000 new rows increate your table size only 0.2%, so
| it won't significantly effect anything.  Also, if they really are just
| inserts then vacuum is totally unnecessary.  I agree that for these
| situations pg_autovacuum should be supplemented by vacuumdb -a -z every
| once in a while, all I was pointing out was that 3 times a day is
| probably excessive.

Right, but the table collect logs, so is mandatory have the statistics up-to-date
in order to obtain index scan for queries that are involving the last 24 hours.
For the vacuum vs the analyze I do vacuum because other tables are not in this
category of "only update" so instead of write tons of line in my crontab I prefer
only one line.



Regards
Gaetano Mendola






-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBYYm77UpzwH2SGd4RAmilAJ98skWgiKI7mqOgYIgigzgpLe0JpQCfRm8/
IPXFZwZVcdJP0RQCE1fPXpw=
=CExm
-----END PGP SIGNATURE-----


Re: VACUUM FULL on 24/7 server

From
"Matthew T. O'Connor"
Date:
On Sun, 2004-10-03 at 21:01, Gaetano Mendola wrote:
> Tom Lane wrote:
> > Gaetano Mendola <mendola@bigfoot.com> writes:
> >
> >>Christopher Browne wrote:
> >>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8
> >
> > I'm not very familiar at all with appropriate settings for autovacuum,
> > but doesn't the above say to vacuum a table only when the dead space
> > reaches 50%?  That seems awfully lax to me.  I've always thought one
> > should vacuum often enough to keep dead space to maybe 10 to 25%.

Yes that is what those options say.  The default values are even more
lax.  I wasn't sure how best to set them, I erred on the conservative
side.

> The problem is that I can not set these value per table and per database
> so, I had to find some compromise, however I will test in the next days
> what happen with -V 0.2
>
> However each six hour I perform a vacuum on all database and the HD space
> continue to grow even with FSM parameters large enough.

Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3
times a day buying you much.  It's not a bad idea to do once in a while.

Given the way Postgres works, it is normal to have slack space in your
tables.  The real question is do your table stop growing?  At some point
you should reach a stead state where you have some percentage of slack
space that stops growing.

You said that after running for a week you have 400M of reclaimable
space.  Is that a problem?  If you don't do a vacuum full for two weeks
is it still 400M?  My guess is most of the 400M is created in the first
few hours (perhaps days) after running your vacuum full.

Matthew



Re: VACUUM FULL on 24/7 server

From
Gaetano Mendola
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matthew T. O'Connor wrote:
| Gaetano Mendola wrote:
|
|> Matthew T. O'Connor wrote:
|>
|>> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3
|>> times a day buying you much.  It's not a bad idea to do once in a while.
|>
|>
|>
|> The reason is that I have few tables of about 5 milion with ~ 10000
|> insert per
|> day. Even with setting  -v 300 -V 0.1 this means these tables will be
|> analyzed
|> each 50 days. So I  have to force it.
|
|
| I understand, but 10,000 new rows increate your table size only 0.2%, so
| it won't significantly effect anything.  Also, if they really are just
| inserts then vacuum is totally unnecessary.  I agree that for these
| situations pg_autovacuum should be supplemented by vacuumdb -a -z every
| once in a while, all I was pointing out was that 3 times a day is
| probably excessive.

Right, but the table collect logs, so is mandatory have the statistics up-to-date
in order to obtain index scan for queries that are involving the last 24 hours.
For the vacuum vs the analyze I do vacuum because other tables are not in this
category of "only update" so instead of write tons of line in my crontab I prefer
only one line.



Regards
Gaetano Mendola





-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBYYiV7UpzwH2SGd4RAiglAKDpPvuhEZSabwXVBJzcCezH8fYIJQCfW9UY
CUkHUisIuJsbJHU5W0g0FhI=
=u6Xz
-----END PGP SIGNATURE-----