Thread: autovacuum

autovacuum

From
Robert Fitzpatrick
Date:
I have a large database used with our mail filter. The pg_dumpall
results in about 3GB with this being the only database in the system
besides templates and postgres.

I do a vacuum every night after backup and it takes about an hour, is
this normal for this size db? I also have autovacuum enabled and when it
is running during the day, our mail queues will tend to fill up with
slow response to the server. Should I have autovacuum on even if I am
vacuuming the db every night?

Let me know if you need more specifics. Just trying to get some feedback
on if my vacuum is taking too long or if both are necessary...thanks for
the help!

--
Robert


Re: autovacuum

From
Bill Moran
Date:
In response to Robert Fitzpatrick <lists@webtent.net>:

> I have a large database used with our mail filter. The pg_dumpall
> results in about 3GB with this being the only database in the system
> besides templates and postgres.
>
> I do a vacuum every night after backup and it takes about an hour, is
> this normal for this size db?

"normal" is relative.  If it's taking an hour to vacuum 3G, I would say
that either your hardware is undersized/badly configured, or you're
not vacuuming often enough.

> I also have autovacuum enabled and when it
> is running during the day, our mail queues will tend to fill up with
> slow response to the server. Should I have autovacuum on even if I am
> vacuuming the db every night?

I'm not aware of any problems with autovaccum and scheduled vacuum
working together.

That doesn't mean you're vacuuming often enough, however.  Switch your
nightly vacuum to vacuum verbose and capture the output to see how much
work it has to do.  Are your fsm settings high enough?

> Let me know if you need more specifics. Just trying to get some feedback
> on if my vacuum is taking too long or if both are necessary...thanks for
> the help!

How much RAM does the system have?  What's your shared_buffer settings?
What's your maintenance_work_mem set to?

--
Bill Moran
http://www.potentialtech.com

Re: autovacuum

From
Robert Fitzpatrick
Date:
On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote:
> In response to Robert Fitzpatrick <lists@webtent.net>:
>
> > I have a large database used with our mail filter. The pg_dumpall
> > results in about 3GB with this being the only database in the system
> > besides templates and postgres.
> >
> > I do a vacuum every night after backup and it takes about an hour, is
> > this normal for this size db?
>
> "normal" is relative.  If it's taking an hour to vacuum 3G, I would say
> that either your hardware is undersized/badly configured, or you're
> not vacuuming often enough.

It is a dual P4 processor supermicro server with 2GB of RAM, so I will
need to go over the configuration then? I didn't think it should take so
long...

> That doesn't mean you're vacuuming often enough, however.  Switch your
> nightly vacuum to vacuum verbose and capture the output to see how much
> work it has to do.  Are your fsm settings high enough?
>
> > Let me know if you need more specifics. Just trying to get some feedback
> > on if my vacuum is taking too long or if both are necessary...thanks for
> > the help!
>
> How much RAM does the system have?  What's your shared_buffer settings?
> What's your maintenance_work_mem set to?

Yes, this is the first time I've had to do any tuning to pgsql, so I
most likely need help in this area. This is 8.2.4 on a FreeBSD 6.2
server...here are those settings currently below. I also had to tweak
BSD loader.conf to allow the changes to work...

max_connections = 250
max_fsm_pages = 204800
shared_buffers = 128MB
effective_cache_size = 256MB
work_mem = 64MB
maintenance_work_mem = 256MB

mx1# cat /etc/loader.conf
kern.ipc.semmni=256
kern.ipc.semmns=512
kern.ipc.semmnu=256
mx1# cat /etc/sysctl.conf
# tuning for PostgreSQL
kern.ipc.shm_use_phys=1
kern.ipc.shmmax=1073741824
kern.ipc.shmall=262144
kern.ipc.semmsl=512
kern.ipc.semmap=256

If I don't have it listed above, then it is default settings for
anything else.

Thanks for the help!

>
--
Robert


Re: autovacuum

From
Bill Moran
Date:
In response to Robert Fitzpatrick <lists@webtent.net>:

> On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote:
> > That doesn't mean you're vacuuming often enough, however.  Switch your
> > nightly vacuum to vacuum verbose and capture the output to see how much
> > work it has to do.  Are your fsm settings high enough?

<snip>

> INFO:  free space map contains 30078 pages in 41 relations
> DETAIL:  A total of 30304 page slots are in use (including overhead).
> 30304 page slots are required to track all free space.
> Current limits are:  204800 page slots, 1000 relations, using 1265 kB.

This was what I was most concerned about.  If your FSM settings are too
low, vacuum won't be able to fully do its job.  But it looks like you're
OK here.

--
Bill Moran
http://www.potentialtech.com

Re: autovacuum

From
Bill Moran
Date:
In response to Robert Fitzpatrick <lists@webtent.net>:

> On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote:
> > In response to Robert Fitzpatrick <lists@webtent.net>:
> >
> > > I have a large database used with our mail filter. The pg_dumpall
> > > results in about 3GB with this being the only database in the system
> > > besides templates and postgres.
> > >
> > > I do a vacuum every night after backup and it takes about an hour, is
> > > this normal for this size db?
> >
> > "normal" is relative.  If it's taking an hour to vacuum 3G, I would say
> > that either your hardware is undersized/badly configured, or you're
> > not vacuuming often enough.
>
> It is a dual P4 processor supermicro server with 2GB of RAM, so I will
> need to go over the configuration then? I didn't think it should take so
> long...

Why does everyone leave of the IO subsystem?  It's almost as if many
people don't realize that disks exist ...

With 2G of RAM, and a DB that's about 3G, then there's at least a G of
database data _not_ in memory at any time.  As a result, disk speed is
important, and _could_ be part of your problem.  You're not using RAID
5 are you?

> > > Let me know if you need more specifics. Just trying to get some feedback
> > > on if my vacuum is taking too long or if both are necessary...thanks for
> > > the help!
> >
> > How much RAM does the system have?  What's your shared_buffer settings?
> > What's your maintenance_work_mem set to?
>
> Yes, this is the first time I've had to do any tuning to pgsql, so I
> most likely need help in this area. This is 8.2.4 on a FreeBSD 6.2
> server...here are those settings currently below. I also had to tweak
> BSD loader.conf to allow the changes to work...
>
> max_connections = 250
> max_fsm_pages = 204800
> shared_buffers = 128MB

Unless this machine runs programs other than PostgreSQL, raise this to
about 650MB.  You might get better performance from even higher values.
The rule of thumb is allocate 1/4 - 1/3 of the available RAM to
shared_buffers ... subtract the RAM that other programs are using first.

> effective_cache_size = 256MB

More like 1300MB (again, unless this machine is doing other things)

> work_mem = 64MB
> maintenance_work_mem = 256MB
>
> mx1# cat /etc/loader.conf
> kern.ipc.semmni=256
> kern.ipc.semmns=512
> kern.ipc.semmnu=256
> mx1# cat /etc/sysctl.conf
> # tuning for PostgreSQL
> kern.ipc.shm_use_phys=1
> kern.ipc.shmmax=1073741824
> kern.ipc.shmall=262144
> kern.ipc.semmsl=512
> kern.ipc.semmap=256
>
> If I don't have it listed above, then it is default settings for
> anything else.

Watch the system during vacuum to see if it's blocking on IO or CPU.
systat, vmstat, iostat, and top (use 'm' to switch views) are all
good utilities to check on this.

Another possibility is that autovac isn't configured correctly.  Watch
your PostgreSQL logs to see if it's running at all.  If it is, turn up
the logging level until it tells you which tables it's vacuuming.  You
may have to tweak the thresholds to make it more aggressive.

--
Bill Moran
http://www.potentialtech.com

Re: autovacuum

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bill Moran wrote:
> In response to Robert Fitzpatrick <lists@webtent.net>:
>
>> On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote:
>>> In response to Robert Fitzpatrick <lists@webtent.net>:
>>>
>>>> I have a large database used with our mail filter. The pg_dumpall
>>>> results in about 3GB with this being the only database in the system
>>>> besides templates and postgres.
>>>>
>>>> I do a vacuum every night after backup and it takes about an hour, is
>>>> this normal for this size db?
>>> "normal" is relative.  If it's taking an hour to vacuum 3G, I would say
>>> that either your hardware is undersized/badly configured, or you're
>>> not vacuuming often enough.
>> It is a dual P4 processor supermicro server with 2GB of RAM, so I will
>> need to go over the configuration then? I didn't think it should take so
>> long...
>
> Why does everyone leave of the IO subsystem?  It's almost as if many
> people don't realize that disks exist ...

I have disks?


- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8txpATb/zqfZUUQRAlgRAKCUhKoTaTFjYy5fdRktrCuAfWTdlwCgo1wF
Hrg1Dwk+TYxTbWeQEKW2LYY=
=JDqc
-----END PGP SIGNATURE-----

Re: autovacuum

From
Robert Fitzpatrick
Date:
On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote:
> In response to Robert Fitzpatrick <lists@webtent.net>:
> Why does everyone leave of the IO subsystem?  It's almost as if many
> people don't realize that disks exist ...
>
> With 2G of RAM, and a DB that's about 3G, then there's at least a G of
> database data _not_ in memory at any time.  As a result, disk speed is
> important, and _could_ be part of your problem.  You're not using RAID
> 5 are you?

Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All
SATA 80GB drives giving me little under 300GB to work with.

Also, my nightly backup does a pg_dump of the one database and vacuums
only that database as there are no other except template#'s. Then it
does a pg_dumpall. Now, I noticed that we have the -dD flags on
pg_dumpall, not sure why, I took them off. But the strange thing I am
finding is while my one database using a 'pg_dump -F c' only comes out
at 930MB while the pg_dumpall results in 3GB, is that due to the use of
INSERTS by using -dD?

> > max_connections = 250
> > max_fsm_pages = 204800
> > shared_buffers = 128MB
>
> Unless this machine runs programs other than PostgreSQL, raise this to
> about 650MB.  You might get better performance from even higher values.
> The rule of thumb is allocate 1/4 - 1/3 of the available RAM to
> shared_buffers ... subtract the RAM that other programs are using first.

Yes, it runs a few other things like Postfix+amavisd-maia+SA+clamAV, but
low priority MX so it gets little unless the primary is not responding.
Other than that, I use it to run the web GUI (php) for this amavisd-maia
mail server where users can view spam/ham caches. Can I determine the
amount of memory everything else is running by stopping postgres and
look in top to see what is being used?

Thanks for the other pointers...!

--
Robert


Re: autovacuum

From
Erik Jones
Date:
On Sep 20, 2007, at 3:47 PM, Joshua D. Drake wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Bill Moran wrote:
>> In response to Robert Fitzpatrick <lists@webtent.net>:
>>
>>> On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote:
>>>> In response to Robert Fitzpatrick <lists@webtent.net>:
>>>>
>>>>> I have a large database used with our mail filter. The pg_dumpall
>>>>> results in about 3GB with this being the only database in the
>>>>> system
>>>>> besides templates and postgres.
>>>>>
>>>>> I do a vacuum every night after backup and it takes about an
>>>>> hour, is
>>>>> this normal for this size db?
>>>> "normal" is relative.  If it's taking an hour to vacuum 3G, I
>>>> would say
>>>> that either your hardware is undersized/badly configured, or you're
>>>> not vacuuming often enough.
>>> It is a dual P4 processor supermicro server with 2GB of RAM, so I
>>> will
>>> need to go over the configuration then? I didn't think it should
>>> take so
>>> long...
>>
>> Why does everyone leave of the IO subsystem?  It's almost as if many
>> people don't realize that disks exist ...
>
> I have disks?

You ARE have disks ;)


Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: autovacuum

From
"Scott Marlowe"
Date:
On 9/20/07, Robert Fitzpatrick <lists@webtent.net> wrote:
> On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote:
> > In response to Robert Fitzpatrick <lists@webtent.net>:
> > Why does everyone leave of the IO subsystem?  It's almost as if many
> > people don't realize that disks exist ...
> >
> > With 2G of RAM, and a DB that's about 3G, then there's at least a G of
> > database data _not_ in memory at any time.  As a result, disk speed is
> > important, and _could_ be part of your problem.  You're not using RAID
> > 5 are you?
>
> Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All
> SATA 80GB drives giving me little under 300GB to work with.

RAID5 optimizes for space, not performance or reliability.  It gets
faster but less reliable as it gets bigger.  If you can afford the
space RAID-10 is generally preferred.

Note however that it is far more important for most general purpose
servers to have a RAID controller that is both fast by nature (i.e.
not $50.00) and has battery backed cache with write thru turned on.

RAID5 on a fast controller with battery backed cache is ok.  But I've
seen software RAID-10 outrun it for certain loads...

Re: autovacuum

From
Gregory Stark
Date:
"Scott Marlowe" <scott.marlowe@gmail.com> writes:

> RAID5 optimizes for space, not performance or reliability.  It gets
> faster but less reliable as it gets bigger.  If you can afford the
> space RAID-10 is generally preferred.

RAID5 can be faster for DSS style work loads. If you're writing data to the
raid in large contiguous chunks then it you get higher bandwidth than RAID1+0.

The problem with RAID5 is that if you're writing random access chunks then
it's even slower than not having a raid at all.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: autovacuum

From
Magnus Hagander
Date:
On Thu, Sep 20, 2007 at 04:33:25PM -0500, Scott Marlowe wrote:
> On 9/20/07, Robert Fitzpatrick <lists@webtent.net> wrote:
> > On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote:
> > > In response to Robert Fitzpatrick <lists@webtent.net>:
> > > Why does everyone leave of the IO subsystem?  It's almost as if many
> > > people don't realize that disks exist ...
> > >
> > > With 2G of RAM, and a DB that's about 3G, then there's at least a G of
> > > database data _not_ in memory at any time.  As a result, disk speed is
> > > important, and _could_ be part of your problem.  You're not using RAID
> > > 5 are you?
> >
> > Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All
> > SATA 80GB drives giving me little under 300GB to work with.
>
> RAID5 optimizes for space, not performance or reliability.  It gets
> faster but less reliable as it gets bigger.  If you can afford the
> space RAID-10 is generally preferred.
>
> Note however that it is far more important for most general purpose
> servers to have a RAID controller that is both fast by nature (i.e.
> not $50.00) and has battery backed cache with write thru turned on.

Surely you mean with write thru turned *off*... Or write-back turned on.
But write thru turned on will make your battery unnecessary...

//Magnus

Re: autovacuum

From
"Scott Marlowe"
Date:
On 9/21/07, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Sep 20, 2007 at 04:33:25PM -0500, Scott Marlowe wrote:
> > On 9/20/07, Robert Fitzpatrick <lists@webtent.net> wrote:
> > > On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote:
> > > > In response to Robert Fitzpatrick <lists@webtent.net>:
> > > > Why does everyone leave of the IO subsystem?  It's almost as if many
> > > > people don't realize that disks exist ...
> > > >
> > > > With 2G of RAM, and a DB that's about 3G, then there's at least a G of
> > > > database data _not_ in memory at any time.  As a result, disk speed is
> > > > important, and _could_ be part of your problem.  You're not using RAID
> > > > 5 are you?
> > >
> > > Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All
> > > SATA 80GB drives giving me little under 300GB to work with.
> >
> > RAID5 optimizes for space, not performance or reliability.  It gets
> > faster but less reliable as it gets bigger.  If you can afford the
> > space RAID-10 is generally preferred.
> >
> > Note however that it is far more important for most general purpose
> > servers to have a RAID controller that is both fast by nature (i.e.
> > not $50.00) and has battery backed cache with write thru turned on.
>
> Surely you mean with write thru turned *off*... Or write-back turned on.
> But write thru turned on will make your battery unnecessary...

Yeah, I meant write back turned on...