Thread: Full Vacuum/Reindex vs autovacuum

Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
I currently have Postgres 9.0 install after an upgrade.  My database is
relatively small, but complex.  The dump is about 90MB.

Every night when there is no activity I do a full vacuum, a reindex, and
then dump a nightly backup.

Is this optimal with regards to performance?  autovacuum is set to the
default.

--
Thank you for your time,

Jason Long
CEO and Chief Software Engineer
BS Physics, MS Chemical Engineering
http://www.octgsoftware.com
HJBug Founder and President
http://www.hjbug.com


Re: Full Vacuum/Reindex vs autovacuum

From
John R Pierce
Date:
On 11/08/10 10:50 AM, Jason Long wrote:
> I currently have Postgres 9.0 install after an upgrade.  My database is
> relatively small, but complex.  The dump is about 90MB.
>
> Every night when there is no activity I do a full vacuum, a reindex, and
> then dump a nightly backup.
>
> Is this optimal with regards to performance?  autovacuum is set to the
> default.


if you have frequently updated tables that are accessed mostly from
their primary key, it may pay to CLUSTER those tables on said index
rather than doing the full vacuum.

VACUUM FULL is usually not recommended, btw.

Also, if you have tables that get lots of updates that only affect data
and not indexed columns, setting a FILL FACTOR of, say, 70 or 80 (its in
%) might help with performance by better facilitating HOT updates (HOT
is a internal feature added to pg 8.3 to speed up these sorts of updates)



Re: Full Vacuum/Reindex vs autovacuum

From
Scott Marlowe
Date:
On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
> I currently have Postgres 9.0 install after an upgrade.  My database is
> relatively small, but complex.  The dump is about 90MB.
>
> Every night when there is no activity I do a full vacuum, a reindex,

One question, why?

> and then dump a nightly backup.

Good idea.

> Is this optimal with regards to performance?  autovacuum is set to the
> default.

that depends very much on your answer to the question of why are you
doing it and what you're trying to gain / work around with vacuum full
/ reindex every night.

Re: Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
On Mon, 2010-11-08 at 13:28 -0800, John R Pierce wrote:
> On 11/08/10 10:50 AM, Jason Long wrote:
> > I currently have Postgres 9.0 install after an upgrade.  My database is
> > relatively small, but complex.  The dump is about 90MB.
> >
> > Every night when there is no activity I do a full vacuum, a reindex, and
> > then dump a nightly backup.
> >
> > Is this optimal with regards to performance?  autovacuum is set to the
> > default.
>
>
> if you have frequently updated tables that are accessed mostly from
> their primary key, it may pay to CLUSTER those tables on said index
> rather than doing the full vacuum.
>
> VACUUM FULL is usually not recommended, btw.
>
> Also, if you have tables that get lots of updates that only affect data
> and not indexed columns, setting a FILL FACTOR of, say, 70 or 80 (its in
> %) might help with performance by better facilitating HOT updates (HOT
> is a internal feature added to pg 8.3 to speed up these sorts of updates)
>
>
>

Just so I understand, why is full vacuum not recommended?



Re: Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
On Mon, 2010-11-08 at 13:28 -0800, John R Pierce wrote:
> On 11/08/10 10:50 AM, Jason Long wrote:
> > I currently have Postgres 9.0 install after an upgrade.  My database is
> > relatively small, but complex.  The dump is about 90MB.
> >
> > Every night when there is no activity I do a full vacuum, a reindex, and
> > then dump a nightly backup.
> >
> > Is this optimal with regards to performance?  autovacuum is set to the
> > default.
>
>
> if you have frequently updated tables that are accessed mostly from
> their primary key, it may pay to CLUSTER those tables on said index
> rather than doing the full vacuum.
>
> VACUUM FULL is usually not recommended, btw.
>
> Also, if you have tables that get lots of updates that only affect data
> and not indexed columns, setting a FILL FACTOR of, say, 70 or 80 (its in
> %) might help with performance by better facilitating HOT updates (HOT
> is a internal feature added to pg 8.3 to speed up these sorts of updates)
>
>
>

Thanks for the tip on CLUSTER.

My application has a couple hundred tables that all have an int8 for
their primary key.  They are joined heavily on their primary key from
views and dynamically generated SQL.  I am going to looking into
clustering the most frequently updated tables.  Thanks for the tip.

Currently my performance problems are reads to display data.


Re: Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote:
> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
> > I currently have Postgres 9.0 install after an upgrade.  My database is
> > relatively small, but complex.  The dump is about 90MB.
> >
> > Every night when there is no activity I do a full vacuum, a reindex,
>
> One question, why?
>
> > and then dump a nightly backup.
>
> Good idea.
>
> > Is this optimal with regards to performance?  autovacuum is set to the
> > default.
>
> that depends very much on your answer to the question of why are you
> doing it and what you're trying to gain / work around with vacuum full
> / reindex every night.
>

Sorry I am not bumping this.  I meant to send this to the list as well.

I have been doing this for several years.  Since my database is small
and it takes little time to do a full vacuum.  I am doing the reindex
because I thought that was recommended after a full vacuum.

As the data has grown the system is slowing down.  Right now I am
looking at ways to improve performance without getting into the queries
themselves because I am swamped with new development.

Is doing the full vacuum and reindex hurting or helping anything?

Any other quick fixes that I can try?

--
Thank you for your time,

Jason Long
CEO and Chief Software Engineer
BS Physics, MS Chemical Engineering
http://www.octgsoftware.com
HJBug Founder and President
http://www.hjbug.com


Re: Full Vacuum/Reindex vs autovacuum

From
Scott Marlowe
Date:
On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote:
> On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote:
>> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
>> > I currently have Postgres 9.0 install after an upgrade.  My database is
>> > relatively small, but complex.  The dump is about 90MB.
>> >
>> > Every night when there is no activity I do a full vacuum, a reindex,
>>
>> One question, why?
>>
>> > and then dump a nightly backup.
>>
>> Good idea.
>>
>> > Is this optimal with regards to performance?  autovacuum is set to the
>> > default.
>>
>> that depends very much on your answer to the question of why are you
>> doing it and what you're trying to gain / work around with vacuum full
>> / reindex every night.
>>
>
> I have been doing this for several years.  Since my database is small
> and it takes little time to do a full vacuum.  I am doing the reindex
> because I thought that was recommended after a full vacuum.

Definitely reindex after a full vacuum on previous versions (i.e.
before 9.0) I think with 9.0 vacuum full is like a cluster without any
reordering, so it likely doesn't need reindexing, but I've not played
with 9.0 much yet.

> As the data has grown the system is slowing down.  Right now I am
> looking at ways to improve performance without getting into the queries
> themselves because I am swamped with new development.

OK, so it's a standard maintenance procedure you've been doing for a
while.  That doesn't really explain why you started doing it, but I
can guess that you had some bloat issues way back when and vacuum full
fixed them, so doing it got kind of enshrined in the nightly
maintenance.

> Is doing the full vacuum and reindex hurting or helping anything?

It might help a small amount if you've got regular usage patterns.  If
you routinely update whole tables over and over then it might be
helping.

> Any other quick fixes that I can try?

Increasing work_mem, shared_buffers, changing random_page_cost and /
or seq_page_cost.

Log long running queries and run explain analyze on any that show up very often.

But for real performance, you do often have to "get into the queries"
because an inefficient query may be something you can cut down to
1/10000th the run time with a simple change, and often that change is
impossible to make by tuning the db, only the query can be tuned.  It
might be something simple like you need to cast a type to match some
other type.  Hard to say without looking.

When a 90Meg database is slow, it's almost always poorly written /
non-optimized queries at the heart of it.

Re: Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote:
> On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote:
> > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote:
> >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
> >> > I currently have Postgres 9.0 install after an upgrade.  My database is
> >> > relatively small, but complex.  The dump is about 90MB.
> >> >
> >> > Every night when there is no activity I do a full vacuum, a reindex,
> >>
> >> One question, why?
> >>
> >> > and then dump a nightly backup.
> >>
> >> Good idea.
> >>
> >> > Is this optimal with regards to performance?  autovacuum is set to the
> >> > default.
> >>
> >> that depends very much on your answer to the question of why are you
> >> doing it and what you're trying to gain / work around with vacuum full
> >> / reindex every night.
> >>
> >
> > I have been doing this for several years.  Since my database is small
> > and it takes little time to do a full vacuum.  I am doing the reindex
> > because I thought that was recommended after a full vacuum.
>
> Definitely reindex after a full vacuum on previous versions (i.e.
> before 9.0) I think with 9.0 vacuum full is like a cluster without any
> reordering, so it likely doesn't need reindexing, but I've not played
> with 9.0 much yet.
>
> > As the data has grown the system is slowing down.  Right now I am
> > looking at ways to improve performance without getting into the queries
> > themselves because I am swamped with new development.
>
> OK, so it's a standard maintenance procedure you've been doing for a
> while.  That doesn't really explain why you started doing it, but I
> can guess that you had some bloat issues way back when and vacuum full
> fixed them, so doing it got kind of enshrined in the nightly
> maintenance.
Exactly.
>
> > Is doing the full vacuum and reindex hurting or helping anything?
>
> It might help a small amount if you've got regular usage patterns.  If
> you routinely update whole tables over and over then it might be
> helping.
I rarely update whole tables.
>
> > Any other quick fixes that I can try?
>
> Increasing work_mem, shared_buffers, changing random_page_cost and /
> or seq_page_cost.
I did up those at one point, but saw little improvement.  I will
reinvestigate.
>
> Log long running queries and run explain analyze on any that show up very often.
>
> But for real performance, you do often have to "get into the queries"
> because an inefficient query may be something you can cut down to
> 1/10000th the run time with a simple change, and often that change is
> impossible to make by tuning the db, only the query can be tuned.  It
> might be something simple like you need to cast a type to match some
> other type.  Hard to say without looking.
>
> When a 90Meg database is slow, it's almost always poorly written /
> non-optimized queries at the heart of it.

I have no doubt that poorly written and non-optimized queries are at the
heart of it.  Stupid developer I'll have to fire that lazy bastard... Oh
wait that's me. ;)

I am going to start using auto_explain and logging long running queries.
Also time to learn how to read query plans.  So far I have gotten by by
throwing faster hardware at the problem.



Re: Full Vacuum/Reindex vs autovacuum

From
Scott Marlowe
Date:
On Mon, Nov 8, 2010 at 4:41 PM, Jason Long
<mailing.lists@octgsoftware.com> wrote:
> On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote:
>> On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote:
>> > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote:
>> >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
>> >> > I currently have Postgres 9.0 install after an upgrade.  My database is
>> >> > relatively small, but complex.  The dump is about 90MB.
>> >> >
>> >> > Every night when there is no activity I do a full vacuum, a reindex,
>> >>
>> >> One question, why?
>> >>
>> >> > and then dump a nightly backup.
>> >>
>> >> Good idea.
>> >>
>> >> > Is this optimal with regards to performance?  autovacuum is set to the
>> >> > default.
>> >>
>> >> that depends very much on your answer to the question of why are you
>> >> doing it and what you're trying to gain / work around with vacuum full
>> >> / reindex every night.
>> >>
>> >
>> > I have been doing this for several years.  Since my database is small
>> > and it takes little time to do a full vacuum.  I am doing the reindex
>> > because I thought that was recommended after a full vacuum.
>>
>> Definitely reindex after a full vacuum on previous versions (i.e.
>> before 9.0) I think with 9.0 vacuum full is like a cluster without any
>> reordering, so it likely doesn't need reindexing, but I've not played
>> with 9.0 much yet.
>>
>> > As the data has grown the system is slowing down.  Right now I am
>> > looking at ways to improve performance without getting into the queries
>> > themselves because I am swamped with new development.
>>
>> OK, so it's a standard maintenance procedure you've been doing for a
>> while.  That doesn't really explain why you started doing it, but I
>> can guess that you had some bloat issues way back when and vacuum full
>> fixed them, so doing it got kind of enshrined in the nightly
>> maintenance.
> Exactly.
>>
>> > Is doing the full vacuum and reindex hurting or helping anything?
>>
>> It might help a small amount if you've got regular usage patterns.  If
>> you routinely update whole tables over and over then it might be
>> helping.
> I rarely update whole tables.
>>
>> > Any other quick fixes that I can try?
>>
>> Increasing work_mem, shared_buffers, changing random_page_cost and /
>> or seq_page_cost.
> I did up those at one point, but saw little improvement.  I will
> reinvestigate.
>>
>> Log long running queries and run explain analyze on any that show up very often.
>>
>> But for real performance, you do often have to "get into the queries"
>> because an inefficient query may be something you can cut down to
>> 1/10000th the run time with a simple change, and often that change is
>> impossible to make by tuning the db, only the query can be tuned.  It
>> might be something simple like you need to cast a type to match some
>> other type.  Hard to say without looking.
>>
>> When a 90Meg database is slow, it's almost always poorly written /
>> non-optimized queries at the heart of it.
>
> I have no doubt that poorly written and non-optimized queries are at the
> heart of it.  Stupid developer I'll have to fire that lazy bastard... Oh
> wait that's me. ;)

Yeah, I've got one of those bastards where I work too.  :)

> I am going to start using auto_explain and logging long running queries.
> Also time to learn how to read query plans.  So far I have gotten by by
> throwing faster hardware at the problem.

Faster hardware, sadly only gets you so far.

For help with explain, start here:
http://explain.depesz.com/

Re: Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote:
> On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote:
> > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote:
> >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
> >> > I currently have Postgres 9.0 install after an upgrade.  My database is
> >> > relatively small, but complex.  The dump is about 90MB.
> >> >
> >> > Every night when there is no activity I do a full vacuum, a reindex,
> >>
> >> One question, why?
> >>
> >> > and then dump a nightly backup.
> >>
> >> Good idea.
> >>
> >> > Is this optimal with regards to performance?  autovacuum is set to the
> >> > default.
> >>
> >> that depends very much on your answer to the question of why are you
> >> doing it and what you're trying to gain / work around with vacuum full
> >> / reindex every night.
> >>
> >
> > I have been doing this for several years.  Since my database is small
> > and it takes little time to do a full vacuum.  I am doing the reindex
> > because I thought that was recommended after a full vacuum.
>
> Definitely reindex after a full vacuum on previous versions (i.e.
> before 9.0) I think with 9.0 vacuum full is like a cluster without any
> reordering, so it likely doesn't need reindexing, but I've not played
> with 9.0 much yet.
>
> > As the data has grown the system is slowing down.  Right now I am
> > looking at ways to improve performance without getting into the queries
> > themselves because I am swamped with new development.
>
> OK, so it's a standard maintenance procedure you've been doing for a
> while.  That doesn't really explain why you started doing it, but I
> can guess that you had some bloat issues way back when and vacuum full
> fixed them, so doing it got kind of enshrined in the nightly
> maintenance.
>
> > Is doing the full vacuum and reindex hurting or helping anything?
>
> It might help a small amount if you've got regular usage patterns.  If
> you routinely update whole tables over and over then it might be
> helping.
>
> > Any other quick fixes that I can try?
>
> Increasing work_mem, shared_buffers, changing random_page_cost and /
> or seq_page_cost.
>
> Log long running queries and run explain analyze on any that show up very often.
>
> But for real performance, you do often have to "get into the queries"
> because an inefficient query may be something you can cut down to
> 1/10000th the run time with a simple change, and often that change is
> impossible to make by tuning the db, only the query can be tuned.  It
> might be something simple like you need to cast a type to match some
> other type.  Hard to say without looking.
>
> When a 90Meg database is slow, it's almost always poorly written /
> non-optimized queries at the heart of it.
>

I stopped doing the nightly vacuum full and reindex.  After 3 months
some queries would not complete within 2 minutes.  Normally these take
less than 5 seconds.  I tried vacuum without full and reindex, but the
problem was still there.  Only vacuum full and reindex returned
performance to normal.  Now I am back to my previous nightly full vacuum
and reindex.

Any suggestions?

--
Thank you for your time,

Jason Long
CEO and Chief Software Engineer
BS Physics, MS Chemical Engineering
http://www.octgsoftware.com
HJBug Founder and President
http://www.hjbug.com


Re: Full Vacuum/Reindex vs autovacuum

From
Jason Long
Date:
On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote:
> On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote:
> > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote:
> >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote:
> >> > I currently have Postgres 9.0 install after an upgrade.  My database is
> >> > relatively small, but complex.  The dump is about 90MB.
> >> >
> >> > Every night when there is no activity I do a full vacuum, a reindex,
> >>
> >> One question, why?
> >>
> >> > and then dump a nightly backup.
> >>
> >> Good idea.
> >>
> >> > Is this optimal with regards to performance?  autovacuum is set to the
> >> > default.
> >>
> >> that depends very much on your answer to the question of why are you
> >> doing it and what you're trying to gain / work around with vacuum full
> >> / reindex every night.
> >>
> >
> > I have been doing this for several years.  Since my database is small
> > and it takes little time to do a full vacuum.  I am doing the reindex
> > because I thought that was recommended after a full vacuum.
>
> Definitely reindex after a full vacuum on previous versions (i.e.
> before 9.0) I think with 9.0 vacuum full is like a cluster without any
> reordering, so it likely doesn't need reindexing, but I've not played
> with 9.0 much yet.
>
> > As the data has grown the system is slowing down.  Right now I am
> > looking at ways to improve performance without getting into the queries
> > themselves because I am swamped with new development.
>
> OK, so it's a standard maintenance procedure you've been doing for a
> while.  That doesn't really explain why you started doing it, but I
> can guess that you had some bloat issues way back when and vacuum full
> fixed them, so doing it got kind of enshrined in the nightly
> maintenance.
>
> > Is doing the full vacuum and reindex hurting or helping anything?
>
> It might help a small amount if you've got regular usage patterns.  If
> you routinely update whole tables over and over then it might be
> helping.
>
> > Any other quick fixes that I can try?
>
> Increasing work_mem, shared_buffers, changing random_page_cost and /
> or seq_page_cost.
>
> Log long running queries and run explain analyze on any that show up very often.
>
> But for real performance, you do often have to "get into the queries"
> because an inefficient query may be something you can cut down to
> 1/10000th the run time with a simple change, and often that change is
> impossible to make by tuning the db, only the query can be tuned.  It
> might be something simple like you need to cast a type to match some
> other type.  Hard to say without looking.
>
> When a 90Meg database is slow, it's almost always poorly written /
> non-optimized queries at the heart of it.
>

I stopped doing the nightly vacuum full and reindex.  After 3 months
some queries would not complete within 2 minutes.  Normally these take
less than 5 seconds.  I tried vacuum without full and reindex, but the
problem was still there.  Only vacuum full and reindex returned
performance to normal.  Now I am back to my previous nightly full vacuum
and reindex.

Any suggestions?

--
Thank you for your time,

Jason Long
CEO and Chief Software Engineer
BS Physics, MS Chemical Engineering
http://www.octgsoftware.com
HJBug Founder and President
http://www.hjbug.com


Re: Full Vacuum/Reindex vs autovacuum

From
Andrew Sullivan
Date:
On Mon, Feb 28, 2011 at 01:48:30PM -0600, Jason Long wrote:

> I stopped doing the nightly vacuum full and reindex.  After 3 months
> some queries would not complete within 2 minutes.  Normally these take
> less than 5 seconds.  I tried vacuum without full and reindex, but the
> problem was still there.  Only vacuum full and reindex returned
> performance to normal.  Now I am back to my previous nightly full vacuum
> and reindex.

Are you logging autovacuum?  Is it actually doing what it needs to?  I
haven't had any experience with autovacuum under 9.x, and I certainly
won't say that there's no problem there, but I have a suspicion that
your settings are such that either the full vacuum or (at least as
likely) the reindex is having an effect that autovacuum ought to be
catching.

The first thing I'd do is log all the vacuums so that you can see how
much vacuum full is recovering.  I suggest this only so as not to
disrupt your regular operations; otherwise, I'd suggest going back to
autovacuum and seeing whether reindex alone would help you.

A


--
Andrew Sullivan
ajs@crankycanuck.ca