Thread: Full Vacuum/Reindex vs autovacuum
I currently have Postgres 9.0 install after an upgrade. My database is relatively small, but complex. The dump is about 90MB. Every night when there is no activity I do a full vacuum, a reindex, and then dump a nightly backup. Is this optimal with regards to performance? autovacuum is set to the default. -- Thank you for your time, Jason Long CEO and Chief Software Engineer BS Physics, MS Chemical Engineering http://www.octgsoftware.com HJBug Founder and President http://www.hjbug.com
On 11/08/10 10:50 AM, Jason Long wrote: > I currently have Postgres 9.0 install after an upgrade. My database is > relatively small, but complex. The dump is about 90MB. > > Every night when there is no activity I do a full vacuum, a reindex, and > then dump a nightly backup. > > Is this optimal with regards to performance? autovacuum is set to the > default. if you have frequently updated tables that are accessed mostly from their primary key, it may pay to CLUSTER those tables on said index rather than doing the full vacuum. VACUUM FULL is usually not recommended, btw. Also, if you have tables that get lots of updates that only affect data and not indexed columns, setting a FILL FACTOR of, say, 70 or 80 (its in %) might help with performance by better facilitating HOT updates (HOT is a internal feature added to pg 8.3 to speed up these sorts of updates)
On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: > I currently have Postgres 9.0 install after an upgrade. My database is > relatively small, but complex. The dump is about 90MB. > > Every night when there is no activity I do a full vacuum, a reindex, One question, why? > and then dump a nightly backup. Good idea. > Is this optimal with regards to performance? autovacuum is set to the > default. that depends very much on your answer to the question of why are you doing it and what you're trying to gain / work around with vacuum full / reindex every night.
On Mon, 2010-11-08 at 13:28 -0800, John R Pierce wrote: > On 11/08/10 10:50 AM, Jason Long wrote: > > I currently have Postgres 9.0 install after an upgrade. My database is > > relatively small, but complex. The dump is about 90MB. > > > > Every night when there is no activity I do a full vacuum, a reindex, and > > then dump a nightly backup. > > > > Is this optimal with regards to performance? autovacuum is set to the > > default. > > > if you have frequently updated tables that are accessed mostly from > their primary key, it may pay to CLUSTER those tables on said index > rather than doing the full vacuum. > > VACUUM FULL is usually not recommended, btw. > > Also, if you have tables that get lots of updates that only affect data > and not indexed columns, setting a FILL FACTOR of, say, 70 or 80 (its in > %) might help with performance by better facilitating HOT updates (HOT > is a internal feature added to pg 8.3 to speed up these sorts of updates) > > > Just so I understand, why is full vacuum not recommended?
On Mon, 2010-11-08 at 13:28 -0800, John R Pierce wrote: > On 11/08/10 10:50 AM, Jason Long wrote: > > I currently have Postgres 9.0 install after an upgrade. My database is > > relatively small, but complex. The dump is about 90MB. > > > > Every night when there is no activity I do a full vacuum, a reindex, and > > then dump a nightly backup. > > > > Is this optimal with regards to performance? autovacuum is set to the > > default. > > > if you have frequently updated tables that are accessed mostly from > their primary key, it may pay to CLUSTER those tables on said index > rather than doing the full vacuum. > > VACUUM FULL is usually not recommended, btw. > > Also, if you have tables that get lots of updates that only affect data > and not indexed columns, setting a FILL FACTOR of, say, 70 or 80 (its in > %) might help with performance by better facilitating HOT updates (HOT > is a internal feature added to pg 8.3 to speed up these sorts of updates) > > > Thanks for the tip on CLUSTER. My application has a couple hundred tables that all have an int8 for their primary key. They are joined heavily on their primary key from views and dynamically generated SQL. I am going to looking into clustering the most frequently updated tables. Thanks for the tip. Currently my performance problems are reads to display data.
On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote: > On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: > > I currently have Postgres 9.0 install after an upgrade. My database is > > relatively small, but complex. The dump is about 90MB. > > > > Every night when there is no activity I do a full vacuum, a reindex, > > One question, why? > > > and then dump a nightly backup. > > Good idea. > > > Is this optimal with regards to performance? autovacuum is set to the > > default. > > that depends very much on your answer to the question of why are you > doing it and what you're trying to gain / work around with vacuum full > / reindex every night. > Sorry I am not bumping this. I meant to send this to the list as well. I have been doing this for several years. Since my database is small and it takes little time to do a full vacuum. I am doing the reindex because I thought that was recommended after a full vacuum. As the data has grown the system is slowing down. Right now I am looking at ways to improve performance without getting into the queries themselves because I am swamped with new development. Is doing the full vacuum and reindex hurting or helping anything? Any other quick fixes that I can try? -- Thank you for your time, Jason Long CEO and Chief Software Engineer BS Physics, MS Chemical Engineering http://www.octgsoftware.com HJBug Founder and President http://www.hjbug.com
On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote: > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote: >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: >> > I currently have Postgres 9.0 install after an upgrade. My database is >> > relatively small, but complex. The dump is about 90MB. >> > >> > Every night when there is no activity I do a full vacuum, a reindex, >> >> One question, why? >> >> > and then dump a nightly backup. >> >> Good idea. >> >> > Is this optimal with regards to performance? autovacuum is set to the >> > default. >> >> that depends very much on your answer to the question of why are you >> doing it and what you're trying to gain / work around with vacuum full >> / reindex every night. >> > > I have been doing this for several years. Since my database is small > and it takes little time to do a full vacuum. I am doing the reindex > because I thought that was recommended after a full vacuum. Definitely reindex after a full vacuum on previous versions (i.e. before 9.0) I think with 9.0 vacuum full is like a cluster without any reordering, so it likely doesn't need reindexing, but I've not played with 9.0 much yet. > As the data has grown the system is slowing down. Right now I am > looking at ways to improve performance without getting into the queries > themselves because I am swamped with new development. OK, so it's a standard maintenance procedure you've been doing for a while. That doesn't really explain why you started doing it, but I can guess that you had some bloat issues way back when and vacuum full fixed them, so doing it got kind of enshrined in the nightly maintenance. > Is doing the full vacuum and reindex hurting or helping anything? It might help a small amount if you've got regular usage patterns. If you routinely update whole tables over and over then it might be helping. > Any other quick fixes that I can try? Increasing work_mem, shared_buffers, changing random_page_cost and / or seq_page_cost. Log long running queries and run explain analyze on any that show up very often. But for real performance, you do often have to "get into the queries" because an inefficient query may be something you can cut down to 1/10000th the run time with a simple change, and often that change is impossible to make by tuning the db, only the query can be tuned. It might be something simple like you need to cast a type to match some other type. Hard to say without looking. When a 90Meg database is slow, it's almost always poorly written / non-optimized queries at the heart of it.
On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote: > On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote: > > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote: > >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: > >> > I currently have Postgres 9.0 install after an upgrade. My database is > >> > relatively small, but complex. The dump is about 90MB. > >> > > >> > Every night when there is no activity I do a full vacuum, a reindex, > >> > >> One question, why? > >> > >> > and then dump a nightly backup. > >> > >> Good idea. > >> > >> > Is this optimal with regards to performance? autovacuum is set to the > >> > default. > >> > >> that depends very much on your answer to the question of why are you > >> doing it and what you're trying to gain / work around with vacuum full > >> / reindex every night. > >> > > > > I have been doing this for several years. Since my database is small > > and it takes little time to do a full vacuum. I am doing the reindex > > because I thought that was recommended after a full vacuum. > > Definitely reindex after a full vacuum on previous versions (i.e. > before 9.0) I think with 9.0 vacuum full is like a cluster without any > reordering, so it likely doesn't need reindexing, but I've not played > with 9.0 much yet. > > > As the data has grown the system is slowing down. Right now I am > > looking at ways to improve performance without getting into the queries > > themselves because I am swamped with new development. > > OK, so it's a standard maintenance procedure you've been doing for a > while. That doesn't really explain why you started doing it, but I > can guess that you had some bloat issues way back when and vacuum full > fixed them, so doing it got kind of enshrined in the nightly > maintenance. Exactly. > > > Is doing the full vacuum and reindex hurting or helping anything? > > It might help a small amount if you've got regular usage patterns. If > you routinely update whole tables over and over then it might be > helping. I rarely update whole tables. > > > Any other quick fixes that I can try? > > Increasing work_mem, shared_buffers, changing random_page_cost and / > or seq_page_cost. I did up those at one point, but saw little improvement. I will reinvestigate. > > Log long running queries and run explain analyze on any that show up very often. > > But for real performance, you do often have to "get into the queries" > because an inefficient query may be something you can cut down to > 1/10000th the run time with a simple change, and often that change is > impossible to make by tuning the db, only the query can be tuned. It > might be something simple like you need to cast a type to match some > other type. Hard to say without looking. > > When a 90Meg database is slow, it's almost always poorly written / > non-optimized queries at the heart of it. I have no doubt that poorly written and non-optimized queries are at the heart of it. Stupid developer I'll have to fire that lazy bastard... Oh wait that's me. ;) I am going to start using auto_explain and logging long running queries. Also time to learn how to read query plans. So far I have gotten by by throwing faster hardware at the problem.
On Mon, Nov 8, 2010 at 4:41 PM, Jason Long <mailing.lists@octgsoftware.com> wrote: > On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote: >> On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote: >> > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote: >> >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: >> >> > I currently have Postgres 9.0 install after an upgrade. My database is >> >> > relatively small, but complex. The dump is about 90MB. >> >> > >> >> > Every night when there is no activity I do a full vacuum, a reindex, >> >> >> >> One question, why? >> >> >> >> > and then dump a nightly backup. >> >> >> >> Good idea. >> >> >> >> > Is this optimal with regards to performance? autovacuum is set to the >> >> > default. >> >> >> >> that depends very much on your answer to the question of why are you >> >> doing it and what you're trying to gain / work around with vacuum full >> >> / reindex every night. >> >> >> > >> > I have been doing this for several years. Since my database is small >> > and it takes little time to do a full vacuum. I am doing the reindex >> > because I thought that was recommended after a full vacuum. >> >> Definitely reindex after a full vacuum on previous versions (i.e. >> before 9.0) I think with 9.0 vacuum full is like a cluster without any >> reordering, so it likely doesn't need reindexing, but I've not played >> with 9.0 much yet. >> >> > As the data has grown the system is slowing down. Right now I am >> > looking at ways to improve performance without getting into the queries >> > themselves because I am swamped with new development. >> >> OK, so it's a standard maintenance procedure you've been doing for a >> while. That doesn't really explain why you started doing it, but I >> can guess that you had some bloat issues way back when and vacuum full >> fixed them, so doing it got kind of enshrined in the nightly >> maintenance. > Exactly. >> >> > Is doing the full vacuum and reindex hurting or helping anything? >> >> It might help a small amount if you've got regular usage patterns. If >> you routinely update whole tables over and over then it might be >> helping. > I rarely update whole tables. >> >> > Any other quick fixes that I can try? >> >> Increasing work_mem, shared_buffers, changing random_page_cost and / >> or seq_page_cost. > I did up those at one point, but saw little improvement. I will > reinvestigate. >> >> Log long running queries and run explain analyze on any that show up very often. >> >> But for real performance, you do often have to "get into the queries" >> because an inefficient query may be something you can cut down to >> 1/10000th the run time with a simple change, and often that change is >> impossible to make by tuning the db, only the query can be tuned. It >> might be something simple like you need to cast a type to match some >> other type. Hard to say without looking. >> >> When a 90Meg database is slow, it's almost always poorly written / >> non-optimized queries at the heart of it. > > I have no doubt that poorly written and non-optimized queries are at the > heart of it. Stupid developer I'll have to fire that lazy bastard... Oh > wait that's me. ;) Yeah, I've got one of those bastards where I work too. :) > I am going to start using auto_explain and logging long running queries. > Also time to learn how to read query plans. So far I have gotten by by > throwing faster hardware at the problem. Faster hardware, sadly only gets you so far. For help with explain, start here: http://explain.depesz.com/
On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote: > On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote: > > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote: > >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: > >> > I currently have Postgres 9.0 install after an upgrade. My database is > >> > relatively small, but complex. The dump is about 90MB. > >> > > >> > Every night when there is no activity I do a full vacuum, a reindex, > >> > >> One question, why? > >> > >> > and then dump a nightly backup. > >> > >> Good idea. > >> > >> > Is this optimal with regards to performance? autovacuum is set to the > >> > default. > >> > >> that depends very much on your answer to the question of why are you > >> doing it and what you're trying to gain / work around with vacuum full > >> / reindex every night. > >> > > > > I have been doing this for several years. Since my database is small > > and it takes little time to do a full vacuum. I am doing the reindex > > because I thought that was recommended after a full vacuum. > > Definitely reindex after a full vacuum on previous versions (i.e. > before 9.0) I think with 9.0 vacuum full is like a cluster without any > reordering, so it likely doesn't need reindexing, but I've not played > with 9.0 much yet. > > > As the data has grown the system is slowing down. Right now I am > > looking at ways to improve performance without getting into the queries > > themselves because I am swamped with new development. > > OK, so it's a standard maintenance procedure you've been doing for a > while. That doesn't really explain why you started doing it, but I > can guess that you had some bloat issues way back when and vacuum full > fixed them, so doing it got kind of enshrined in the nightly > maintenance. > > > Is doing the full vacuum and reindex hurting or helping anything? > > It might help a small amount if you've got regular usage patterns. If > you routinely update whole tables over and over then it might be > helping. > > > Any other quick fixes that I can try? > > Increasing work_mem, shared_buffers, changing random_page_cost and / > or seq_page_cost. > > Log long running queries and run explain analyze on any that show up very often. > > But for real performance, you do often have to "get into the queries" > because an inefficient query may be something you can cut down to > 1/10000th the run time with a simple change, and often that change is > impossible to make by tuning the db, only the query can be tuned. It > might be something simple like you need to cast a type to match some > other type. Hard to say without looking. > > When a 90Meg database is slow, it's almost always poorly written / > non-optimized queries at the heart of it. > I stopped doing the nightly vacuum full and reindex. After 3 months some queries would not complete within 2 minutes. Normally these take less than 5 seconds. I tried vacuum without full and reindex, but the problem was still there. Only vacuum full and reindex returned performance to normal. Now I am back to my previous nightly full vacuum and reindex. Any suggestions? -- Thank you for your time, Jason Long CEO and Chief Software Engineer BS Physics, MS Chemical Engineering http://www.octgsoftware.com HJBug Founder and President http://www.hjbug.com
On Mon, 2010-11-08 at 16:23 -0700, Scott Marlowe wrote: > On Mon, Nov 8, 2010 at 3:42 PM, Jason Long <jason@octgsoftware.com> wrote: > > On Mon, 2010-11-08 at 14:58 -0700, Scott Marlowe wrote: > >> On Mon, Nov 8, 2010 at 11:50 AM, Jason Long <jason@octgsoftware.com> wrote: > >> > I currently have Postgres 9.0 install after an upgrade. My database is > >> > relatively small, but complex. The dump is about 90MB. > >> > > >> > Every night when there is no activity I do a full vacuum, a reindex, > >> > >> One question, why? > >> > >> > and then dump a nightly backup. > >> > >> Good idea. > >> > >> > Is this optimal with regards to performance? autovacuum is set to the > >> > default. > >> > >> that depends very much on your answer to the question of why are you > >> doing it and what you're trying to gain / work around with vacuum full > >> / reindex every night. > >> > > > > I have been doing this for several years. Since my database is small > > and it takes little time to do a full vacuum. I am doing the reindex > > because I thought that was recommended after a full vacuum. > > Definitely reindex after a full vacuum on previous versions (i.e. > before 9.0) I think with 9.0 vacuum full is like a cluster without any > reordering, so it likely doesn't need reindexing, but I've not played > with 9.0 much yet. > > > As the data has grown the system is slowing down. Right now I am > > looking at ways to improve performance without getting into the queries > > themselves because I am swamped with new development. > > OK, so it's a standard maintenance procedure you've been doing for a > while. That doesn't really explain why you started doing it, but I > can guess that you had some bloat issues way back when and vacuum full > fixed them, so doing it got kind of enshrined in the nightly > maintenance. > > > Is doing the full vacuum and reindex hurting or helping anything? > > It might help a small amount if you've got regular usage patterns. If > you routinely update whole tables over and over then it might be > helping. > > > Any other quick fixes that I can try? > > Increasing work_mem, shared_buffers, changing random_page_cost and / > or seq_page_cost. > > Log long running queries and run explain analyze on any that show up very often. > > But for real performance, you do often have to "get into the queries" > because an inefficient query may be something you can cut down to > 1/10000th the run time with a simple change, and often that change is > impossible to make by tuning the db, only the query can be tuned. It > might be something simple like you need to cast a type to match some > other type. Hard to say without looking. > > When a 90Meg database is slow, it's almost always poorly written / > non-optimized queries at the heart of it. > I stopped doing the nightly vacuum full and reindex. After 3 months some queries would not complete within 2 minutes. Normally these take less than 5 seconds. I tried vacuum without full and reindex, but the problem was still there. Only vacuum full and reindex returned performance to normal. Now I am back to my previous nightly full vacuum and reindex. Any suggestions? -- Thank you for your time, Jason Long CEO and Chief Software Engineer BS Physics, MS Chemical Engineering http://www.octgsoftware.com HJBug Founder and President http://www.hjbug.com
On Mon, Feb 28, 2011 at 01:48:30PM -0600, Jason Long wrote: > I stopped doing the nightly vacuum full and reindex. After 3 months > some queries would not complete within 2 minutes. Normally these take > less than 5 seconds. I tried vacuum without full and reindex, but the > problem was still there. Only vacuum full and reindex returned > performance to normal. Now I am back to my previous nightly full vacuum > and reindex. Are you logging autovacuum? Is it actually doing what it needs to? I haven't had any experience with autovacuum under 9.x, and I certainly won't say that there's no problem there, but I have a suspicion that your settings are such that either the full vacuum or (at least as likely) the reindex is having an effect that autovacuum ought to be catching. The first thing I'd do is log all the vacuums so that you can see how much vacuum full is recovering. I suggest this only so as not to disrupt your regular operations; otherwise, I'd suggest going back to autovacuum and seeing whether reindex alone would help you. A -- Andrew Sullivan ajs@crankycanuck.ca