Thread: slowness what only full vacuum can solve
Hello guys,
We are running Postgres 8.3 on our production servers on Amazon EC2.
We have a reoccurring problem of slowness initially in every 2 months, after enabling autovacuum every 6 months what only full vacuum can solve. It's kind of a problem as it requires 2hrs downtime and we want to avoid that. What we are doing now is moving to Postgres 9.1 as a desperate measure, but we would like to understand better the root cause of the problem.
If you've seen similar before or have an idea about what's going on here we would really appreciate that insight.
Please see bellow the details.
Thanks guys.
Laszlo
Details:
We have a table with 2-3 million rows, with 30 fields with type varchar, text and number. The table is heavily updated and read too. A super simple query like the following takes unacceptable long time to run. 4-15 sec compared to the under a second run when Postgres works well. We tried index rebuild, more disks, more memory, but nothing helped, only full vacuum. It seems to me that Postgres can't keep internal things in order to operate nicely. Also I see extreme high IO load on the server when this problem occurs. And again, after a full vacuum, things are perfect.
You can see the explain output here:
explain analyze select count(comment0_.id) as col_0_0_ from Comment comment0_ where comment0_.sourceId='xxx' limit 2;
Before full vacuum:
'Limit (cost=14446.76..14446.77 rows=1 width=38) (actual time=4664.694..4664.696 rows=1 loops=1)'
' -> Aggregate (cost=14446.76..14446.77 rows=1 width=38) (actual time=4664.691..4664.692 rows=1 loops=1)'
' -> Bitmap Heap Scan on comment comment0_ (cost=96.07..14436.60 rows=4060 width=38) (actual time=9.135..4645.847rows=18468 loops=1)'
' Recheck Cond: ((sourceid)::text = 'xxx'::text)'
' -> Bitmap Index Scan on comment_sourceid_idx (cost=0.00..95.05 rows=4060 width=0) (actual time=6.424..6.424 rows=18468 loops=1)'
' Index Cond: ((sourceid)::text = 'xxx'::text)'
'Total runtime: 4664.766 ms'
After full vacuum:
'Limit (cost=11877.89..11877.90 rows=1 width=38) (actual time=344.031..344.034 rows=1 loops=1)'
' -> Aggregate (cost=11877.89..11877.90 rows=1 width=38) (actual time=344.024..344.024 rows=1 loops=1)'
' -> Bitmap Heap Scan on comment comment0_ (cost=87.46..11869.24 rows=3461 width=38) (actual time=9.391..326.931 rows=18513 loops=1)'
' Recheck Cond: ((sourceid)::text = 'xxx'::text)'
' -> Bitmap Index Scan on comment_sourceid_idx (cost=0.00..86.59 rows=3461 width=0) (actual time=6.749..6.749 rows=18555 loops=1)'
' Index Cond: ((sourceid)::text = 'xxx'::text)'
'Total runtime: 344.464 ms'
Laszlo Fogas <laszlo@falconsocial.com> wrote: > We have a reoccurring problem of slowness initially in every 2 > months, after enabling autovacuum every 6 months what only full > vacuum can solve. It's kind of a problem as it requires 2hrs > downtime and we want to avoid that. What we are doing now is > moving to Postgres 9.1 as a desperate measure, but we would like > to understand better the root cause of the problem. Your autovacuum is not configured to be aggressive enough. The upgrade should help, as autovacuum has been improved in recent releases, but you may need to increase autovacuum_max_workers, decrease autovacuum_naptime, etc. There are a lot of knobs, and I often find I need to make autovacuum a bit more aggressive than the defaults to keep things in good shape. -Kevin
Sorry, I wasn't clear.
Autovacuum runs with default setting, I believe it's daily, or whenever it feels like.
When autovacuum was disabled, we had this problem once every 2 months.
With autovacuum enabled, we had this problem once in every six month. It seems autovacuum could only delay the event, but not prevent it.
On Fri, Aug 10, 2012 at 3:49 PM, François Beausoleil <francois@teksol.info> wrote:
Le 2012-08-10 à 07:53, Laszlo Fogas a écrit :> We are running Postgres 8.3 on our production servers on Amazon EC2.The usual solution is to run autovacuum *more* frequently, not less. It's not perfectly clear, but you say "after enabling autovacuum every 6 months". If that's the case, then it's much too long. autovacuum should be running hourly, if not more often.
>
> We have a reoccurring problem of slowness initially in every 2 months, after enabling autovacuum every 6 months what only full vacuum can solve. It's kind of a problem as it requires 2hrs downtime and we want to avoid that. What we are doing now is moving to Postgres 9.1 as a desperate measure, but we would like to understand better the root cause of the problem.
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relations talks about running more frequently.
Hope that helps!
François Beausoleil
On Fri, 10 Aug 2012 16:14:54 +0200 Laszlo Fogas <laszlo@falconsocial.com> wrote: > Sorry, I wasn't clear. > > Autovacuum runs with default setting, I believe it's daily, or whenever it > feels like. > > When autovacuum was disabled, we had this problem once every 2 months. > > With autovacuum enabled, we had this problem once in every six month. It > seems autovacuum could only delay the event, but not prevent it. The first steps are still the same: change the autovacuum settings to be more aggressive. > On Fri, Aug 10, 2012 at 3:49 PM, François Beausoleil > <francois@teksol.info>wrote: > > > > > Le 2012-08-10 à 07:53, Laszlo Fogas a écrit : > > > We are running Postgres 8.3 on our production servers on Amazon EC2. > > > > > > We have a reoccurring problem of slowness initially in every 2 months, > > after enabling autovacuum every 6 months what only full vacuum can solve. > > It's kind of a problem as it requires 2hrs downtime and we want to avoid > > that. What we are doing now is moving to Postgres 9.1 as a desperate > > measure, but we would like to understand better the root cause of the > > problem. > > > > The usual solution is to run autovacuum *more* frequently, not less. It's > > not perfectly clear, but you say "after enabling autovacuum every 6 > > months". If that's the case, then it's much too long. autovacuum should be > > running hourly, if not more often. > > > > > > http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relationstalks aboutrunning more frequently. > > > > Hope that helps! > > François Beausoleil -- Bill Moran <wmoran@potentialtech.com>
Laszlo Fogas <laszlo@falconsocial.com> wrote: > Sorry, I wasn't clear. > > Autovacuum runs with default setting, I believe it's daily, or > whenever it feels like. > > When autovacuum was disabled, we had this problem once every 2 > months. > > With autovacuum enabled, we had this problem once in every six > month. That was quite clear. > It seems autovacuum could only delay the event, but not prevent > it. Autovacuum with default settings improves it drastically. Autovacuum with more aggressive settings might well prevent the problem entirely. -Kevin
Le 2012-08-10 à 07:53, Laszlo Fogas a écrit : > We are running Postgres 8.3 on our production servers on Amazon EC2. > > We have a reoccurring problem of slowness initially in every 2 months, after enabling autovacuum every 6 months what onlyfull vacuum can solve. It's kind of a problem as it requires 2hrs downtime and we want to avoid that. What we are doingnow is moving to Postgres 9.1 as a desperate measure, but we would like to understand better the root cause of the problem. The usual solution is to run autovacuum *more* frequently, not less. It's not perfectly clear, but you say "after enablingautovacuum every 6 months". If that's the case, then it's much too long. autovacuum should be running hourly, ifnot more often. http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relations talks about runningmore frequently. Hope that helps! François Beausoleil
Thank you for your answers, we will try to make it more agressive!
On Aug 10, 2012 11:56 PM, "Bill Moran" <wmoran@potentialtech.com> wrote:
On Fri, 10 Aug 2012 16:14:54 +0200 Laszlo Fogas <laszlo@falconsocial.com> wrote:
> Sorry, I wasn't clear.
>
> Autovacuum runs with default setting, I believe it's daily, or whenever it
> feels like.
>
> When autovacuum was disabled, we had this problem once every 2 months.
>
> With autovacuum enabled, we had this problem once in every six month. It
> seems autovacuum could only delay the event, but not prevent it.
The first steps are still the same: change the autovacuum settings to be
more aggressive.
> On Fri, Aug 10, 2012 at 3:49 PM, François Beausoleil
> <francois@teksol.info>wrote:
>
> >
> > Le 2012-08-10 à 07:53, Laszlo Fogas a écrit :
> > > We are running Postgres 8.3 on our production servers on Amazon EC2.
> > >
> > > We have a reoccurring problem of slowness initially in every 2 months,
> > after enabling autovacuum every 6 months what only full vacuum can solve.
> > It's kind of a problem as it requires 2hrs downtime and we want to avoid
> > that. What we are doing now is moving to Postgres 9.1 as a desperate
> > measure, but we would like to understand better the root cause of the
> > problem.
> >
> > The usual solution is to run autovacuum *more* frequently, not less. It's
> > not perfectly clear, but you say "after enabling autovacuum every 6
> > months". If that's the case, then it's much too long. autovacuum should be
> > running hourly, if not more often.
> >
> >
> > http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relationstalks about running more frequently.
> >
> > Hope that helps!
> > François Beausoleil
--
Bill Moran <wmoran@potentialtech.com>