Re: Problem after VACUUM ANALYZE - Mailing list pgsql-general

From mljv@planwerk6.de
Subject Re: Problem after VACUUM ANALYZE
Date
Msg-id 200804091017.39896.mljv@planwerk6.de
Whole thread Raw
In response to Re: Problem after VACUUM ANALYZE  ("David Wilson" <david.t.wilson@gmail.com>)
List pgsql-general
Am Mittwoch, 9. April 2008 10:11 schrieb David Wilson:
> On Wed, Apr 9, 2008 at 3:29 AM,  <mljv@planwerk6.de> wrote:
> >  But if i do "VACUUM ANALYZE" without concurrent queries, everything runs
> > fine afterwards.
> >
> >  If i run  "VACUUM ANALYZE" with few concurrent queries, it slows down to
> > a crawl.
> >
> >  Could it be that something like this is happening:
> >  - In the early morning a new DB connection is opened.
> >  - While running VACUUM ANALYZE the planner uses different plans because
> > some index could not be used or the statistics are right in that moment
> > not present because they are updated... So the query gets a wrong plan.
> > It uses a seqScan instead of an index scan.
> >  - This wrongly planned statement is prepared so even after VACUUM
> > ANALYZE is done, the statement does not use the wrong plan.
> >  - load raises triggers many concurrent queries with wrong plans. so
> > everything slows down.
>
> I've run into a very similar problem. I have some long-running
> processes that generate a large amount of data and then query that
> data that must periodically drop their connections and rebuild to
> ensure that query plans get re-planned according to updated
> statistics. This is especially true when a new DB is first being
> brought up with an initial set of data (~1 week of compute time, ~30gb
> of data and ~120m rows) with processes that live through the entire
> process.

This is not the case at my place. We are not producing lots of data. Just few
data with small INSERTs all the time. I don't think that the statistic are
really changing quite often as the tables are rather fixed and just some data
is added.

> My solution, as mentioned above, is to rebuild the connection approx
> every hour on my long-running processes. This is a tricky question,
> because you want the benefit of not re-planning queries in 99.9% of
> the cases- I'm not really opposed to the current system that requires
> the reconnections, given that the overhead involved in them is
> completely negligible in the grand scheme of things. There are
> downsides to not replanning queries, and if you can't live with them
> then you should either force re-planning at intervals or avoid
> prepared statements.

In release 8.3:
http://www.postgresql.org/docs/8.3/interactive/release-8-3.html
"Automatically re-plan cached queries when table definitions change or
statistics are updated"

So no reconnection is necessary in 8.3 to replan prepared statements. Maybe it
helps you.

kind regards,
janning


pgsql-general by date:

Previous
From: "Pavan Deolasee"
Date:
Subject: Re: Problem after VACUUM ANALYZE
Next
From: "josep porres"
Date:
Subject: wrong query result