Re: Shouldn't we have a way to avoid "risky" plans? - Mailing list pgsql-performance

From Robert Haas
Subject Re: Shouldn't we have a way to avoid "risky" plans?
Date
Msg-id BANLkTimcUqk4UQEQGpocGsRoQTk-S+R3mw@mail.gmail.com
Whole thread Raw
In response to Re: Shouldn't we have a way to avoid "risky" plans?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Shouldn't we have a way to avoid "risky" plans?
List pgsql-performance
On Fri, Mar 25, 2011 at 10:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>>> If the planner starts operating on the basis of worst case rather than
>>> expected-case performance, the complaints will be far more numerous than
>>> they are today.
>
>> Yeah, I don't think that's the way to go.  The other thought I had was
>> to accumulate a "risk" stat the same as we accumulate a "cost" stat.
>
>> However, I'm thinking that I'm overengineering what seems to be a fairly
>> isolated problem, in that we might simply need to adjust the costing on
>> this kind of a plan.
>
> mergejoinscansel doesn't currently try to fix up the histogram bounds by
> consulting indexes.  At the time I was afraid of the costs of doing
> that, and I still am; but it would be a way to address this issue.

Apparently, this is a pain point for the MySQL query planner - not so
much for merge joins, which I don't think are supported in any of the
major forks anyway - but the planner's desire to go estimate things by
probing the indexes.  IIRC, the MariaDB guys are looking into adding
persistent statistics to address this problem.  That doesn't
necessarily mean that we shouldn't do this, but it probably does mean
that we should be awfully careful about it.

Another thought is that we might want to consider reducing
autovacuum_analyze_scale_factor.  The root of the original problem
seems to be that the table had some data churn but not enough to cause
an ANALYZE.  Now, if the data churn is random, auto-analyzing after
10% churn might be reasonable, but a lot of data churn is non-random,
and ANALYZE is fairly cheap.  I'm just shooting in the dark here; I
might be all wet.  I think part of the problem is that the AV launcher
isn't very smart about looking at the overall picture.  It'd be nice,
for example, to be able to be more aggressive when the system is quiet
and to be a bit more careful when the system is saturated, but it's a
bit tricky to think about how to make that work, or exactly what the
heuristics should be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-performance by date:

Previous
From: Robert Haas
Date:
Subject: Re: Shouldn't we have a way to avoid "risky" plans?
Next
From: Greg Smith
Date:
Subject: Re: How to configure a read-only database server?