Re: pg_statistic "forced" values - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: pg_statistic "forced" values
Date
Msg-id 1194556796.4251.307.camel@ebony.site
Whole thread Raw
In response to pg_statistic "forced" values  (Jacques Caron <jc@oxado.com>)
List pgsql-hackers
On Wed, 2007-11-07 at 17:18 +0100, Jacques Caron wrote:

> It is well known that in some instances the Postgresql will make 
> estimates of the number of distinct values in a table that can be 
> quite far off reality. This then has a tendency to make the planner 
> lean towards unsavory plans (read: seqscans) because it estimates the 
> number of lines returned by a part of the request as being quite a 
> lot more than they really are.
> 
> The "good" solution would be to fix the estimator, but there has 
> already been long discussions on this topic in the past years and 
> apparently no consensus was found, with alternatives proposed 
> "fixing" some cases where the current estimator is wrong but getting 
> in trouble in others, or requiring quite a bit more CPU/memory/disk 
> I/O to achieve their results (correct me if I'm wrong).
> 
> There is a "simple" way to override this, which is to change the 
> value present in pg_statistic, however it will be overwritten the 
> next time ANALYZE (or VACUUM ANALYZE) is run. This thus requires 
> adding updates to this value every time a request that might be 
> fooled by it is executed, which is cumbersome, and does not 
> facilitate updates of this value (especially with positive values of 
> stadistinct).
> 
> It seems to me it would be a good idea to be able to store a forced 
> value for stadistinct in pg_attribute (with optionally some clauses 
> to set/change/reset it in CREATE TABLE, ALTER TABLE ADD COLUMN and 
> ALTER TABLE ALTER COLUMN, in a way similar to the STATISTICS clauses).

I'm looking at exactly this issue at the moment, though as one issue
amongst many similar ones.

My inclination is to decide what needs to be stored, then debate
separately where it should be stored.

I expect to post a wider proposal in around two weeks.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Free Space Map thoughts
Next
From: Tom Lane
Date:
Subject: Re: New tzdata available