pg_statistic "forced" values - Mailing list pgsql-hackers

From Jacques Caron
Subject pg_statistic "forced" values
Date
Msg-id 20071107161839.03C8C1165946@zeus.directinfos.com
Whole thread Raw
Responses Re: pg_statistic "forced" values  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Hi,

It is well known that in some instances the Postgresql will make 
estimates of the number of distinct values in a table that can be 
quite far off reality. This then has a tendency to make the planner 
lean towards unsavory plans (read: seqscans) because it estimates the 
number of lines returned by a part of the request as being quite a 
lot more than they really are.

The "good" solution would be to fix the estimator, but there has 
already been long discussions on this topic in the past years and 
apparently no consensus was found, with alternatives proposed 
"fixing" some cases where the current estimator is wrong but getting 
in trouble in others, or requiring quite a bit more CPU/memory/disk 
I/O to achieve their results (correct me if I'm wrong).

There is a "simple" way to override this, which is to change the 
value present in pg_statistic, however it will be overwritten the 
next time ANALYZE (or VACUUM ANALYZE) is run. This thus requires 
adding updates to this value every time a request that might be 
fooled by it is executed, which is cumbersome, and does not 
facilitate updates of this value (especially with positive values of 
stadistinct).

It seems to me it would be a good idea to be able to store a forced 
value for stadistinct in pg_attribute (with optionally some clauses 
to set/change/reset it in CREATE TABLE, ALTER TABLE ADD COLUMN and 
ALTER TABLE ALTER COLUMN, in a way similar to the STATISTICS clauses).

Alternatively, it could be a simple boolean to just say "don't update 
stadistinct".

Or did I miss something and this already exists somewhere?

If not, are there any comments or suggestions regarding implementing this?

Thanks,

Jacques.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Weird type selection choice
Next
From: "Kevin Grittner"
Date:
Subject: Re: Visibility map thoughts