Re: help understanding create statistic - Mailing list pgsql-general

From David Rowley
Subject Re: help understanding create statistic
Date
Msg-id CAKJS1f-Zw24J9FEk=4UnpneMkXd4LBed=ayW8XTH-sxLRoE11g@mail.gmail.com
Whole thread Raw
In response to help understanding create statistic  (Luca Ferrari <fluca1978@gmail.com>)
List pgsql-general
On 28 June 2018 at 21:38, Luca Ferrari <fluca1978@gmail.com> wrote:
> CREATE INDEX idx_year
> ON expenses ( EXTRACT( year FROM day ) );
>
> why is the planner not choosing to use such index on a 'year' raw query?
>
> EXPLAIN SELECT * FROM expenses
> WHERE year = 2016;

The expression in the where clause must match the indexed expression.
You'd need to add an index on just (year) for that to work.

> Am I misunderstaing this functional dependency?

Yeah, the statistics are just there to drive the planner's costing.
They won't serve as proof for anything else.

All you've done by creating those stats is to allow better estimates
for queries such as:

SELECT * FROM expenses WHERE day = '2018-06-28' and year = 2018;

> stxdependencies | {"3 => 5": 1.000000}

It would appear that "3" is the attnum for day and "5" is year. All
that tells the planner is that on the records sampled during analyze
is that each "day" had about exactly 1 year.

There's nothing then to stop you going and adding a record with the
day '2017-01-01' and the year 2018. The stats will remain the same
until you analyze the table again.

If those stats didn't exist, the planner would have multiplied the
selectivity estimates of each item in the WHERE clause individually.
So if about 10% of records had year=2018, and 0.01% had '2018-06-28',
then the selectivity would have been 0.1 *  0.001.   With a functional
dependency of 1, the selectivity just becomes 0.001.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-general by date:

Previous
From: Luca Ferrari
Date:
Subject: Re: plperl and plperlu language extentsions
Next
From: "joby.john@nccgroup.trust"
Date:
Subject: Re: Database name with semicolon