Home > mailing lists

Re: BUG #7853: Incorrect statistics in table with many dead rows. - Mailing list pgsql-bugs

From	Jeff Janes
Subject	Re: BUG #7853: Incorrect statistics in table with many dead rows.
Date	February 10, 2013 23:11:02
Msg-id	CAMkU=1yE4Dwr0a7GPKMZjxMUiT6qSY=2Y1oF79myhdkxwWTu_A@mail.gmail.com Whole thread Raw
In response to	Re: BUG #7853: Incorrect statistics in table with many dead rows. (Kevin Grittner <kgrittn@ymail.com>)
Responses	Re: BUG #7853: Incorrect statistics in table with many dead rows. Re: BUG #7853: Incorrect statistics in table with many dead rows.
List	pgsql-bugs

Tree view

On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> "jimbob@seagate.com" <jimbob@seagate.com> wrote:
>
>> INFO:  analyzing "public.stream_file"
>> INFO:  "stream_file": scanned 30000 of 2123642 pages, containing
>> 184517 live rows and 2115512 dead rows; 30000 rows in sample,
>> 158702435 estimated total rows
>
> 184517 live rows in 30000 randomly sampled pages out of 2123642
> total pages, means that the statistics predict that a select
> count(*) will find about  13 million live rows to count.
>
>> After "analyze verbose", the table shows 158 million rows. A
>> select count(1) yields 13.8 million rows.
>
> OK, the estimate was 13 million and there were actually 13.8
> million, but it is a random sample used to generate estimates.
> That seems worse than average, but close enough to be useful.
> The 158.7 million total rows includes dead rows, which must be
> visited to determine visibility, but will not be counted because
> they are not visible to the counting transaction.

To clarify here, the 158.7 million estimate does not *intentionally*
include dead rows.  As you say, the ANALYZE did get a very good
instantaneous estimate of the number of live rows.  However, ANALYZE
doesn't over-write the old estimate, it averages its estimate into the
old one.  After the table shape changes dramatically, the ANALYZE
needs to be run repeatedly before the estimate will converge to the
new reality.  (Of course a cluster or vacuum full will blow away the
old statistics, so the next analyze after that will solely determine
the new statistics.)

I agree, not a bug.

Cheers,

Jeff

pgsql-bugs by date:

From: Tom Lane
Date: 10 February 2013, 18:17:40
Subject: Re: BUG #7866: even after T's child-tables are deleted it cannot have select-rule

From: matti.aarnio@methics.fi
Date: 11 February 2013, 02:05:20
Subject: BUG #7865: Unexpected error code on insert of duplicate to composite primary key

Re: BUG #7853: Incorrect statistics in table with many dead rows. - Mailing list pgsql-bugs

Previous

Next