Re: BUG #7853: Incorrect statistics in table with many dead rows. - Mailing list pgsql-bugs

From Jeff Janes
Subject Re: BUG #7853: Incorrect statistics in table with many dead rows.
Date
Msg-id CAMkU=1x2SNdDqdRVQUMzkZiDW_Vbx73z1obhw8HQ7eQSyiORDQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #7853: Incorrect statistics in table with many dead rows.  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: BUG #7853: Incorrect statistics in table with many dead rows.
List pgsql-bugs
On Sun, Feb 10, 2013 at 12:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

> On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> >
> > OK, the estimate was 13 million and there were actually 13.8
> > million, but it is a random sample used to generate estimates.
> > That seems worse than average, but close enough to be useful.
> > The 158.7 million total rows includes dead rows, which must be
> > visited to determine visibility, but will not be counted because
> > they are not visible to the counting transaction.
>
> To clarify here, the 158.7 million estimate does not *intentionally*
> include dead rows.  As you say, the ANALYZE did get a very good
> instantaneous estimate of the number of live rows.  However, ANALYZE
> doesn't over-write the old estimate, it averages its estimate into the
> old one.  After the table shape changes dramatically, the ANALYZE
> needs to be run repeatedly before the estimate will converge to the
> new reality.  (Of course a cluster or vacuum full will blow away the
> old statistics, so the next analyze after that will solely determine
> the new statistics.)
>

I was incredibly wrong here.  The cluster or vacuum do not blow away the
stats so that the next analyze gets to solely determine them.  Rather, they
impose their own idea of live tuples, and then analyze can only update that
incrementally as it averages itself into the old value.

Worse, the two methods have very different ideas of what constitutes a live
tuple.  ANALYZE thinks tuples that are visible to a current/recent snapshot
are live.  While CLUSTER and VACUUM think tuples that are possibly visible
to anyone are live.

I would say that this is a bug, or at least approaching to being one.  It
is not obvious whether reltuples and n_live_tuples should count the
"recently dead", but it should either be one way or the other and not an
unholy mixture of the two.

As it is now, a cluster or simple vacuum will snap n_live_tuples so that it
counts recently dead, then analyze will slowly converge it to excludes
recently dead, and then the next vacuum will snap it back again.

Of course, all of this only comes into play in the presence of very
long-lived transactions that prevent tuples from going away.  Otherwise the
number recently dead is small enough not to matter.


create table foo as select (random()*1000000)::integer as val from
generate_series(1,50000000);

In a different session, open a transaction and leave it open: begin; create
temp table adlfkj (x serial);

Back in the main session:

delete from foo where val > 100;

run this repeatedly and watch the rows estimate slowly decay:

ANALYZE verbose foo;  explain select count(*) from foo;

Then run this and watch it instantly spring back:

VACUUM VERBOSE foo ; explain select count(*) from foo;

Cheers,

Jeff

pgsql-bugs by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: BUG #7873: pg_restore --clean tries to drop tables that don't exist
Next
From: Rafael Martinez Guerrero
Date:
Subject: Re: BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues