Re: lost statistics; analyze needs to execute twice - Mailing list pgsql-bugs

From Tom Lane
Subject Re: lost statistics; analyze needs to execute twice
Date
Msg-id 25519.1251865500@sss.pgh.pa.us
Whole thread Raw
In response to Re: lost statistics; analyze needs to execute twice  (Magnus Hagander <magnus@hagander.net>)
Responses Re: lost statistics; analyze needs to execute twice  (Magnus Hagander <magnus@hagander.net>)
Re: lost statistics; analyze needs to execute twice  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-bugs
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Sep 1, 2009 at 00:02, Jaime
> Casanova<jcasanov@systemguards.com.ec> wrote:
>> when i issue an "immediate shutdown" the statistics on all tables disappear...

> That is by design. Whenever the server goes into crash recovery on
> startup, it will clean out the statistics. Since the statistics data
> is not kept crashsafe, there is no way to know if it's corrupt or not.

Yeah.  I don't think we'll change that.  "-m immediate" is not the
recommended way to stop the server; it's more like the big red button
that dumps Halon all over your equipment.  You expect to have to clean
up afterwards.

>> and when i try to recover them via an analyze; (on all tables on the
>> database) the result is nothing...
>> i have to exexute the analyze commands twice to compute the statistics

> pg_stat_* are not directly affected by ANALYZE. They collect runtime
> statistics about activity in the tables,

Yeah, but ANALYZE does update the stats collector stats too.  I looked
into what's actually happening here, and it's a bit interesting:

1. Stats collector tables are empty.

2. ANALYZE does its thing and sends a PgStat_MsgAnalyze message.

3. pgstat_recv_analyze *intentionally throws the data away*, on the
grounds that if it were interesting there would already be a stats
table entry for the table.

4. At completion of ANALYZE, the regular tabstat machinery sends
off a tabstat message for the table, because guess what, ANALYZE did a
scan of that table, and there are t_blocks_fetched counts to report.

5. pgstat_recv_tabstat happily creates a table entry.  (The pg_statio
counts in it are nonzero, even though the pg_stat counts aren't.)

6. Now, if you repeat the cycle, the stats collector will accept
the second PgStat_MsgAnalyze message, because this time there's
a stats table entry.

This is a bit silly I guess --- we dropped the data but didn't actually
save any stats-table space.

I'm inclined to think that the don't-create-a-table-entry behavior in
pgstat_recv_vacuum and pgstat_recv_analyze should just be dropped.
I'm dubious that it ever worked as intended.  To have it work right
you'd need to suppress vacuum/analyze physical I/O from the tabstats
counts, which doesn't seem like an amazingly good idea.  Moreover,
autovacuum is unlikely to issue vacuum or analyze against a table
that hasn't already got a stats-table entry, so the filter doesn't
seem likely to buy much if it did work.  There might have been some
value in the idea back when cron-driven database-wide VACUUM ANALYZE
was the standard maintenance mechanism, but that's not the recommended
thing anymore.

Comments?

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_ctl infinite loop and memory leak
Next
From: Magnus Hagander
Date:
Subject: Re: lost statistics; analyze needs to execute twice