Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Date
Msg-id BANLkTimVhdO_bKQagRsH0OLp7MxgJZDryg@mail.gmail.com
Whole thread Raw
In response to Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum  (Robert Haas <robertmhaas@gmail.com>)
Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, May 25, 2011 at 1:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [ shrug... ]  When you don't have complete information, it's *always*
> the case that you will sometimes make a mistake.  That's not
> justification for paralysis, especially not when the existing code is
> demonstrably broken.
>
> What occurs to me after thinking a bit more is that the existing tuple
> density is likely to be only an estimate, too (one coming from the last
> ANALYZE, which could very well have scanned even less of the table than
> VACUUM did).  So what I now think is that both VACUUM and ANALYZE ought
> to use a calculation of the above form to arrive at a new value for
> pg_class.reltuples.  In both cases it would be pretty easy to track the
> number of pages we looked at while counting tuples, so the same raw
> information is available.
>
>> I am wondering, though, why we're not just inserting a special-purpose
>> hack for TOAST tables.
>
> Because the problem is not specific to TOAST tables.  As things
> currently stand, we will accept the word of an ANALYZE as gospel even if
> it scanned 1% of the table, and completely ignore the results from a
> VACUUM even if it scanned 99% of the table.  This is not sane.

I agree that if VACUUM scanned 99% of the table, it's probably fine to
use its numbers.  It's also fine to use the numbers from ANALYZE,
because those pages are chosen randomly.  What bothers me is the idea
of using a small *non-random* sample, and I'm not sure that
incorporating possibly-bogus results slowly is any better than
incorporating them quickly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Next
From: Greg Smith
Date:
Subject: Re: tackling full page writes