Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Date
Msg-id 27720.1511042193@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 11/02/2017 08:15 PM, Tom Lane wrote:
>> However, I'm not sure we're there yet, because there remains a fairly
>> nasty discrepancy even once we've gotten everyone onto the same page
>> about reltuples counting just live tuples: VACUUM and ANALYZE have
>> different definitions of what's "live".  In particular they do not treat
>> INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same.  Should we
>> try to do something about that?  If so, what?  It looks like ANALYZE's
>> behavior is pretty tightly constrained, judging by the comments in
>> acquire_sample_rows.

> ISTM we need to unify those definitions, probably so that VACUUM adopts
> what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats
> will be updated at the end of transaction, why shouldn't VACUUM do the
> same thing?

That was the way I was leaning.  I haven't thought very hard about the
implications, but as long as the change in VACUUM's behavior extends
only to the live-tuple count it reports, it seems like adjusting it
couldn't break anything too badly.

>> Another problem is that it looks like CREATE INDEX will set reltuples
>> to the total number of heap entries it chose to index, because that
>> is what IndexBuildHeapScan counts.  Maybe we should adjust that?

> You mean by only counting live tuples in IndexBuildHeapRangeScan,
> following whatever definition we end up using in VACUUM/ANALYZE?

Right.  One issue is that, as I mentioned, the index AMs probably want to
think about total-tuples-indexed not live-tuples; so for their purposes,
what IndexBuildHeapScan currently counts is the right thing.  We need to
look and see if any AMs are actually using that value rather than just
silently passing it back.  If they are, we might need to go to the trouble
of computing/returning two values.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Next
From: Tomas Vondra
Date:
Subject: Re: percentile value check can be slow