Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Date
Msg-id 3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

So here is an updated version of the patch/fix, addressing the remaining
issues in v3 posted by Tom in November.


The 0001 part is actually a bugfix in bloom and spgist index AM, which
did something like this:

    reltuples = IndexBuildHeapScan(...)

    result->heap_tuples = result->index_tuples = reltuples;

That is, these two AMs simply used the number of heap rows for the
index. That does not work for partial indexes, of course, where the
correct index reltuples value is likely much lower.

0001 fixes this by tracking the number of actually indexed rows in the
build states, just like in the other index AMs.

A VACUUM or ANALYZE will fix the estimate, of course, but for tables
that are not changing very much it may take quite a while. So I think
this is something we definitely need to back-patch.


The 0002 part is the main part, unifying the definition of reltuples on
three main places:

 a) acquire_sample_rows (ANALYZE)
 b) lazy_scan_heap (VACUUM)
 c) IndexBuildHeapRangeScan (CREATE INDEX)

As the ANALYZE case seems the most constrained, the other two places
were updated to use the same criteria for which rows to include in the
reltuples estimate:

  * HEAPTUPLE_LIVE
  * HEAPTUPLE_INSERT_IN_PROGRESS (same transaction)
  * HEAPTUPLE_DELETE_IN_PROGRESS (not the same trasaction)

This resolves the issue with oscillating reltuples estimates, produced
by VACUUM and ANALYZE (with many non-removable dead tuples).

I've checked all IndexBuildHeapRangeScan callers, and none of them is
using the reltuples estimate for anything except for passing it to
index_update_stats. Aside from the bug fixed in 0001, of course.



regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: configure's checks for --enable-tap-tests are insufficient
Next
From: Alvaro Herrera
Date:
Subject: Re: configure's checks for --enable-tap-tests are insufficient