Re: vacuum -vs reltuples on insert only index - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: vacuum -vs reltuples on insert only index
Date
Msg-id CAH2-Wzk+8kQ-ZoxoeOBXymStt5SaXZF8RncOB7jP0sZ31WJ8Aw@mail.gmail.com
Whole thread Raw
In response to Re: vacuum -vs reltuples on insert only index  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: vacuum -vs reltuples on insert only index  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Mon, Nov 2, 2020 at 10:03 AM Peter Geoghegan <pg@bowt.ie> wrote:
> Attached is my proposed fix, which takes this approach. I will commit
> this on Wednesday or Thursday, barring any objections.

Just to be clear: I am not proposing that we set
'IndexBulkDeleteResult.estimated_count = false' here, even though
there is a certain sense in which we now accept an unreliable figure
in Postgres 13. This is not what GIN does. That approach doesn't seem
appropriate for nbtree + deduplication, which is much closer to nbtree
in Postgres 12 than to GIN. I believe that the final num_index_tuples
value (generated during cleanup-only nbtree VACUUM) is in general
sufficiently reliable to not be treated as an estimate by vacuumlazy.c
-- the pg_class entry for the index should still be updated in
update_index_statistics().

In other words, I think that the remaining posting-list related
inaccuracies are comparable to the existing inaccuracies caused by
concurrent page splits during nbtree vacuuming (I describe the problem
right next to an old comment about that issue, in fact). What we have
in both cases is an artifact of how the data is physically represented
and the difficulty it causes us during vacuuming, in certain cases.
There are known error bars. That's why we shouldn't treat
num_index_tuples as merely an estimate.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: libpq compression
Next
From: Peter Geoghegan
Date:
Subject: Re: vacuum -vs reltuples on insert only index