Re: RFE: Make statistics robust for unplanned events - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: RFE: Make statistics robust for unplanned events
Date
Msg-id CAH2-WzmEAutJuEK1i4y8pCafHCn9jcJ=WB4H=sjS6dJYrdaC8g@mail.gmail.com
Whole thread Raw
In response to Re: RFE: Make statistics robust for unplanned events  (Magnus Hagander <magnus@hagander.net>)
Responses Re: RFE: Make statistics robust for unplanned events  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Apr 21, 2021 at 5:39 AM Magnus Hagander <magnus@hagander.net> wrote:
> I'm pretty sure everybody would *want* this. At least nobody would be
> against it. The problem is the potential performance cost of it.

VACUUM remembers vacrel->new_live_tuples as the pg_class.reltuples for
the heap relation being vacuumed. It also remembers new_rel_pages in
pg_class (see vac_update_relstats()). However, it does not remember
vacrel->new_dead_tuples in pg_class or in any other durable location
(the information gets remembered via a call to pgstat_report_vacuum()
instead).

We already *almost* pay the full cost of durably storing the
information used by autovacuum.c's relation_needs_vacanalyze() to
determine if a VACUUM is required -- we're only missing
new_dead_tuples/tabentry->n_dead_tuples. Why not go one tiny baby step
further to fix this issue?

Admittedly, storing new_dead_tuples durably is not sufficient to allow
ANALYZE to be launched on schedule when there is a hard crash. It is
also insufficient to make sure that insert-driven autovacuums get
launched on schedule. Even still, I'm pretty sure that just making
sure that we store it durably (alongside pg_class.reltuples?) will
impose only a modest additional cost, while fixing Patrik's problem.
That seems likely to be worth it.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: multi-install PostgresNode fails with older postgres versions
Next
From: Tom Lane
Date:
Subject: Re: RFE: Make statistics robust for unplanned events