Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Autovacuum on partitioned table (autoanalyze)
Date
Msg-id 20210408161916.GA5618@alvherre.pgsql
Whole thread Raw
In response to Re: Autovacuum on partitioned table (autoanalyze)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 2021-Apr-08, Tomas Vondra wrote:

> On 4/8/21 5:27 PM, Alvaro Herrera wrote:
>
> > Same as for any other relation: ANALYZE would set it, after it's done
> > scanning the table.  We would to make sure that nothing resets it to
> > empty, though, and that it doesn't cause issues elsewhere.  (The patch I
> > sent contains the minimal change to make it work, but of course that's
> > missing having other pieces of code maintain it.)
> 
> So ANALYZE would inspect the child relations, sum the reltuples and set
> it for the parent? IMO that'd be problematic because it'd mean we're
> comparing the current number of changes with reltuples value which may
> be arbitrarily stale (if we haven't analyzed the parent for a while).

What?  Not at all.  reltuples would be set by ANALYZE on one run, and
then the value is available for the future autovacuum run.  That's how
it works for regular tables too, so I'm not sure what you problem have
with that.  The (possibly stale) reltuples value is multiplied by the
scale factor, and added to the analyze_threshold value, and that's
compared with the current changes_since_analyze to determine whether to
analyze or not.

> That's essentially the issue I described when explaining why I think the
> code needs to propagate the changes, reread the stats and then evaluate
> which relations need vacuuming. It's similar to the issue of comparing
> old changes_since_analyze vs. current reltuples, which is why the code
> is rereading the stats before checking the thresholds. This time it's
> the opposite direction - the reltuples might be stale.

Well, I don't think the issue is the same.  reltuples is always stale,
even for regular tables, because that's just how it works.
changes_since_analyze is not stale for regular tables, and that's why it
makes sense to propagate it from partitions to ancestors prior to
checking the analyze condition.

> FWIW I think the current refresh logic is not quite correct, because
> autovac_refresh_stats does some throttling (STATS_READ_DELAY). It
> probably needs a "force" parameter to ensure it actually reads the
> current stats in this one case.

Hmm ... good catch, but actually that throttling only applies to the
launcher.  do_autovacuum runs in the worker, so there's no throttling.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W



pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: VACUUM (DISABLE_PAGE_SKIPPING on)
Next
From: Tom Lane
Date:
Subject: Re: SQL-standard function body