Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers

From yuzuko
Subject Re: Autovacuum on partitioned table (autoanalyze)
Date
Msg-id CAKkQ50-gCcXOpop4A=9d3H31yiPkxrgf2FxpMtO=pK6oHe0WDA@mail.gmail.com
Whole thread Raw
In response to Re: Autovacuum on partitioned table (autoanalyze)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Autovacuum on partitioned table (autoanalyze)
List pgsql-hackers
Hi,

I fixed the patch according to the following comments.
Attach the latest patch.  It is based on v14 patch Alvaro attached before.

On Mon, Apr 5, 2021 at 4:08 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 4/3/21 9:42 PM, Alvaro Herrera wrote:
> > Thanks for the quick rework.  I like this design much better and I think
> > this is pretty close to committable.  Here's a rebased copy with some
> > small cleanups (most notably, avoid calling pgstat_propagate_changes
> > when the partition doesn't have a tabstat entry; also, free the lists
> > that are allocated in a couple of places).
> >
> > I didn't actually verify that it works.
> >
>
> Yeah, this approach seems much simpler, I think. That being said, I
> think there's a couple issues:
>
> 1) I still don't understand why inheritance and declarative partitioning
> are treated differently. Seems unnecessary nad surprising, but maybe
> there's a good reason?
>
As we discussed in this thread,  this patch should handle only declarative
partitioning for now.

>
> 2) pgstat_recv_tabstat
>
> Should it really reset changes_since_analyze_reported in both branches?
> AFAICS if the "found" branch does this
>
>     tabentry->changes_since_analyze_reported = 0;
>
> that means we lose the counter any time tabstats are received, no?
> That'd be wrong, because we'd propagate the changes repeatedly.
>
I changed the changes_since_analyze_reported counter not to reset.

>
> 3) pgstat_recv_analyze
>
> Shouldn't it propagate the counters before resetting them? I understand
> that for the just-analyzed relation we can't do better, but why not to
> propagate the counters to parents? (Not necessarily from this place in
> the stat collector, maybe the analyze process should do that.)
>
I realized that we should propagate the counters for manual ANALYZE too.
thanks to the examples you offered in another email.
I fixed that for manual ANALYZE.

>
> 4) pgstat_recv_reportedchanges
>
> I think this needs to be more careful when updating the value - the
> stats collector might have received other messages modifying those
> counters (including e.g. PGSTAT_MTYPE_ANALYZE with a reset), so maybe we
> can get into situation with
>
>   (changes_since_analyze_reported > changes_since_analyze)
>
> if we just blindly increment the value. I'd bet would lead to funny
> stuff. So maybe this should be careful to never exceed this?
>
pgstat_propagate_changes() calls pgstat_report_reportedchanges()
only if (changes_since_analyze_reported < changes_since_analyze).
So I think we won't get into the such situation
>   (changes_since_analyze_reported > changes_since_analyze)
but am I missing something?

> I also realized relation_needs_vacanalyze is not really doing what I
> suggested - it propagates the counts, but does so in the existing loop
> which checks which relations need vacuum/analyze.
>
> That means we may skip the parent table in the *current* round, because
> it'll see the old (not yet updated) counts. It's likely to be processed
> in the next autovacuum round, but that may actually not happen. The
> trouble is the reltuples for the parent is calculated using *current*
> child reltuples values, but we're comparing it to the *old* value of
> changes_since_analyze. So e.g. if enough rows were inserted into the
> partitions, it may still be below the analyze threshold.
>
Indeed, the partitioned table was not analyzed at the same timing as
its leaf partitions due to the delay of propagating counters.  According
to your proposal, I added a separate loop to propagate the counters
before collecting a list of relations to vacuum/analyze.

--
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Improve treatment of page special and page header alignment during page init.
Next
From: Andrew Dunstan
Date:
Subject: Re: multi-install PostgresNode fails with older postgres versions