Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Autovacuum on partitioned table (autoanalyze)
Date
Msg-id 7f27445b-95cf-24be-fe95-a6f66b7db1b9@enterprisedb.com
Whole thread Raw
In response to Re: Autovacuum on partitioned table (autoanalyze)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers

On 4/4/21 9:08 PM, Tomas Vondra wrote:
> On 4/3/21 9:42 PM, Alvaro Herrera wrote:
>> Thanks for the quick rework.  I like this design much better and I think
>> this is pretty close to committable.  Here's a rebased copy with some
>> small cleanups (most notably, avoid calling pgstat_propagate_changes
>> when the partition doesn't have a tabstat entry; also, free the lists
>> that are allocated in a couple of places).
>>
>> I didn't actually verify that it works.
>>> ...
> 
> 3) pgstat_recv_analyze
> 
> Shouldn't it propagate the counters before resetting them? I understand
> that for the just-analyzed relation we can't do better, but why not to
> propagate the counters to parents? (Not necessarily from this place in
> the stat collector, maybe the analyze process should do that.)
> 

FWIW the scenario I had in mind is something like this:

create table t (a int, b int) partition by hash (a);
create table p0 partition of t for values with (modulus 2, remainder 0);
create table p1 partition of t for values with (modulus 2, remainder 1);

insert into t select i, i from generate_series(1,1000000) s(i);

select relname, n_mod_since_analyze from pg_stat_user_tables;

test=# select relname, n_mod_since_analyze from pg_stat_user_tables;
 relname | n_mod_since_analyze
---------+---------------------
 t       |                   0
 p0      |              499375
 p1      |              500625
(3 rows)

test=# analyze p0, p1;
ANALYZE
test=# select relname, n_mod_since_analyze from pg_stat_user_tables;
 relname | n_mod_since_analyze
---------+---------------------
 t       |                   0
 p0      |                   0
 p1      |                   0
(3 rows)

This may seem a bit silly - who would analyze the hash partitions
directly? However, with other partitioning schemes (list, range) it's
quite plausible that people load data directly into partition. They can
analyze the parent explicitly too, but with multi-level partitioning
that probably requires analyzing all the ancestors.

The other possible scenario is about rows inserted while p0/p1 are being
processed by autoanalyze. That may actually take quite a bit of time,
depending on vacuum cost limit. So I still think we should propagate the
delta after the analyze, before we reset the counters.


I also realized relation_needs_vacanalyze is not really doing what I
suggested - it propagates the counts, but does so in the existing loop
which checks which relations need vacuum/analyze.

That means we may skip the parent table in the *current* round, because
it'll see the old (not yet updated) counts. It's likely to be processed
in the next autovacuum round, but that may actually not happen. The
trouble is the reltuples for the parent is calculated using *current*
child reltuples values, but we're comparing it to the *old* value of
changes_since_analyze. So e.g. if enough rows were inserted into the
partitions, it may still be below the analyze threshold.

What I proposed is adding a separate loop that *only* propagates the
counts, and then re-read the current stats (perhaps only if we actually
propagated anything). And then decide which relations need analyze.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: pg_amcheck contrib application
Next
From: Michael Paquier
Date:
Subject: Re: Improve error matching patterns in the SSL tests