Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Autovacuum on partitioned table (autoanalyze) |
Date | |
Msg-id | 7f27445b-95cf-24be-fe95-a6f66b7db1b9@enterprisedb.com Whole thread Raw |
In response to | Re: Autovacuum on partitioned table (autoanalyze) (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
On 4/4/21 9:08 PM, Tomas Vondra wrote: > On 4/3/21 9:42 PM, Alvaro Herrera wrote: >> Thanks for the quick rework. I like this design much better and I think >> this is pretty close to committable. Here's a rebased copy with some >> small cleanups (most notably, avoid calling pgstat_propagate_changes >> when the partition doesn't have a tabstat entry; also, free the lists >> that are allocated in a couple of places). >> >> I didn't actually verify that it works. >>> ... > > 3) pgstat_recv_analyze > > Shouldn't it propagate the counters before resetting them? I understand > that for the just-analyzed relation we can't do better, but why not to > propagate the counters to parents? (Not necessarily from this place in > the stat collector, maybe the analyze process should do that.) > FWIW the scenario I had in mind is something like this: create table t (a int, b int) partition by hash (a); create table p0 partition of t for values with (modulus 2, remainder 0); create table p1 partition of t for values with (modulus 2, remainder 1); insert into t select i, i from generate_series(1,1000000) s(i); select relname, n_mod_since_analyze from pg_stat_user_tables; test=# select relname, n_mod_since_analyze from pg_stat_user_tables; relname | n_mod_since_analyze ---------+--------------------- t | 0 p0 | 499375 p1 | 500625 (3 rows) test=# analyze p0, p1; ANALYZE test=# select relname, n_mod_since_analyze from pg_stat_user_tables; relname | n_mod_since_analyze ---------+--------------------- t | 0 p0 | 0 p1 | 0 (3 rows) This may seem a bit silly - who would analyze the hash partitions directly? However, with other partitioning schemes (list, range) it's quite plausible that people load data directly into partition. They can analyze the parent explicitly too, but with multi-level partitioning that probably requires analyzing all the ancestors. The other possible scenario is about rows inserted while p0/p1 are being processed by autoanalyze. That may actually take quite a bit of time, depending on vacuum cost limit. So I still think we should propagate the delta after the analyze, before we reset the counters. I also realized relation_needs_vacanalyze is not really doing what I suggested - it propagates the counts, but does so in the existing loop which checks which relations need vacuum/analyze. That means we may skip the parent table in the *current* round, because it'll see the old (not yet updated) counts. It's likely to be processed in the next autovacuum round, but that may actually not happen. The trouble is the reltuples for the parent is calculated using *current* child reltuples values, but we're comparing it to the *old* value of changes_since_analyze. So e.g. if enough rows were inserted into the partitions, it may still be below the analyze threshold. What I proposed is adding a separate loop that *only* propagates the counts, and then re-read the current stats (perhaps only if we actually propagated anything). And then decide which relations need analyze. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: