Hi Tomas,
Thank you for reviewing the patch.
> Firstly, the patch propagates the changes_since_analyze values from
> do_analyze_rel, i.e. from the worker after it analyzes the relation.
> That may easily lead to cases with unnecessary analyzes - consider a
> partitioned with 4 child relations:
> [ explanation ]
>
I didn't realize that till now. Indeed, this approach increments parent's
changes_since_analyze counter according to its leaf partition's counter
when the leaf partition is analyzed, so it will cause unnecessary ANALYZE
on partitioned tables as you described.
> I propose a different approach - instead of propagating the counts in
> do_analyze_rel for individual leaf tables, let's do that in bulk in
> relation_needs_vacanalyze. Before the (existing) first pass over
> pg_class, we can add a new one, propagating counts from leaf tables to
> parents.
>
Thank you for your suggestion. I think it could solve all the issues
you mentioned. I modified the patch based on this approach:
- Create a new counter, PgStat_Counter changes_since_analyze_reported,
to track changes_since_analyze we already propagated to ancestors.
This is used for internal processing and users may not need to know it.
So this counter is not displayed at pg_stat_all_tables view for now.
- Create a new function, pgstat_propagate_changes() which propagates
changes_since_analyze counter to all ancestors and saves
changes_since_analyze_reported. This function is called in
do_autovacuum() before relation_needs_vacanalyze().
> Note: I do have some ideas about how to improve that, I've started a
> separate thread about it [1].
>
I'm also interested in merging children's statistics for partitioned tables
because it will make ANALYZE on inheritance trees more efficient.
So I'll check it later.
> I forgot to mention one additional thing yesterday - I wonder if we need
> to do something similar after a partition is attached/detached. That can
> also change the parent's statistics significantly, so maybe we should
> handle all partition's rows as changes_since_analyze? Not necessarily
> something this patch has to handle, but might be related.
>
Regarding attached/detached partitions, I think we should update statistics
of partitioned tables according to the new inheritance tree. The latest patch
hasn't handled this case yet, but I'll give it a try soon.
Attach the v13 patch to this email. Could you please check it again?
--
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center