Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | Re: Autovacuum on partitioned table (autoanalyze) |
Date | |
Msg-id | 20200416141910.GM2228@telsasoft.com Whole thread Raw |
In response to | Re: Autovacuum on partitioned table (autoanalyze) (yuzuko <yuzukohosoya@gmail.com>) |
Responses |
Re: Autovacuum on partitioned table (autoanalyze)
|
List | pgsql-hackers |
On Thu, Apr 16, 2020 at 06:16:45PM +0900, yuzuko wrote: > > I think it ought to be possible to configure this feature such that an > > auto-analyze on any child partition would trigger analyze of the parent. I > > think that would be important for maintaining accurate stats of the partition > > key column for many cases involving RANGE-partitioned tables, which are likely > > to rely on histogram rather than MCVs. > > I read your previous email and understand that it would be neccesary to analyze > partitioned tables automatically when any of its children are analyzed. In my > first patch, auto-analyze on partitioned tables worked like this but there were > some comments about performance of autovacuum, especially when partitioned > tables have a lot of children. I reread that part. There was also confusion between autovacuum vacuum and autovacuum analyze. I agree that it *might* be a problem to analyze the parent every time any child is analyzed. But it might also be what's needed for this feature to be useful. > The latest patch lets users set different autovacuum configuration for > each partitioned > tables like this, > create table p3(i int) partition by range(i) with > (autovacuum_analyze_scale_factor=0.0005, autovacuum_analyze_threshold=100); > so users can configure those parameters according to partitioning strategies > and other requirements. > > So I think this patch can solve problem you mentioned. I don't think that adequately allows what's needed. I think it out to be possible to get the "analyze parent whenever a child is analyzed" behavior easily, without having to compute new thershold parameters every time one adds partitions, detaches partitions, loades 10x more data into one of the partitions, load only 10% as much data into the latest partition, etc. For example, say a new customer has bunch of partitioned tables which each currently have only one partition (for the current month), and that's expected to grow to at least 20+ partitions (2+ years of history). How does one set the partitioned table's auto-analyze parameters to analyze whenever any child is analyzed ? I don't think it should be needed to update it every month after computing sum(child tuples). Possibly you could allow that behavior for some special values of the threshold. Like if autovacuum_analyze_threshold=-2, then analyze the parent whenever any of its children are analyzed. I think that use case and that need would be common, but I'd like to hear what others think. -- Justin
pgsql-hackers by date: