Re: document the need to analyze partitioned tables - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: document the need to analyze partitioned tables
Date
Msg-id 20230118174919.GA9837@telsasoft.com
Whole thread Raw
In response to Re: document the need to analyze partitioned tables  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: document the need to analyze partitioned tables
Re: document the need to analyze partitioned tables
List pgsql-hackers
On Wed, Jan 18, 2023 at 10:15:18AM +0100, Laurenz Albe wrote:
> On Tue, 2023-01-17 at 16:16 -0500, Bruce Momjian wrote:
> > On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:
> > > Maybe (all?) the clarification the docs need is to say:
> > > "Partitioned tables are not *themselves* processed by autovacuum."
> > 
> > Yes, I think the lack of autovacuum needs to be specifically mentioned
> > since most people assume autovacuum handles _all_ statistics updating.

That's what 61fa6ca79 aimed to do.  Laurenz is suggesting further
clarification.

> > Can someone summarize how bad it is we have no statistics on partitioned
> > tables?  It sounds bad to me.
> 
> Andrey Lepikhov had an example earlier in this thread[1].  It doesn't take
> an exotic query. 
> 
> Attached is a new version of my patch that tries to improve the wording.

I tweaked this a bit to end up with:

> -    Partitioned tables are not processed by autovacuum.  Statistics
> -    should be collected by running a manual <command>ANALYZE</command> when it is
> +    The leaf partitions of a partitioned table are normal tables and are processed
> +    by autovacuum; however, autovacuum does not process the partitioned table itself.
> +    This is no problem as far as <command>VACUUM</command> is concerned, since
> +    there's no need to vacuum the empty, partitioned table.  But, as mentioned in
> +    <xref linkend="vacuum-for-statistics"/>, it also means that autovacuum won't
> +    run <command>ANALYZE</command> on the partitioned table.
> +    Although statistics are automatically gathered on its leaf partitions, some queries also need
> +    statistics on the partitioned table to run optimally.  You should collect statistics by
> +    running a manual <command>ANALYZE</command> when the partitioned table is
>      first populated, and again whenever the distribution of data in its
>      partitions changes significantly.
>     </para>

"partitions are normal tables" was techically wrong, as partitions can
also be partitioned.

-- 
Justin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Removing redundant grouping columns
Next
From: Tomas Vondra
Date:
Subject: Re: Implement missing join selectivity estimation for range types