Re: document the need to analyze partitioned tables - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: document the need to analyze partitioned tables
Date
Msg-id 20230112232747.GA2111950@nathanxps13
Whole thread Raw
In response to Re: document the need to analyze partitioned tables  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: document the need to analyze partitioned tables
List pgsql-hackers
On Wed, Oct 05, 2022 at 10:37:01AM +0200, Laurenz Albe wrote:
> On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:
>> I've pushed the last version, and backpatched it to 10 (not sure I'd
>> call it a bugfix, but I certainly agree with Justin it's worth
>> mentioning in the docs, even on older branches).
> 
> I'd like to suggest an improvement to this.  The current wording could
> be read to mean that dead tuples won't get cleaned up in partitioned tables.

Well, dead tuples won't get cleaned up in partitioned tables, as
partitioned tables do not have storage.  But I see what you mean.  Readers
might misinterpret this to mean that autovacuum will not process the
partitions.  There's a good definition of what the docs mean by
"partitioned table" [0], but FWIW it took me some time before I
consistently read "partitioned table" to mean "only the thing with relkind
set to 'p'" and not "both the partitioned table and its partitions."  So,
while the current wording it technically correct, I think it'd be
reasonable to expand it to help avoid confusion.

Here is my take on the wording:

    Since all the data for a partitioned table is stored in its partitions,
    autovacuum does not process partitioned tables.  Instead, autovacuum
    processes the individual partitions that are regular tables.  This
    means that autovacuum only gathers statistics for the regular tables
    that serve as partitions and not for the partitioned tables.  Since
    queries may rely on a partitioned table's statistics, you should
    collect statistics via the ANALYZE command when it is first populated,
    and again whenever the distribution of data in its partitions changes
    significantly.

[0] https://www.postgresql.org/docs/devel/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [EXTERNAL] Re: Support load balancing in libpq
Next
From: Michael Paquier
Date:
Subject: Re: Beautify pg_walinspect docs a bit