ANALYZE on partitioned tables vs on individual partitions - Mailing list pgsql-general

From Michael Harris
Subject ANALYZE on partitioned tables vs on individual partitions
Date
Msg-id CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
Whole thread Raw
Responses Re: ANALYZE on partitioned tables vs on individual partitions
List pgsql-general
Hello Experts,

Our application has a database with a large number of partitioned tables used to store time series data.

It is partitioned by time: new data is populated into the current partition, old partitions eventually get dropped, and the partitions in between are largely static.

I had not realized until recently that the partitioned tables themselves are not analyzed by the autovacuum processes. I am now wondering if we should be manually analyzing those.

The problem is they are quite enormous and take a long time to analyze - I tested one yesterday and it took ~30mins, and we have hundreds of them. It might not be practical to regularly analyze them.

My questions are: 

  1. What is the consequence of not having good statistics on partitioned table level, if you do have good statistics on the partition level?

  2. Which planning operations rely on partitioned table level statistics?

My reason for asking is I'm trying to understand the tradeoff between the IO cost associated with frequently analyzing as opposed to possibly non-optimal plans being chosen.

Thanks in advance, and sorry if this has all been covered before. I could not find anything definitive in the docs, and while I did find a few references by googling, none really explained the answers to those questions too well.

Regards
Mike

pgsql-general by date:

Previous
From: Christophe Pettus
Date:
Subject: Re: data checksums
Next
From: Muhammad Ikram
Date:
Subject: Re: Streaming replication issue post upgrade from version 11 to 14 on windows 2016 Server