Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Autovacuum on partitioned table (autoanalyze)
Date
Msg-id 20200425141320.GR28974@telsasoft.com
Whole thread Raw
In response to Re: Autovacuum on partitioned table (autoanalyze)  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Autovacuum on partitioned table (autoanalyze)  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Wed, Mar 18, 2020 at 11:30:39AM -0500, Justin Pryzby wrote:
> In the past, I think there's was talk that maybe someone would invent a clever
> way to dynamically combine all the partitions' statistics, so analyzing the
> parent wasn't needed. [...]

I happened across the thread I was referring to:
https://www.postgresql.org/message-id/7363.1426537103@sss.pgh.pa.us

I'm not opposed to doing things the currently-proposed way (trigger analyze of
partitioned tables based on updates, same as nonpartitioned tables), but we
should think if it's worth doing something totally different, like what Tom
proposed.

Robert had concerns that it would increase planning time.  I imagine that
argument is even stronger now, since PG12 has *less* planning time for large
heirarchies (428b260f8) and advertizes support for "thousands" of partitions.

Tom said:
> we would automatically get statistics that account for
> partitions being eliminated by constraint exclusion, because only the
> non-eliminated partitions are present in the appendrel. And second,

That's a pretty strong benefit.  I don't know if there's a good way to support
both(either) ways of doing things.  Like maybe a reloption that allows
triggering autovacuum on partitioned tables, but if no statistics exist on a
partitioned table, then the planner would dynamically determine the selectivity
by decending into child statistics (Tom's way).  I think the usual way this
would play out is that someone with a small partition heirarchies would
eventually complain about high planning time and then we'd suggest implementing
a manual ANALYZE job.

I'm not sure it's good to support two ways anyway, since 1) I think that gives
different (better) statistics Tom's way (due to excluding stats of excluded
partitions); 2) there's not a good way to put an ANALYZE job in place and then
get rid of parent stats (have to DELETE FROM pg_statistic WHERE
starelid='...'::regclass; 3) if someone implements an ANALYZE job, but they
disable it or it stops working then they have outdated stats forever; 

-- 
Justin



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Next
From: Tom Lane
Date:
Subject: Re: Anybody want to check for Windows timezone updates?