Re: pgsql: autovacuum: handle analyze for partitioned tables - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: pgsql: autovacuum: handle analyze for partitioned tables
Date
Msg-id f8b09ecf-a590-c22b-b948-3d7dba7d30c3@enterprisedb.com
Whole thread Raw
In response to Re: pgsql: autovacuum: handle analyze for partitioned tables  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On 4/10/21 12:29 AM, Justin Pryzby wrote:
> On Fri, Apr 09, 2021 at 06:16:59PM -0400, Alvaro Herrera wrote:
>> On 2021-Apr-09, Justin Pryzby wrote:
>>
>>> One data point: we do DETACH/ATTACH tables during normal operation, before
>>> type-promoting ALTERs, to avoid worst-case disk use, and to avoid locking the
>>> table for a long time.  It'd be undesirable (but maybe of no great consequence)
>>> to trigger an ALTER when we DETACH them, since we'll re-ATTACH it shortly
>>> afterwards.
>>
>> You mean to trigger an ANALYZE, not to trigger an ALTER, right?
> 
> Oops, right.  It's slightly undesirable for a DETACH to cause an ANALYZE.
> 
>> I think I agree with Tomas: we should do it by default, and offer some
>> way to turn that off.  I suppose a new reloptions, solely for
>> partitioned tables, would be the way to do it.
>>
>>> However, I think DROP should be handled ?
>>
>> DROP of a partition? ... I would think it should do the same as DETACH,
>> right?  Inform that however many rows the partition had, are now changed
>> in ancestors.
> 
> Yes, drop of an (attached) partition.  The case for DROP is clear, since it
> was clearly meant to go away forever.  The case for DETACH seems somewhat less
> clear.
> 
> The current behavior of pg_dump/restore (since 33a53130a) is to CREATE+ATTACH,
> so there's an argument that if DROPping the partition counts towards the
> parent's analyze, then so should CREATE+ATTACH.
> 

I think it's tricky to "optimize" the behavior after ATTACH/DETACH. I'd
argue that in principle, we should aim to keep accurate statistics,  so
ATTACH should be treated as insert of all rows, and DETACH should be
treated as delete of all rows. Se for the purpose of ANALYZE, we should
propagate reltuples as changes_since_analyze after ATTACH/DETACH.

Yes, it may result in more frequent ANALYZE on the parent, but I think
that's necessary. Repeated attach/detach of the same partition may bloat
the value, but I guess that's an example of "If it hurts don't do it."

What I think we might do is offer some light-weight analyze variant,
e.g. based on the merging of statistics (I've posted a PoC patch a
couple days ago.). That would make the ANALYZEs on parent much cheaper,
so those "unnecessary" analyzes would not be an issue.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: Reference Leak with type
Next
From: Tom Lane
Date:
Subject: Re: Reference Leak with type