Re: Column changes such as an increase in varchar size, can cause extremely slow queries, and postgres should run analyze automatically. - Mailing list pgsql-bugs

From Noah Misch
Subject Re: Column changes such as an increase in varchar size, can cause extremely slow queries, and postgres should run analyze automatically.
Date
Msg-id 20250102052924.2e.nmisch@google.com
Whole thread Raw
In response to Re: Column changes such as an increase in varchar size, can cause extremely slow queries, and postgres should run analyze automatically.  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-bugs
On Fri, Nov 01, 2024 at 10:03:08PM +1300, David Rowley wrote:
> On Fri, 1 Nov 2024 at 19:32, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > Would it be an option to clear pg_class.reltuples and pg_stat_all_tables.n_mod_since_analyze
> > whenever the statistics for a table are cleared?  Then autoanalyze would trigger after 50
> > modifications.
> 
> I wondered about that too. I also wondered which cases we could have
> ATExecAlterColumnType() not call RemoveStatistics(). If the table is
> going to be rewritten, then we need to, but there must be plenty of
> cases where we could forego removing the stats when there's no
> rewrite.

In the absence of a rewrite, stanullfrac and stawidth always remain valid.
The conditions for keeping other stats are like CheckIndexCompatible()
deciding to skip an index rebuild.  Specifically, I think keeping pg_statistic
entries would require (1) staop and stacoll haven't changed and (2) if staop
is polymorphic, the argument type hasn't changed.  Those conditions would also
suffice to keep non-expression pg_statistic_ext_data.

That may be enough.  If not, type-specific rules would identify more cases
where discarding stats has no value.  For example, if staop is "<" and the
column type is changing between timestamp and timestamptz, none of the "<"
comparisons will change.  Similarly, staop of "=" works the same for oid and
for int4.  Beyond those cases where discarding stats has no value, there may
be cases where the staop behaves differently before and after, but you could
argue that it's better to keep sometimes-incorrect stats than to remove them.
For example, <(oid,oid) and <(int4,int4) behave differently due to signed
vs. unsigned.  Still, the last histogram may be better than no histogram.

Keeping pg_statistic_ext_data for expression stats would require expression
introspection that CheckIndexCompatible() hasn't bothered with.  If we ever
implement such expression introspection, I expect it would cover both
CheckIndexCompatible() and pg_statistic_ext_data.



pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #18762: Update script produces distinct outputs before and after vacuum analyze operation
Next
From: PG Bug reporting form
Date:
Subject: BUG #18763: pg_get_viewdef returns an old def with new relation name