Re: Statistics Import and Export - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Statistics Import and Export
Date
Msg-id CAEze2WiX_YWEhe4iLSLHEkSJimE0=wU1cGhmPikurZGOTrh1tg@mail.gmail.com
Whole thread Raw
In response to Re: Statistics Import and Export  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Statistics Import and Export
Re: Statistics Import and Export
List pgsql-hackers
On Tue, 23 Apr 2024, 05:52 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote:
> >> Loading data without stats, and hoping
> >> that auto-analyze will catch up sooner not later, is exactly the
> >> current behavior that we're doing all this work to get out of.
>
> > That's the disconnect, I think. For me, the main reason I'm excited
> > about this work is as a way to solve the bad-plans-after-upgrade
> > problem and to repro planner issues outside of production. Avoiding the
> > need to ANALYZE at the end of a data load is also a nice convenience,
> > but not a primary driver (for me).
>
> Oh, I don't doubt that there are use-cases for dumping stats without
> data.  I'm just dubious about the reverse.  I think data+stats should
> be the default, even if only because pg_dump's default has always
> been to dump everything.  Then there should be a way to get stats
> only, and maybe a way to get data only.  Maybe this does argue for a
> four-section definition, despite the ensuing churn in the pg_dump API.

I've heard of use cases where dumping stats without data would help
with production database planner debugging on a non-prod system.

Sure, some planner inputs would have to be taken into account too, but
having an exact copy of production stats is at least a start and can
help build models and alerts for what'll happen when the tables grow
larger with the current stats.

As for other planner inputs: table size is relatively easy to shim
with sparse files; cumulative statistics can be copied from a donor
replica if needed, and btree indexes only really really need to
contain their highest and lowest values (and need their height set
correctly).

Kind regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Avoid orphaned objects dependencies, take 3
Next
From: Andres Freund
Date:
Subject: Re: gcc 12.1.0 warning