Re: Statistics Import and Export - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Statistics Import and Export
Date
Msg-id 3228677.1713844341@sss.pgh.pa.us
Whole thread Raw
In response to Re: Statistics Import and Export  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Statistics Import and Export
List pgsql-hackers
Jeff Davis <pgsql@j-davis.com> writes:
> On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote:
>> Loading data without stats, and hoping
>> that auto-analyze will catch up sooner not later, is exactly the
>> current behavior that we're doing all this work to get out of.

> That's the disconnect, I think. For me, the main reason I'm excited
> about this work is as a way to solve the bad-plans-after-upgrade
> problem and to repro planner issues outside of production. Avoiding the
> need to ANALYZE at the end of a data load is also a nice convenience,
> but not a primary driver (for me).

Oh, I don't doubt that there are use-cases for dumping stats without
data.  I'm just dubious about the reverse.  I think data+stats should
be the default, even if only because pg_dump's default has always
been to dump everything.  Then there should be a way to get stats
only, and maybe a way to get data only.  Maybe this does argue for a
four-section definition, despite the ensuing churn in the pg_dump API.

> Should we just itemize some common use cases for pg_dump, and then
> choose the defaults that are least likely to cause surprise?

Per above, I don't find any difficulty in deciding what should be the
default.  What I think we need to consider is what the pg_dump and
pg_restore switch sets should be.  There's certainly a few different
ways we could present that; maybe we should sketch out the details for
a couple of ways.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: promotion related handling in pg_sync_replication_slots()
Next
From: Tom Lane
Date:
Subject: Re: clamp_row_est avoid infinite