Re: Statistics Import and Export - Mailing list pgsql-hackers
From | Corey Huinker |
---|---|
Subject | Re: Statistics Import and Export |
Date | |
Msg-id | CADkLM=dHMSV=0r-hssUAksYXgi5ugDpSC9xF9mV88h2x9fjCBw@mail.gmail.com Whole thread Raw |
In response to | Re: Statistics Import and Export (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Statistics Import and Export
|
List | pgsql-hackers |
On Sun, Mar 31, 2024 at 2:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Corey Huinker <corey.huinker@gmail.com> writes:
> Having given this some thought, I'd be inclined to create a view,
> pg_stats_missing, with the same security barrier as pg_stats, but looking
> for tables that lack stats on at least one column, or lack stats on an
> extended statistics object.
The week before feature freeze is no time to be designing something
like that, unless you've abandoned all hope of getting this into v17.
It was a response to the suggestion that there be some way for tools/automation to read the status of stats. I would view it as a separate patch, as such a view would be useful now for knowing which tables to ANALYZE, regardless of whether this patch goes in or not.
There's a bigger issue though: AFAICS this patch set does nothing
about dumping extended statistics. I surely don't want to hold up
the patch insisting that that has to happen before we can commit the
functionality proposed here. But we cannot rip out pg_upgrade's
support for post-upgrade ANALYZE processing before we do something
about extended statistics, and that means it's premature to be
designing any changes to how that works. So I'd set that whole
topic on the back burner.
So Extended Stats _were_ supported by earlier versions where the medium of communication was JSON. However, there were several problems with adapting that to the current model where we match params to stat types:
* Several of the column types do not have functional input functions, so we must construct the data structure internally and pass them to statext_store().
* The output functions for some of those column types have lists of attnums, with negative values representing positional expressions in the stat definition. This information is not translatable to another system without also passing along the attnum/attname mapping of the source system.
At least three people told me "nobody uses extended stats" and to just drop that from the initial version. Unhappy with this assessment, I inquired as to whether my employer (AWS) had some internal databases that used extended stats so that I could get good test data, and came up with nothing, nor did anyone know of customers who used the feature. So when the fourth person told me that nobody uses extended stats, and not to let a rarely-used feature get in the way of a feature that would benefit nearly 100% of users, I dropped it.
It's possible that we could drop the analyze-in-stages recommendation,
figuring that this functionality will get people to the
able-to-limp-along level immediately and that all that is needed is a
single mop-up ANALYZE pass. But I think we should leave that till we
have a bit more than zero field experience with this feature.
It may be that we leave the recommendation exactly as it is.
Perhaps we enhance the error messages in pg_set_*_stats() to indicate what command would remediate the issue.
pgsql-hackers by date: