Re: Extended Statistics set/restore/clear functions. - Mailing list pgsql-hackers
| From | Corey Huinker |
|---|---|
| Subject | Re: Extended Statistics set/restore/clear functions. |
| Date | |
| Msg-id | CADkLM=ebOoi6xjLBmVtd+3Y=d4VxdBbpFY5_T2EobM58Ry=70g@mail.gmail.com Whole thread Raw |
| In response to | Re: Extended Statistics set/restore/clear functions. (Michael Paquier <michael@paquier.xyz>) |
| Responses |
Re: Extended Statistics set/restore/clear functions.
|
| List | pgsql-hackers |
- v34 did not include a bunch of the fixes I've done in v33, like
error messages tweaks, typo fixes, comment adjustments, and some test
additions. So I had to come back to all that.
I'm sorry. I misinterpreted your follow-on inquiry about the need for the mcv-nulls parameter to mean that you were walking back the need for the v33 patch.
Finally comes the final part of the proposal, as of v34-0002 for the
restore of expression stats. I have been wondering for a couple of
days if the proposed interface by relying on
pg_restore_extended_stats() combined with scans of pg_stats_ext_exprs
is the best thing we can do, and if a different approach with a
different function would be better. I don't have a clean answer to
this question yet, unfortunately.
I agree that the method of serializing the stats columns from pg_stats_ext_exprs into a text[] is a bit opaque.
However, pg_stats_ext_exprs is the only means we have of fetching this data, as it provides the security barrier that pg_statistic_ext_data does not. Prior to my awareness of the security requirement, I had assumed that I could just serialize stxdexpr to text and then unpack it with deconstruct_array, but even that ran into problems with the stavalues ANYARRAY at the end, wherein the type needed for the each row of the stxdexpr array is different based on the object definition...so that wasn't going anywhere even without the security requirement.
I could envision serializing pg_stats_ext_exprs to a kwargs-ish array {"null_frac","0.4","avg_width","4","most_common_vals",...} which would at least remove ambiguity about which string was meant for a given type of stat unpacking, at a cost of some additional overhead. For that matter we could collect then into a 1-level json of param_name:textified_value pairs, but json inputs were not well received before.
If we went to a new function pg_restore_extended_expr_stats() which specified all the fields from pg_stats_ext_exprs plus an array index value, we'd still have some of the ANYARRAY issues necessitating casting the mcv, histogram, and mcelems arrays to a scalar text value, but we'd also have to address issues of the stxdexpr array elements being restored out of order or missing a leading element.
All of that is a long way of explaining how the fairly opaque array of text-casted values won out.
What I am aware of though is that my schedule clock is unfortunately
ticking, meaning that I am running out of time to do more here as I
also need to focus on other things for this release.
Your time is a precious thing, and I thank you for all you've given of it so far.
We have achieved
a lot already in terms of restore support for MCV, ndistinct and
dependencies, and I am curious to see how this will work out during
the beta cycle of v19 without the complexities related to expressions,
so it may not be a bad thing to cool down and see how the situation
evolves and if what's already been done requires any adjustments
If the arc of implementing attribute stats is any guide, we're likely to get some feedback from the buildfarm in the coming days. I'll be watching it.
pgsql-hackers by date: