Re: Extended Statistics set/restore/clear functions. - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Extended Statistics set/restore/clear functions.
Date
Msg-id CADkLM=c3JivzHNXLt-X_JicYknRYwLTiOCHOPiKagm2_vdrFUg@mail.gmail.com
Whole thread Raw
In response to Re: Extended Statistics set/restore/clear functions.  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Extended Statistics set/restore/clear functions.
Re: Extended Statistics set/restore/clear functions.
List pgsql-hackers
How about the multirange range case in extended_statistics_update()
for the mcv/expression path?

Added.
 
import_expressions() also complains about a statatt_set_slot() with
histograms.

Unsure what you mean here.
 
These ones are less consistent in style, the error hints should be the
errmsg, and there are some s/statitisics/statistics/.  The errhint
should be a full sentence, so I guess that you mean to switch both
fields in such cases.

The 0003 patch of this set addresses this.
 
s/the the/the/, I think this one's mine.  :D

0002
 

While the documentation shows one example with n_distinct,
dependencies and exprs, I'd guess that we should push forward with
something similar regarding most_common_val, most_common_val_nulls,
most_common_freqs, most_common_base_freqs.  It looks particularly
important to expand this part, the relationship they have between each
pther, their expected input format (?), and what they map to in the
catalogs.  If one is specified, all four of them are required, for
example, but that's not the only thing I imagine should keep a track
of.  This data is more complex than the single stats fields.

I think some of this may have been addressed in this patchset.

I've left these patches un-squashed to help differentiate what changes address which problems. They'll likely be re-squashed once those issues are addressed.

The most interesting here is the change I did to statatt_get_type() to get it to work for MCV/expression multiranges. Mostly, the problem arose from the fact that MCVlist will have multirange as a part of the exported tuple text array, so there is a legit need to parse the input of a multirange's text reprepresentation, whereas with attribute stats we only ever need to parse the input of the multirange's element, a plain old range. The fix, as you may see highlights how statatt_get_type() is doing two things (resolving typ* info for an attribute or expression, and fetching the basetype and basetype's ltopr and eqopr) and maybe the basetype/operator stuff should be moved to its own function.

Another possible solution is to change the two refactor the examine_attribute() functions (one in analyze.c, one in extended_stats.c) unifying them where possible, and allowing the unified function to still generate a VacAttrStats even if the attstattarget is 0. That's a heavier lift, and I left things as they are now to demonstrate the issue at hand. 

Rebased as well.

Attachment

pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: Race conditions in logical decoding
Next
From: tushar
Date:
Subject: Re: Non-text mode for pg_dumpall