Re: Extended Statistics set/restore/clear functions. - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Extended Statistics set/restore/clear functions.
Date
Msg-id CADkLM=f4+3_HrXSAaEic5VGGd-fOjS4jGnr1WGJYrL7vf3ZJnw@mail.gmail.com
Whole thread Raw
In response to Re: Extended Statistics set/restore/clear functions.  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Extended Statistics set/restore/clear functions.
List pgsql-hackers

Yes, we are entering into some complexity territory here that needs a
very careful lookup, and I am not sure yet if your suggestion of
input arguments is the best thing to do.

With that in mind, I'm going to move forward with a local static version of statatt_get_type() which avoids the typecache stuff altogether. Whatever remains there will be instructive for how we keep/lose/modify the real statatt_get_type().
 

I have been thinking about how to move this patch forward, and I think
that we should split things a bit more the logic (aka my usual
divide-and-conquer strategy) for the restore function, because we
should make sure that each part is sound before merging, and that's
also a risk reduction if we need to revert one part or the other:
1) One patch for ndistinct.
2) One patch for dependencies.
3) One patch for MVC.

All three of these are possible, though the tests of trying to set one of the three types on a stat object incapable of holding it won't be add-able until the next type of stats are settlable.

Expressions would be a 4th patch, and honestly it's the lion's share of the code.

 
4) Dump/restore.

The "exprs" input argument is required by all the three others, meaning
that it needs to stand in the first bits of the patch.  The order of
ndistinct and dependencies does not matter, let's just do these based
on the order of the input arguments.  Dump/restore could be already
integrated even if we have only 1) and 2) restore integrated in the
tree.  It also seems like the "exprs" argument should stand on top of
all the others, as well, so it would make more sense to have it after
"inherited" in extended_stats_argnum?

I think I went with the order they appeared in pg_stats_ext, with exprs tacked on the end. I'm a bit confused why the argument order matters given that the only user exposure is the kwargs-like interface.
 

There are two reasons behind this set of thoughts:
1) The regression tests are bloated.  We rely currently on one single
extended stats object that holds all the types of stats (MCV,
dependencies, ndistinct), meaning that for some of the valid cases we
need to cary all the types in the input of restore function.  It's
also a bit true in terms of the invalid cases.  One thing that I would
recommend to do here is create two simpler extstat objects for each
stxkind (one with expression, one without expression).  You should
still require the "global" extstat object that includes all the
stxkind, which counts for the expression import path, of course.

We can definitely lean on the one-stat-only objects more now that we have them.

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add WALRCV_CONNECTING state to walreceiver
Next
From: Michael Paquier
Date:
Subject: Re: Extended Statistics set/restore/clear functions.