Re: Multivariate MCV list vs. statistics target - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Multivariate MCV list vs. statistics target
Date
Msg-id 20190620221250.jk62m4j7kr77qkzg@development
Whole thread Raw
In response to Re: Multivariate MCV list vs. statistics target  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Multivariate MCV list vs. statistics target
List pgsql-hackers
On Thu, Jun 20, 2019 at 08:08:44AM +0100, Dean Rasheed wrote:
>On Tue, 18 Jun 2019 at 22:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> One slightly inconvenient thing I realized while playing with the
>> address data set is that it's somewhat difficult to set the desired size
>> of the multi-column MCV list.
>>
>> At the moment, we simply use the maximum statistic target for attributes
>> the MCV list is built on. But that does not allow keeping default size
>> for per-column stats, and only increase size of multi-column MCV lists.
>>
>> So I'm thinking we should allow tweaking the statistics for extended
>> stats, and serialize it in the pg_statistic_ext catalog. Any opinions
>> why that would be a bad idea?
>>
>
>Seems reasonable to me. This might not be the only option we'll ever
>want to add though, so perhaps a "stxoptions text[]" column along the
>lines of a relation's reloptions would be the way to go.
>

I don't know - I kinda dislike the idea of stashing stuff like this into
text[] arrays unless there's a clear need for such flexibility (i.e.
vision to have more such options). Which I'm not sure is the case here.
And we kinda have a precedent in pg_attribute.attstattarget, so I'd use
the same approach here.

>> I suppose it should be part of the CREATE STATISTICS command, but I'm
>> not sure what'd be the best syntax. We might also have something more
>> similar to ALTER COLUMNT, but perhaps
>>
>>      ALTER STATISTICS s SET STATISTICS 1000;
>>
>> looks a bit too weird.
>>
>
>Yes it does look a bit weird, but that's the natural generalisation of
>what we have for per-column statistics, so it's probably preferable to
>do that rather than invent some other syntax that wouldn't be so
>consistent.
>

Yeah, I agree.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)
Next
From: Tomas Vondra
Date:
Subject: Re: Choosing values for multivariate MCV lists