Re: Multivariate MCV list vs. statistics target - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Multivariate MCV list vs. statistics target
Date
Msg-id 20190629104121.xw4qmpzlhy4e2r4g@development
Whole thread Raw
In response to Re: Multivariate MCV list vs. statistics target  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Multivariate MCV list vs. statistics target
List pgsql-hackers
On Fri, Jun 21, 2019 at 08:09:18AM +0100, Dean Rasheed wrote:
>On Thu, 20 Jun 2019 at 23:12, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> On Thu, Jun 20, 2019 at 08:08:44AM +0100, Dean Rasheed wrote:
>> >On Tue, 18 Jun 2019 at 22:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> >>
>> >> So I'm thinking we should allow tweaking the statistics for extended
>> >> stats, and serialize it in the pg_statistic_ext catalog. Any opinions
>> >> why that would be a bad idea?
>> >
>> >Seems reasonable to me. This might not be the only option we'll ever
>> >want to add though, so perhaps a "stxoptions text[]" column along the
>> >lines of a relation's reloptions would be the way to go.
>>
>> I don't know - I kinda dislike the idea of stashing stuff like this into
>> text[] arrays unless there's a clear need for such flexibility (i.e.
>> vision to have more such options). Which I'm not sure is the case here.
>> And we kinda have a precedent in pg_attribute.attstattarget, so I'd use
>> the same approach here.
>>
>
>Hmm, maybe. I can certainly understand your dislike of using text[].
>I'm not sure that we can confidently say that multivariate statistics
>won't ever need additional tuning knobs, but I have no idea at the
>moment what they might be, and nothing else has come up so far in all
>the time spent considering MCV lists and histograms, so maybe this is
>OK.
>

OK, attached is a patch implementing this - it adds

    ALTER STATISTICS ... SET STATISTICS ...

modifying a new stxstattarget column in pg_statistic_ext catalog,
following the same logic as pg_attribute.attstattarget.

During analyze, the per-ext-statistic value is determined like this:

1) When pg_statistic_ext.stxstattarget != (-1), we just use this value
and we're done.

2) Otherwise we inspect per-column attstattarget values, and use the
largest value. This is what we do now, so it's backwards-compatible
behavior.

3) If the value is still (-1), we use default_statistics_target.



regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Avoid full GIN index scan when possible
Next
From: Julien Rouhaud
Date:
Subject: Re: Avoid full GIN index scan when possible