Re: Merging statistics from children instead of re-sampling everything - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Merging statistics from children instead of re-sampling everything
Date
Msg-id 0c3db54d-c06d-7fef-ad47-8aa4bd8485b2@enterprisedb.com
Whole thread Raw
In response to Re: Merging statistics from children instead of re-sampling everything  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Merging statistics from children instead of re-sampling everything  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers

On 3/29/21 8:36 PM, Justin Pryzby wrote:
> Thanks for taking a fresh look at this.
> 
> As you've written it, this can apply to either/both partitioned or inheritence.
> I imagine when "MERGE" goes away, this should apply only to partitioned tables.
> (Actually, personally I would advocate to consider applying it to *both*, but I
> don't think that's been the tendency over the last 4 years.  I wrote here about
> some arguably-gratuitous differences between inheritence and partitioning.
> https://www.postgresql.org/message-id/20180601221428.GU5164@telsasoft.com)
> 

Haven't thought about that too much at this point, but I don't see any
reason to not apply it only to both cases. I might be missing something,
but the fact that with declarative partitioning we analyze the children,
while with inheritance we don't, seems a bit strange. Not sure.

>> +     if (*mcv_items > default_statistics_target)
>> +             n = default_statistics_target;
> 
> It should use any non-default stats target of the parent's column
> 

Yeah. That's a simplification, the non-PoC code would have to look at
per-column statistics target for the target / all the children, and make
some calculation based on that. I ignored that in the PoC.

>> +         * ignore anything but valid leaf relatiins with data, but release
> 
> sp: relatiins.
> 
>> +                elog(WARNING, "stats for %d %d not found",
>> +                                        RelationGetRelid(rels[j]), vacattrstats[i]->attr->attnum);
> 
> should be %u %d
> 
> This code duplication is undesirable:
>> +    /* Log the action if appropriate */
>> +     * Determine which columns to analyze
Yeah. Fine for PoC, but needs to be cleaned up in future patch.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Re: Idea: Avoid JOINs by using path expressions to follow FKs
Next
From: Isaac Morland
Date:
Subject: Add missing function abs (interval)