Re: MCV lists for highly skewed distributions - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: MCV lists for highly skewed distributions
Date
Msg-id CAEZATCWTSwsV11Xc9fboSzWFoKi_Gz9MQh-P+weP105DL4E0HA@mail.gmail.com
Whole thread Raw
In response to Re: MCV lists for highly skewed distributions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: MCV lists for highly skewed distributions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On 17 March 2018 at 18:40, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> Currently, analyze_mcv_list only checks if the frequency of the current
> item is significantly higher than the non-MCV selectivity. My question
> is if it shouldn't also consider if removing the item from MCV would not
> increase the non-MCV selectivity too much.
>

Oh, I see what you're saying. In theory, each MCV item we remove is
not significantly more common than the non-MCV items at that point, so
removing it shouldn't significantly increase the non-MCV selectivity.
It's possible the cumulative effect of removing multiple items might
start to add up, but I think it would necessarily be a slow effect,
and I think it would keep getting slower and slower as more items are
removed -- isn't this equivalent to constructing a sequence of numbers
where each number is a little greater than the average of all the
preceding numbers, and ends up virtually flat-lining.

Regards,
Dean


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)
Next
From: Andres Freund
Date:
Subject: Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)