Re: Index only scan sometimes switches to sequential scan for small amount of rows - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Index only scan sometimes switches to sequential scan for small amount of rows
Date
Msg-id 5513FF23.8090603@2ndquadrant.com
Whole thread Raw
In response to Re: Index only scan sometimes switches to sequential scan for small amount of rows  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Index only scan sometimes switches to sequential scan for small amount of rows  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
On 26.3.2015 08:48, Jeff Janes wrote:
>
> OK, this is starting to look like a long-standing bug to me.
>
> If it only sees 3 distinct values, and all three are present at least
> twice, it throws all of them into the MCV list. But if one of those 3
> were present just once, then it tests them to see if they qualify.
> The test for inclusion is that it has to be present more than once,
> and that it must be "over-represented" by 25%.
>
> Lets say it sampled 30000 rows and found 29,900 of one value, 99 of
> another, and 1 of a third.
>
> But that turns into the second one needing to be present 12,500 times.
> The average value is present 10,000 times (30,000 samples with 3
> distinct values) and 25 more than that is 12,500.  So it excluded.
>
> It seems to me that a more reasonable criteria is that it must be
> over-represented 25% compared to the average of all the remaining values
> not yet accepted into the MCV list.  I.e. all the greater ones should be
> subtracted out before computing the over-representation threshold.

That might work IMO, but maybe we should increase the coefficient a bit
(say, from 1.25 to 2), not to produce needlessly long MCV lists.


> It is also grossly inconsistent with the other behavior.  If they are
> "29900; 98; 2" then all three go into the MCV.

Isn't the mincount still 12500? How could all three get into the MCV?


--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-performance by date:

Previous
From: Dominique Vallée
Date:
Subject: query faster with a foreign table
Next
From: Jeff Janes
Date:
Subject: Re: Index only scan sometimes switches to sequential scan for small amount of rows