Re: [HACKERS] Index Puzzle for you - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] Index Puzzle for you
Date
Msg-id 199912291012.FAA24890@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] Index Puzzle for you  (Kristofer Munn <kmunn@munn.com>)
List pgsql-hackers
> Tom Lane wrote:
> > The thing that jumps out at me from this example is the much larger
> > estimate of returned rows in the second case.  The planner is clearly
> 
> Good catch!  There were 296 possible issues the table.  One had 86,544
> articles associated with it.  The next highest was 5,949.  Then the
> numbers drop to 630, 506, 412, 184 and then the rest are all under 62.
> Out of curiosity, how does vacuum decide on the large estimate?
> 
> The maximum is 86,544.
> The average row return for ixissue = x is 3412.
> The median is 25.
> The mode is 25.
> 
> ixissue is the result of a sequence.
> 
> Thanks for the heads up on this...

Here is the relevent comment from vacuum.c.  It is not perfect, but was
the best thing I could think of.

---------------------------------------------------------------------------

/**  vc_attrstats() -- compute column statistics used by the optimzer**  We compute the column min, max, null and
non-nullcounts.*  Plus we attempt to find the count of the value that occurs most*  frequently in each column.  These
figuresare used to compute *  the selectivity of the column.**  We use a three-bucked cache to get the most frequent
item.* The 'guess' buckets count hits.  A cache miss causes guess1*  to get the most hit 'guess' item in the most
recentcycle, and*  the new item goes into guess2.  Whenever the total count of hits*  of a 'guess' entry is larger than
'best','guess' becomes 'best'.**  This method works perfectly for columns with unique values, and columns*  with only
twounique values, plus nulls.**  It becomes less perfect as the number of unique values increases and*  their
distributionin the table becomes more random.**/
 

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Kristofer Munn
Date:
Subject: Re: [HACKERS] Index Puzzle for you
Next
From: Adriaan Joubert
Date:
Subject: Re: [HACKERS] Index corruption