Re: Index only scan sometimes switches to sequential scan for small amount of rows - Mailing list pgsql-performance

From Feike Steenbergen
Subject Re: Index only scan sometimes switches to sequential scan for small amount of rows
Date
Msg-id CAK_s-G0BKR7ceVd4k5qL3_bGQ=Obwq+xi0ZpdtoHFaZ5K-EHiA@mail.gmail.com
Whole thread Raw
In response to Re: Index only scan sometimes switches to sequential scan for small amount of rows  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Index only scan sometimes switches to sequential scan for small amount of rows
List pgsql-performance
I'm posting this as I am trying to understand what has happened.
TLDR: The problem seems to be fixed now.

By bumping the statistics_target we see that most_common_vals is
having its contents filled more often, causing way better estimates:

 attname                | status
 inherited              | f
 null_frac              | 0
 avg_width              | 4
 n_distinct             | 3
 most_common_vals       | {PRINTED,PREPARED,ERROR}
 most_common_freqs      | {0.996863,0.00307333,6.33333e-05}
 histogram_bounds       | (null)
 correlation            | 0.98207
 most_common_elems      | (null)
 most_common_elem_freqs | (null)
 elem_count_histogram   | (null)

Basically 100% of the records are accounted for in these statistics,
the planner now consistently estimates the number of rows to be very
small for other values.

Before bumping the target we didn't have information for 0.34% of the
rows, which in this case means roughly 11K rows.

What is the reasoning behind having at least 2 hits before including
it in the most_common_* columns?


pgsql-performance by date:

Previous
From: Feike Steenbergen
Date:
Subject: Re: Index only scan sometimes switches to sequential scan for small amount of rows
Next
From: Tom Lane
Date:
Subject: Re: Index only scan sometimes switches to sequential scan for small amount of rows