Re: Yet another "Why won't PostgreSQL use my index?" - Mailing list pgsql-general

From Manfred Koizar
Subject Re: Yet another "Why won't PostgreSQL use my index?"
Date
Msg-id tqa4hu04gbuus6ubhse68kbsgn87fu7n1p@4ax.com
Whole thread Raw
In response to Re: Yet another "Why won't PostgreSQL use my index?"  ("Gregory Wood" <gregw@com-stock.com>)
Responses Re: Yet another "Why won't PostgreSQL use my index?"  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Thu, 20 Jun 2002 15:00:09 -0400, "Gregory Wood"
<gregw@com-stock.com> wrote:
>I guess the best way to approach that particular tuning problem is to find a
>query where the estimated row numbers is close to the actual page numbers
>and then try different values until the random page reads start to become
>slower than the sequential scan. Fun fun.
>
>Of course if PostgreSQL were estimating the number of rows correctly, that
>would be less of a problem.

Yes, less of a problem.  But it wouldn't make the problem go away.
About two weeks ago I had that kind of fun you mentioned in a
situation where estimated numbers of rows were +/- 5% close to
reality.  However, if the planner believes that one random page read
is necessary for each tuple, but in reality 50 or more tuples can be
fetched with one page read, then the planner is off by a factor 50 or
more in favour of seq scans.

If you know that there is a tendency for your data to be physically
ordered by index value, you can put in a counterweight in favour of
index scans by lowering random_page_cost.  Of course this won't work,
if you have multiple indices implying very different sort orders.

I thought that the planner had a notion of "clustering", but I cannot
recall where I got this idea from.  Must have read something on the
hackers list ...  Sorry, I cannot provide any details.  Most probably
Tom Lane can ...

>Seems that our data is throwing off the
>statistics... we have some values that appear tens of thousands of times and
>others that appear only a few times, with a few values (such as the example
>I sent) in between. Perhaps it's time to look at TABLE SET STATISTICS...

At least it can't hurt :-)

Servus
 Manfred

pgsql-general by date:

Previous
From: Fran Fabrizio
Date:
Subject: Re: selecting all records where a column is null
Next
From: Manfred Koizar
Date:
Subject: Re: selecting all records where a column is null