Home > mailing lists

Re: [SQL] optimizer not using an index... - Mailing list pgsql-sql

From	Tom Lane
Subject	Re: [SQL] optimizer not using an index...
Date	August 27, 1999 10:55:39
Msg-id	6737.935765619@sss.pgh.pa.us Whole thread Raw
In response to	optimizer not using an index... (Howie <caffeine@toodarkpark.org>)
Responses	Re: [SQL] optimizer not using an index...
List	pgsql-sql

Tree view

Howie <caffeine@toodarkpark.org> writes:
> explain reveals that postgres ( 6.5.0 ) isnt using some of my indexes,
> opting instead for a complete table scan ( and drastically slowing things
> down ).

Well, mumble.  The optimizer certainly needs work, but what's your
evidence that this query would be faster if done another way?  Hashing
the smaller tables, as it's doing, ought to be a pretty good strategy.

One way to check is to start your client with environment variable
setting PGOPTIONS="-fh" ("forbid hashjoin") to discourage the optimizer
from using hashes, then check the generated plan for the same query and
see what its actual runtime is.  That's likely to be a suboptimal plan
however, since it'll turn off *all* hashing.  The hash on customers is
probably the thing that's bothering you.

How many result rows do you actually get from this query?  If you
eliminate the customers table from the query, and just do the same join
among the remaining tables, how many rows do you get?  I suspect the
optimizer is drastically off in its estimate of ~300k result rows, and
that's contributing to the problem.

> doing a complete table scan on a column thats indexed isnt really that
> nice, especially since there are 12,000 entries in it.

But it's estimating it's going to have to probe that table 300k times,
which makes the hashjoin look mighty attractive...

> interestingly, when querying on "pincodes.code" instead of
> "customers.name", postgres does NOT use a full table scan; it uses the
> proper indexes:

> Hash Join  (cost=23.78 rows=5 width=222)

Note the drastic difference in the estimated result-row count; that's
undoubtedly what's changing the optimizer's choice of what to do.  You
haven't given us enough detail to understand why this query would be
(or at least seem) more selective than the other, however.

Anyway, this looks to me like it is probably a symptom of poor
selectivity estimation leading to bogus estimates of output row counts
leading to a nonoptimal plan choice.  I have been working on improving
the selectivity estimation for 6.6, and am looking for test cases to
check out the logic on.  Is your database small enough/non proprietary
enough that you could send me a dump?  Or could you strip it down to
a test case that still exhibits the same misbehavior?  If you don't
like either of those, perhaps you could grab a current snapshot, install
your data in a test postmaster, and report back on whether it acts any
different...
        regards, tom lane

pgsql-sql by date:

From: Tom Lane
Date: 27 August 1999, 10:27:36
Subject: Re: [SQL] Inheritance and DELETE

From: Howie
Date: 27 August 1999, 12:05:37
Subject: Re: [SQL] optimizer not using an index...

Re: [SQL] optimizer not using an index... - Mailing list pgsql-sql

Previous

Next