Thread: PostgreSQL index usage discussion.

PostgreSQL index usage discussion.

From
mlw
Date:
We have had several threads about index usage, specifically when PostgreSQL has
the choice of using one or not.

There seems to be a few points of view:

(1) The planner and statistics need to improve, so that erroneously using an
index (or not) happens less frequently or not at all.

(2) Use programmatic hints which allow coders specify which indexes are used
during a query. (ala Oracle)

(3) It is pretty much OK as-is, just use enable_seqscan=false in the query.

My point of view is about this subject is one from personal experience. I had a
database on which PostgreSQL would always (erroneously) choose not to use an
index. Are my experiences typical? Probably not, but are experiences like it
very common? I don't know, but we see a number "Why won't PostgreSQL use my
index" messages to at least conclude that it happens every now and then. In my
experience, when it happens, it is very frustrating.

I think statement (1) is a good idea, but I think it is optimistic to expect
that a statistical analysis of a table will contain enough information for all
possible cases.

Statement (2) would allow the flexibility needed, but as was pointed out, the
hints may become wrong over time as characteristics of the various change.

Statement (3) is not good enough because disabling sequential scans affect
whole queries and sub-queries which would correctly not use an index would be
forced to do so.

My personal preference is that some more specific mechanism than enable_seqscan
be provided for the DBA to assure an index is used. Working on the statistics
and the planner is fine, but I suspect there will always be a strong argument
for manual override in the exceptional cases where it will be needed.

What do you all think? What would be a good plan of attack?


Re: PostgreSQL index usage discussion.

From
Bradley McLean
Date:
* mlw (markw@mohawksoft.com) [020424 18:51]:
> 
> (2) Use programmatic hints which allow coders specify which indexes are used
> during a query. (ala Oracle)

We would certainly use this if it were available.  Hopefully not to
shoot ourselves in the foot, but for the rather common case of having
a small set of important predefined queries that run over data sets
that neither grow significantly nor change in characteristics (for
example, a table of airline routes and fares, with a few million
rows).

We want to squeeze every last bit of performance out of certain
queries, and we're willing to spend the time to verify that the
manual tuning beats the planner.

> What do you all think? What would be a good plan of attack?

I dunno.  If someone comes up with one that I can reasonably
contribute to, I will.

-Brad


Re: PostgreSQL index usage discussion.

From
Hannu Krosing
Date:
On Thu, 2002-04-25 at 00:46, mlw wrote:
> We have had several threads about index usage, specifically when PostgreSQL has
> the choice of using one or not.
> 
> There seems to be a few points of view:
> 
> (1) The planner and statistics need to improve, so that erroneously using an
> index (or not) happens less frequently or not at all.
> 
> (2) Use programmatic hints which allow coders specify which indexes are used
> during a query. (ala Oracle)
> 
> (3) It is pretty much OK as-is, just use enable_seqscan=false in the query.
> 
> My point of view is about this subject is one from personal experience. I had a
> database on which PostgreSQL would always (erroneously) choose not to use an
> index. Are my experiences typical? Probably not, but are experiences like it
> very common?

I have currently 2 databases that run with enable_seqscan=false to avoid
choosing plans that take forever.

> I don't know, but we see a number "Why won't PostgreSQL use my
> index" messages to at least conclude that it happens every now and then. In my
> experience, when it happens, it is very frustrating.
> 
> I think statement (1) is a good idea, but I think it is optimistic to expect
> that a statistical analysis of a table will contain enough information for all
> possible cases.

Perhaps we can come up with some special rules to avoid grossly pessimal
plans.

--------------------
Hannu




Re: PostgreSQL index usage discussion.

From
"Luis Alberto Amigo Navarro"
Date:
>
> (2) Use programmatic hints which allow coders specify which indexes are
used
> during a query. (ala Oracle)
>
As I said before it would be useful a way to improve(not force) using
indexes on particular queries, i.e. lowering the cost of using this index on
this query.
Regards



Re: PostgreSQL index usage discussion.

From
Hannu Krosing
Date:
On Thu, 2002-04-25 at 08:42, Luis Alberto Amigo Navarro wrote:
> >
> > (2) Use programmatic hints which allow coders specify which indexes are
> used
> > during a query. (ala Oracle)
> >
> As I said before it would be useful a way to improve(not force) using
> indexes on particular queries, i.e. lowering the cost of using this index on
> this query.
> Regards

I was told that DB2 has per-table (or rather per-tablespace) knowledge
of disk speeds, so keeping separate random and seqsqan costs for each 
table and index could be a good way here (to force use of a particular
index make its use cheap)

------------
Hannu





Re: PostgreSQL index usage discussion.

From
"Luis Alberto Amigo Navarro"
Date:
> I was told that DB2 has per-table (or rather per-tablespace) knowledge
> of disk speeds, so keeping separate random and seqsqan costs for each
> table and index could be a good way here (to force use of a particular
> index make its use cheap)
>

I was wondering something even easier, keeping 1 cost per index, 1 cost per
seqscan, but being allowed to scale cost for each index on each
query(recommended, null or unrecommended)
Regards