Thread: Non-use of index ?

Non-use of index ?

From
"Peter Galbavy"
Date:
We have two seemingly identical SQL statements:

SELECT user_name, status, count(status) AS total FROM messages
WHERE client_id = '89' AND user_name in ('U66220', 'U66221')
GROUP BY user_name, status;

and

SELECT user_name, status, count(status) AS total FROM messages
WHERE (client_id = '89' AND user_name = 'U66221')  OR (client_id = '89' AND user_name = 'U66220')
GROUP BY user_name, status;

The (relevant) index on that table is:

CREATE INDEX messages_200203_ix2 ON messages_200203 (client_id, user_name);

The first statement is *slow* by an order of magnitude in comparison to the
second. Using EXPLAIN says that only the second statement is using the
index...

Now I can intellectually understand why the index is only used for the
second statement, but shouldn't the optimiser be able to use the index to
match (client_id, user_name) against the index through each combination and
produce the second query by itself ?

Using 'SET ENABLE_SEQSCAN TO OFF' makes no difference.

As a side note, the original SQL from an Oracle database that is being
ported to PostgreSQL did not suffer this 'problem'.

Peter




Re: Non-use of index ?

From
Tom Lane
Date:
"Peter Galbavy" <peter.galbavy@knowtion.net> writes:
> We have two seemingly identical SQL statements:
> SELECT user_name, status, count(status) AS total FROM messages
> WHERE client_id = '89' AND user_name in ('U66220', 'U66221')
> GROUP BY user_name, status;
> and
> SELECT user_name, status, count(status) AS total FROM messages
> WHERE (client_id = '89' AND user_name = 'U66221')
>    OR (client_id = '89' AND user_name = 'U66220')
> GROUP BY user_name, status;

> The (relevant) index on that table is:

> CREATE INDEX messages_200203_ix2 ON messages_200203 (client_id, user_name);

> The first statement is *slow* by an order of magnitude in comparison to the
> second.

Presumably "client_id = '89'" alone is not very selective?

> Now I can intellectually understand why the index is only used for the
> second statement, but shouldn't the optimiser be able to use the index to
> match (client_id, user_name) against the index through each combination and
> produce the second query by itself ?

The Postgres planner does not explore that particular direction of
transformation of user qualifications --- it generally thinks that
conjunctive-normal-form quals are fine as-is, and it won't change them
to disjunctive form unless that makes the quals more compact.  Apparently
Oracle uses a different set of heuristics to decide what transformations
of the quals are worth generating and comparing to indexes.

There is a hack in 7.2 that would have discovered this optimization
if the index had been on (user_name, client_id) ... but as-is, the
possibility of using an OR index search is never considered.
        regards, tom lane