Tom Lane a écrit :
> Benoit Delbosc <bdelbosc@nuxeo.com> writes:
>> I am trying to understand why inside an EXISTS clause the query planner
>> does not use the index:
>
> I'm not sure this plan is as bad as all that. The key point is that the
> planner is expecting 52517 rows that match that users_md5 value (and the
> true number is evidently 51446, so that estimate isn't far off). That's
> about 1/48th of the table. It knows that the EXISTS case can stop as
> soon as it finds one match, so it's betting that a plain seqscan will
> hit a match faster than an index lookup would be able to, ie,
> seqscanning about 48 tuples is faster than one index lookup. This might
> be a bad bet if the users_md5 values are correlated with physical order,
> ie the matches are not randomly scattered but are all towards the end of
> the table.
exact, the data is not randomly scattered but ordered this explains why
in my case seq scan is a bad bet
Barring that, though, it could be a good bet if the table
> isn't swapped in. Which is what the default cost parameters are set
> up to assume.
there are lots of shared buffers and effective memory on this instance,
the query is executed many times I can assume that the table isn't
swapped in right ?
> I suspect your real complaint is that you expect the table to be swapped
> in, in which case what you ought to be doing is adjusting the planner's
> cost parameters. Some playing around here with a similar case suggests
> that even a small reduction in random_page_cost would make it prefer an
> indexscan for this type of situation.
excellent !
Changing the random_page_cost from 4 to 2 do the trick
SET random_page_cost = 2;
EXPLAIN ANALYZE SELECT 1 WHERE EXISTS (SELECT 1 FROM read_acls_cache
WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=1.06..1.07 rows=1 width=0) (actual time=0.048..0.048
rows=1 loops=1)
One-Time Filter: $0
InitPlan 1 (returns $0)
-> Index Scan using read_acls_cache_users_md5_idx on
read_acls_cache (cost=0.00..55664.21 rows=52517 width=0) (actual
time=0.045..0.045 rows=1 loops=1)
Index Cond: ((users_md5)::text =
'9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text)
Total runtime: 0.087 ms
(6 rows)
466/0.087 = 5360 thanks !
kind regards
ben