Home > mailing lists

Re: Bad query plan inside EXISTS clause - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Bad query plan inside EXISTS clause
Date	March 10, 2010 11:44:34
Msg-id	15947.1268235860@sss.pgh.pa.us Whole thread Raw
In response to	Bad query plan inside EXISTS clause (Benoit Delbosc <bdelbosc@nuxeo.com>)
Responses	Re: Bad query plan inside EXISTS clause
List	pgsql-performance

Tree view

Benoit Delbosc <bdelbosc@nuxeo.com> writes:
> I am trying to understand why inside an EXISTS clause the query planner
>   does not use the index:

I'm not sure this plan is as bad as all that.  The key point is that the
planner is expecting 52517 rows that match that users_md5 value (and the
true number is evidently 51446, so that estimate isn't far off).  That's
about 1/48th of the table.  It knows that the EXISTS case can stop as
soon as it finds one match, so it's betting that a plain seqscan will
hit a match faster than an index lookup would be able to, ie,
seqscanning about 48 tuples is faster than one index lookup.  This might
be a bad bet if the users_md5 values are correlated with physical order,
ie the matches are not randomly scattered but are all towards the end of
the table.  Barring that, though, it could be a good bet if the table
isn't swapped in.  Which is what the default cost parameters are set
up to assume.

I suspect your real complaint is that you expect the table to be swapped
in, in which case what you ought to be doing is adjusting the planner's
cost parameters.  Some playing around here with a similar case suggests
that even a small reduction in random_page_cost would make it prefer an
indexscan for this type of situation.

            regards, tom lane

pgsql-performance by date:

From: Benoit Delbosc
Date: 10 March 2010, 11:42:22
Subject: Re: Bad query plan inside EXISTS clause

From: Harald Fuchs
Date: 10 March 2010, 12:35:20
Subject: Re: Strange workaround for slow query

Re: Bad query plan inside EXISTS clause - Mailing list pgsql-performance

Previous

Next