Re: NOT LIKE much faster than LIKE? - Mailing list pgsql-performance

From Tom Lane
Subject Re: NOT LIKE much faster than LIKE?
Date
Msg-id 24021.1136858688@sss.pgh.pa.us
Whole thread Raw
In response to NOT LIKE much faster than LIKE?  (Andrea Arcangeli <andrea@cpushare.com>)
Responses Re: NOT LIKE much faster than LIKE?  (Andrea Arcangeli <andrea@cpushare.com>)
List pgsql-performance
Andrea Arcangeli <andrea@cpushare.com> writes:
> It just makes no sense to me that the planner takes a difference
> decision based on a "not".

Why in the world would you think that?  In general a NOT will change the
selectivity of the WHERE condition tremendously.  If the planner weren't
sensitive to that, *that* would be a bug.  The only case where it's
irrelevant is if the selectivity of the base condition is exactly 50%,
which is not a very reasonable default guess for LIKE.

It sounds to me that the problem is misestimation of the selectivity
of the LIKE pattern --- the planner is going to think that
LIKE '%% PREEMPT %%' is fairly selective because of the rather long
match text, when in reality it's probably not so selective on your
data.  But we don't keep any statistics that would allow the actual
number of matching rows to be estimated well.  You might want to think
about changing your data representation so that the pattern-match can be
replaced by a boolean column, or some such, so that the existing
statistics code can make a more reasonable estimate how many rows are
selected.

            regards, tom lane

pgsql-performance by date:

Previous
From: Andrea Arcangeli
Date:
Subject: NOT LIKE much faster than LIKE?
Next
From: Andrea Arcangeli
Date:
Subject: Re: NOT LIKE much faster than LIKE?