Re: Planner estimates cost of 'like' a lot lower than '='?? - Mailing list pgsql-general

From Tom Lane
Subject Re: Planner estimates cost of 'like' a lot lower than '='??
Date
Msg-id 2453.995830142@sss.pgh.pa.us
Whole thread Raw
In response to Re: Planner estimates cost of 'like' a lot lower than '='??  (Mats Lofkvist <mal@algonet.se>)
Responses Re: Planner estimates cost of 'like' a lot lower than '='??  (Mats Lofkvist <mal@algonet.se>)
List pgsql-general
Mats Lofkvist <mal@algonet.se> writes:
> There are 11004 rows matching key2 = 'llll' and 90 rows matching
> value like 't10k__' (all 90 have key2 = 'llll').

Hmph.  On that basis, one would think the planner made the right choice
the first time.  Curious.  Do you have locale support enabled?  If so,
what locale are you using in the database?

> (I'm downloading the CVS tree right now. Do I need to do dump/restore
> or can I just start it on the current data?)

You'll need to dump/reload.  I wouldn't advise running CVS tip on your
production database, even if it were compatible ;-).  Set it up as a
playpen installation, instead.  To do this, give configure a --prefix
pointing at a temporary directory, plus --with-pgport to select a port
number other than the default, and when you initdb and start the
postmaster, specify a data directory inside the temp area.

> I still don't understand how
> "where value = 'xxx'" can be estimated to return 600 times more rows
> than "where value like 'xxx%'" (this is what happens in my simplified
> test).

Because the LIKE test is estimated as a range query (where value >=
'xxx' AND value < 'xxy') which uses entirely different statistics
than the equality test does.

            regards, tom lane

pgsql-general by date:

Previous
From: Mats Lofkvist
Date:
Subject: Re: Planner estimates cost of 'like' a lot lower than '='??
Next
From: Tom Lane
Date:
Subject: Re: Planner estimates cost of 'like' a lot lower than '='??