Re: Planner estimates cost of 'like' a lot lower than '='?? - Mailing list pgsql-general

From Mats Lofkvist
Subject Re: Planner estimates cost of 'like' a lot lower than '='??
Date
Msg-id 20010722201555.116.qmail@kairos.algonet.se
Whole thread Raw
In response to Re: Planner estimates cost of 'like' a lot lower than '='??  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
   From: Tom Lane <tgl@sss.pgh.pa.us>

   Mats Lofkvist <mal@algonet.se> writes:
   > There are 11004 rows matching key2 = 'llll' and 90 rows matching
   > value like 't10k__' (all 90 have key2 = 'llll').

   Hmph.  On that basis, one would think the planner made the right choice
   the first time.  Curious.  Do you have locale support enabled?  If so,
   what locale are you using in the database?

The FreeBSD port sets --enable-locale --enable-multibyte=LATIN1.

   > (I'm downloading the CVS tree right now. Do I need to do dump/restore
   > or can I just start it on the current data?)

   You'll need to dump/reload.  I wouldn't advise running CVS tip on your
   production database, even if it were compatible ;-).  Set it up as a
   playpen installation, instead.  To do this, give configure a --prefix
   pointing at a temporary directory, plus --with-pgport to select a port
   number other than the default, and when you initdb and start the
   postmaster, specify a data directory inside the temp area.

It's all a test database, so trashing is not a (big) problem.

   > I still don't understand how
   > "where value = 'xxx'" can be estimated to return 600 times more rows
   > than "where value like 'xxx%'" (this is what happens in my simplified
   > test).

   Because the LIKE test is estimated as a range query (where value >=
   'xxx' AND value < 'xxy') which uses entirely different statistics
   than the equality test does.

Ok, guess I should have skipped the conclusions and stayed with
describing the real problem then :-)

                           regards, tom lane

      _
Mats Lofkvist
mal@algonet.se

pgsql-general by date:

Previous
From: Joel Burton
Date:
Subject: Re: SQL Regular Expression Question
Next
From: Mats Lofkvist
Date:
Subject: Re: Planner estimates cost of 'like' a lot lower than '='??