Thread: LIKE and indexes?

LIKE and indexes?

From
Alexander Jerusalem
Date:
Hi,

Can anyone telle me if and when a LIKE query uses an index? I've compiled
postgres with locale support. Does that have any influence indexes?

thanks,

Alexander Jerusalem


Re: LIKE and indexes?

From
adb
Date:
When in doubt, try the explain command

Not exactly sure about postgres but in general LIKE can
only use an index in the case of LIKE "Something%"

LIKE "%Something" or LIKE "%Something%"
won't use an index since it would have to scan the entire
index to find all matches.

Alex.



On Wed, 14 Mar 2001, Alexander Jerusalem wrote:

> Hi,
>
> Can anyone telle me if and when a LIKE query uses an index? I've compiled
> postgres with locale support. Does that have any influence indexes?
>
> thanks,
>
> Alexander Jerusalem
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


Re: LIKE and indexes?

From
Alexander Jerusalem
Date:
Thanks for your answer, Alex!

I've done an EXPLAIN and I saw that no index was used although my SQL
statement had a wild card only at the end as you pointed out ('blah%'). My
guess is that this is because of the locale support.

Regards,
Alexander Jerusalem


At 01:22 15.03.01, adb wrote:
>When in doubt, try the explain command
>
>Not exactly sure about postgres but in general LIKE can
>only use an index in the case of LIKE "Something%"
>
>LIKE "%Something" or LIKE "%Something%"
>won't use an index since it would have to scan the entire
>index to find all matches.
>
>Alex.
>
>
>
>On Wed, 14 Mar 2001, Alexander Jerusalem wrote:
>
> > Hi,
> >
> > Can anyone telle me if and when a LIKE query uses an index? I've compiled
> > postgres with locale support. Does that have any influence indexes?
> >
> > thanks,
> >
> > Alexander Jerusalem
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: Re: LIKE and indexes?

From
"Richard Huxton"
Date:
From: "Alexander Jerusalem" <alexander.jerusalem@pop.chello.at>

> Thanks for your answer, Alex!
>
> I've done an EXPLAIN and I saw that no index was used although my SQL
> statement had a wild card only at the end as you pointed out ('blah%'). My
> guess is that this is because of the locale support.

Try like '^blah%' - maybe the additional hint will help (but maybe not). How
many rows did PG think it was going to access in the explain?

- Richard Huxton


> Regards,
> Alexander Jerusalem
>
>
> At 01:22 15.03.01, adb wrote:
> >When in doubt, try the explain command
> >
> >Not exactly sure about postgres but in general LIKE can
> >only use an index in the case of LIKE "Something%"
> >
> >LIKE "%Something" or LIKE "%Something%"
> >won't use an index since it would have to scan the entire
> >index to find all matches.
> >
> >Alex.
> >
> >
> >
> >On Wed, 14 Mar 2001, Alexander Jerusalem wrote:
> >
> > > Hi,
> > >
> > > Can anyone telle me if and when a LIKE query uses an index? I've
compiled
> > > postgres with locale support. Does that have any influence indexes?
> > >
> > > thanks,
> > >
> > > Alexander Jerusalem
> > >
> > >
> > > ---------------------------(end of
broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


Re: Re: LIKE and indexes?

From
"Oliver Elphick"
Date:
"Richard Huxton" wrote:
  >> I've done an EXPLAIN and I saw that no index was used although my SQL
  >> statement had a wild card only at the end as you pointed out ('blah%'). My
  >> guess is that this is because of the locale support.
  >
  >Try like '^blah%' - maybe the additional hint will help (but maybe not). How
  >many rows did PG think it was going to access in the explain?

No, Richard, you can't mix regular expressions with SQL LIKE.  For LIKE, '^'
is an ordinary character to match.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "Blessed are the poor in spirit, for theirs is the
      kingdom of heaven...Blessed are they which do hunger
      and thirst after righteousness, for they shall be
      filled...Blessed are the pure in heart, for they shall
      see God."                   Matthew 5:3,6,8



Re: Re: LIKE and indexes?

From
"Brent R. Matzelle"
Date:
3/15/2001 3:39:27 AM, Alexander Jerusalem <alexander.jerusalem@pop.chello.at>
wrote:

>Thanks for your answer, Alex!
>
>I've done an EXPLAIN and I saw that no index was used although my SQL
>statement had a wild card only at the end as you pointed out ('blah%'). My
>guess is that this is because of the locale support.

How large is the index in question?  I recall reading that PostgreSQL doesn't
utilize an index if the overhead of using it was calculated as higher than a
sequential scan.

Brent


Re: Re: LIKE and indexes?

From
Alexander Jerusalem
Date:
Hi,

The query I'm analyzing is this one:

SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid
from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where
corporation.crp_name1 ilike 'Uni%');

Aggregate  (cost=622544.96..622544.96 rows=1 width=0)
   ->  Seq Scan on person  (cost=0.00..622526.04 rows=7565 width=0)
         SubPlan
           ->  Materialize  (cost=82.27..82.27 rows=1 width=36)
                 ->  Nested Loop  (cost=0.00..82.27 rows=1 width=36)
                       ->  Seq Scan on corporation  (cost=0.00..80.24
rows=1 width=12)
                       ->  Index Scan using i_pcp_pc_toid on
pcpc  (cost=0.00..2.02 rows=1 width=24)


The query takes over 3 seconds without any other load on the same machine
(Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on
MSSQLServer takes only a fraction. The tables are fairly small: the person
table has 7565 rows, the corporation table has 3059 and the relation table
(pcpc) has 2271 rows.


thanks,

Alexander Jerusalem


Re: Re: Re: LIKE and indexes?

From
Bill Huff
Date:
  Alexander,

    My guess is that MSSQL does a better job of optimizing the in clause.

    In postgres an in clause will not ( currently ) use an index, so it
    forces a sequential scan.  However, you can change your query a bit and
    use exists which will use an existing index.

      SELECT count(*)
        FROM Person
       WHERE EXISTS (
         SELECT pcpc.pc_fromid
           FROM pcpc, corporation
          WHERE pcpc.pc_toid = corporation.pc_id  AND
                Person.pc_Id = pcpc.pc_toid       AND
                corporation.crp_name1 like 'Uni%' AND
         );

    That will allow the query to use an index on Person.pc_Id and
    pcpc.pc_toid assuming they exist.

--
Bill

On Thu, Mar 15, 2001 at 11:16:47PM +0100, Alexander Jerusalem wrote:
> Hi,
>
> The query I'm analyzing is this one:
>
> SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid
> from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where
> corporation.crp_name1 ilike 'Uni%');
>
> Aggregate  (cost=622544.96..622544.96 rows=1 width=0)
>    ->  Seq Scan on person  (cost=0.00..622526.04 rows=7565 width=0)
>          SubPlan
>            ->  Materialize  (cost=82.27..82.27 rows=1 width=36)
>                  ->  Nested Loop  (cost=0.00..82.27 rows=1 width=36)
>                        ->  Seq Scan on corporation  (cost=0.00..80.24
> rows=1 width=12)
>                        ->  Index Scan using i_pcp_pc_toid on
> pcpc  (cost=0.00..2.02 rows=1 width=24)
>
>
> The query takes over 3 seconds without any other load on the same machine
> (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on
> MSSQLServer takes only a fraction. The tables are fairly small: the person
> table has 7565 rows, the corporation table has 3059 and the relation table
> (pcpc) has 2271 rows.
>
>
> thanks,
>
> Alexander Jerusalem
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
     _____
    / ___/___       | Bill Huff / bhuff@colltech.com
   / /__  __/       | Voice: (512) 263-0770 x 262
  / /__/ /          | Fax:   (512) 263-8921
  \___/ /ollective  | Pager: 1-800-946-4646 # 1406217
      \/echnologies |------[ http://www.colltech.com ] ------

Re: Re: Re: LIKE and indexes?

From
Ben
Date:
It's your in subquery. Check out the FAQ:

http://postgresql.readysetnet.com/docs/faq-english.html#4.23

On Thu, 15 Mar 2001, Alexander Jerusalem wrote:

> Hi,
>
> The query I'm analyzing is this one:
>
> SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid
> from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where
> corporation.crp_name1 ilike 'Uni%');
>
> Aggregate  (cost=622544.96..622544.96 rows=1 width=0)
>    ->  Seq Scan on person  (cost=0.00..622526.04 rows=7565 width=0)
>          SubPlan
>            ->  Materialize  (cost=82.27..82.27 rows=1 width=36)
>                  ->  Nested Loop  (cost=0.00..82.27 rows=1 width=36)
>                        ->  Seq Scan on corporation  (cost=0.00..80.24
> rows=1 width=12)
>                        ->  Index Scan using i_pcp_pc_toid on
> pcpc  (cost=0.00..2.02 rows=1 width=24)
>
>
> The query takes over 3 seconds without any other load on the same machine
> (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on
> MSSQLServer takes only a fraction. The tables are fairly small: the person
> table has 7565 rows, the corporation table has 3059 and the relation table
> (pcpc) has 2271 rows.
>
>
> thanks,
>
> Alexander Jerusalem
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>


Re: Re: Re: LIKE and indexes?

From
Tom Lane
Date:
Alexander Jerusalem <alexander.jerusalem@pop.chello.at> writes:
> The query I'm analyzing is this one:

> SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid
> from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where
> corporation.crp_name1 ilike 'Uni%');
                        ^^^^^

Case-insensitive compares cannot use indexes in Postgres, because our
indexes are case-sensitive.

You could make an index on lower(crp_name1) and then do

... where lower(corporation.crp_name1) like 'uni%'

Actually, though, I don't believe that the lack of an indexscan on
corporation is the problem here.  That's a tiny table and it's only
going to be scanned once in this plan.  The real problem is the WHERE
... IN at the top level.  Try changing to a WHERE EXISTS (see the PG
FAQ).

            regards, tom lane

Re: Re: Re: LIKE and indexes?

From
Alexander Jerusalem
Date:
Thank you all, I'm sorry I didn't read this FAQ first...

Alexander Jerusalem


Re: Re: Re: LIKE and indexes?

From
Alexander Jerusalem
Date:
Thanks a lot Bill, that did it! Now it takes only a quarter of a second or
so :-)

Alexander Jerusalem


At 23:33 15.03.01, Bill Huff wrote:

>   Alexander,
>
>     My guess is that MSSQL does a better job of optimizing the in clause.
>
>     In postgres an in clause will not ( currently ) use an index, so it
>     forces a sequential scan.  However, you can change your query a bit and
>     use exists which will use an existing index.
>
>       SELECT count(*)
>         FROM Person
>        WHERE EXISTS (
>          SELECT pcpc.pc_fromid
>            FROM pcpc, corporation
>           WHERE pcpc.pc_toid = corporation.pc_id  AND
>                 Person.pc_Id = pcpc.pc_toid       AND
>                 corporation.crp_name1 like 'Uni%' AND
>          );
>
>     That will allow the query to use an index on Person.pc_Id and
>     pcpc.pc_toid assuming they exist.
>
>--
>Bill
>
>On Thu, Mar 15, 2001 at 11:16:47PM +0100, Alexander Jerusalem wrote:
> > Hi,
> >
> > The query I'm analyzing is this one:
> >
> > SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid
> > from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where
> > corporation.crp_name1 ilike 'Uni%');
> >
> > Aggregate  (cost=622544.96..622544.96 rows=1 width=0)
> >    ->  Seq Scan on person  (cost=0.00..622526.04 rows=7565 width=0)
> >          SubPlan
> >            ->  Materialize  (cost=82.27..82.27 rows=1 width=36)
> >                  ->  Nested Loop  (cost=0.00..82.27 rows=1 width=36)
> >                        ->  Seq Scan on corporation  (cost=0.00..80.24
> > rows=1 width=12)
> >                        ->  Index Scan using i_pcp_pc_toid on
> > pcpc  (cost=0.00..2.02 rows=1 width=24)
> >
> >
> > The query takes over 3 seconds without any other load on the same machine
> > (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on
> > MSSQLServer takes only a fraction. The tables are fairly small: the person
> > table has 7565 rows, the corporation table has 3059 and the relation table
> > (pcpc) has 2271 rows.
> >
> >
> > thanks,
> >
> > Alexander Jerusalem
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
>--
>      _____
>     / ___/___       | Bill Huff / bhuff@colltech.com
>    / /__  __/       | Voice: (512) 263-0770 x 262
>   / /__/ /          | Fax:   (512) 263-8921
>   \___/ /ollective  | Pager: 1-800-946-4646 # 1406217
>       \/echnologies |------[ http://www.colltech.com ] ------
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)