Thread: LIKE and indexes?
Hi, Can anyone telle me if and when a LIKE query uses an index? I've compiled postgres with locale support. Does that have any influence indexes? thanks, Alexander Jerusalem
When in doubt, try the explain command Not exactly sure about postgres but in general LIKE can only use an index in the case of LIKE "Something%" LIKE "%Something" or LIKE "%Something%" won't use an index since it would have to scan the entire index to find all matches. Alex. On Wed, 14 Mar 2001, Alexander Jerusalem wrote: > Hi, > > Can anyone telle me if and when a LIKE query uses an index? I've compiled > postgres with locale support. Does that have any influence indexes? > > thanks, > > Alexander Jerusalem > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Thanks for your answer, Alex! I've done an EXPLAIN and I saw that no index was used although my SQL statement had a wild card only at the end as you pointed out ('blah%'). My guess is that this is because of the locale support. Regards, Alexander Jerusalem At 01:22 15.03.01, adb wrote: >When in doubt, try the explain command > >Not exactly sure about postgres but in general LIKE can >only use an index in the case of LIKE "Something%" > >LIKE "%Something" or LIKE "%Something%" >won't use an index since it would have to scan the entire >index to find all matches. > >Alex. > > > >On Wed, 14 Mar 2001, Alexander Jerusalem wrote: > > > Hi, > > > > Can anyone telle me if and when a LIKE query uses an index? I've compiled > > postgres with locale support. Does that have any influence indexes? > > > > thanks, > > > > Alexander Jerusalem > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From: "Alexander Jerusalem" <alexander.jerusalem@pop.chello.at> > Thanks for your answer, Alex! > > I've done an EXPLAIN and I saw that no index was used although my SQL > statement had a wild card only at the end as you pointed out ('blah%'). My > guess is that this is because of the locale support. Try like '^blah%' - maybe the additional hint will help (but maybe not). How many rows did PG think it was going to access in the explain? - Richard Huxton > Regards, > Alexander Jerusalem > > > At 01:22 15.03.01, adb wrote: > >When in doubt, try the explain command > > > >Not exactly sure about postgres but in general LIKE can > >only use an index in the case of LIKE "Something%" > > > >LIKE "%Something" or LIKE "%Something%" > >won't use an index since it would have to scan the entire > >index to find all matches. > > > >Alex. > > > > > > > >On Wed, 14 Mar 2001, Alexander Jerusalem wrote: > > > > > Hi, > > > > > > Can anyone telle me if and when a LIKE query uses an index? I've compiled > > > postgres with locale support. Does that have any influence indexes? > > > > > > thanks, > > > > > > Alexander Jerusalem > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 4: Don't 'kill -9' the postmaster > > > > > > > > >---------------------------(end of broadcast)--------------------------- > >TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
"Richard Huxton" wrote: >> I've done an EXPLAIN and I saw that no index was used although my SQL >> statement had a wild card only at the end as you pointed out ('blah%'). My >> guess is that this is because of the locale support. > >Try like '^blah%' - maybe the additional hint will help (but maybe not). How >many rows did PG think it was going to access in the explain? No, Richard, you can't mix regular expressions with SQL LIKE. For LIKE, '^' is an ordinary character to match. -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47 6B 7E 39 CC 56 E4 C1 47 GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "Blessed are the poor in spirit, for theirs is the kingdom of heaven...Blessed are they which do hunger and thirst after righteousness, for they shall be filled...Blessed are the pure in heart, for they shall see God." Matthew 5:3,6,8
3/15/2001 3:39:27 AM, Alexander Jerusalem <alexander.jerusalem@pop.chello.at> wrote: >Thanks for your answer, Alex! > >I've done an EXPLAIN and I saw that no index was used although my SQL >statement had a wild card only at the end as you pointed out ('blah%'). My >guess is that this is because of the locale support. How large is the index in question? I recall reading that PostgreSQL doesn't utilize an index if the overhead of using it was calculated as higher than a sequential scan. Brent
Hi, The query I'm analyzing is this one: SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where corporation.crp_name1 ilike 'Uni%'); Aggregate (cost=622544.96..622544.96 rows=1 width=0) -> Seq Scan on person (cost=0.00..622526.04 rows=7565 width=0) SubPlan -> Materialize (cost=82.27..82.27 rows=1 width=36) -> Nested Loop (cost=0.00..82.27 rows=1 width=36) -> Seq Scan on corporation (cost=0.00..80.24 rows=1 width=12) -> Index Scan using i_pcp_pc_toid on pcpc (cost=0.00..2.02 rows=1 width=24) The query takes over 3 seconds without any other load on the same machine (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on MSSQLServer takes only a fraction. The tables are fairly small: the person table has 7565 rows, the corporation table has 3059 and the relation table (pcpc) has 2271 rows. thanks, Alexander Jerusalem
Alexander, My guess is that MSSQL does a better job of optimizing the in clause. In postgres an in clause will not ( currently ) use an index, so it forces a sequential scan. However, you can change your query a bit and use exists which will use an existing index. SELECT count(*) FROM Person WHERE EXISTS ( SELECT pcpc.pc_fromid FROM pcpc, corporation WHERE pcpc.pc_toid = corporation.pc_id AND Person.pc_Id = pcpc.pc_toid AND corporation.crp_name1 like 'Uni%' AND ); That will allow the query to use an index on Person.pc_Id and pcpc.pc_toid assuming they exist. -- Bill On Thu, Mar 15, 2001 at 11:16:47PM +0100, Alexander Jerusalem wrote: > Hi, > > The query I'm analyzing is this one: > > SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid > from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where > corporation.crp_name1 ilike 'Uni%'); > > Aggregate (cost=622544.96..622544.96 rows=1 width=0) > -> Seq Scan on person (cost=0.00..622526.04 rows=7565 width=0) > SubPlan > -> Materialize (cost=82.27..82.27 rows=1 width=36) > -> Nested Loop (cost=0.00..82.27 rows=1 width=36) > -> Seq Scan on corporation (cost=0.00..80.24 > rows=1 width=12) > -> Index Scan using i_pcp_pc_toid on > pcpc (cost=0.00..2.02 rows=1 width=24) > > > The query takes over 3 seconds without any other load on the same machine > (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on > MSSQLServer takes only a fraction. The tables are fairly small: the person > table has 7565 rows, the corporation table has 3059 and the relation table > (pcpc) has 2271 rows. > > > thanks, > > Alexander Jerusalem > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html -- _____ / ___/___ | Bill Huff / bhuff@colltech.com / /__ __/ | Voice: (512) 263-0770 x 262 / /__/ / | Fax: (512) 263-8921 \___/ /ollective | Pager: 1-800-946-4646 # 1406217 \/echnologies |------[ http://www.colltech.com ] ------
It's your in subquery. Check out the FAQ: http://postgresql.readysetnet.com/docs/faq-english.html#4.23 On Thu, 15 Mar 2001, Alexander Jerusalem wrote: > Hi, > > The query I'm analyzing is this one: > > SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid > from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where > corporation.crp_name1 ilike 'Uni%'); > > Aggregate (cost=622544.96..622544.96 rows=1 width=0) > -> Seq Scan on person (cost=0.00..622526.04 rows=7565 width=0) > SubPlan > -> Materialize (cost=82.27..82.27 rows=1 width=36) > -> Nested Loop (cost=0.00..82.27 rows=1 width=36) > -> Seq Scan on corporation (cost=0.00..80.24 > rows=1 width=12) > -> Index Scan using i_pcp_pc_toid on > pcpc (cost=0.00..2.02 rows=1 width=24) > > > The query takes over 3 seconds without any other load on the same machine > (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on > MSSQLServer takes only a fraction. The tables are fairly small: the person > table has 7565 rows, the corporation table has 3059 and the relation table > (pcpc) has 2271 rows. > > > thanks, > > Alexander Jerusalem > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Alexander Jerusalem <alexander.jerusalem@pop.chello.at> writes: > The query I'm analyzing is this one: > SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid > from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where > corporation.crp_name1 ilike 'Uni%'); ^^^^^ Case-insensitive compares cannot use indexes in Postgres, because our indexes are case-sensitive. You could make an index on lower(crp_name1) and then do ... where lower(corporation.crp_name1) like 'uni%' Actually, though, I don't believe that the lack of an indexscan on corporation is the problem here. That's a tiny table and it's only going to be scanned once in this plan. The real problem is the WHERE ... IN at the top level. Try changing to a WHERE EXISTS (see the PG FAQ). regards, tom lane
Thank you all, I'm sorry I didn't read this FAQ first... Alexander Jerusalem
Thanks a lot Bill, that did it! Now it takes only a quarter of a second or so :-) Alexander Jerusalem At 23:33 15.03.01, Bill Huff wrote: > Alexander, > > My guess is that MSSQL does a better job of optimizing the in clause. > > In postgres an in clause will not ( currently ) use an index, so it > forces a sequential scan. However, you can change your query a bit and > use exists which will use an existing index. > > SELECT count(*) > FROM Person > WHERE EXISTS ( > SELECT pcpc.pc_fromid > FROM pcpc, corporation > WHERE pcpc.pc_toid = corporation.pc_id AND > Person.pc_Id = pcpc.pc_toid AND > corporation.crp_name1 like 'Uni%' AND > ); > > That will allow the query to use an index on Person.pc_Id and > pcpc.pc_toid assuming they exist. > >-- >Bill > >On Thu, Mar 15, 2001 at 11:16:47PM +0100, Alexander Jerusalem wrote: > > Hi, > > > > The query I'm analyzing is this one: > > > > SELECT count(*) from Person WHERE Person.pc_Id in (select pcpc.pc_fromid > > from pcpc inner join corporation on pcpc.pc_toid = corporation.pc_id where > > corporation.crp_name1 ilike 'Uni%'); > > > > Aggregate (cost=622544.96..622544.96 rows=1 width=0) > > -> Seq Scan on person (cost=0.00..622526.04 rows=7565 width=0) > > SubPlan > > -> Materialize (cost=82.27..82.27 rows=1 width=36) > > -> Nested Loop (cost=0.00..82.27 rows=1 width=36) > > -> Seq Scan on corporation (cost=0.00..80.24 > > rows=1 width=12) > > -> Index Scan using i_pcp_pc_toid on > > pcpc (cost=0.00..2.02 rows=1 width=24) > > > > > > The query takes over 3 seconds without any other load on the same machine > > (Pentium III, 1 GHZ, 512 MB RAM) and I'm not sure why because on > > MSSQLServer takes only a fraction. The tables are fairly small: the person > > table has 7565 rows, the corporation table has 3059 and the relation table > > (pcpc) has 2271 rows. > > > > > > thanks, > > > > Alexander Jerusalem > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > >-- > _____ > / ___/___ | Bill Huff / bhuff@colltech.com > / /__ __/ | Voice: (512) 263-0770 x 262 > / /__/ / | Fax: (512) 263-8921 > \___/ /ollective | Pager: 1-800-946-4646 # 1406217 > \/echnologies |------[ http://www.colltech.com ] ------ > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)