Thread: sequential scans and the like operator
There is a discussion going on on the sql-ledger mailing list concerning whether indexes will provide any performance improvements. The one that caught my eye was whether using LIKE in a statement would force a sequential scan. I tried checking the PG list archives but fts.postgresql.org is a little slow - as many know - plus a quick search using google didn't turn up much. My guesstimation is a leading or mid-field wildcard (_, %) would force a sequential scan but a trailing wildcard would not (should not?). I think (seem to remember) that this was discussed either here or on the HACKERS list. A pointer to the appropriate message/thread would be great. TIA, Rod -- Let Accuracy Triumph Over Victory Zetetic Institute "David's Sling" Marc Stiegler
On Tue, Jan 08, 2002 at 04:50:53AM -0800, Roderick A. Anderson wrote: > There is a discussion going on on the sql-ledger mailing list concerning > whether indexes will provide any performance improvements. The one that > caught my eye was whether using LIKE in a statement would force a > sequential scan. > > I tried checking the PG list archives but fts.postgresql.org is a little > slow - as many know - plus a quick search using google didn't turn up > much. > > My guesstimation is a leading or mid-field wildcard (_, %) would force a > sequential scan but a trailing wildcard would not (should not?). I, myself, don't remember the thread exactly, but I'm pretty sure that was the behavior that we were told to expect... -- Adam Haberlach | Who buys an eight-processor machine and then adam@newsnipple.com | watches 30 movies on it all at the same time? http://newsnipple.com | Beats me. They told us they could sell it, so | we made it. -- George Hoffman, Be Engineer
Roderick A. Anderson wrote: >There is a discussion going on on the sql-ledger mailing list concerning >whether indexes will provide any performance improvements. The one that >caught my eye was whether using LIKE in a statement would force a >sequential scan. > You can always check exaclty what's being done in your queries by using the EXPLAIN command. For example, to test your hypothesis: ------------------ test=# create table foo(f text, i int); CREATE test=# create index foo_t on foo (f); CREATE test=# explain select * from foo where f like '%uiop%'; NOTICE: QUERY PLAN: Seq Scan on foo (cost=0.00..22.50 rows=1 width=36) EXPLAIN test=# explain select * from foo where f = '%uiop%'; NOTICE: QUERY PLAN: Index Scan using foo_t on foo (cost=0.00..17.07 rows=5 width=36) EXPLAIN --------------------- -dj trombley <dtrom@bumba.net>
On Tue, 8 Jan 2002, Dave Trombley wrote: > You can always check exaclty what's being done in your queries by > using the EXPLAIN command. For example, to test your hypothesis: > From your example I'm assuming I don't need data to test the optimizer. Cool! This is why I asked (and a little laziness) instead of trying it. No real data to populate the table with. Thanks for the idea/knowledge. Rod -- Let Accuracy Triumph Over Victory Zetetic Institute "David's Sling" Marc Stiegler
On Tue, 8 Jan 2002, Roderick A. Anderson wrote: > On Tue, 8 Jan 2002, Dave Trombley wrote: > > > You can always check exaclty what's being done in your queries by > > using the EXPLAIN command. For example, to test your hypothesis: > > > > >From your example I'm assuming I don't need data to test the optimizer. > Cool! This is why I asked (and a little laziness) instead of trying it. > No real data to populate the table with. > > Thanks for the idea/knowledge. Ah, no that actually isn't true. For example, I believe that Pg will always Seq Scan a table that has never been vacuum analyzed, or is empty, or has only a few rows. -jwb
Dave Trombley <dtrom@bumba.net> writes: > Roderick A. Anderson wrote: >> There is a discussion going on on the sql-ledger mailing list concerning >> whether indexes will provide any performance improvements. The one that >> caught my eye was whether using LIKE in a statement would force a >> sequential scan. >> > You can always check exaclty what's being done in your queries by > using the EXPLAIN command. Alternatively, search the pgsql mailing list archives; LIKE performance has been discussed more times than I care to think about. The present state of play, I believe, is: * LIKE and regexp match WHERE clauses are potentially indexable if the pattern specifies a fixed prefix of one or more characters. The longer the fixed prefix, the more selective the index condition (and hence the greater the probability the planner will choose to use it). As examples: foo LIKE 'abc%bar' indexable (prefix is abc) foo LIKE '_abc%bar' not indexable (first character not fixed) foo ~ 'abc' not indexable (pattern not anchored left) foo ~ '^abc' indexable (prefix is abc) foo ILIKE 'abc%' not indexable (1st char could be A or a) * If Postgres was compiled with LOCALE support and is using a non-C locale, <blech>LIKE indexing is disabled</blech> because the sort order of the index may not agree with what's needed to perform LIKE searches. This last point is rather nasty since non-C locales are rapidly becoming the usual default, even in the USA. Linuxen tend to default to en_US locale, for example. regards, tom lane
On Tue, Jan 08, 2002 at 10:45:25AM -0800, Roderick A. Anderson wrote: > On Tue, 8 Jan 2002, Dave Trombley wrote: > > > You can always check exaclty what's being done in your queries by > > using the EXPLAIN command. For example, to test your hypothesis: > > > > >From your example I'm assuming I don't need data to test the optimizer. > Cool! This is why I asked (and a little laziness) instead of trying it. > No real data to populate the table with. Uh, except that the results will be different depending on how much data there is. If you only have three rows in the table, it's always going to use a seqscan no matter what the indices are: it's cheaper just to look at all the records than to read the index. That's why VACUUM ANALYSE is so important, by the way. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On Tue, 8 Jan 2002, Tom Lane wrote: > Alternatively, search the pgsql mailing list archives; Yeah this was where I was trying to get but it was _way_ too slow and I have AisleRiot, Gnome Stones, and Demineur to keep my happy while I wait. > * If Postgres was compiled with LOCALE support and is using a non-C > locale, <blech>LIKE indexing is disabled</blech> because the sort order > of the index may not agree with what's needed to perform LIKE searches. > > This last point is rather nasty since non-C locales are rapidly becoming > the usual default, even in the USA. Linuxen tend to default to en_US > locale, for example. Got bitten here once. Thanks for the more indepth explanation. Best, Rod -- Let Accuracy Triumph Over Victory Zetetic Institute "David's Sling" Marc Stiegler