Thread: sequential scans and the like operator

sequential scans and the like operator

From
"Roderick A. Anderson"
Date:
There is a discussion going on on the sql-ledger mailing list concerning
whether indexes will provide any performance improvements.  The one that
caught my eye was whether using LIKE in a statement would force a
sequential scan.

I tried checking the PG list archives but fts.postgresql.org is a little
slow - as many know - plus a quick search using google didn't turn up
much.

My guesstimation is a leading or mid-field wildcard (_, %) would force a
sequential scan but a trailing wildcard would not (should not?).

I think (seem to remember) that this was discussed either here or on
the HACKERS list.  A pointer to the appropriate message/thread would be
great.


TIA,
Rod
--
                      Let Accuracy Triumph Over Victory

                                                       Zetetic Institute
                                                        "David's Sling"
                                                         Marc Stiegler


Re: sequential scans and the like operator

From
Adam Haberlach
Date:
On Tue, Jan 08, 2002 at 04:50:53AM -0800, Roderick A. Anderson wrote:
> There is a discussion going on on the sql-ledger mailing list concerning
> whether indexes will provide any performance improvements.  The one that
> caught my eye was whether using LIKE in a statement would force a
> sequential scan.
>
> I tried checking the PG list archives but fts.postgresql.org is a little
> slow - as many know - plus a quick search using google didn't turn up
> much.
>
> My guesstimation is a leading or mid-field wildcard (_, %) would force a
> sequential scan but a trailing wildcard would not (should not?).

    I, myself, don't remember the thread exactly, but I'm pretty sure that
was the behavior that we were told to expect...

--
Adam Haberlach         | Who buys an eight-processor machine and then
adam@newsnipple.com    | watches 30 movies on it all at the same time?
http://newsnipple.com  | Beats me.  They told us they could sell it, so
                       | we made it.       -- George Hoffman, Be Engineer

Re: sequential scans and the like operator

From
Dave Trombley
Date:
Roderick A. Anderson wrote:

>There is a discussion going on on the sql-ledger mailing list concerning
>whether indexes will provide any performance improvements.  The one that
>caught my eye was whether using LIKE in a statement would force a
>sequential scan.
>
    You can always check exaclty what's being done in your queries by
using the EXPLAIN command.  For example, to test your hypothesis:

------------------
test=# create table foo(f text, i int);
CREATE
test=# create index foo_t on foo (f);
CREATE
test=# explain select * from foo where f like '%uiop%';
NOTICE:  QUERY PLAN:

Seq Scan on foo  (cost=0.00..22.50 rows=1 width=36)

EXPLAIN
test=# explain select * from foo where f  = '%uiop%';
NOTICE:  QUERY PLAN:

Index Scan using foo_t on foo  (cost=0.00..17.07 rows=5 width=36)

EXPLAIN
---------------------
-dj trombley
  <dtrom@bumba.net>



Re: sequential scans and the like operator

From
"Roderick A. Anderson"
Date:
On Tue, 8 Jan 2002, Dave Trombley wrote:

>     You can always check exaclty what's being done in your queries by
> using the EXPLAIN command.  For example, to test your hypothesis:
>

From your example I'm assuming I don't need data to test the optimizer.
Cool!  This is why I asked (and a little laziness) instead of trying it.
No real data to populate the table with.

Thanks for the idea/knowledge.


Rod
--
                      Let Accuracy Triumph Over Victory

                                                       Zetetic Institute
                                                        "David's Sling"
                                                         Marc Stiegler


Re: sequential scans and the like operator

From
"Jeffrey W. Baker"
Date:

On Tue, 8 Jan 2002, Roderick A. Anderson wrote:

> On Tue, 8 Jan 2002, Dave Trombley wrote:
>
> >     You can always check exaclty what's being done in your queries by
> > using the EXPLAIN command.  For example, to test your hypothesis:
> >
>
> >From your example I'm assuming I don't need data to test the optimizer.
> Cool!  This is why I asked (and a little laziness) instead of trying it.
> No real data to populate the table with.
>
> Thanks for the idea/knowledge.

Ah, no that actually isn't true.  For example, I believe that Pg will
always Seq Scan a table that has never been vacuum analyzed, or is empty,
or has only a few rows.

-jwb


Re: sequential scans and the like operator

From
Tom Lane
Date:
Dave Trombley <dtrom@bumba.net> writes:
> Roderick A. Anderson wrote:
>> There is a discussion going on on the sql-ledger mailing list concerning
>> whether indexes will provide any performance improvements.  The one that
>> caught my eye was whether using LIKE in a statement would force a
>> sequential scan.
>>
>     You can always check exaclty what's being done in your queries by
> using the EXPLAIN command.

Alternatively, search the pgsql mailing list archives; LIKE performance
has been discussed more times than I care to think about.  The present
state of play, I believe, is:

* LIKE and regexp match WHERE clauses are potentially indexable if the
pattern specifies a fixed prefix of one or more characters.  The longer
the fixed prefix, the more selective the index condition (and hence the
greater the probability the planner will choose to use it).

As examples:

    foo LIKE 'abc%bar'    indexable (prefix is abc)
    foo LIKE '_abc%bar'    not indexable (first character not fixed)
    foo ~ 'abc'        not indexable (pattern not anchored left)
    foo ~ '^abc'        indexable (prefix is abc)
    foo ILIKE 'abc%'    not indexable (1st char could be A or a)

* If Postgres was compiled with LOCALE support and is using a non-C
locale, <blech>LIKE indexing is disabled</blech> because the sort order
of the index may not agree with what's needed to perform LIKE searches.

This last point is rather nasty since non-C locales are rapidly becoming
the usual default, even in the USA.  Linuxen tend to default to en_US
locale, for example.

            regards, tom lane

Re: sequential scans and the like operator

From
Andrew Sullivan
Date:
On Tue, Jan 08, 2002 at 10:45:25AM -0800, Roderick A. Anderson wrote:
> On Tue, 8 Jan 2002, Dave Trombley wrote:
>
> >     You can always check exaclty what's being done in your queries by
> > using the EXPLAIN command.  For example, to test your hypothesis:
> >
>
> >From your example I'm assuming I don't need data to test the optimizer.
> Cool!  This is why I asked (and a little laziness) instead of trying it.
> No real data to populate the table with.

Uh, except that the results will be different depending on how much
data there is.  If you only have three rows in the table, it's always
going to use a seqscan no matter what the indices are: it's cheaper
just to look at all the records than to read the index.

That's why VACUUM ANALYSE is so important, by the way.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: sequential scans and the like operator

From
"Roderick A. Anderson"
Date:
On Tue, 8 Jan 2002, Tom Lane wrote:

> Alternatively, search the pgsql mailing list archives;

Yeah this was where I was trying to get but it was _way_ too slow and I
have AisleRiot, Gnome Stones, and Demineur to keep my happy while I
wait.

> * If Postgres was compiled with LOCALE support and is using a non-C
> locale, <blech>LIKE indexing is disabled</blech> because the sort order
> of the index may not agree with what's needed to perform LIKE searches.
>
> This last point is rather nasty since non-C locales are rapidly becoming
> the usual default, even in the USA.  Linuxen tend to default to en_US
> locale, for example.

Got bitten here once.

Thanks for the more indepth explanation.


Best,
Rod
--
                      Let Accuracy Triumph Over Victory

                                                       Zetetic Institute
                                                        "David's Sling"
                                                         Marc Stiegler