Thread: Query plan w/ like clause question

Query plan w/ like clause question

From
Steve Wranovsky
Date:
Hello,

I am having some strange results using a "like" clause under 7.2.1 w/ Linux.  I have a table with about 700,000 rows.
WhenI select on one of the varchar fields that has an index created on it with an exact match, the results comes back
quickly. When I try to a like clause with a '%' wildcard that will return the same results as the exact match, the
querydoes a sequential scan instead of using the index, and takes a significant amount of time to execute.  Below is
theSQL to create the table & index, along with the query plans for the exact match, and the like clause.  Any
suggestionson how to improve the "like" performance would be appreciated...   

Best regards,
Steve

Create Table Objects
   (SOP_Instance_UID VarChar(64) Not Null,
    Object_Entity_Type SmallInt Not Null,
    SOP_Class_UID VarChar(64) Not Null,
    Archived_Time Integer Not Null,
    Update_Time Integer Not Null,
    Object_Size Integer,
    Patient_ID VarChar(64),
    Study_Instance_UID VarChar(64),
    Series_Instance_UID VarChar(64),
    FileSet_UID VarChar(64),
    Primary Key (SOP_Instance_UID),
    Foreign Key (FileSet_UID) References FileSet (FileSet_UID),
    Foreign Key (Patient_ID) References Patient (Patient_ID),
    Foreign Key (Study_Instance_UID) References PatientStudy (Study_Instance_UID),
    Foreign Key (Series_Instance_UID) References Series (Series_Instance_UID)
   );

Create Index Objects_AK1
    On Objects (Patient_ID);

Here's the output from "explain analyze" and the total objects:

mergeark=# explain analyze select patient_id from objects where patient_id = '49.35.34.5.0';
NOTICE:  QUERY PLAN:

Index Scan using objects_ak1 on objects  (cost=0.00..118.06 rows=29 width=15) (actual time=58.30..117.52 rows=50
loops=1)
Total runtime: 117.75 msec

EXPLAIN


mergeark=# explain analyze select patient_id from objects where patient_id like '49.35.34.5.%';
NOTICE:  QUERY PLAN:

Seq Scan on objects  (cost=100000000.00..100057379.33 rows=1 width=15) (actual time=661.30..49266.06 rows=50 loops=1)
Total runtime: 49266.27 msec

EXPLAIN

mergeark=# select count(*) from objects;
 count
--------
 698836
(1 row)


Re: Query plan w/ like clause question

From
Richard Poole
Date:
On Wed, May 29, 2002 at 03:58:53PM -0500, Steve Wranovsky wrote:
>
> I am having some strange results using a "like" clause under 7.2.1 w/ Linux.

...

> When I try to a like clause with a '%' wildcard that will return the
> same results as the exact match, the query does a sequential scan
> instead of using the index, and takes a significant amount of time
> to execute.

An index scan can't be used for a LIKE unless you're running in the C
locale, because other locales may have sorting rules which would give
incorrect results. Probably you're using en_US? Unfortunately you
need to re-run "initdb" to change this, which means dumping and
restoring your databases. See section 3.2 of the Administrator's
Guide.

Richard

Re: Query plan w/ like clause question

From
Steve Wranovsky
Date:
Richard,

Thanks for the quick feedback!  Changing to the C locale fixed the problem and
the queries are now less than a fraction of a second, as I'd expect.  Thanks again.

Steve


At 12:58 AM 5/30/02 +0100, you wrote:
>On Wed, May 29, 2002 at 03:58:53PM -0500, Steve Wranovsky wrote:
>>
>> I am having some strange results using a "like" clause under 7.2.1 w/ Linux.
>
>...
>
>> When I try to a like clause with a '%' wildcard that will return the
>> same results as the exact match, the query does a sequential scan
>> instead of using the index, and takes a significant amount of time
>> to execute.
>
>An index scan can't be used for a LIKE unless you're running in the C
>locale, because other locales may have sorting rules which would give
>incorrect results. Probably you're using en_US? Unfortunately you
>need to re-run "initdb" to change this, which means dumping and
>restoring your databases. See section 3.2 of the Administrator's
>Guide.
>
>Richard


Re: Query plan w/ like clause question

From
Richard Poole
Date:
On Wed, May 29, 2002 at 03:58:53PM -0500, Steve Wranovsky wrote:
>
> I am having some strange results using a "like" clause under 7.2.1 w/ Linux.

...

> When I try to a like clause with a '%' wildcard that will return the
> same results as the exact match, the query does a sequential scan
> instead of using the index, and takes a significant amount of time
> to execute.

An index scan can't be used for a LIKE unless you're running in the C
locale, because other locales may have sorting rules which would give
incorrect results. Probably you're using en_US? Unfortunately you
need to re-run "initdb" to change this, which means dumping and
restoring your databases. See section 3.2 of the Administrator's
Guide.

Richard