Thread: Mixed case text searches

Mixed case text searches

From
Chris Campbell
Date:

Hi list people.  Okay I’ve read the documentation.  Now it’s time to talk to people that actually do this for a living.  Mixed case searches, what is the best practice?

I’m searching for an account name:  Acme Rockets Inc.

 

strSearchString = ‘acme%’

Select * From Accounts Where AccountName = strSearchString

This will of course fail because the case doesn’t match.  So what is the best practice for performance?

 

I could use the Lower() function:

strSearchString = lower(‘acme%’)

Select * From Accounts Where lower(AccountName) = strSearchString

 

Or I could use the ilike operator

strSearchString = ‘acme%

Select * From Accounts Where AccountName ilike  strSearchString

 

It’s also been suggested that I keep a companion column that mirrors the account name column which is forced to lower case.  This seems, well a bit desperate to me.

 

So, from a performance standpoint, what are people doing and why?

 

Many thanks for your replies.

 

Chris Campbell

 

Re: Mixed case text searches

From
Thom Brown
Date:
On 15 June 2010 16:25, Chris Campbell <ccampbell@cascadeds.com> wrote:

Hi list people.  Okay I’ve read the documentation.  Now it’s time to talk to people that actually do this for a living.  Mixed case searches, what is the best practice?

I’m searching for an account name:  Acme Rockets Inc.

 

strSearchString = ‘acme%’

Select * From Accounts Where AccountName = strSearchString

This will of course fail because the case doesn’t match.  So what is the best practice for performance?

 

I could use the Lower() function:

strSearchString = lower(‘acme%’)

Select * From Accounts Where lower(AccountName) = strSearchString

 

Or I could use the ilike operator

strSearchString = ‘acme%

Select * From Accounts Where AccountName ilike  strSearchString

 

It’s also been suggested that I keep a companion column that mirrors the account name column which is forced to lower case.  This seems, well a bit desperate to me.

 

So, from a performance standpoint, what are people doing and why?

 

Many thanks for your replies.

 

Chris Campbell

 

You might want citext (case-insensitive text): http://www.postgresql.org/docs/8.4/static/citext.html

This means while the case is preserved when returning data, queries will match insensitively.  Also indexes will work as expected on them.  The alternative is to use lower() on every query and have a function index using lower().

Regards

Thom

Re: Mixed case text searches

From
Chris Campbell
Date:

(Pg 8.4)

Well I gave it a shot and for whatever reason it doesn’t work for me.  The table got created with accountname field of type citext.  If I run a general select query the field correctly shows that it’s of type citext.  However, when I add my where clause accountname like ‘a%’, it returns no records.  If I cap it (‘A%’) then I get the expected results.  I’m using a BTree index on the field.  Don’t know if that makes a difference or not.  Has anyone gotten this citext type to work?  Is there an undocumented trick to getting it to work?  Thanks

 

 

From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-owner@postgresql.org] On Behalf Of Thom Brown
Sent: Tuesday, June 15, 2010 8:35 AM
To: Chris Campbell
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Mixed case text searches

 

On 15 June 2010 16:25, Chris Campbell <ccampbell@cascadeds.com> wrote:

Hi list people.  Okay I’ve read the documentation.  Now it’s time to talk to people that actually do this for a living.  Mixed case searches, what is the best practice?

I’m searching for an account name:  Acme Rockets Inc.

 

strSearchString = ‘acme%’

Select * From Accounts Where AccountName = strSearchString

This will of course fail because the case doesn’t match.  So what is the best practice for performance?

 

I could use the Lower() function:

strSearchString = lower(‘acme%’)

Select * From Accounts Where lower(AccountName) = strSearchString

 

Or I could use the ilike operator

strSearchString = ‘acme%

Select * From Accounts Where AccountName ilike  strSearchString

 

It’s also been suggested that I keep a companion column that mirrors the account name column which is forced to lower case.  This seems, well a bit desperate to me.

 

So, from a performance standpoint, what are people doing and why?

 

Many thanks for your replies.

 

Chris Campbell

 

You might want citext (case-insensitive text): http://www.postgresql.org/docs/8.4/static/citext.html

This means while the case is preserved when returning data, queries will match insensitively.  Also indexes will work as expected on them.  The alternative is to use lower() on every query and have a function index using lower().

Regards

Thom

Re: Mixed case text searches

From
Thom Brown
Date:
On 15 June 2010 19:41, Chris Campbell <ccampbell@cascadeds.com> wrote:

(Pg 8.4)

Well I gave it a shot and for whatever reason it doesn’t work for me.  The table got created with accountname field of type citext.  If I run a general select query the field correctly shows that it’s of type citext.  However, when I add my where clause accountname like ‘a%’, it returns no records.  If I cap it (‘A%’) then I get the expected results.  I’m using a BTree index on the field.  Don’t know if that makes a difference or not.  Has anyone gotten this citext type to work?  Is there an undocumented trick to getting it to work?  Thanks

 

Please reply below rather than above messages and in plain text if possible.

I'm not sure why that's not working for you.  I just tried it on 8.4.4 and was fine for me:

test=# create table test_table(
id serial,
stuff citext);
NOTICE:  CREATE TABLE will create implicit sequence "test_table_id_seq" for serial column "test_table.id"
CREATE TABLE

test=# insert into test_table (stuff) values ('animal'),('Alpha'),('aquarium'),('Arctic');
INSERT 0 4

test=# select * from test_table where stuff like 'a%';
 id |  stuff   
----+----------
  1 | animal
  2 | Alpha
  3 | aquarium
  4 | Arctic
(4 rows)

test=# select * from test_table where stuff like 'A%';
 id |  stuff   
----+----------
  1 | animal
  2 | Alpha
  3 | aquarium
  4 | Arctic
(4 rows)

Are you sure you're referring to the correct field in your where clause?

Regards

Thom 

Re: Mixed case text searches

From
Chris Campbell
Date:

On 15 June 2010 19:41, Chris Campbell <ccampbell@cascadeds.com> wrote:

(Pg 8.4)

Well I gave it a shot and for whatever reason it doesn’t work for me.  The table got created with accountname field of type citext.  If I run a general select query the field correctly shows that it’s of type citext.  However, when I add my where clause accountname like ‘a%’, it returns no records.  If I cap it (‘A%’) then I get the expected results.  I’m using a BTree index on the field.  Don’t know if that makes a difference or not.  Has anyone gotten this citext type to work?  Is there an undocumented trick to getting it to work?  Thanks

 

Please reply below rather than above messages and in plain text if possible.

 

I'm not sure why that's not working for you.  I just tried it on 8.4.4 and was fine for me:

 

test=# create table test_table(

id serial,

stuff citext);

NOTICE:  CREATE TABLE will create implicit sequence "test_table_id_seq" for serial column "test_table.id"

CREATE TABLE

 

test=# insert into test_table (stuff) values ('animal'),('Alpha'),('aquarium'),('Arctic');

INSERT 0 4

 

test=# select * from test_table where stuff like 'a%';

 id |  stuff   

----+----------

  1 | animal

  2 | Alpha

  3 | aquarium

  4 | Arctic

(4 rows)

 

test=# select * from test_table where stuff like 'A%';

 id |  stuff   

----+----------

  1 | animal

  2 | Alpha

  3 | aquarium

  4 | Arctic

(4 rows)

 

Are you sure you're referring to the correct field in your where clause?

 

Regards

 

Thom 

 

--

Okay, apparently it makes a difference which schema you import the functions into.  I did not originally import them into the public schema of my database.  In order to use the citext type I had to prefix it with my schema name, which apparently effected its behavior.  Once I removed those functions from my working schema and added them to the public schema it started working for me.

 

Thanks for your assistance!

 

“Are you sure you're referring to the correct field in your where clause?”

Hehe, yeah pretty sure.

 

- CBC