Thread: Mixed case text searches
Hi list people. Okay I’ve read the documentation. Now it’s time to talk to people that actually do this for a living. Mixed case searches, what is the best practice?
I’m searching for an account name: Acme Rockets Inc.
strSearchString = ‘acme%’
Select * From Accounts Where AccountName = strSearchString
This will of course fail because the case doesn’t match. So what is the best practice for performance?
I could use the Lower() function:
strSearchString = lower(‘acme%’)
Select * From Accounts Where lower(AccountName) = strSearchString
Or I could use the ilike operator
strSearchString = ‘acme%
Select * From Accounts Where AccountName ilike strSearchString
It’s also been suggested that I keep a companion column that mirrors the account name column which is forced to lower case. This seems, well a bit desperate to me.
So, from a performance standpoint, what are people doing and why?
Many thanks for your replies.
Chris Campbell
Hi list people. Okay I’ve read the documentation. Now it’s time to talk to people that actually do this for a living. Mixed case searches, what is the best practice?
I’m searching for an account name: Acme Rockets Inc.
strSearchString = ‘acme%’
Select * From Accounts Where AccountName = strSearchString
This will of course fail because the case doesn’t match. So what is the best practice for performance?
I could use the Lower() function:
strSearchString = lower(‘acme%’)
Select * From Accounts Where lower(AccountName) = strSearchString
Or I could use the ilike operator
strSearchString = ‘acme%
Select * From Accounts Where AccountName ilike strSearchString
It’s also been suggested that I keep a companion column that mirrors the account name column which is forced to lower case. This seems, well a bit desperate to me.
So, from a performance standpoint, what are people doing and why?
Many thanks for your replies.
Chris Campbell
This means while the case is preserved when returning data, queries will match insensitively. Also indexes will work as expected on them. The alternative is to use lower() on every query and have a function index using lower().
Regards
Thom
(Pg 8.4)
Well I gave it a shot and for whatever reason it doesn’t work for me. The table got created with accountname field of type citext. If I run a general select query the field correctly shows that it’s of type citext. However, when I add my where clause accountname like ‘a%’, it returns no records. If I cap it (‘A%’) then I get the expected results. I’m using a BTree index on the field. Don’t know if that makes a difference or not. Has anyone gotten this citext type to work? Is there an undocumented trick to getting it to work? Thanks
From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-owner@postgresql.org] On Behalf Of Thom Brown
Sent: Tuesday, June 15, 2010 8:35 AM
To: Chris Campbell
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Mixed case text searches
On 15 June 2010 16:25, Chris Campbell <ccampbell@cascadeds.com> wrote:
Hi list people. Okay I’ve read the documentation. Now it’s time to talk to people that actually do this for a living. Mixed case searches, what is the best practice?
I’m searching for an account name: Acme Rockets Inc.
strSearchString = ‘acme%’
Select * From Accounts Where AccountName = strSearchString
This will of course fail because the case doesn’t match. So what is the best practice for performance?
I could use the Lower() function:
strSearchString = lower(‘acme%’)
Select * From Accounts Where lower(AccountName) = strSearchString
Or I could use the ilike operator
strSearchString = ‘acme%
Select * From Accounts Where AccountName ilike strSearchString
It’s also been suggested that I keep a companion column that mirrors the account name column which is forced to lower case. This seems, well a bit desperate to me.
So, from a performance standpoint, what are people doing and why?
Many thanks for your replies.
Chris Campbell
You might want citext (case-insensitive text): http://www.postgresql.org/docs/8.4/static/citext.html
This means while the case is preserved when returning data, queries will match insensitively. Also indexes will work as expected on them. The alternative is to use lower() on every query and have a function index using lower().
Regards
Thom
(Pg 8.4)
Well I gave it a shot and for whatever reason it doesn’t work for me. The table got created with accountname field of type citext. If I run a general select query the field correctly shows that it’s of type citext. However, when I add my where clause accountname like ‘a%’, it returns no records. If I cap it (‘A%’) then I get the expected results. I’m using a BTree index on the field. Don’t know if that makes a difference or not. Has anyone gotten this citext type to work? Is there an undocumented trick to getting it to work? Thanks
On 15 June 2010 19:41, Chris Campbell <ccampbell@cascadeds.com> wrote:
(Pg 8.4)
Well I gave it a shot and for whatever reason it doesn’t work for me. The table got created with accountname field of type citext. If I run a general select query the field correctly shows that it’s of type citext. However, when I add my where clause accountname like ‘a%’, it returns no records. If I cap it (‘A%’) then I get the expected results. I’m using a BTree index on the field. Don’t know if that makes a difference or not. Has anyone gotten this citext type to work? Is there an undocumented trick to getting it to work? Thanks
Please reply below rather than above messages and in plain text if possible.
I'm not sure why that's not working for you. I just tried it on 8.4.4 and was fine for me:
test=# create table test_table(
id serial,
stuff citext);
NOTICE: CREATE TABLE will create implicit sequence "test_table_id_seq" for serial column "test_table.id"
CREATE TABLE
test=# insert into test_table (stuff) values ('animal'),('Alpha'),('aquarium'),('Arctic');
INSERT 0 4
test=# select * from test_table where stuff like 'a%';
id | stuff
----+----------
1 | animal
2 | Alpha
3 | aquarium
4 | Arctic
(4 rows)
test=# select * from test_table where stuff like 'A%';
id | stuff
----+----------
1 | animal
2 | Alpha
3 | aquarium
4 | Arctic
(4 rows)
Are you sure you're referring to the correct field in your where clause?
Regards
Thom
--
Okay, apparently it makes a difference which schema you import the functions into. I did not originally import them into the public schema of my database. In order to use the citext type I had to prefix it with my schema name, which apparently effected its behavior. Once I removed those functions from my working schema and added them to the public schema it started working for me.
Thanks for your assistance!
“Are you sure you're referring to the correct field in your where clause?”
Hehe, yeah pretty sure.
- CBC