Re: "_" in a serach pattern - Mailing list pgsql-admin

From Michael Fuhr
Subject Re: "_" in a serach pattern
Date
Msg-id 20070722044134.GA55486@winnie.fuhr.org
Whole thread Raw
In response to "_" in a serach pattern  (Jessica Richard <rjessil@yahoo.com>)
List pgsql-admin
On Fri, Jul 20, 2007 at 09:06:53AM -0700, Jessica Richard wrote:
> I have a huge table that has a char(80) name column (indexed).

Do you really nead char(N) instead of varchar(N) or text?

> The name pattern is like
>
> A_B_C-D.123.456.pdf
> A_B_C-D.123.333.doc.2
>
> When I select name from table where name like 'A_B%', it was doing a
> table scan;

What does EXPLAIN ANALYZE show?  When you created the database,
what locale did you use?  What do "show lc_ctype" and "show
server_encoding" return?

> I guess the underscore "_" was treated like a special character,
> instead of a character "_".

Yes -- see the Pattern Matching section of the documentation:

http://www.postgresql.org/docs/8.2/interactive/functions-matching.html#FUNCTIONS-LIKE

> When I use select name from table where name like 'A\\_B%', it was doing
> an index scan with my result returned very fast,

What does EXPLAIN ANALYZE show for this query?

> but I got a warning about the \\:
>
> WARNING:  nonstandard use of \\ in a string literal
> LINE 1: ...elect name from table where file_name like 'A\\_B...
>                                                              ^
> HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.

As the hint suggests, you can avoid the warning by using E'A\\_B%'.

> By the way, the wild card works if it is at the end of the query, it
> does not seem to work if it is in the middle:
>
> select name from table where name like 'A\\_B%'  -- fine;
>
> select name from table were name like 'A\\_B%.pdf'; -- returns nothing.

This is probably due to the column type being char(N) instead of
varchar(N) or text:

test=> select 'abc'::char(3) like 'a%c';
 ?column?
----------
 t
(1 row)

test=> select 'abc'::char(4) like 'a%c';
 ?column?
----------
 f
(1 row)

test=> select 'abc'::char(4) like 'a%c%';
 ?column?
----------
 t
(1 row)

test=> select 'abc'::char(4) like 'a%c ';
 ?column?
----------
 t
(1 row)

I had thought that char(N)'s padding spaces were supposed to be
semantically insignificant but I don't know if that applies to
pattern matching.

> questions:
>
> 1. How do I get rid of the nonstandard warning, but still using the index search?

Do what the HINT suggests: use E'pattern' instead of 'pattern', or
use dollar quotes.

http://www.postgresql.org/docs/8.2/interactive/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING

If you're using a non-C locale then you'll need to create an index
using one of the pattern_ops operator classes to get pattern matches
to use an index.

http://www.postgresql.org/docs/8.2/interactive/indexes-opclass.html

> 2. How do I search with a wild card % in the middle of the pattern? Would
>  varchar(80) or char(80) make a difference about the wild card search
>  (% in the middle)?

See the above examples.  I'd recommend using varchar(N) or text
unless you have a good reason to use char(N).

--
Michael Fuhr

pgsql-admin by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: row level locking
Next
From: Peter Koczan
Date:
Subject: Re: "_" in a serach pattern