Thread: LIKE, CHAR(), and trailing spaces

LIKE, CHAR(), and trailing spaces

From
Bruce Momjian
Date:
I found a little LIKE/CHAR() surprise --- below is a table and query
against a CHAR(10) field:
test=> CREATE TABLE test (x char(10));CREATE TABLE
test=> INSERT INTO test values ('hi');INSERT 0 1
test=> SELECT * FROM test WHERE x = 'hi';     x------------ hi(1 row)

The above works because both sides are converted to 'bpchar';  explain
shows that:
test=> EXPLAIN SELECT * FROM test WHERE x = 'hi';                      QUERY
PLAN------------------------------------------------------Seq Scan on test  (cost=0.00..33.12 rows=9 width=14)
Filter:(x = 'hi'::bpchar)                      ^^^^^^(2 rows)
 

The following does not work:test=> SELECT * FROM test WHERE x LIKE 'hi'; x---(0 rows)


It seems LIKE is considering the trailing CHAR(10) field spaces as
significant, even though our documentations says:
   Values of type <type>character</type> are physically padded   with spaces to the specified width <replaceable>n</>,
andare   stored and displayed that way.  However, the padding spaces are   treated as semantically insignificant.
Trailingspaces are
 
--> disregarded when comparing two values of type <type>character</type>,   and they will be removed when converting a
<type>character</type>value   to one of the other string types.  Note that trailing spaces   <emphasis>are</>
semanticallysignificant in   <type>character varying</type> and <type>text</type> values.
 

It says trailing spaces are not significant for character comparisons
--- the real question is whether LIKE is a comparison.  Obvioiusly '='
is a comparison, but the system does not treat LIKE as a comparison in
terms of trailing spaces.  Is that desired behavior?

I did an EXPLAIN on the query and found '~~' was being used and 'hi' was
being converted to text:
test=> explain select * from test where x like 'hi';                      QUERY
PLAN------------------------------------------------------Seq Scan on test  (cost=0.00..33.12 rows=9 width=14)
Filter:(x ~~ 'hi'::text)                     ^^       ^^^^(2 rows)
 

so I then checked psql \do to see what operators there were for ~~:
test=> \do ~~                                     List of operators   Schema   | Name | Left arg type | Right arg type
|Result type |
Description------------+------+---------------+----------------+-------------+-------------------------pg_catalog | ~~
| bytea         | bytea          | boolean     | matches LIKE expression
 
-->     pg_catalog | ~~   | character     | text           | boolean     | matches LIKE expression pg_catalog | ~~   |
name         | text           | boolean     | matches LIKE expression pg_catalog | ~~   | text          | text
| boolean     | matches LIKE expression(4 rows)
 

The one marked matches the arguments so it seems the comparison being
done is not character and character, but character and text.

I realize trim() could be used to get the desired behavior, but is our
behavior consistent?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: LIKE, CHAR(), and trailing spaces

From
Brendan Jurd
Date:
On 3 February 2011 10:54, Bruce Momjian <bruce@momjian.us> wrote:
> It seems LIKE is considering the trailing CHAR(10) field spaces as
> significant, even though our documentations says:
>
-- snip --
>
> It says trailing spaces are not significant for character comparisons
> --- the real question is whether LIKE is a comparison.  Obvioiusly '='
> is a comparison, but the system does not treat LIKE as a comparison in
> terms of trailing spaces.  Is that desired behavior?

Interesting.  I would have to say that from the user point of view,
LIKE is definitely a comparison, and if the rest of the operators on
bpchar ignore whitespace then LIKE ought to as well.

Is the situation the same for regex matches (~ operators)?

Cheers,
BJ


Re: LIKE, CHAR(), and trailing spaces

From
Bruce Momjian
Date:
Brendan Jurd wrote:
> On 3 February 2011 10:54, Bruce Momjian <bruce@momjian.us> wrote:
> > It seems LIKE is considering the trailing CHAR(10) field spaces as
> > significant, even though our documentations says:
> >
> -- snip --
> >
> > It says trailing spaces are not significant for character comparisons
> > --- the real question is whether LIKE is a comparison. ?Obvioiusly '='
> > is a comparison, but the system does not treat LIKE as a comparison in
> > terms of trailing spaces. ?Is that desired behavior?
> 
> Interesting.  I would have to say that from the user point of view,
> LIKE is definitely a comparison, and if the rest of the operators on
> bpchar ignore whitespace then LIKE ought to as well.
> 
> Is the situation the same for regex matches (~ operators)?

Yes, I think so:
test=> SELECT 'a'::char(10) ~ 'a$'; ?column?---------- f(1 row)test=> SELECT 'a'::char(10) ~ 'a  *$';
?column?----------t(1 row)
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: LIKE, CHAR(), and trailing spaces

From
Kenneth Marshall
Date:
On Wed, Feb 02, 2011 at 07:48:38PM -0500, Bruce Momjian wrote:
> Brendan Jurd wrote:
> > On 3 February 2011 10:54, Bruce Momjian <bruce@momjian.us> wrote:
> > > It seems LIKE is considering the trailing CHAR(10) field spaces as
> > > significant, even though our documentations says:
> > >
> > -- snip --
> > >
> > > It says trailing spaces are not significant for character comparisons
> > > --- the real question is whether LIKE is a comparison. ?Obvioiusly '='
> > > is a comparison, but the system does not treat LIKE as a comparison in
> > > terms of trailing spaces. ?Is that desired behavior?
> > 
> > Interesting.  I would have to say that from the user point of view,
> > LIKE is definitely a comparison, and if the rest of the operators on
> > bpchar ignore whitespace then LIKE ought to as well.
> > 
> > Is the situation the same for regex matches (~ operators)?
> 
> Yes, I think so:
> 
>     test=> SELECT 'a'::char(10) ~ 'a$';
>      ?column?
>     ----------
>      f
>     (1 row)
>     
>     test=> SELECT 'a'::char(10) ~ 'a  *$';
>      ?column?
>     ----------
>      t
>     (1 row)
> 
> -- 
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com

In my mind LIKE/~ are pattern matching operators and not a simple
comparison operator. PostgreSQL is doing the right thing in restricting
the somewhat bizarre treatment of trailing spaces to the '=' comparison
function. I can only imagine what would be needed to allow exceptions
to the pattern matching syntax to allow you to actually work with and
match the trailing spaces otherwise.

+10 for leaving the behavior as is.

Regards,
Ken
> 
>   + It's impossible for everything to be true. +
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
> 


Re: LIKE, CHAR(), and trailing spaces

From
Tom Lane
Date:
Kenneth Marshall <ktm@rice.edu> writes:
> On Wed, Feb 02, 2011 at 07:48:38PM -0500, Bruce Momjian wrote:
>>> It seems LIKE is considering the trailing CHAR(10) field spaces as
>>> significant, even though our documentations says:

> +10 for leaving the behavior as is.

Yeah, we've been around on this before if memory serves.  I don't think
there's a case for changing it that's strong enough to outweigh
backwards-compatibility considerations.

Also, anyone who does want the spaces to be stripped can just add an
explicit cast to text first:   char_variable::text LIKE ...
If we change it then we'll have to provide some other weird notation
to allow people to get at the old behavior (I suppose there are some
people out there relying on it).
        regards, tom lane