Re: varchar comparison and trim() - Mailing list pgsql-novice

From Kevin Grittner
Subject Re: varchar comparison and trim()
Date
Msg-id 20121219195924.14700@gmx.com
Whole thread Raw
In response to varchar comparison and trim()  (ta@lavabit.com)
List pgsql-novice
ta@lavabit.com wrote:

> What would be the easiest way to make PG perform string comparison
> consistently across all string types?

Avoid using the char(n) data type.

> By 'consistent' I refer to this trailing space handling:
> (possibly it's standard but I find it somewhat inconvenient)
>
> select 'aa'::char(4) = 'aa '::char(3) => T
> select 'aa'::varchar(4) = 'aa '::varchar(3) => F
> select 'aa'::char(4) = 'aa '::varchar(3) => T

That behavior is mandated by standard.

> I've tried to redefine varchar to varchar operators applying Trim() to the
> arguments (have put them in separate "myschema" to avoid possible
> interference with who knows what) and that seems to work fine (not quite
> sure about performance loss and those optimizer hints).

Generally, char(n) is slower than varchar(n) or text.

> But during my tests sometimes, somehow, varchar columns having "unique"
> constraint defined, manage to accept both 'aa' and 'aa ' values.

Yes, those are different values. You could add a constraint to
prohibit leading or trailing spaces.

> So, this solution seems a bit "picky". Might I have to convert all varchar
> columns to char if they are covered by "unique" constraint?

I would go the other way. In fact, after running into this issue
where I worked, we took the time to convert every single char(n)
column to varchar(n).

> This way or another, having some sort of permanent "ANSI_PADDING" setting
> would be nice.

That's not likely to happen, to put it mildly. Every
behavior-changing setting is a source of problems, and we only
tolerate the char(n) behavior in the first place because the
standard requires it. Without that, I'm pretty sure it would not be
there at all.

> By the way, I managed to redefine various combinations of text to char,
> text to varchar etc, but not "text to text" operators. PG9.2 quietly
> ignores that redefinition.
> So, I finally gave up using "text" at all and converted all text columns
> that are likely to be used in comparison to varchar (unbounded).

You're likely to be creating a whole new and different set of odd
behaviors and bugs which nobody else in the world will have seen,
and it will be hard for people to help you when you hit problems.
The text type is the most native and normal of all the character
string types; many people stay away from not only char(n) but all
forms of varchar as well, and just use constraints to enforce data
format requirements, including length. (That's not a position I
advocate, but it works.)

-Kevin


pgsql-novice by date:

Previous
From: ta@lavabit.com
Date:
Subject: varchar comparison and trim()
Next
From: ta@lavabit.com
Date:
Subject: RESOLVED: varchar comparison and trim()