Home > mailing lists

Re: [SQL] Comparison semantics of CHAR data type - Mailing list pgsql-hackers

From	Kevin Grittner
Subject	Re: [SQL] Comparison semantics of CHAR data type
Date	October 11, 2013 20:58:58
Msg-id	1381525130.59803.YahooMailNeo@web162902.mail.bf1.yahoo.com Whole thread Raw
In response to	Re: [SQL] Comparison semantics of CHAR data type (Bruce Momjian <bruce@momjian.us>)
Responses	Re: [SQL] Comparison semantics of CHAR data type
List	pgsql-hackers

Tree view

Bruce Momjian <bruce@momjian.us> wrote:
> Thomas Fanghaenel wrote:

>> I was wondering about the proper semantics of CHAR comparisons in some corner
>> cases that involve control characters with values that are less than 0x20
>> (space).

What matters in general isn't where the characters fall when
comparing individual bytes, but how the strings containing them
sort according to the applicable collation.  That said, my
recollection of the spec is that when two CHAR(n) values are
compared, the shorter should be blank-padded before making the
comparison.  *That* said, I think the general advice is to stay
away from CHAR(n) in favor or VARCHAR(n) or TEXT, and I think that
is good advice.

> I am sorry for this long email, but I would be interested to see what
> other hackers think about this issue.

Since we only have the CHAR(n) type to improve compliance with the
SQL specification, and we don't generally encourage its use, I
think we should fix any non-compliant behavior.  That seems to mean
that if you take two CHAR values and compare them, it should give
the same result as comparing the same two values as VARCHAR using
the same collation with the shorter value padded with spaces.

So this is correct:

test=# select 'ab'::char(3) collate "en_US" < E'ab\n'::char(3) collate "en_US";
 ?column?
----------
 t
(1 row)

... because it matches:

test=# select 'ab '::varchar(3) collate "en_US" < E'ab\n'::varchar(3) collate "en_US";
 ?column?
----------
 t
(1 row)

But this is incorrect:

test=# select 'ab'::char(3) collate "C" < E'ab\n'::char(3) collate "C";
 ?column?
----------
 t
(1 row)

... because it doesn't match:

test=# select 'ab '::varchar(3) collate "C" < E'ab\n'::varchar(3) collate "C";
 ?column?
----------
 f
(1 row)

Of course, I have no skin in the game, because it took me about two
weeks after my first time converting a database with CHAR columns
to PostgreSQL to change them all to VARCHAR, and do that as part of
all future conversions.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Josh Berkus
Date: 11 October 2013, 20:23:21
Subject: Re: Auto-tuning work_mem and maintenance_work_mem

From: Tom Lane
Date: 11 October 2013, 21:01:27
Subject: Re: [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation

Re: [SQL] Comparison semantics of CHAR data type - Mailing list pgsql-hackers

Previous

Next