Re: Select * from users WHERE upper(lastName) = upper('Pringle') - Mailing list pgsql-jdbc

From Tom Lane
Subject Re: Select * from users WHERE upper(lastName) = upper('Pringle')
Date
Msg-id 25368.1042346758@sss.pgh.pa.us
Whole thread Raw
In response to Select * from users WHERE upper(lastName) = upper('Pringle')  ("Hale Pringle" <halepringle@yahoo.com>)
List pgsql-jdbc
"Hale Pringle" <halepringle@yahoo.com> writes:
> In a situation where a column names lastname is char(40):
> SELECT * FROM users WHERE lastname = 'Pringle'   returns one row.
> SELECT * FROM users WHERE UPPER(lastname) = UPPER('Pringle')   will return
> zero rows.
> SELECT * FROM users WHERE TRIM(UPPER(lastname) = TRIM(UPPER('Pringle'))
> will return one row.

lastname is actually 'Pringle                                 ' because
of char(N)'s implicit space-padding.

When you compare it to an untyped literal 'Pringle', the comparison
is assumed to be done under the rules of the char(N) datatype, in
which trailing spaces are not significant --- so you get TRUE.

When you feed lastname to UPPER(), the result is of type TEXT, because
we only have one form of UPPER() and it takes and returns TEXT.  And
trailing spaces *are* significant according to TEXT's equality operator.
So your second case reduces to
  'PRINGLE                                 '::text = 'PRINGLE'::text
which is false.

There has been some talk of altering the coercion rules so that
CHAR(n)-to-TEXT coercion is not direct but implicitly invokes rtrim()
to get rid of the trailing spaces.  This would cause your example to
be effectively
  SELECT * FROM users WHERE UPPER(RTRIM(lastname)) = UPPER('Pringle')
which would avoid most of the surprise factor.  There are probably still
some unpleasant corner cases in this approach, though.

In the meantime: my recommendation is to avoid char(N) like the plague,
except in cases where the data naturally has a fixed width (US state
postal abbreviations, for example, are inherently char(2)).  You should
be using varchar(40) or text for lastname.  char(40) loses on
efficiency, storage space, and every other dimension as well as this
one.

            regards, tom lane

PS: this is not a JDBC issue, but a backend issue.  If you want to
discuss it further, please respect the followup-to: pgsql-sql.

pgsql-jdbc by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Sequence
Next
From: Simon Mitchell
Date:
Subject: Re: Sequence