Thread: [Fwd: Re: [Pljava-dev] char with trailing space, PreparedStatement.setObject & SetString]

There's an inconsistency between the handling of trailing whitespace in
query parameters in the client jdbc driver compared to the PL/Java SPI
based driver. According to Jean-Pierre, the former apparently trims the
trailing spaces before passing the query (see below). What is the
correct behavior?

Regards,
Thomas Hallgren


-------- Original Message --------
Subject:     Re: [Pljava-dev] char with trailing space,
PreparedStatement.setObject & SetString
Date:     Tue, 27 Jun 2006 12:07:19 -0400
From:     JEAN-PIERRE PELLETIER <pelletier_32@sympatico.ca>
To:     thomas@tada.se
CC:     pljava-dev@gborg.postgresql.org



Hi Thomas,

There are very few char columns on my system, I can easily live with
explicit trim
in my application code, I only wanted to let you guys know.

I am not sure which of the two JDBC implementations is right.
psql and pgadmin would both handle char as expected.
As for JDBC, you might want to know how other dbms are handling this.

Thanks for your reply.
Jean-Pierre Pelletier

>From: Thomas Hallgren <thomas@tada.se>
>To: JEAN-PIERRE PELLETIER <pelletier_32@sympatico.ca>
>CC: pljava-dev@gborg.postgresql.org
>Subject: Re: [Pljava-dev] char with trailing
>space,    PreparedStatement.setObject & SetString
>Date: Tue, 27 Jun 2006 17:47:24 +0200
>
>Hi Jean-Pierre,
>I'm not sure this is incorrect behavior. There's nothing in the spec that
>indicates that String values should be trimmed by setString and setObject.
>On the contrary. Some datatypes (the CHAR in particular) are sensitive to
>whitespace according to the SQL standard. Perhaps the client jdbc driver is
>doing something wrong here?
>
>Regards,
>Thomas Hallgren
>
>
>JEAN-PIERRE PELLETIER wrote:
>>Hi,
>>
>>Trailing space are not handled properly by setObject & setString.
>>
>>PreparedStatement pstmt = connection.prepareStatement(
>>    "select * from mytable where mycharcolumn = ?");
>>
>>String myString = "abc ";
>>pstmt.setObject(1, myString); // or setObject(1, myString, Types.CHAR) or
>>setString(1, myString)
>>
>>No rows are returned, but using trim works fine as in:
>>pstmt.setObject(1, myString.trim());
>>
>>My environment is Pl/Java 1.3, Sun JDK 1.5.07,  PostgreSQL 8.1.4, Windows
>>XP SP2
>>
>>With PostgreSQL own (non pl/java) jdbc driver, setObject on char works
>>fine without the trim.
>>
>>Thanks,
>>Jean-Pierre Pelletier
>>
>>
>>_______________________________________________
>>Pljava-dev mailing list
>>Pljava-dev@gborg.postgresql.org
>>http://gborg.postgresql.org/mailman/listinfo/pljava-dev
>>




Re: [Fwd: Re: [Pljava-dev] char with trailing space,

From
Kris Jurka
Date:

On Tue, 27 Jun 2006, Thomas Hallgren wrote:

> There's an inconsistency between the handling of trailing whitespace in
> query parameters in the client jdbc driver compared to the PL/Java SPI
> based driver. According to Jean-Pierre, the former apparently trims the
> trailing spaces before passing the query (see below). What is the
> correct behavior?

The JDBC driver does not trim spaces.  What it does is pass setString()
values with a type of varchar, perhaps pljava is using text and exposing
this subtle difference:

# select 'a '::char(2) = 'a '::text;
  ?column?
----------
  f
(1 row)
# select 'a '::char(2) = 'a '::varchar;
  ?column?
----------
  t
(1 row)

The original archive discussion here:
http://archives.postgresql.org/pgsql-jdbc/2004-10/msg00241.php

Kris Jurka

Kris Jurka wrote:
>
> The JDBC driver does not trim spaces.  What it does is pass
> setString() values with a type of varchar, perhaps pljava is using
> text and exposing this subtle difference:
>
> # select 'a '::char(2) = 'a '::text;
>  ?column?
> ----------
>  f
> (1 row)
> # select 'a '::char(2) = 'a '::varchar;
>  ?column?
> ----------
>  t
> (1 row)
>
I'm missing something. Both text and varchar consider the space significant:


thhal=# select length(' '::char);
 length
--------
      0
(1 row)

thhal=# select length(' '::varchar);
 length
--------
      1
(1 row)

thhal=# select length(' '::text);
 length
--------
      1
(1 row)

so why does it make a difference to use varchar instead of text? I.e.
why are your comparison results different? Why is the space
insignificant when using char?

Regards,
Thomas Hallgren


Thomas Hallgren <thomas@tada.se> writes:
> so why does it make a difference to use varchar instead of text?

It's a question of type resolution rules.  text is the preferred string
type so it "wins" in cross-type situations, in particular
    char_value = text_value
will be interpreted as
    char_value::text text_eq text_value
varchar is not only not a preferred type, it doesn't even have any
comparison ops of its own (it depends on text's ops).  Therefore given
    char_value = varchar_value
the parser is faced with the alternative interpretations
    char_value::text text_eq varchar_value::text
    char_value char_eq varchar_value::char
and it will prefer the latter because it has fewer casts.
See
http://www.postgresql.org/docs/8.1/static/typeconv.html
particularly rules 10.2.3b and 3c.  The upshot of all this
is that a parameter specified as "text" dominates any comparison
to other string datatypes, whereas one specified as "varchar"
does not.  Yeah, it's a bit grotty, but it gets the job done
and it's not an area we're likely to change much.

            regards, tom lane

Tom Lane wrote:
> Thomas Hallgren <thomas@tada.se> writes:
>
>> so why does it make a difference to use varchar instead of text?
>>
>
> It's a question of type resolution rules.  text is the preferred string
> type so it "wins" in cross-type situations, in particular
>     char_value = text_value
> will be interpreted as
>     char_value::text text_eq text_value
> varchar is not only not a preferred type, it doesn't even have any
> comparison ops of its own (it depends on text's ops).  Therefore given
>     char_value = varchar_value
> the parser is faced with the alternative interpretations
>     char_value::text text_eq varchar_value::text
>     char_value char_eq varchar_value::char
> and it will prefer the latter because it has fewer casts.
> See
> http://www.postgresql.org/docs/8.1/static/typeconv.html
> particularly rules 10.2.3b and 3c.  The upshot of all this
> is that a parameter specified as "text" dominates any comparison
> to other string datatypes, whereas one specified as "varchar"
> does not.  Yeah, it's a bit grotty, but it gets the job done
> and it's not an area we're likely to change much.
>
>             regards, tom lane
>
Hmm, OK. I checked my code and I don't use any specific type. Instead, I
use SPI_getargtypeid(ePlan, idx) on the prepared query and the
pgType->typinput function of the type that corresponds to the returned
oid. Perhaps the SPI_getargtypeid should return varchar for all text types?

Regards,
Thomas Hallgren



Thomas Hallgren <thomas@tada.se> writes:
> Hmm, OK. I checked my code and I don't use any specific type. Instead, I
> use SPI_getargtypeid(ePlan, idx) on the prepared query and the
> pgType->typinput function of the type that corresponds to the returned
> oid. Perhaps the SPI_getargtypeid should return varchar for all text types?

Certainly not.  That would break most normal uses of SPI_getargtypeid,
and I don't even see how it fixes your problem; once the query is
prepared these decisions are already made.

            regards, tom lane

Tom Lane wrote:
> Thomas Hallgren <thomas@tada.se> writes:
>
>> Hmm, OK. I checked my code and I don't use any specific type. Instead, I
>> use SPI_getargtypeid(ePlan, idx) on the prepared query and the
>> pgType->typinput function of the type that corresponds to the returned
>> oid. Perhaps the SPI_getargtypeid should return varchar for all text types?
>>
>
> Certainly not.  That would break most normal uses of SPI_getargtypeid,
> and I don't even see how it fixes your problem; once the query is
> prepared these decisions are already made.
>
>             regards, tom lane
>
My problem is simple. When I pass a parameter with a trailing space,
that space is considered significant. When the same thing is done using
the client jdbc driver, the trailing space is insignificant. The client
driver, since it knows nothing about the actual plan, elects to use the
varchar type always. PL/Java uses the actual type that it gets from the
plan when it performs the parameter coercion. I don't think my solution
is wrong but it yields different results and that's undesirable. My
questions stem from an uncertainty on how to go about fixing this so
that the two drivers behave the same way (and for the same reason).
Should this be considered a conscious flaw in the client driver
motivated by the desire to limit the number of round trips? If it is,
what would the best course of action be for me? Implement the same flaw
although I don't have that problem?

Regards,
Thomas Hallgren


Thomas Hallgren <thomas@tada.se> writes:
> My problem is simple. When I pass a parameter with a trailing space,
> that space is considered significant. When the same thing is done using
> the client jdbc driver, the trailing space is insignificant.

Your problem is not simple, and that's not an adequate description of
it, because you are omitting the influence of context on what the
parameter gets typed as.

The real point here is that the JDBC driver is doing something to prompt
the parser to resolve the types of parameter symbols in particular ways,
and (I gather) you're doing something different.  If you want to match
the results they get, then you need to supply the same type information
for parameter symbols to SPI_prepare as they put into Parse messages.

            regards, tom lane

Tom Lane wrote:
> The real point here is that the JDBC driver is doing something to prompt
> the parser to resolve the types of parameter symbols in particular ways,
> and (I gather) you're doing something different.  If you want to match
> the results they get, then you need to supply the same type information
> for parameter symbols to SPI_prepare as they put into Parse messages.
>
>
My mistake. I thought that given a statement like:

  SELECT p.name FROM people p WHERE p.zipcode = ?

the actual type returned by SPI_getargtype would be inferred by the
expression 'p.zipcode'. I guess that's not the case. Instead I need to
provide the information earlier, like:

  SELECT p.name FROM people p WHERE p.zipcode = ?::varchar

is that correct? If so, is there any way that I can automatically infer
the best type?

Regards,
Thomas Hallgren


Thomas Hallgren wrote:

> Should this be considered a conscious flaw in the client driver
> motivated by the desire to limit the number of round trips? If it is,
> what would the best course of action be for me? Implement the same flaw
> although I don't have that problem?

There is considerably more to it than trying to avoid an extra
round-trip (although that is part of it). If we're given a String
parameter, why should we interpret it as anything other than a String?
We don't really want to implement an any-type-to-any-other-type
conversion routine in the driver!

Nevertheless there is a "fix" here -- pass "stringtype=unspecified" as a
URL parameter in the JDBC URL and the driver will pass parameters set
via setString() with unspecified types. This was originally put in to
handle apps that expected to be able to setString("42") on an int
column, but it should deal with your particular situation too. Use at
your own risk. See
http://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters

Actually, one feature that might be nice in future protocol versions is
some ability to say "I think this parameter is one of these types --
please infer a type and fail to prepare if it doesn't match with the
expected types" which would also help with the timestamp-vs-timestamptz
case. Currently, we don't notice a mismatch between the value we passed
and the type that was inferred until well after the query is executed..
We could wait for a Describe before proceeding, but it seems a bit silly
if you can't do a decent implementation without sending Flush messages
everywhere, given the work the protocol does to let you stream messages
at it without waiting for results.

-O

Oliver Jowett wrote:
> Thomas Hallgren wrote:
>
>> Should this be considered a conscious flaw in the client driver
>> motivated by the desire to limit the number of round trips? If it is,
>> what would the best course of action be for me? Implement the same
>> flaw although I don't have that problem?
>
> There is considerably more to it than trying to avoid an extra
> round-trip (although that is part of it). If we're given a String
> parameter, why should we interpret it as anything other than a String?
> We don't really want to implement an any-type-to-any-other-type
> conversion routine in the driver!
>
No, of course not. I fully appreciate the way you do it today. The whole discussion stems
from a misunderstanding on my part. I thought that that type returned by the SPI_getargtype
was inferred from its context within the query. As it turns out it isn't.

> Nevertheless there is a "fix" here -- pass "stringtype=unspecified" as a
> URL parameter in the JDBC URL and the driver will pass parameters set
> via setString() with unspecified types. This was originally put in to
> handle apps that expected to be able to setString("42") on an int
> column, but it should deal with your particular situation too. Use at
> your own risk. See
> http://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters
>
Right. Apparently, what I do is similar to using "stringtype-unspecified" always (and not
just for string types either). My driver has some room for improvements :-)

Anyway, thanks for helping me clearing this up. Now I know what to do.

Regards,
Thomas Hallgren