Thread: Stripping white-space in SELECT statments

Stripping white-space in SELECT statments

From
Thorbjörn Eriksson
Date:
Hello,

I've encountered a strange behavior in postgres 7.2.1 regarding how psql
handles strings ending with space characters.

If I want to search for records where the first column (artnrgrpmtrln_1)
begins with
'201901  ', our system that uses the database creates the following SQL
statement:

select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901  ' and
artnrgrpmtrln_1<='201901  ÿ'

The execution of this statement gives the following resultset, which I
didn't expect. What I wanted was the first 8 records only.
    artnrgrpmtrln_1
--------------------------201901  00R18000  0035C2201901  00R18005  0035C3201901  00R18707  007593201901  00R28541
0030D6201901 00R40055  0030D8201901  00R40277T 0030D7201901  00S00406  0030D9201901  00SA0200  003662201901-D00R18000
0035C2201901-D00R18005 0035C3201901-D00R18702  008439201901-D00R18707  007593201901-D00R28541  0030D6201901-D00R40055
0030D8201901-D00R40277T0030D7201901-D00S00406  0030D9201901-D00SA0200  003662201901JW00R18000  0035C2201901JW00R18005
0035C3201901JW00R18707 007593
 

The Table description is below.
                Table "sr"    Column      |     Type      | Modifiers
-----------------+---------------+-----------artnrgrpmtrln_1 | character(24) |mangd_2         | character(8)  |enhet_3
      | character(1)  |text_4          | character(15) |start_5         | character(3)  |lageruppd_6     | character(1)
|materialnr_7    | character(8)  |opfoljd_8       | character(3)  |lgst_9          | character(3)  |
 
Indexes: sr_materialnr_7
Unique keys: sr_artnrgrpmtrln_1

This behaviour seems to have changed since postgreSQL v. 7.2, since it works
there. The reason that we don't use 'LIKE 201901  %' is that it don't use
the index sr_artnrgrpmtrln_1 when doing the lookup.

Is there anyone who can explain this behaviour? Could it be that the parser
strips of the whitespaces in '201901  '?

Best Regards,
Tobbe




Re: Stripping white-space in SELECT statments

From
dima
Date:
> If I want to search for records where the first column (artnrgrpmtrln_1)
> begins with
> '201901  ', our system that uses the database creates the following SQL
> statement:
> 
> select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901  ' and
> artnrgrpmtrln_1<='201901  ÿ'
what does "our system" mean?




Re: Stripping white-space in SELECT statments

From
Thorbjörn Eriksson
Date:
By "our system" I mean the software that uses the database. It is a quit old
software written in C that has been ported a couple of times to different
*NIX platforms using different DBMS's. It uses in-house developed functions
that, depending on parameters, creates a variety of SQL statements that in
the end gets executed by PQexec (from libpq i guess).

> -----Ursprungligt meddelande-----
> Från: dima [mailto:_pppp@mail.ru]
> Skickat: den 19 september 2002 13:08
> Till: thorbjorn.eriksson@ec.se
> Kopia: pgsql-sql@postgresql.org
> Ämne: Re: [SQL] Stripping white-space in SELECT statments
>
>
> > If I want to search for records where the first column (artnrgrpmtrln_1)
> > begins with
> > '201901  ', our system that uses the database creates the following SQL
> > statement:
> >
> > select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901  ' and
> > artnrgrpmtrln_1<='201901  ÿ'
> what does "our system" mean?
>
>




Re: Stripping white-space in SELECT statments

From
Tom Lane
Date:
Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> writes:
> I've encountered a strange behavior in postgres 7.2.1 regarding how psql
> handles strings ending with space characters.

Perhaps you are running in a non-C locale?  A lot of locales have
sorting rules that are pretty weird about whitespace.

> The reason that we don't use 'LIKE 201901  %' is that it don't use
> the index

This suggests strongly that you are in a non-C locale.  Your external
software seems to be emulating the standard LIKE-to-index optimization;
which as you are now discovering, does not work with non-C sorting
rules (so the system doesn't try to apply it).
        regards, tom lane


Re: Stripping white-space in SELECT statments

From
Stephan Szabo
Date:
On Thu, 19 Sep 2002, [iso-8859-1] Thorbj�rn Eriksson wrote:

> Hello,
>
> I've encountered a strange behavior in postgres 7.2.1 regarding how psql
> handles strings ending with space characters.
>

What locale did you initdb with?  Some locales on some systems have
behavior like that.  To test, you could try the unix sort command
on data like:
201901  Z
201901-D
201901 D
in the same locale and see what order you get (specifically where
does that first line go - on my machine it goes to the end unless
I explicitly choose a locale like "C")



Re: Stripping white-space in SELECT statments

From
Mathieu Arnold
Date:

--On jeudi 19 septembre 2002 13:20 +0200 Thorbjörn Eriksson
<thorbjorn.eriksson@ec.se> wrote:

> Hello,
>
> I've encountered a strange behavior in postgres 7.2.1 regarding how psql
> handles strings ending with space characters.
>
> If I want to search for records where the first column (artnrgrpmtrln_1)
> begins with
> '201901  ', our system that uses the database creates the following SQL
> statement:
>
> select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901  ' and
> artnrgrpmtrln_1<='201901  ÿ'

and why not
select artnrgrpmtrln_1
from sr
where substr(artnrgrpmtrln_1, 0, 9) = '201901  '


--
Mathieu Arnold


Re: Stripping white-space in SELECT statments

From
Andreas Joseph Krogh
Date:
On Thursday 19 September 2002 13:41, Thorbjörn Eriksson wrote:
> By "our system" I mean the software that uses the database. It is a quit
> old software written in C that has been ported a couple of times to
> different *NIX platforms using different DBMS's. It uses in-house developed
> functions that, depending on parameters, creates a variety of SQL
> statements that in the end gets executed by PQexec (from libpq i guess).
[snip]
> > > artnrgrpmtrln_1<='201901  ÿ'

This looks like to me that your C software doesn't '\0' terminate its strings
proparly and some random byte gets in the query.

--
Andreas Joseph Krogh <andreak@officenet.no>



Re: Stripping white-space in SELECT statments

From
Thorbjörn Eriksson
Date:
Thank's Tom Lane & Stephan Szabo for pointing out the problem to me.

After some testing it turned out that the swedish locale, 'sv_SE', doesn't
handle sorting spaces as expected, which probably made the SELECT fail. On
the other hand, if I use the 'C' locale, the SELECT works but not the sort
order of the swedish characters 'åäö'.

Does anyone know a solution to this problem, or could give me a hint?


> -----Ursprungligt meddelande-----
> Från: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Skickat: den 19 september 2002 16:32
> Till: thorbjorn.eriksson@ec.se
> Kopia: pgsql-sql@postgresql.org
> Ämne: Re: [SQL] Stripping white-space in SELECT statments
>
>
> Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> writes:
> > I've encountered a strange behavior in postgres 7.2.1 regarding how psql
> > handles strings ending with space characters.
>
> Perhaps you are running in a non-C locale?  A lot of locales have
> sorting rules that are pretty weird about whitespace.
>
> > The reason that we don't use 'LIKE 201901  %' is that it don't use
> > the index
>
> This suggests strongly that you are in a non-C locale.  Your external
> software seems to be emulating the standard LIKE-to-index optimization;
> which as you are now discovering, does not work with non-C sorting
> rules (so the system doesn't try to apply it).
>
>             regards, tom lane
>
>




Re: Stripping white-space in SELECT statments

From
Tom Lane
Date:
Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> writes:
> Thank's Tom Lane & Stephan Szabo for pointing out the problem to me.
> After some testing it turned out that the swedish locale, 'sv_SE', doesn't
> handle sorting spaces as expected, which probably made the SELECT fail. On
> the other hand, if I use the 'C' locale, the SELECT works but not the sort
> order of the swedish characters '���'.

> Does anyone know a solution to this problem, or could give me a hint?

I think you are going to have to create a custom locale definition that
sorts the accented characters as you wish, but does not have the strange
rules about whitespace.  I don't know enough about locale definitions to
give advice on how ... but I'll bet after a couple hours study you could
do it by copying just the parts you want of the existing Swedish locale 
definition.
        regards, tom lane