Thread: Stripping white-space in SELECT statments
Hello, I've encountered a strange behavior in postgres 7.2.1 regarding how psql handles strings ending with space characters. If I want to search for records where the first column (artnrgrpmtrln_1) begins with '201901 ', our system that uses the database creates the following SQL statement: select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901 ' and artnrgrpmtrln_1<='201901 ÿ' The execution of this statement gives the following resultset, which I didn't expect. What I wanted was the first 8 records only. artnrgrpmtrln_1 --------------------------201901 00R18000 0035C2201901 00R18005 0035C3201901 00R18707 007593201901 00R28541 0030D6201901 00R40055 0030D8201901 00R40277T 0030D7201901 00S00406 0030D9201901 00SA0200 003662201901-D00R18000 0035C2201901-D00R18005 0035C3201901-D00R18702 008439201901-D00R18707 007593201901-D00R28541 0030D6201901-D00R40055 0030D8201901-D00R40277T0030D7201901-D00S00406 0030D9201901-D00SA0200 003662201901JW00R18000 0035C2201901JW00R18005 0035C3201901JW00R18707 007593 The Table description is below. Table "sr" Column | Type | Modifiers -----------------+---------------+-----------artnrgrpmtrln_1 | character(24) |mangd_2 | character(8) |enhet_3 | character(1) |text_4 | character(15) |start_5 | character(3) |lageruppd_6 | character(1) |materialnr_7 | character(8) |opfoljd_8 | character(3) |lgst_9 | character(3) | Indexes: sr_materialnr_7 Unique keys: sr_artnrgrpmtrln_1 This behaviour seems to have changed since postgreSQL v. 7.2, since it works there. The reason that we don't use 'LIKE 201901 %' is that it don't use the index sr_artnrgrpmtrln_1 when doing the lookup. Is there anyone who can explain this behaviour? Could it be that the parser strips of the whitespaces in '201901 '? Best Regards, Tobbe
> If I want to search for records where the first column (artnrgrpmtrln_1) > begins with > '201901 ', our system that uses the database creates the following SQL > statement: > > select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901 ' and > artnrgrpmtrln_1<='201901 ÿ' what does "our system" mean?
By "our system" I mean the software that uses the database. It is a quit old software written in C that has been ported a couple of times to different *NIX platforms using different DBMS's. It uses in-house developed functions that, depending on parameters, creates a variety of SQL statements that in the end gets executed by PQexec (from libpq i guess). > -----Ursprungligt meddelande----- > Från: dima [mailto:_pppp@mail.ru] > Skickat: den 19 september 2002 13:08 > Till: thorbjorn.eriksson@ec.se > Kopia: pgsql-sql@postgresql.org > Ämne: Re: [SQL] Stripping white-space in SELECT statments > > > > If I want to search for records where the first column (artnrgrpmtrln_1) > > begins with > > '201901 ', our system that uses the database creates the following SQL > > statement: > > > > select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901 ' and > > artnrgrpmtrln_1<='201901 ÿ' > what does "our system" mean? > >
Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> writes: > I've encountered a strange behavior in postgres 7.2.1 regarding how psql > handles strings ending with space characters. Perhaps you are running in a non-C locale? A lot of locales have sorting rules that are pretty weird about whitespace. > The reason that we don't use 'LIKE 201901 %' is that it don't use > the index This suggests strongly that you are in a non-C locale. Your external software seems to be emulating the standard LIKE-to-index optimization; which as you are now discovering, does not work with non-C sorting rules (so the system doesn't try to apply it). regards, tom lane
On Thu, 19 Sep 2002, [iso-8859-1] Thorbj�rn Eriksson wrote: > Hello, > > I've encountered a strange behavior in postgres 7.2.1 regarding how psql > handles strings ending with space characters. > What locale did you initdb with? Some locales on some systems have behavior like that. To test, you could try the unix sort command on data like: 201901 Z 201901-D 201901 D in the same locale and see what order you get (specifically where does that first line go - on my machine it goes to the end unless I explicitly choose a locale like "C")
--On jeudi 19 septembre 2002 13:20 +0200 Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> wrote: > Hello, > > I've encountered a strange behavior in postgres 7.2.1 regarding how psql > handles strings ending with space characters. > > If I want to search for records where the first column (artnrgrpmtrln_1) > begins with > '201901 ', our system that uses the database creates the following SQL > statement: > > select artnrgrpmtrln_1 from sr where (artnrgrpmtrln_1>='201901 ' and > artnrgrpmtrln_1<='201901 ÿ' and why not select artnrgrpmtrln_1 from sr where substr(artnrgrpmtrln_1, 0, 9) = '201901 ' -- Mathieu Arnold
On Thursday 19 September 2002 13:41, Thorbjörn Eriksson wrote: > By "our system" I mean the software that uses the database. It is a quit > old software written in C that has been ported a couple of times to > different *NIX platforms using different DBMS's. It uses in-house developed > functions that, depending on parameters, creates a variety of SQL > statements that in the end gets executed by PQexec (from libpq i guess). [snip] > > > artnrgrpmtrln_1<='201901 ÿ' This looks like to me that your C software doesn't '\0' terminate its strings proparly and some random byte gets in the query. -- Andreas Joseph Krogh <andreak@officenet.no>
Thank's Tom Lane & Stephan Szabo for pointing out the problem to me. After some testing it turned out that the swedish locale, 'sv_SE', doesn't handle sorting spaces as expected, which probably made the SELECT fail. On the other hand, if I use the 'C' locale, the SELECT works but not the sort order of the swedish characters 'åäö'. Does anyone know a solution to this problem, or could give me a hint? > -----Ursprungligt meddelande----- > Från: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Skickat: den 19 september 2002 16:32 > Till: thorbjorn.eriksson@ec.se > Kopia: pgsql-sql@postgresql.org > Ämne: Re: [SQL] Stripping white-space in SELECT statments > > > Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> writes: > > I've encountered a strange behavior in postgres 7.2.1 regarding how psql > > handles strings ending with space characters. > > Perhaps you are running in a non-C locale? A lot of locales have > sorting rules that are pretty weird about whitespace. > > > The reason that we don't use 'LIKE 201901 %' is that it don't use > > the index > > This suggests strongly that you are in a non-C locale. Your external > software seems to be emulating the standard LIKE-to-index optimization; > which as you are now discovering, does not work with non-C sorting > rules (so the system doesn't try to apply it). > > regards, tom lane > >
Thorbjörn Eriksson <thorbjorn.eriksson@ec.se> writes: > Thank's Tom Lane & Stephan Szabo for pointing out the problem to me. > After some testing it turned out that the swedish locale, 'sv_SE', doesn't > handle sorting spaces as expected, which probably made the SELECT fail. On > the other hand, if I use the 'C' locale, the SELECT works but not the sort > order of the swedish characters '���'. > Does anyone know a solution to this problem, or could give me a hint? I think you are going to have to create a custom locale definition that sorts the accented characters as you wish, but does not have the strange rules about whitespace. I don't know enough about locale definitions to give advice on how ... but I'll bet after a couple hours study you could do it by copying just the parts you want of the existing Swedish locale definition. regards, tom lane