Thread: PG SQL and LIKE clause
Hello, We're porting a huge Library Management System, written using all kind of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux from the DBS Sybase to PG, millions of lines of code, which works also with DBS Oracle and in the past with INFORMIX-SE and -ONLINE. We got to know that in CHAR columns with trailing blanks a SELECT ... FROM ... WHERE name LIKE 'Ali' does not match in 'name' having 'Ali '. I glanced through our code with grep pipelines and found some hundred places which would be affected by this problem. I'm not interested in a religious discussion if or if not this behaviour of PG is correcter or better than in Sybase. It's just different to Sybase. Any hints to address this problem? Or is there any compile time option for the PG server to address this? Thanks matthias -- Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub Mientras haya voluntad de lucha habrá esperanza de vencer.
Hi
pá 13. 9. 2019 v 7:29 odesílatel Matthias Apitz <guru@unixarea.de> napsal:
Hello,
We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.
We got to know that in CHAR columns with trailing blanks a
SELECT ... FROM ... WHERE name LIKE 'Ali'
does not match in 'name' having 'Ali '.
I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.
Any hints to address this problem? Or is there any compile time option
for the PG server to address this?
There is not simple solution - you should to write own patch and used patched postgres.
Pavel
Thanks
matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
Mientras haya voluntad de lucha habrá esperanza de vencer.
On 9/13/19 12:28 AM, Matthias Apitz wrote: > Hello, > > We're porting a huge Library Management System, written using all kind > of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux > from the DBS Sybase to PG, millions of lines of code, which works also > with DBS Oracle and in the past with INFORMIX-SE and -ONLINE. > > We got to know that in CHAR columns with trailing blanks a > > SELECT ... FROM ... WHERE name LIKE 'Ali' > > does not match in 'name' having 'Ali '. Did you forget the "%"? Because the SQL standard which PostgreSQL follows is: SELECT ... FROM ... WHERE name LIKE 'Ali%' > I glanced through our code with grep pipelines and found some hundred > places which would be affected by this problem. I'm not interested in a > religious discussion if or if not this behaviour of PG is correcter or > better than in Sybase. It's just different to Sybase. SQL Server derives from Sybase, and it would also fail on this statement. > Any hints to address this problem? Or is there any compile time option > for the PG server to address this? > > Thanks > > matthias -- Angular momentum makes the world go 'round.
El día Friday, September 13, 2019 a las 07:33:10AM +0200, Pavel Stehule escribió: > > We got to know that in CHAR columns with trailing blanks a > > > > SELECT ... FROM ... WHERE name LIKE 'Ali' > > > > does not match in 'name' having 'Ali '. > > > > I glanced through our code with grep pipelines and found some hundred > > places which would be affected by this problem. I'm not interested in a > > religious discussion if or if not this behaviour of PG is correcter or > > better than in Sybase. It's just different to Sybase. > > > > Any hints to address this problem? Or is there any compile time option > > for the PG server to address this? > > > > There is not simple solution - you should to write own patch and used > patched postgres. Hello, As we compiled our own PG 11.4 and as we're alone on the server with our software plus PG, this could be an option to consider. Do you have an idea where to look for this in the PG source. Maybe it's just deleting the trailing blanks from the column content before applying the string compare with regular expressions. Thanks matthias -- Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub May, 9: Спаси́бо освободители! Thank you very much, Russian liberators!
pá 13. 9. 2019 v 8:49 odesílatel Matthias Apitz <guru@unixarea.de> napsal:
El día Friday, September 13, 2019 a las 07:33:10AM +0200, Pavel Stehule escribió:
> > We got to know that in CHAR columns with trailing blanks a
> >
> > SELECT ... FROM ... WHERE name LIKE 'Ali'
> >
> > does not match in 'name' having 'Ali '.
> >
> > I glanced through our code with grep pipelines and found some hundred
> > places which would be affected by this problem. I'm not interested in a
> > religious discussion if or if not this behaviour of PG is correcter or
> > better than in Sybase. It's just different to Sybase.
> >
> > Any hints to address this problem? Or is there any compile time option
> > for the PG server to address this?
> >
>
> There is not simple solution - you should to write own patch and used
> patched postgres.
Hello,
As we compiled our own PG 11.4 and as we're alone on the server with
our software plus PG, this could be an option to consider.
Do you have an idea where to look for this in the PG source. Maybe it's
just deleting the trailing blanks from the column content before applying
the string compare with regular expressions.
I am not a expert on this area, and I don't know all dependency. Some code is in src/backend/utils/adt/like.c
Pavel
Thanks
matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
May, 9: Спаси́бо освободители! Thank you very much, Russian liberators!
Matthias Apitz schrieb am 13.09.2019 um 07:28: > We're porting a huge Library Management System, written using all kind > of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux > from the DBS Sybase to PG, millions of lines of code, which works also > with DBS Oracle and in the past with INFORMIX-SE and -ONLINE. > > We got to know that in CHAR columns with trailing blanks a In a nutshell: do not use the CHAR data type (neither in Oracle nor in Postgres) Unless your application actively checks the data type name of columns in a table you can simply change the type of thosecolumns to varchar (or text) and get rid of the annoying (but required) behaviour of the CHAR type. To be honest, I am surprised this didn't show up in Oracle, as I think the CHAR behaviour there is the same as in Postgres. Thomas
On 9/12/19 11:11 PM, Ron wrote: > On 9/13/19 12:28 AM, Matthias Apitz wrote: >> Hello, >> >> We're porting a huge Library Management System, written using all kind >> of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux >> from the DBS Sybase to PG, millions of lines of code, which works also >> with DBS Oracle and in the past with INFORMIX-SE and -ONLINE. >> >> We got to know that in CHAR columns with trailing blanks a >> >> SELECT ... FROM ... WHERE name LIKE 'Ali' >> >> does not match in 'name' having 'Ali '. > > Did you forget the "%"? Because the SQL standard which PostgreSQL > follows is: > > SELECT ... FROM ... WHERE name LIKE 'Ali%' The above is if you want to find a string starting with 'Ali'. If you are looking for the complete string 'Ali' then it is appropriate. The OP is looking for a way to automatically match a complete string against a right stripped string from a CHAR field. > >> I glanced through our code with grep pipelines and found some hundred >> places which would be affected by this problem. I'm not interested in a >> religious discussion if or if not this behaviour of PG is correcter or >> better than in Sybase. It's just different to Sybase. > > SQL Server derives from Sybase, and it would also fail on this statement. > >> Any hints to address this problem? Or is there any compile time option >> for the PG server to address this? >> >> Thanks >> >> matthias > -- Adrian Klaver adrian.klaver@aklaver.com
Matthias Apitz <guru@unixarea.de> writes: > We got to know that in CHAR columns with trailing blanks a > SELECT ... FROM ... WHERE name LIKE 'Ali' > does not match in 'name' having 'Ali '. > I glanced through our code with grep pipelines and found some hundred > places which would be affected by this problem. I'm not interested in a > religious discussion if or if not this behaviour of PG is correcter or > better than in Sybase. It's just different to Sybase. > Any hints to address this problem? Don't use char(n) for anything more complicated than cases like it-must-be-a-2-character-state-abbreviation. You can use varchar(n) if you must have an upper limit on the field length. Or text. regards, tom lane
On 9/13/19 9:14 AM, Adrian Klaver wrote: > On 9/12/19 11:11 PM, Ron wrote: >> On 9/13/19 12:28 AM, Matthias Apitz wrote: >>> Hello, >>> >>> We're porting a huge Library Management System, written using all kind >>> of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux >>> from the DBS Sybase to PG, millions of lines of code, which works also >>> with DBS Oracle and in the past with INFORMIX-SE and -ONLINE. >>> >>> We got to know that in CHAR columns with trailing blanks a >>> >>> SELECT ... FROM ... WHERE name LIKE 'Ali' >>> >>> does not match in 'name' having 'Ali '. >> >> Did you forget the "%"? Because the SQL standard which PostgreSQL >> follows is: >> >> SELECT ... FROM ... WHERE name LIKE 'Ali%' > > The above is if you want to find a string starting with 'Ali'. If you are > looking for the complete string 'Ali' then it is appropriate. The OP is > looking for a way to automatically match a complete string against a right > stripped string from a CHAR field. This is highly dependent on implementation. On the RDBMS that I used to work on, trailing whitespace was automatically stripped from CHAR(xx) fields. -- Angular momentum makes the world go 'round.
On 9/13/19 7:35 AM, Ron wrote: > On 9/13/19 9:14 AM, Adrian Klaver wrote: >> On 9/12/19 11:11 PM, Ron wrote: >>> On 9/13/19 12:28 AM, Matthias Apitz wrote: >>>> Hello, >>>> >>>> We're porting a huge Library Management System, written using all kind >>>> of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on >>>> Linux >>>> from the DBS Sybase to PG, millions of lines of code, which works also >>>> with DBS Oracle and in the past with INFORMIX-SE and -ONLINE. >>>> >>>> We got to know that in CHAR columns with trailing blanks a >>>> >>>> SELECT ... FROM ... WHERE name LIKE 'Ali' >>>> >>>> does not match in 'name' having 'Ali '. >>> >>> Did you forget the "%"? Because the SQL standard which PostgreSQL >>> follows is: >>> >>> SELECT ... FROM ... WHERE name LIKE 'Ali%' >> >> The above is if you want to find a string starting with 'Ali'. If you >> are looking for the complete string 'Ali' then it is appropriate. The >> OP is looking for a way to automatically match a complete string >> against a right stripped string from a CHAR field. > > This is highly dependent on implementation. On the RDBMS that I used to > work on, trailing whitespace was automatically stripped from CHAR(xx) > fields. > That is what is driving the OP's question. On Sybase the trailing whitespace is stripped apparently, on Postgres it is not, at least for the purposes of LIKE. Though for other comparisons whitespaces are ignored. See below for more information: https://www.postgresql.org/docs/11/datatype-character.html -- Adrian Klaver adrian.klaver@aklaver.com
Is this a possibility?
From https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP -
~ 'ali[ ]*$' matches strings ending in ali that have zero of more spaces after ali
this would match
'bali'
'ali'
'ali '
'bali '
If you need full string then it would be like this
~ '^ali[ ]*$' matches string only containing ali plus zero or more spaces after ali
this would match
'ali'
'ali '
but not match
'bali'
'bali '
Also switching ~* for ~ makes if case insensitive if necessary.
John
On Thu, Sep 12, 2019 at 10:29 PM Matthias Apitz <guru@unixarea.de> wrote:
Hello,
We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.
We got to know that in CHAR columns with trailing blanks a
SELECT ... FROM ... WHERE name LIKE 'Ali'
does not match in 'name' having 'Ali '.
I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.
Any hints to address this problem? Or is there any compile time option
for the PG server to address this?
Thanks
matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
Mientras haya voluntad de lucha habrá esperanza de vencer.