Thread: BUG #8335: trim() un-document behaviour

BUG #8335: trim() un-document behaviour

From
amutu@amutu.com
Date:
The following bug has been logged on the website:

Bug reference:      8335
Logged by:          Jov
Email address:      amutu@amutu.com
PostgreSQL version: 9.2.4
Operating system:   suse 10 linux 64
Description:

in the postgresql doc 9.4,I find the trim() function like this:


trim([leading | trailing | both] [characters] from string)


so the trim should be pass only one argument with some optional prefix。but I
find the following calls with two argument is successfull but the results is
unexpected and wired:


##first call
postgres=# select trim(trailing ‘/’, ‘fasd/’);
rtrim
——-


(1 row)
-----!!!note: it return titile is rtrim----


## second call
postgres=# select trim(‘/’, ‘fasd/’)
;
btrim
——-


(1 row)
-----!!!note: it return titile is btrim----


it seems trim is transform to r、b、l trim internal,but the above call should
return error or it may produce un-expect results

Re: BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Fri, Jul 26, 2013 at 02:23:10AM +0000, amutu@amutu.com wrote:
> in the postgresql doc 9.4,I find the trim() function like this:
>
>
> trim([leading | trailing | both] [characters] from string)
>
>
> so the trim should be pass only one argument with some optional prefix --- but I
> find the following calls with two argument is successfull but the results is
> unexpected and wired:
>
>
> ##first call
> postgres=# select trim(trailing '/, 'fasd/');
> rtrim
> ------
>
>
> (1 row)
> -----!!!note: it return titile is rtrim----
>
>
> ## second call
> postgres=# select trim('/', 'fasd/')
> ;
> btrim
> -----
>
>
> (1 row)
> -----!!!note: it return titile is btrim----
>
>
> it seems trim is transform to rtrim internal but the above call should
> return error or it may produce un-expect results

(I have cleaned up this posting because single-quotes were converted to
Unicode forward-backward quotes):

What is happening is that TRIM() is converted by the parser to calls to
base functions, e.g.

    \df *trim*
                              List of functions
       Schema   | Name  | Result data type | Argument data types |  Type
    ------------+-------+------------------+---------------------+--------
     pg_catalog | btrim | bytea            | bytea, bytea        | normal
     pg_catalog | btrim | text             | text                | normal
     pg_catalog | btrim | text             | text, text          | normal
     pg_catalog | ltrim | text             | text                | normal
     pg_catalog | ltrim | text             | text, text          | normal
     pg_catalog | rtrim | text             | text                | normal
     pg_catalog | rtrim | text             | text, text          | normal

That is why the headings don't say 'trim', but 'btrim', or similar ---
not sure we can easily improve that, and you can change the label with
AS.

The larger problem is the use of ',' instead of FROM, and the backwards
interpretation of the arguments.  The query:

    SELECT trim('/' FROM 'fasd/')

is internally converted to:

    SELECT btrim('fasd/', '/')

Note the arguments are reversed.  The comma syntax does not reverse the
arguments:

    SELECT trim('/', 'fasd/')

is internally converted to:

    SELECT btrim('/', 'fasd/')

You can even use modifiers like TRAILING with comma syntax:

    SELECT trim(TRAILING '/', 'fasd/');

and that uses 'rtrim', but of course the behavior is still reverse of
expected.

Basically the odd comma behavior is because without a FROM, the
arguments are passed directly to btrim/rtrim/ltrim, and these functions
take the origin string first, then the string of characters to remove.
You are right this is undocumented.

The attached patch swaps the arguments in the parser, and allows your
expected behavior:

    SELECT trim('x', 'xfasdx');
     btrim
    -------
     fasd

Another option would be to change the C API for the b/r/ltrim functions,
or disallow the use of the comma TRIM syntax in the parser.

I am a little worried people might be relying on the trim/comma syntax
somewhere.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: BUG #8335: trim() un-document behaviour

From
Romain Billon-Grand
Date:
Running postgres 9.1.4 using pgAdmin3 on a macbook air intel OS 10.6

I just had this bug, using comma:

SELECT trim(trailing texte_natif , 'DATEDENAISSANCE' ) FROM CROP

returned the whole field, including the string 'DATEDENAISSANCE', but except
the very first character, the column being called rtrim.

Tryed to chang it to
SELECT trim(trailing texte_natif from 'DATEDENAISSANCE' ) FROM CROP

returned nothing ( as a dummy in postgres I can't say if it was null fields
or empty string...)


CROP.texte_natif is a quite long texte field,in which the STRING
'DATEDENAISSANCE can be present 0 to 3 times, and never before 20th
character

bests
Romain



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-8335-trim-un-document-behaviour-tp5765221p5766930.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #8335: trim() un-document behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> The attached patch swaps the arguments in the parser, and allows your
> expected behavior:

This patch is completely unsafe.  It will break stored rules, which may
contain calls using the existing argument order (which will be dumped
without any of the SQL-spec syntactic sugar).  To say nothing of existing
applications that may be relying on calling the underlying functions with
their existing argument order.

The inconsistency in argument order is unfortunate but we're long since
stuck with it, I'm afraid.

            regards, tom lane

Re: BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Fri, Aug  9, 2013 at 11:06:15AM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > The attached patch swaps the arguments in the parser, and allows your
> > expected behavior:
>
> This patch is completely unsafe.  It will break stored rules, which may
> contain calls using the existing argument order (which will be dumped
> without any of the SQL-spec syntactic sugar).  To say nothing of existing
> applications that may be relying on calling the underlying functions with
> their existing argument order.
>
> The inconsistency in argument order is unfortunate but we're long since
> stuck with it, I'm afraid.

Yes, I have thought about this some more and another problem is that
rtrim/btrim/ltrim() use the source string first, so having trim() have
the source string second when using a comma is very confusing, e.g.:

    -- with patch
    SELECT trim('x', 'xabcx');
     btrim
    -------
     abc

    -- btrim
    SELECT btrim('xabcx', 'x');
     btrim
    -------
     abc

I think we can either document what we have, or remove the ability to
use comma with trim().  If we go with documentation, it is going to look
confusing as the optional modifier is going to be on the source string,
e.g.:

    SELECT trim(both 'xabcx', 'x');
     btrim
    -------
     abc

We could modify the grammar to force the modifier on the second
argument, but that is more parser states for limited value.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Re: BUG #8335: trim() un-document behaviour

From
Romain Billon-Grand
Date:
Finally could make it using
regexp_replace (my_field_to_trim, '.+my_triming_string' , '') for the "lead=
ing" case

regexp_replace (my_field_to_trim, 'my_triming_string(.+)' , '') for the "tr=
ailing" case

And both of them in this order a for the "both" one.
I don't know why, but=C2=A0 could not use '*?'
Since in no instance my_triming_string was at the extremlity of the sting, =
I did not need it, but I guess something like '(|.+)my_triming_string'could=
 have worked



________________________________
 De=C2=A0: Bruce Momjian [via PostgreSQL] <ml-node+s1045698n5766983h79@n5.n=
abble.com>
=C3=80=C2=A0: Romain Billon-Grand <romainbillongrand@yahoo.fr>=20
Envoy=C3=A9 le : Vendredi 9 ao=C3=BBt 2013 18h24
Objet=C2=A0: Re: BUG #8335: trim() un-document behaviour
=20


On Fri, Aug =C2=A09, 2013 at 11:06:15AM -0400, Tom Lane wrote:=20

> Bruce Momjian <[hidden email]> writes:=20
> > The attached patch swaps the arguments in the parser, and allows your=
=20
> > expected behavior:=20
>=20
> This patch is completely unsafe. =C2=A0It will break stored rules, which =
may=20
> contain calls using the existing argument order (which will be dumped=20
> without any of the SQL-spec syntactic sugar). =C2=A0To say nothing of exi=
sting=20
> applications that may be relying on calling the underlying functions with=
=20
> their existing argument order.=20
>=20
> The inconsistency in argument order is unfortunate but we're long since=
=20
> stuck with it, I'm afraid.=20
Yes, I have thought about this some more and another problem is that=20
rtrim/btrim/ltrim() use the source string first, so having trim() have=20
the source string second when using a comma is very confusing, e.g.:=20

=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- with patch=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 SELECT trim('x', 'xabcx');=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0btrim=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 -------=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0abc=20

=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- btrim=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 SELECT btrim('xabcx', 'x');=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0btrim=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 -------=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0abc=20

I think we can either document what we have, or remove the ability to=20
use comma with trim(). =C2=A0If we go with documentation, it is going to lo=
ok=20
confusing as the optional modifier is going to be on the source string,=20
e.g.:=20

=C2=A0 =C2=A0 =C2=A0 =C2=A0 SELECT trim(both 'xabcx', 'x');=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0btrim=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 -------=20
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0abc=20

We could modify the grammar to force the modifier on the second=20
argument, but that is more parser states for limited value.=20

--=20
=C2=A0 Bruce Momjian =C2=A0<[hidden email]> =C2=A0 =C2=A0 =C2=A0 =C2=A0http=
://momjian.us
=C2=A0 EnterpriseDB =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 http://enterprisedb.com

=C2=A0 + It's impossible for everything to be true. +=20


--=20
Sent via pgsql-bugs mailing list ([hidden email])=20
To make changes to your subscription:=20
http://www.postgresql.org/mailpref/pgsql-bugs


________________________________
=20
If you reply to this email, your message will be added to the discussion be=
low:http://postgresql.1045698.n5.nabble.com/BUG-8335-trim-un-document-behav=
iour-tp5765221p5766983.html=20
To unsubscribe from BUG #8335: trim() un-document behaviour, click here.
NAML



--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-8=
335-trim-un-document-behaviour-tp5765221p5767148.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.