Thread: BUG #14441: trim function bug (hit a keyword somewhere internally)

BUG #14441: trim function bug (hit a keyword somewhere internally)

From
mark.pether@gmail.com
Date:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDQ0MQpMb2dnZWQgYnk6ICAg
ICAgICAgIG1hcmsgcGV0aGVyCkVtYWlsIGFkZHJlc3M6ICAgICAgbWFyay5w
ZXRoZXJAZ21haWwuY29tClBvc3RncmVTUUwgdmVyc2lvbjogOS41LjQKT3Bl
cmF0aW5nIHN5c3RlbTogICBtYWNvcwpEZXNjcmlwdGlvbjogICAgICAgIAoK
c2VsZWN0IFRSSU0oTEVBRElORyAnRVhDTFVERScgRlJPTSAnQzAwMScpOw0K
DQpXaWxsIGluY29ycmVjdGx5IHN0cmlwIG9mZiB0aGUgJ0MnIGxlYXZpbmcg
JzAwMScuDQoNCg0Kc2VsZWN0IFRSSU0oTEVBRElORyAnZXhjbHVkZScgRlJP
TSAnQzAwMScpOw0KDQpXaWxsIG5vdCBzdHJpcCB0cmltIGFueXRoaW5nIGFz
IGV4cGVjdGVkLg0KDQpDaGVlcnMNCg0KCgo=

Re: BUG #14441: trim function bug (hit a keyword somewhere internally)

From
David Gould
Date:
On Wed, 30 Nov 2016 01:17:10 +0000
mark.pether@gmail.com wrote:

> The following bug has been logged on the website:
>
> Bug reference:      14441
> Logged by:          mark pether
> Email address:      mark.pether@gmail.com
> PostgreSQL version: 9.5.4
> Operating system:   macos
> Description:
>
> select TRIM(LEADING 'EXCLUDE' FROM 'C001');
>
> Will incorrectly strip off the 'C' leaving '001'.


Perhaps I'm misreading the doc, but that looks like the correct result.
That is, any character in 'EXCLUDE' will be stripped from the leading part of
'C001', ie, the 'C'. Why do you think this is a bug?

-dg



--
David Gould                                   daveg@sonic.net
If simplicity worked, the world would be overrun with insects.

Re: BUG #14441: trim function bug (hit a keyword somewhere internally)

From
Tom Lane
Date:
mark.pether@gmail.com writes:
> select TRIM(LEADING 'EXCLUDE' FROM 'C001');
> Will incorrectly strip off the 'C' leaving '001'.

The code appears to believe that the first argument is a set of characters
that are to be removed from the front of the string argument until none
remain.  For instance:

regression=3D# select TRIM(LEADING 'ABC' FROM 'CAABC001');
 ltrim =

-------
 001
(1 row)

So the behavior you mention is expected.

I think what you're expecting is to strip off one or more occurrences of a
substring, but that's not what our version of TRIM does.

A look at the SQL standard indicates that it requires the first argument
to be *exactly one* character in length, and then TRIM strips leading
occurrence(s) of that character.  So our existing implementation is a
plausible extension of that ... as is your expectation, but there's
certainly no reason to favor yours over what's there.  Also:
* the PG code has acted this way since it was introduced in 1997.
* the code comments claim, and a bit of googling seems to confirm,
that this matches Oracle's behavior.
So I do not see us changing this.

You can easily build the behavior you want out of other spare parts.
Regexes would be my first weapon of choice:

regression=3D# select substring('C001' from '^(?:EXCLUDE)*(.*)$');
 substring =

-----------
 C001
(1 row)

regression=3D# select substring('EXCLUDEC001' from '^(?:EXCLUDE)*(.*)$');
 substring =

-----------
 C001
(1 row)

regression=3D# select substring('EXCLUDEEXCLUDEC001' from '^(?:EXCLUDE)*(.=
*)$');
 substring =

-----------
 C001
(1 row)

See
https://www.postgresql.org/docs/current/static/functions-matching.html#FUN=
CTIONS-POSIX-REGEXP

            regards, tom lane

Re: BUG #14441: trim function bug (hit a keyword somewhere internally)

From
mark pether
Date:
Sorry, I misread the document it's not a bug, doco was unclear.

It replaces the largest match which may be all characters or simply
individual characters. I realised this after I created the bug.

I have switched to REGEXP_REPLACE to do the correct operation.

Regards

On Wed, Nov 30, 2016 at 5:09 PM, David Gould <daveg@sonic.net> wrote:

> On Wed, 30 Nov 2016 01:17:10 +0000
> mark.pether@gmail.com wrote:
>
> > The following bug has been logged on the website:
> >
> > Bug reference:      14441
> > Logged by:          mark pether
> > Email address:      mark.pether@gmail.com
> > PostgreSQL version: 9.5.4
> > Operating system:   macos
> > Description:
> >
> > select TRIM(LEADING 'EXCLUDE' FROM 'C001');
> >
> > Will incorrectly strip off the 'C' leaving '001'.
>
>
> Perhaps I'm misreading the doc, but that looks like the correct result.
> That is, any character in 'EXCLUDE' will be stripped from the leading part
> of
> 'C001', ie, the 'C'. Why do you think this is a bug?
>
> -dg
>
>
>
> --
> David Gould                                   daveg@sonic.net
> If simplicity worked, the world would be overrun with insects.
>

Re: BUG #14441: trim function bug (hit a keyword somewhere internally)

From
Tom Lane
Date:
mark pether <mark.pether@gmail.com> writes:
> Sorry, I misread the document it's not a bug, doco was unclear.
> It replaces the largest match which may be all characters or simply
> individual characters. I realised this after I created the bug.

Hmm, yeah, docs fail to make clear what happens with a multicharacter
trim spec.  I think we could fix this with a better example.
Will do something about it.

            regards, tom lane