Thread: BUG #14441: trim function bug (hit a keyword somewhere internally)
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDQ0MQpMb2dnZWQgYnk6ICAg ICAgICAgIG1hcmsgcGV0aGVyCkVtYWlsIGFkZHJlc3M6ICAgICAgbWFyay5w ZXRoZXJAZ21haWwuY29tClBvc3RncmVTUUwgdmVyc2lvbjogOS41LjQKT3Bl cmF0aW5nIHN5c3RlbTogICBtYWNvcwpEZXNjcmlwdGlvbjogICAgICAgIAoK c2VsZWN0IFRSSU0oTEVBRElORyAnRVhDTFVERScgRlJPTSAnQzAwMScpOw0K DQpXaWxsIGluY29ycmVjdGx5IHN0cmlwIG9mZiB0aGUgJ0MnIGxlYXZpbmcg JzAwMScuDQoNCg0Kc2VsZWN0IFRSSU0oTEVBRElORyAnZXhjbHVkZScgRlJP TSAnQzAwMScpOw0KDQpXaWxsIG5vdCBzdHJpcCB0cmltIGFueXRoaW5nIGFz IGV4cGVjdGVkLg0KDQpDaGVlcnMNCg0KCgo=
On Wed, 30 Nov 2016 01:17:10 +0000 mark.pether@gmail.com wrote: > The following bug has been logged on the website: > > Bug reference: 14441 > Logged by: mark pether > Email address: mark.pether@gmail.com > PostgreSQL version: 9.5.4 > Operating system: macos > Description: > > select TRIM(LEADING 'EXCLUDE' FROM 'C001'); > > Will incorrectly strip off the 'C' leaving '001'. Perhaps I'm misreading the doc, but that looks like the correct result. That is, any character in 'EXCLUDE' will be stripped from the leading part of 'C001', ie, the 'C'. Why do you think this is a bug? -dg -- David Gould daveg@sonic.net If simplicity worked, the world would be overrun with insects.
mark.pether@gmail.com writes: > select TRIM(LEADING 'EXCLUDE' FROM 'C001'); > Will incorrectly strip off the 'C' leaving '001'. The code appears to believe that the first argument is a set of characters that are to be removed from the front of the string argument until none remain. For instance: regression=3D# select TRIM(LEADING 'ABC' FROM 'CAABC001'); ltrim = ------- 001 (1 row) So the behavior you mention is expected. I think what you're expecting is to strip off one or more occurrences of a substring, but that's not what our version of TRIM does. A look at the SQL standard indicates that it requires the first argument to be *exactly one* character in length, and then TRIM strips leading occurrence(s) of that character. So our existing implementation is a plausible extension of that ... as is your expectation, but there's certainly no reason to favor yours over what's there. Also: * the PG code has acted this way since it was introduced in 1997. * the code comments claim, and a bit of googling seems to confirm, that this matches Oracle's behavior. So I do not see us changing this. You can easily build the behavior you want out of other spare parts. Regexes would be my first weapon of choice: regression=3D# select substring('C001' from '^(?:EXCLUDE)*(.*)$'); substring = ----------- C001 (1 row) regression=3D# select substring('EXCLUDEC001' from '^(?:EXCLUDE)*(.*)$'); substring = ----------- C001 (1 row) regression=3D# select substring('EXCLUDEEXCLUDEC001' from '^(?:EXCLUDE)*(.= *)$'); substring = ----------- C001 (1 row) See https://www.postgresql.org/docs/current/static/functions-matching.html#FUN= CTIONS-POSIX-REGEXP regards, tom lane
Sorry, I misread the document it's not a bug, doco was unclear. It replaces the largest match which may be all characters or simply individual characters. I realised this after I created the bug. I have switched to REGEXP_REPLACE to do the correct operation. Regards On Wed, Nov 30, 2016 at 5:09 PM, David Gould <daveg@sonic.net> wrote: > On Wed, 30 Nov 2016 01:17:10 +0000 > mark.pether@gmail.com wrote: > > > The following bug has been logged on the website: > > > > Bug reference: 14441 > > Logged by: mark pether > > Email address: mark.pether@gmail.com > > PostgreSQL version: 9.5.4 > > Operating system: macos > > Description: > > > > select TRIM(LEADING 'EXCLUDE' FROM 'C001'); > > > > Will incorrectly strip off the 'C' leaving '001'. > > > Perhaps I'm misreading the doc, but that looks like the correct result. > That is, any character in 'EXCLUDE' will be stripped from the leading part > of > 'C001', ie, the 'C'. Why do you think this is a bug? > > -dg > > > > -- > David Gould daveg@sonic.net > If simplicity worked, the world would be overrun with insects. >
mark pether <mark.pether@gmail.com> writes: > Sorry, I misread the document it's not a bug, doco was unclear. > It replaces the largest match which may be all characters or simply > individual characters. I realised this after I created the bug. Hmm, yeah, docs fail to make clear what happens with a multicharacter trim spec. I think we could fix this with a better example. Will do something about it. regards, tom lane