Thread: Re: [BUGS] BUG #8335: trim() un-document behaviour

Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Fri, Aug  9, 2013 at 12:23:59PM -0400, Bruce Momjian wrote:
> Yes, I have thought about this some more and another problem is that
> rtrim/btrim/ltrim() use the source string first, so having trim() have
> the source string second when using a comma is very confusing, e.g.:
>
>     -- with patch
>     SELECT trim('x', 'xabcx');
>      btrim
>     -------
>      abc
>
>     -- btrim
>     SELECT btrim('xabcx', 'x');
>      btrim
>     -------
>      abc
>
> I think we can either document what we have, or remove the ability to
> use comma with trim().  If we go with documentation, it is going to look
> confusing as the optional modifier is going to be on the source string,
> e.g.:
>
>     SELECT trim(both 'xabcx', 'x');
>      btrim
>     -------
>      abc
>
> We could modify the grammar to force the modifier on the second
> argument, but that is more parser states for limited value.

[ moved to hackers ]

Based on my research, I am now proposing a new, attached patch which
eliminates comma in all places in TRIM, e.g. this is no longer valid
either:

    SELECT trim(BOTH FROM 'abc', 'a');
     btrim
    -------
     bc
    (1 row)

I believe the flexible TRIM syntax was introduced when TRIM was added in
1997:

    commit 570620c5698b0c76b26a3ec71692df29375cad16
    Author: Thomas G. Lockhart <lockhart@fourpalms.org>
    Date:   Mon Sep 1 06:00:35 1997 +0000

        Add SQL92 string handling features (SUBSTRING, TRIM, EXTRACT).

We would now only support the documented TRIM syntax.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Based on my research, I am now proposing a new, attached patch which
> eliminates comma in all places in TRIM,

This will break even more stuff than the last patch, ie, every single
stored rule or view that contains a TRIM function.  You can *not*
eliminate, or mess with, the expr_list production, because that's what
dumping of these function calls relies on.

It's easily a dozen years too late to change this, Bruce.  Please just
think about documenting the underlying functions, if you feel a need
to do anything here.
        regards, tom lane



Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Tom Lane
Date:
I wrote:
> This will break even more stuff than the last patch, ie, every single
> stored rule or view that contains a TRIM function.  You can *not*
> eliminate, or mess with, the expr_list production, because that's what
> dumping of these function calls relies on.

No, wait, I take that back.  I was thinking that the function call would
dump out as trim(x, y) but actually none of the underlying functions
are named just "trim"; they're btrim, ltrim, or rtrim.  So actually the
dump/reload scenario does not have anything to do with the trim_list
production --- the underlying functions just parse normally in any case.

The question remains why it's a good idea to mess with a syntax behavior
that's been like that for a dozen years or more.  I don't see any upside
to doing that.  As an example of a downside, right now if you try to
pass extra arguments to TRIM() you'll get

regression=# select trim(1,2,3);
ERROR:  function pg_catalog.btrim(integer, integer, integer) does not exist
LINE 1: select trim(1,2,3);              ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

You might wonder why the message mentions "btrim" not "trim", but at least
the complaint is reasonably on-topic.  After this patch, you'd just get
a "syntax error" message, which doesn't seem helpful at all.
        regards, tom lane



Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Mon, Aug 12, 2013 at 02:18:01PM -0400, Tom Lane wrote:
> No, wait, I take that back.  I was thinking that the function call would
> dump out as trim(x, y) but actually none of the underlying functions
> are named just "trim"; they're btrim, ltrim, or rtrim.  So actually the
> dump/reload scenario does not have anything to do with the trim_list
> production --- the underlying functions just parse normally in any case.

Right, TRIM is really just a wrapper around btrim/rtrim/ltrim.

> The question remains why it's a good idea to mess with a syntax behavior
> that's been like that for a dozen years or more.  I don't see any upside
> to doing that.  As an example of a downside, right now if you try to
> pass extra arguments to TRIM() you'll get
> 
> regression=# select trim(1,2,3);
> ERROR:  function pg_catalog.btrim(integer, integer, integer) does not exist
> LINE 1: select trim(1,2,3);
>                ^
> HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
> 
> You might wonder why the message mentions "btrim" not "trim", but at least
> the complaint is reasonably on-topic.  After this patch, you'd just get
> a "syntax error" message, which doesn't seem helpful at all.

Well, btrim/rtrim/ltrim only take two arguments, so allowing three for
it to fail later really doesn't seem to help much, compared to a syntax
error.

We did have someone confused by what we have now, as well as me, so I
think there is a reason to clean this up.  It would be a
backward-compatible change, though.

To document this, I think we would need to add only one line:
trim([leading | trailing | both] [characters] from string)
new    trim([leading | trailing | both] [from] string [, characters])

Of course, that second line is non-standard --- do we have to mention
that?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> We did have someone confused by what we have now, as well as me, so I
> think there is a reason to clean this up.  It would be a
> backward-compatible change, though.

backward-INcompatible, I assume you meant.

> To document this, I think we would need to add only one line:

>     trim([leading | trailing | both] [characters] from string)
> new    trim([leading | trailing | both] [from] string [, characters])

> Of course, that second line is non-standard --- do we have to mention
> that?

The second line is wrong no?  We don't allow the LEADING etc keywords
in the expr_list alternative.  Anyway, I'm dubious that we really want
to document a nonstandard syntax --- that would just be encouraging
people to use it, to little benefit.

Now that I've thought about this some more, I think that there was some
previous discussion around this syntax production, and that the reason
we left it like this is that we wanted to leave the door open for
user-defined trim functions that might take extra arguments.  That
discussion probably predated 7.3 (when we added schemas) because the
code's current habit of forcing a "pg_catalog" prefix would make it
a little bit painful to add such functions.  Still, you could do it
with superuser privileges.  Not sure how strong that argument is,
but I think that's where we left it years ago.
        regards, tom lane



Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Mon, Aug 12, 2013 at 04:58:23PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > We did have someone confused by what we have now, as well as me, so I
> > think there is a reason to clean this up.  It would be a
> > backward-compatible change, though.
>
> backward-INcompatible, I assume you meant.

Yes.

> > To document this, I think we would need to add only one line:
>
> >     trim([leading | trailing | both] [characters] from string)
> > new    trim([leading | trailing | both] [from] string [, characters])
>
> > Of course, that second line is non-standard --- do we have to mention
> > that?
>
> The second line is wrong no?  We don't allow the LEADING etc keywords
> in the expr_list alternative.  Anyway, I'm dubious that we really want
> to document a nonstandard syntax --- that would just be encouraging
> people to use it, to little benefit.

Well, we just call trim_list rule the TRIM keyword rule, so it sure does
work:

    SELECT trim(LEADING FROM 'abc', 'a');
     ltrim
    -------
     bc

So, you are saying we should just leave it undocumented?  It is true we
have gotten near-zero complaints about it in the past.

> Now that I've thought about this some more, I think that there was some
> previous discussion around this syntax production, and that the reason
> we left it like this is that we wanted to leave the door open for
> user-defined trim functions that might take extra arguments.  That
> discussion probably predated 7.3 (when we added schemas) because the
> code's current habit of forcing a "pg_catalog" prefix would make it
> a little bit painful to add such functions.  Still, you could do it
> with superuser privileges.  Not sure how strong that argument is,
> but I think that's where we left it years ago.

Oh, that does make sense why we had this syntax so open.

Attached are docs that add the new syntax, and mention it is
non-standard;  you can see the output here:

    http://momjian.us/tmp/pgsql/functions-string.html#FUNCTIONS-STRING-SQL

We do document three syntaxes for substring() in the same table, one row
for each, so there is precedent for doing this.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Mon, Aug 12, 2013 at 05:19:30PM -0400, Bruce Momjian wrote:
> Attached are docs that add the new syntax, and mention it is
> non-standard;  you can see the output here:
>
>     http://momjian.us/tmp/pgsql/functions-string.html#FUNCTIONS-STRING-SQL
>
> We do document three syntaxes for substring() in the same table, one row
> for each, so there is precedent for doing this.

Attached is an updated patch with a proper example.  I could move the
extra syntax into the description of the existing trim entry instead.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Mon, Aug 12, 2013 at 11:31:38PM -0400, Bruce Momjian wrote:
> On Mon, Aug 12, 2013 at 05:19:30PM -0400, Bruce Momjian wrote:
> > Attached are docs that add the new syntax, and mention it is
> > non-standard;  you can see the output here:
> > 
> >     http://momjian.us/tmp/pgsql/functions-string.html#FUNCTIONS-STRING-SQL
> > 
> > We do document three syntaxes for substring() in the same table, one row
> > for each, so there is precedent for doing this.
> 
> Attached is an updated patch with a proper example.  I could move the
> extra syntax into the description of the existing trim entry instead.

Patch applied to head.  I did not apply this to 9.3 in case we change
our minds about documenting this.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Vik Fearing
Date:
On 08/14/2013 11:27 PM, Bruce Momjian wrote:
> On Mon, Aug 12, 2013 at 11:31:38PM -0400, Bruce Momjian wrote:
>> On Mon, Aug 12, 2013 at 05:19:30PM -0400, Bruce Momjian wrote:
>>> Attached are docs that add the new syntax, and mention it is
>>> non-standard;  you can see the output here:
>>>
>>>     http://momjian.us/tmp/pgsql/functions-string.html#FUNCTIONS-STRING-SQL
>>>
>>> We do document three syntaxes for substring() in the same table, one row
>>> for each, so there is precedent for doing this.
>> Attached is an updated patch with a proper example.  I could move the
>> extra syntax into the description of the existing trim entry instead.
> Patch applied to head.  I did not apply this to 9.3 in case we change
> our minds about documenting this.

This commit introduced the following:

doc$ make -s html
Processing HTML.index...
2409 entries loaded...
collateindex.pl: duplicated index entry found: TRIM 
1 entries ignored...
Done.
-- 
Vik



Re: [BUGS] BUG #8335: trim() un-document behaviour

From
Bruce Momjian
Date:
On Wed, Aug 21, 2013 at 12:32:20PM +0200, Vik Fearing wrote:
> On 08/14/2013 11:27 PM, Bruce Momjian wrote:
> > On Mon, Aug 12, 2013 at 11:31:38PM -0400, Bruce Momjian wrote:
> >> On Mon, Aug 12, 2013 at 05:19:30PM -0400, Bruce Momjian wrote:
> >>> Attached are docs that add the new syntax, and mention it is
> >>> non-standard;  you can see the output here:
> >>>
> >>>     http://momjian.us/tmp/pgsql/functions-string.html#FUNCTIONS-STRING-SQL
> >>>
> >>> We do document three syntaxes for substring() in the same table, one row
> >>> for each, so there is precedent for doing this.
> >> Attached is an updated patch with a proper example.  I could move the
> >> extra syntax into the description of the existing trim entry instead.
> > Patch applied to head.  I did not apply this to 9.3 in case we change
> > our minds about documenting this.
> 
> This commit introduced the following:
> 
> doc$ make -s html
> Processing HTML.index...
> 2409 entries loaded...
> collateindex.pl: duplicated index entry found: TRIM 
> 1 entries ignored...
> Done.

Interesting.  I didn't realize HTML shows errors that 'make check'
doesn't.  Anyway, I have removed the second index reference, so the
error should be gone now.  Thanks for the report.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +