Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments - Mailing list pgsql-bugs

From Ashutosh Sharma
Subject Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments
Date
Msg-id CAE9k0Pksw==KG3JkBmwSSdmmHotJEUywryqs0L+15kcY_aO6_Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Wed, Aug 21, 2019 at 7:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Ashutosh Sharma <ashu.coek88@gmail.com> writes:
> > I don't think it's a bug. I think it's just that the SUBSTR in
> > PostgreSQL is more SQL compliant than any other databases that you
> > mentioned.  As per the SQL standard, if the start position is zero or
> > negative number, it should be adjusted to the start of the string and
> > not to the end of the string and that way I feel the behaviour of
> > SUBSTR in PostgreSQL is more SQL compliant than other databases.
>
> Yes, the standard provides no wiggle room here.  The behavior of
> substring() with integer parameters, as specified in SQL:2008
> 6.29 <string value function> general rule 3, is
>
> 3) If <character substring function> is specified, then:
>
>   a) If the character encoding form of <character value expression> is
>   UTF8, UTF16, or UTF32, then, in the remainder of this General Rule, the
>   term “character” shall be taken to mean “unit specified by <char length
>   units>”.
>
>   b) Let C be the value of the <character value expression>, let LC be the
>   length in characters of C, and let S be the value of the <start
>   position>.
>
>   c) If <string length> is specified, then let L be the value of <string
>   length> and let E be S+L. Otherwise, let E be the larger of LC + 1 and
>   S.
>
>   d) If either C, S, or L is the null value, then the result of the
>   <character substring function> is the null value.
>
>   e) If E is less than S, then an exception condition is raised: data
>   exception — substring error.
>
>   f) Case:
>
>     i) If S is greater than LC or if E is less than 1 (one), then the
>     result of the <character substring function> is a zero-length string.
>
>     ii) Otherwise,
>
>       1) Let S1 be the larger of S and 1 (one). Let E1 be the smaller of E
>       and LC+1. Let L1 be E1–S1.
>
>       2) The result of the <character substring function> is a character
>       string containing the L1 characters of C starting at character
>       number S1 in the same order that the characters appear in C.
>
> I believe our implementation does this exactly.
>
> Even if it were true that Oracle's behavior is more common than the
> spec's definition, it's quite unlikely that we could be talked into
> abandoning spec-compliant behavior to match Oracle.
>

Absolutely. We shouldn't do that. We are more accurate than Oracle as
per the SQL standard.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE'
Next
From: Alvaro Herrera
Date:
Subject: Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE'