Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments
Date
Msg-id 29036.1566395113@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: BUG #15971: Behaviour of SUBSTR function depending on its arguments  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-bugs
Ashutosh Sharma <ashu.coek88@gmail.com> writes:
> I don't think it's a bug. I think it's just that the SUBSTR in
> PostgreSQL is more SQL compliant than any other databases that you
> mentioned.  As per the SQL standard, if the start position is zero or
> negative number, it should be adjusted to the start of the string and
> not to the end of the string and that way I feel the behaviour of
> SUBSTR in PostgreSQL is more SQL compliant than other databases.

Yes, the standard provides no wiggle room here.  The behavior of
substring() with integer parameters, as specified in SQL:2008
6.29 <string value function> general rule 3, is

3) If <character substring function> is specified, then:

  a) If the character encoding form of <character value expression> is
  UTF8, UTF16, or UTF32, then, in the remainder of this General Rule, the
  term “character” shall be taken to mean “unit specified by <char length
  units>”.

  b) Let C be the value of the <character value expression>, let LC be the
  length in characters of C, and let S be the value of the <start
  position>.

  c) If <string length> is specified, then let L be the value of <string
  length> and let E be S+L. Otherwise, let E be the larger of LC + 1 and
  S.

  d) If either C, S, or L is the null value, then the result of the
  <character substring function> is the null value.

  e) If E is less than S, then an exception condition is raised: data
  exception — substring error.

  f) Case:

    i) If S is greater than LC or if E is less than 1 (one), then the
    result of the <character substring function> is a zero-length string.

    ii) Otherwise,

      1) Let S1 be the larger of S and 1 (one). Let E1 be the smaller of E
      and LC+1. Let L1 be E1–S1.

      2) The result of the <character substring function> is a character
      string containing the L1 characters of C starting at character
      number S1 in the same order that the characters appear in C.

I believe our implementation does this exactly.

Even if it were true that Oracle's behavior is more common than the
spec's definition, it's quite unlikely that we could be talked into
abandoning spec-compliant behavior to match Oracle.

            regards, tom lane



pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15954: Unable to alter partitioned table to set logged
Next
From: Alvaro Herrera
Date:
Subject: Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE'