Thread: Re: add function argument name to substring and substr
On Tue, Jan 21, 2025 at 11:21 PM jian he <jian.universality@gmail.com> wrote:
hi.
attached patch add function argument name to function substring and substr
you can see the visual changes.
mater behavior:
\df substring
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+-----------+------------------+-------------------------+------
pg_catalog | substring | bit | bit, integer | func
pg_catalog | substring | bit | bit, integer, integer | func
pg_catalog | substring | bytea | bytea, integer | func
pg_catalog | substring | bytea | bytea, integer, integer | func
pg_catalog | substring | text | text, integer | func
pg_catalog | substring | text | text, integer, integer | func
pg_catalog | substring | text | text, text | func
pg_catalog | substring | text | text, text, text | func
with patch
List of functions
Schema | Name | Result data type | Argument
data types | Type
------------+-----------+------------------+--------------------------------------------------+------
pg_catalog | substring | bit | bits bit, start integer
| func
pg_catalog | substring | bit | bits bit, start integer,
count integer | func
pg_catalog | substring | bytea | bytes bytea, start integer
| func
pg_catalog | substring | bytea | bytes bytea, start integer,
count integer | func
pg_catalog | substring | text | string text, pattern text
| func
pg_catalog | substring | text | string text, pattern
text, escape_character text | func
pg_catalog | substring | text | string text, start
integer | func
pg_catalog | substring | text | string text, start
integer, count integer | func
I did the same change to the function substr.
since 9.7.2. SIMILAR TO Regular Expressions we use
substring(string, pattern, escape-character)
so i refactor the substring function argument name to
substring(string text, pattern text, escape_character text).
we can make it as ``substring(string text, pattern text, escape text).``
then in 9.7.2, we need to change the substring synopsis section.
one thing I am not sure is about
9.4. String Functions and Operators
Table 9.9. SQL String Functions and Operators.
Do we need entries for substring related functions?
current signature
substring ( string text [ FROM start integer ] [ FOR count integer ] )
is kind of different from
substring(string text, start integer, count integer).
since the previous one can allow keywords "FROM", "FOR", the latter one won't.
Table 9.9 limits itself to those functions defined in the SQL standard; which are basically the ones that use keywords instead of commas.
The substring(string, start, count) function you note is already covered in Table 9.10 but we spell it substr(...)
I don't think adding yet more spellings of this same function is warranted or desirable at this point. I'd maybe add a note if substring(,,,) works to substr saying that substring is a valid alias. I could be convinced to just document though.
David J.
On Tue, Feb 18, 2025 at 6:13 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > Table 9.9 limits itself to those functions defined in the SQL standard; which are basically the ones that use keywordsinstead of commas. > > The substring(string, start, count) function you note is already covered in Table 9.10 but we spell it substr(...) > > I don't think adding yet more spellings of this same function is warranted or desirable at this point. ok. > I'd maybe add a note if substring(,,,) works to substr saying that substring is a valid alias. I could be convinced tojust document though. > it seems already in the doc. substr ( string text, start integer [, count integer ] ) → text Extracts the substring of string starting at the start'th character, and extending for count characters if that is specified. (Same as substring(string from start for count).) substr ( bytes bytea, start integer [, count integer ] ) → bytea Extracts the substring of bytes starting at the start'th byte, and extending for count bytes if that is specified. (Same as substring(bytes from start for count).) new patch attached. main changes: 1. change 3 argument func argument from (string text, pattern text, escape_character text) to (string text, pattern text, escape text) 2. add synopsis section in 9.7.3. POSIX Regular Expressions for function substring. we only have the synopsis section for function substring in 9.7.2 section, now add it to 9.7.3. also add an example about using named natation call substring: substring(string=>'foobar', pattern=> 'o.b') the patch is small, I just put the \df substring \df substr (before and after patch) output into the commit message.
Attachment
On Mon, Mar 17, 2025 at 8:20 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Feb 18, 2025 at 6:13 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> Table 9.9 limits itself to those functions defined in the SQL standard; which are basically the ones that use keywords instead of commas.
>
> The substring(string, start, count) function you note is already covered in Table 9.10 but we spell it substr(...)
>
> I don't think adding yet more spellings of this same function is warranted or desirable at this point.
ok.
> I'd maybe add a note if substring(,,,) works to substr saying that substring is a valid alias. I could be convinced to just document though.
>
it seems already in the doc.
substr ( string text, start integer [, count integer ] ) → text
Extracts the substring of string starting at the start'th character,
and extending for count characters if that is specified. (Same as
substring(string from start for count).)
substr ( bytes bytea, start integer [, count integer ] ) → bytea
Extracts the substring of bytes starting at the start'th byte, and
extending for count bytes if that is specified. (Same as
substring(bytes from start for count).)
new patch attached.
main changes:
1. change 3 argument func argument from
(string text, pattern text, escape_character text)
to
(string text, pattern text, escape text)
Why? It can only be one character so that existing name seems well chosen.
postgres=# select substring('123^^,123',',','^^');
ERROR: invalid escape string
HINT: Escape string must be empty or one character.
CONTEXT: SQL function "substring" statement 1
ERROR: invalid escape string
HINT: Escape string must be empty or one character.
CONTEXT: SQL function "substring" statement 1
2. add synopsis section in 9.7.3. POSIX Regular Expressions for
function substring.
we only have the synopsis section for function substring in 9.7.2
section, now add it to 9.7.3.
I'd probably try and resolve that the other way...point the reader to the reference page for the function if they want to see syntax. I'm mixed on (leaning against) whether choosing this place to demonstrate all the possible spellings is the best.
also add an example about using named natation call substring:
substring(string=>'foobar', pattern=> 'o.b')
I'm already pulling my hair out at this showing all the insanity that exists without adding this to the mix.
The vast majority of examples throughout the manual use traditional function call syntax func_name(arg1, arg2, etc.); I'd rather keep with convention than start to scatter about alternative syntax choices just to give the random reader who happens upon this fairly esoteric part of the manual the benefit of seeing their options. If that is a goal, then I'd suggest spending some time in our Tutorial adding some more examples with these alternative forms to people looking to be exposed to new things in the place they'd go to look for them. They probably won't learn about them from the Syntax section.
substring(string text, pattern text[, escape-character text])
to Table 9.10
I'd also rename escape to escape-character in the other SQL substring function synopses. The RegEx page got that part correct.
Do as little or as much with the RegEx section as you'd like, though it seems like separate material from $subject. The page seems to already use replaceable names instead of data types so on that score it should be unaffected if we've chosen the same names.
Food for thought, it seems a bit redundant to name the first argument basically the same as the data type. I was thinking that "content" would be a better choice. This is basically a polymorphic function where all the inputs are the same thing just having different types - and that thing is being "content".
David J.
On Wed, Mar 19, 2025 at 8:19 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > The vast majority of examples throughout the manual use traditional function call syntax func_name(arg1, arg2, etc.); I'd rather keep with convention than start to scatter about alternative syntax choices just to give the random readerwho happens upon this fairly esoteric part of the manual the benefit of seeing their options. If that is a goal, thenI'd suggest spending some time in our Tutorial adding some more examples with these alternative forms to people lookingto be exposed to new things in the place they'd go to look for them. They probably won't learn about them from theSyntax section. > > On the plus side, I agree now we should add: > substring(string text, pattern text[, escape-character text]) > to Table 9.10 > in Table Table 9.9 we have ``` substring ( string text FROM pattern text ) → text Extracts the first substring matching POSIX regular expression; see Section 9.7.3. substring('Thomas' from '...$') → mas ``` can we change to substring ( string text FROM pattern text ) → text substring ( string text, pattern text ) → text Extracts the first substring matching POSIX regular expression; the second format is not standardized. see Section 9.7.3. substring('Thomas' from '...$') → mas if we add to ``substring ( string text, pattern text ) → text`` Table 9.10, then maybe it feels like duplication? (same function in Table 9.9, Table 9.10, then we also need some words saying that they are the same) I do realized we have brief explanation about Table 9.9 and Table 9.10 in second paragraph of https://www.postgresql.org/docs/current/functions-string.html
On Tue, Mar 18, 2025 at 6:20 PM jian he <jian.universality@gmail.com> wrote:
On Wed, Mar 19, 2025 at 8:19 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> The vast majority of examples throughout the manual use traditional function call syntax func_name(arg1, arg2, etc.); I'd rather keep with convention than start to scatter about alternative syntax choices just to give the random reader who happens upon this fairly esoteric part of the manual the benefit of seeing their options. If that is a goal, then I'd suggest spending some time in our Tutorial adding some more examples with these alternative forms to people looking to be exposed to new things in the place they'd go to look for them. They probably won't learn about them from the Syntax section.
>
> On the plus side, I agree now we should add:
> substring(string text, pattern text[, escape-character text])
> to Table 9.10
>
in Table Table 9.9 we have
```
substring ( string text FROM pattern text ) → text
Extracts the first substring matching POSIX regular expression; see
Section 9.7.3.
substring('Thomas' from '...$') → mas
```
can we change to
substring ( string text FROM pattern text ) → text
substring ( string text, pattern text ) → text
Extracts the first substring matching POSIX regular expression;
the second format is not standardized. see Section 9.7.3.
substring('Thomas' from '...$') → mas
No, based on the (I presume) fact that the substring(string, pattern) variant is not defined in the SQL standard and Table 9.9 is reserved for those functions.
It would be a different, but probably worth considering, patch to simply combine Tables 9.9 and 9.10 and just denote which entries are standard and which are not. The decision to split the tables along that property came well before our current table format which seems much more amenable to merging them together.
if we add to
``substring ( string text, pattern text ) → text``
Table 9.10,
then maybe it feels like duplication?
(same function in Table 9.9, Table 9.10, then we also need some words
saying that they are the same)
We can/should add substring(string, pattern) to Table 9.10 for the same reason and the same general wording that substr(string, start) exists on that table.
I would be in favor of adding a similar "same as" comment to the functions in Table 9.9
I just now processed the cross references in Table 9.9 to the POSIX section. The new entry in Table 9.10 would want that too.
David J.
hi. new patch attached. now \df substring looks like: List of functions Schema | Name | Result data type | Argument data types | Type ------------+-----------+------------------+--------------------------------------------------+------ pg_catalog | substring | bit | bits bit, start integer | func pg_catalog | substring | bit | bits bit, start integer, count integer | func pg_catalog | substring | bytea | bytes bytea, start integer | func pg_catalog | substring | bytea | bytes bytea, start integer, count integer | func pg_catalog | substring | text | string text, pattern text | func pg_catalog | substring | text | string text, pattern text, escape_character text | func pg_catalog | substring | text | string text, start integer | func pg_catalog | substring | text | string text, start integer, count integer | func Table 9.10 also have doc entry for substring. in Table 9.10, i have: substring ( string text, pattern text ) → text substring ( string text, pattern text, escape_character text) → text we can put it into one, but one is based on POSIX regular expression, another one is based on SQL, the difference is big, that's why put it into two.
Attachment
On Tue, Mar 18, 2025 at 9:04 PM jian he <jian.universality@gmail.com> wrote:
new patch attached.
I've done v4 with a delta patch.
Decided to standardize on calling the SQL Similar To regular expression escape replaceable "escape" everywhere.
Instead of fully documenting the obsolete syntax I added a note explaining the keyword choice difference. Removed mention of it completely from the Pattern Matching portion of the documentation - that section has enough going on.
I also add "Same as" references for the two pairs of entries. Not married to them but they do seem warranted; having Pattern Matching be required reading to make that connection seems undesirable.
David J.
Attachment
On Tue, Apr 1, 2025 at 6:22 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Tue, Mar 18, 2025 at 9:04 PM jian he <jian.universality@gmail.com> wrote: >> >> >> new patch attached. >> > > I've done v4 with a delta patch. > > Decided to standardize on calling the SQL Similar To regular expression escape replaceable "escape" everywhere. > > Instead of fully documenting the obsolete syntax I added a note explaining the keyword choice difference. Removed mentionof it completely from the Pattern Matching portion of the documentation - that section has enough going on. > > I also add "Same as" references for the two pairs of entries. Not married to them but they do seem warranted; having PatternMatching be required reading to make that connection seems undesirable. > your v4-0001-v3-0001-substring.patch is not the same as my v3-0001-add-argument-name-to-function-substring-and-subst.patch for example: - <para role="func_signature"> + <para role="func_signature"><s> <function>substring</function> ( <parameter>string</parameter> <type>text</type> <literal>FROM</literal> <parameter>pattern</parameter> <type>text</type> <literal>FOR</literal> <parameter>escape</parameter> <type>text</type> ) - <returnvalue>text</returnvalue> + <returnvalue>text</returnvalue></s> can you make sure v4-0001-v3-0001-substring.patch the same as v3-0001-add-argument-name-to-function-substring-and-subst.patch. because I tried git am On Tue, Apr 1, 2025 at 6:22 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Tue, Mar 18, 2025 at 9:04 PM jian he <jian.universality@gmail.com> wrote: >> >> >> new patch attached. >> > > I've done v4 with a delta patch. > > Decided to standardize on calling the SQL Similar To regular expression escape replaceable "escape" everywhere. > > Instead of fully documenting the obsolete syntax I added a note explaining the keyword choice difference. Removed mentionof it completely from the Pattern Matching portion of the documentation - that section has enough going on. > > I also add "Same as" references for the two pairs of entries. Not married to them but they do seem warranted; having PatternMatching be required reading to make that connection seems undesirable. > your v4-0001-v3-0001-substring.patch is not the same as my v3-0001-add-argument-name-to-function-substring-and-subst.patch for example: - <para role="func_signature"> + <para role="func_signature"><s> <function>substring</function> ( <parameter>string</parameter> <type>text</type> <literal>FROM</literal> <parameter>pattern</parameter> <type>text</type> <literal>FOR</literal> <parameter>escape</parameter> <type>text</type> ) - <returnvalue>text</returnvalue> + <returnvalue>text</returnvalue></s> can you make sure v4-0001-v3-0001-substring.patch the same as v3-0001-add-argument-name-to-function-substring-and-subst.patch. because I tried git am On Tue, Apr 1, 2025 at 6:22 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Tue, Mar 18, 2025 at 9:04 PM jian he <jian.universality@gmail.com> wrote: >> >> >> new patch attached. >> > > I've done v4 with a delta patch. > > Decided to standardize on calling the SQL Similar To regular expression escape replaceable "escape" everywhere. > > Instead of fully documenting the obsolete syntax I added a note explaining the keyword choice difference. Removed mentionof it completely from the Pattern Matching portion of the documentation - that section has enough going on. > > I also add "Same as" references for the two pairs of entries. Not married to them but they do seem warranted; having PatternMatching be required reading to make that connection seems undesirable. > can not build docs based on your v4-0001. your v4-0001-v3-0001-substring.patch is not the same as my v3-0001-add-argument-name-to-function-substring-and-subst.patch for example: - <para role="func_signature"> + <para role="func_signature"><s> <function>substring</function> ( <parameter>string</parameter> <type>text</type> <literal>FROM</literal> <parameter>pattern</parameter> <type>text</type> <literal>FOR</literal> <parameter>escape</parameter> <type>text</type> ) - <returnvalue>text</returnvalue> + <returnvalue>text</returnvalue></s> because I tried git am v3-0001-add-argument-name-to-function-substring-and-subst.patch. patch -p1 < v4-0002-v3-0002-delta.patch Then there are several places that differ, it doesn't seem easy to resolve the difference. Can you make sure v4-0001-v3-0001-substring.patch the same as v3-0001-add-argument-name-to-function-substring-and-subst.patch, then I can review your delta patch.
On Mon, Mar 31, 2025 at 9:12 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Apr 1, 2025 at 6:22 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, Mar 18, 2025 at 9:04 PM jian he <jian.universality@gmail.com> wrote:
>>
>>
>> new patch attached.
>>
>
> I've done v4 with a delta patch.
your v4-0001-v3-0001-substring.patch is not the same as my
v3-0001-add-argument-name-to-function-substring-and-subst.patch
Sorry about that. v5 attached. Confirmed with diff the v3 and v5 0001 so we should be good.
David J.
Attachment
Em ter., 1 de abr. de 2025 às 02:00, David G. Johnston <david.g.johnston@gmail.com> escreveu:
Wouldn't it be good to add the use of parentheses using posix ? It's useful and rarely documented
<literal>substring('Thomas', '...$')</literal>
+ <literal>substring('Email: johnjohn@mymail.com, Name: John' from '@(.*), Name')</literal>
regards
Marcos
> On 1 Apr 2025, at 15:14, Marcos Pegoraro <marcos@f10.com.br> wrote: > > Em ter., 1 de abr. de 2025 às 02:00, David G. Johnston <david.g.johnston@gmail.com> escreveu: > > Wouldn't it be good to add the use of parentheses using posix ? It's useful and rarely documented > <literal>substring('Thomas', '...$')</literal> > + <literal>substring('Email: johnjohn@mymail.com, Name: John' from '@(.*), Name')</literal> While not commenting on the usefulness of the suggestion; any usage of email addresses in the documentation should use @example.{com|org}. -- Daniel Gustafsson
On Tue, Apr 1, 2025 at 6:15 AM Marcos Pegoraro <marcos@f10.com.br> wrote:
Em ter., 1 de abr. de 2025 às 02:00, David G. Johnston <david.g.johnston@gmail.com> escreveu:Wouldn't it be good to add the use of parentheses using posix ? It's useful and rarely documented<literal>substring('Thomas', '...$')</literal>+ <literal>substring('Email: johnjohn@mymail.com, Name: John' from '@(.*), Name')</literal>
Agreed. A second example using () would be good here.
Was pondering explaining the "no parentheses" case here; but for someone familiar with PREs the behavior is obvious and everyone else has the link needed to learn what is happening.
David J.