Thread: to_regtype() Raises Error

Re: to_regtype() Raises Error

From

Erik Wienhold

Date:

17 September 2023, 22:41:01

On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:

> The docs for `to_regtype()` say, “this function will return NULL rather than
> throwing an error if the name is not found.” And it’s true most of the time:
>
> david=# select to_regtype('foo'), to_regtype('clam');
>  to_regtype | to_regtype
> ------------+------------
>  [null]     | [null]
>
> But not others:
>
> david=# select to_regtype('inteval second');
> ERROR:  syntax error at or near "second"
> LINE 1: select to_regtype('inteval second');
>                 ^
> CONTEXT:  invalid type name "inteval second”

Probably a typo and you meant 'interval second' which works.

> I presume this has something to do with not catching errors from the parser?
>
> david=# select to_regtype('clam bake');
> ERROR:  syntax error at or near "bake"
> LINE 1: select to_regtype('clam bake');
>              ^
> CONTEXT:  invalid type name "clam bake"

Double-quoting the type name to treat it as an identifier works:

    test=# select to_regtype('"clam bake"');
     to_regtype
    ------------
     <NULL>
    (1 row)

So it's basically a matter of keywords vs. identifiers.

--
Erik

Re: to_regtype() Raises Error

From

Vik Fearing

Date:

17 September 2023, 22:57:04

On 9/18/23 00:41, Erik Wienhold wrote:
> On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
> 
>> The docs for `to_regtype()` say, “this function will return NULL rather than
>> throwing an error if the name is not found.” And it’s true most of the time:
>>
>> david=# select to_regtype('foo'), to_regtype('clam');
>>   to_regtype | to_regtype
>> ------------+------------
>>   [null]     | [null]
>>
>> But not others:
>>
>> david=# select to_regtype('inteval second');
>> ERROR:  syntax error at or near "second"
>> LINE 1: select to_regtype('inteval second');
>>                  ^
>> CONTEXT:  invalid type name "inteval second”
> 
> Probably a typo and you meant 'interval second' which works.
No, that is precisely the point.  The result should be null instead of 
an error.
-- 
Vik Fearing

Re: to_regtype() Raises Error

From

Tom Lane

Date:

17 September 2023, 23:28:32

Vik Fearing <vik@postgresfriends.org> writes:
> No, that is precisely the point.  The result should be null instead of
> an error.

Yeah, ideally so, but the cost/benefit of making it happen seems
pretty unattractive for now.  See the soft-errors thread at [1],
particularly [2] (but searching in that thread for references to
regclassin, regtypein, and to_reg* will find additional detail).

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/3bbbb0df-7382-bf87-9737-340ba096e034%40postgrespro.ru
[2] https://www.postgresql.org/message-id/3342239.1671988406%40sss.pgh.pa.us

Re: to_regtype() Raises Error

From

Erik Wienhold

Date:

17 September 2023, 23:33:07

On 18/09/2023 00:57 CEST Vik Fearing <vik@postgresfriends.org> wrote:

> On 9/18/23 00:41, Erik Wienhold wrote:
> > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
> >
> >> david=# select to_regtype('inteval second');
> >> ERROR:  syntax error at or near "second"
> >> LINE 1: select to_regtype('inteval second');
> >>                  ^
> >> CONTEXT:  invalid type name "inteval second”
> >
> > Probably a typo and you meant 'interval second' which works.
>
> No, that is precisely the point.  The result should be null instead of
> an error.

Well, the docs say "return NULL rather than throwing an error if the name is
not found".  To me "name is not found" implies that it has to be valid syntax
first to even have a name that can be looked up.

String 'inteval second' is a syntax error when interpreted as a type name.
The same when I want to create a table with that typo:

    test=# create table t (a inteval second);
    ERROR:  syntax error at or near "second"
    LINE 1: create table t (a inteval second);

And a custom function is always an option:

    create function to_regtype_lax(name text)
      returns regtype
      language plpgsql
      as $$
    begin
      return to_regtype(name);
    exception
      when others then
        return null;
    end
    $$;

    test=# select to_regtype_lax('inteval second');
     to_regtype_lax
    ----------------
     <NULL>
    (1 row)

--
Erik

Re: to_regtype() Raises Error

From

Laurenz Albe

Date:

17 September 2023, 23:49:22

On Mon, 2023-09-18 at 00:57 +0200, Vik Fearing wrote:
> On 9/18/23 00:41, Erik Wienhold wrote:
> > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
> > > The docs for `to_regtype()` say, “this function will return NULL rather than
> > > throwing an error if the name is not found.” And it’s true most of the time:
> > >
> > > david=# select to_regtype('foo'), to_regtype('clam');
> > >   to_regtype | to_regtype
> > > ------------+------------
> > >   [null]     | [null]
> > >
> > > But not others:
> > >
> > > david=# select to_regtype('inteval second');
> > > ERROR:  syntax error at or near "second"
> > > LINE 1: select to_regtype('inteval second');
> > >                  ^
> > > CONTEXT:  invalid type name "inteval second”
> >
> > Probably a typo and you meant 'interval second' which works.
> No, that is precisely the point.  The result should be null instead of
> an error.

Right.  I debugged into this, and found this comment to typeStringToTypeName():

 * If the string cannot be parsed as a type, an error is raised,
 * unless escontext is an ErrorSaveContext node, in which case we may
 * fill that and return NULL.  But note that the ErrorSaveContext option
 * is mostly aspirational at present: errors detected by the main
 * grammar, rather than here, will still be thrown.

"escontext" is an ErrorSaveContext node, and it is the parser failing.

Not sure if we can do anything about that or if it is worth the effort.

Perhaps the documentation could reflect the implementation.

Yours,
Laurenz Albe

Re: to_regtype() Raises Error

From

"David G. Johnston"

Date:

18 September 2023, 00:58:48

On Sun, Sep 17, 2023 at 5:34 PM Erik Wienhold <ewie@ewie.name> wrote:

On 18/09/2023 00:57 CEST Vik Fearing <vik@postgresfriends.org> wrote:

> On 9/18/23 00:41, Erik Wienhold wrote:
> > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
> >
> >> david=# select to_regtype('inteval second');
> >> ERROR: syntax error at or near "second"
> >> LINE 1: select to_regtype('inteval second');
> >> ^
> >> CONTEXT: invalid type name "inteval second”
> >
> > Probably a typo and you meant 'interval second' which works.
>
> No, that is precisely the point. The result should be null instead of
> an error.

Well, the docs say "return NULL rather than throwing an error if the name is
not found".

To me "name is not found" implies that it has to be valid syntax
first to even have a name that can be looked up.

Except there is nothing in the typed literal value that is actually a syntactical problem from the perspective of the user. IOW, the following work just fine:

select to_regtype('character varying'), to_regtype('interval second');

No need for quotes and the space doesn't produce an issue (and in fact adding double quotes to the above causes them to not match since the quoting is taken literally and not syntactically)

The failure to return NULL exposes an implementation detail that we shouldn't be exposing. As Tom said, maybe doing better is too hard to be worthwhile, but that doesn't mean our current behavior is somehow correct.

Put differently, there is no syntax involved when the value being provided is the text literal name of a type as it is stored in pg_type.typname, so the presence of a syntax error is wrong.

David J.

Re: to_regtype() Raises Error

From

Chapman Flack

Date:

18 September 2023, 01:24:59

On 2023-09-17 20:58, David G. Johnston wrote:
> Put differently, there is no syntax involved when the value being 
> provided
> is the text literal name of a type as it is stored in pg_type.typname, 
> so
> the presence of a syntax error is wrong.

Well, the situation is a little weirder than that, because of the 
existence
of SQL standard types with multiple-token names; when you provide the
value 'character varying', you are not providing a name found in
pg_type.typname (while, if you call the same type 'varchar', you are).
For 'character varying', the parser is necessarily involved.

The case with 'interval second' is similar, but even different still;
that isn't a multiple-token type name, but a type name with a
standard-specified bespoke way of writing a typmod. Another place
the parser is necessarily involved, doing another job. (AFAICT,
to_regtype is happy with a typmod attached to the input, and
happily ignores it, so to_regtype('interval second') gives you
interval, to_regtype('character varying(20)') gives you
character varying, and so on.)

Regards,
-Chao

Re: to_regtype() Raises Error

From

"David G. Johnston"

Date:

18 September 2023, 01:58:13

On Sun, Sep 17, 2023 at 6:25 PM Chapman Flack <chap@anastigmatix.net> wrote:

On 2023-09-17 20:58, David G. Johnston wrote:
> Put differently, there is no syntax involved when the value being
> provided
> is the text literal name of a type as it is stored in pg_type.typname,
> so
> the presence of a syntax error is wrong.

Well, the situation is a little weirder than that, because of the
existence
of SQL standard types with multiple-token names; when you provide the
value 'character varying', you are not providing a name found in
pg_type.typname (while, if you call the same type 'varchar', you are).
For 'character varying', the parser is necessarily involved.

Why don't we just populate pg_type with these standard mandated names too?

The case with 'interval second' is similar, but even different still;
that isn't a multiple-token type name, but a type name with a
standard-specified bespoke way of writing a typmod. Another place
the parser is necessarily involved, doing another job. (AFAICT,
to_regtype is happy with a typmod attached to the input, and
happily ignores it, so to_regtype('interval second') gives you
interval, to_regtype('character varying(20)') gives you
character varying, and so on.)

Seems doable to teach the lookup code that suffixes of the form (n) should be ignored when matching the base type, plus maybe some kind of special case for standard mandated typmods on their specific types. There is some ambiguity possible when doing that though:

create type "interval second" as (x int, y int);
select to_regtype('interval second'); --> interval

Or just write a function that deals with the known forms dictated by the standard and delegate checking for valid names against that first before consulting pg_type? It might be some code duplication but it isn't like it's a quickly moving target and I have to imagine it would be faster and allow us to readily implement the soft-error contract.

David J.

Re: to_regtype() Raises Error

From

Chapman Flack

Date:

18 September 2023, 02:44:13

On 2023-09-17 21:58, David G. Johnston wrote:
> ambiguity possible when doing that though:
> 
> create type "interval second" as (x int, y int);
> select to_regtype('interval second'); --> interval

Not ambiguity really: that composite type you just made was named
with a single <delimited identifier>, which is one token. (Also,
being delimited makes it case-sensitive, and always distinct from
an SQL keyword; consider the different types char and "char". Ah,
that SQL committee!)

The argument to regtype there is a single, case-insensitive,
<regular identifier>, a <separator>, and another <regular identifier>,
where in this case the first identifier happens to name a type, the
second one happens to be a typmod, and the separator is rather
simple as <separator> goes.

In this one, both identifiers are part of the type name, and the
separator a little more flamboyant.

select to_regtype('character /* hi!
am I part of the type name? /* what, me too? */ ok! */ -- huh!
varying');
     to_regtype
-------------------
  character varying

As the backend already has one parser that knows all those
lexical and grammar productions, I don't imagine it would be
very appealing to have a second implementation of some of them.
Obviously, to_regtype could add some simplifying requirements
(like "only whitespace for the separator please"), but as you
see above, it currently doesn't.

Regards,
-Chap

Re: to_regtype() Raises Error

From

"David G. Johnston"

Date:

18 September 2023, 02:58:44

On Sunday, September 17, 2023, Chapman Flack <chap@anastigmatix.net> wrote:

In this one, both identifiers are part of the type name, and the
separator a little more flamboyant.

select to_regtype('character /* hi!
am I part of the type name? /* what, me too? */ ok! */ -- huh!
varying');
to_regtype
-------------------
character varying

So, maybe we should be saying:

Parses a string of text, extracts a potential type name from it, and translates that name into an OID. Failure to extract a valid potential type name results in an error while a failure to determine that the extracted name is known to the system results in a null output.

I take specific exception to describing your example as a “textual type name”.

David J.

Re: to_regtype() Raises Error

From

"David E. Wheeler"

Date:

29 January 2024, 15:45:01

Hey there, coming back to this. I poked at the logs in the master branch and saw no mention of to_regtype; did I miss
it?

On Sep 17, 2023, at 10:58 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:

> Parses a string of text, extracts a potential type name from it, and translates that name into an OID.  Failure to
extracta valid potential type name results in an error while a failure to determine that the extracted name is known to
thesystem results in a null output. 
>
> I take specific exception to describing your example as a “textual type name”.

More docs seem like a reasonable compromise. Perhaps it’d be useful to also describe when an error is likely and when
it’snot. 

Best,

David

Re: to_regtype() Raises Error

From

"David G. Johnston"

Date:

02 February 2024, 20:23:48

On Mon, Jan 29, 2024 at 8:45 AM David E. Wheeler <david@justatheory.com> wrote:

Hey there, coming back to this. I poked at the logs in the master branch and saw no mention of to_regtype; did I miss it?

With no feedback regarding my final suggestion I lost interest in it and never produced a patch.

On Sep 17, 2023, at 10:58 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:

> Parses a string of text, extracts a potential type name from it, and translates that name into an OID. Failure to extract a valid potential type name results in an error while a failure to determine that the extracted name is known to the system results in a null output.
>
> I take specific exception to describing your example as a “textual type name”.

More docs seem like a reasonable compromise. Perhaps it’d be useful to also describe when an error is likely and when it’s not.

Seems like most just want to leave well enough alone and deal with the rare question for oddball input on the mailing list. If you are interested enough to come back after 4 months I'd suggest you write up and submit a patch. I'm happy to review it and see if that is enough to get a committer to respond.

David J.

Re: to_regtype() Raises Error

From

"David E. Wheeler"

Date:

02 February 2024, 20:33:51

On Feb 2, 2024, at 3:23 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:

> Seems like most just want to leave well enough alone and deal with the rare question for oddball input on the mailing
list. If you are interested enough to come back after 4 months I'd suggest you write up and submit a patch.  I'm happy
toreview it and see if that is enough to get a committer to respond. 

LOL, “interested enough” is less the right term than “triaging email backlog and following up on a surprisingly
controversialquestion.” I also just like to see decisions made and issues closed one way or another. 

Anyway, I’m happy to submit a documentation patch along the lines you suggested.

D

Re: to_regtype() Raises Error

From

"David E. Wheeler"

Date:

04 February 2024, 19:20:38

On Feb 2, 2024, at 15:33, David E. Wheeler <david@justatheory.com> wrote:

> Anyway, I’m happy to submit a documentation patch along the lines you suggested.

How’s this?

--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE");
         <returnvalue>regtype</returnvalue>
        </para>
        <para>
-        Translates a textual type name to its OID.  A similar result is
+        Parses a string of text, extracts a potential type name from it, and
+        translates that name into an OID. A similar result is
         obtained by casting the string to type <type>regtype</type> (see
-        <xref linkend="datatype-oid"/>); however, this function will return
-        <literal>NULL</literal> rather than throwing an error if the name is
-        not found.
+        <xref linkend="datatype-oid"/>). Failure to extract a valid potential
+        type name results in an error; however, if the extracted names is not
+        known to the system, this function will return <literal>NULL</literal>.
        </para></entry>
       </row>
      </tbody>

Does similar wording need to apply to other `to_reg*` functions?

Best,

David

Attachment

v1-0001-Update-to_regtype-docs-regarding-error-condition.patch

Re: to_regtype() Raises Error

From

Erik Wienhold

Date:

05 February 2024, 00:11:48

On 2024-02-04 20:20 +0100, David E. Wheeler wrote:
> On Feb 2, 2024, at 15:33, David E. Wheeler <david@justatheory.com> wrote:
> 
> > Anyway, I’m happy to submit a documentation patch along the lines you suggested.
> 
> How’s this?
> 
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE");
>          <returnvalue>regtype</returnvalue>
>         </para>
>         <para>
> -        Translates a textual type name to its OID.  A similar result is
> +        Parses a string of text, extracts a potential type name from it, and
> +        translates that name into an OID. A similar result is
>          obtained by casting the string to type <type>regtype</type> (see
> -        <xref linkend="datatype-oid"/>); however, this function will return
> -        <literal>NULL</literal> rather than throwing an error if the name is
> -        not found.
> +        <xref linkend="datatype-oid"/>). Failure to extract a valid potential
> +        type name results in an error; however, if the extracted names is not

Here "extracted names" should be "extracted name" (singular).
Otherwise, the text looks good.

> +        known to the system, this function will return <literal>NULL</literal>.
>         </para></entry>
>        </row>
>       </tbody>
> 
> Does similar wording need to apply to other `to_reg*` functions?

Just to_regtype() is fine IMO.  The other to_reg* functions don't throw
errors on similar input, e.g.:

    test=> select to_regproc('foo bar');
     to_regproc
    ------------
     <NULL>
    (1 row)

-- 
Erik

Re: to_regtype() Raises Error

From

"David E. Wheeler"

Date:

05 February 2024, 14:01:12

On Feb 4, 2024, at 19:11, Erik Wienhold <ewie@ewie.name> wrote:

> Here "extracted names" should be "extracted name" (singular).
> Otherwise, the text looks good.

Ah, thank you. Updated patch attached.

Best,

David

Attachment

v2-0001-Update-to_regtype-docs-regarding-error-condition.patch

Re: to_regtype() Raises Error

From

"David E. Wheeler"

Date:

20 February 2024, 14:48:15

On Feb 5, 2024, at 09:01, David E. Wheeler <david@justatheory.com> wrote:

> Ah, thank you. Updated patch attached.

I’ve moved this patch into the to_regtype patch thread[1], since it exhibits the same behavior.

Best,

David

[1] https://www.postgresql.org/message-id/60EF4E11-BC1C-4034-B37B-695662D28AD2@justatheory.com

Re: to_regtype() Raises Error

From

David Wheeler

Date:

21 February 2024, 16:54:32

Merged this change into the [to_regtypemod patch](https://commitfest.postgresql.org/47/4807/), which has exactly the
sameissue. 

The new status of this patch is: Needs review

Re: to_regtype() Raises Error

From

"David E. Wheeler"

Date:

21 February 2024, 16:56:49

On Feb 21, 2024, at 11:54 AM, David Wheeler <david@justatheory.com> wrote:

> Merged this change into the [to_regtypemod patch](https://commitfest.postgresql.org/47/4807/), which has exactly the
sameissue. 
>
> The new status of this patch is: Needs review

Bah, withdrawn.

D