Thread: to_regtype() Raises Error
The docs for `to_regtype()` say, “this function will return NULL rather than throwing an error if the name is not found.”And it’s true most of the time: david=# select to_regtype('foo'), to_regtype('clam'); to_regtype | to_regtype ------------+------------ [null] | [null] But not others: david=# select to_regtype('inteval second'); ERROR: syntax error at or near "second" LINE 1: select to_regtype('inteval second'); ^ CONTEXT: invalid type name "inteval second” I presume this has something to do with not catching errors from the parser? david=# select to_regtype('clam bake'); ERROR: syntax error at or near "bake" LINE 1: select to_regtype('clam bake'); ^ CONTEXT: invalid type name "clam bake" Best, David
Attachment
On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote: > The docs for `to_regtype()` say, “this function will return NULL rather than > throwing an error if the name is not found.” And it’s true most of the time: > > david=# select to_regtype('foo'), to_regtype('clam'); > to_regtype | to_regtype > ------------+------------ > [null] | [null] > > But not others: > > david=# select to_regtype('inteval second'); > ERROR: syntax error at or near "second" > LINE 1: select to_regtype('inteval second'); > ^ > CONTEXT: invalid type name "inteval second” Probably a typo and you meant 'interval second' which works. > I presume this has something to do with not catching errors from the parser? > > david=# select to_regtype('clam bake'); > ERROR: syntax error at or near "bake" > LINE 1: select to_regtype('clam bake'); > ^ > CONTEXT: invalid type name "clam bake" Double-quoting the type name to treat it as an identifier works: test=# select to_regtype('"clam bake"'); to_regtype ------------ <NULL> (1 row) So it's basically a matter of keywords vs. identifiers. -- Erik
On 9/18/23 00:41, Erik Wienhold wrote: > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote: > >> The docs for `to_regtype()` say, “this function will return NULL rather than >> throwing an error if the name is not found.” And it’s true most of the time: >> >> david=# select to_regtype('foo'), to_regtype('clam'); >> to_regtype | to_regtype >> ------------+------------ >> [null] | [null] >> >> But not others: >> >> david=# select to_regtype('inteval second'); >> ERROR: syntax error at or near "second" >> LINE 1: select to_regtype('inteval second'); >> ^ >> CONTEXT: invalid type name "inteval second” > > Probably a typo and you meant 'interval second' which works. No, that is precisely the point. The result should be null instead of an error. -- Vik Fearing
Vik Fearing <vik@postgresfriends.org> writes: > No, that is precisely the point. The result should be null instead of > an error. Yeah, ideally so, but the cost/benefit of making it happen seems pretty unattractive for now. See the soft-errors thread at [1], particularly [2] (but searching in that thread for references to regclassin, regtypein, and to_reg* will find additional detail). regards, tom lane [1] https://www.postgresql.org/message-id/flat/3bbbb0df-7382-bf87-9737-340ba096e034%40postgrespro.ru [2] https://www.postgresql.org/message-id/3342239.1671988406%40sss.pgh.pa.us
On 18/09/2023 00:57 CEST Vik Fearing <vik@postgresfriends.org> wrote: > On 9/18/23 00:41, Erik Wienhold wrote: > > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote: > > > >> david=# select to_regtype('inteval second'); > >> ERROR: syntax error at or near "second" > >> LINE 1: select to_regtype('inteval second'); > >> ^ > >> CONTEXT: invalid type name "inteval second” > > > > Probably a typo and you meant 'interval second' which works. > > No, that is precisely the point. The result should be null instead of > an error. Well, the docs say "return NULL rather than throwing an error if the name is not found". To me "name is not found" implies that it has to be valid syntax first to even have a name that can be looked up. String 'inteval second' is a syntax error when interpreted as a type name. The same when I want to create a table with that typo: test=# create table t (a inteval second); ERROR: syntax error at or near "second" LINE 1: create table t (a inteval second); And a custom function is always an option: create function to_regtype_lax(name text) returns regtype language plpgsql as $$ begin return to_regtype(name); exception when others then return null; end $$; test=# select to_regtype_lax('inteval second'); to_regtype_lax ---------------- <NULL> (1 row) -- Erik
On Sep 17, 2023, at 19:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No, that is precisely the point. The result should be null instead of >> an error. > > Yeah, ideally so, but the cost/benefit of making it happen seems > pretty unattractive for now. See the soft-errors thread at [1], > particularly [2] (but searching in that thread for references to > regclassin, regtypein, and to_reg* will find additional detail). For my purposes I’m wrapping it in an exception-catching PL/pgSQL function, but it might be worth noting the condition inwhich it *will* raise an error on the docs. Best, David
Attachment
On Mon, 2023-09-18 at 00:57 +0200, Vik Fearing wrote: > On 9/18/23 00:41, Erik Wienhold wrote: > > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote: > > > The docs for `to_regtype()` say, “this function will return NULL rather than > > > throwing an error if the name is not found.” And it’s true most of the time: > > > > > > david=# select to_regtype('foo'), to_regtype('clam'); > > > to_regtype | to_regtype > > > ------------+------------ > > > [null] | [null] > > > > > > But not others: > > > > > > david=# select to_regtype('inteval second'); > > > ERROR: syntax error at or near "second" > > > LINE 1: select to_regtype('inteval second'); > > > ^ > > > CONTEXT: invalid type name "inteval second” > > > > Probably a typo and you meant 'interval second' which works. > No, that is precisely the point. The result should be null instead of > an error. Right. I debugged into this, and found this comment to typeStringToTypeName(): * If the string cannot be parsed as a type, an error is raised, * unless escontext is an ErrorSaveContext node, in which case we may * fill that and return NULL. But note that the ErrorSaveContext option * is mostly aspirational at present: errors detected by the main * grammar, rather than here, will still be thrown. "escontext" is an ErrorSaveContext node, and it is the parser failing. Not sure if we can do anything about that or if it is worth the effort. Perhaps the documentation could reflect the implementation. Yours, Laurenz Albe
On Sun, Sep 17, 2023 at 5:34 PM Erik Wienhold <ewie@ewie.name> wrote:
On 18/09/2023 00:57 CEST Vik Fearing <vik@postgresfriends.org> wrote:
> On 9/18/23 00:41, Erik Wienhold wrote:
> > On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
> >
> >> david=# select to_regtype('inteval second');
> >> ERROR: syntax error at or near "second"
> >> LINE 1: select to_regtype('inteval second');
> >> ^
> >> CONTEXT: invalid type name "inteval second”
> >
> > Probably a typo and you meant 'interval second' which works.
>
> No, that is precisely the point. The result should be null instead of
> an error.
Well, the docs say "return NULL rather than throwing an error if the name is
not found".
To me "name is not found" implies that it has to be valid syntax
first to even have a name that can be looked up.
Except there is nothing in the typed literal value that is actually a syntactical problem from the perspective of the user. IOW, the following work just fine:
select to_regtype('character varying'), to_regtype('interval second');
No need for quotes and the space doesn't produce an issue (and in fact adding double quotes to the above causes them to not match since the quoting is taken literally and not syntactically)
The failure to return NULL exposes an implementation detail that we shouldn't be exposing. As Tom said, maybe doing better is too hard to be worthwhile, but that doesn't mean our current behavior is somehow correct.
Put differently, there is no syntax involved when the value being provided is the text literal name of a type as it is stored in pg_type.typname, so the presence of a syntax error is wrong.
David J.
On 2023-09-17 20:58, David G. Johnston wrote: > Put differently, there is no syntax involved when the value being > provided > is the text literal name of a type as it is stored in pg_type.typname, > so > the presence of a syntax error is wrong. Well, the situation is a little weirder than that, because of the existence of SQL standard types with multiple-token names; when you provide the value 'character varying', you are not providing a name found in pg_type.typname (while, if you call the same type 'varchar', you are). For 'character varying', the parser is necessarily involved. The case with 'interval second' is similar, but even different still; that isn't a multiple-token type name, but a type name with a standard-specified bespoke way of writing a typmod. Another place the parser is necessarily involved, doing another job. (AFAICT, to_regtype is happy with a typmod attached to the input, and happily ignores it, so to_regtype('interval second') gives you interval, to_regtype('character varying(20)') gives you character varying, and so on.) Regards, -Chao
On Sun, Sep 17, 2023 at 6:25 PM Chapman Flack <chap@anastigmatix.net> wrote:
On 2023-09-17 20:58, David G. Johnston wrote:
> Put differently, there is no syntax involved when the value being
> provided
> is the text literal name of a type as it is stored in pg_type.typname,
> so
> the presence of a syntax error is wrong.
Well, the situation is a little weirder than that, because of the
existence
of SQL standard types with multiple-token names; when you provide the
value 'character varying', you are not providing a name found in
pg_type.typname (while, if you call the same type 'varchar', you are).
For 'character varying', the parser is necessarily involved.
Why don't we just populate pg_type with these standard mandated names too?
The case with 'interval second' is similar, but even different still;
that isn't a multiple-token type name, but a type name with a
standard-specified bespoke way of writing a typmod. Another place
the parser is necessarily involved, doing another job. (AFAICT,
to_regtype is happy with a typmod attached to the input, and
happily ignores it, so to_regtype('interval second') gives you
interval, to_regtype('character varying(20)') gives you
character varying, and so on.)
Seems doable to teach the lookup code that suffixes of the form (n) should be ignored when matching the base type, plus maybe some kind of special case for standard mandated typmods on their specific types. There is some ambiguity possible when doing that though:
create type "interval second" as (x int, y int);
select to_regtype('interval second'); --> interval
select to_regtype('interval second'); --> interval
Or just write a function that deals with the known forms dictated by the standard and delegate checking for valid names against that first before consulting pg_type? It might be some code duplication but it isn't like it's a quickly moving target and I have to imagine it would be faster and allow us to readily implement the soft-error contract.
David J.
On 2023-09-17 21:58, David G. Johnston wrote: > ambiguity possible when doing that though: > > create type "interval second" as (x int, y int); > select to_regtype('interval second'); --> interval Not ambiguity really: that composite type you just made was named with a single <delimited identifier>, which is one token. (Also, being delimited makes it case-sensitive, and always distinct from an SQL keyword; consider the different types char and "char". Ah, that SQL committee!) The argument to regtype there is a single, case-insensitive, <regular identifier>, a <separator>, and another <regular identifier>, where in this case the first identifier happens to name a type, the second one happens to be a typmod, and the separator is rather simple as <separator> goes. In this one, both identifiers are part of the type name, and the separator a little more flamboyant. select to_regtype('character /* hi! am I part of the type name? /* what, me too? */ ok! */ -- huh! varying'); to_regtype ------------------- character varying As the backend already has one parser that knows all those lexical and grammar productions, I don't imagine it would be very appealing to have a second implementation of some of them. Obviously, to_regtype could add some simplifying requirements (like "only whitespace for the separator please"), but as you see above, it currently doesn't. Regards, -Chap
On Sunday, September 17, 2023, Chapman Flack <chap@anastigmatix.net> wrote:
In this one, both identifiers are part of the type name, and the
separator a little more flamboyant.
select to_regtype('character /* hi!
am I part of the type name? /* what, me too? */ ok! */ -- huh!
varying');
to_regtype
-------------------
character varying
So, maybe we should be saying:
Parses a string of text, extracts a potential type name from it, and translates that name into an OID. Failure to extract a valid potential type name results in an error while a failure to determine that the extracted name is known to the system results in a null output.
I take specific exception to describing your example as a “textual type name”.
David J.
Hey there, coming back to this. I poked at the logs in the master branch and saw no mention of to_regtype; did I miss it? On Sep 17, 2023, at 10:58 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > Parses a string of text, extracts a potential type name from it, and translates that name into an OID. Failure to extracta valid potential type name results in an error while a failure to determine that the extracted name is known to thesystem results in a null output. > > I take specific exception to describing your example as a “textual type name”. More docs seem like a reasonable compromise. Perhaps it’d be useful to also describe when an error is likely and when it’snot. Best, David
On Mon, Jan 29, 2024 at 8:45 AM David E. Wheeler <david@justatheory.com> wrote:
Hey there, coming back to this. I poked at the logs in the master branch and saw no mention of to_regtype; did I miss it?
With no feedback regarding my final suggestion I lost interest in it and never produced a patch.
On Sep 17, 2023, at 10:58 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
> Parses a string of text, extracts a potential type name from it, and translates that name into an OID. Failure to extract a valid potential type name results in an error while a failure to determine that the extracted name is known to the system results in a null output.
>
> I take specific exception to describing your example as a “textual type name”.
More docs seem like a reasonable compromise. Perhaps it’d be useful to also describe when an error is likely and when it’s not.
Seems like most just want to leave well enough alone and deal with the rare question for oddball input on the mailing list. If you are interested enough to come back after 4 months I'd suggest you write up and submit a patch. I'm happy to review it and see if that is enough to get a committer to respond.
David J.
On Feb 2, 2024, at 3:23 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > Seems like most just want to leave well enough alone and deal with the rare question for oddball input on the mailing list. If you are interested enough to come back after 4 months I'd suggest you write up and submit a patch. I'm happy toreview it and see if that is enough to get a committer to respond. LOL, “interested enough” is less the right term than “triaging email backlog and following up on a surprisingly controversialquestion.” I also just like to see decisions made and issues closed one way or another. Anyway, I’m happy to submit a documentation patch along the lines you suggested. D
On Feb 2, 2024, at 15:33, David E. Wheeler <david@justatheory.com> wrote: > Anyway, I’m happy to submit a documentation patch along the lines you suggested. How’s this? --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE"); <returnvalue>regtype</returnvalue> </para> <para> - Translates a textual type name to its OID. A similar result is + Parses a string of text, extracts a potential type name from it, and + translates that name into an OID. A similar result is obtained by casting the string to type <type>regtype</type> (see - <xref linkend="datatype-oid"/>); however, this function will return - <literal>NULL</literal> rather than throwing an error if the name is - not found. + <xref linkend="datatype-oid"/>). Failure to extract a valid potential + type name results in an error; however, if the extracted names is not + known to the system, this function will return <literal>NULL</literal>. </para></entry> </row> </tbody> Does similar wording need to apply to other `to_reg*` functions? Best, David
Attachment
On 2024-02-04 20:20 +0100, David E. Wheeler wrote: > On Feb 2, 2024, at 15:33, David E. Wheeler <david@justatheory.com> wrote: > > > Anyway, I’m happy to submit a documentation patch along the lines you suggested. > > How’s this? > > --- a/doc/src/sgml/func.sgml > +++ b/doc/src/sgml/func.sgml > @@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE"); > <returnvalue>regtype</returnvalue> > </para> > <para> > - Translates a textual type name to its OID. A similar result is > + Parses a string of text, extracts a potential type name from it, and > + translates that name into an OID. A similar result is > obtained by casting the string to type <type>regtype</type> (see > - <xref linkend="datatype-oid"/>); however, this function will return > - <literal>NULL</literal> rather than throwing an error if the name is > - not found. > + <xref linkend="datatype-oid"/>). Failure to extract a valid potential > + type name results in an error; however, if the extracted names is not Here "extracted names" should be "extracted name" (singular). Otherwise, the text looks good. > + known to the system, this function will return <literal>NULL</literal>. > </para></entry> > </row> > </tbody> > > Does similar wording need to apply to other `to_reg*` functions? Just to_regtype() is fine IMO. The other to_reg* functions don't throw errors on similar input, e.g.: test=> select to_regproc('foo bar'); to_regproc ------------ <NULL> (1 row) -- Erik
On Feb 4, 2024, at 19:11, Erik Wienhold <ewie@ewie.name> wrote: > Here "extracted names" should be "extracted name" (singular). > Otherwise, the text looks good. Ah, thank you. Updated patch attached. Best, David
Attachment
On Feb 5, 2024, at 09:01, David E. Wheeler <david@justatheory.com> wrote: > Ah, thank you. Updated patch attached. I’ve moved this patch into the to_regtype patch thread[1], since it exhibits the same behavior. Best, David [1] https://www.postgresql.org/message-id/60EF4E11-BC1C-4034-B37B-695662D28AD2@justatheory.com
Merged this change into the [to_regtypemod patch](https://commitfest.postgresql.org/47/4807/), which has exactly the sameissue. The new status of this patch is: Needs review
On Feb 21, 2024, at 11:54 AM, David Wheeler <david@justatheory.com> wrote: > Merged this change into the [to_regtypemod patch](https://commitfest.postgresql.org/47/4807/), which has exactly the sameissue. > > The new status of this patch is: Needs review Bah, withdrawn. D