Thread: Error-safe user functions

Re: Error-safe user functions

From

Tom Lane

Date:

07 October 2022, 17:37:56

Nikita Glukhov <n.gluhov@postgrespro.ru> writes:
> On 30.09.2022, Tom Lane wrote:
>> I strongly recommend against having a new pg_proc column at all.

> I think the only way to avoid catalog modification (adding new columns
> to pg_proc or pg_type, introducing new function signatures etc.) and
> to avoid adding some additional code to the entry of error-safe
> functions is to bump version of our calling convention.  I simply added
> flag Pg_finfo_record.errorsafe which is set to true when the new
> PG_FUNCTION_INFO_V2_ERRORSAFE() macro is used.

I don't think you got my point at all.

I do not think we need a new pg_proc column (btw, touching pg_proc.dat
is morally equivalent to a pg_proc column), and I do not think we need
a new call-convention version either, because I think that this sort
of thing:

+        /* check whether input function supports returning errors */
+        if (cstate->opts.null_on_error_flags[attnum - 1] &&
+            !func_is_error_safe(in_func_oid))
+            ereport(ERROR,
+                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                     errmsg("input function for datatype \"%s\" does not support error handling",
+                            format_type_be(att->atttypid))));

is useless.  It does not benefit anybody to pre-emptively throw an error
because you are afraid that some other code might throw an error later.
That just converts "might fail" to "guaranteed to fail" --- how is that
better?

I think what we want is to document options like NULL_ON_ERROR along the
lines of

    If the input data in one of the specified columns is invalid,
    set the column's value to NULL instead of reporting an error.
    This feature will only work if the column datatype's input
    function has been upgraded to support it; otherwise, an invalid
    input value will result in an error anyway.

and just leave it on the heads of extension authors to get their
code moved forward.  (If we fail to get all the core types converted
by the time v16 reaches feature freeze, we'd have to add some docs
about which ones support this; but maybe we will get all that done
and not need documentation effort.)

Some other recommendations:

* The primary work-product from an initial patch of this sort is an
API specification.  Therefore, your 0001 patch ought to be introducing
some prose documentation somewhere (and I don't mean comments in elog.h,
rather a README file or even the SGML docs --- utils/fmgr/README might
be a good spot).  Getting that text right so that people understand
what to do is more important than any single code detail.  You are not
winning any fans by not bothering with code comments such as per-function
header comments, either.

* Submitting 16-part patch series is a good way to discourage people
from reviewing your work.  I'd toss most of the datatype conversions
overboard for the moment, planning to address them later once the core
patch is committed.  The initial patchset only needs to have one or two
data types done as proof-of-concept.

* I'd toss the test module overboard too.  Once you've got COPY using
the feature, that's a perfectly good testbed.  The effort spent on
the test module would have been better spent on making the COPY support
more complete (ie, get rid of the silly restriction to CSV).

* The 0015 and 0016 patches don't seem to belong here either.  It's
impossible to review these when the code is neither commented nor
connected to any use-case.

* I think that the ereturn macro is the right idea, but I don't understand
the rationale for also inventing PG_RETURN_ERROR.  Also, ereturn's
implementation isn't great --- I don't like duplicating the __VA_ARGS__
text, because that will lead to substantial code bloat.  It'd likely
work better to make ereturn very much like ereport, except with a
different finishing function that contains the throw-or-not logic.
As a small nitpick, I think I'd make ereturn's argument order be return
value then edata then ...; it just seems more sensible that way.

* execnodes.h seems like an *extremely* random place to put struct
FuncCallError; that will force inclusion of execnodes.h in many places
that did not need it before.  Possibly fmgr.h is the right place for it?
In general you need to think about avoiding major inclusion bloat
(and I wonder whether the patchset passes cpluspluscheck).  It might
help to treat ereturn's edata argument as just "void *" and confine
references to the FuncCallError struct to the errfinish-replacement
subroutine, ie drop the tests in PG_GET_ERROR_PTR and do that check
inside elog.c.

* I wonder if there's a way to avoid the CopyErrorData and FreeErrorData
steps in use-cases like this --- that's pure overhead, really, for
COPY's purposes, and it's probably not the only use-case that will
think so.  Maybe we could complicate FuncCallError a little and pass
a flag that indicates that we only want to know whether an error
occurred, not what it was exactly.  On the other hand, if you assume
that errors should be rare, maybe that's useless micro-optimization.

Basically, this patch set should be a lot smaller and not have ambitions
beyond "get the API right" and "make one or two datatypes support COPY
NULL_ON_ERROR".  Add more code once that core functionality gets reviewed
and committed.

            regards, tom lane

Re: Error-safe user functions

From

Corey Huinker

Date:

10 October 2022, 16:54:28


The idea is simple -- introduce new "error-safe" calling mode of user 
functions by passing special node through FunctCallInfo.context, in 
which function should write error info and return instead of throwing 
it.  Also such functions should manually free resources before 
returning an error.  This gives ability to avoid PG_TRY/PG_CATCH and 
subtransactions.

I tried something similar when trying to implement TRY_CAST (https://learn.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver16) late last year. I also considered having a default datum rather than just returning NULL.

I had not considered a new node type. I had considered having every function have a "safe" version, which would be a big duplication of logic requiring a lot of regression tests and possibly fuzzing tests.

Instead, I extended every core input function to have an extra boolean parameter to indicate if failures were allowed, and then an extra Datum parameter for the default value. The Input function wouldn't need to check the value of the new parameters until it was already in a situation where it found invalid data, but the extra overhead still remained, and it meant that basically every third party type extension would need to be changed.

Then I considered whether the cast failure should be completely silent, or if the previous error message should instead be omitted as a LOG/INFO/WARN, and if we'd want that to be configurable, so then the boolean parameter became an integer enum:

* regular fail (0)

* use default silently (1)

* use default emit LOG/NOTICE/WARNING (2,3,4)

At the time, all of this seemed like too big of a change for a function that isn't even in the SQL Standard, but maybe SQL/JSON changes that.

If so, it would allow for a can-cast-to test that users would find very useful. Something like:

SELECT CASE WHEN 'abc' CAN BE integer THEN 'Integer' ELSE 'Nope' END

There's obviously no standard syntax to support that, but the data cleansing possibilities would be great.

Re: Error-safe user functions

From

Andrew Dunstan

Date:

15 November 2022, 16:35:55

On 2022-10-07 Fr 13:37, Tom Lane wrote:

[ lots of detailed review ]

> Basically, this patch set should be a lot smaller and not have ambitions
> beyond "get the API right" and "make one or two datatypes support COPY
> NULL_ON_ERROR".  Add more code once that core functionality gets reviewed
> and committed.
>
>             

Nikita,

just checking in, are you making progress on this? I think we really
need to get this reviewed and committed ASAP if we are to have a chance
to get the SQL/JSON stuff reworked to use it in time for release 16.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Corey Huinker

Date:

21 November 2022, 05:17:12

On Tue, Nov 15, 2022 at 11:36 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-10-07 Fr 13:37, Tom Lane wrote:

[ lots of detailed review ]

> Basically, this patch set should be a lot smaller and not have ambitions
> beyond "get the API right" and "make one or two datatypes support COPY
> NULL_ON_ERROR". Add more code once that core functionality gets reviewed
> and committed.
>
>

Nikita,

just checking in, are you making progress on this? I think we really
need to get this reviewed and committed ASAP if we are to have a chance
to get the SQL/JSON stuff reworked to use it in time for release 16.

I'm making an attempt at this or something very similar to it. I don't yet have a patch ready.

Re: Error-safe user functions

From

Tom Lane

Date:

21 November 2022, 05:26:45

Corey Huinker <corey.huinker@gmail.com> writes:
> On Tue, Nov 15, 2022 at 11:36 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>> Nikita,
>> just checking in, are you making progress on this? I think we really
>> need to get this reviewed and committed ASAP if we are to have a chance
>> to get the SQL/JSON stuff reworked to use it in time for release 16.

> I'm making an attempt at this or something very similar to it. I don't yet
> have a patch ready.

Cool.  We can't delay too much longer on this if we want to have
a credible feature in v16.  Although I want a minimal initial
patch, there will still be a ton of incremental work to do after
the core capability is reviewed and committed, so there's no
time to lose.

            regards, tom lane

Re: Error-safe user functions

From

Michael Paquier

Date:

21 November 2022, 23:59:38

On Mon, Nov 21, 2022 at 12:26:45AM -0500, Tom Lane wrote:
> Corey Huinker <corey.huinker@gmail.com> writes:
>> I'm making an attempt at this or something very similar to it. I don't yet
>> have a patch ready.

Nice to hear that.  If a WIP or a proof of concept takes more than a
few hours, how about beginning a new thread with the ideas you have in
mind so as we could agree about how to shape this error-to-default
conversion facility on data input?

> Cool.  We can't delay too much longer on this if we want to have
> a credible feature in v16.  Although I want a minimal initial
> patch, there will still be a ton of incremental work to do after
> the core capability is reviewed and committed, so there's no
> time to lose.

I was glancing at the patch of upthread and the introduction of a v2
scared me, so I've stopped at this point.
--
Michael

Attachment

signature.asc

Re: Error-safe user functions

From

Andrew Dunstan

Date:

01 December 2022, 17:46:48

On 2022-11-21 Mo 00:26, Tom Lane wrote:
> Corey Huinker <corey.huinker@gmail.com> writes:
>> On Tue, Nov 15, 2022 at 11:36 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Nikita,
>>> just checking in, are you making progress on this? I think we really
>>> need to get this reviewed and committed ASAP if we are to have a chance
>>> to get the SQL/JSON stuff reworked to use it in time for release 16.
>> I'm making an attempt at this or something very similar to it. I don't yet
>> have a patch ready.
> Cool.  We can't delay too much longer on this if we want to have
> a credible feature in v16.  Although I want a minimal initial
> patch, there will still be a ton of incremental work to do after
> the core capability is reviewed and committed, so there's no
> time to lose.
>
>             

OK, here's a set of minimal patches based on Nikita's earlier work and
also some work by my colleague Amul Sul. It tries to follow Tom's
original outline at [1], and do as little else as possible.

Patch 1 introduces the IOCallContext node. The caller should set the
no_error_throw flag and clear the error_found flag, which will be set by
a conforming IO function if an error is found. It also includes a string
field for an error message. I haven't used that, it's more there to
stimulate discussion. Robert suggested to me that maybe it should be an
ErrorData, but I'm not sure how we would use it.

Patch 2 introduces InputFunctionCallContext(), which is similar to
InputFunctionCall() but with an extra context parameter. Note that it's
ok for an input function to return a NULL to this function if an error
is found.

Patches 3 and 4 modify the bool_in() and int4in() functions respectively
to handle an IOContextCall appropriately if provided one in their
fcinfo.context.

Patch 5 introduces COPY FROM ... NULL_ON_ERROR which, in addition to
being useful in itself, provides a test of the previous patches.

I hope we can get a fairly quick agreement so that work can being on
adjusting at least those things needed for the SQL/JSON patches ASAP.
Our goal should be to adjust all the core input functions, but that's
not quite so urgent, and can be completed in parallel with the SQL/JSON
work.

cheers

andrew

[1] https://www.postgresql.org/message-id/13351.1661965592%40sss.pgh.pa.us

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

On Fri, Dec 2, 2022 at 9:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:
> I think the design is evolving in your head as you think about this
> more, which is totally understandable and actually very good. However,
> this is also why I think that you should produce the patch you
> actually want instead of letting other people repeatedly submit
> patches and then complain that they weren't what you had in mind.

OK, Corey hasn't said anything, so I will have a look at this over
the weekend.

regards, tom lane

Sorry, had several life issues intervene. Glancing over what was discussed because it seems pretty similar to what I had in mind. Will respond back in detail shortly.

Re: Error-safe user functions

From

Corey Huinker

Date:

02 December 2022, 18:15:06

On Fri, Dec 2, 2022 at 9:34 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-12-02 Fr 09:12, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think the design is evolving in your head as you think about this
>> more, which is totally understandable and actually very good. However,
>> this is also why I think that you should produce the patch you
>> actually want instead of letting other people repeatedly submit
>> patches and then complain that they weren't what you had in mind.
> OK, Corey hasn't said anything, so I will have a look at this over
> the weekend.
>
>

Great. Let's hope we can get this settled early next week and then we
can get to work on the next tranche of functions, those that will let
the SQL/JSON work restart.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

I'm still working on organizing my patch, but it grew out of a desire to do this:

CAST(value AS TypeName DEFAULT expr)

This is a thing that exists in other forms in other databases and while it may look unrelated, it is essentially the SQL/JSON casts within a nested data structure issue, just a lot simpler.

My original plan had been two new params to all _in() functions: a boolean error_mode and a default expression Datum.

After consulting with Jeff Davis and Michael Paquier, the notion of modifying fcinfo itself two booleans:
allow_error (this call is allowed to return if there was an error with INPUT) and
has_error (this function has the concept of a purely-input-based error, and found one)

The nice part about this is that unaware functions can ignore these values, and custom data types that did not check these values would continue to work as before. It wouldn't respect the CAST default, but that's up to the extension writer to fix, and that's a pretty acceptable failure mode.

Where this gets tricky is arrays and complex types: the default expression applies only to the object explicitly casted, so if somebody tried CAST ('{"123","abc"}'::text[] AS integer[] DEFAULT '{0}') the inner casts need to know that they _can_ allow input errors, but have no default to offer, they need merely report their failure upstream...and that's where the issues align with the SQL/JSON issue.

In pursuing this, I see that my method was simultaneously too little and too much. Too much in the sense that it alters the structure for all fmgr functions, though in a very minor and back-compatible way, and too little in the sense that we could actually return the ereport info in a structure and leave it to the node to decide whether to raise it or not. Though I should add that there many situations where we don't care about the specifics of the error, we just want to know that one existed and move on, so time spent forming that return structure would be time wasted.

The one gap I see so far in the patch presented is that it returns a null value on bad input, which might be ok if the node has the default, but that then presents the node with having to understand whether it was a null because of bad input vs a null that was expected.

Re: Error-safe user functions

From

Tom Lane

Date:

02 December 2022, 18:46:01

Corey Huinker <corey.huinker@gmail.com> writes:
> I'm still working on organizing my patch, but it grew out of a desire to do
> this:
> CAST(value AS TypeName DEFAULT expr)
> This is a thing that exists in other forms in other databases and while it
> may look unrelated, it is essentially the SQL/JSON casts within a nested
> data structure issue, just a lot simpler.

Okay, maybe that's why I was thinking we had a requirement for
failure-free casts.  Sure, you can transform it to the other thing
by always implementing this as a cast-via-IO, but you could run into
semantics issues that way.  (If memory serves, we already have cases
where casting X to Y gives a different result from casting X to text
to Y.)

> My original plan had been two new params to all _in() functions: a boolean
> error_mode and a default expression Datum.
> After consulting with Jeff Davis and Michael Paquier, the notion of
> modifying fcinfo itself two booleans:
>   allow_error (this call is allowed to return if there was an error with
> INPUT) and
>   has_error (this function has the concept of a purely-input-based error,
> and found one)

Hmm ... my main complaint about that is the lack of any way to report
the details of the error.  I realize that a plain boolean failure flag
might be enough for our immediate use-cases, but I don't want to expend
the amount of effort this is going to involve and then later find we
have a use-case where we need the error details.

The sketch that's in my head at the moment is to make use of the existing
"context" field of FunctionCallInfo: if that contains a node of some
to-be-named type, then we are requesting that errors not be thrown
but instead be reported by passing back an ErrorData using a field of
that node.  The issue about not constructing an ErrorData if the outer
caller doesn't need it could perhaps be addressed by adding some boolean
flag fields in that node, but the details of that need not be known to
the functions reporting errors this way; it'd be a side channel from the
outer caller to elog.c.

The main objection I can see to this approach is that we only support
one context value per call, so you could not easily combine this
functionality with existing use-cases for the context field.  A quick
census of InitFunctionCallInfoData calls finds aggregates, window
functions, triggers, and procedures, none of which seem like plausible
candidates for wanting no-error behavior, so I'm not too concerned
about that.  (Maybe the error-reporting node could be made a sub-node
of the context node in any future cases where we do need it?)

> The one gap I see so far in the patch presented is that it returns a null
> value on bad input, which might be ok if the node has the default, but that
> then presents the node with having to understand whether it was a null
> because of bad input vs a null that was expected.

Yeah.  That's something we could probably get away with for the case of
input functions only, but I think explicit out-of-band signaling that
there was an error is a more future-proof solution.

            regards, tom lane

Re: Error-safe user functions

From

Corey Huinker

Date:

02 December 2022, 19:06:09

On Fri, Dec 2, 2022 at 1:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:
> I'm still working on organizing my patch, but it grew out of a desire to do
> this:
> CAST(value AS TypeName DEFAULT expr)
> This is a thing that exists in other forms in other databases and while it
> may look unrelated, it is essentially the SQL/JSON casts within a nested
> data structure issue, just a lot simpler.

Okay, maybe that's why I was thinking we had a requirement for
failure-free casts. Sure, you can transform it to the other thing
by always implementing this as a cast-via-IO, but you could run into
semantics issues that way. (If memory serves, we already have cases
where casting X to Y gives a different result from casting X to text
to Y.)

Yes, I was setting aside the issue of direct cast functions for my v0.1

> My original plan had been two new params to all _in() functions: a boolean
> error_mode and a default expression Datum.
> After consulting with Jeff Davis and Michael Paquier, the notion of
> modifying fcinfo itself two booleans:
> allow_error (this call is allowed to return if there was an error with
> INPUT) and
> has_error (this function has the concept of a purely-input-based error,
> and found one)

Hmm ... my main complaint about that is the lack of any way to report
the details of the error. I realize that a plain boolean failure flag
might be enough for our immediate use-cases, but I don't want to expend
the amount of effort this is going to involve and then later find we
have a use-case where we need the error details.

I agree, but then we're past a boolean for allow_error, and we probably get into a list of modes like this:

CAST_ERROR_ERROR /* default ereport(), what we do now */
CAST_ERROR_REPORT_FULL /* report that the cast failed, everything that you would have put in the ereport() instead put in a struct that gets returned to caller */
CAST_ERROR_REPORT_SILENT /* report that the cast failed, but nobody cares why so don't even form the ereport strings, good for bulk operations */
CAST_ERROR_WARNING /* report that the cast failed, but emit ereport() as a warning */
CAST_ERROR_[NOTICE,LOG,DEBUG1,..DEBUG5] /* same, but some other loglevel */

The sketch that's in my head at the moment is to make use of the existing
"context" field of FunctionCallInfo: if that contains a node of some
to-be-named type, then we are requesting that errors not be thrown
but instead be reported by passing back an ErrorData using a field of
that node. The issue about not constructing an ErrorData if the outer
caller doesn't need it could perhaps be addressed by adding some boolean
flag fields in that node, but the details of that need not be known to
the functions reporting errors this way; it'd be a side channel from the
outer caller to elog.c.

That should be a good place for it, assuming it's not already used like fn_extra is. It would also squash those cases above into just three: ERROR, REPORT_FULL, and REPORT_SILENT, leaving it up to the node what type of erroring/logging is appropriate.

The main objection I can see to this approach is that we only support
one context value per call, so you could not easily combine this
functionality with existing use-cases for the context field. A quick
census of InitFunctionCallInfoData calls finds aggregates, window
functions, triggers, and procedures, none of which seem like plausible
candidates for wanting no-error behavior, so I'm not too concerned
about that. (Maybe the error-reporting node could be made a sub-node
of the context node in any future cases where we do need it?)

A subnode had occurred to me when fiddling about with fn_extra, so so that applies here, but if we're doing a sub-node, then maybe it's worth it's own parameter. I struggled with that because of how few of these functions will use it vs how often they're executed.

> The one gap I see so far in the patch presented is that it returns a null
> value on bad input, which might be ok if the node has the default, but that
> then presents the node with having to understand whether it was a null
> because of bad input vs a null that was expected.

Yeah. That's something we could probably get away with for the case of
input functions only, but I think explicit out-of-band signaling that
there was an error is a more future-proof solution.

I think we'll run into it fairly soon, because if I recall correctly, SQL/JSON has a formatting spec that essentially means that we're not calling input functions, we're calling TO_CHAR() and TO_DATE(), but they're very similiar to input functions.

One positive side effect of all this is we can get a isa(value, typname) construct like this "for free", we just try the cast and return the value.

I'm still working on my patch even though it will probably be sidelined in the hopes that it informs us of any subsequent issues.

Re: Error-safe user functions

From

Robert Haas

Date:

02 December 2022, 20:49:09

On Fri, Dec 2, 2022 at 1:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The main objection I can see to this approach is that we only support
> one context value per call, so you could not easily combine this
> functionality with existing use-cases for the context field.  A quick
> census of InitFunctionCallInfoData calls finds aggregates, window
> functions, triggers, and procedures, none of which seem like plausible
> candidates for wanting no-error behavior, so I'm not too concerned
> about that.  (Maybe the error-reporting node could be made a sub-node
> of the context node in any future cases where we do need it?)

I kind of wonder why we don't just add another member to FmgrInfo.
It's 48 bytes right now and this would increase the size to 56 bytes,
so it's not as if we're increasing the number of cache lines or even
using up all of the remaining byte space. It's an API break, but
people have to recompile for new major versions anyway, so I guess I
don't really see the downside.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

02 December 2022, 21:19:11

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Dec 2, 2022 at 1:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The main objection I can see to this approach is that we only support
>> one context value per call, so you could not easily combine this
>> functionality with existing use-cases for the context field.

> I kind of wonder why we don't just add another member to FmgrInfo.
> It's 48 bytes right now and this would increase the size to 56 bytes,

This'd be FunctionCallInfoData not FmgrInfo.

I'm not terribly concerned about the size of FunctionCallInfoData,
but I am concerned about the number of cycles spent to initialize it,
because we do that pretty durn often.  So I don't really want to add
fields to it without compelling use-cases, and I don't see one here.

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

03 December 2022, 21:46:41

Andrew Dunstan <andrew@dunslane.net> writes:
> Great. Let's hope we can get this settled early next week and then we
> can get to work on the next tranche of functions, those that will let
> the SQL/JSON work restart.

OK, here's a draft proposal.  I should start out by acknowledging that
this steals a great deal from Nikita's original patch as well as yours,
though I editorialized heavily.

0001 is the core infrastructure and documentation for the feature.
(I didn't bother breaking it down further than that.)

0002 fixes boolin and int4in.  That is the work that we're going to
have to replicate in an awful lot of places, and I am pleased by how
short-and-sweet it is.  Of course, stuff like the datetime functions
might be more complex to adapt.

Then 0003 is a quick-hack version of COPY that is able to exercise
all this.  I did not bother with the per-column flags as you had
them, because I'm not sure if they're worth the trouble compared
to a simple boolean; in any case we can add that refinement later.
What I did add was a WARN_ON_ERROR option that exercises the ability
to extract the error message after a soft error.  I'm not proposing
that as a shippable feature, it's just something for testing.

I think there are just a couple of loose ends here:

1. Bikeshedding on my name choices is welcome.  I know Robert is
dissatisfied with "ereturn", but I'm content with that so I didn't
change it here.

2. Everybody has struggled with just where to put the declaration
of the error context structure.  The most natural home for it
probably would be elog.h, but that's out because it cannot depend
on nodes.h, and the struct has to be a Node type to conform to
the fmgr safety guidelines.  What I've done here is to drop it
in nodes.h, as we've done with a couple of other hard-to-classify
node types; but I can't say I'm satisfied with that.

Other plausible answers seem to be:

* Drop it in fmgr.h.  The only real problem is that historically
we've not wanted fmgr.h to depend on nodes.h either.  But I'm not
sure how strong the argument for that really is/was.  If we did
do it like that we could clean up a few kluges, both in this patch
and pre-existing (fmNodePtr at least could go away).

* Invent a whole new header just for this struct.  But then we're
back to the question of what to call it.  Maybe something along the
lines of utils/elog_extras.h ?

            regards, tom lane

diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 693423e524..e68a6d7231 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -900,6 +900,15 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
    function is written in C.
   </para>

+  <para>
+   In <productname>PostgreSQL</productname> version 16 and later, it is
+   desirable for base types' input functions to return <quote>safe</quote>
+   errors using the new <function>ereturn()</function> mechanism, rather
+   than throwing <function>ereport()</function> exceptions as in previous
+   versions.  See <filename>src/backend/utils/fmgr/README</filename> for
+   more information.
+  </para>
+
  </refsect1>

  <refsect1>
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 2585e24845..6c8736f0c4 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -686,6 +686,153 @@ errfinish(const char *filename, int lineno, const char *funcname)
 }


+/*
+ * ereturn_start --- begin a "safe" error-reporting cycle
+ *
+ * If "context" isn't an ErrorReturnContext node, this behaves as
+ * errstart(ERROR, domain).
+ *
+ * If it is an ErrorReturnContext node, but the node creator only wants
+ * notification of the fact of a safe error without any details, just set
+ * the error_occurred flag in the ErrorReturnContext node and return false,
+ * which will cause us to skip remaining error processing steps.
+ *
+ * Otherwise, create and initialize error stack entry and return true.
+ * Subsequently, errmsg() and perhaps other routines will be called to further
+ * populate the stack entry.  Finally, ereturn_finish() will be called to
+ * tidy up.
+ */
+bool
+ereturn_start(void *context, const char *domain)
+{
+    ErrorReturnContext *ercontext;
+    ErrorData  *edata;
+
+    /*
+     * Do we have a context for safe error reporting?  If not, just punt to
+     * errstart().
+     */
+    if (context == NULL || !IsA(context, ErrorReturnContext))
+        return errstart(ERROR, domain);
+
+    /* Report that an error was detected */
+    ercontext = (ErrorReturnContext *) context;
+    ercontext->error_occurred = true;
+
+    /* Nothing else to do if caller wants no further details */
+    if (!ercontext->details_please)
+        return false;
+
+    /*
+     * Okay, crank up a stack entry to store the info in.
+     */
+
+    recursion_depth++;
+    if (++errordata_stack_depth >= ERRORDATA_STACK_SIZE)
+    {
+        /*
+         * Wups, stack not big enough.  We treat this as a PANIC condition
+         * because it suggests an infinite loop of errors during error
+         * recovery.
+         */
+        errordata_stack_depth = -1; /* make room on stack */
+        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
+    }
+
+    /* Initialize data for this error frame */
+    edata = &errordata[errordata_stack_depth];
+    MemSet(edata, 0, sizeof(ErrorData));
+    edata->elevel = LOG;        /* signal all is well to ereturn_finish */
+    /* the default text domain is the backend's */
+    edata->domain = domain ? domain : PG_TEXTDOMAIN("postgres");
+    /* initialize context_domain the same way (see set_errcontext_domain()) */
+    edata->context_domain = edata->domain;
+    /* Select default errcode based on the assumed elevel of ERROR */
+    edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
+    /* errno is saved here so that error parameter eval can't change it */
+    edata->saved_errno = errno;
+
+    /*
+     * Any allocations for this error state level should go into the caller's
+     * context.  We don't need to pollute ErrorContext, or even require it to
+     * exist, in this code path.
+     */
+    edata->assoc_context = CurrentMemoryContext;
+
+    recursion_depth--;
+    return true;
+}
+
+/*
+ * ereturn_finish --- end a "safe" error-reporting cycle
+ *
+ * If ereturn_start() decided this was a regular error, behave as
+ * errfinish().  Otherwise, package up the error details and save
+ * them in the ErrorReturnContext node.
+ */
+void
+ereturn_finish(void *context, const char *filename, int lineno,
+               const char *funcname)
+{
+    ErrorReturnContext *ercontext = (ErrorReturnContext *) context;
+    ErrorData  *edata = &errordata[errordata_stack_depth];
+
+    /* verify stack depth before accessing *edata */
+    CHECK_STACK_DEPTH();
+
+    /*
+     * If ereturn_start punted to errstart, then elevel will be ERROR or
+     * perhaps even PANIC.  Punt likewise to errfinish.
+     */
+    if (edata->elevel >= ERROR)
+        errfinish(filename, lineno, funcname);
+
+    /*
+     * Else, we should package up the stack entry contents and deliver them to
+     * the caller.
+     */
+    recursion_depth++;
+
+    /* Save the last few bits of error state into the stack entry */
+    if (filename)
+    {
+        const char *slash;
+
+        /* keep only base name, useful especially for vpath builds */
+        slash = strrchr(filename, '/');
+        if (slash)
+            filename = slash + 1;
+        /* Some Windows compilers use backslashes in __FILE__ strings */
+        slash = strrchr(filename, '\\');
+        if (slash)
+            filename = slash + 1;
+    }
+
+    edata->filename = filename;
+    edata->lineno = lineno;
+    edata->funcname = funcname;
+    edata->elevel = ERROR;        /* hide the LOG value used above */
+
+    /*
+     * We skip calling backtrace and context functions, which are more likely
+     * to cause trouble than provide useful context; they might act on the
+     * assumption that a transaction abort is about to occur.
+     */
+
+    /*
+     * Make a copy of the error info for the caller.  All the subsidiary
+     * strings are already in the caller's context, so it's sufficient to
+     * flat-copy the stack entry.
+     */
+    ercontext->error_data = palloc_object(ErrorData);
+    memcpy(ercontext->error_data, edata, sizeof(ErrorData));
+
+    /* Exit error-handling context */
+    errordata_stack_depth--;
+    recursion_depth--;
+}
+
+
 /*
  * errcode --- add SQLSTATE error code to the current error
  *
diff --git a/src/backend/utils/fmgr/README b/src/backend/utils/fmgr/README
index 49845f67ac..4aa1ce6d28 100644
--- a/src/backend/utils/fmgr/README
+++ b/src/backend/utils/fmgr/README
@@ -267,6 +267,66 @@ See windowapi.h for more information.
 information about the context of the CALL statement, particularly
 whether it is within an "atomic" execution context.

+* Some callers of datatype input functions (and in future perhaps
+other classes of functions) pass an instance of ErrorReturnContext.
+This indicates that the caller wishes to handle "safe" errors without
+a transaction-terminating exception being thrown: instead, the callee
+should store information about the error cause in the ErrorReturnContext
+struct and return a dummy result value.  Further details appear in
+"Handling Non-Exception Errors" below.
+
+
+Handling Non-Exception Errors
+-----------------------------
+
+Postgres' standard mechanism for reporting errors (ereport() or elog())
+is used for all sorts of error conditions.  This means that throwing
+an exception via ereport(ERROR) requires an expensive transaction or
+subtransaction abort and cleanup, since the exception catcher dare not
+make many assumptions about what has gone wrong.  There are situations
+where we would rather have a lighter-weight mechanism for dealing
+with errors that are known to be safe to recover from without a full
+transaction cleanup.  SQL-callable functions can support this need
+using the ErrorReturnContext context mechanism.
+
+To report a "safe" error, a SQL-callable function should call
+    ereturn(fcinfo->context, ...)
+where it would previously have done
+    ereport(ERROR, ...)
+If the passed "context" is NULL or is not an ErrorReturnContext node,
+then ereturn behaves precisely as ereport(ERROR): the exception is
+thrown via longjmp, so that control does not return.  If "context"
+is an ErrorReturnContext node, then the error information included in
+ereturn's subsidiary reporting calls is stored into the context node
+and control returns normally.  The function should then return a dummy
+value to its caller.  (SQL NULL is recommendable as the dummy value;
+but anything will do, since the caller is expected to ignore the
+function's return value once it sees that an error has been reported
+in the ErrorReturnContext node.)
+
+Considering datatype input functions as examples, typical "safe" error
+conditions include input syntax errors and out-of-range values.  An input
+function typically detects these cases with simple if-tests and can easily
+change the following ereport calls to ereturns.  Error conditions that
+should NOT be handled this way include out-of-memory, internal errors, and
+anything where there is any question about our ability to continue normal
+processing of the transaction.  Those should still be thrown with ereport.
+Because of this restriction, it's typically not necessary to pass the
+error context pointer down very far, as errors reported by palloc or
+other low-level functions are typically reasonable to consider internal.
+
+Because no transaction cleanup will occur, a function that is exiting
+after ereturn() returns normally still bears responsibility for resource
+cleanup.  It is not necessary to be concerned about small leakages of
+palloc'd memory, since the caller should be running the function in a
+short-lived memory context.  However, resources such as locks, open files,
+or buffer pins must be closed out cleanly, as they would be in the
+non-error code path.
+
+Conventions for callers that use the ErrorReturnContext mechanism
+to trap errors are discussed with the declaration of that struct,
+in nodes.h.
+

 Functions Accepting or Returning Sets
 -------------------------------------
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 3c210297aa..d200b9c296 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -1548,6 +1548,63 @@ InputFunctionCall(FmgrInfo *flinfo, char *str, Oid typioparam, int32 typmod)
     return result;
 }

+/*
+ * Call a previously-looked-up datatype input function, with non-exception
+ * handling of "safe" errors.
+ *
+ * This is the same as InputFunctionCall, but the caller also passes a
+ * previously-initialized ErrorReturnContext node.  (We declare that as
+ * "void *" to avoid including nodes.h in fmgr.h, but it had better be an
+ * ErrorReturnContext.)  Any "safe" errors detected by the input function
+ * will be reported by filling the ercontext struct.  The caller must
+ * check ercontext->error_occurred before assuming that the function result
+ * is meaningful.
+ */
+Datum
+InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                      Oid typioparam, int32 typmod,
+                      void *ercontext)
+{
+    LOCAL_FCINFO(fcinfo, 3);
+    Datum        result;
+
+    Assert(IsA(ercontext, ErrorReturnContext));
+
+    if (str == NULL && flinfo->fn_strict)
+        return (Datum) 0;        /* just return null result */
+
+    InitFunctionCallInfoData(*fcinfo, flinfo, 3, InvalidOid, ercontext, NULL);
+
+    fcinfo->args[0].value = CStringGetDatum(str);
+    fcinfo->args[0].isnull = false;
+    fcinfo->args[1].value = ObjectIdGetDatum(typioparam);
+    fcinfo->args[1].isnull = false;
+    fcinfo->args[2].value = Int32GetDatum(typmod);
+    fcinfo->args[2].isnull = false;
+
+    result = FunctionCallInvoke(fcinfo);
+
+    /* Result value is garbage, and could be null, if an error was reported */
+    if (((ErrorReturnContext *) ercontext)->error_occurred)
+        return (Datum) 0;
+
+    /* Otherwise, should get null result if and only if str is NULL */
+    if (str == NULL)
+    {
+        if (!fcinfo->isnull)
+            elog(ERROR, "input function %u returned non-NULL",
+                 flinfo->fn_oid);
+    }
+    else
+    {
+        if (fcinfo->isnull)
+            elog(ERROR, "input function %u returned NULL",
+                 flinfo->fn_oid);
+    }
+
+    return result;
+}
+
 /*
  * Call a previously-looked-up datatype output function.
  *
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index 380a82b9de..a95bad117f 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -700,6 +700,9 @@ extern Datum OidFunctionCall9Coll(Oid functionId, Oid collation,
 /* Special cases for convenient invocation of datatype I/O functions. */
 extern Datum InputFunctionCall(FmgrInfo *flinfo, char *str,
                                Oid typioparam, int32 typmod);
+extern Datum InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                                   Oid typioparam, int32 typmod,
+                                   void *ercontext);
 extern Datum OidInputFunctionCall(Oid functionId, char *str,
                                   Oid typioparam, int32 typmod);
 extern char *OutputFunctionCall(FmgrInfo *flinfo, Datum val);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..7979aea16d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -430,4 +430,31 @@ typedef enum LimitOption
     LIMIT_OPTION_DEFAULT,        /* No limit present */
 } LimitOption;

+/*
+ * ErrorReturnContext -
+ *    function call context node for handling of "safe" errors
+ *
+ * A caller wishing to trap "safe" errors must initialize a struct like this
+ * with all fields zero/NULL except for the NodeTag.  Optionally, set
+ * details_please = true if more than the bare knowledge that a "safe" error
+ * occurred is required.  After calling code that might report an error this
+ * way, check error_occurred to see if an error happened.  If so, and if
+ * details_please is true, error_data has been filled with error details
+ * (stored in the callee's memory context!).  FreeErrorData() can be called
+ * to release error_data, although this step is typically not necessary
+ * if the called code was run in a short-lived context.
+ *
+ * nodes.h isn't a great place for this, but neither elog.h nor fmgr.h
+ * should depend on nodes.h, so we don't really have a better option.
+ */
+typedef struct ErrorReturnContext
+{
+    pg_node_attr(nodetag_only)    /* this is not a member of parse trees */
+
+    NodeTag        type;
+    bool        error_occurred; /* set to true if we detect a "safe" error */
+    bool        details_please; /* does caller want more info than that? */
+    struct ErrorData *error_data;    /* details of error, if so */
+} ErrorReturnContext;
+
 #endif                            /* NODES_H */
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f107a818e8..2c8dd3d633 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -235,6 +235,46 @@ extern int    getinternalerrposition(void);
     ereport(elevel, errmsg_internal(__VA_ARGS__))


+/*----------
+ * Support for reporting "safe" errors that don't require a full transaction
+ * abort to clean up.  This is to be used in this way:
+ *        ereturn(context,
+ *                errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ *                errmsg("invalid input syntax for type %s: \"%s\"",
+ *                       "boolean", in_str),
+ *                ... other errxxx() fields as needed ...);
+ *
+ * "context" is a node pointer or NULL, and the remaining auxiliary calls
+ * provide the same error details as in ereport().  If context is not a
+ * pointer to an ErrorReturnContext node, then ereturn(context, ...)
+ * behaves identically to ereport(ERROR, ...).  If context is a pointer
+ * to an ErrorReturnContext node, then the information provided by the
+ * auxiliary calls is stored in the context node and control returns
+ * normally.  The caller of ereturn() must then do any required cleanup
+ * and return control back to its caller.  That caller must check the
+ * ErrorReturnContext node to see whether an error occurred before
+ * it can trust the function's result to be meaningful.
+ *
+ * ereturn_domain() allows a message domain to be specified; it is
+ * precisely analogous to ereport_domain().
+ *----------
+ */
+#define ereturn_domain(context, domain, ...)    \
+    do { \
+        void *context_ = (context); \
+        pg_prevent_errno_in_scope(); \
+        if (ereturn_start(context_, domain)) \
+            __VA_ARGS__, ereturn_finish(context_, __FILE__, __LINE__, __func__); \
+    } while(0)
+
+#define ereturn(context, ...)    \
+    ereturn_domain(context, TEXTDOMAIN, __VA_ARGS__)
+
+extern bool ereturn_start(void *context, const char *domain);
+extern void ereturn_finish(void *context, const char *filename, int lineno,
+                           const char *funcname);
+
+
 /* Support for constructing error strings separately from ereport() calls */

 extern void pre_format_elog_string(int errnumber, const char *domain);
diff --git a/src/backend/utils/adt/bool.c b/src/backend/utils/adt/bool.c
index cd7335287f..d78f862421 100644
--- a/src/backend/utils/adt/bool.c
+++ b/src/backend/utils/adt/bool.c
@@ -148,12 +148,12 @@ boolin(PG_FUNCTION_ARGS)
     if (parse_bool_with_len(str, len, &result))
         PG_RETURN_BOOL(result);

-    ereport(ERROR,
+    ereturn(fcinfo->context,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "boolean", in_str)));

-    /* not reached */
+    /* dummy result */
     PG_RETURN_BOOL(false);
 }

diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 42ddae99ef..e1837bee71 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -291,7 +291,7 @@ int4in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT32(pg_strtoint32(num));
+    PG_RETURN_INT32(pg_strtoint32_safe(num, fcinfo->context));
 }

 /*
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 834ec0b588..7050d5825d 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -164,8 +164,11 @@ invalid_syntax:
 /*
  * Convert input string to a signed 32 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * pg_strtoint32() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint32_safe() instead returns such complaints in *ercontext,
+ * if it's an ErrorReturnContext.
  *
  * NB: Accumulate input as a negative number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
@@ -173,6 +176,12 @@ invalid_syntax:
  */
 int32
 pg_strtoint32(const char *s)
+{
+    return pg_strtoint32_safe(s, NULL);
+}
+
+int32
+pg_strtoint32_safe(const char *s, Node *ercontext)
 {
     const char *ptr = s;
     int32        tmp = 0;
@@ -223,18 +232,18 @@ pg_strtoint32(const char *s)
     return tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(ercontext,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "integer")));
+    return 0;                    /* dummy result */

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(ercontext,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "integer", s)));
-
-    return 0;                    /* keep compiler quiet */
+    return 0;                    /* dummy result */
 }

 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 81631f1645..894e3d8ba9 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -45,6 +45,7 @@ extern int    namestrcmp(Name name, const char *str);
 /* numutils.c */
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
+extern int32 pg_strtoint32_safe(const char *s, Node *ercontext);
 extern int64 pg_strtoint64(const char *s);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index c25b52d0cb..462e4d338b 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -42,6 +42,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_QUOTE { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     FORCE_NOT_NULL ( <replaceable class="parameter">column_name</replaceable> [, ...] )
     FORCE_NULL ( <replaceable class="parameter">column_name</replaceable> [, ...] )
+    NULL_ON_ERROR [ <replaceable class="parameter">boolean</replaceable> ]
+    WARN_ON_ERROR [ <replaceable class="parameter">boolean</replaceable> ]
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
 </synopsis>
  </refsynopsisdiv>
@@ -356,6 +358,27 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>

+   <varlistentry>
+    <term><literal>NULL_ON_ERROR</literal></term>
+    <listitem>
+     <para>
+      Requests silently replacing any erroneous input values with
+      <literal>NULL</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>WARN_ON_ERROR</literal></term>
+    <listitem>
+     <para>
+      Requests replacing any erroneous input values with
+      <literal>NULL</literal>, and emitting a warning message instead of
+      the usual error.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>ENCODING</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index db4c9dbc23..d224167111 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -409,6 +409,7 @@ ProcessCopyOptions(ParseState *pstate,
     bool        format_specified = false;
     bool        freeze_specified = false;
     bool        header_specified = false;
+    bool        on_error_specified = false;
     ListCell   *option;

     /* Support external use for option sanity checking */
@@ -520,6 +521,20 @@ ProcessCopyOptions(ParseState *pstate,
                                 defel->defname),
                          parser_errposition(pstate, defel->location)));
         }
+        else if (strcmp(defel->defname, "null_on_error") == 0)
+        {
+            if (on_error_specified)
+                errorConflictingDefElem(defel, pstate);
+            on_error_specified = true;
+            opts_out->null_on_error = defGetBoolean(defel);
+        }
+        else if (strcmp(defel->defname, "warn_on_error") == 0)
+        {
+            if (on_error_specified)
+                errorConflictingDefElem(defel, pstate);
+            on_error_specified = true;
+            opts_out->warn_on_error = defGetBoolean(defel);
+        }
         else if (strcmp(defel->defname, "convert_selectively") == 0)
         {
             /*
@@ -701,6 +716,30 @@ ProcessCopyOptions(ParseState *pstate,
         ereport(ERROR,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("CSV quote character must not appear in the NULL specification")));
+
+    /*
+     * The XXX_ON_ERROR options are only supported for input, and only in text
+     * modes.  We could in future extend safe-errors support to datatype
+     * receive functions, but it'd take a lot more work.  Moreover, it's not
+     * clear that receive functions can detect errors very well, so the
+     * feature likely wouldn't work terribly well.
+     */
+    if (opts_out->null_on_error && !is_from)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("COPY NULL_ON_ERROR only available using COPY FROM")));
+    if (opts_out->null_on_error && opts_out->binary)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("cannot specify NULL_ON_ERROR in BINARY mode")));
+    if (opts_out->warn_on_error && !is_from)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("COPY WARN_ON_ERROR only available using COPY FROM")));
+    if (opts_out->warn_on_error && opts_out->binary)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("cannot specify WARN_ON_ERROR in BINARY mode")));
 }

 /*
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 504afcb811..a268ca05d0 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1599,6 +1599,15 @@ BeginCopyFrom(ParseState *pstate,
         }
     }

+    /* For the XXX_ON_ERROR options, we'll need an ErrorReturnContext */
+    if (cstate->opts.null_on_error ||
+        cstate->opts.warn_on_error)
+    {
+        cstate->er_context = makeNode(ErrorReturnContext);
+        /* Error details are only needed for warnings */
+        if (cstate->opts.warn_on_error)
+            cstate->er_context->details_please = true;
+    }

     /* initialize progress */
     pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 097414ef12..901cbea030 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -876,6 +876,7 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
         char      **field_strings;
         ListCell   *cur;
         int            fldct;
+        bool        safe_mode;
         int            fieldno;
         char       *string;

@@ -889,6 +890,8 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
                     (errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
                      errmsg("extra data after last expected column")));

+        safe_mode = cstate->opts.null_on_error || cstate->opts.warn_on_error;
+
         fieldno = 0;

         /* Loop to read the user attributes on the line. */
@@ -938,12 +941,50 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,

             cstate->cur_attname = NameStr(att->attname);
             cstate->cur_attval = string;
-            values[m] = InputFunctionCall(&in_functions[m],
-                                          string,
-                                          typioparams[m],
-                                          att->atttypmod);
-            if (string != NULL)
-                nulls[m] = false;
+
+            if (safe_mode)
+            {
+                ErrorReturnContext *er_context = cstate->er_context;
+
+                /* Must reset the error_occurred flag each time */
+                er_context->error_occurred = false;
+
+                values[m] = InputFunctionCallSafe(&in_functions[m],
+                                                  string,
+                                                  typioparams[m],
+                                                  att->atttypmod,
+                                                  er_context);
+                if (er_context->error_occurred)
+                {
+                    /* nulls[m] is already true */
+                    if (cstate->opts.warn_on_error)
+                    {
+                        ErrorData  *edata = er_context->error_data;
+
+                        /* Note that our errcontext callback wasn't used */
+                        ereport(WARNING,
+                                errcode(edata->sqlerrcode),
+                                errmsg_internal("invalid input for column %s: %s",
+                                                cstate->cur_attname,
+                                                edata->message),
+                                errcontext("COPY %s, line %llu",
+                                           cstate->cur_relname,
+                                           (unsigned long long) cstate->cur_lineno));
+                    }
+                }
+                else if (string != NULL)
+                    nulls[m] = false;
+            }
+            else
+            {
+                values[m] = InputFunctionCall(&in_functions[m],
+                                              string,
+                                              typioparams[m],
+                                              att->atttypmod);
+                if (string != NULL)
+                    nulls[m] = false;
+            }
+
             cstate->cur_attname = NULL;
             cstate->cur_attval = NULL;
         }
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b77b935005..ee38bd0e28 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -57,6 +57,8 @@ typedef struct CopyFormatOptions
     bool       *force_notnull_flags;    /* per-column CSV FNN flags */
     List       *force_null;        /* list of column names */
     bool       *force_null_flags;    /* per-column CSV FN flags */
+    bool        null_on_error;    /* replace erroneous inputs with NULL? */
+    bool        warn_on_error;    /* ... and warn about it? */
     bool        convert_selectively;    /* do selective binary conversion? */
     List       *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 8d9cc5accd..d578c73107 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -97,6 +97,7 @@ typedef struct CopyFromStateData
     int           *defmap;            /* array of default att numbers */
     ExprState **defexprs;        /* array of default att expressions */
     bool        volatile_defexprs;    /* is any of defexprs volatile? */
+    ErrorReturnContext *er_context; /* used for XXX_ON_ERROR options */
     List       *range_table;
     ExprState  *qualexpr;

diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index 3fad1c52d1..fa095ec52d 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -240,3 +240,25 @@ SELECT * FROM header_copytest ORDER BY a;
 (5 rows)

 drop table header_copytest;
+-- "safe" error handling
+create table on_error_copytest(i int, b bool);
+copy on_error_copytest from stdin with (null_on_error);
+copy on_error_copytest from stdin with (warn_on_error);
+WARNING:  invalid input for column b: invalid input syntax for type boolean: "b"
+WARNING:  invalid input for column i: invalid input syntax for type integer: "err"
+WARNING:  invalid input for column i: invalid input syntax for type integer: "bad"
+WARNING:  invalid input for column b: invalid input syntax for type boolean: "z"
+select * from on_error_copytest;
+ i | b
+---+---
+ 1 |
+   | t
+ 2 | f
+   |
+ 3 | f
+ 4 |
+   | t
+   |
+(8 rows)
+
+drop table on_error_copytest;
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index 285022e07c..2d804ad3af 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -268,3 +268,23 @@ a    c    b

 SELECT * FROM header_copytest ORDER BY a;
 drop table header_copytest;
+
+-- "safe" error handling
+create table on_error_copytest(i int, b bool);
+
+copy on_error_copytest from stdin with (null_on_error);
+1    a
+err    1
+2    f
+bad    x
+\.
+
+copy on_error_copytest from stdin with (warn_on_error);
+3    0
+4    b
+err    t
+bad    z
+\.
+
+select * from on_error_copytest;
+drop table on_error_copytest;

Re: Error-safe user functions

From

Corey Huinker

Date:

04 December 2022, 03:56:49

I think there are just a couple of loose ends here:

1. Bikeshedding on my name choices is welcome. I know Robert is
dissatisfied with "ereturn", but I'm content with that so I didn't
change it here.

1. details_please => include_error_data

as this hints the reader directly to the struct to be filled out

2. ereturn_* => errfeedback / error_feedback / efeedback

It is returned, but it's not taking control and the caller could ignore it. I arrived at his after checking https://www.thesaurus.com/browse/report and https://www.thesaurus.com/browse/hint.

2. Everybody has struggled with just where to put the declaration
of the error context structure. The most natural home for it
probably would be elog.h, but that's out because it cannot depend
on nodes.h, and the struct has to be a Node type to conform to
the fmgr safety guidelines. What I've done here is to drop it
in nodes.h, as we've done with a couple of other hard-to-classify
node types; but I can't say I'm satisfied with that.

Other plausible answers seem to be:

* Drop it in fmgr.h. The only real problem is that historically
we've not wanted fmgr.h to depend on nodes.h either. But I'm not
sure how strong the argument for that really is/was. If we did
do it like that we could clean up a few kluges, both in this patch
and pre-existing (fmNodePtr at least could go away).

* Invent a whole new header just for this struct. But then we're
back to the question of what to call it. Maybe something along the
lines of utils/elog_extras.h ?

Would moving ErrorReturnContext and the ErrorData struct to their own util/errordata.h allow us to avoid the void pointer for ercontext? If so, that'd be a win because typed pointers give the reader some idea of what is expected there, as well as aiding doxygen-like tools.

Overall this looks like a good foundation.

My own effort was getting bogged down in the number of changes I needed to make in code paths that would never want a failover case for their typecasts, so I'm going to refactor my work on top of this and see where I get stuck.

Re: Error-safe user functions

From

Tom Lane

Date:

04 December 2022, 04:17:08

Corey Huinker <corey.huinker@gmail.com> writes:
> My own effort was getting bogged down in the number of changes I needed to
> make in code paths that would never want a failover case for their
> typecasts, so I'm going to refactor my work on top of this and see where I
> get stuck.

+1, that would be a good way to see if I missed anything.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

04 December 2022, 12:53:15

On 2022-12-03 Sa 16:46, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Great. Let's hope we can get this settled early next week and then we
>> can get to work on the next tranche of functions, those that will let
>> the SQL/JSON work restart.
> OK, here's a draft proposal.  I should start out by acknowledging that
> this steals a great deal from Nikita's original patch as well as yours,
> though I editorialized heavily.
>
> 0001 is the core infrastructure and documentation for the feature.
> (I didn't bother breaking it down further than that.)
>
> 0002 fixes boolin and int4in.  That is the work that we're going to
> have to replicate in an awful lot of places, and I am pleased by how
> short-and-sweet it is.  Of course, stuff like the datetime functions
> might be more complex to adapt.
>
> Then 0003 is a quick-hack version of COPY that is able to exercise
> all this.  I did not bother with the per-column flags as you had
> them, because I'm not sure if they're worth the trouble compared
> to a simple boolean; in any case we can add that refinement later.
> What I did add was a WARN_ON_ERROR option that exercises the ability
> to extract the error message after a soft error.  I'm not proposing
> that as a shippable feature, it's just something for testing.


Overall I think this is pretty good, and I hope we can settle on it quickly.


>
> I think there are just a couple of loose ends here:
>
> 1. Bikeshedding on my name choices is welcome.  I know Robert is
> dissatisfied with "ereturn", but I'm content with that so I didn't
> change it here.


I haven't got anything better than ereturn.

details_please seems more informal than our usual style. details_wanted
maybe?


>
> 2. Everybody has struggled with just where to put the declaration
> of the error context structure.  The most natural home for it
> probably would be elog.h, but that's out because it cannot depend
> on nodes.h, and the struct has to be a Node type to conform to
> the fmgr safety guidelines.  What I've done here is to drop it
> in nodes.h, as we've done with a couple of other hard-to-classify
> node types; but I can't say I'm satisfied with that.
>
> Other plausible answers seem to be:
>
> * Drop it in fmgr.h.  The only real problem is that historically
> we've not wanted fmgr.h to depend on nodes.h either.  But I'm not
> sure how strong the argument for that really is/was.  If we did
> do it like that we could clean up a few kluges, both in this patch
> and pre-existing (fmNodePtr at least could go away).
>
> * Invent a whole new header just for this struct.  But then we're
> back to the question of what to call it.  Maybe something along the
> lines of utils/elog_extras.h ?
>
>             


Maybe a new header misc_nodes.h?

Soon after we get this done I think we'll find we need to extend this to
non-input functions. But that can wait a short while.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

04 December 2022, 15:25:23

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-03 Sa 16:46, Tom Lane wrote:
>> 1. Bikeshedding on my name choices is welcome.  I know Robert is
>> dissatisfied with "ereturn", but I'm content with that so I didn't
>> change it here.

> details_please seems more informal than our usual style. details_wanted
> maybe?

Yeah, Corey didn't like that either.  "details_wanted" works for me.

> Soon after we get this done I think we'll find we need to extend this to
> non-input functions. But that can wait a short while.

I'm curious to know exactly which other use-cases you foresee.
It wouldn't be a bad idea to write some draft code to verify
that this mechanism will work conveniently for them.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

04 December 2022, 16:21:41

On 2022-12-04 Su 10:25, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 2022-12-03 Sa 16:46, Tom Lane wrote:
>>> 1. Bikeshedding on my name choices is welcome.  I know Robert is
>>> dissatisfied with "ereturn", but I'm content with that so I didn't
>>> change it here.
>> details_please seems more informal than our usual style. details_wanted
>> maybe?
> Yeah, Corey didn't like that either.  "details_wanted" works for me.
>
>> Soon after we get this done I think we'll find we need to extend this to
>> non-input functions. But that can wait a short while.
> I'm curious to know exactly which other use-cases you foresee.
> It wouldn't be a bad idea to write some draft code to verify
> that this mechanism will work conveniently for them.


The SQL/JSON patches at [1] included fixes for some numeric and datetime
conversion functions as well as various input functions, so that's a
fairly immediate need. More generally, I can see uses for error free
casts, something like, say CAST(foo AS bar ON ERROR blurfl)


cheers


andrew


[1]
https://www.postgresql.org/message-id/f54ebd2b-0e67-d1c6-4ff7-5d732492d1a0%40postgrespro.ru

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Vik Fearing

Date:

04 December 2022, 17:01:33

On 12/4/22 17:21, Andrew Dunstan wrote:
> 
> More generally, I can see uses for error free
> casts, something like, say CAST(foo AS bar ON ERROR blurfl)

What I am proposing for inclusion in the standard is basically the same 
as what JSON does:

<cast specification> ::=
CAST <left paren>
     <cast operand> AS <cast target>
     [ FORMAT <cast template> ]
     [ <cast error behavior> ON ERROR ]
     <right paren>

<cast error behavior> ::=
     ERROR
   | NULL
   | DEFAULT <value expression>

Once/If I get that in, I will be pushing to get that syntax in postgres 
as well.
-- 
Vik Fearing

Re: Error-safe user functions

From

Michael Paquier

Date:

05 December 2022, 01:17:56

On Sun, Dec 04, 2022 at 06:01:33PM +0100, Vik Fearing wrote:
> Once/If I get that in, I will be pushing to get that syntax in postgres as
> well.

If I may ask, how long would it take to know if this grammar would be
integrated in the standard or not?
--
Michael

Attachment

signature.asc

Re: Error-safe user functions

From

Robert Haas

Date:

05 December 2022, 15:47:49

On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker <corey.huinker@gmail.com> wrote:
> 2. ereturn_* => errfeedback / error_feedback / feedback

Oh, I like that, especially errfeedback.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 16:09:27

Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>> 2. ereturn_* => errfeedback / error_feedback / feedback

> Oh, I like that, especially errfeedback.

efeedback?  But TBH I do not think any of these are better than ereturn.

Whether or not you agree with my position that it'd be best if the new
macro name is the same length as "ereport", I hope we can all agree
that it had better be short.  ereport call nests already tend to contain
quite long lines.  We don't need to add another couple tab-stops worth
of indentation there.

As for it being the same length: if you take a close look at my 0002
patch, you will realize that replacing ereport with a different-length
name will double or triple the number of lines that need to be touched
in many input functions.  Yeah, we could sweep that under the rug to
some extent by submitting non-pgindent'd patches and running a separate
pgindent commit later, but that is not without pain.  I don't want to
go there for the sake of a name that isn't really compellingly The
Right Word, and "feedback" just isn't very compelling IMO.

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

05 December 2022, 16:20:02

On Mon, Dec 5, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker <corey.huinker@gmail.com> wrote:
> >> 2. ereturn_* => errfeedback / error_feedback / feedback
>
> > Oh, I like that, especially errfeedback.
>
> efeedback?  But TBH I do not think any of these are better than ereturn.

I do. Having a macro name that is "return" plus one character is going
to make people think that it returns. I predict that if you insist on
using that name people are still going to be making mistakes based on
that confusion 10 years from now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Andrew Dunstan

Date:

05 December 2022, 16:36:19

On 2022-12-05 Mo 11:20, Robert Haas wrote:
> On Mon, Dec 5, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>>>> 2. ereturn_* => errfeedback / error_feedback / feedback
>>> Oh, I like that, especially errfeedback.
>> efeedback?  But TBH I do not think any of these are better than ereturn.
> I do. Having a macro name that is "return" plus one character is going
> to make people think that it returns. I predict that if you insist on
> using that name people are still going to be making mistakes based on
> that confusion 10 years from now.
>

OK, I take both this point and Tom's about trying to keep it the same
length. So we need something that's  7 letters, doesn't say 'return' and
preferably begins with 'e'. I modestly suggest 'eseterr', or if we like
the 'feedback' idea 'efeedbk'.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Joe Conway

Date:

05 December 2022, 16:53:17

On 12/5/22 11:36, Andrew Dunstan wrote:
> 
> On 2022-12-05 Mo 11:20, Robert Haas wrote:
>> On Mon, Dec 5, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>>>>> 2. ereturn_* => errfeedback / error_feedback / feedback
>>>> Oh, I like that, especially errfeedback.
>>> efeedback?  But TBH I do not think any of these are better than ereturn.
>> I do. Having a macro name that is "return" plus one character is going
>> to make people think that it returns. I predict that if you insist on
>> using that name people are still going to be making mistakes based on
>> that confusion 10 years from now.
>>
> 
> OK, I take both this point and Tom's about trying to keep it the same
> length. So we need something that's  7 letters, doesn't say 'return' and
> preferably begins with 'e'. I modestly suggest 'eseterr', or if we like
> the 'feedback' idea 'efeedbk'.

Maybe eretort?

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 17:09:57

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Dec 5, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> efeedback?  But TBH I do not think any of these are better than ereturn.

> I do. Having a macro name that is "return" plus one character is going
> to make people think that it returns.

But it does return, or at least you need to code on the assumption
that it will.  (The cases where it doesn't aren't much different
from any situation where a called subroutine unexpectedly throws
an error.  Callers typically don't have to consider that.)

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 17:14:44

Joe Conway <mail@joeconway.com> writes:
> On 12/5/22 11:36, Andrew Dunstan wrote:
>> OK, I take both this point and Tom's about trying to keep it the same
>> length. So we need something that's 7 letters, doesn't say 'return' and
>> preferably begins with 'e'. I modestly suggest 'eseterr', or if we like
>> the 'feedback' idea 'efeedbk'.

> Maybe eretort?

Nah, it's so close to ereport that it looks like a typo.  eseterr isn't
awful, perhaps.  Or maybe errXXXX, but I've not thought of suitable XXXX.

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

05 December 2022, 17:27:29

On Mon, Dec 5, 2022 at 12:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> But it does return, or at least you need to code on the assumption
> that it will.  (The cases where it doesn't aren't much different
> from any situation where a called subroutine unexpectedly throws
> an error.  Callers typically don't have to consider that.)

Are you just trolling me here?

AIUI, the macro never returns in the sense of using the return
statement, unlike PG_RETURN_WHATEVER(), which do. It possibly
transfers control by throwing an error. But that is also true of just
about everything you do in PostgreSQL code, because errors can get
thrown from almost anywhere. So clearly the possibility of a non-local
transfer of control is not the issue here. The issue is the
possibility that there will be NO transfer of control. That is, you
are compelled to write ereturn() and then afterwards you still need a
return statement.

I do not understand how it is possible to sensibly argue that someone
won't see a macro called ereturn() and perhaps come to the false
conclusion that it will always return.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 17:27:42

I wrote:
> Nah, it's so close to ereport that it looks like a typo.  eseterr isn't
> awful, perhaps.  Or maybe errXXXX, but I've not thought of suitable XXXX.

... "errsave", maybe?

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

05 December 2022, 17:35:34

On Mon, Dec 5, 2022 at 12:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > Nah, it's so close to ereport that it looks like a typo.  eseterr isn't
> > awful, perhaps.  Or maybe errXXXX, but I've not thought of suitable XXXX.
>
> ... "errsave", maybe?

eseterr or errsave seem totally fine to me, FWIW.

I would probably choose a more verbose name if I were doing it, but I
do get the point that keeping line lengths reasonable is important,
and if someone were to accuse me of excessive prolixity, I would be
unable to mount much of a defense.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Joe Conway

Date:

05 December 2022, 17:38:45

On 12/5/22 12:35, Robert Haas wrote:
> On Mon, Dec 5, 2022 at 12:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wrote:
>> > Nah, it's so close to ereport that it looks like a typo.  eseterr isn't
>> > awful, perhaps.  Or maybe errXXXX, but I've not thought of suitable XXXX.
>>
>> ... "errsave", maybe?
> 
> eseterr or errsave seem totally fine to me, FWIW.

+1

> I would probably choose a more verbose name if I were doing it, but I
> do get the point that keeping line lengths reasonable is important,
> and if someone were to accuse me of excessive prolixity, I would be
> unable to mount much of a defense.

prolixity -- nice word! I won't comment on its applicability to you in 
particular ;-P

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Error-safe user functions

From

Alvaro Herrera

Date:

05 December 2022, 17:42:31

On 2022-Dec-05, Tom Lane wrote:

> I wrote:
> > Nah, it's so close to ereport that it looks like a typo.  eseterr isn't
> > awful, perhaps.  Or maybe errXXXX, but I've not thought of suitable XXXX.
> 
> ... "errsave", maybe?

IMO eseterr is quite awful while errsave is not, so here goes my vote
for the latter.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes...  Fixed.
        http://archives.postgresql.org/message-id/482D1632.8010507@sigaev.ru

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 17:44:47

Robert Haas <robertmhaas@gmail.com> writes:
> AIUI, the macro never returns in the sense of using the return
> statement, unlike PG_RETURN_WHATEVER(), which do.

Oh!  Now I see what you don't like about it.  I thought you
meant "return to the call site", not "return to the call site's
caller".  Agreed that that could be confusing.

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

05 December 2022, 17:50:47

On Mon, Dec 5, 2022 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > AIUI, the macro never returns in the sense of using the return
> > statement, unlike PG_RETURN_WHATEVER(), which do.
>
> Oh!  Now I see what you don't like about it.  I thought you
> meant "return to the call site", not "return to the call site's
> caller".  Agreed that that could be confusing.

OK, good. I couldn't figure out what in the world we were arguing
about... apparently I wasn't being as clear as I thought I was.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Andrew Dunstan

Date:

05 December 2022, 17:51:49

On 2022-12-05 Mo 12:42, Alvaro Herrera wrote:
> On 2022-Dec-05, Tom Lane wrote:
>
>> I wrote:
>>> Nah, it's so close to ereport that it looks like a typo.  eseterr isn't
>>> awful, perhaps.  Or maybe errXXXX, but I've not thought of suitable XXXX.
>> ... "errsave", maybe?
> IMO eseterr is quite awful while errsave is not, so here goes my vote
> for the latter.


Wait a minute!  Oh, no, sorry, as you were, 'errsave' is fine.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 18:00:04

Andrew Dunstan <andrew@dunslane.net> writes:
> Wait a minute!  Oh, no, sorry, as you were, 'errsave' is fine.

Seems like everybody's okay with errsave.  I'll make a v2 in a
little bit.  I'd like to try updating array_in and/or record_in
just to verify that indirection cases work okay, before we consider
the design to be set.

            regards, tom lane

Re: Error-safe user functions

From

Corey Huinker

Date:

05 December 2022, 19:22:27

On Mon, Dec 5, 2022 at 11:36 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-12-05 Mo 11:20, Robert Haas wrote:
> On Mon, Dec 5, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>>>> 2. ereturn_* => errfeedback / error_feedback / feedback
>>> Oh, I like that, especially errfeedback.
>> efeedback? But TBH I do not think any of these are better than ereturn.
> I do. Having a macro name that is "return" plus one character is going
> to make people think that it returns. I predict that if you insist on
> using that name people are still going to be making mistakes based on
> that confusion 10 years from now.
>

OK, I take both this point and Tom's about trying to keep it the same
length. So we need something that's 7 letters, doesn't say 'return' and
preferably begins with 'e'. I modestly suggest 'eseterr', or if we like
the 'feedback' idea 'efeedbk'.

Consulting https://www.thesaurus.com/browse/feedback again:
ereply clocks in at 7 characters.

Re: Error-safe user functions

From

Corey Huinker

Date:

05 December 2022, 19:23:38

On Mon, Dec 5, 2022 at 1:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:
> Wait a minute! Oh, no, sorry, as you were, 'errsave' is fine.

Seems like everybody's okay with errsave. I'll make a v2 in a
little bit. I'd like to try updating array_in and/or record_in
just to verify that indirection cases work okay, before we consider
the design to be set.

+1 to errsave.

Re: Error-safe user functions

From

Andrew Dunstan

Date:

05 December 2022, 19:55:19

On 2022-12-05 Mo 14:22, Corey Huinker wrote:
>
> On Mon, Dec 5, 2022 at 11:36 AM Andrew Dunstan <andrew@dunslane.net>
> wrote:
>
>
>     On 2022-12-05 Mo 11:20, Robert Haas wrote:
>     > On Mon, Dec 5, 2022 at 11:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>     >> Robert Haas <robertmhaas@gmail.com> writes:
>     >>> On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker
>     <corey.huinker@gmail.com> wrote:
>     >>>> 2. ereturn_* => errfeedback / error_feedback / feedback
>     >>> Oh, I like that, especially errfeedback.
>     >> efeedback?  But TBH I do not think any of these are better than
>     ereturn.
>     > I do. Having a macro name that is "return" plus one character is
>     going
>     > to make people think that it returns. I predict that if you
>     insist on
>     > using that name people are still going to be making mistakes
>     based on
>     > that confusion 10 years from now.
>     >
>
>     OK, I take both this point and Tom's about trying to keep it the same
>     length. So we need something that's  7 letters, doesn't say
>     'return' and
>     preferably begins with 'e'. I modestly suggest 'eseterr', or if we
>     like
>     the 'feedback' idea 'efeedbk'.
>
>
>
> Consulting https://www.thesaurus.com/browse/feedback again:
> ereply clocks in at 7 characters.


It does?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

05 December 2022, 21:40:06

I wrote:
> Seems like everybody's okay with errsave.  I'll make a v2 in a
> little bit.  I'd like to try updating array_in and/or record_in
> just to verify that indirection cases work okay, before we consider
> the design to be set.

v2 as promised, incorporating the discussed renamings as well as some
follow-on ones (ErrorReturnContext -> ErrorSaveContext, notably).

I also tried moving the struct into a new header file, miscnodes.h
after Andrew's suggestion upthread.  That seems at least marginally
cleaner than putting it in nodes.h, although I'm not wedded to this
choice.

I was really glad that I took the trouble to update some less-trivial
input functions, because I learned two things:

* It's better if InputFunctionCallSafe will tolerate the case of not
being passed an ErrorSaveContext.  In the COPY hack it felt worthwhile
to have completely separate code paths calling InputFunctionCallSafe
or InputFunctionCall, but that's less appetizing elsewhere.

* There's a crying need for a macro that wraps up errsave() with an
immediate return.  Hence, ereturn() is reborn from the ashes.  I hope
Robert won't object to that name if it *does* do a return.

I feel pretty good about this version; it seems committable if there
are not objections.  Not sure if we should commit 0003 like this,
though.

            regards, tom lane

diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 693423e524..53b8d15f97 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -900,6 +900,15 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
    function is written in C.
   </para>

+  <para>
+   In <productname>PostgreSQL</productname> version 16 and later, it is
+   desirable for base types' input functions to return <quote>safe</quote>
+   errors using the new <function>errsave()</function> mechanism, rather
+   than throwing <function>ereport()</function> exceptions as in previous
+   versions.  See <filename>src/backend/utils/fmgr/README</filename> for
+   more information.
+  </para>
+
  </refsect1>

  <refsect1>
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 4368c30fdb..7c594be583 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -56,6 +56,7 @@ node_headers = \
     nodes/bitmapset.h \
     nodes/extensible.h \
     nodes/lockoptions.h \
+    nodes/miscnodes.h \
     nodes/replnodes.h \
     nodes/supportnodes.h \
     nodes/value.h \
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7212bc486f..08992dfd47 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -68,6 +68,7 @@ my @all_input_files = qw(
   nodes/bitmapset.h
   nodes/extensible.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
   nodes/value.h
@@ -89,6 +90,7 @@ my @nodetag_only_files = qw(
   executor/tuptable.h
   foreign/fdwapi.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
 );
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 2585e24845..81727ecb28 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -71,6 +71,7 @@
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
+#include "nodes/miscnodes.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/bgworker.h"
@@ -686,6 +687,154 @@ errfinish(const char *filename, int lineno, const char *funcname)
 }


+/*
+ * errsave_start --- begin a "safe" error-reporting cycle
+ *
+ * If "context" isn't an ErrorSaveContext node, this behaves as
+ * errstart(ERROR, domain), and the errsave() macro ends up acting
+ * exactly like ereport(ERROR, ...).
+ *
+ * If "context" is an ErrorSaveContext node, but the node creator only wants
+ * notification of the fact of a safe error without any details, just set
+ * the error_occurred flag in the ErrorSaveContext node and return false,
+ * which will cause us to skip the remaining error processing steps.
+ *
+ * Otherwise, create and initialize error stack entry and return true.
+ * Subsequently, errmsg() and perhaps other routines will be called to further
+ * populate the stack entry.  Finally, errsave_finish() will be called to
+ * tidy up.
+ */
+bool
+errsave_start(void *context, const char *domain)
+{
+    ErrorSaveContext *escontext;
+    ErrorData  *edata;
+
+    /*
+     * Do we have a context for safe error reporting?  If not, just punt to
+     * errstart().
+     */
+    if (context == NULL || !IsA(context, ErrorSaveContext))
+        return errstart(ERROR, domain);
+
+    /* Report that an error was detected */
+    escontext = (ErrorSaveContext *) context;
+    escontext->error_occurred = true;
+
+    /* Nothing else to do if caller wants no further details */
+    if (!escontext->details_wanted)
+        return false;
+
+    /*
+     * Okay, crank up a stack entry to store the info in.
+     */
+
+    recursion_depth++;
+    if (++errordata_stack_depth >= ERRORDATA_STACK_SIZE)
+    {
+        /*
+         * Wups, stack not big enough.  We treat this as a PANIC condition
+         * because it suggests an infinite loop of errors during error
+         * recovery.
+         */
+        errordata_stack_depth = -1; /* make room on stack */
+        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
+    }
+
+    /* Initialize data for this error frame */
+    edata = &errordata[errordata_stack_depth];
+    MemSet(edata, 0, sizeof(ErrorData));
+    edata->elevel = LOG;        /* signal all is well to errsave_finish */
+    /* the default text domain is the backend's */
+    edata->domain = domain ? domain : PG_TEXTDOMAIN("postgres");
+    /* initialize context_domain the same way (see set_errcontext_domain()) */
+    edata->context_domain = edata->domain;
+    /* Select default errcode based on the assumed elevel of ERROR */
+    edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
+    /* errno is saved here so that error parameter eval can't change it */
+    edata->saved_errno = errno;
+
+    /*
+     * Any allocations for this error state level should go into the caller's
+     * context.  We don't need to pollute ErrorContext, or even require it to
+     * exist, in this code path.
+     */
+    edata->assoc_context = CurrentMemoryContext;
+
+    recursion_depth--;
+    return true;
+}
+
+/*
+ * errsave_finish --- end a "safe" error-reporting cycle
+ *
+ * If errsave_start() decided this was a regular error, behave as
+ * errfinish().  Otherwise, package up the error details and save
+ * them in the ErrorSaveContext node.
+ */
+void
+errsave_finish(void *context, const char *filename, int lineno,
+               const char *funcname)
+{
+    ErrorSaveContext *escontext = (ErrorSaveContext *) context;
+    ErrorData  *edata = &errordata[errordata_stack_depth];
+
+    /* verify stack depth before accessing *edata */
+    CHECK_STACK_DEPTH();
+
+    /*
+     * If errsave_start punted to errstart, then elevel will be ERROR or
+     * perhaps even PANIC.  Punt likewise to errfinish.
+     */
+    if (edata->elevel >= ERROR)
+        errfinish(filename, lineno, funcname);
+
+    /*
+     * Else, we should package up the stack entry contents and deliver them to
+     * the caller.
+     */
+    recursion_depth++;
+
+    /* Save the last few bits of error state into the stack entry */
+    if (filename)
+    {
+        const char *slash;
+
+        /* keep only base name, useful especially for vpath builds */
+        slash = strrchr(filename, '/');
+        if (slash)
+            filename = slash + 1;
+        /* Some Windows compilers use backslashes in __FILE__ strings */
+        slash = strrchr(filename, '\\');
+        if (slash)
+            filename = slash + 1;
+    }
+
+    edata->filename = filename;
+    edata->lineno = lineno;
+    edata->funcname = funcname;
+    edata->elevel = ERROR;        /* hide the LOG value used above */
+
+    /*
+     * We skip calling backtrace and context functions, which are more likely
+     * to cause trouble than provide useful context; they might act on the
+     * assumption that a transaction abort is about to occur.
+     */
+
+    /*
+     * Make a copy of the error info for the caller.  All the subsidiary
+     * strings are already in the caller's context, so it's sufficient to
+     * flat-copy the stack entry.
+     */
+    escontext->error_data = palloc_object(ErrorData);
+    memcpy(escontext->error_data, edata, sizeof(ErrorData));
+
+    /* Exit error-handling context */
+    errordata_stack_depth--;
+    recursion_depth--;
+}
+
+
 /*
  * errcode --- add SQLSTATE error code to the current error
  *
diff --git a/src/backend/utils/fmgr/README b/src/backend/utils/fmgr/README
index 49845f67ac..aff8f6fb3e 100644
--- a/src/backend/utils/fmgr/README
+++ b/src/backend/utils/fmgr/README
@@ -267,6 +267,70 @@ See windowapi.h for more information.
 information about the context of the CALL statement, particularly
 whether it is within an "atomic" execution context.

+* Some callers of datatype input functions (and in future perhaps
+other classes of functions) pass an instance of ErrorSaveContext.
+This indicates that the caller wishes to handle "safe" errors without
+a transaction-terminating exception being thrown: instead, the callee
+should store information about the error cause in the ErrorSaveContext
+struct and return a dummy result value.  Further details appear in
+"Handling Non-Exception Errors" below.
+
+
+Handling Non-Exception Errors
+-----------------------------
+
+Postgres' standard mechanism for reporting errors (ereport() or elog())
+is used for all sorts of error conditions.  This means that throwing
+an exception via ereport(ERROR) requires an expensive transaction or
+subtransaction abort and cleanup, since the exception catcher dare not
+make many assumptions about what has gone wrong.  There are situations
+where we would rather have a lighter-weight mechanism for dealing
+with errors that are known to be safe to recover from without a full
+transaction cleanup.  SQL-callable functions can support this need
+using the ErrorSaveContext context mechanism.
+
+To report a "safe" error, a SQL-callable function should call
+    errsave(fcinfo->context, ...)
+where it would previously have done
+    ereport(ERROR, ...)
+If the passed "context" is NULL or is not an ErrorSaveContext node,
+then errsave behaves precisely as ereport(ERROR): the exception is
+thrown via longjmp, so that control does not return.  If "context"
+is an ErrorSaveContext node, then the error information included in
+errsave's subsidiary reporting calls is stored into the context node
+and control returns normally.  The function should then return a dummy
+value to its caller.  (SQL NULL is recommendable as the dummy value;
+but anything will do, since the caller is expected to ignore the
+function's return value once it sees that an error has been reported
+in the ErrorSaveContext node.)
+
+If there is nothing to do except return after calling errsave(), use
+    ereturn(fcinfo->context, dummy_value, ...)
+to perform errsave() and then "return dummy_value".
+
+Considering datatype input functions as examples, typical "safe" error
+conditions include input syntax errors and out-of-range values.  An input
+function typically detects such cases with simple if-tests and can easily
+change the following ereport call to errsave.  Error conditions that
+should NOT be handled this way include out-of-memory, internal errors, and
+anything where there is any question about our ability to continue normal
+processing of the transaction.  Those should still be thrown with ereport.
+Because of this restriction, it's typically not necessary to pass the
+ErrorSaveContext pointer down very far, as errors reported by palloc or
+other low-level functions are typically reasonable to consider internal.
+
+Because no transaction cleanup will occur, a function that is exiting
+after errsave() returns still bears responsibility for resource cleanup.
+It is not necessary to be concerned about small leakages of palloc'd
+memory, since the caller should be running the function in a short-lived
+memory context.  However, resources such as locks, open files, or buffer
+pins must be closed out cleanly, as they would be in the non-error code
+path.
+
+Conventions for callers that use the ErrorSaveContext mechanism
+to trap errors are discussed with the declaration of that struct,
+in nodes/miscnodes.h.
+

 Functions Accepting or Returning Sets
 -------------------------------------
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 3c210297aa..8ef50781ec 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -23,6 +23,7 @@
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/miscnodes.h"
 #include "nodes/nodeFuncs.h"
 #include "pgstat.h"
 #include "utils/acl.h"
@@ -1548,6 +1549,61 @@ InputFunctionCall(FmgrInfo *flinfo, char *str, Oid typioparam, int32 typmod)
     return result;
 }

+/*
+ * Call a previously-looked-up datatype input function, with non-exception
+ * handling of "safe" errors.
+ *
+ * This is the same as InputFunctionCall, but the caller may also pass a
+ * previously-initialized ErrorSaveContext node.  (We declare that as
+ * "void *" to avoid including miscnodes.h in fmgr.h.)  If escontext points
+ * to an ErrorSaveContext, any "safe" errors detected by the input function
+ * will be reported by filling the escontext struct.  The caller must
+ * check escontext->error_occurred before assuming that the function result
+ * is meaningful.
+ */
+Datum
+InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                      Oid typioparam, int32 typmod,
+                      void *escontext)
+{
+    LOCAL_FCINFO(fcinfo, 3);
+    Datum        result;
+
+    if (str == NULL && flinfo->fn_strict)
+        return (Datum) 0;        /* just return null result */
+
+    InitFunctionCallInfoData(*fcinfo, flinfo, 3, InvalidOid, escontext, NULL);
+
+    fcinfo->args[0].value = CStringGetDatum(str);
+    fcinfo->args[0].isnull = false;
+    fcinfo->args[1].value = ObjectIdGetDatum(typioparam);
+    fcinfo->args[1].isnull = false;
+    fcinfo->args[2].value = Int32GetDatum(typmod);
+    fcinfo->args[2].isnull = false;
+
+    result = FunctionCallInvoke(fcinfo);
+
+    /* Result value is garbage, and could be null, if an error was reported */
+    if (SAFE_ERROR_OCCURRED(escontext))
+        return (Datum) 0;
+
+    /* Otherwise, should get null result if and only if str is NULL */
+    if (str == NULL)
+    {
+        if (!fcinfo->isnull)
+            elog(ERROR, "input function %u returned non-NULL",
+                 flinfo->fn_oid);
+    }
+    else
+    {
+        if (fcinfo->isnull)
+            elog(ERROR, "input function %u returned NULL",
+                 flinfo->fn_oid);
+    }
+
+    return result;
+}
+
 /*
  * Call a previously-looked-up datatype output function.
  *
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index 380a82b9de..27f98a4413 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -700,6 +700,9 @@ extern Datum OidFunctionCall9Coll(Oid functionId, Oid collation,
 /* Special cases for convenient invocation of datatype I/O functions. */
 extern Datum InputFunctionCall(FmgrInfo *flinfo, char *str,
                                Oid typioparam, int32 typmod);
+extern Datum InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                                   Oid typioparam, int32 typmod,
+                                   void *escontext);
 extern Datum OidInputFunctionCall(Oid functionId, char *str,
                                   Oid typioparam, int32 typmod);
 extern char *OutputFunctionCall(FmgrInfo *flinfo, Datum val);
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index e63881086e..f0e60935b6 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -16,6 +16,7 @@ node_support_input_i = [
   'nodes/bitmapset.h',
   'nodes/extensible.h',
   'nodes/lockoptions.h',
+  'nodes/miscnodes.h',
   'nodes/replnodes.h',
   'nodes/supportnodes.h',
   'nodes/value.h',
diff --git a/src/include/nodes/miscnodes.h b/src/include/nodes/miscnodes.h
new file mode 100644
index 0000000000..893c49e02f
--- /dev/null
+++ b/src/include/nodes/miscnodes.h
@@ -0,0 +1,52 @@
+/*-------------------------------------------------------------------------
+ *
+ * miscnodes.h
+ *      Definitions for hard-to-classify node types.
+ *
+ * Node types declared here are not part of parse trees, plan trees,
+ * or execution state trees.  We only assign them NodeTag values because
+ * IsA() tests provide a convenient way to disambiguate what kind of
+ * structure is being passed through assorted APIs, such as function
+ * "context" pointers.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/nodes/miscnodes.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MISCNODES_H
+#define MISCNODES_H
+
+#include "nodes/nodes.h"
+
+/*
+ * ErrorSaveContext -
+ *    function call context node for handling of "safe" errors
+ *
+ * A caller wishing to trap "safe" errors must initialize a struct like this
+ * with all fields zero/NULL except for the NodeTag.  Optionally, set
+ * details_wanted = true if more than the bare knowledge that a "safe" error
+ * occurred is required.  After calling code that might report an error this
+ * way, check error_occurred to see if an error happened.  If so, and if
+ * details_wanted is true, error_data has been filled with error details
+ * (stored in the callee's memory context!).  FreeErrorData() can be called
+ * to release error_data, although this step is typically not necessary
+ * if the called code was run in a short-lived context.
+ */
+typedef struct ErrorSaveContext
+{
+    NodeTag        type;
+    bool        error_occurred; /* set to true if we detect a "safe" error */
+    bool        details_wanted; /* does caller want more info than that? */
+    ErrorData  *error_data;        /* details of error, if so */
+} ErrorSaveContext;
+
+/* Often-useful macro for checking if a safe error was reported */
+#define SAFE_ERROR_OCCURRED(escontext) \
+    ((escontext) != NULL && IsA(escontext, ErrorSaveContext) && \
+     ((ErrorSaveContext *) (escontext))->error_occurred)
+
+#endif                            /* MISCNODES_H */
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f107a818e8..9d292ea6fd 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -235,6 +235,62 @@ extern int    getinternalerrposition(void);
     ereport(elevel, errmsg_internal(__VA_ARGS__))


+/*----------
+ * Support for reporting "safe" errors that don't require a full transaction
+ * abort to clean up.  This is to be used in this way:
+ *        errsave(context,
+ *                errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ *                errmsg("invalid input syntax for type %s: \"%s\"",
+ *                       "boolean", in_str),
+ *                ... other errxxx() fields as needed ...);
+ *
+ * "context" is a node pointer or NULL, and the remaining auxiliary calls
+ * provide the same error details as in ereport().  If context is not a
+ * pointer to an ErrorSaveContext node, then errsave(context, ...)
+ * behaves identically to ereport(ERROR, ...).  If context is a pointer
+ * to an ErrorSaveContext node, then the information provided by the
+ * auxiliary calls is stored in the context node and control returns
+ * normally.  The caller of errsave() must then do any required cleanup
+ * and return control back to its caller.  That caller must check the
+ * ErrorSaveContext node to see whether an error occurred before
+ * it can trust the function's result to be meaningful.
+ *
+ * errsave_domain() allows a message domain to be specified; it is
+ * precisely analogous to ereport_domain().
+ *----------
+ */
+#define errsave_domain(context, domain, ...)    \
+    do { \
+        void *context_ = (context); \
+        pg_prevent_errno_in_scope(); \
+        if (errsave_start(context_, domain)) \
+            __VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \
+    } while(0)
+
+#define errsave(context, ...)    \
+    errsave_domain(context, TEXTDOMAIN, __VA_ARGS__)
+
+/*
+ * "ereturn(context, dummy_value, ...);" is exactly the same as
+ * "errsave(context, ...); return dummy_value;".  This saves a bit
+ * of typing in the common case where a function has no cleanup
+ * actions to take after reporting a safe error.  "dummy_value"
+ * can be empty if the function returns void.
+ */
+#define ereturn_domain(context, dummy_value, domain, ...)    \
+    do { \
+        errsave_domain(context, domain, __VA_ARGS__); \
+        return dummy_value; \
+    } while(0)
+
+#define ereturn(context, dummy_value, ...)    \
+    ereturn_domain(context, dummy_value, TEXTDOMAIN, __VA_ARGS__)
+
+extern bool errsave_start(void *context, const char *domain);
+extern void errsave_finish(void *context, const char *filename, int lineno,
+                           const char *funcname);
+
+
 /* Support for constructing error strings separately from ereport() calls */

 extern void pre_format_elog_string(int errnumber, const char *domain);
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 495e449a9e..245ea5ba09 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -21,6 +21,7 @@
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
+#include "nodes/miscnodes.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
@@ -90,14 +91,15 @@ typedef struct ArrayIteratorData
 }            ArrayIteratorData;

 static bool array_isspace(char ch);
-static int    ArrayCount(const char *str, int *dim, char typdelim);
+static int    ArrayCount(const char *str, int *dim, char typdelim,
+                       void *escontext);
 static void ReadArrayStr(char *arrayStr, const char *origStr,
                          int nitems, int ndim, int *dim,
                          FmgrInfo *inputproc, Oid typioparam, int32 typmod,
                          char typdelim,
                          int typlen, bool typbyval, char typalign,
                          Datum *values, bool *nulls,
-                         bool *hasnulls, int32 *nbytes);
+                         bool *hasnulls, int32 *nbytes, void *escontext);
 static void ReadArrayBinary(StringInfo buf, int nitems,
                             FmgrInfo *receiveproc, Oid typioparam, int32 typmod,
                             int typlen, bool typbyval, char typalign,
@@ -177,6 +179,7 @@ array_in(PG_FUNCTION_ARGS)
     Oid            element_type = PG_GETARG_OID(1);    /* type of an array
                                                      * element */
     int32        typmod = PG_GETARG_INT32(2);    /* typmod for array elements */
+    void       *escontext = fcinfo->context;
     int            typlen;
     bool        typbyval;
     char        typalign;
@@ -188,8 +191,8 @@ array_in(PG_FUNCTION_ARGS)
                 nitems;
     Datum       *dataPtr;
     bool       *nullsPtr;
-    bool        hasnulls;
-    int32        nbytes;
+    bool        hasnulls = false;
+    int32        nbytes = 0;
     int32        dataoffset;
     ArrayType  *retval;
     int            ndim,
@@ -258,7 +261,7 @@ array_in(PG_FUNCTION_ARGS)
             break;                /* no more dimension items */
         p++;
         if (ndim >= MAXDIM)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                             ndim + 1, MAXDIM)));
@@ -266,7 +269,7 @@ array_in(PG_FUNCTION_ARGS)
         for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
              /* skip */ ;
         if (q == p)                /* no digits? */
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("\"[\" must introduce explicitly-specified array dimensions.")));
@@ -280,7 +283,7 @@ array_in(PG_FUNCTION_ARGS)
             for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
                  /* skip */ ;
             if (q == p)            /* no digits? */
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Missing array dimension value.")));
@@ -291,7 +294,7 @@ array_in(PG_FUNCTION_ARGS)
             lBound[ndim] = 1;
         }
         if (*q != ']')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -301,7 +304,7 @@ array_in(PG_FUNCTION_ARGS)
         ub = atoi(p);
         p = q + 1;
         if (ub < lBound[ndim])
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
                      errmsg("upper bound cannot be less than lower bound")));

@@ -313,11 +316,13 @@ array_in(PG_FUNCTION_ARGS)
     {
         /* No array dimensions, so intuit dimensions from brace structure */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array value must start with \"{\" or dimension information.")));
-        ndim = ArrayCount(p, dim, typdelim);
+        ndim = ArrayCount(p, dim, typdelim, escontext);
+        if (ndim < 0)
+            PG_RETURN_NULL();
         for (i = 0; i < ndim; i++)
             lBound[i] = 1;
     }
@@ -328,7 +333,7 @@ array_in(PG_FUNCTION_ARGS)

         /* If array dimensions are given, expect '=' operator */
         if (strncmp(p, ASSGN, strlen(ASSGN)) != 0)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -342,20 +347,22 @@ array_in(PG_FUNCTION_ARGS)
          * were given
          */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array contents must start with \"{\".")));
-        ndim_braces = ArrayCount(p, dim_braces, typdelim);
+        ndim_braces = ArrayCount(p, dim_braces, typdelim, escontext);
+        if (ndim_braces < 0)
+            PG_RETURN_NULL();
         if (ndim_braces != ndim)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Specified array dimensions do not match array contents.")));
         for (i = 0; i < ndim; ++i)
         {
             if (dim[i] != dim_braces[i])
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Specified array dimensions do not match array contents.")));
@@ -372,8 +379,10 @@ array_in(PG_FUNCTION_ARGS)
 #endif

     /* This checks for overflow of the array dimensions */
-    nitems = ArrayGetNItems(ndim, dim);
-    ArrayCheckBounds(ndim, dim, lBound);
+    nitems = ArrayGetNItemsSafe(ndim, dim, escontext);
+    ArrayCheckBoundsSafe(ndim, dim, lBound, escontext);
+    if (SAFE_ERROR_OCCURRED(escontext))
+        PG_RETURN_NULL();

     /* Empty array? */
     if (nitems == 0)
@@ -387,7 +396,9 @@ array_in(PG_FUNCTION_ARGS)
                  typdelim,
                  typlen, typbyval, typalign,
                  dataPtr, nullsPtr,
-                 &hasnulls, &nbytes);
+                 &hasnulls, &nbytes, escontext);
+    if (SAFE_ERROR_OCCURRED(escontext))
+        PG_RETURN_NULL();
     if (hasnulls)
     {
         dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems);
@@ -451,9 +462,11 @@ array_isspace(char ch)
  *
  * Returns number of dimensions as function result.  The axis lengths are
  * returned in dim[], which must be of size MAXDIM.
+ *
+ * If we detect an error, fill *escontext with error details and return -1.
  */
 static int
-ArrayCount(const char *str, int *dim, char typdelim)
+ArrayCount(const char *str, int *dim, char typdelim, void *escontext)
 {
     int            nest_level = 0,
                 i;
@@ -488,11 +501,10 @@ ArrayCount(const char *str, int *dim, char typdelim)
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, -1,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"", str),
                              errdetail("Unexpected end of input.")));
-                    break;
                 case '\\':

                     /*
@@ -504,7 +516,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         parse_state != ARRAY_ELEM_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected \"%c\" character.",
@@ -515,7 +527,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (*(ptr + 1))
                         ptr++;
                     else
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected end of input.")));
@@ -530,7 +542,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (parse_state != ARRAY_LEVEL_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected array element.")));
@@ -551,14 +563,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         if (parse_state != ARRAY_NO_LEVEL &&
                             parse_state != ARRAY_LEVEL_STARTED &&
                             parse_state != ARRAY_LEVEL_DELIMITED)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '{')));
                         parse_state = ARRAY_LEVEL_STARTED;
                         if (nest_level >= MAXDIM)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                                             nest_level + 1, MAXDIM)));
@@ -581,14 +593,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                             parse_state != ARRAY_LEVEL_COMPLETED &&
                             !(nest_level == 1 && parse_state == ARRAY_LEVEL_STARTED))
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '}')));
                         parse_state = ARRAY_LEVEL_COMPLETED;
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unmatched \"%c\" character.", '}')));
@@ -596,7 +608,7 @@ ArrayCount(const char *str, int *dim, char typdelim)

                         if (nelems_last[nest_level] != 0 &&
                             nelems[nest_level] != nelems_last[nest_level])
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Multidimensional arrays must have "
@@ -630,7 +642,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                                 parse_state != ARRAY_ELEM_COMPLETED &&
                                 parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                                 parse_state != ARRAY_LEVEL_COMPLETED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected \"%c\" character.",
@@ -653,7 +665,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             if (parse_state != ARRAY_LEVEL_STARTED &&
                                 parse_state != ARRAY_ELEM_STARTED &&
                                 parse_state != ARRAY_ELEM_DELIMITED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected array element.")));
@@ -673,7 +685,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
     while (*ptr)
     {
         if (!array_isspace(*ptr++))
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", str),
                      errdetail("Junk after closing right brace.")));
@@ -713,9 +725,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
  *    *hasnulls: set true iff there are any null elements.
  *    *nbytes: set to total size of data area needed (including alignment
  *        padding but not including array header overhead).
+ *    *escontext: if this points to an ErrorSaveContext, details of
+ *        any error are reported there.
  *
  * Note that values[] and nulls[] are allocated by the caller, and must have
  * nitems elements.
+ *
+ * If escontext isn't NULL, caller must check for "safe" errors by
+ * examining the escontext.
  */
 static void
 ReadArrayStr(char *arrayStr,
@@ -733,7 +750,8 @@ ReadArrayStr(char *arrayStr,
              Datum *values,
              bool *nulls,
              bool *hasnulls,
-             int32 *nbytes)
+             int32 *nbytes,
+             void *escontext)
 {
     int            i,
                 nest_level = 0;
@@ -784,7 +802,7 @@ ReadArrayStr(char *arrayStr,
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext,,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"",
                                     origStr)));
@@ -793,7 +811,7 @@ ReadArrayStr(char *arrayStr,
                     /* Skip backslash, copy next character as-is. */
                     srcptr++;
                     if (*srcptr == '\0')
-                        ereport(ERROR,
+                        ereturn(escontext,,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"",
                                         origStr)));
@@ -823,7 +841,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level >= ndim)
-                            ereport(ERROR,
+                            ereturn(escontext,,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -838,7 +856,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext,,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -891,7 +909,7 @@ ReadArrayStr(char *arrayStr,
         *dstendptr = '\0';

         if (i < 0 || i >= nitems)
-            ereport(ERROR,
+            ereturn(escontext,,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"",
                             origStr)));
@@ -900,16 +918,20 @@ ReadArrayStr(char *arrayStr,
             pg_strcasecmp(itemstart, "NULL") == 0)
         {
             /* it's a NULL item */
-            values[i] = InputFunctionCall(inputproc, NULL,
-                                          typioparam, typmod);
+            values[i] = InputFunctionCallSafe(inputproc, NULL,
+                                              typioparam, typmod,
+                                              escontext);
             nulls[i] = true;
         }
         else
         {
-            values[i] = InputFunctionCall(inputproc, itemstart,
-                                          typioparam, typmod);
+            values[i] = InputFunctionCallSafe(inputproc, itemstart,
+                                              typioparam, typmod,
+                                              escontext);
             nulls[i] = false;
         }
+        if (SAFE_ERROR_OCCURRED(escontext))
+            return;
     }

     /*
@@ -930,7 +952,7 @@ ReadArrayStr(char *arrayStr,
             totbytes = att_align_nominal(totbytes, typalign);
             /* check for overflow of total request */
             if (!AllocSizeIsValid(totbytes))
-                ereport(ERROR,
+                ereturn(escontext,,
                         (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                          errmsg("array size exceeds the maximum allowed (%d)",
                                 (int) MaxAllocSize)));
diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 051169a149..2868f4ef10 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -74,6 +74,16 @@ ArrayGetOffset0(int n, const int *tup, const int *scale)
  */
 int
 ArrayGetNItems(int ndim, const int *dims)
+{
+    return ArrayGetNItemsSafe(ndim, dims, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.  -1 is returned after an error.
+ */
+int
+ArrayGetNItemsSafe(int ndim, const int *dims, void *escontext)
 {
     int32        ret;
     int            i;
@@ -89,7 +99,7 @@ ArrayGetNItems(int ndim, const int *dims)

         /* A negative dimension implies that UB-LB overflowed ... */
         if (dims[i] < 0)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
@@ -98,14 +108,14 @@ ArrayGetNItems(int ndim, const int *dims)

         ret = (int32) prod;
         if ((int64) ret != prod)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
     }
     Assert(ret >= 0);
     if ((Size) ret > MaxArraySize)
-        ereport(ERROR,
+        ereturn(escontext, -1,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("array size exceeds the maximum allowed (%d)",
                         (int) MaxArraySize)));
@@ -126,6 +136,17 @@ ArrayGetNItems(int ndim, const int *dims)
  */
 void
 ArrayCheckBounds(int ndim, const int *dims, const int *lb)
+{
+    ArrayCheckBoundsSafe(ndim, dims, lb, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.
+ */
+void
+ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                     void *escontext)
 {
     int            i;

@@ -135,7 +156,7 @@ ArrayCheckBounds(int ndim, const int *dims, const int *lb)
         int32        sum PG_USED_FOR_ASSERTS_ONLY;

         if (pg_add_s32_overflow(dims[i], lb[i], &sum))
-            ereport(ERROR,
+            ereturn(escontext,,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array lower bound is too large: %d",
                             lb[i])));
diff --git a/src/backend/utils/adt/bool.c b/src/backend/utils/adt/bool.c
index cd7335287f..e291672ae4 100644
--- a/src/backend/utils/adt/bool.c
+++ b/src/backend/utils/adt/bool.c
@@ -148,13 +148,10 @@ boolin(PG_FUNCTION_ARGS)
     if (parse_bool_with_len(str, len, &result))
         PG_RETURN_BOOL(result);

-    ereport(ERROR,
+    ereturn(fcinfo->context, (Datum) 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "boolean", in_str)));
-
-    /* not reached */
-    PG_RETURN_BOOL(false);
 }

 /*
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 42ddae99ef..e1837bee71 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -291,7 +291,7 @@ int4in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT32(pg_strtoint32(num));
+    PG_RETURN_INT32(pg_strtoint32_safe(num, fcinfo->context));
 }

 /*
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index a64422c8d0..0de0bed0e8 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -166,8 +166,11 @@ invalid_syntax:
 /*
  * Convert input string to a signed 32 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * pg_strtoint32() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint32_safe() instead returns such complaints in *escontext,
+ * if it's an ErrorSaveContext.
  *
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
@@ -175,6 +178,12 @@ invalid_syntax:
  */
 int32
 pg_strtoint32(const char *s)
+{
+    return pg_strtoint32_safe(s, NULL);
+}
+
+int32
+pg_strtoint32_safe(const char *s, Node *escontext)
 {
     const char *ptr = s;
     uint32        tmp = 0;
@@ -227,18 +236,16 @@ pg_strtoint32(const char *s)
     return (int32) tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "integer")));

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "integer", s)));
-
-    return 0;                    /* keep compiler quiet */
 }

 /*
diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..221362ddb8 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -23,6 +23,7 @@
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
@@ -77,6 +78,7 @@ record_in(PG_FUNCTION_ARGS)
     char       *string = PG_GETARG_CSTRING(0);
     Oid            tupType = PG_GETARG_OID(1);
     int32        tupTypmod = PG_GETARG_INT32(2);
+    void       *escontext = fcinfo->context;
     HeapTupleHeader result;
     TupleDesc    tupdesc;
     HeapTuple    tuple;
@@ -100,7 +102,7 @@ record_in(PG_FUNCTION_ARGS)
      * supply a valid typmod, and then we can do something useful for RECORD.
      */
     if (tupType == RECORDOID && tupTypmod < 0)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("input of anonymous composite types is not implemented")));

@@ -152,7 +154,7 @@ record_in(PG_FUNCTION_ARGS)
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr++ != '(')
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Missing left parenthesis.")));
@@ -181,7 +183,7 @@ record_in(PG_FUNCTION_ARGS)
                 ptr++;
             else
                 /* *ptr must be ')' */
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed record literal: \"%s\"", string),
                          errdetail("Too few columns.")));
@@ -204,7 +206,7 @@ record_in(PG_FUNCTION_ARGS)
                 char        ch = *ptr++;

                 if (ch == '\0')
-                    ereport(ERROR,
+                    ereturn(escontext, (Datum) 0,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed record literal: \"%s\"",
                                     string),
@@ -212,7 +214,7 @@ record_in(PG_FUNCTION_ARGS)
                 if (ch == '\\')
                 {
                     if (*ptr == '\0')
-                        ereport(ERROR,
+                        ereturn(escontext, (Datum) 0,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed record literal: \"%s\"",
                                         string),
@@ -252,10 +254,13 @@ record_in(PG_FUNCTION_ARGS)
             column_info->column_type = column_type;
         }

-        values[i] = InputFunctionCall(&column_info->proc,
-                                      column_data,
-                                      column_info->typioparam,
-                                      att->atttypmod);
+        values[i] = InputFunctionCallSafe(&column_info->proc,
+                                          column_data,
+                                          column_info->typioparam,
+                                          att->atttypmod,
+                                          escontext);
+        if (SAFE_ERROR_OCCURRED(escontext))
+            PG_RETURN_NULL();

         /*
          * Prep for next column
@@ -264,7 +269,7 @@ record_in(PG_FUNCTION_ARGS)
     }

     if (*ptr++ != ')')
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Too many columns.")));
@@ -272,7 +277,7 @@ record_in(PG_FUNCTION_ARGS)
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Junk after right parenthesis.")));
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 2f794d1168..c56b6937b0 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -447,7 +447,10 @@ extern void array_free_iterator(ArrayIterator iterator);
 extern int    ArrayGetOffset(int n, const int *dim, const int *lb, const int *indx);
 extern int    ArrayGetOffset0(int n, const int *tup, const int *scale);
 extern int    ArrayGetNItems(int ndim, const int *dims);
+extern int    ArrayGetNItemsSafe(int ndim, const int *dims, void *escontext);
 extern void ArrayCheckBounds(int ndim, const int *dims, const int *lb);
+extern void ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                                 void *escontext);
 extern void mda_get_range(int n, int *span, const int *st, const int *endp);
 extern void mda_get_prod(int n, const int *range, int *prod);
 extern void mda_get_offset_values(int n, int *dist, const int *prod, const int *span);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 81631f1645..fbfd8375e3 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -45,6 +45,7 @@ extern int    namestrcmp(Name name, const char *str);
 /* numutils.c */
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
+extern int32 pg_strtoint32_safe(const char *s, Node *escontext);
 extern int64 pg_strtoint64(const char *s);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index c25b52d0cb..462e4d338b 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -42,6 +42,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_QUOTE { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     FORCE_NOT_NULL ( <replaceable class="parameter">column_name</replaceable> [, ...] )
     FORCE_NULL ( <replaceable class="parameter">column_name</replaceable> [, ...] )
+    NULL_ON_ERROR [ <replaceable class="parameter">boolean</replaceable> ]
+    WARN_ON_ERROR [ <replaceable class="parameter">boolean</replaceable> ]
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
 </synopsis>
  </refsynopsisdiv>
@@ -356,6 +358,27 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>

+   <varlistentry>
+    <term><literal>NULL_ON_ERROR</literal></term>
+    <listitem>
+     <para>
+      Requests silently replacing any erroneous input values with
+      <literal>NULL</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>WARN_ON_ERROR</literal></term>
+    <listitem>
+     <para>
+      Requests replacing any erroneous input values with
+      <literal>NULL</literal>, and emitting a warning message instead of
+      the usual error.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>ENCODING</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index db4c9dbc23..d224167111 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -409,6 +409,7 @@ ProcessCopyOptions(ParseState *pstate,
     bool        format_specified = false;
     bool        freeze_specified = false;
     bool        header_specified = false;
+    bool        on_error_specified = false;
     ListCell   *option;

     /* Support external use for option sanity checking */
@@ -520,6 +521,20 @@ ProcessCopyOptions(ParseState *pstate,
                                 defel->defname),
                          parser_errposition(pstate, defel->location)));
         }
+        else if (strcmp(defel->defname, "null_on_error") == 0)
+        {
+            if (on_error_specified)
+                errorConflictingDefElem(defel, pstate);
+            on_error_specified = true;
+            opts_out->null_on_error = defGetBoolean(defel);
+        }
+        else if (strcmp(defel->defname, "warn_on_error") == 0)
+        {
+            if (on_error_specified)
+                errorConflictingDefElem(defel, pstate);
+            on_error_specified = true;
+            opts_out->warn_on_error = defGetBoolean(defel);
+        }
         else if (strcmp(defel->defname, "convert_selectively") == 0)
         {
             /*
@@ -701,6 +716,30 @@ ProcessCopyOptions(ParseState *pstate,
         ereport(ERROR,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("CSV quote character must not appear in the NULL specification")));
+
+    /*
+     * The XXX_ON_ERROR options are only supported for input, and only in text
+     * modes.  We could in future extend safe-errors support to datatype
+     * receive functions, but it'd take a lot more work.  Moreover, it's not
+     * clear that receive functions can detect errors very well, so the
+     * feature likely wouldn't work terribly well.
+     */
+    if (opts_out->null_on_error && !is_from)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("COPY NULL_ON_ERROR only available using COPY FROM")));
+    if (opts_out->null_on_error && opts_out->binary)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("cannot specify NULL_ON_ERROR in BINARY mode")));
+    if (opts_out->warn_on_error && !is_from)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("COPY WARN_ON_ERROR only available using COPY FROM")));
+    if (opts_out->warn_on_error && opts_out->binary)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("cannot specify WARN_ON_ERROR in BINARY mode")));
 }

 /*
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 504afcb811..16b01b6598 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1599,6 +1599,15 @@ BeginCopyFrom(ParseState *pstate,
         }
     }

+    /* For the XXX_ON_ERROR options, we'll need an ErrorSaveContext */
+    if (cstate->opts.null_on_error ||
+        cstate->opts.warn_on_error)
+    {
+        cstate->es_context = makeNode(ErrorSaveContext);
+        /* Error details are only needed for warnings */
+        if (cstate->opts.warn_on_error)
+            cstate->es_context->details_wanted = true;
+    }

     /* initialize progress */
     pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 097414ef12..9cf7d31dd2 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -876,6 +876,7 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
         char      **field_strings;
         ListCell   *cur;
         int            fldct;
+        bool        safe_mode;
         int            fieldno;
         char       *string;

@@ -889,6 +890,8 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
                     (errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
                      errmsg("extra data after last expected column")));

+        safe_mode = cstate->opts.null_on_error || cstate->opts.warn_on_error;
+
         fieldno = 0;

         /* Loop to read the user attributes on the line. */
@@ -938,12 +941,50 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,

             cstate->cur_attname = NameStr(att->attname);
             cstate->cur_attval = string;
-            values[m] = InputFunctionCall(&in_functions[m],
-                                          string,
-                                          typioparams[m],
-                                          att->atttypmod);
-            if (string != NULL)
-                nulls[m] = false;
+
+            if (safe_mode)
+            {
+                ErrorSaveContext *es_context = cstate->es_context;
+
+                /* Must reset the error_occurred flag each time */
+                es_context->error_occurred = false;
+
+                values[m] = InputFunctionCallSafe(&in_functions[m],
+                                                  string,
+                                                  typioparams[m],
+                                                  att->atttypmod,
+                                                  es_context);
+                if (es_context->error_occurred)
+                {
+                    /* nulls[m] is already true */
+                    if (cstate->opts.warn_on_error)
+                    {
+                        ErrorData  *edata = es_context->error_data;
+
+                        /* Note that our errcontext callback wasn't used */
+                        ereport(WARNING,
+                                errcode(edata->sqlerrcode),
+                                errmsg_internal("invalid input for column %s: %s",
+                                                cstate->cur_attname,
+                                                edata->message),
+                                errcontext("COPY %s, line %llu",
+                                           cstate->cur_relname,
+                                           (unsigned long long) cstate->cur_lineno));
+                    }
+                }
+                else if (string != NULL)
+                    nulls[m] = false;
+            }
+            else
+            {
+                values[m] = InputFunctionCall(&in_functions[m],
+                                              string,
+                                              typioparams[m],
+                                              att->atttypmod);
+                if (string != NULL)
+                    nulls[m] = false;
+            }
+
             cstate->cur_attname = NULL;
             cstate->cur_attval = NULL;
         }
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b77b935005..ee38bd0e28 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -57,6 +57,8 @@ typedef struct CopyFormatOptions
     bool       *force_notnull_flags;    /* per-column CSV FNN flags */
     List       *force_null;        /* list of column names */
     bool       *force_null_flags;    /* per-column CSV FN flags */
+    bool        null_on_error;    /* replace erroneous inputs with NULL? */
+    bool        warn_on_error;    /* ... and warn about it? */
     bool        convert_selectively;    /* do selective binary conversion? */
     List       *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 8d9cc5accd..ee6a11306d 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -16,6 +16,7 @@

 #include "commands/copy.h"
 #include "commands/trigger.h"
+#include "nodes/miscnodes.h"

 /*
  * Represents the different source cases we need to worry about at
@@ -97,6 +98,7 @@ typedef struct CopyFromStateData
     int           *defmap;            /* array of default att numbers */
     ExprState **defexprs;        /* array of default att expressions */
     bool        volatile_defexprs;    /* is any of defexprs volatile? */
+    ErrorSaveContext *es_context;    /* used for XXX_ON_ERROR options */
     List       *range_table;
     ExprState  *qualexpr;

diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index 3fad1c52d1..f848ce124d 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -240,3 +240,27 @@ SELECT * FROM header_copytest ORDER BY a;
 (5 rows)

 drop table header_copytest;
+-- "safe" error handling
+create table on_error_copytest(i int, b bool, ai int[]);
+copy on_error_copytest from stdin with (null_on_error);
+copy on_error_copytest from stdin with (warn_on_error);
+WARNING:  invalid input for column b: invalid input syntax for type boolean: "b"
+WARNING:  invalid input for column ai: malformed array literal: "[0:1000]={3,4}"
+WARNING:  invalid input for column i: invalid input syntax for type integer: "err"
+WARNING:  invalid input for column i: invalid input syntax for type integer: "bad"
+WARNING:  invalid input for column b: invalid input syntax for type boolean: "z"
+WARNING:  invalid input for column ai: invalid input syntax for type integer: "zed"
+select * from on_error_copytest;
+ i | b |     ai
+---+---+-------------
+ 1 |   |
+   | t |
+ 2 | f | {3,4}
+   |   |
+ 3 | f | [3:4]={3,4}
+ 4 |   |
+   | t | {}
+   |   |
+(8 rows)
+
+drop table on_error_copytest;
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index 285022e07c..ff77d27cfc 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -268,3 +268,23 @@ a    c    b

 SELECT * FROM header_copytest ORDER BY a;
 drop table header_copytest;
+
+-- "safe" error handling
+create table on_error_copytest(i int, b bool, ai int[]);
+
+copy on_error_copytest from stdin with (null_on_error);
+1    a    {1,}
+err    1    {x}
+2    f    {3,4}
+bad    x    {,
+\.
+
+copy on_error_copytest from stdin with (warn_on_error);
+3    0    [3:4]={3,4}
+4    b    [0:1000]={3,4}
+err    t    {}
+bad    z    {"zed"}
+\.
+
+select * from on_error_copytest;
+drop table on_error_copytest;

Re: Error-safe user functions

From

Andres Freund

Date:

05 December 2022, 23:47:34

Hi,

On 2022-12-05 16:40:06 -0500, Tom Lane wrote:
> +/*
> + * errsave_start --- begin a "safe" error-reporting cycle
> + *
> + * If "context" isn't an ErrorSaveContext node, this behaves as
> + * errstart(ERROR, domain), and the errsave() macro ends up acting
> + * exactly like ereport(ERROR, ...).
> + *
> + * If "context" is an ErrorSaveContext node, but the node creator only wants
> + * notification of the fact of a safe error without any details, just set
> + * the error_occurred flag in the ErrorSaveContext node and return false,
> + * which will cause us to skip the remaining error processing steps.
> + *
> + * Otherwise, create and initialize error stack entry and return true.
> + * Subsequently, errmsg() and perhaps other routines will be called to further
> + * populate the stack entry.  Finally, errsave_finish() will be called to
> + * tidy up.
> + */
> +bool
> +errsave_start(void *context, const char *domain)

Why is context a void *?


> +{
> +    ErrorSaveContext *escontext;
> +    ErrorData  *edata;
> +
> +    /*
> +     * Do we have a context for safe error reporting?  If not, just punt to
> +     * errstart().
> +     */
> +    if (context == NULL || !IsA(context, ErrorSaveContext))
> +        return errstart(ERROR, domain);

I don't think we should "accept" !IsA(context, ErrorSaveContext) - that
seems likely to hide things like use-after-free.


> +    if (++errordata_stack_depth >= ERRORDATA_STACK_SIZE)
> +    {
> +        /*
> +         * Wups, stack not big enough.  We treat this as a PANIC condition
> +         * because it suggests an infinite loop of errors during error
> +         * recovery.
> +         */
> +        errordata_stack_depth = -1; /* make room on stack */
> +        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
> +    }

This is the fourth copy of this code...



> +/*
> + * errsave_finish --- end a "safe" error-reporting cycle
> + *
> + * If errsave_start() decided this was a regular error, behave as
> + * errfinish().  Otherwise, package up the error details and save
> + * them in the ErrorSaveContext node.
> + */
> +void
> +errsave_finish(void *context, const char *filename, int lineno,
> +               const char *funcname)
> +{
> +    ErrorSaveContext *escontext = (ErrorSaveContext *) context;
> +    ErrorData  *edata = &errordata[errordata_stack_depth];
> +
> +    /* verify stack depth before accessing *edata */
> +    CHECK_STACK_DEPTH();
> +
> +    /*
> +     * If errsave_start punted to errstart, then elevel will be ERROR or
> +     * perhaps even PANIC.  Punt likewise to errfinish.
> +     */
> +    if (edata->elevel >= ERROR)
> +        errfinish(filename, lineno, funcname);

I'd put a pg_unreachable() or such after the errfinish() call.


> +    /*
> +     * Else, we should package up the stack entry contents and deliver them to
> +     * the caller.
> +     */
> +    recursion_depth++;
> +
> +    /* Save the last few bits of error state into the stack entry */
> +    if (filename)
> +    {
> +        const char *slash;
> +
> +        /* keep only base name, useful especially for vpath builds */
> +        slash = strrchr(filename, '/');
> +        if (slash)
> +            filename = slash + 1;
> +        /* Some Windows compilers use backslashes in __FILE__ strings */
> +        slash = strrchr(filename, '\\');
> +        if (slash)
> +            filename = slash + 1;
> +    }
> +
> +    edata->filename = filename;
> +    edata->lineno = lineno;
> +    edata->funcname = funcname;
> +    edata->elevel = ERROR;        /* hide the LOG value used above */
> +
> +    /*
> +     * We skip calling backtrace and context functions, which are more likely
> +     * to cause trouble than provide useful context; they might act on the
> +     * assumption that a transaction abort is about to occur.
> +     */

This seems like a fair bit of duplicated code.


> + * This is the same as InputFunctionCall, but the caller may also pass a
> + * previously-initialized ErrorSaveContext node.  (We declare that as
> + * "void *" to avoid including miscnodes.h in fmgr.h.)

It seems way cleaner to forward declare ErrorSaveContext instead of
using void *.


> If escontext points
> + * to an ErrorSaveContext, any "safe" errors detected by the input function
> + * will be reported by filling the escontext struct.  The caller must
> + * check escontext->error_occurred before assuming that the function result
> + * is meaningful.

I wonder if we shouldn't instead make InputFunctionCallSafe() return a
boolean and return the Datum via a pointer. As callers are otherwise
going to need to do SAFE_ERROR_OCCURRED(escontext) themselves, I think
it should also lead to more concise (and slightly more efficient) code.


> +Datum
> +InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
> +                      Oid typioparam, int32 typmod,
> +                      void *escontext)

Is there a reason not to provide this infrastructure for
ReceiveFunctionCall() as well?


Not that I have a suggestion for a better name, but I don't particularly
like "Safe" denoting non-erroring input function calls. There's too many
interpretations of safe - e.g. safe against privilege escalation issues
or such.



> @@ -252,10 +254,13 @@ record_in(PG_FUNCTION_ARGS)
>              column_info->column_type = column_type;
>          }
>  
> -        values[i] = InputFunctionCall(&column_info->proc,
> -                                      column_data,
> -                                      column_info->typioparam,
> -                                      att->atttypmod);
> +        values[i] = InputFunctionCallSafe(&column_info->proc,
> +                                          column_data,
> +                                          column_info->typioparam,
> +                                          att->atttypmod,
> +                                          escontext);
> +        if (SAFE_ERROR_OCCURRED(escontext))
> +            PG_RETURN_NULL();

It doesn't *quite* seem right to set ->isnull in case of an error. Not
that it has an obvious harm.

Wonder if it's perhaps worth to add VALGRIND_MAKE_MEM_UNDEFINED() calls
to InputFunctionCallSafe() to more easily detect cases where a caller
ignores that an error occured.


> +            if (safe_mode)
> +            {
> +                ErrorSaveContext *es_context = cstate->es_context;
> +
> +                /* Must reset the error_occurred flag each time */
> +                es_context->error_occurred = false;

I'd put that into the if (es_context->error_occurred) path. Likely the
window for store-forwarding issues is smaller than
InputFunctionCallSafe(), but it's trivial to write it differently...


> diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
> index 285022e07c..ff77d27cfc 100644
> --- a/src/test/regress/sql/copy.sql
> +++ b/src/test/regress/sql/copy.sql
> @@ -268,3 +268,23 @@ a    c    b
>  
>  SELECT * FROM header_copytest ORDER BY a;
>  drop table header_copytest;
> +
> +-- "safe" error handling
> +create table on_error_copytest(i int, b bool, ai int[]);
> +
> +copy on_error_copytest from stdin with (null_on_error);
> +1    a    {1,}
> +err    1    {x}
> +2    f    {3,4}
> +bad    x    {,
> +\.
> +
> +copy on_error_copytest from stdin with (warn_on_error);
> +3    0    [3:4]={3,4}
> +4    b    [0:1000]={3,4}
> +err    t    {}
> +bad    z    {"zed"}
> +\.
> +
> +select * from on_error_copytest;
> +drop table on_error_copytest;

Think it'd be good to have a test for a composite type where one of the
columns safely errors out and the other doesn't.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 00:18:11

Andres Freund <andres@anarazel.de> writes:
> Why is context a void *?

elog.h can't depend on nodes.h, at least not without some rather
fundamental rethinking of our #include relationships.  We could
possibly use the same kind of hack that fmgr.h does:

typedef struct Node *fmNodePtr;

but I'm not sure that's much of an improvement.  Note that it'd
*not* be correct to declare it as anything more specific than Node*,
since the fmgr context pointer is Node* and we're not expecting
callers to do their own IsA checks to see what they were passed.

> I don't think we should "accept" !IsA(context, ErrorSaveContext) - that
> seems likely to hide things like use-after-free.

No, see above.  Moving the IsA checks out to the callers would
not improve the garbage-pointer risk one bit, it would just
add code bloat.

> I'd put a pg_unreachable() or such after the errfinish() call.

[ shrug... ]  Kinda pointless IMO, but OK.

> This seems like a fair bit of duplicated code.

I don't think refactoring to remove the duplication would improve it.

>> + * This is the same as InputFunctionCall, but the caller may also pass a
>> + * previously-initialized ErrorSaveContext node.  (We declare that as
>> + * "void *" to avoid including miscnodes.h in fmgr.h.)

> It seems way cleaner to forward declare ErrorSaveContext instead of
> using void *.

Again, it cannot be any more specific than Node*.  But you're right
that we could use fmNodePtr here, and that would be at least a little
nicer.

> I wonder if we shouldn't instead make InputFunctionCallSafe() return a
> boolean and return the Datum via a pointer. As callers are otherwise
> going to need to do SAFE_ERROR_OCCURRED(escontext) themselves, I think
> it should also lead to more concise (and slightly more efficient) code.

Hmm, maybe.  It would be a bigger change from existing code, but
I don't think very many call sites would be impacted.  (But by
the same token, we'd not save much code this way.)  Personally
I put more value on keeping similar APIs between InputFunctionCall
and InputFunctionCallSafe, but I won't argue hard if you're insistent.

> Is there a reason not to provide this infrastructure for
> ReceiveFunctionCall() as well?

There's a comment in 0003 about that: I doubt that it makes sense
to have no-error semantics for binary input.  That would require
far more trust in the receive functions' ability to detect garbage
input than I think they have in reality.  Perhaps more to the
point, even if we ultimately do that I don't want to do it now.
Including the receive functions in the first-pass conversion would
roughly double the amount of work needed per datatype, and we are
already going to be hard put to it to finish what needs to be done
for v16.

> Not that I have a suggestion for a better name, but I don't particularly
> like "Safe" denoting non-erroring input function calls. There's too many
> interpretations of safe - e.g. safe against privilege escalation issues
> or such.

Yeah, I'm not that thrilled with it either --- but it's a reasonably
on-point modifier, and short.

> It doesn't *quite* seem right to set ->isnull in case of an error. Not
> that it has an obvious harm.

Doesn't matter: if the caller pays attention to either the Datum
value or the isnull flag, it's broken.

> Wonder if it's perhaps worth to add VALGRIND_MAKE_MEM_UNDEFINED() calls
> to InputFunctionCallSafe() to more easily detect cases where a caller
> ignores that an error occured.

I do not think there are going to be enough callers of
InputFunctionCallSafe that we need such tactics to validate them.

> I'd put that into the if (es_context->error_occurred) path. Likely the
> window for store-forwarding issues is smaller than
> InputFunctionCallSafe(), but it's trivial to write it differently...

Does not seem better to me, and your argument for it seems like the
worst sort of premature micro-optimization.

> Think it'd be good to have a test for a composite type where one of the
> columns safely errors out and the other doesn't.

I wasn't trying all that hard on the error tests, because I think
0003 is just throwaway code at this point.  If we want to seriously
check the input functions' behavior then we need to factorize the
tests so it can be done per-datatype, not in one central place in
the COPY tests.  For the core types it could make sense to provide
some function in pg_regress.c that allows access to the non-exception
code path independently of COPY; but I'm not sure how contrib
datatypes could use that.

In any case, I'm unconvinced that testing each error exit both ways is
likely to be a profitable use of test cycles.  The far more likely source
of problems with this patch series is going to be that we miss converting
some ereport call that is reachable with bad input.  No amount of
testing is going to prove that that didn't happen.

            regards, tom lane

Re: Error-safe user functions

From

Andres Freund

Date:

06 December 2022, 00:56:07

Hi,

On 2022-12-05 19:18:11 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Why is context a void *?
>
> elog.h can't depend on nodes.h, at least not without some rather
> fundamental rethinking of our #include relationships.  We could
> possibly use the same kind of hack that fmgr.h does:
>
> typedef struct Node *fmNodePtr;
>
> but I'm not sure that's much of an improvement.  Note that it'd
> *not* be correct to declare it as anything more specific than Node*,
> since the fmgr context pointer is Node* and we're not expecting
> callers to do their own IsA checks to see what they were passed.

Ah - I hadn't actually grokked that that's the reason for the
void*. Unless I missed a comment to that regard, entirely possible, it
seems worth explaining that above errsave_start().

> > This seems like a fair bit of duplicated code.
>
> I don't think refactoring to remove the duplication would improve it.

Why? I think a populate_edata() or such seems to make sense. And the
required argument to skip ->backtrace and error_context_stack processing
seem like things that'd be good to document anyway.

> > I wonder if we shouldn't instead make InputFunctionCallSafe() return a
> > boolean and return the Datum via a pointer. As callers are otherwise
> > going to need to do SAFE_ERROR_OCCURRED(escontext) themselves, I think
> > it should also lead to more concise (and slightly more efficient) code.
>
> Hmm, maybe.  It would be a bigger change from existing code, but
> I don't think very many call sites would be impacted.  (But by
> the same token, we'd not save much code this way.)  Personally
> I put more value on keeping similar APIs between InputFunctionCall
> and InputFunctionCallSafe, but I won't argue hard if you're insistent.

I think it's good to diverge from the existing code, because imo the
behaviour is quite different and omitting the SAFE_ERROR_OCCURRED()
check will lead to brokenness.

> > Is there a reason not to provide this infrastructure for
> > ReceiveFunctionCall() as well?
>
> There's a comment in 0003 about that: I doubt that it makes sense
> to have no-error semantics for binary input.  That would require
> far more trust in the receive functions' ability to detect garbage
> input than I think they have in reality.  Perhaps more to the
> point, even if we ultimately do that I don't want to do it now.
> Including the receive functions in the first-pass conversion would
> roughly double the amount of work needed per datatype, and we are
> already going to be hard put to it to finish what needs to be done
> for v16.

Fair enough.

> > Wonder if it's perhaps worth to add VALGRIND_MAKE_MEM_UNDEFINED() calls
> > to InputFunctionCallSafe() to more easily detect cases where a caller
> > ignores that an error occured.
>
> I do not think there are going to be enough callers of
> InputFunctionCallSafe that we need such tactics to validate them.

I predict that we'll have quite a few bugs due to converting some parts
of the system, but not other parts. But we can add them later, so I'll
not insist on it.

> > I'd put that into the if (es_context->error_occurred) path. Likely the
> > window for store-forwarding issues is smaller than
> > InputFunctionCallSafe(), but it's trivial to write it differently...
>
> Does not seem better to me, and your argument for it seems like the
> worst sort of premature micro-optimization.

Shrug. The copy code is quite slow today, but not by a single source,
but by death by a thousand cuts.

> > Think it'd be good to have a test for a composite type where one of the
> > columns safely errors out and the other doesn't.
>
> I wasn't trying all that hard on the error tests, because I think
> 0003 is just throwaway code at this point.

I am mainly interested in having *something* test erroring out hard when
using the "Safe" mechanism, which afaict we don't have with the patches
as they stand.  You're right that it'd be better to do that without COPY
in the way, but it doesn't seem all that crucial.

> If we want to seriously check the input functions' behavior then we
> need to factorize the tests so it can be done per-datatype, not in one
> central place in the COPY tests.  For the core types it could make
> sense to provide some function in pg_regress.c that allows access to
> the non-exception code path independently of COPY; but I'm not sure
> how contrib datatypes could use that.

It might be worth adding a function for testing safe input functions
into core PG - it's not like we don't have other such functions.

But perhaps it's even worth having such a function properly exposed:
It's not at all rare to parse text data during ETL and quite often
erroring out fatally is undesirable. As savepoints are undesirable
overhead-wise, there's a lot of SQL out there that tries to do a
pre-check about whether some text could be cast to some other data
type. A function that'd try to cast input to a certain type without
erroring out hard would be quite useful for that.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 01:06:55

Andres Freund <andres@anarazel.de> writes:
> On 2022-12-05 19:18:11 -0500, Tom Lane wrote:
>> but I'm not sure that's much of an improvement.  Note that it'd
>> *not* be correct to declare it as anything more specific than Node*,
>> since the fmgr context pointer is Node* and we're not expecting
>> callers to do their own IsA checks to see what they were passed.

> Ah - I hadn't actually grokked that that's the reason for the
> void*. Unless I missed a comment to that regard, entirely possible, it
> seems worth explaining that above errsave_start().

There's a comment about that in elog.h IIRC, but no harm in saying
it in elog.c as well.

Having said that, I am warming a little bit to making these pointers
be Node* or an alias spelling of that rather than void*.

>> I don't think refactoring to remove the duplication would improve it.

> Why? I think a populate_edata() or such seems to make sense. And the
> required argument to skip ->backtrace and error_context_stack processing
> seem like things that'd be good to document anyway.

Meh.  Well, I'll have a look, but it seems kind of orthogonal to the
main point of the patch.

>> Hmm, maybe.  It would be a bigger change from existing code, but
>> I don't think very many call sites would be impacted.  (But by
>> the same token, we'd not save much code this way.)  Personally
>> I put more value on keeping similar APIs between InputFunctionCall
>> and InputFunctionCallSafe, but I won't argue hard if you're insistent.

> I think it's good to diverge from the existing code, because imo the
> behaviour is quite different and omitting the SAFE_ERROR_OCCURRED()
> check will lead to brokenness.

True, but it only helps for the immediate caller of InputFunctionCallSafe,
not for call levels further out.  Still, I'll give that a look.

>> I wasn't trying all that hard on the error tests, because I think
>> 0003 is just throwaway code at this point.

> I am mainly interested in having *something* test erroring out hard when
> using the "Safe" mechanism, which afaict we don't have with the patches
> as they stand.  You're right that it'd be better to do that without COPY
> in the way, but it doesn't seem all that crucial.

Hmm, either I'm confused or you're stating that backwards --- aren't
the hard-error code paths already tested by our existing tests?

> But perhaps it's even worth having such a function properly exposed:
> It's not at all rare to parse text data during ETL and quite often
> erroring out fatally is undesirable. As savepoints are undesirable
> overhead-wise, there's a lot of SQL out there that tries to do a
> pre-check about whether some text could be cast to some other data
> type. A function that'd try to cast input to a certain type without
> erroring out hard would be quite useful for that.

Corey and Vik are already talking about a non-error CAST variant.
Maybe we should leave this in abeyance until something shows up
for that?  Otherwise we'll be making a nonstandard API for what
will probably ultimately be SQL-spec functionality.  I don't mind
that as regression-test infrastructure, but I'm a bit less excited
about exposing it as a user feature.

            regards, tom lane

Re: Error-safe user functions

From

Andres Freund

Date:

06 December 2022, 01:14:04

Hi,

On 2022-12-05 20:06:55 -0500, Tom Lane wrote:
> >> I wasn't trying all that hard on the error tests, because I think
> >> 0003 is just throwaway code at this point.
> 
> > I am mainly interested in having *something* test erroring out hard when
> > using the "Safe" mechanism, which afaict we don't have with the patches
> > as they stand.  You're right that it'd be better to do that without COPY
> > in the way, but it doesn't seem all that crucial.
> 
> Hmm, either I'm confused or you're stating that backwards --- aren't
> the hard-error code paths already tested by our existing tests?

What I'd like to test is a hard error, either due to an input function
that wasn't converted or because it's a type of error that can't be
handled "softly", but when using the "safe" interface.


> > But perhaps it's even worth having such a function properly exposed:
> > It's not at all rare to parse text data during ETL and quite often
> > erroring out fatally is undesirable. As savepoints are undesirable
> > overhead-wise, there's a lot of SQL out there that tries to do a
> > pre-check about whether some text could be cast to some other data
> > type. A function that'd try to cast input to a certain type without
> > erroring out hard would be quite useful for that.
> 
> Corey and Vik are already talking about a non-error CAST variant.
> Maybe we should leave this in abeyance until something shows up
> for that?  Otherwise we'll be making a nonstandard API for what
> will probably ultimately be SQL-spec functionality.  I don't mind
> that as regression-test infrastructure, but I'm a bit less excited
> about exposing it as a user feature.

Yea, I'm fine with that. I was just thinking out loud on this aspect.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 01:19:26

Andres Freund <andres@anarazel.de> writes:
> On 2022-12-05 20:06:55 -0500, Tom Lane wrote:
>> Hmm, either I'm confused or you're stating that backwards --- aren't
>> the hard-error code paths already tested by our existing tests?

> What I'd like to test is a hard error, either due to an input function
> that wasn't converted or because it's a type of error that can't be
> handled "softly", but when using the "safe" interface.

Oh, I see.  That seems like kind of a problematic requirement,
unless we leave some datatype around that's intentionally not
ever going to be converted.  For datatypes that we do convert,
there shouldn't be any easy way to get to a hard error.

I don't really quite understand why you're worried about that
though.  The hard-error code paths are well tested already.

            regards, tom lane

Re: Error-safe user functions

From

Andres Freund

Date:

06 December 2022, 02:23:23

Hi,

On 2022-12-05 20:19:26 -0500, Tom Lane wrote:
> That seems like kind of a problematic requirement, unless we leave some
> datatype around that's intentionally not ever going to be converted.  For
> datatypes that we do convert, there shouldn't be any easy way to get to a
> hard error.

I suspect there are going to be types we can't convert. But even if not - that
actually makes a *stronger* case for ensuring the path is tested, because
certainly some out of core types aren't going to be converted.

This made me look at fmgr/README again:

> +Considering datatype input functions as examples, typical "safe" error
> +conditions include input syntax errors and out-of-range values.  An input
> +function typically detects such cases with simple if-tests and can easily
> +change the following ereport call to errsave.  Error conditions that
> +should NOT be handled this way include out-of-memory, internal errors, and
> +anything where there is any question about our ability to continue normal
> +processing of the transaction.  Those should still be thrown with ereport.

I wonder if we should provide more guidance around what kind of catalogs
access are acceptable before avoiding throwing an error.

This in turn make me look at record_in() in 0002 - I think we might be leaking
a tupledesc refcount in case of errors. Yup:

DROP TABLE IF EXISTS tbl_as_record, tbl_with_record;

CREATE TABLE tbl_as_record(a int, b int);
CREATE TABLE tbl_with_record(composite_col tbl_as_record, non_composite_col int);

COPY tbl_with_record FROM stdin WITH (warn_on_error);
kdjkdf    212
\.

WARNING:  22P02: invalid input for column composite_col: malformed record literal: "kdjkdf"
WARNING:  01000: TupleDesc reference leak: TupleDesc 0x7fb1c5fd0c58 (159584,-1) still referenced

> I don't really quite understand why you're worried about that
> though.  The hard-error code paths are well tested already.

Afaict they're not tested when going through InputFunctionCallSafe() / with an
ErrorSaveContext. To me that does seem worth testing.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 02:32:23

Andres Freund <andres@anarazel.de> writes:
> This in turn make me look at record_in() in 0002 - I think we might be leaking
> a tupledesc refcount in case of errors. Yup:

Doh :-( ... I did that function a little too hastily, obviously.
Thanks for catching that.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

06 December 2022, 11:46:10

On 2022-12-05 Mo 20:06, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>
>> But perhaps it's even worth having such a function properly exposed:
>> It's not at all rare to parse text data during ETL and quite often
>> erroring out fatally is undesirable. As savepoints are undesirable
>> overhead-wise, there's a lot of SQL out there that tries to do a
>> pre-check about whether some text could be cast to some other data
>> type. A function that'd try to cast input to a certain type without
>> erroring out hard would be quite useful for that.
> Corey and Vik are already talking about a non-error CAST variant.


/metoo! :-)


> Maybe we should leave this in abeyance until something shows up
> for that?  Otherwise we'll be making a nonstandard API for what
> will probably ultimately be SQL-spec functionality.  I don't mind
> that as regression-test infrastructure, but I'm a bit less excited
> about exposing it as a user feature.
>             


I think a functional mechanism could be very useful. Who knows when the
standard might specify something in this area?


cheers


andrew

-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 14:42:17

[ continuing the naming quagmire... ]

I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Not that I have a suggestion for a better name, but I don't particularly
>> like "Safe" denoting non-erroring input function calls. There's too many
>> interpretations of safe - e.g. safe against privilege escalation issues
>> or such.

> Yeah, I'm not that thrilled with it either --- but it's a reasonably
> on-point modifier, and short.

It occurs to me that another spelling could be NoError (or _noerror
where not using camel case).  There's some precedent for that already;
and where we have it, it has the same implication of reporting rather
than throwing certain errors, without making a guarantee about all
errors.  For instance lookup_rowtype_tupdesc_noerror won't prevent
throwing errors if catalog corruption is detected inside the catcaches.

I'm not sure this is any *better* than Safe ... it's longer, less
mellifluous, and still subject to misinterpretation.  But it's
a possible alternative.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

06 December 2022, 15:43:03

On 2022-12-06 Tu 09:42, Tom Lane wrote:
> [ continuing the naming quagmire... ]
>
> I wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> Not that I have a suggestion for a better name, but I don't particularly
>>> like "Safe" denoting non-erroring input function calls. There's too many
>>> interpretations of safe - e.g. safe against privilege escalation issues
>>> or such.
>> Yeah, I'm not that thrilled with it either --- but it's a reasonably
>> on-point modifier, and short.
> It occurs to me that another spelling could be NoError (or _noerror
> where not using camel case).  There's some precedent for that already;
> and where we have it, it has the same implication of reporting rather
> than throwing certain errors, without making a guarantee about all
> errors.  For instance lookup_rowtype_tupdesc_noerror won't prevent
> throwing errors if catalog corruption is detected inside the catcaches.
>
> I'm not sure this is any *better* than Safe ... it's longer, less
> mellifluous, and still subject to misinterpretation.  But it's
> a possible alternative.
>
>             


Yeah, I don't think there's terribly much to choose between 'safe' and
'noerror' in terms of meaning.

I originally chose InputFunctionCallContext as a more neutral name in
case we wanted to be able to pass some other sort of node for the
context in future.

Maybe that was a little too forward looking.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 16:07:01

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-06 Tu 09:42, Tom Lane wrote:
>> I'm not sure this is any *better* than Safe ... it's longer, less
>> mellifluous, and still subject to misinterpretation.  But it's
>> a possible alternative.

> Yeah, I don't think there's terribly much to choose between 'safe' and
> 'noerror' in terms of meaning.

Yeah, I just wanted to throw it out there and see if anyone thought
it was a better idea.

> I originally chose InputFunctionCallContext as a more neutral name in
> case we wanted to be able to pass some other sort of node for the
> context in future.
> Maybe that was a little too forward looking.

I didn't like that because it seemed to convey nothing at all about
the expected behavior.

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

06 December 2022, 17:10:09

On Tue, Dec 6, 2022 at 11:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I originally chose InputFunctionCallContext as a more neutral name in
> > case we wanted to be able to pass some other sort of node for the
> > context in future.
> > Maybe that was a little too forward looking.
>
> I didn't like that because it seemed to convey nothing at all about
> the expected behavior.

I feel like this can go either way. If we pick a name that conveys a
specific intended behavior now, and then later we want to pass some
other sort of node for some purpose other than ignoring errors, it's
unpleasant to have a name that sounds like it can only ignore errors.
But if we never use it for anything other than ignoring errors, a
specific name is clearer.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Corey Huinker

Date:

06 December 2022, 18:16:59

On Tue, Dec 6, 2022 at 6:46 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-12-05 Mo 20:06, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>
>> But perhaps it's even worth having such a function properly exposed:
>> It's not at all rare to parse text data during ETL and quite often
>> erroring out fatally is undesirable. As savepoints are undesirable
>> overhead-wise, there's a lot of SQL out there that tries to do a
>> pre-check about whether some text could be cast to some other data
>> type. A function that'd try to cast input to a certain type without
>> erroring out hard would be quite useful for that.
> Corey and Vik are already talking about a non-error CAST variant.

/metoo! :-)

> Maybe we should leave this in abeyance until something shows up
> for that? Otherwise we'll be making a nonstandard API for what
> will probably ultimately be SQL-spec functionality. I don't mind
> that as regression-test infrastructure, but I'm a bit less excited
> about exposing it as a user feature.
>

I think a functional mechanism could be very useful. Who knows when the
standard might specify something in this area?

Vik's working on the standard (he put the spec in earlier in this thread). I'm working on implementing it on top of Tom's work, but I'm one patchset behind at the moment.

Once completed, it should be leverage-able in several places, COPY being the most obvious.

What started all this was me noticing that if I implemented TRY_CAST in pl/pgsql with an exception block, then I wasn't able to use parallel query.

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 20:21:09

OK, here's a v3 responding to the comments from Andres.

0000 is preliminary refactoring of elog.c, with (I trust) no
functional effect.  It gets rid of some pre-existing code duplication
as well as setting up to let 0001's additions be less duplicative.

0001 adopts use of Node pointers in place of "void *".  To do this
I needed an alias type in elog.h equivalent to fmgr.h's fmNodePtr.
I decided that having two different aliases would be too confusing,
so what I did here was to converge both elog.h and fmgr.h on using
the same alias "typedef struct Node *NodePtr".  That has to be in
elog.h since it's included first, from postgres.h.  (I thought of
defining NodePtr in postgres.h, but postgres.h includes elog.h
immediately so that wouldn't have looked very nice.)

I also adopted Andres' recommendation that InputFunctionCallSafe
return boolean.  I'm still not totally sold on that ... but it does
end with array_in and record_in never using SAFE_ERROR_OCCURRED at
all, so maybe the idea's OK.

0002 adjusts the I/O functions for these API changes, and fixes
my silly oversight about error cleanup in record_in.

Given the discussion about testing requirements, I threw away the
COPY hack entirely.  This 0003 provides a couple of SQL-callable
functions that can be used to invoke a specific datatype's input
function.  I haven't documented them, pending bikeshedding on
names etc.  I also arranged to test array_in and record_in with
a datatype that still throws errors, reserving the existing test
type "widget" for that purpose.

(I'm not intending to foreclose development of new COPY features
in this area, just abandoning the idea that that's our initial
test mechanism.)

Thoughts?

            regards, tom lane

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 2585e24845..f5cd1b7493 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -176,8 +176,15 @@ static char formatted_log_time[FORMATTED_TS_LEN];


 static const char *err_gettext(const char *str) pg_attribute_format_arg(1);
+static ErrorData *get_error_stack_entry(void);
+static void set_stack_entry_domain(ErrorData *edata, const char *domain);
+static void set_stack_entry_location(ErrorData *edata,
+                                     const char *filename, int lineno,
+                                     const char *funcname);
+static bool matches_backtrace_functions(const char *funcname);
 static pg_noinline void set_backtrace(ErrorData *edata, int num_skip);
 static void set_errdata_field(MemoryContextData *cxt, char **ptr, const char *str);
+static void FreeErrorDataContents(ErrorData *edata);
 static void write_console(const char *line, int len);
 static const char *process_log_prefix_padding(const char *p, int *ppadding);
 static void log_line_prefix(StringInfo buf, ErrorData *edata);
@@ -434,27 +441,13 @@ errstart(int elevel, const char *domain)
             debug_query_string = NULL;
         }
     }
-    if (++errordata_stack_depth >= ERRORDATA_STACK_SIZE)
-    {
-        /*
-         * Wups, stack not big enough.  We treat this as a PANIC condition
-         * because it suggests an infinite loop of errors during error
-         * recovery.
-         */
-        errordata_stack_depth = -1; /* make room on stack */
-        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
-    }

     /* Initialize data for this error frame */
-    edata = &errordata[errordata_stack_depth];
-    MemSet(edata, 0, sizeof(ErrorData));
+    edata = get_error_stack_entry();
     edata->elevel = elevel;
     edata->output_to_server = output_to_server;
     edata->output_to_client = output_to_client;
-    /* the default text domain is the backend's */
-    edata->domain = domain ? domain : PG_TEXTDOMAIN("postgres");
-    /* initialize context_domain the same way (see set_errcontext_domain()) */
-    edata->context_domain = edata->domain;
+    set_stack_entry_domain(edata, domain);
     /* Select default errcode based on elevel */
     if (elevel >= ERROR)
         edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
@@ -462,8 +455,6 @@ errstart(int elevel, const char *domain)
         edata->sqlerrcode = ERRCODE_WARNING;
     else
         edata->sqlerrcode = ERRCODE_SUCCESSFUL_COMPLETION;
-    /* errno is saved here so that error parameter eval can't change it */
-    edata->saved_errno = errno;

     /*
      * Any allocations for this error state level should go into ErrorContext
@@ -474,32 +465,6 @@ errstart(int elevel, const char *domain)
     return true;
 }

-/*
- * Checks whether the given funcname matches backtrace_functions; see
- * check_backtrace_functions.
- */
-static bool
-matches_backtrace_functions(const char *funcname)
-{
-    char       *p;
-
-    if (!backtrace_symbol_list || funcname == NULL || funcname[0] == '\0')
-        return false;
-
-    p = backtrace_symbol_list;
-    for (;;)
-    {
-        if (*p == '\0')            /* end of backtrace_symbol_list */
-            break;
-
-        if (strcmp(funcname, p) == 0)
-            return true;
-        p += strlen(p) + 1;
-    }
-
-    return false;
-}
-
 /*
  * errfinish --- end an error-reporting cycle
  *
@@ -520,23 +485,7 @@ errfinish(const char *filename, int lineno, const char *funcname)
     CHECK_STACK_DEPTH();

     /* Save the last few bits of error state into the stack entry */
-    if (filename)
-    {
-        const char *slash;
-
-        /* keep only base name, useful especially for vpath builds */
-        slash = strrchr(filename, '/');
-        if (slash)
-            filename = slash + 1;
-        /* Some Windows compilers use backslashes in __FILE__ strings */
-        slash = strrchr(filename, '\\');
-        if (slash)
-            filename = slash + 1;
-    }
-
-    edata->filename = filename;
-    edata->lineno = lineno;
-    edata->funcname = funcname;
+    set_stack_entry_location(edata, filename, lineno, funcname);

     elevel = edata->elevel;

@@ -546,6 +495,7 @@ errfinish(const char *filename, int lineno, const char *funcname)
      */
     oldcontext = MemoryContextSwitchTo(ErrorContext);

+    /* Collect backtrace, if enabled and we didn't already */
     if (!edata->backtrace &&
         edata->funcname &&
         backtrace_functions &&
@@ -596,31 +546,7 @@ errfinish(const char *filename, int lineno, const char *funcname)
     EmitErrorReport();

     /* Now free up subsidiary data attached to stack entry, and release it */
-    if (edata->message)
-        pfree(edata->message);
-    if (edata->detail)
-        pfree(edata->detail);
-    if (edata->detail_log)
-        pfree(edata->detail_log);
-    if (edata->hint)
-        pfree(edata->hint);
-    if (edata->context)
-        pfree(edata->context);
-    if (edata->backtrace)
-        pfree(edata->backtrace);
-    if (edata->schema_name)
-        pfree(edata->schema_name);
-    if (edata->table_name)
-        pfree(edata->table_name);
-    if (edata->column_name)
-        pfree(edata->column_name);
-    if (edata->datatype_name)
-        pfree(edata->datatype_name);
-    if (edata->constraint_name)
-        pfree(edata->constraint_name);
-    if (edata->internalquery)
-        pfree(edata->internalquery);
-
+    FreeErrorDataContents(edata);
     errordata_stack_depth--;

     /* Exit error-handling context */
@@ -685,6 +611,120 @@ errfinish(const char *filename, int lineno, const char *funcname)
     CHECK_FOR_INTERRUPTS();
 }

+/*
+ * get_error_stack_entry --- allocate and initialize a new stack entry
+ *
+ * The entry should be freed, when we're done with it, by calling
+ * FreeErrorDataContents() and then decrementing errordata_stack_depth.
+ *
+ * Returning the entry's address is just a notational convenience,
+ * since it had better be errordata[errordata_stack_depth].
+ *
+ * Although the error stack is not large, we don't expect to run out of space.
+ * Using more than one entry implies a new error report during error recovery,
+ * which is possible but already suggests we're in trouble.  If we exhaust the
+ * stack, almost certainly we are in an infinite loop of errors during error
+ * recovery, so we give up and PANIC.
+ *
+ * (Note that this is distinct from the recursion_depth checks, which
+ * guard against recursion while handling a single stack entry.)
+ */
+static ErrorData *
+get_error_stack_entry(void)
+{
+    ErrorData  *edata;
+
+    /* Allocate error frame */
+    errordata_stack_depth++;
+    if (unlikely(errordata_stack_depth >= ERRORDATA_STACK_SIZE))
+    {
+        /* Wups, stack not big enough */
+        errordata_stack_depth = -1; /* make room on stack */
+        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
+    }
+
+    /* Initialize error frame to all zeroes/NULLs */
+    edata = &errordata[errordata_stack_depth];
+    memset(edata, 0, sizeof(ErrorData));
+
+    /* Save errno immediately to ensure error parameter eval can't change it */
+    edata->saved_errno = errno;
+
+    return edata;
+}
+
+/*
+ * set_stack_entry_domain --- fill in the internationalization domain
+ */
+static void
+set_stack_entry_domain(ErrorData *edata, const char *domain)
+{
+    /* the default text domain is the backend's */
+    edata->domain = domain ? domain : PG_TEXTDOMAIN("postgres");
+    /* initialize context_domain the same way (see set_errcontext_domain()) */
+    edata->context_domain = edata->domain;
+}
+
+/*
+ * set_stack_entry_location --- fill in code-location details
+ *
+ * Store the values of __FILE__, __LINE__, and __func__ from the call site.
+ * We make an effort to normalize __FILE__, since compilers are inconsistent
+ * about how much of the path they'll include, and we'd prefer that the
+ * behavior not depend on that (especially, that it not vary with build path).
+ */
+static void
+set_stack_entry_location(ErrorData *edata,
+                         const char *filename, int lineno,
+                         const char *funcname)
+{
+    if (filename)
+    {
+        const char *slash;
+
+        /* keep only base name, useful especially for vpath builds */
+        slash = strrchr(filename, '/');
+        if (slash)
+            filename = slash + 1;
+        /* Some Windows compilers use backslashes in __FILE__ strings */
+        slash = strrchr(filename, '\\');
+        if (slash)
+            filename = slash + 1;
+    }
+
+    edata->filename = filename;
+    edata->lineno = lineno;
+    edata->funcname = funcname;
+}
+
+/*
+ * matches_backtrace_functions --- checks whether the given funcname matches
+ * backtrace_functions
+ *
+ * See check_backtrace_functions.
+ */
+static bool
+matches_backtrace_functions(const char *funcname)
+{
+    const char *p;
+
+    if (!backtrace_symbol_list || funcname == NULL || funcname[0] == '\0')
+        return false;
+
+    p = backtrace_symbol_list;
+    for (;;)
+    {
+        if (*p == '\0')            /* end of backtrace_symbol_list */
+            break;
+
+        if (strcmp(funcname, p) == 0)
+            return true;
+        p += strlen(p) + 1;
+    }
+
+    return false;
+}
+

 /*
  * errcode --- add SQLSTATE error code to the current error
@@ -1611,6 +1651,18 @@ CopyErrorData(void)
  */
 void
 FreeErrorData(ErrorData *edata)
+{
+    FreeErrorDataContents(edata);
+    pfree(edata);
+}
+
+/*
+ * FreeErrorDataContents --- free the subsidiary data of an ErrorData.
+ *
+ * This can be used on either an error stack entry or a copied ErrorData.
+ */
+static void
+FreeErrorDataContents(ErrorData *edata)
 {
     if (edata->message)
         pfree(edata->message);
@@ -1636,7 +1688,6 @@ FreeErrorData(ErrorData *edata)
         pfree(edata->constraint_name);
     if (edata->internalquery)
         pfree(edata->internalquery);
-    pfree(edata);
 }

 /*
@@ -1742,18 +1793,7 @@ ReThrowError(ErrorData *edata)
     recursion_depth++;
     MemoryContextSwitchTo(ErrorContext);

-    if (++errordata_stack_depth >= ERRORDATA_STACK_SIZE)
-    {
-        /*
-         * Wups, stack not big enough.  We treat this as a PANIC condition
-         * because it suggests an infinite loop of errors during error
-         * recovery.
-         */
-        errordata_stack_depth = -1; /* make room on stack */
-        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
-    }
-
-    newedata = &errordata[errordata_stack_depth];
+    newedata = get_error_stack_entry();
     memcpy(newedata, edata, sizeof(ErrorData));

     /* Make copies of separately-allocated fields */
@@ -1854,26 +1894,11 @@ GetErrorContextStack(void)
     ErrorContextCallback *econtext;

     /*
-     * Okay, crank up a stack entry to store the info in.
+     * Crank up a stack entry to store the info in.
      */
     recursion_depth++;

-    if (++errordata_stack_depth >= ERRORDATA_STACK_SIZE)
-    {
-        /*
-         * Wups, stack not big enough.  We treat this as a PANIC condition
-         * because it suggests an infinite loop of errors during error
-         * recovery.
-         */
-        errordata_stack_depth = -1; /* make room on stack */
-        ereport(PANIC, (errmsg_internal("ERRORDATA_STACK_SIZE exceeded")));
-    }
-
-    /*
-     * Things look good so far, so initialize our error frame
-     */
-    edata = &errordata[errordata_stack_depth];
-    MemSet(edata, 0, sizeof(ErrorData));
+    edata = get_error_stack_entry();

     /*
      * Set up assoc_context to be the caller's context, so any allocations
diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 693423e524..4ea0284ca4 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -900,6 +900,17 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
    function is written in C.
   </para>

+  <para>
+   In <productname>PostgreSQL</productname> version 16 and later, it is
+   desirable for base types' input functions to return <quote>safe</quote>
+   errors using the
+   new <function>errsave()</function>/<function>ereturn()</function>
+   mechanism, rather than throwing <function>ereport()</function>
+   exceptions as in previous versions.
+   See <filename>src/backend/utils/fmgr/README</filename> for more
+   information.
+  </para>
+
  </refsect1>

  <refsect1>
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 4368c30fdb..7c594be583 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -56,6 +56,7 @@ node_headers = \
     nodes/bitmapset.h \
     nodes/extensible.h \
     nodes/lockoptions.h \
+    nodes/miscnodes.h \
     nodes/replnodes.h \
     nodes/supportnodes.h \
     nodes/value.h \
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7212bc486f..08992dfd47 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -68,6 +68,7 @@ my @all_input_files = qw(
   nodes/bitmapset.h
   nodes/extensible.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
   nodes/value.h
@@ -89,6 +90,7 @@ my @nodetag_only_files = qw(
   executor/tuptable.h
   foreign/fdwapi.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
 );
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index f5cd1b7493..74e7641485 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -71,6 +71,7 @@
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
+#include "nodes/miscnodes.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/bgworker.h"
@@ -611,6 +612,128 @@ errfinish(const char *filename, int lineno, const char *funcname)
     CHECK_FOR_INTERRUPTS();
 }

+
+/*
+ * errsave_start --- begin a "safe" error-reporting cycle
+ *
+ * If "context" isn't an ErrorSaveContext node, this behaves as
+ * errstart(ERROR, domain), and the errsave() macro ends up acting
+ * exactly like ereport(ERROR, ...).
+ *
+ * If "context" is an ErrorSaveContext node, but the node creator only wants
+ * notification of the fact of a safe error without any details, just set
+ * the error_occurred flag in the ErrorSaveContext node and return false,
+ * which will cause us to skip the remaining error processing steps.
+ *
+ * Otherwise, create and initialize error stack entry and return true.
+ * Subsequently, errmsg() and perhaps other routines will be called to further
+ * populate the stack entry.  Finally, errsave_finish() will be called to
+ * tidy up.
+ */
+bool
+errsave_start(NodePtr context, const char *domain)
+{
+    ErrorSaveContext *escontext;
+    ErrorData  *edata;
+
+    /*
+     * Do we have a context for safe error reporting?  If not, just punt to
+     * errstart().
+     */
+    if (context == NULL || !IsA(context, ErrorSaveContext))
+        return errstart(ERROR, domain);
+
+    /* Report that an error was detected */
+    escontext = (ErrorSaveContext *) context;
+    escontext->error_occurred = true;
+
+    /* Nothing else to do if caller wants no further details */
+    if (!escontext->details_wanted)
+        return false;
+
+    /*
+     * Okay, crank up a stack entry to store the info in.
+     */
+
+    recursion_depth++;
+
+    /* Initialize data for this error frame */
+    edata = get_error_stack_entry();
+    edata->elevel = LOG;        /* signal all is well to errsave_finish */
+    set_stack_entry_domain(edata, domain);
+    /* Select default errcode based on the assumed elevel of ERROR */
+    edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
+
+    /*
+     * Any allocations for this error state level should go into the caller's
+     * context.  We don't need to pollute ErrorContext, or even require it to
+     * exist, in this code path.
+     */
+    edata->assoc_context = CurrentMemoryContext;
+
+    recursion_depth--;
+    return true;
+}
+
+/*
+ * errsave_finish --- end a "safe" error-reporting cycle
+ *
+ * If errsave_start() decided this was a regular error, behave as
+ * errfinish().  Otherwise, package up the error details and save
+ * them in the ErrorSaveContext node.
+ */
+void
+errsave_finish(NodePtr context, const char *filename, int lineno,
+               const char *funcname)
+{
+    ErrorSaveContext *escontext = (ErrorSaveContext *) context;
+    ErrorData  *edata = &errordata[errordata_stack_depth];
+
+    /* verify stack depth before accessing *edata */
+    CHECK_STACK_DEPTH();
+
+    /*
+     * If errsave_start punted to errstart, then elevel will be ERROR or
+     * perhaps even PANIC.  Punt likewise to errfinish.
+     */
+    if (edata->elevel >= ERROR)
+    {
+        errfinish(filename, lineno, funcname);
+        pg_unreachable();
+    }
+
+    /*
+     * Else, we should package up the stack entry contents and deliver them to
+     * the caller.
+     */
+    recursion_depth++;
+
+    /* Save the last few bits of error state into the stack entry */
+    set_stack_entry_location(edata, filename, lineno, funcname);
+
+    /* Replace the LOG value that errsave_start inserted */
+    edata->elevel = ERROR;
+
+    /*
+     * We skip calling backtrace and context functions, which are more likely
+     * to cause trouble than provide useful context; they might act on the
+     * assumption that a transaction abort is about to occur.
+     */
+
+    /*
+     * Make a copy of the error info for the caller.  All the subsidiary
+     * strings are already in the caller's context, so it's sufficient to
+     * flat-copy the stack entry.
+     */
+    escontext->error_data = palloc_object(ErrorData);
+    memcpy(escontext->error_data, edata, sizeof(ErrorData));
+
+    /* Exit error-handling context */
+    errordata_stack_depth--;
+    recursion_depth--;
+}
+
+
 /*
  * get_error_stack_entry --- allocate and initialize a new stack entry
  *
diff --git a/src/backend/utils/fmgr/README b/src/backend/utils/fmgr/README
index 49845f67ac..aff8f6fb3e 100644
--- a/src/backend/utils/fmgr/README
+++ b/src/backend/utils/fmgr/README
@@ -267,6 +267,70 @@ See windowapi.h for more information.
 information about the context of the CALL statement, particularly
 whether it is within an "atomic" execution context.

+* Some callers of datatype input functions (and in future perhaps
+other classes of functions) pass an instance of ErrorSaveContext.
+This indicates that the caller wishes to handle "safe" errors without
+a transaction-terminating exception being thrown: instead, the callee
+should store information about the error cause in the ErrorSaveContext
+struct and return a dummy result value.  Further details appear in
+"Handling Non-Exception Errors" below.
+
+
+Handling Non-Exception Errors
+-----------------------------
+
+Postgres' standard mechanism for reporting errors (ereport() or elog())
+is used for all sorts of error conditions.  This means that throwing
+an exception via ereport(ERROR) requires an expensive transaction or
+subtransaction abort and cleanup, since the exception catcher dare not
+make many assumptions about what has gone wrong.  There are situations
+where we would rather have a lighter-weight mechanism for dealing
+with errors that are known to be safe to recover from without a full
+transaction cleanup.  SQL-callable functions can support this need
+using the ErrorSaveContext context mechanism.
+
+To report a "safe" error, a SQL-callable function should call
+    errsave(fcinfo->context, ...)
+where it would previously have done
+    ereport(ERROR, ...)
+If the passed "context" is NULL or is not an ErrorSaveContext node,
+then errsave behaves precisely as ereport(ERROR): the exception is
+thrown via longjmp, so that control does not return.  If "context"
+is an ErrorSaveContext node, then the error information included in
+errsave's subsidiary reporting calls is stored into the context node
+and control returns normally.  The function should then return a dummy
+value to its caller.  (SQL NULL is recommendable as the dummy value;
+but anything will do, since the caller is expected to ignore the
+function's return value once it sees that an error has been reported
+in the ErrorSaveContext node.)
+
+If there is nothing to do except return after calling errsave(), use
+    ereturn(fcinfo->context, dummy_value, ...)
+to perform errsave() and then "return dummy_value".
+
+Considering datatype input functions as examples, typical "safe" error
+conditions include input syntax errors and out-of-range values.  An input
+function typically detects such cases with simple if-tests and can easily
+change the following ereport call to errsave.  Error conditions that
+should NOT be handled this way include out-of-memory, internal errors, and
+anything where there is any question about our ability to continue normal
+processing of the transaction.  Those should still be thrown with ereport.
+Because of this restriction, it's typically not necessary to pass the
+ErrorSaveContext pointer down very far, as errors reported by palloc or
+other low-level functions are typically reasonable to consider internal.
+
+Because no transaction cleanup will occur, a function that is exiting
+after errsave() returns still bears responsibility for resource cleanup.
+It is not necessary to be concerned about small leakages of palloc'd
+memory, since the caller should be running the function in a short-lived
+memory context.  However, resources such as locks, open files, or buffer
+pins must be closed out cleanly, as they would be in the non-error code
+path.
+
+Conventions for callers that use the ErrorSaveContext mechanism
+to trap errors are discussed with the declaration of that struct,
+in nodes/miscnodes.h.
+

 Functions Accepting or Returning Sets
 -------------------------------------
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 3c210297aa..443512aa57 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -23,6 +23,7 @@
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/miscnodes.h"
 #include "nodes/nodeFuncs.h"
 #include "pgstat.h"
 #include "utils/acl.h"
@@ -1548,6 +1549,66 @@ InputFunctionCall(FmgrInfo *flinfo, char *str, Oid typioparam, int32 typmod)
     return result;
 }

+/*
+ * Call a previously-looked-up datatype input function, with non-exception
+ * handling of "safe" errors.
+ *
+ * This is basically like InputFunctionCall, but the converted Datum is
+ * returned into *result while the function result is true for success or
+ * false for failure.  Also, the caller may pass an ErrorSaveContext node.
+ * (We declare that as "NodePtr" to avoid including nodes.h in fmgr.h.)
+ *
+ * If escontext points to an ErrorSaveContext, any "safe" errors detected by
+ * the input function will be reported by filling the escontext struct and
+ * returning false.  (The caller can alternatively use SAFE_ERROR_OCCURRED(),
+ * but checking the function result instead is usually cheaper.)
+ */
+bool
+InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                      Oid typioparam, int32 typmod,
+                      NodePtr escontext,
+                      Datum *result)
+{
+    LOCAL_FCINFO(fcinfo, 3);
+
+    if (str == NULL && flinfo->fn_strict)
+    {
+        *result = (Datum) 0;    /* just return null result */
+        return true;
+    }
+
+    InitFunctionCallInfoData(*fcinfo, flinfo, 3, InvalidOid, escontext, NULL);
+
+    fcinfo->args[0].value = CStringGetDatum(str);
+    fcinfo->args[0].isnull = false;
+    fcinfo->args[1].value = ObjectIdGetDatum(typioparam);
+    fcinfo->args[1].isnull = false;
+    fcinfo->args[2].value = Int32GetDatum(typmod);
+    fcinfo->args[2].isnull = false;
+
+    *result = FunctionCallInvoke(fcinfo);
+
+    /* Result value is garbage, and could be null, if an error was reported */
+    if (SAFE_ERROR_OCCURRED(escontext))
+        return false;
+
+    /* Otherwise, should get null result if and only if str is NULL */
+    if (str == NULL)
+    {
+        if (!fcinfo->isnull)
+            elog(ERROR, "input function %u returned non-NULL",
+                 flinfo->fn_oid);
+    }
+    else
+    {
+        if (fcinfo->isnull)
+            elog(ERROR, "input function %u returned NULL",
+                 flinfo->fn_oid);
+    }
+
+    return true;
+}
+
 /*
  * Call a previously-looked-up datatype output function.
  *
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index 380a82b9de..d739f3dbd9 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -18,8 +18,7 @@
 #ifndef FMGR_H
 #define FMGR_H

-/* We don't want to include primnodes.h here, so make some stub references */
-typedef struct Node *fmNodePtr;
+/* We don't want to include primnodes.h here, so make a stub reference */
 typedef struct Aggref *fmAggrefPtr;

 /* Likewise, avoid including execnodes.h here */
@@ -63,7 +62,7 @@ typedef struct FmgrInfo
     unsigned char fn_stats;        /* collect stats if track_functions > this */
     void       *fn_extra;        /* extra space for use by handler */
     MemoryContext fn_mcxt;        /* memory context to store fn_extra in */
-    fmNodePtr    fn_expr;        /* expression parse tree for call, or NULL */
+    NodePtr        fn_expr;        /* expression parse tree for call, or NULL */
 } FmgrInfo;

 /*
@@ -85,8 +84,8 @@ typedef struct FmgrInfo
 typedef struct FunctionCallInfoBaseData
 {
     FmgrInfo   *flinfo;            /* ptr to lookup info used for this call */
-    fmNodePtr    context;        /* pass info about context of call */
-    fmNodePtr    resultinfo;        /* pass or return extra info about result */
+    NodePtr        context;        /* pass info about context of call */
+    NodePtr        resultinfo;        /* pass or return extra info about result */
     Oid            fncollation;    /* collation for function to use */
 #define FIELDNO_FUNCTIONCALLINFODATA_ISNULL 4
     bool        isnull;            /* function must set true if result is NULL */
@@ -700,6 +699,10 @@ extern Datum OidFunctionCall9Coll(Oid functionId, Oid collation,
 /* Special cases for convenient invocation of datatype I/O functions. */
 extern Datum InputFunctionCall(FmgrInfo *flinfo, char *str,
                                Oid typioparam, int32 typmod);
+extern bool InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                                  Oid typioparam, int32 typmod,
+                                  NodePtr escontext,
+                                  Datum *result);
 extern Datum OidInputFunctionCall(Oid functionId, char *str,
                                   Oid typioparam, int32 typmod);
 extern char *OutputFunctionCall(FmgrInfo *flinfo, Datum val);
@@ -719,9 +722,9 @@ extern const Pg_finfo_record *fetch_finfo_record(void *filehandle, const char *f
 extern Oid    fmgr_internal_function(const char *proname);
 extern Oid    get_fn_expr_rettype(FmgrInfo *flinfo);
 extern Oid    get_fn_expr_argtype(FmgrInfo *flinfo, int argnum);
-extern Oid    get_call_expr_argtype(fmNodePtr expr, int argnum);
+extern Oid    get_call_expr_argtype(NodePtr expr, int argnum);
 extern bool get_fn_expr_arg_stable(FmgrInfo *flinfo, int argnum);
-extern bool get_call_expr_arg_stable(fmNodePtr expr, int argnum);
+extern bool get_call_expr_arg_stable(NodePtr expr, int argnum);
 extern bool get_fn_expr_variadic(FmgrInfo *flinfo);
 extern bytea *get_fn_opclass_options(FmgrInfo *flinfo);
 extern bool has_fn_opclass_options(FmgrInfo *flinfo);
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index e63881086e..f0e60935b6 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -16,6 +16,7 @@ node_support_input_i = [
   'nodes/bitmapset.h',
   'nodes/extensible.h',
   'nodes/lockoptions.h',
+  'nodes/miscnodes.h',
   'nodes/replnodes.h',
   'nodes/supportnodes.h',
   'nodes/value.h',
diff --git a/src/include/nodes/miscnodes.h b/src/include/nodes/miscnodes.h
new file mode 100644
index 0000000000..893c49e02f
--- /dev/null
+++ b/src/include/nodes/miscnodes.h
@@ -0,0 +1,52 @@
+/*-------------------------------------------------------------------------
+ *
+ * miscnodes.h
+ *      Definitions for hard-to-classify node types.
+ *
+ * Node types declared here are not part of parse trees, plan trees,
+ * or execution state trees.  We only assign them NodeTag values because
+ * IsA() tests provide a convenient way to disambiguate what kind of
+ * structure is being passed through assorted APIs, such as function
+ * "context" pointers.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/nodes/miscnodes.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MISCNODES_H
+#define MISCNODES_H
+
+#include "nodes/nodes.h"
+
+/*
+ * ErrorSaveContext -
+ *    function call context node for handling of "safe" errors
+ *
+ * A caller wishing to trap "safe" errors must initialize a struct like this
+ * with all fields zero/NULL except for the NodeTag.  Optionally, set
+ * details_wanted = true if more than the bare knowledge that a "safe" error
+ * occurred is required.  After calling code that might report an error this
+ * way, check error_occurred to see if an error happened.  If so, and if
+ * details_wanted is true, error_data has been filled with error details
+ * (stored in the callee's memory context!).  FreeErrorData() can be called
+ * to release error_data, although this step is typically not necessary
+ * if the called code was run in a short-lived context.
+ */
+typedef struct ErrorSaveContext
+{
+    NodeTag        type;
+    bool        error_occurred; /* set to true if we detect a "safe" error */
+    bool        details_wanted; /* does caller want more info than that? */
+    ErrorData  *error_data;        /* details of error, if so */
+} ErrorSaveContext;
+
+/* Often-useful macro for checking if a safe error was reported */
+#define SAFE_ERROR_OCCURRED(escontext) \
+    ((escontext) != NULL && IsA(escontext, ErrorSaveContext) && \
+     ((ErrorSaveContext *) (escontext))->error_occurred)
+
+#endif                            /* MISCNODES_H */
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f107a818e8..37e4e5cfe4 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -18,6 +18,13 @@

 #include "lib/stringinfo.h"

+/*
+ * We cannot include nodes.h yet, so make a stub reference.  (This is also
+ * used by fmgr.h, which doesn't want to depend on nodes.h either.)
+ */
+typedef struct Node *NodePtr;
+
+
 /* Error level codes */
 #define DEBUG5        10            /* Debugging messages, in categories of
                                  * decreasing detail. */
@@ -235,6 +242,63 @@ extern int    getinternalerrposition(void);
     ereport(elevel, errmsg_internal(__VA_ARGS__))


+/*----------
+ * Support for reporting "safe" errors that don't require a full transaction
+ * abort to clean up.  This is to be used in this way:
+ *        errsave(context,
+ *                errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ *                errmsg("invalid input syntax for type %s: \"%s\"",
+ *                       "boolean", in_str),
+ *                ... other errxxx() fields as needed ...);
+ *
+ * "context" is a node pointer or NULL, and the remaining auxiliary calls
+ * provide the same error details as in ereport().  If context is not a
+ * pointer to an ErrorSaveContext node, then errsave(context, ...)
+ * behaves identically to ereport(ERROR, ...).  If context is a pointer
+ * to an ErrorSaveContext node, then the information provided by the
+ * auxiliary calls is stored in the context node and control returns
+ * normally.  The caller of errsave() must then do any required cleanup
+ * and return control back to its caller.  That caller must check the
+ * ErrorSaveContext node to see whether an error occurred before
+ * it can trust the function's result to be meaningful.
+ *
+ * errsave_domain() allows a message domain to be specified; it is
+ * precisely analogous to ereport_domain().
+ *----------
+ */
+#define errsave_domain(context, domain, ...)    \
+    do { \
+        NodePtr context_ = (context); \
+        pg_prevent_errno_in_scope(); \
+        if (errsave_start(context_, domain)) \
+            __VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \
+    } while(0)
+
+#define errsave(context, ...)    \
+    errsave_domain(context, TEXTDOMAIN, __VA_ARGS__)
+
+/*
+ * "ereturn(context, dummy_value, ...);" is exactly the same as
+ * "errsave(context, ...); return dummy_value;".  This saves a bit
+ * of typing in the common case where a function has no cleanup
+ * actions to take after reporting a safe error.  "dummy_value"
+ * can be empty if the function returns void.
+ */
+#define ereturn_domain(context, dummy_value, domain, ...)    \
+    do { \
+        errsave_domain(context, domain, __VA_ARGS__); \
+        return dummy_value; \
+    } while(0)
+
+#define ereturn(context, dummy_value, ...)    \
+    ereturn_domain(context, dummy_value, TEXTDOMAIN, __VA_ARGS__)
+
+extern bool errsave_start(NodePtr context, const char *domain);
+extern void errsave_finish(NodePtr context,
+                           const char *filename, int lineno,
+                           const char *funcname);
+
+
 /* Support for constructing error strings separately from ereport() calls */

 extern void pre_format_elog_string(int errnumber, const char *domain);
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 495e449a9e..c011ebdfd9 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -90,14 +90,15 @@ typedef struct ArrayIteratorData
 }            ArrayIteratorData;

 static bool array_isspace(char ch);
-static int    ArrayCount(const char *str, int *dim, char typdelim);
-static void ReadArrayStr(char *arrayStr, const char *origStr,
+static int    ArrayCount(const char *str, int *dim, char typdelim,
+                       Node *escontext);
+static bool ReadArrayStr(char *arrayStr, const char *origStr,
                          int nitems, int ndim, int *dim,
                          FmgrInfo *inputproc, Oid typioparam, int32 typmod,
                          char typdelim,
                          int typlen, bool typbyval, char typalign,
                          Datum *values, bool *nulls,
-                         bool *hasnulls, int32 *nbytes);
+                         bool *hasnulls, int32 *nbytes, Node *escontext);
 static void ReadArrayBinary(StringInfo buf, int nitems,
                             FmgrInfo *receiveproc, Oid typioparam, int32 typmod,
                             int typlen, bool typbyval, char typalign,
@@ -177,6 +178,7 @@ array_in(PG_FUNCTION_ARGS)
     Oid            element_type = PG_GETARG_OID(1);    /* type of an array
                                                      * element */
     int32        typmod = PG_GETARG_INT32(2);    /* typmod for array elements */
+    Node       *escontext = fcinfo->context;
     int            typlen;
     bool        typbyval;
     char        typalign;
@@ -258,7 +260,7 @@ array_in(PG_FUNCTION_ARGS)
             break;                /* no more dimension items */
         p++;
         if (ndim >= MAXDIM)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                             ndim + 1, MAXDIM)));
@@ -266,7 +268,7 @@ array_in(PG_FUNCTION_ARGS)
         for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
              /* skip */ ;
         if (q == p)                /* no digits? */
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("\"[\" must introduce explicitly-specified array dimensions.")));
@@ -280,7 +282,7 @@ array_in(PG_FUNCTION_ARGS)
             for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
                  /* skip */ ;
             if (q == p)            /* no digits? */
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Missing array dimension value.")));
@@ -291,7 +293,7 @@ array_in(PG_FUNCTION_ARGS)
             lBound[ndim] = 1;
         }
         if (*q != ']')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -301,7 +303,7 @@ array_in(PG_FUNCTION_ARGS)
         ub = atoi(p);
         p = q + 1;
         if (ub < lBound[ndim])
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
                      errmsg("upper bound cannot be less than lower bound")));

@@ -313,11 +315,13 @@ array_in(PG_FUNCTION_ARGS)
     {
         /* No array dimensions, so intuit dimensions from brace structure */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array value must start with \"{\" or dimension information.")));
-        ndim = ArrayCount(p, dim, typdelim);
+        ndim = ArrayCount(p, dim, typdelim, escontext);
+        if (ndim < 0)
+            PG_RETURN_NULL();
         for (i = 0; i < ndim; i++)
             lBound[i] = 1;
     }
@@ -328,7 +332,7 @@ array_in(PG_FUNCTION_ARGS)

         /* If array dimensions are given, expect '=' operator */
         if (strncmp(p, ASSGN, strlen(ASSGN)) != 0)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -342,20 +346,22 @@ array_in(PG_FUNCTION_ARGS)
          * were given
          */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array contents must start with \"{\".")));
-        ndim_braces = ArrayCount(p, dim_braces, typdelim);
+        ndim_braces = ArrayCount(p, dim_braces, typdelim, escontext);
+        if (ndim_braces < 0)
+            PG_RETURN_NULL();
         if (ndim_braces != ndim)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Specified array dimensions do not match array contents.")));
         for (i = 0; i < ndim; ++i)
         {
             if (dim[i] != dim_braces[i])
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Specified array dimensions do not match array contents.")));
@@ -372,8 +378,11 @@ array_in(PG_FUNCTION_ARGS)
 #endif

     /* This checks for overflow of the array dimensions */
-    nitems = ArrayGetNItems(ndim, dim);
-    ArrayCheckBounds(ndim, dim, lBound);
+    nitems = ArrayGetNItemsSafe(ndim, dim, escontext);
+    if (nitems < 0)
+        PG_RETURN_NULL();
+    if (!ArrayCheckBoundsSafe(ndim, dim, lBound, escontext))
+        PG_RETURN_NULL();

     /* Empty array? */
     if (nitems == 0)
@@ -381,13 +390,14 @@ array_in(PG_FUNCTION_ARGS)

     dataPtr = (Datum *) palloc(nitems * sizeof(Datum));
     nullsPtr = (bool *) palloc(nitems * sizeof(bool));
-    ReadArrayStr(p, string,
-                 nitems, ndim, dim,
-                 &my_extra->proc, typioparam, typmod,
-                 typdelim,
-                 typlen, typbyval, typalign,
-                 dataPtr, nullsPtr,
-                 &hasnulls, &nbytes);
+    if (!ReadArrayStr(p, string,
+                      nitems, ndim, dim,
+                      &my_extra->proc, typioparam, typmod,
+                      typdelim,
+                      typlen, typbyval, typalign,
+                      dataPtr, nullsPtr,
+                      &hasnulls, &nbytes, escontext))
+        PG_RETURN_NULL();
     if (hasnulls)
     {
         dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems);
@@ -451,9 +461,12 @@ array_isspace(char ch)
  *
  * Returns number of dimensions as function result.  The axis lengths are
  * returned in dim[], which must be of size MAXDIM.
+ *
+ * If we detect an error, fill *escontext with error details and return -1
+ * (unless escontext isn't provided, in which case errors will be thrown).
  */
 static int
-ArrayCount(const char *str, int *dim, char typdelim)
+ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 {
     int            nest_level = 0,
                 i;
@@ -488,11 +501,10 @@ ArrayCount(const char *str, int *dim, char typdelim)
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, -1,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"", str),
                              errdetail("Unexpected end of input.")));
-                    break;
                 case '\\':

                     /*
@@ -504,7 +516,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         parse_state != ARRAY_ELEM_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected \"%c\" character.",
@@ -515,7 +527,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (*(ptr + 1))
                         ptr++;
                     else
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected end of input.")));
@@ -530,7 +542,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (parse_state != ARRAY_LEVEL_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected array element.")));
@@ -551,14 +563,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         if (parse_state != ARRAY_NO_LEVEL &&
                             parse_state != ARRAY_LEVEL_STARTED &&
                             parse_state != ARRAY_LEVEL_DELIMITED)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '{')));
                         parse_state = ARRAY_LEVEL_STARTED;
                         if (nest_level >= MAXDIM)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                                             nest_level + 1, MAXDIM)));
@@ -581,14 +593,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                             parse_state != ARRAY_LEVEL_COMPLETED &&
                             !(nest_level == 1 && parse_state == ARRAY_LEVEL_STARTED))
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '}')));
                         parse_state = ARRAY_LEVEL_COMPLETED;
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unmatched \"%c\" character.", '}')));
@@ -596,7 +608,7 @@ ArrayCount(const char *str, int *dim, char typdelim)

                         if (nelems_last[nest_level] != 0 &&
                             nelems[nest_level] != nelems_last[nest_level])
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Multidimensional arrays must have "
@@ -630,7 +642,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                                 parse_state != ARRAY_ELEM_COMPLETED &&
                                 parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                                 parse_state != ARRAY_LEVEL_COMPLETED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected \"%c\" character.",
@@ -653,7 +665,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             if (parse_state != ARRAY_LEVEL_STARTED &&
                                 parse_state != ARRAY_ELEM_STARTED &&
                                 parse_state != ARRAY_ELEM_DELIMITED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected array element.")));
@@ -673,7 +685,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
     while (*ptr)
     {
         if (!array_isspace(*ptr++))
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", str),
                      errdetail("Junk after closing right brace.")));
@@ -713,11 +725,16 @@ ArrayCount(const char *str, int *dim, char typdelim)
  *    *hasnulls: set true iff there are any null elements.
  *    *nbytes: set to total size of data area needed (including alignment
  *        padding but not including array header overhead).
+ *    *escontext: if this points to an ErrorSaveContext, details of
+ *        any error are reported there.
+ *
+ * Result:
+ *    true for success, false for failure (if escontext is provided).
  *
  * Note that values[] and nulls[] are allocated by the caller, and must have
  * nitems elements.
  */
-static void
+static bool
 ReadArrayStr(char *arrayStr,
              const char *origStr,
              int nitems,
@@ -733,7 +750,8 @@ ReadArrayStr(char *arrayStr,
              Datum *values,
              bool *nulls,
              bool *hasnulls,
-             int32 *nbytes)
+             int32 *nbytes,
+             Node *escontext)
 {
     int            i,
                 nest_level = 0;
@@ -784,7 +802,7 @@ ReadArrayStr(char *arrayStr,
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, false,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"",
                                     origStr)));
@@ -793,7 +811,7 @@ ReadArrayStr(char *arrayStr,
                     /* Skip backslash, copy next character as-is. */
                     srcptr++;
                     if (*srcptr == '\0')
-                        ereport(ERROR,
+                        ereturn(escontext, false,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"",
                                         origStr)));
@@ -823,7 +841,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level >= ndim)
-                            ereport(ERROR,
+                            ereturn(escontext, false,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -838,7 +856,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, false,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -891,7 +909,7 @@ ReadArrayStr(char *arrayStr,
         *dstendptr = '\0';

         if (i < 0 || i >= nitems)
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"",
                             origStr)));
@@ -900,14 +918,20 @@ ReadArrayStr(char *arrayStr,
             pg_strcasecmp(itemstart, "NULL") == 0)
         {
             /* it's a NULL item */
-            values[i] = InputFunctionCall(inputproc, NULL,
-                                          typioparam, typmod);
+            if (!InputFunctionCallSafe(inputproc, NULL,
+                                       typioparam, typmod,
+                                       escontext,
+                                       &values[i]))
+                return false;
             nulls[i] = true;
         }
         else
         {
-            values[i] = InputFunctionCall(inputproc, itemstart,
-                                          typioparam, typmod);
+            if (!InputFunctionCallSafe(inputproc, itemstart,
+                                       typioparam, typmod,
+                                       escontext,
+                                       &values[i]))
+                return false;
             nulls[i] = false;
         }
     }
@@ -930,7 +954,7 @@ ReadArrayStr(char *arrayStr,
             totbytes = att_align_nominal(totbytes, typalign);
             /* check for overflow of total request */
             if (!AllocSizeIsValid(totbytes))
-                ereport(ERROR,
+                ereturn(escontext, false,
                         (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                          errmsg("array size exceeds the maximum allowed (%d)",
                                 (int) MaxAllocSize)));
@@ -938,6 +962,7 @@ ReadArrayStr(char *arrayStr,
     }
     *hasnulls = hasnull;
     *nbytes = totbytes;
+    return true;
 }


diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 051169a149..c52adc6259 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -74,6 +74,16 @@ ArrayGetOffset0(int n, const int *tup, const int *scale)
  */
 int
 ArrayGetNItems(int ndim, const int *dims)
+{
+    return ArrayGetNItemsSafe(ndim, dims, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.  -1 is returned after an error.
+ */
+int
+ArrayGetNItemsSafe(int ndim, const int *dims, NodePtr escontext)
 {
     int32        ret;
     int            i;
@@ -89,7 +99,7 @@ ArrayGetNItems(int ndim, const int *dims)

         /* A negative dimension implies that UB-LB overflowed ... */
         if (dims[i] < 0)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
@@ -98,14 +108,14 @@ ArrayGetNItems(int ndim, const int *dims)

         ret = (int32) prod;
         if ((int64) ret != prod)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
     }
     Assert(ret >= 0);
     if ((Size) ret > MaxArraySize)
-        ereport(ERROR,
+        ereturn(escontext, -1,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("array size exceeds the maximum allowed (%d)",
                         (int) MaxArraySize)));
@@ -126,6 +136,17 @@ ArrayGetNItems(int ndim, const int *dims)
  */
 void
 ArrayCheckBounds(int ndim, const int *dims, const int *lb)
+{
+    (void) ArrayCheckBoundsSafe(ndim, dims, lb, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.
+ */
+bool
+ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                     NodePtr escontext)
 {
     int            i;

@@ -135,11 +156,13 @@ ArrayCheckBounds(int ndim, const int *dims, const int *lb)
         int32        sum PG_USED_FOR_ASSERTS_ONLY;

         if (pg_add_s32_overflow(dims[i], lb[i], &sum))
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array lower bound is too large: %d",
                             lb[i])));
     }
+
+    return true;
 }

 /*
diff --git a/src/backend/utils/adt/bool.c b/src/backend/utils/adt/bool.c
index cd7335287f..e291672ae4 100644
--- a/src/backend/utils/adt/bool.c
+++ b/src/backend/utils/adt/bool.c
@@ -148,13 +148,10 @@ boolin(PG_FUNCTION_ARGS)
     if (parse_bool_with_len(str, len, &result))
         PG_RETURN_BOOL(result);

-    ereport(ERROR,
+    ereturn(fcinfo->context, (Datum) 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "boolean", in_str)));
-
-    /* not reached */
-    PG_RETURN_BOOL(false);
 }

 /*
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 42ddae99ef..e1837bee71 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -291,7 +291,7 @@ int4in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT32(pg_strtoint32(num));
+    PG_RETURN_INT32(pg_strtoint32_safe(num, fcinfo->context));
 }

 /*
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index a64422c8d0..0de0bed0e8 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -166,8 +166,11 @@ invalid_syntax:
 /*
  * Convert input string to a signed 32 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * pg_strtoint32() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint32_safe() instead returns such complaints in *escontext,
+ * if it's an ErrorSaveContext.
  *
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
@@ -175,6 +178,12 @@ invalid_syntax:
  */
 int32
 pg_strtoint32(const char *s)
+{
+    return pg_strtoint32_safe(s, NULL);
+}
+
+int32
+pg_strtoint32_safe(const char *s, Node *escontext)
 {
     const char *ptr = s;
     uint32        tmp = 0;
@@ -227,18 +236,16 @@ pg_strtoint32(const char *s)
     return (int32) tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "integer")));

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "integer", s)));
-
-    return 0;                    /* keep compiler quiet */
 }

 /*
diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..bdafcff02d 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -77,6 +77,7 @@ record_in(PG_FUNCTION_ARGS)
     char       *string = PG_GETARG_CSTRING(0);
     Oid            tupType = PG_GETARG_OID(1);
     int32        tupTypmod = PG_GETARG_INT32(2);
+    Node       *escontext = fcinfo->context;
     HeapTupleHeader result;
     TupleDesc    tupdesc;
     HeapTuple    tuple;
@@ -100,7 +101,7 @@ record_in(PG_FUNCTION_ARGS)
      * supply a valid typmod, and then we can do something useful for RECORD.
      */
     if (tupType == RECORDOID && tupTypmod < 0)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("input of anonymous composite types is not implemented")));

@@ -152,10 +153,13 @@ record_in(PG_FUNCTION_ARGS)
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr++ != '(')
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Missing left parenthesis.")));
+        goto fail;
+    }

     initStringInfo(&buf);

@@ -181,10 +185,13 @@ record_in(PG_FUNCTION_ARGS)
                 ptr++;
             else
                 /* *ptr must be ')' */
-                ereport(ERROR,
+            {
+                errsave(escontext,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed record literal: \"%s\"", string),
                          errdetail("Too few columns.")));
+                goto fail;
+            }
         }

         /* Check for null: completely empty input means null */
@@ -204,19 +211,25 @@ record_in(PG_FUNCTION_ARGS)
                 char        ch = *ptr++;

                 if (ch == '\0')
-                    ereport(ERROR,
+                {
+                    errsave(escontext,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed record literal: \"%s\"",
                                     string),
                              errdetail("Unexpected end of input.")));
+                    goto fail;
+                }
                 if (ch == '\\')
                 {
                     if (*ptr == '\0')
-                        ereport(ERROR,
+                    {
+                        errsave(escontext,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed record literal: \"%s\"",
                                         string),
                                  errdetail("Unexpected end of input.")));
+                        goto fail;
+                    }
                     appendStringInfoChar(&buf, *ptr++);
                 }
                 else if (ch == '"')
@@ -252,10 +265,13 @@ record_in(PG_FUNCTION_ARGS)
             column_info->column_type = column_type;
         }

-        values[i] = InputFunctionCall(&column_info->proc,
-                                      column_data,
-                                      column_info->typioparam,
-                                      att->atttypmod);
+        if (!InputFunctionCallSafe(&column_info->proc,
+                                   column_data,
+                                   column_info->typioparam,
+                                   att->atttypmod,
+                                   escontext,
+                                   &values[i]))
+            goto fail;

         /*
          * Prep for next column
@@ -264,18 +280,24 @@ record_in(PG_FUNCTION_ARGS)
     }

     if (*ptr++ != ')')
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Too many columns.")));
+        goto fail;
+    }
     /* Allow trailing whitespace */
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr)
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Junk after right parenthesis.")));
+        goto fail;
+    }

     tuple = heap_form_tuple(tupdesc, values, nulls);

@@ -294,6 +316,11 @@ record_in(PG_FUNCTION_ARGS)
     ReleaseTupleDesc(tupdesc);

     PG_RETURN_HEAPTUPLEHEADER(result);
+
+    /* exit here once we've done lookup_rowtype_tupdesc */
+fail:
+    ReleaseTupleDesc(tupdesc);
+    PG_RETURN_NULL();
 }

 /*
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 2f794d1168..5ecb436a08 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -447,7 +447,11 @@ extern void array_free_iterator(ArrayIterator iterator);
 extern int    ArrayGetOffset(int n, const int *dim, const int *lb, const int *indx);
 extern int    ArrayGetOffset0(int n, const int *tup, const int *scale);
 extern int    ArrayGetNItems(int ndim, const int *dims);
+extern int    ArrayGetNItemsSafe(int ndim, const int *dims,
+                               NodePtr escontext);
 extern void ArrayCheckBounds(int ndim, const int *dims, const int *lb);
+extern bool ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                                 NodePtr escontext);
 extern void mda_get_range(int n, int *span, const int *st, const int *endp);
 extern void mda_get_prod(int n, const int *range, int *prod);
 extern void mda_get_offset_values(int n, int *dist, const int *prod, const int *span);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 81631f1645..fbfd8375e3 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -45,6 +45,7 @@ extern int    namestrcmp(Name name, const char *str);
 /* numutils.c */
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
+extern int32 pg_strtoint32_safe(const char *s, Node *escontext);
 extern int64 pg_strtoint64(const char *s);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 9c13251231..0318441b7e 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -32,6 +32,7 @@
 #include "common/keywords.h"
 #include "funcapi.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "parser/scansup.h"
 #include "pgstat.h"
 #include "postmaster/syslogger.h"
@@ -45,6 +46,22 @@
 #include "utils/ruleutils.h"
 #include "utils/timestamp.h"

+/*
+ * structure to cache metadata needed in pg_input_is_valid_common
+ */
+typedef struct BasicIOData
+{
+    Oid            typoid;
+    Oid            typiofunc;
+    Oid            typioparam;
+    FmgrInfo    proc;
+} BasicIOData;
+
+static bool pg_input_is_valid_common(FunctionCallInfo fcinfo,
+                                     text *txt, Oid typoid, int32 typmod,
+                                     ErrorSaveContext *escontext);
+
+
 /*
  * Common subroutine for num_nulls() and num_nonnulls().
  * Returns true if successful, false if function should return NULL.
@@ -640,6 +657,146 @@ pg_column_is_updatable(PG_FUNCTION_ARGS)
 }


+/*
+ * pg_input_is_valid - test whether string is valid input for datatype.
+ *
+ * Returns true if OK, false if not.
+ *
+ * This will only work usefully if the datatype's input function has been
+ * updated to return "safe" errors via errsave/ereturn.
+ */
+Datum
+pg_input_is_valid(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+
+    PG_RETURN_BOOL(pg_input_is_valid_common(fcinfo, txt, typoid, -1,
+                                            &escontext));
+}
+
+/* Same, with non-default typmod */
+Datum
+pg_input_is_valid_mod(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    int32        typmod = PG_GETARG_INT32(2);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+
+    PG_RETURN_BOOL(pg_input_is_valid_common(fcinfo, txt, typoid, typmod,
+                                            &escontext));
+}
+
+/*
+ * pg_input_invalid_message - test whether string is valid input for datatype.
+ *
+ * Returns NULL if OK, else the primary message string from the error.
+ *
+ * This will only work usefully if the datatype's input function has been
+ * updated to return "safe" errors via errsave/ereturn.
+ */
+Datum
+pg_input_invalid_message(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext, but enable details_wanted */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+    escontext.details_wanted = true;
+
+    if (pg_input_is_valid_common(fcinfo, txt, typoid, -1,
+                                 &escontext))
+        PG_RETURN_NULL();
+
+    Assert(escontext.error_occurred);
+    Assert(escontext.error_data != NULL);
+    Assert(escontext.error_data->message != NULL);
+
+    PG_RETURN_TEXT_P(cstring_to_text(escontext.error_data->message));
+}
+
+/* Same, with non-default typmod */
+Datum
+pg_input_invalid_message_mod(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    int32        typmod = PG_GETARG_INT32(2);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext, but enable details_wanted */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+    escontext.details_wanted = true;
+
+    if (pg_input_is_valid_common(fcinfo, txt, typoid, typmod,
+                                 &escontext))
+        PG_RETURN_NULL();
+
+    Assert(escontext.error_occurred);
+    Assert(escontext.error_data != NULL);
+    Assert(escontext.error_data->message != NULL);
+
+    PG_RETURN_TEXT_P(cstring_to_text(escontext.error_data->message));
+}
+
+/* Common subroutine for the above */
+static bool
+pg_input_is_valid_common(FunctionCallInfo fcinfo,
+                         text *txt, Oid typoid, int32 typmod,
+                         ErrorSaveContext *escontext)
+{
+    char       *str = text_to_cstring(txt);
+    BasicIOData *my_extra;
+    Datum        converted;
+
+    /*
+     * We arrange to look up the needed I/O info just once per series of
+     * calls, assuming the data type doesn't change underneath us.
+     */
+    my_extra = (BasicIOData *) fcinfo->flinfo->fn_extra;
+    if (my_extra == NULL)
+    {
+        fcinfo->flinfo->fn_extra =
+            MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+                               sizeof(BasicIOData));
+        my_extra = (BasicIOData *) fcinfo->flinfo->fn_extra;
+        my_extra->typoid = InvalidOid;
+    }
+
+    if (my_extra->typoid != typoid)
+    {
+        getTypeInputInfo(typoid,
+                         &my_extra->typiofunc,
+                         &my_extra->typioparam);
+        fmgr_info_cxt(my_extra->typiofunc, &my_extra->proc,
+                      fcinfo->flinfo->fn_mcxt);
+        my_extra->typoid = typoid;
+    }
+
+    /* Now we can try to perform the conversion */
+    return InputFunctionCallSafe(&my_extra->proc,
+                                 str,
+                                 my_extra->typioparam,
+                                 typmod,
+                                 (Node *) escontext,
+                                 &converted);
+}
+
+
 /*
  * Is character a valid identifier start?
  * Must match scan.l's {ident_start} character class.
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f9301b2627..d178a5a8ec 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7060,6 +7060,23 @@
   prorettype => 'regnamespace', proargtypes => 'text',
   prosrc => 'to_regnamespace' },

+{ oid => '8050', descr => 'test whether string is valid input for data type',
+  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
+  proargtypes => 'text regtype', prosrc => 'pg_input_is_valid' },
+{ oid => '8051', descr => 'test whether string is valid input for data type',
+  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
+  proargtypes => 'text regtype int4', prosrc => 'pg_input_is_valid_mod' },
+{ oid => '8052',
+  descr => 'get error message if string is not valid input for data type',
+  proname => 'pg_input_invalid_message', provolatile => 's',
+  prorettype => 'text', proargtypes => 'text regtype',
+  prosrc => 'pg_input_invalid_message' },
+{ oid => '8053',
+  descr => 'get error message if string is not valid input for data type',
+  proname => 'pg_input_invalid_message', provolatile => 's',
+  prorettype => 'text', proargtypes => 'text regtype int4',
+  prosrc => 'pg_input_invalid_message_mod' },
+
 { oid => '1268',
   descr => 'parse qualified identifier to array of identifiers',
   proname => 'parse_ident', prorettype => '_text', proargtypes => 'text bool',
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 97920f38c2..5253541470 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -182,6 +182,31 @@ SELECT a,b,c FROM arrtest;
  [4:4]={NULL}  | {3,4}                 | {foo,new_word}
 (3 rows)

+-- test non-error-throwing API
+SELECT pg_input_is_valid('{1,2,3}', 'integer[]');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('{1,2', 'integer[]');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('{1,zed}', 'integer[]');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_invalid_message('{1,zed}', 'integer[]');
+           pg_input_invalid_message
+----------------------------------------------
+ invalid input syntax for type integer: "zed"
+(1 row)
+
 -- test mixed slice/scalar subscripting
 select '{{1,2,3},{4,5,6},{7,8,9}}'::int[];
            int4
diff --git a/src/test/regress/expected/boolean.out b/src/test/regress/expected/boolean.out
index 4728fe2dfd..08c7e45803 100644
--- a/src/test/regress/expected/boolean.out
+++ b/src/test/regress/expected/boolean.out
@@ -142,6 +142,25 @@ SELECT bool '' AS error;
 ERROR:  invalid input syntax for type boolean: ""
 LINE 1: SELECT bool '' AS error;
                     ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('true', 'bool');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'bool');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_invalid_message('junk', 'bool');
+           pg_input_invalid_message
+-----------------------------------------------
+ invalid input syntax for type boolean: "junk"
+(1 row)
+
 -- and, or, not in qualifications
 SELECT bool 't' or bool 'f' AS true;
  true
diff --git a/src/test/regress/expected/create_type.out b/src/test/regress/expected/create_type.out
index 0dfc88c1c8..7383fcdbb1 100644
--- a/src/test/regress/expected/create_type.out
+++ b/src/test/regress/expected/create_type.out
@@ -249,6 +249,31 @@ select format_type('bpchar'::regtype, -1);
  bpchar
 (1 row)

+-- Test non-error-throwing APIs using widget, which still throws errors
+SELECT pg_input_is_valid('(1,2,3)', 'widget');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('(1,2)', 'widget');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
+SELECT pg_input_is_valid('{"(1,2,3)"}', 'widget[]');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('{"(1,2)"}', 'widget[]');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
+SELECT pg_input_is_valid('("(1,2,3)")', 'mytab');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('("(1,2)")', 'mytab');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
 -- Test creation of an operator over a user-defined type
 CREATE FUNCTION pt_in_widget(point, widget)
    RETURNS bool
diff --git a/src/test/regress/expected/int4.out b/src/test/regress/expected/int4.out
index fbcc0e8d9e..cc5dfc092c 100644
--- a/src/test/regress/expected/int4.out
+++ b/src/test/regress/expected/int4.out
@@ -45,6 +45,31 @@ SELECT * FROM INT4_TBL;
  -2147483647
 (5 rows)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int4');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'int4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1000000000000', 'int4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_invalid_message('1000000000000', 'int4');
+                pg_input_invalid_message
+--------------------------------------------------------
+ value "1000000000000" is out of range for type integer
+(1 row)
+
 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int2 '0';
      f1
 -------------
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index a4cc2d8c12..47d4fe7518 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -69,6 +69,32 @@ ERROR:  malformed record literal: "(Joe,Blow) /"
 LINE 1: select '(Joe,Blow) /'::fullname;
                ^
 DETAIL:  Junk after right parenthesis.
+-- test non-error-throwing API
+create type twoints as (r integer, i integer);
+SELECT pg_input_is_valid('(1,2)', 'twoints');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('(1,2', 'twoints');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('(1,zed)', 'twoints');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_invalid_message('(1,zed)', 'twoints');
+           pg_input_invalid_message
+----------------------------------------------
+ invalid input syntax for type integer: "zed"
+(1 row)
+
 create temp table quadtable(f1 int, q quad);
 insert into quadtable values (1, ((3.3,4.4),(5.5,6.6)));
 insert into quadtable values (2, ((null,4.4),(5.5,6.6)));
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index 548afb4438..d6e6733670 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -183,6 +183,11 @@ widget_in(PG_FUNCTION_ARGS)
             coord[i++] = p + 1;
     }

+    /*
+     * Note: DON'T convert this error to "safe" style (errsave/ereturn).  We
+     * want this data type to stay permanently in the hard-error world so that
+     * it can be used for testing that such cases still work reasonably.
+     */
     if (i < NARGS)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index 791af5c0ce..cb91458b74 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -113,6 +113,12 @@ SELECT a FROM arrtest WHERE a[2] IS NULL;
 DELETE FROM arrtest WHERE a[2] IS NULL AND b IS NULL;
 SELECT a,b,c FROM arrtest;

+-- test non-error-throwing API
+SELECT pg_input_is_valid('{1,2,3}', 'integer[]');
+SELECT pg_input_is_valid('{1,2', 'integer[]');
+SELECT pg_input_is_valid('{1,zed}', 'integer[]');
+SELECT pg_input_invalid_message('{1,zed}', 'integer[]');
+
 -- test mixed slice/scalar subscripting
 select '{{1,2,3},{4,5,6},{7,8,9}}'::int[];
 select ('{{1,2,3},{4,5,6},{7,8,9}}'::int[])[1:2][2];
diff --git a/src/test/regress/sql/boolean.sql b/src/test/regress/sql/boolean.sql
index 4dd47aaf9d..fc463d705c 100644
--- a/src/test/regress/sql/boolean.sql
+++ b/src/test/regress/sql/boolean.sql
@@ -62,6 +62,11 @@ SELECT bool '000' AS error;

 SELECT bool '' AS error;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('true', 'bool');
+SELECT pg_input_is_valid('asdf', 'bool');
+SELECT pg_input_invalid_message('junk', 'bool');
+
 -- and, or, not in qualifications

 SELECT bool 't' or bool 'f' AS true;
diff --git a/src/test/regress/sql/create_type.sql b/src/test/regress/sql/create_type.sql
index c6fc4f9029..c25018029c 100644
--- a/src/test/regress/sql/create_type.sql
+++ b/src/test/regress/sql/create_type.sql
@@ -192,6 +192,14 @@ select format_type('bpchar'::regtype, null);
 -- this behavior difference is intentional
 select format_type('bpchar'::regtype, -1);

+-- Test non-error-throwing APIs using widget, which still throws errors
+SELECT pg_input_is_valid('(1,2,3)', 'widget');
+SELECT pg_input_is_valid('(1,2)', 'widget');  -- hard error expected
+SELECT pg_input_is_valid('{"(1,2,3)"}', 'widget[]');
+SELECT pg_input_is_valid('{"(1,2)"}', 'widget[]');  -- hard error expected
+SELECT pg_input_is_valid('("(1,2,3)")', 'mytab');
+SELECT pg_input_is_valid('("(1,2)")', 'mytab');  -- hard error expected
+
 -- Test creation of an operator over a user-defined type

 CREATE FUNCTION pt_in_widget(point, widget)
diff --git a/src/test/regress/sql/int4.sql b/src/test/regress/sql/int4.sql
index f19077f3da..a188731178 100644
--- a/src/test/regress/sql/int4.sql
+++ b/src/test/regress/sql/int4.sql
@@ -17,6 +17,12 @@ INSERT INTO INT4_TBL(f1) VALUES ('');

 SELECT * FROM INT4_TBL;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int4');
+SELECT pg_input_is_valid('asdf', 'int4');
+SELECT pg_input_is_valid('1000000000000', 'int4');
+SELECT pg_input_invalid_message('1000000000000', 'int4');
+
 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int2 '0';

 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int4 '0';
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index ad5b7e128f..d558d66eb6 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -31,6 +31,13 @@ select '[]'::fullname;          -- bad
 select ' (Joe,Blow)  '::fullname;  -- ok, extra whitespace
 select '(Joe,Blow) /'::fullname;  -- bad

+-- test non-error-throwing API
+create type twoints as (r integer, i integer);
+SELECT pg_input_is_valid('(1,2)', 'twoints');
+SELECT pg_input_is_valid('(1,2', 'twoints');
+SELECT pg_input_is_valid('(1,zed)', 'twoints');
+SELECT pg_input_invalid_message('(1,zed)', 'twoints');
+
 create temp table quadtable(f1 int, q quad);

 insert into quadtable values (1, ((3.3,4.4),(5.5,6.6)));

Re: Error-safe user functions

From

Tom Lane

Date:

06 December 2022, 20:29:15

Robert Haas <robertmhaas@gmail.com> writes:
> I feel like this can go either way. If we pick a name that conveys a
> specific intended behavior now, and then later we want to pass some
> other sort of node for some purpose other than ignoring errors, it's
> unpleasant to have a name that sounds like it can only ignore errors.
> But if we never use it for anything other than ignoring errors, a
> specific name is clearer.

With Andres' proposal to make the function return boolean succeed/fail,
I think it's pretty clear that the only useful case is to pass an
ErrorSaveContext.  There may well be future APIs that pass some other
kind of context object to input functions, but they'll presumably
have different goals and want a different sort of wrapper function.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

07 December 2022, 13:47:37

On 2022-12-06 Tu 15:21, Tom Lane wrote:
> OK, here's a v3 responding to the comments from Andres.


Looks pretty good to me.


>
> 0000 is preliminary refactoring of elog.c, with (I trust) no
> functional effect.  It gets rid of some pre-existing code duplication
> as well as setting up to let 0001's additions be less duplicative.
>
> 0001 adopts use of Node pointers in place of "void *".  To do this
> I needed an alias type in elog.h equivalent to fmgr.h's fmNodePtr.
> I decided that having two different aliases would be too confusing,
> so what I did here was to converge both elog.h and fmgr.h on using
> the same alias "typedef struct Node *NodePtr".  That has to be in
> elog.h since it's included first, from postgres.h.  (I thought of
> defining NodePtr in postgres.h, but postgres.h includes elog.h
> immediately so that wouldn't have looked very nice.)
>
> I also adopted Andres' recommendation that InputFunctionCallSafe
> return boolean.  I'm still not totally sold on that ... but it does
> end with array_in and record_in never using SAFE_ERROR_OCCURRED at
> all, so maybe the idea's OK.


Originally I wanted to make the new function look as much like the
original as possible, but I'm not wedded to that either. I can live with
it like this.


>
> 0002 adjusts the I/O functions for these API changes, and fixes
> my silly oversight about error cleanup in record_in.
>
> Given the discussion about testing requirements, I threw away the
> COPY hack entirely.  This 0003 provides a couple of SQL-callable
> functions that can be used to invoke a specific datatype's input
> function.  I haven't documented them, pending bikeshedding on
> names etc.  I also arranged to test array_in and record_in with
> a datatype that still throws errors, reserving the existing test
> type "widget" for that purpose.
>
> (I'm not intending to foreclose development of new COPY features
> in this area, just abandoning the idea that that's our initial
> test mechanism.)
>

The new functions on their own are likely to make plenty of people quite
happy once we've adjusted all the input functions.

Perhaps we should add a type in the regress library that will never have
a safe input function, so we can test that the mechanism works as
expected in that case even after we adjust all the core data types'
input functions.

Otherwise I think we're good to go.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 14:20:33

Andrew Dunstan <andrew@dunslane.net> writes:
> Perhaps we should add a type in the regress library that will never have
> a safe input function, so we can test that the mechanism works as
> expected in that case even after we adjust all the core data types'
> input functions.

I was intending that the existing "widget" type be that.  0003 already
adds a comment to widget_in saying not to "fix" its one ereport call.

Returning to the naming quagmire -- it occurred to me just now that
it might be helpful to call this style of error reporting "soft"
errors rather than "safe" errors, which'd provide a nice contrast
with "hard" errors thrown by longjmp'ing.  That would lead to naming
all the variant functions XXXSoft not XXXSafe.  There would still
be commentary to the effect that "soft errors must be safe, in the
sense that there's no question whether it's safe to continue
processing the transaction".  Anybody think that'd be an
improvement?

            regards, tom lane

Re: Error-safe user functions

From

"David G. Johnston"

Date:

07 December 2022, 14:51:12

On Wed, Dec 7, 2022 at 7:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Returning to the naming quagmire -- it occurred to me just now that
it might be helpful to call this style of error reporting "soft"
errors rather than "safe" errors, which'd provide a nice contrast
with "hard" errors thrown by longjmp'ing. That would lead to naming
all the variant functions XXXSoft not XXXSafe. There would still
be commentary to the effect that "soft errors must be safe, in the
sense that there's no question whether it's safe to continue
processing the transaction". Anybody think that'd be an
improvement?

David J.

Re: Error-safe user functions

From

Andrew Dunstan

Date:

07 December 2022, 15:04:01

On 2022-12-07 We 09:20, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Perhaps we should add a type in the regress library that will never have
>> a safe input function, so we can test that the mechanism works as
>> expected in that case even after we adjust all the core data types'
>> input functions.
> I was intending that the existing "widget" type be that.  0003 already
> adds a comment to widget_in saying not to "fix" its one ereport call.


Yeah, I see that, I must have been insufficiently caffeinated.


>
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing.  That would lead to naming
> all the variant functions XXXSoft not XXXSafe.  There would still
> be commentary to the effect that "soft errors must be safe, in the
> sense that there's no question whether it's safe to continue
> processing the transaction".  Anybody think that'd be an
> improvement?
>
>             


I'm not sure InputFunctionCallSoft would be an improvement. Maybe
InputFunctionCallSoftError would be clearer, but I don't know that it's
much of an improvement either. The same goes for the other visible changes.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 15:23:22

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-07 We 09:20, Tom Lane wrote:
>> Returning to the naming quagmire -- it occurred to me just now that
>> it might be helpful to call this style of error reporting "soft"
>> errors rather than "safe" errors, which'd provide a nice contrast
>> with "hard" errors thrown by longjmp'ing.  That would lead to naming
>> all the variant functions XXXSoft not XXXSafe.

> I'm not sure InputFunctionCallSoft would be an improvement.

Yeah, after reflecting on it a bit more I'm not that impressed with
that as a function name either.

(I think that "soft error" could be useful as informal terminology.
AFAIR we don't use "hard error" in any formal way either, but there
are certainly comments using that phrase.)

More questions:

* Anyone want to bikeshed about the new SQL-level function names?
I'm reasonably satisfied with "pg_input_is_valid" for the bool-returning
variant, but not so much with "pg_input_invalid_message" for the
error-message-returning variant.  Thinking about "pg_input_error_message"
instead, but that's not stellar either.

* Where in the world shall we document these, if we document them?
The only section of chapter 9 that seems even a little bit appropriate
is "9.26. System Information Functions and Operators", and even there,
they would need their own new table because they don't fit well in any
existing table.

BTW, does anyone else agree that 9.26 is desperately in need of some
<sect2> subdivisions?  It seems to have gotten a lot longer since
I looked at it last.

            regards, tom lane

Re: Error-safe user functions

From

"David G. Johnston"

Date:

07 December 2022, 15:33:11

On Wed, Dec 7, 2022 at 8:04 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-12-07 We 09:20, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Perhaps we should add a type in the regress library that will never have
>> a safe input function, so we can test that the mechanism works as
>> expected in that case even after we adjust all the core data types'
>> input functions.
> I was intending that the existing "widget" type be that. 0003 already
> adds a comment to widget_in saying not to "fix" its one ereport call.

Yeah, I see that, I must have been insufficiently caffeinated.

>
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing. That would lead to naming
> all the variant functions XXXSoft not XXXSafe. There would still
> be commentary to the effect that "soft errors must be safe, in the
> sense that there's no question whether it's safe to continue
> processing the transaction". Anybody think that'd be an
> improvement?
>
>

I'm not sure InputFunctionCallSoft would be an improvement. Maybe
InputFunctionCallSoftError would be clearer, but I don't know that it's
much of an improvement either. The same goes for the other visible changes.

InputFunctionCallSafe -> TryInputFunctionCall

I think in create type saying "input functions to handle errors softly" is an improvement over "input functions to return safe errors".

start->save->finish describes a soft error handling procedure quite well. safe has baggage, all code should be "safe".

fmgr/README: "Handling Non-Exception Errors" -> "Soft Error Handling"

"typical safe error conditions include" -> "error conditions that can be handled softly include"

(pg_input_is_valid) "input function has been updated to return "safe' errors" -> "input function has been updated to soft error handling"

Unrelated observation: "Although the error stack is not large, we don't expect to run out of space." -> "Because the error stack is not large, assume that we will not run out of space and panic if we are wrong."?

David J.

Re: Error-safe user functions

From

"David G. Johnston"

Date:

07 December 2022, 15:49:18

On Wed, Dec 7, 2022 at 8:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-07 We 09:20, Tom Lane wrote:
>> Returning to the naming quagmire -- it occurred to me just now that
>> it might be helpful to call this style of error reporting "soft"
>> errors rather than "safe" errors, which'd provide a nice contrast
>> with "hard" errors thrown by longjmp'ing. That would lead to naming
>> all the variant functions XXXSoft not XXXSafe.

> I'm not sure InputFunctionCallSoft would be an improvement.

Yeah, after reflecting on it a bit more I'm not that impressed with
that as a function name either.

(I think that "soft error" could be useful as informal terminology.
AFAIR we don't use "hard error" in any formal way either, but there
are certainly comments using that phrase.)

More questions:

* Anyone want to bikeshed about the new SQL-level function names?
I'm reasonably satisfied with "pg_input_is_valid" for the bool-returning
variant, but not so much with "pg_input_invalid_message" for the
error-message-returning variant. Thinking about "pg_input_error_message"
instead, but that's not stellar either.

Why not do away with two separate functions and define a composite type (boolean, text) for is_valid to return?

* Where in the world shall we document these, if we document them?
The only section of chapter 9 that seems even a little bit appropriate
is "9.26. System Information Functions and Operators", and even there,
they would need their own new table because they don't fit well in any
existing table.

I would indeed just add a table there.

BTW, does anyone else agree that 9.26 is desperately in need of some
<sect2> subdivisions? It seems to have gotten a lot longer since
I looked at it last.

I'd be inclined to do something like what we are attempting for Chapter 28 Monitoring Database Activity; introduce pagination through refentry and build our own table of contents into it.

David J.

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 16:06:16

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Why not do away with two separate functions and define a composite type
> (boolean, text) for is_valid to return?

I don't see any advantage to that.  It would be harder to use in both
use-cases.

>> BTW, does anyone else agree that 9.26 is desperately in need of some
>> <sect2> subdivisions?  It seems to have gotten a lot longer since
>> I looked at it last.

> I'd be inclined to do something like what we are attempting for Chapter 28
> Monitoring Database Activity; introduce pagination through refentry and
> build our own table of contents into it.

I'd prefer to follow the model that already exists in 9.27,
ie break it up with <sect2>'s, which provide a handy
sub-table-of-contents.

            regards, tom lane

Re: Error-safe user functions

From

"David G. Johnston"

Date:

07 December 2022, 16:15:31

On Wed, Dec 7, 2022 at 9:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Why not do away with two separate functions and define a composite type
> (boolean, text) for is_valid to return?

I don't see any advantage to that. It would be harder to use in both
use-cases.

I don't really see a use case for either of them individually. If all you are doing is printing them out in a test and checking the result in what situation wouldn't you want to check that both the true/false and message are as expected? Plus, you don't have to figure out a name for the second function.

>> BTW, does anyone else agree that 9.26 is desperately in need of some
>> <sect2> subdivisions? It seems to have gotten a lot longer since
>> I looked at it last.

> I'd be inclined to do something like what we are attempting for Chapter 28
> Monitoring Database Activity; introduce pagination through refentry and
> build our own table of contents into it.

I'd prefer to follow the model that already exists in 9.27,
ie break it up with <sect2>'s, which provide a handy
sub-table-of-contents.

I have a bigger issue with the non-pagination myself; the extra bit of effort to manually create a tabular ToC (where we can add descriptions) seems like a worthy price to pay.

Are you suggesting we should not go down the path that v8-0003 does in the monitoring section cleanup thread? I find the usability of Chapter 54 System Views to be superior to these two run-on chapters and would rather we emulate it in both these places - for what is in the end very little additional effort, all mechanical in nature.

David J.

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 16:59:09

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Wed, Dec 7, 2022 at 9:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> "David G. Johnston" <david.g.johnston@gmail.com> writes:
>>> Why not do away with two separate functions and define a composite type
>>> (boolean, text) for is_valid to return?

>> I don't see any advantage to that.  It would be harder to use in both
>> use-cases.

> I don't really see a use case for either of them individually.

Uh, several people opined that pg_input_is_valid would be of field
interest.  If I thought these were only for testing purposes I wouldn't
be especially concerned about documenting them at all.

> Are you suggesting we should not go down the path that v8-0003 does in the
> monitoring section cleanup thread?  I find the usability of Chapter 54
> System Views to be superior to these two run-on chapters and would rather
> we emulate it in both these places - for what is in the end very little
> additional effort, all mechanical in nature.

I have not been following that thread, and am not really excited about
putting in a huge amount of documentation work here.  I'd just like 9.26
to have a mini-TOC at the page head, which <sect2>'s would be enough for.

            regards, tom lane

Re: Error-safe user functions

From

Corey Huinker

Date:

07 December 2022, 17:01:11

On Wed, Dec 7, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:
> Perhaps we should add a type in the regress library that will never have
> a safe input function, so we can test that the mechanism works as
> expected in that case even after we adjust all the core data types'
> input functions.

I was intending that the existing "widget" type be that. 0003 already
adds a comment to widget_in saying not to "fix" its one ereport call.

Returning to the naming quagmire -- it occurred to me just now that
it might be helpful to call this style of error reporting "soft"
errors rather than "safe" errors, which'd provide a nice contrast
with "hard" errors thrown by longjmp'ing. That would lead to naming
all the variant functions XXXSoft not XXXSafe. There would still
be commentary to the effect that "soft errors must be safe, in the
sense that there's no question whether it's safe to continue
processing the transaction". Anybody think that'd be an
improvement?

In my attempt to implement CAST...DEFAULT, I noticed that I immediately needed an
OidInputFunctionCallSafe, which was trivial but maybe something we want to add to the infra patch, but the comments around that function also somewhat indicate that we might want to just do the work in-place and call InputFunctionCallSafe directly. Open to both ideas.

Looking forward cascades up into coerce_type and its brethren, and reimplementing those from a Node returner to a boolean returner with a Node parameter seems a bit of a stretch, so I have to pick a point where the code pivots from passing down a safe-mode indicator and passing back a found_error indicator (which may be combine-able, as safe is always true when the found_error pointer is not null, and always false when it isn't), but for the most part things look do-able.

Re: Error-safe user functions

From

"David G. Johnston"

Date:

07 December 2022, 17:02:47

On Wed, Dec 7, 2022 at 9:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

> Are you suggesting we should not go down the path that v8-0003 does in the
> monitoring section cleanup thread? I find the usability of Chapter 54
> System Views to be superior to these two run-on chapters and would rather
> we emulate it in both these places - for what is in the end very little
> additional effort, all mechanical in nature.

I have not been following that thread, and am not really excited about
putting in a huge amount of documentation work here. I'd just like 9.26
to have a mini-TOC at the page head, which <sect2>'s would be enough for.

So long as you aren't opposed to the idea if someone else does the work, adding sect2 is better than nothing even if it is just a stop-gap measure.

David J.

Re: Error-safe user functions

From

Andres Freund

Date:

07 December 2022, 17:17:25

On 2022-12-07 09:20:33 -0500, Tom Lane wrote:
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing.

+1

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 17:17:34

Corey Huinker <corey.huinker@gmail.com> writes:
> In my attempt to implement CAST...DEFAULT, I noticed that I immediately
> needed an
> OidInputFunctionCallSafe, which was trivial but maybe something we want to
> add to the infra patch, but the comments around that function also somewhat
> indicate that we might want to just do the work in-place and call
> InputFunctionCallSafe directly. Open to both ideas.

I'm a bit skeptical of that.  IMO using OidInputFunctionCall is only
appropriate in places that will be executed just once per query.
Otherwise, unless you have zero concern for performance, you should
be caching the function lookup.  (The test functions in my 0003 patch
illustrate the standard way to do that within SQL-callable functions.
If you're implementing CAST as a new kind of executable expression,
the lookup would likely happen in expression compilation.)

I don't say that OidInputFunctionCallSafe won't ever be useful, but
I doubt it's what we want in CAST.

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 17:20:55

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> So long as you aren't opposed to the idea if someone else does the work,
> adding sect2 is better than nothing even if it is just a stop-gap measure.

OK, we can agree on that.

As for the other point ---  not sure why I didn't remember this right off,
but the point of two test functions is that one exercises the code path
with details_wanted = true while the other exercises details_wanted =
false.  A combined function would only test the first case.

            regards, tom lane

Re: Error-safe user functions

From

Andres Freund

Date:

07 December 2022, 17:34:27

Hi,

On 2022-12-06 15:21:09 -0500, Tom Lane wrote:
> +{ oid => '8050', descr => 'test whether string is valid input for data type',
> +  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
> +  proargtypes => 'text regtype', prosrc => 'pg_input_is_valid' },
> +{ oid => '8051', descr => 'test whether string is valid input for data type',
> +  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
> +  proargtypes => 'text regtype int4', prosrc => 'pg_input_is_valid_mod' },
> +{ oid => '8052',
> +  descr => 'get error message if string is not valid input for data type',
> +  proname => 'pg_input_invalid_message', provolatile => 's',
> +  prorettype => 'text', proargtypes => 'text regtype',
> +  prosrc => 'pg_input_invalid_message' },
> +{ oid => '8053',
> +  descr => 'get error message if string is not valid input for data type',
> +  proname => 'pg_input_invalid_message', provolatile => 's',
> +  prorettype => 'text', proargtypes => 'text regtype int4',
> +  prosrc => 'pg_input_invalid_message_mod' },
> +

Is there a guarantee that input functions are stable or immutable? We don't
have any volatile input functions in core PG:

SELECT provolatile, count(*) FROM pg_proc WHERE oid IN (SELECT typinput FROM pg_type) GROUP BY provolatile;

Greetings,

Andres Freund

Re: Error-safe user functions

From

"David G. Johnston"

Date:

07 December 2022, 17:46:26

On Wed, Dec 7, 2022 at 10:34 AM Andres Freund <andres@anarazel.de> wrote:

> +{ oid => '8053',
> + descr => 'get error message if string is not valid input for data type',
> + proname => 'pg_input_invalid_message', provolatile => 's',
> + prorettype => 'text', proargtypes => 'text regtype int4',
> + prosrc => 'pg_input_invalid_message_mod' },
> +

Is there a guarantee that input functions are stable or immutable? We don't
have any volatile input functions in core PG:

SELECT provolatile, count(*) FROM pg_proc WHERE oid IN (SELECT typinput FROM pg_type) GROUP BY provolatile;

Effectively yes, though I'm not sure if it is formally documented or otherwise enforced by the system.

The fact we allow stable is a bit of a sore spot, volatile would be a terrible property for an I/O function.

David J.

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 17:51:29

Andres Freund <andres@anarazel.de> writes:
> Is there a guarantee that input functions are stable or immutable?

There's a project policy that that should be true.  That justifies
marking things like record_in as stable --- if the per-column input
functions could be volatile, record_in would need to be as well.
There are other dependencies on it; see e.g. aab353a60, 3db6524fe.

> We don't
> have any volatile input functions in core PG:

Indeed, because type_sanity.sql checks that.

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 18:00:38

I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Is there a guarantee that input functions are stable or immutable?

> There's a project policy that that should be true.  That justifies
> marking things like record_in as stable --- if the per-column input
> functions could be volatile, record_in would need to be as well.
> There are other dependencies on it; see e.g. aab353a60, 3db6524fe.

I dug in the archives and found the thread leading up to aab353a60:

https://www.postgresql.org/message-id/flat/AANLkTik8v7O9QR9jjHNVh62h-COC1B0FDUNmEYMdtKjR%40mail.gmail.com

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 20:16:16

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Wed, Dec 7, 2022 at 8:04 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>> I'm not sure InputFunctionCallSoft would be an improvement. Maybe
>> InputFunctionCallSoftError would be clearer, but I don't know that it's
>> much of an improvement either. The same goes for the other visible changes.

> InputFunctionCallSafe -> TryInputFunctionCall

I think we are already using "TryXXX" for code that involves catching
ereport errors.  Since the whole point here is that we are NOT doing
that, I think this naming would be more confusing than helpful.

> Unrelated observation: "Although the error stack is not large, we don't
> expect to run out of space." -> "Because the error stack is not large,
> assume that we will not run out of space and panic if we are wrong."?

That doesn't seem to make the point I wanted to make.

I've adopted your other suggestions in the v4 I'm preparing now.

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 22:32:21

OK, here's a v4 that I think is possibly committable.

I've changed all the comments and docs to use the "soft error"
terminology, but since using "soft" in the actual function names
didn't seem that appealing, they still use "safe".

I already pushed the 0000 elog-refactoring patch, since that seemed
uncontroversial.  0001 attached covers the same territory as before,
but I regrouped the rest so that 0002 installs the new test support
functions, then 0003 adds both the per-datatype changes and
corresponding test cases for bool, int4, arrays, and records.
The idea here is that 0003 can be pointed to as a sample of what
has to be done to datatype input functions, while the preceding
patches can be cited as relevant documentation.  (I've not decided
whether to squash 0001 and 0002 together or commit them separately.
Does it make sense to break 0003 into 4 separate commits, or is
that overkill?)

Thoughts?

            regards, tom lane

diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 693423e524..994dfc6526 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -900,6 +900,17 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
    function is written in C.
   </para>

+  <para>
+   In <productname>PostgreSQL</productname> version 16 and later,
+   it is desirable for base types' input functions to
+   return <quote>soft</quote> errors using the
+   new <function>errsave()</function>/<function>ereturn()</function>
+   mechanism, rather than throwing <function>ereport()</function>
+   exceptions as in previous versions.
+   See <filename>src/backend/utils/fmgr/README</filename> for more
+   information.
+  </para>
+
  </refsect1>

  <refsect1>
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 4368c30fdb..7c594be583 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -56,6 +56,7 @@ node_headers = \
     nodes/bitmapset.h \
     nodes/extensible.h \
     nodes/lockoptions.h \
+    nodes/miscnodes.h \
     nodes/replnodes.h \
     nodes/supportnodes.h \
     nodes/value.h \
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7212bc486f..08992dfd47 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -68,6 +68,7 @@ my @all_input_files = qw(
   nodes/bitmapset.h
   nodes/extensible.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
   nodes/value.h
@@ -89,6 +90,7 @@ my @nodetag_only_files = qw(
   executor/tuptable.h
   foreign/fdwapi.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
 );
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index f5cd1b7493..a36aeb832e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -71,6 +71,7 @@
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
+#include "nodes/miscnodes.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/bgworker.h"
@@ -611,6 +612,128 @@ errfinish(const char *filename, int lineno, const char *funcname)
     CHECK_FOR_INTERRUPTS();
 }

+
+/*
+ * errsave_start --- begin a "soft" error-reporting cycle
+ *
+ * If "context" isn't an ErrorSaveContext node, this behaves as
+ * errstart(ERROR, domain), and the errsave() macro ends up acting
+ * exactly like ereport(ERROR, ...).
+ *
+ * If "context" is an ErrorSaveContext node, but the node creator only wants
+ * notification of the fact of a soft error without any details, just set
+ * the error_occurred flag in the ErrorSaveContext node and return false,
+ * which will cause us to skip the remaining error processing steps.
+ *
+ * Otherwise, create and initialize error stack entry and return true.
+ * Subsequently, errmsg() and perhaps other routines will be called to further
+ * populate the stack entry.  Finally, errsave_finish() will be called to
+ * tidy up.
+ */
+bool
+errsave_start(NodePtr context, const char *domain)
+{
+    ErrorSaveContext *escontext;
+    ErrorData  *edata;
+
+    /*
+     * Do we have a context for soft error reporting?  If not, just punt to
+     * errstart().
+     */
+    if (context == NULL || !IsA(context, ErrorSaveContext))
+        return errstart(ERROR, domain);
+
+    /* Report that a soft error was detected */
+    escontext = (ErrorSaveContext *) context;
+    escontext->error_occurred = true;
+
+    /* Nothing else to do if caller wants no further details */
+    if (!escontext->details_wanted)
+        return false;
+
+    /*
+     * Okay, crank up a stack entry to store the info in.
+     */
+
+    recursion_depth++;
+
+    /* Initialize data for this error frame */
+    edata = get_error_stack_entry();
+    edata->elevel = LOG;        /* signal all is well to errsave_finish */
+    set_stack_entry_domain(edata, domain);
+    /* Select default errcode based on the assumed elevel of ERROR */
+    edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
+
+    /*
+     * Any allocations for this error state level should go into the caller's
+     * context.  We don't need to pollute ErrorContext, or even require it to
+     * exist, in this code path.
+     */
+    edata->assoc_context = CurrentMemoryContext;
+
+    recursion_depth--;
+    return true;
+}
+
+/*
+ * errsave_finish --- end a "soft" error-reporting cycle
+ *
+ * If errsave_start() decided this was a regular error, behave as
+ * errfinish().  Otherwise, package up the error details and save
+ * them in the ErrorSaveContext node.
+ */
+void
+errsave_finish(NodePtr context, const char *filename, int lineno,
+               const char *funcname)
+{
+    ErrorSaveContext *escontext = (ErrorSaveContext *) context;
+    ErrorData  *edata = &errordata[errordata_stack_depth];
+
+    /* verify stack depth before accessing *edata */
+    CHECK_STACK_DEPTH();
+
+    /*
+     * If errsave_start punted to errstart, then elevel will be ERROR or
+     * perhaps even PANIC.  Punt likewise to errfinish.
+     */
+    if (edata->elevel >= ERROR)
+    {
+        errfinish(filename, lineno, funcname);
+        pg_unreachable();
+    }
+
+    /*
+     * Else, we should package up the stack entry contents and deliver them to
+     * the caller.
+     */
+    recursion_depth++;
+
+    /* Save the last few bits of error state into the stack entry */
+    set_stack_entry_location(edata, filename, lineno, funcname);
+
+    /* Replace the LOG value that errsave_start inserted */
+    edata->elevel = ERROR;
+
+    /*
+     * We skip calling backtrace and context functions, which are more likely
+     * to cause trouble than provide useful context; they might act on the
+     * assumption that a transaction abort is about to occur.
+     */
+
+    /*
+     * Make a copy of the error info for the caller.  All the subsidiary
+     * strings are already in the caller's context, so it's sufficient to
+     * flat-copy the stack entry.
+     */
+    escontext->error_data = palloc_object(ErrorData);
+    memcpy(escontext->error_data, edata, sizeof(ErrorData));
+
+    /* Exit error-handling context */
+    errordata_stack_depth--;
+    recursion_depth--;
+}
+
+
 /*
  * get_error_stack_entry --- allocate and initialize a new stack entry
  *
diff --git a/src/backend/utils/fmgr/README b/src/backend/utils/fmgr/README
index 49845f67ac..9958d38992 100644
--- a/src/backend/utils/fmgr/README
+++ b/src/backend/utils/fmgr/README
@@ -267,6 +267,78 @@ See windowapi.h for more information.
 information about the context of the CALL statement, particularly
 whether it is within an "atomic" execution context.

+* Some callers of datatype input functions (and in future perhaps
+other classes of functions) pass an instance of ErrorSaveContext.
+This indicates that the caller wishes to handle "soft" errors without
+a transaction-terminating exception being thrown: instead, the callee
+should store information about the error cause in the ErrorSaveContext
+struct and return a dummy result value.  Further details appear in
+"Handling Soft Errors" below.
+
+
+Handling Soft Errors
+--------------------
+
+Postgres' standard mechanism for reporting errors (ereport() or elog())
+is used for all sorts of error conditions.  This means that throwing
+an exception via ereport(ERROR) requires an expensive transaction or
+subtransaction abort and cleanup, since the exception catcher dare not
+make many assumptions about what has gone wrong.  There are situations
+where we would rather have a lighter-weight mechanism for dealing
+with errors that are known to be safe to recover from without a full
+transaction cleanup.  SQL-callable functions can support this need
+using the ErrorSaveContext context mechanism.
+
+To report a "soft" error, a SQL-callable function should call
+    errsave(fcinfo->context, ...)
+where it would previously have done
+    ereport(ERROR, ...)
+If the passed "context" is NULL or is not an ErrorSaveContext node,
+then errsave behaves precisely as ereport(ERROR): the exception is
+thrown via longjmp, so that control does not return.  If "context"
+is an ErrorSaveContext node, then the error information included in
+errsave's subsidiary reporting calls is stored into the context node
+and control returns from errsave normally.  The function should then
+return a dummy value to its caller.  (SQL NULL is recommendable as
+the dummy value; but anything will do, since the caller is expected
+to ignore the function's return value once it sees that an error has
+been reported in the ErrorSaveContext node.)
+
+If there is nothing to do except return after calling errsave(),
+you can save a line or two by writing
+    ereturn(fcinfo->context, dummy_value, ...)
+to perform errsave() and then "return dummy_value".
+
+An error reported "softly" must be safe, in the sense that there is
+no question about our ability to continue normal processing of the
+transaction.  Error conditions that should NOT be handled this way
+include out-of-memory, unexpected internal errors, or anything that
+cannot easily be cleaned up after.  Such cases should still be thrown
+with ereport, as they have been in the past.
+
+Considering datatype input functions as examples, typical "soft" error
+conditions include input syntax errors and out-of-range values.  An
+input function typically detects such cases with simple if-tests and
+can easily change the ensuing ereport call to an errsave or ereturn.
+Because of this restriction, it's typically not necessary to pass
+the ErrorSaveContext pointer down very far, as errors reported by
+low-level functions are typically reasonable to consider internal.
+(Another way to frame the distinction is that input functions should
+report all invalid-input conditions softly, but internal problems are
+hard errors.)
+
+Because no transaction cleanup will occur, a function that is exiting
+after errsave() returns will bear responsibility for resource cleanup.
+It is not necessary to be concerned about small leakages of palloc'd
+memory, since the caller should be running the function in a short-lived
+memory context.  However, resources such as locks, open files, or buffer
+pins must be closed out cleanly, as they would be in the non-error code
+path.
+
+Conventions for callers that use the ErrorSaveContext mechanism
+to trap errors are discussed with the declaration of that struct,
+in nodes/miscnodes.h.
+

 Functions Accepting or Returning Sets
 -------------------------------------
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 3c210297aa..493e893ada 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -23,6 +23,7 @@
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/miscnodes.h"
 #include "nodes/nodeFuncs.h"
 #include "pgstat.h"
 #include "utils/acl.h"
@@ -1548,6 +1549,70 @@ InputFunctionCall(FmgrInfo *flinfo, char *str, Oid typioparam, int32 typmod)
     return result;
 }

+/*
+ * Call a previously-looked-up datatype input function, with non-exception
+ * handling of "soft" errors.
+ *
+ * This is basically like InputFunctionCall, but the converted Datum is
+ * returned into *result while the function result is true for success or
+ * false for failure.  Also, the caller may pass an ErrorSaveContext node.
+ * (We declare that as "NodePtr" to avoid including nodes.h in fmgr.h.)
+ *
+ * If escontext points to an ErrorSaveContext, any "soft" errors detected by
+ * the input function will be reported by filling the escontext struct and
+ * returning false.  (The caller can choose to test SOFT_ERROR_OCCURRED(),
+ * but checking the function result instead is usually cheaper.)
+ *
+ * If escontext does not point to an ErrorSaveContext, errors are reported
+ * via ereport(ERROR), so that there is no functional difference from
+ * InputFunctionCall; the result will always be true if control returns.
+ */
+bool
+InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                      Oid typioparam, int32 typmod,
+                      NodePtr escontext,
+                      Datum *result)
+{
+    LOCAL_FCINFO(fcinfo, 3);
+
+    if (str == NULL && flinfo->fn_strict)
+    {
+        *result = (Datum) 0;    /* just return null result */
+        return true;
+    }
+
+    InitFunctionCallInfoData(*fcinfo, flinfo, 3, InvalidOid, escontext, NULL);
+
+    fcinfo->args[0].value = CStringGetDatum(str);
+    fcinfo->args[0].isnull = false;
+    fcinfo->args[1].value = ObjectIdGetDatum(typioparam);
+    fcinfo->args[1].isnull = false;
+    fcinfo->args[2].value = Int32GetDatum(typmod);
+    fcinfo->args[2].isnull = false;
+
+    *result = FunctionCallInvoke(fcinfo);
+
+    /* Result value is garbage, and could be null, if an error was reported */
+    if (SOFT_ERROR_OCCURRED(escontext))
+        return false;
+
+    /* Otherwise, should get null result if and only if str is NULL */
+    if (str == NULL)
+    {
+        if (!fcinfo->isnull)
+            elog(ERROR, "input function %u returned non-NULL",
+                 flinfo->fn_oid);
+    }
+    else
+    {
+        if (fcinfo->isnull)
+            elog(ERROR, "input function %u returned NULL",
+                 flinfo->fn_oid);
+    }
+
+    return true;
+}
+
 /*
  * Call a previously-looked-up datatype output function.
  *
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index 380a82b9de..d739f3dbd9 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -18,8 +18,7 @@
 #ifndef FMGR_H
 #define FMGR_H

-/* We don't want to include primnodes.h here, so make some stub references */
-typedef struct Node *fmNodePtr;
+/* We don't want to include primnodes.h here, so make a stub reference */
 typedef struct Aggref *fmAggrefPtr;

 /* Likewise, avoid including execnodes.h here */
@@ -63,7 +62,7 @@ typedef struct FmgrInfo
     unsigned char fn_stats;        /* collect stats if track_functions > this */
     void       *fn_extra;        /* extra space for use by handler */
     MemoryContext fn_mcxt;        /* memory context to store fn_extra in */
-    fmNodePtr    fn_expr;        /* expression parse tree for call, or NULL */
+    NodePtr        fn_expr;        /* expression parse tree for call, or NULL */
 } FmgrInfo;

 /*
@@ -85,8 +84,8 @@ typedef struct FmgrInfo
 typedef struct FunctionCallInfoBaseData
 {
     FmgrInfo   *flinfo;            /* ptr to lookup info used for this call */
-    fmNodePtr    context;        /* pass info about context of call */
-    fmNodePtr    resultinfo;        /* pass or return extra info about result */
+    NodePtr        context;        /* pass info about context of call */
+    NodePtr        resultinfo;        /* pass or return extra info about result */
     Oid            fncollation;    /* collation for function to use */
 #define FIELDNO_FUNCTIONCALLINFODATA_ISNULL 4
     bool        isnull;            /* function must set true if result is NULL */
@@ -700,6 +699,10 @@ extern Datum OidFunctionCall9Coll(Oid functionId, Oid collation,
 /* Special cases for convenient invocation of datatype I/O functions. */
 extern Datum InputFunctionCall(FmgrInfo *flinfo, char *str,
                                Oid typioparam, int32 typmod);
+extern bool InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                                  Oid typioparam, int32 typmod,
+                                  NodePtr escontext,
+                                  Datum *result);
 extern Datum OidInputFunctionCall(Oid functionId, char *str,
                                   Oid typioparam, int32 typmod);
 extern char *OutputFunctionCall(FmgrInfo *flinfo, Datum val);
@@ -719,9 +722,9 @@ extern const Pg_finfo_record *fetch_finfo_record(void *filehandle, const char *f
 extern Oid    fmgr_internal_function(const char *proname);
 extern Oid    get_fn_expr_rettype(FmgrInfo *flinfo);
 extern Oid    get_fn_expr_argtype(FmgrInfo *flinfo, int argnum);
-extern Oid    get_call_expr_argtype(fmNodePtr expr, int argnum);
+extern Oid    get_call_expr_argtype(NodePtr expr, int argnum);
 extern bool get_fn_expr_arg_stable(FmgrInfo *flinfo, int argnum);
-extern bool get_call_expr_arg_stable(fmNodePtr expr, int argnum);
+extern bool get_call_expr_arg_stable(NodePtr expr, int argnum);
 extern bool get_fn_expr_variadic(FmgrInfo *flinfo);
 extern bytea *get_fn_opclass_options(FmgrInfo *flinfo);
 extern bool has_fn_opclass_options(FmgrInfo *flinfo);
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index e63881086e..f0e60935b6 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -16,6 +16,7 @@ node_support_input_i = [
   'nodes/bitmapset.h',
   'nodes/extensible.h',
   'nodes/lockoptions.h',
+  'nodes/miscnodes.h',
   'nodes/replnodes.h',
   'nodes/supportnodes.h',
   'nodes/value.h',
diff --git a/src/include/nodes/miscnodes.h b/src/include/nodes/miscnodes.h
new file mode 100644
index 0000000000..b50ee60352
--- /dev/null
+++ b/src/include/nodes/miscnodes.h
@@ -0,0 +1,56 @@
+/*-------------------------------------------------------------------------
+ *
+ * miscnodes.h
+ *      Definitions for hard-to-classify node types.
+ *
+ * Node types declared here are not part of parse trees, plan trees,
+ * or execution state trees.  We only assign them NodeTag values because
+ * IsA() tests provide a convenient way to disambiguate what kind of
+ * structure is being passed through assorted APIs, such as function
+ * "context" pointers.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/nodes/miscnodes.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MISCNODES_H
+#define MISCNODES_H
+
+#include "nodes/nodes.h"
+
+/*
+ * ErrorSaveContext -
+ *        function call context node for handling of "soft" errors
+ *
+ * A caller wishing to trap soft errors must initialize a struct like this
+ * with all fields zero/NULL except for the NodeTag.  Optionally, set
+ * details_wanted = true if more than the bare knowledge that a soft error
+ * occurred is required.  The struct is then passed to a SQL-callable function
+ * via the FunctionCallInfo.context field; or below the level of SQL calls,
+ * it could be passed to a subroutine directly.
+ *
+ * After calling code that might report an error this way, check
+ * error_occurred to see if an error happened.  If so, and if details_wanted
+ * is true, error_data has been filled with error details (stored in the
+ * callee's memory context!).  FreeErrorData() can be called to release
+ * error_data, although that step is typically not necessary if the called
+ * code was run in a short-lived context.
+ */
+typedef struct ErrorSaveContext
+{
+    NodeTag        type;
+    bool        error_occurred; /* set to true if we detect a soft error */
+    bool        details_wanted; /* does caller want more info than that? */
+    ErrorData  *error_data;        /* details of error, if so */
+} ErrorSaveContext;
+
+/* Often-useful macro for checking if a soft error was reported */
+#define SOFT_ERROR_OCCURRED(escontext) \
+    ((escontext) != NULL && IsA(escontext, ErrorSaveContext) && \
+     ((ErrorSaveContext *) (escontext))->error_occurred)
+
+#endif                            /* MISCNODES_H */
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f107a818e8..607c62b17c 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -18,6 +18,13 @@

 #include "lib/stringinfo.h"

+/*
+ * We cannot include nodes.h yet, so make a stub reference.  (This is also
+ * used by fmgr.h, which doesn't want to depend on nodes.h either.)
+ */
+typedef struct Node *NodePtr;
+
+
 /* Error level codes */
 #define DEBUG5        10            /* Debugging messages, in categories of
                                  * decreasing detail. */
@@ -235,6 +242,63 @@ extern int    getinternalerrposition(void);
     ereport(elevel, errmsg_internal(__VA_ARGS__))


+/*----------
+ * Support for reporting "soft" errors that don't require a full transaction
+ * abort to clean up.  This is to be used in this way:
+ *        errsave(context,
+ *                errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ *                errmsg("invalid input syntax for type %s: \"%s\"",
+ *                       "boolean", in_str),
+ *                ... other errxxx() fields as needed ...);
+ *
+ * "context" is a node pointer or NULL, and the remaining auxiliary calls
+ * provide the same error details as in ereport().  If context is not a
+ * pointer to an ErrorSaveContext node, then errsave(context, ...)
+ * behaves identically to ereport(ERROR, ...).  If context is a pointer
+ * to an ErrorSaveContext node, then the information provided by the
+ * auxiliary calls is stored in the context node and control returns
+ * normally.  The caller of errsave() must then do any required cleanup
+ * and return control back to its caller.  That caller must check the
+ * ErrorSaveContext node to see whether an error occurred before
+ * it can trust the function's result to be meaningful.
+ *
+ * errsave_domain() allows a message domain to be specified; it is
+ * precisely analogous to ereport_domain().
+ *----------
+ */
+#define errsave_domain(context, domain, ...)    \
+    do { \
+        NodePtr context_ = (context); \
+        pg_prevent_errno_in_scope(); \
+        if (errsave_start(context_, domain)) \
+            __VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \
+    } while(0)
+
+#define errsave(context, ...)    \
+    errsave_domain(context, TEXTDOMAIN, __VA_ARGS__)
+
+/*
+ * "ereturn(context, dummy_value, ...);" is exactly the same as
+ * "errsave(context, ...); return dummy_value;".  This saves a bit
+ * of typing in the common case where a function has no cleanup
+ * actions to take after reporting a soft error.  "dummy_value"
+ * can be empty if the function returns void.
+ */
+#define ereturn_domain(context, dummy_value, domain, ...)    \
+    do { \
+        errsave_domain(context, domain, __VA_ARGS__); \
+        return dummy_value; \
+    } while(0)
+
+#define ereturn(context, dummy_value, ...)    \
+    ereturn_domain(context, dummy_value, TEXTDOMAIN, __VA_ARGS__)
+
+extern bool errsave_start(NodePtr context, const char *domain);
+extern void errsave_finish(NodePtr context,
+                           const char *filename, int lineno,
+                           const char *funcname);
+
+
 /* Support for constructing error strings separately from ereport() calls */

 extern void pre_format_elog_string(int errnumber, const char *domain);
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e57ffce971..4fdd692e8e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24683,6 +24683,105 @@ SELECT collation for ('foo' COLLATE "de_DE");

   </sect2>

+  <sect2 id="functions-info-validity">
+   <title>Data Validity Checking Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-info-validity-table"/>
+    can be helpful for checking validity of proposed input data.
+   </para>
+
+   <table id="functions-info-validity-table">
+    <title>Data Validity Checking Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para>
+       <para>
+        Example(s)
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_input_is_valid</primary>
+        </indexterm>
+        <function>pg_input_is_valid</function> (
+          <parameter>string</parameter> <type>text</type>,
+          <parameter>type</parameter> <type>regtype</type>
+          <optional>, <parameter>typmod</parameter> <type>integer</type> </optional>
+        )
+        <returnvalue>boolean</returnvalue>
+       </para>
+       <para>
+        Tests whether the given <parameter>string</parameter> is valid
+        input for the specified data type, returning true or false.
+        Since the data type is named by a <type>regtype</type> parameter,
+        it is possible to just write the type name in single quotes.  An
+        encoded type modifier can also be supplied, if the data type pays
+        attention to that.
+       </para>
+       <para>
+        This function will only work as desired if the data type's input
+        function has been updated to report invalid input as
+        a <quote>soft</quote> error.  Otherwise, invalid input will abort
+        the transaction, just as if the string had been cast to the type
+        directly.
+        </para>
+        <para>
+         <literal>pg_input_is_valid('42', 'integer')</literal>
+         <returnvalue>t</returnvalue>
+        </para>
+        <para>
+         <literal>pg_input_is_valid('42000000000', 'integer')</literal>
+         <returnvalue>f</returnvalue>
+       </para></entry>
+      </row>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_input_error_message</primary>
+        </indexterm>
+        <function>pg_input_error_message</function> (
+          <parameter>string</parameter> <type>text</type>,
+          <parameter>type</parameter> <type>regtype</type>
+          <optional>, <parameter>typmod</parameter> <type>integer</type> </optional>
+        )
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Tests whether the given <parameter>string</parameter> is valid
+        input for the specified data type; if not, return the error
+        message that would have been thrown.  If the input is valid, the
+        result is NULL.  The inputs are the same as
+        for <function>pg_input_is_valid</function>.
+       </para>
+       <para>
+        This function will only work as desired if the data type's input
+        function has been updated to report invalid input as
+        a <quote>soft</quote> error.  Otherwise, invalid input will abort
+        the transaction, just as if the string had been cast to the type
+        directly.
+        </para>
+        <para>
+         <literal>pg_input_error_message('42000000000', 'integer')</literal>
+         <returnvalue>value "42000000000" is out of range for type integer</returnvalue>
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-info-snapshot">
    <title>Transaction ID and Snapshot Information Functions</title>

diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 9c13251231..09fae48658 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -32,6 +32,7 @@
 #include "common/keywords.h"
 #include "funcapi.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "parser/scansup.h"
 #include "pgstat.h"
 #include "postmaster/syslogger.h"
@@ -45,6 +46,22 @@
 #include "utils/ruleutils.h"
 #include "utils/timestamp.h"

+/*
+ * structure to cache metadata needed in pg_input_is_valid_common
+ */
+typedef struct BasicIOData
+{
+    Oid            typoid;
+    Oid            typiofunc;
+    Oid            typioparam;
+    FmgrInfo    proc;
+} BasicIOData;
+
+static bool pg_input_is_valid_common(FunctionCallInfo fcinfo,
+                                     text *txt, Oid typoid, int32 typmod,
+                                     ErrorSaveContext *escontext);
+
+
 /*
  * Common subroutine for num_nulls() and num_nonnulls().
  * Returns true if successful, false if function should return NULL.
@@ -640,6 +657,146 @@ pg_column_is_updatable(PG_FUNCTION_ARGS)
 }


+/*
+ * pg_input_is_valid - test whether string is valid input for datatype.
+ *
+ * Returns true if OK, false if not.
+ *
+ * This will only work usefully if the datatype's input function has been
+ * updated to return "soft" errors via errsave/ereturn.
+ */
+Datum
+pg_input_is_valid(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+
+    PG_RETURN_BOOL(pg_input_is_valid_common(fcinfo, txt, typoid, -1,
+                                            &escontext));
+}
+
+/* Same, with non-default typmod */
+Datum
+pg_input_is_valid_mod(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    int32        typmod = PG_GETARG_INT32(2);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+
+    PG_RETURN_BOOL(pg_input_is_valid_common(fcinfo, txt, typoid, typmod,
+                                            &escontext));
+}
+
+/*
+ * pg_input_error_message - test whether string is valid input for datatype.
+ *
+ * Returns NULL if OK, else the primary message string from the error.
+ *
+ * This will only work usefully if the datatype's input function has been
+ * updated to return "soft" errors via errsave/ereturn.
+ */
+Datum
+pg_input_error_message(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext, but enable details_wanted */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+    escontext.details_wanted = true;
+
+    if (pg_input_is_valid_common(fcinfo, txt, typoid, -1,
+                                 &escontext))
+        PG_RETURN_NULL();
+
+    Assert(escontext.error_occurred);
+    Assert(escontext.error_data != NULL);
+    Assert(escontext.error_data->message != NULL);
+
+    PG_RETURN_TEXT_P(cstring_to_text(escontext.error_data->message));
+}
+
+/* Same, with non-default typmod */
+Datum
+pg_input_error_message_mod(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    Oid            typoid = PG_GETARG_OID(1);
+    int32        typmod = PG_GETARG_INT32(2);
+    ErrorSaveContext escontext;
+
+    /* Set up empty ErrorSaveContext, but enable details_wanted */
+    memset(&escontext, 0, sizeof(escontext));
+    escontext.type = T_ErrorSaveContext;
+    escontext.details_wanted = true;
+
+    if (pg_input_is_valid_common(fcinfo, txt, typoid, typmod,
+                                 &escontext))
+        PG_RETURN_NULL();
+
+    Assert(escontext.error_occurred);
+    Assert(escontext.error_data != NULL);
+    Assert(escontext.error_data->message != NULL);
+
+    PG_RETURN_TEXT_P(cstring_to_text(escontext.error_data->message));
+}
+
+/* Common subroutine for the above */
+static bool
+pg_input_is_valid_common(FunctionCallInfo fcinfo,
+                         text *txt, Oid typoid, int32 typmod,
+                         ErrorSaveContext *escontext)
+{
+    char       *str = text_to_cstring(txt);
+    BasicIOData *my_extra;
+    Datum        converted;
+
+    /*
+     * We arrange to look up the needed I/O info just once per series of
+     * calls, assuming the data type doesn't change underneath us.
+     */
+    my_extra = (BasicIOData *) fcinfo->flinfo->fn_extra;
+    if (my_extra == NULL)
+    {
+        fcinfo->flinfo->fn_extra =
+            MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+                               sizeof(BasicIOData));
+        my_extra = (BasicIOData *) fcinfo->flinfo->fn_extra;
+        my_extra->typoid = InvalidOid;
+    }
+
+    if (my_extra->typoid != typoid)
+    {
+        getTypeInputInfo(typoid,
+                         &my_extra->typiofunc,
+                         &my_extra->typioparam);
+        fmgr_info_cxt(my_extra->typiofunc, &my_extra->proc,
+                      fcinfo->flinfo->fn_mcxt);
+        my_extra->typoid = typoid;
+    }
+
+    /* Now we can try to perform the conversion */
+    return InputFunctionCallSafe(&my_extra->proc,
+                                 str,
+                                 my_extra->typioparam,
+                                 typmod,
+                                 (Node *) escontext,
+                                 &converted);
+}
+
+
 /*
  * Is character a valid identifier start?
  * Must match scan.l's {ident_start} character class.
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f9301b2627..1593e43e24 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7060,6 +7060,21 @@
   prorettype => 'regnamespace', proargtypes => 'text',
   prosrc => 'to_regnamespace' },

+{ oid => '8050', descr => 'test whether string is valid input for data type',
+  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
+  proargtypes => 'text regtype', prosrc => 'pg_input_is_valid' },
+{ oid => '8051', descr => 'test whether string is valid input for data type',
+  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
+  proargtypes => 'text regtype int4', prosrc => 'pg_input_is_valid_mod' },
+{ oid => '8052',
+  descr => 'get error message if string is not valid input for data type',
+  proname => 'pg_input_error_message', provolatile => 's', prorettype => 'text',
+  proargtypes => 'text regtype', prosrc => 'pg_input_error_message' },
+{ oid => '8053',
+  descr => 'get error message if string is not valid input for data type',
+  proname => 'pg_input_error_message', provolatile => 's', prorettype => 'text',
+  proargtypes => 'text regtype int4', prosrc => 'pg_input_error_message_mod' },
+
 { oid => '1268',
   descr => 'parse qualified identifier to array of identifiers',
   proname => 'parse_ident', prorettype => '_text', proargtypes => 'text bool',
diff --git a/src/test/regress/expected/create_type.out b/src/test/regress/expected/create_type.out
index 0dfc88c1c8..7383fcdbb1 100644
--- a/src/test/regress/expected/create_type.out
+++ b/src/test/regress/expected/create_type.out
@@ -249,6 +249,31 @@ select format_type('bpchar'::regtype, -1);
  bpchar
 (1 row)

+-- Test non-error-throwing APIs using widget, which still throws errors
+SELECT pg_input_is_valid('(1,2,3)', 'widget');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('(1,2)', 'widget');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
+SELECT pg_input_is_valid('{"(1,2,3)"}', 'widget[]');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('{"(1,2)"}', 'widget[]');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
+SELECT pg_input_is_valid('("(1,2,3)")', 'mytab');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('("(1,2)")', 'mytab');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
 -- Test creation of an operator over a user-defined type
 CREATE FUNCTION pt_in_widget(point, widget)
    RETURNS bool
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index 548afb4438..2977045cc7 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -183,6 +183,11 @@ widget_in(PG_FUNCTION_ARGS)
             coord[i++] = p + 1;
     }

+    /*
+     * Note: DON'T convert this error to "soft" style (errsave/ereturn).  We
+     * want this data type to stay permanently in the hard-error world so that
+     * it can be used for testing that such cases still work reasonably.
+     */
     if (i < NARGS)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
diff --git a/src/test/regress/sql/create_type.sql b/src/test/regress/sql/create_type.sql
index c6fc4f9029..c25018029c 100644
--- a/src/test/regress/sql/create_type.sql
+++ b/src/test/regress/sql/create_type.sql
@@ -192,6 +192,14 @@ select format_type('bpchar'::regtype, null);
 -- this behavior difference is intentional
 select format_type('bpchar'::regtype, -1);

+-- Test non-error-throwing APIs using widget, which still throws errors
+SELECT pg_input_is_valid('(1,2,3)', 'widget');
+SELECT pg_input_is_valid('(1,2)', 'widget');  -- hard error expected
+SELECT pg_input_is_valid('{"(1,2,3)"}', 'widget[]');
+SELECT pg_input_is_valid('{"(1,2)"}', 'widget[]');  -- hard error expected
+SELECT pg_input_is_valid('("(1,2,3)")', 'mytab');
+SELECT pg_input_is_valid('("(1,2)")', 'mytab');  -- hard error expected
+
 -- Test creation of an operator over a user-defined type

 CREATE FUNCTION pt_in_widget(point, widget)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 495e449a9e..c011ebdfd9 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -90,14 +90,15 @@ typedef struct ArrayIteratorData
 }            ArrayIteratorData;

 static bool array_isspace(char ch);
-static int    ArrayCount(const char *str, int *dim, char typdelim);
-static void ReadArrayStr(char *arrayStr, const char *origStr,
+static int    ArrayCount(const char *str, int *dim, char typdelim,
+                       Node *escontext);
+static bool ReadArrayStr(char *arrayStr, const char *origStr,
                          int nitems, int ndim, int *dim,
                          FmgrInfo *inputproc, Oid typioparam, int32 typmod,
                          char typdelim,
                          int typlen, bool typbyval, char typalign,
                          Datum *values, bool *nulls,
-                         bool *hasnulls, int32 *nbytes);
+                         bool *hasnulls, int32 *nbytes, Node *escontext);
 static void ReadArrayBinary(StringInfo buf, int nitems,
                             FmgrInfo *receiveproc, Oid typioparam, int32 typmod,
                             int typlen, bool typbyval, char typalign,
@@ -177,6 +178,7 @@ array_in(PG_FUNCTION_ARGS)
     Oid            element_type = PG_GETARG_OID(1);    /* type of an array
                                                      * element */
     int32        typmod = PG_GETARG_INT32(2);    /* typmod for array elements */
+    Node       *escontext = fcinfo->context;
     int            typlen;
     bool        typbyval;
     char        typalign;
@@ -258,7 +260,7 @@ array_in(PG_FUNCTION_ARGS)
             break;                /* no more dimension items */
         p++;
         if (ndim >= MAXDIM)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                             ndim + 1, MAXDIM)));
@@ -266,7 +268,7 @@ array_in(PG_FUNCTION_ARGS)
         for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
              /* skip */ ;
         if (q == p)                /* no digits? */
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("\"[\" must introduce explicitly-specified array dimensions.")));
@@ -280,7 +282,7 @@ array_in(PG_FUNCTION_ARGS)
             for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
                  /* skip */ ;
             if (q == p)            /* no digits? */
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Missing array dimension value.")));
@@ -291,7 +293,7 @@ array_in(PG_FUNCTION_ARGS)
             lBound[ndim] = 1;
         }
         if (*q != ']')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -301,7 +303,7 @@ array_in(PG_FUNCTION_ARGS)
         ub = atoi(p);
         p = q + 1;
         if (ub < lBound[ndim])
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
                      errmsg("upper bound cannot be less than lower bound")));

@@ -313,11 +315,13 @@ array_in(PG_FUNCTION_ARGS)
     {
         /* No array dimensions, so intuit dimensions from brace structure */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array value must start with \"{\" or dimension information.")));
-        ndim = ArrayCount(p, dim, typdelim);
+        ndim = ArrayCount(p, dim, typdelim, escontext);
+        if (ndim < 0)
+            PG_RETURN_NULL();
         for (i = 0; i < ndim; i++)
             lBound[i] = 1;
     }
@@ -328,7 +332,7 @@ array_in(PG_FUNCTION_ARGS)

         /* If array dimensions are given, expect '=' operator */
         if (strncmp(p, ASSGN, strlen(ASSGN)) != 0)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -342,20 +346,22 @@ array_in(PG_FUNCTION_ARGS)
          * were given
          */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array contents must start with \"{\".")));
-        ndim_braces = ArrayCount(p, dim_braces, typdelim);
+        ndim_braces = ArrayCount(p, dim_braces, typdelim, escontext);
+        if (ndim_braces < 0)
+            PG_RETURN_NULL();
         if (ndim_braces != ndim)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Specified array dimensions do not match array contents.")));
         for (i = 0; i < ndim; ++i)
         {
             if (dim[i] != dim_braces[i])
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Specified array dimensions do not match array contents.")));
@@ -372,8 +378,11 @@ array_in(PG_FUNCTION_ARGS)
 #endif

     /* This checks for overflow of the array dimensions */
-    nitems = ArrayGetNItems(ndim, dim);
-    ArrayCheckBounds(ndim, dim, lBound);
+    nitems = ArrayGetNItemsSafe(ndim, dim, escontext);
+    if (nitems < 0)
+        PG_RETURN_NULL();
+    if (!ArrayCheckBoundsSafe(ndim, dim, lBound, escontext))
+        PG_RETURN_NULL();

     /* Empty array? */
     if (nitems == 0)
@@ -381,13 +390,14 @@ array_in(PG_FUNCTION_ARGS)

     dataPtr = (Datum *) palloc(nitems * sizeof(Datum));
     nullsPtr = (bool *) palloc(nitems * sizeof(bool));
-    ReadArrayStr(p, string,
-                 nitems, ndim, dim,
-                 &my_extra->proc, typioparam, typmod,
-                 typdelim,
-                 typlen, typbyval, typalign,
-                 dataPtr, nullsPtr,
-                 &hasnulls, &nbytes);
+    if (!ReadArrayStr(p, string,
+                      nitems, ndim, dim,
+                      &my_extra->proc, typioparam, typmod,
+                      typdelim,
+                      typlen, typbyval, typalign,
+                      dataPtr, nullsPtr,
+                      &hasnulls, &nbytes, escontext))
+        PG_RETURN_NULL();
     if (hasnulls)
     {
         dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems);
@@ -451,9 +461,12 @@ array_isspace(char ch)
  *
  * Returns number of dimensions as function result.  The axis lengths are
  * returned in dim[], which must be of size MAXDIM.
+ *
+ * If we detect an error, fill *escontext with error details and return -1
+ * (unless escontext isn't provided, in which case errors will be thrown).
  */
 static int
-ArrayCount(const char *str, int *dim, char typdelim)
+ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 {
     int            nest_level = 0,
                 i;
@@ -488,11 +501,10 @@ ArrayCount(const char *str, int *dim, char typdelim)
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, -1,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"", str),
                              errdetail("Unexpected end of input.")));
-                    break;
                 case '\\':

                     /*
@@ -504,7 +516,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         parse_state != ARRAY_ELEM_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected \"%c\" character.",
@@ -515,7 +527,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (*(ptr + 1))
                         ptr++;
                     else
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected end of input.")));
@@ -530,7 +542,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (parse_state != ARRAY_LEVEL_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected array element.")));
@@ -551,14 +563,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         if (parse_state != ARRAY_NO_LEVEL &&
                             parse_state != ARRAY_LEVEL_STARTED &&
                             parse_state != ARRAY_LEVEL_DELIMITED)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '{')));
                         parse_state = ARRAY_LEVEL_STARTED;
                         if (nest_level >= MAXDIM)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                                             nest_level + 1, MAXDIM)));
@@ -581,14 +593,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                             parse_state != ARRAY_LEVEL_COMPLETED &&
                             !(nest_level == 1 && parse_state == ARRAY_LEVEL_STARTED))
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '}')));
                         parse_state = ARRAY_LEVEL_COMPLETED;
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unmatched \"%c\" character.", '}')));
@@ -596,7 +608,7 @@ ArrayCount(const char *str, int *dim, char typdelim)

                         if (nelems_last[nest_level] != 0 &&
                             nelems[nest_level] != nelems_last[nest_level])
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Multidimensional arrays must have "
@@ -630,7 +642,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                                 parse_state != ARRAY_ELEM_COMPLETED &&
                                 parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                                 parse_state != ARRAY_LEVEL_COMPLETED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected \"%c\" character.",
@@ -653,7 +665,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             if (parse_state != ARRAY_LEVEL_STARTED &&
                                 parse_state != ARRAY_ELEM_STARTED &&
                                 parse_state != ARRAY_ELEM_DELIMITED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected array element.")));
@@ -673,7 +685,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
     while (*ptr)
     {
         if (!array_isspace(*ptr++))
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", str),
                      errdetail("Junk after closing right brace.")));
@@ -713,11 +725,16 @@ ArrayCount(const char *str, int *dim, char typdelim)
  *    *hasnulls: set true iff there are any null elements.
  *    *nbytes: set to total size of data area needed (including alignment
  *        padding but not including array header overhead).
+ *    *escontext: if this points to an ErrorSaveContext, details of
+ *        any error are reported there.
+ *
+ * Result:
+ *    true for success, false for failure (if escontext is provided).
  *
  * Note that values[] and nulls[] are allocated by the caller, and must have
  * nitems elements.
  */
-static void
+static bool
 ReadArrayStr(char *arrayStr,
              const char *origStr,
              int nitems,
@@ -733,7 +750,8 @@ ReadArrayStr(char *arrayStr,
              Datum *values,
              bool *nulls,
              bool *hasnulls,
-             int32 *nbytes)
+             int32 *nbytes,
+             Node *escontext)
 {
     int            i,
                 nest_level = 0;
@@ -784,7 +802,7 @@ ReadArrayStr(char *arrayStr,
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, false,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"",
                                     origStr)));
@@ -793,7 +811,7 @@ ReadArrayStr(char *arrayStr,
                     /* Skip backslash, copy next character as-is. */
                     srcptr++;
                     if (*srcptr == '\0')
-                        ereport(ERROR,
+                        ereturn(escontext, false,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"",
                                         origStr)));
@@ -823,7 +841,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level >= ndim)
-                            ereport(ERROR,
+                            ereturn(escontext, false,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -838,7 +856,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, false,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -891,7 +909,7 @@ ReadArrayStr(char *arrayStr,
         *dstendptr = '\0';

         if (i < 0 || i >= nitems)
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"",
                             origStr)));
@@ -900,14 +918,20 @@ ReadArrayStr(char *arrayStr,
             pg_strcasecmp(itemstart, "NULL") == 0)
         {
             /* it's a NULL item */
-            values[i] = InputFunctionCall(inputproc, NULL,
-                                          typioparam, typmod);
+            if (!InputFunctionCallSafe(inputproc, NULL,
+                                       typioparam, typmod,
+                                       escontext,
+                                       &values[i]))
+                return false;
             nulls[i] = true;
         }
         else
         {
-            values[i] = InputFunctionCall(inputproc, itemstart,
-                                          typioparam, typmod);
+            if (!InputFunctionCallSafe(inputproc, itemstart,
+                                       typioparam, typmod,
+                                       escontext,
+                                       &values[i]))
+                return false;
             nulls[i] = false;
         }
     }
@@ -930,7 +954,7 @@ ReadArrayStr(char *arrayStr,
             totbytes = att_align_nominal(totbytes, typalign);
             /* check for overflow of total request */
             if (!AllocSizeIsValid(totbytes))
-                ereport(ERROR,
+                ereturn(escontext, false,
                         (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                          errmsg("array size exceeds the maximum allowed (%d)",
                                 (int) MaxAllocSize)));
@@ -938,6 +962,7 @@ ReadArrayStr(char *arrayStr,
     }
     *hasnulls = hasnull;
     *nbytes = totbytes;
+    return true;
 }


diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 051169a149..c52adc6259 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -74,6 +74,16 @@ ArrayGetOffset0(int n, const int *tup, const int *scale)
  */
 int
 ArrayGetNItems(int ndim, const int *dims)
+{
+    return ArrayGetNItemsSafe(ndim, dims, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.  -1 is returned after an error.
+ */
+int
+ArrayGetNItemsSafe(int ndim, const int *dims, NodePtr escontext)
 {
     int32        ret;
     int            i;
@@ -89,7 +99,7 @@ ArrayGetNItems(int ndim, const int *dims)

         /* A negative dimension implies that UB-LB overflowed ... */
         if (dims[i] < 0)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
@@ -98,14 +108,14 @@ ArrayGetNItems(int ndim, const int *dims)

         ret = (int32) prod;
         if ((int64) ret != prod)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
     }
     Assert(ret >= 0);
     if ((Size) ret > MaxArraySize)
-        ereport(ERROR,
+        ereturn(escontext, -1,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("array size exceeds the maximum allowed (%d)",
                         (int) MaxArraySize)));
@@ -126,6 +136,17 @@ ArrayGetNItems(int ndim, const int *dims)
  */
 void
 ArrayCheckBounds(int ndim, const int *dims, const int *lb)
+{
+    (void) ArrayCheckBoundsSafe(ndim, dims, lb, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.
+ */
+bool
+ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                     NodePtr escontext)
 {
     int            i;

@@ -135,11 +156,13 @@ ArrayCheckBounds(int ndim, const int *dims, const int *lb)
         int32        sum PG_USED_FOR_ASSERTS_ONLY;

         if (pg_add_s32_overflow(dims[i], lb[i], &sum))
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array lower bound is too large: %d",
                             lb[i])));
     }
+
+    return true;
 }

 /*
diff --git a/src/backend/utils/adt/bool.c b/src/backend/utils/adt/bool.c
index cd7335287f..e291672ae4 100644
--- a/src/backend/utils/adt/bool.c
+++ b/src/backend/utils/adt/bool.c
@@ -148,13 +148,10 @@ boolin(PG_FUNCTION_ARGS)
     if (parse_bool_with_len(str, len, &result))
         PG_RETURN_BOOL(result);

-    ereport(ERROR,
+    ereturn(fcinfo->context, (Datum) 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "boolean", in_str)));
-
-    /* not reached */
-    PG_RETURN_BOOL(false);
 }

 /*
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 42ddae99ef..e1837bee71 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -291,7 +291,7 @@ int4in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT32(pg_strtoint32(num));
+    PG_RETURN_INT32(pg_strtoint32_safe(num, fcinfo->context));
 }

 /*
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index a64422c8d0..0de0bed0e8 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -166,8 +166,11 @@ invalid_syntax:
 /*
  * Convert input string to a signed 32 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * pg_strtoint32() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint32_safe() instead returns such complaints in *escontext,
+ * if it's an ErrorSaveContext.
  *
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
@@ -175,6 +178,12 @@ invalid_syntax:
  */
 int32
 pg_strtoint32(const char *s)
+{
+    return pg_strtoint32_safe(s, NULL);
+}
+
+int32
+pg_strtoint32_safe(const char *s, Node *escontext)
 {
     const char *ptr = s;
     uint32        tmp = 0;
@@ -227,18 +236,16 @@ pg_strtoint32(const char *s)
     return (int32) tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "integer")));

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "integer", s)));
-
-    return 0;                    /* keep compiler quiet */
 }

 /*
diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..bdafcff02d 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -77,6 +77,7 @@ record_in(PG_FUNCTION_ARGS)
     char       *string = PG_GETARG_CSTRING(0);
     Oid            tupType = PG_GETARG_OID(1);
     int32        tupTypmod = PG_GETARG_INT32(2);
+    Node       *escontext = fcinfo->context;
     HeapTupleHeader result;
     TupleDesc    tupdesc;
     HeapTuple    tuple;
@@ -100,7 +101,7 @@ record_in(PG_FUNCTION_ARGS)
      * supply a valid typmod, and then we can do something useful for RECORD.
      */
     if (tupType == RECORDOID && tupTypmod < 0)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("input of anonymous composite types is not implemented")));

@@ -152,10 +153,13 @@ record_in(PG_FUNCTION_ARGS)
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr++ != '(')
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Missing left parenthesis.")));
+        goto fail;
+    }

     initStringInfo(&buf);

@@ -181,10 +185,13 @@ record_in(PG_FUNCTION_ARGS)
                 ptr++;
             else
                 /* *ptr must be ')' */
-                ereport(ERROR,
+            {
+                errsave(escontext,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed record literal: \"%s\"", string),
                          errdetail("Too few columns.")));
+                goto fail;
+            }
         }

         /* Check for null: completely empty input means null */
@@ -204,19 +211,25 @@ record_in(PG_FUNCTION_ARGS)
                 char        ch = *ptr++;

                 if (ch == '\0')
-                    ereport(ERROR,
+                {
+                    errsave(escontext,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed record literal: \"%s\"",
                                     string),
                              errdetail("Unexpected end of input.")));
+                    goto fail;
+                }
                 if (ch == '\\')
                 {
                     if (*ptr == '\0')
-                        ereport(ERROR,
+                    {
+                        errsave(escontext,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed record literal: \"%s\"",
                                         string),
                                  errdetail("Unexpected end of input.")));
+                        goto fail;
+                    }
                     appendStringInfoChar(&buf, *ptr++);
                 }
                 else if (ch == '"')
@@ -252,10 +265,13 @@ record_in(PG_FUNCTION_ARGS)
             column_info->column_type = column_type;
         }

-        values[i] = InputFunctionCall(&column_info->proc,
-                                      column_data,
-                                      column_info->typioparam,
-                                      att->atttypmod);
+        if (!InputFunctionCallSafe(&column_info->proc,
+                                   column_data,
+                                   column_info->typioparam,
+                                   att->atttypmod,
+                                   escontext,
+                                   &values[i]))
+            goto fail;

         /*
          * Prep for next column
@@ -264,18 +280,24 @@ record_in(PG_FUNCTION_ARGS)
     }

     if (*ptr++ != ')')
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Too many columns.")));
+        goto fail;
+    }
     /* Allow trailing whitespace */
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr)
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Junk after right parenthesis.")));
+        goto fail;
+    }

     tuple = heap_form_tuple(tupdesc, values, nulls);

@@ -294,6 +316,11 @@ record_in(PG_FUNCTION_ARGS)
     ReleaseTupleDesc(tupdesc);

     PG_RETURN_HEAPTUPLEHEADER(result);
+
+    /* exit here once we've done lookup_rowtype_tupdesc */
+fail:
+    ReleaseTupleDesc(tupdesc);
+    PG_RETURN_NULL();
 }

 /*
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 2f794d1168..5ecb436a08 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -447,7 +447,11 @@ extern void array_free_iterator(ArrayIterator iterator);
 extern int    ArrayGetOffset(int n, const int *dim, const int *lb, const int *indx);
 extern int    ArrayGetOffset0(int n, const int *tup, const int *scale);
 extern int    ArrayGetNItems(int ndim, const int *dims);
+extern int    ArrayGetNItemsSafe(int ndim, const int *dims,
+                               NodePtr escontext);
 extern void ArrayCheckBounds(int ndim, const int *dims, const int *lb);
+extern bool ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                                 NodePtr escontext);
 extern void mda_get_range(int n, int *span, const int *st, const int *endp);
 extern void mda_get_prod(int n, const int *range, int *prod);
 extern void mda_get_offset_values(int n, int *dist, const int *prod, const int *span);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 81631f1645..fbfd8375e3 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -45,6 +45,7 @@ extern int    namestrcmp(Name name, const char *str);
 /* numutils.c */
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
+extern int32 pg_strtoint32_safe(const char *s, Node *escontext);
 extern int64 pg_strtoint64(const char *s);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 97920f38c2..a2f9d7ed16 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -182,6 +182,31 @@ SELECT a,b,c FROM arrtest;
  [4:4]={NULL}  | {3,4}                 | {foo,new_word}
 (3 rows)

+-- test non-error-throwing API
+SELECT pg_input_is_valid('{1,2,3}', 'integer[]');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('{1,2', 'integer[]');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('{1,zed}', 'integer[]');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('{1,zed}', 'integer[]');
+            pg_input_error_message
+----------------------------------------------
+ invalid input syntax for type integer: "zed"
+(1 row)
+
 -- test mixed slice/scalar subscripting
 select '{{1,2,3},{4,5,6},{7,8,9}}'::int[];
            int4
diff --git a/src/test/regress/expected/boolean.out b/src/test/regress/expected/boolean.out
index 4728fe2dfd..977124b20b 100644
--- a/src/test/regress/expected/boolean.out
+++ b/src/test/regress/expected/boolean.out
@@ -142,6 +142,25 @@ SELECT bool '' AS error;
 ERROR:  invalid input syntax for type boolean: ""
 LINE 1: SELECT bool '' AS error;
                     ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('true', 'bool');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'bool');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('junk', 'bool');
+            pg_input_error_message
+-----------------------------------------------
+ invalid input syntax for type boolean: "junk"
+(1 row)
+
 -- and, or, not in qualifications
 SELECT bool 't' or bool 'f' AS true;
  true
diff --git a/src/test/regress/expected/int4.out b/src/test/regress/expected/int4.out
index fbcc0e8d9e..b98007bd7a 100644
--- a/src/test/regress/expected/int4.out
+++ b/src/test/regress/expected/int4.out
@@ -45,6 +45,31 @@ SELECT * FROM INT4_TBL;
  -2147483647
 (5 rows)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int4');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'int4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1000000000000', 'int4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1000000000000', 'int4');
+                 pg_input_error_message
+--------------------------------------------------------
+ value "1000000000000" is out of range for type integer
+(1 row)
+
 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int2 '0';
      f1
 -------------
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index a4cc2d8c12..1bcd2b499c 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -69,6 +69,32 @@ ERROR:  malformed record literal: "(Joe,Blow) /"
 LINE 1: select '(Joe,Blow) /'::fullname;
                ^
 DETAIL:  Junk after right parenthesis.
+-- test non-error-throwing API
+create type twoints as (r integer, i integer);
+SELECT pg_input_is_valid('(1,2)', 'twoints');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('(1,2', 'twoints');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('(1,zed)', 'twoints');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('(1,zed)', 'twoints');
+            pg_input_error_message
+----------------------------------------------
+ invalid input syntax for type integer: "zed"
+(1 row)
+
 create temp table quadtable(f1 int, q quad);
 insert into quadtable values (1, ((3.3,4.4),(5.5,6.6)));
 insert into quadtable values (2, ((null,4.4),(5.5,6.6)));
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index 791af5c0ce..38e8dd440b 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -113,6 +113,12 @@ SELECT a FROM arrtest WHERE a[2] IS NULL;
 DELETE FROM arrtest WHERE a[2] IS NULL AND b IS NULL;
 SELECT a,b,c FROM arrtest;

+-- test non-error-throwing API
+SELECT pg_input_is_valid('{1,2,3}', 'integer[]');
+SELECT pg_input_is_valid('{1,2', 'integer[]');
+SELECT pg_input_is_valid('{1,zed}', 'integer[]');
+SELECT pg_input_error_message('{1,zed}', 'integer[]');
+
 -- test mixed slice/scalar subscripting
 select '{{1,2,3},{4,5,6},{7,8,9}}'::int[];
 select ('{{1,2,3},{4,5,6},{7,8,9}}'::int[])[1:2][2];
diff --git a/src/test/regress/sql/boolean.sql b/src/test/regress/sql/boolean.sql
index 4dd47aaf9d..dfaa55dd0f 100644
--- a/src/test/regress/sql/boolean.sql
+++ b/src/test/regress/sql/boolean.sql
@@ -62,6 +62,11 @@ SELECT bool '000' AS error;

 SELECT bool '' AS error;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('true', 'bool');
+SELECT pg_input_is_valid('asdf', 'bool');
+SELECT pg_input_error_message('junk', 'bool');
+
 -- and, or, not in qualifications

 SELECT bool 't' or bool 'f' AS true;
diff --git a/src/test/regress/sql/int4.sql b/src/test/regress/sql/int4.sql
index f19077f3da..54420818de 100644
--- a/src/test/regress/sql/int4.sql
+++ b/src/test/regress/sql/int4.sql
@@ -17,6 +17,12 @@ INSERT INTO INT4_TBL(f1) VALUES ('');

 SELECT * FROM INT4_TBL;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int4');
+SELECT pg_input_is_valid('asdf', 'int4');
+SELECT pg_input_is_valid('1000000000000', 'int4');
+SELECT pg_input_error_message('1000000000000', 'int4');
+
 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int2 '0';

 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int4 '0';
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index ad5b7e128f..4cd6a49215 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -31,6 +31,13 @@ select '[]'::fullname;          -- bad
 select ' (Joe,Blow)  '::fullname;  -- ok, extra whitespace
 select '(Joe,Blow) /'::fullname;  -- bad

+-- test non-error-throwing API
+create type twoints as (r integer, i integer);
+SELECT pg_input_is_valid('(1,2)', 'twoints');
+SELECT pg_input_is_valid('(1,2', 'twoints');
+SELECT pg_input_is_valid('(1,zed)', 'twoints');
+SELECT pg_input_error_message('(1,zed)', 'twoints');
+
 create temp table quadtable(f1 int, q quad);

 insert into quadtable values (1, ((3.3,4.4),(5.5,6.6)));

Re: Error-safe user functions

From

Andrew Dunstan

Date:

07 December 2022, 22:50:34

On 2022-12-07 We 17:32, Tom Lane wrote:
> OK, here's a v4 that I think is possibly committable.
>
> I've changed all the comments and docs to use the "soft error"
> terminology, but since using "soft" in the actual function names
> didn't seem that appealing, they still use "safe".
>
> I already pushed the 0000 elog-refactoring patch, since that seemed
> uncontroversial.  0001 attached covers the same territory as before,
> but I regrouped the rest so that 0002 installs the new test support
> functions, then 0003 adds both the per-datatype changes and
> corresponding test cases for bool, int4, arrays, and records.
> The idea here is that 0003 can be pointed to as a sample of what
> has to be done to datatype input functions, while the preceding
> patches can be cited as relevant documentation.  (I've not decided
> whether to squash 0001 and 0002 together or commit them separately.
> Does it make sense to break 0003 into 4 separate commits, or is
> that overkill?)
>

No strong opinion about 0001 and 0002. I'm happy enough with them as
they are, but if you want to squash them that's ok. I wouldn't break up
0003. I think we're going to end up committing the remaining work in
batches, although they would probably be a bit more thematically linked
than these.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 22:56:20

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-07 We 17:32, Tom Lane wrote:
>> Does it make sense to break 0003 into 4 separate commits, or is
>> that overkill?)

> No strong opinion about 0001 and 0002. I'm happy enough with them as
> they are, but if you want to squash them that's ok. I wouldn't break up
> 0003. I think we're going to end up committing the remaining work in
> batches, although they would probably be a bit more thematically linked
> than these.

Yeah, we certainly aren't likely to do this work as
one-commit-per-datatype going forward.  I'm just wondering
how to do these initial commits so that they provide
good reference material.

            regards, tom lane

Re: Error-safe user functions

From

Andres Freund

Date:

07 December 2022, 23:35:18

Hi,

On 2022-12-07 17:32:21 -0500, Tom Lane wrote:
> I already pushed the 0000 elog-refactoring patch, since that seemed
> uncontroversial.  0001 attached covers the same territory as before,
> but I regrouped the rest so that 0002 installs the new test support
> functions, then 0003 adds both the per-datatype changes and
> corresponding test cases for bool, int4, arrays, and records.
> The idea here is that 0003 can be pointed to as a sample of what
> has to be done to datatype input functions, while the preceding
> patches can be cited as relevant documentation.  (I've not decided
> whether to squash 0001 and 0002 together or commit them separately.

I think they make sense as is.


> Does it make sense to break 0003 into 4 separate commits, or is
> that overkill?)

I think it'd be fine either way.


> + * If "context" is an ErrorSaveContext node, but the node creator only wants
> + * notification of the fact of a soft error without any details, just set
> + * the error_occurred flag in the ErrorSaveContext node and return false,
> + * which will cause us to skip the remaining error processing steps.
> + *
> + * Otherwise, create and initialize error stack entry and return true.
> + * Subsequently, errmsg() and perhaps other routines will be called to further
> + * populate the stack entry.  Finally, errsave_finish() will be called to
> + * tidy up.
> + */
> +bool
> +errsave_start(NodePtr context, const char *domain)

I wonder if there are potential use-cases for levels other than ERROR. I can
potentially see us wanting to defer some FATALs, e.g. when they occur in
process exit hooks.


> +{
> +    ErrorSaveContext *escontext;
> +    ErrorData  *edata;
> +
> +    /*
> +     * Do we have a context for soft error reporting?  If not, just punt to
> +     * errstart().
> +     */
> +    if (context == NULL || !IsA(context, ErrorSaveContext))
> +        return errstart(ERROR, domain);
> +
> +    /* Report that a soft error was detected */
> +    escontext = (ErrorSaveContext *) context;
> +    escontext->error_occurred = true;
> +
> +    /* Nothing else to do if caller wants no further details */
> +    if (!escontext->details_wanted)
> +        return false;
> +
> +    /*
> +     * Okay, crank up a stack entry to store the info in.
> +     */
> +
> +    recursion_depth++;
> +
> +    /* Initialize data for this error frame */
> +    edata = get_error_stack_entry();

For a moment I was worried that it could lead to odd behaviour that we don't
do get_error_stack_entry() when !details_wanted, due to not raising an error
we'd otherwise raise. But that's a should-never-be-reached case, so ...


> +/*
> + * errsave_finish --- end a "soft" error-reporting cycle
> + *
> + * If errsave_start() decided this was a regular error, behave as
> + * errfinish().  Otherwise, package up the error details and save
> + * them in the ErrorSaveContext node.
> + */
> +void
> +errsave_finish(NodePtr context, const char *filename, int lineno,
> +               const char *funcname)
> +{
> +    ErrorSaveContext *escontext = (ErrorSaveContext *) context;
> +    ErrorData  *edata = &errordata[errordata_stack_depth];
> +
> +    /* verify stack depth before accessing *edata */
> +    CHECK_STACK_DEPTH();
> +
> +    /*
> +     * If errsave_start punted to errstart, then elevel will be ERROR or
> +     * perhaps even PANIC.  Punt likewise to errfinish.
> +     */
> +    if (edata->elevel >= ERROR)
> +    {
> +        errfinish(filename, lineno, funcname);
> +        pg_unreachable();
> +    }

It seems somewhat ugly transport this knowledge via edata->elevel, but it's
not too bad.



> +/*
> + * We cannot include nodes.h yet, so make a stub reference.  (This is also
> + * used by fmgr.h, which doesn't want to depend on nodes.h either.)
> + */
> +typedef struct Node *NodePtr;

Seems like it'd be easier to just forward declare the struct, and use the
non-typedef'ed name in the header than to have to deal with these
interdependencies and the differing typenames.


> +/*----------
> + * Support for reporting "soft" errors that don't require a full transaction
> + * abort to clean up.  This is to be used in this way:
> + *        errsave(context,
> + *                errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
> + *                errmsg("invalid input syntax for type %s: \"%s\"",
> + *                       "boolean", in_str),
> + *                ... other errxxx() fields as needed ...);
> + *
> + * "context" is a node pointer or NULL, and the remaining auxiliary calls
> + * provide the same error details as in ereport().  If context is not a
> + * pointer to an ErrorSaveContext node, then errsave(context, ...)
> + * behaves identically to ereport(ERROR, ...).  If context is a pointer
> + * to an ErrorSaveContext node, then the information provided by the
> + * auxiliary calls is stored in the context node and control returns
> + * normally.  The caller of errsave() must then do any required cleanup
> + * and return control back to its caller.  That caller must check the
> + * ErrorSaveContext node to see whether an error occurred before
> + * it can trust the function's result to be meaningful.
> + *
> + * errsave_domain() allows a message domain to be specified; it is
> + * precisely analogous to ereport_domain().
> + *----------
> + */
> +#define errsave_domain(context, domain, ...)    \
> +    do { \
> +        NodePtr context_ = (context); \
> +        pg_prevent_errno_in_scope(); \
> +        if (errsave_start(context_, domain)) \
> +            __VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \
> +    } while(0)

Perhaps worth noting here that the reason why the errsave_start/errsave_finish
split exist differs a bit from the reason in ereport_domain()? "Over there"
it's just about not wanting to incur overhead when the message isn't logged,
but here we'll always have >= ERROR, but ->details_wanted can still lead to
not wanting to incur the overhead.


>  /*
> diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
> index db843a0fbf..bdafcff02d 100644
> --- a/src/backend/utils/adt/rowtypes.c
> +++ b/src/backend/utils/adt/rowtypes.c
> @@ -77,6 +77,7 @@ record_in(PG_FUNCTION_ARGS)
>      char       *string = PG_GETARG_CSTRING(0);
>      Oid            tupType = PG_GETARG_OID(1);
>      int32        tupTypmod = PG_GETARG_INT32(2);
> +    Node       *escontext = fcinfo->context;
>      HeapTupleHeader result;
>      TupleDesc    tupdesc;
>      HeapTuple    tuple;
> @@ -100,7 +101,7 @@ record_in(PG_FUNCTION_ARGS)
>       * supply a valid typmod, and then we can do something useful for RECORD.
>       */
>      if (tupType == RECORDOID && tupTypmod < 0)
> -        ereport(ERROR,
> +        ereturn(escontext, (Datum) 0,
>                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                   errmsg("input of anonymous composite types is not implemented")));
>  

Is it ok that we throw an error in lookup_rowtype_tupdesc()? Normally those
should not be reachable by users, I think? The new testing functions might
reach it, but that seems fine, they're test functions.


Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

07 December 2022, 23:52:41

Andres Freund <andres@anarazel.de> writes:
> I wonder if there are potential use-cases for levels other than ERROR. I can
> potentially see us wanting to defer some FATALs, e.g. when they occur in
> process exit hooks.

I thought about that early on, and concluded not.  The whole thing is
moot for levels less than ERROR, of course, and I'm having a hard
time seeing how it could be useful for FATAL or PANIC.  Maybe I just
lack imagination, but if a call is specifying FATAL rather than just
ERROR then it seems to me it's already a special snowflake rather
than something we could fold into a generic non-error behavior.

> For a moment I was worried that it could lead to odd behaviour that we don't
> do get_error_stack_entry() when !details_wanted, due to not raising an error
> we'd otherwise raise. But that's a should-never-be-reached case, so ...

I don't see how.  Returning false out of errsave_start causes the
errsave macro to immediately give control back to the caller, which
will go on about its business.

> It seems somewhat ugly transport this knowledge via edata->elevel, but it's
> not too bad.

The LOG-vs-ERROR business, you mean?  Yeah.  I considered adding another
bool flag to ErrorData, but couldn't convince myself it was worth the
trouble.  If we find a problem we can do that sometime in future.

>> +/*
>> + * We cannot include nodes.h yet, so make a stub reference.  (This is also
>> + * used by fmgr.h, which doesn't want to depend on nodes.h either.)
>> + */
>> +typedef struct Node *NodePtr;

> Seems like it'd be easier to just forward declare the struct, and use the
> non-typedef'ed name in the header than to have to deal with these
> interdependencies and the differing typenames.

Meh.  I'm a little allergic to writing "struct foo *" in function argument
lists, because I so often see gcc pointing out that if struct foo isn't
yet known then that can silently mean something different than you
intended.  With the typedef, it either works or is an error, no halfway
about it.  And the struct way isn't really much better in terms of
having two different notations to use rather than only one.

> Perhaps worth noting here that the reason why the errsave_start/errsave_finish
> split exist differs a bit from the reason in ereport_domain()? "Over there"
> it's just about not wanting to incur overhead when the message isn't logged,
> but here we'll always have >= ERROR, but ->details_wanted can still lead to
> not wanting to incur the overhead.

Hmmm ... it seems like the same reason to me, we don't want to incur the
overhead if the "start" function says not to.

> Is it ok that we throw an error in lookup_rowtype_tupdesc()?

Yeah, that should fall in the category of internal errors I think.
I don't see how you'd reach that from a bad input string.

(Or to be more precise, the point of pg_input_is_valid is to tell
you whether the input string is valid, not to tell you whether the
type name is valid; if you're worried about the latter you need
a separate and earlier test.)

            regards, tom lane

Re: Error-safe user functions

From

Corey Huinker

Date:

08 December 2022, 03:37:28

On Wed, Dec 7, 2022 at 12:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:
> In my attempt to implement CAST...DEFAULT, I noticed that I immediately
> needed an
> OidInputFunctionCallSafe, which was trivial but maybe something we want to
> add to the infra patch, but the comments around that function also somewhat
> indicate that we might want to just do the work in-place and call
> InputFunctionCallSafe directly. Open to both ideas.

I'm a bit skeptical of that. IMO using OidInputFunctionCall is only
appropriate in places that will be executed just once per query.

That is what's happening when the expr of the existing CAST ( expr AS typename ) is a constant and we want to just resolve the constant at parse time.

Re: Error-safe user functions

From

Alvaro Herrera

Date:

08 December 2022, 12:00:19

On 2022-Dec-07, David G. Johnston wrote:

> Are you suggesting we should not go down the path that v8-0003 does in the
> monitoring section cleanup thread?  I find the usability of Chapter 54
> System Views to be superior to these two run-on chapters and would rather
> we emulate it in both these places - for what is in the end very little
> additional effort, all mechanical in nature.

I think the new 9.26 is much better now than what we had there two days
ago.  Maybe it would be even better with your proposed changes, but
let's see what you come up with.

As for Chapter 54, while it's a lot better than what we had previously,
I have a complaint about the new presentation: the overview table
appears (at least in the HTML presentation) in a separate page from the
initial page of the chapter.  So to get the intended table of contents I
have to move forward from the unintended table of contents (i.e. from
https://www.postgresql.org/docs/devel/views.html forward to
https://www.postgresql.org/docs/devel/views-overview.html ).  This seems
pointless.  I think it would be better if we just removed the line
<sect1 id="overview">, which would put that table in the "front page".

I also have an issue with Chapter 28, more precisely 28.2.2, where we
have a similar TOC-style tables (Tables 28.1 and 28.2), but these ones
seem inferior to the new table in Chapter 54 in that the outgoing links
are in random positions in the text of the table.  It would be better to
put those in a column of their own, so that they are all vertically
aligned and easier to spot/click.  Not sure if you've been here already.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)

Re: Error-safe user functions

From

Tom Lane

Date:

08 December 2022, 16:31:59

Andres Freund <andres@anarazel.de> writes:
> On 2022-12-07 17:32:21 -0500, Tom Lane wrote:
>> +typedef struct Node *NodePtr;

> Seems like it'd be easier to just forward declare the struct, and use the
> non-typedef'ed name in the header than to have to deal with these
> interdependencies and the differing typenames.

I've been having second thoughts about how to handle this issue.
As we convert more and more datatypes, references to "Node *" are
going to be needed in assorted headers that don't currently have
any reason to #include nodes.h.  Rather than bloating their include
footprints, we'll want to use the alternate spelling, whichever
it is.  (I already had to do this in array.h.)  Some of these headers
might be things that are also read by frontend compiles, in which
case they won't have access to elog.h either, so that NodePtr in
this formulation won't work for them.  (I ran into a variant of that
with an early draft of this patch series.)

If we stick with NodePtr we'll probably end by putting that typedef
into c.h so that it's accessible in frontend as well as backend.
I don't have a huge problem with that, but I concede it's a little ugly.

If we go with "struct Node *" then we can solve such problems by
just repeating "struct Node;" forward-declarations in as many
headers as we have to.  This is a bit ugly too, but maybe less so,
and it's a method we use elsewhere.  The main downside I can see
to it is that we will probably not find out all the places where
we need such declarations until we get field complaints that
"header X doesn't compile for me".  elog.h will have a struct Node
declaration, and that will be visible in every backend compilation
we do as well as every cpluspluscheck/headerscheck test.

Another notational point I'm wondering about is whether we want
to create hundreds of direct references to fcinfo->context.
Is it time to invent

#define PG_GET_CONTEXT()    (fcinfo->context)

and write that instead in all these input functions?

Thoughts?

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

08 December 2022, 21:00:10

On Thu, Dec 8, 2022 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we go with "struct Node *" then we can solve such problems by
> just repeating "struct Node;" forward-declarations in as many
> headers as we have to.

Yes, I think just putting "struct Node;" in as many places as
necessary is the way to go. Or even:

struct Node;
typedef struct Node Node;

....which I think then allows for Node * to be used later.

A small problem with typedef struct Something *SomethingElse is that
it can get hard to keep track of whether some identifier is a pointer
to a struct or just a struct. This doesn't bother me as much as it
does some other hackers, from what I gather anyway, but I think we
should be pretty judicious in using typedef that way. "SomethingPtr"
really has no advantage over "Something *". It is neither shorter nor
clearer.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Andres Freund

Date:

08 December 2022, 21:58:36

Hi,

On 2022-12-08 16:00:10 -0500, Robert Haas wrote:
> On Thu, Dec 8, 2022 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > If we go with "struct Node *" then we can solve such problems by
> > just repeating "struct Node;" forward-declarations in as many
> > headers as we have to.
> 
> Yes, I think just putting "struct Node;" in as many places as
> necessary is the way to go. Or even:

+1


> struct Node;
> typedef struct Node Node;

That doesn't work well, because C99 doesn't allow typedefs to be redeclared in
the same scope. IIRC C11 added suppport for it, and a lot of compilers already
supported it before.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

08 December 2022, 22:57:09

Andres Freund <andres@anarazel.de> writes:
> On 2022-12-08 16:00:10 -0500, Robert Haas wrote:
>> Yes, I think just putting "struct Node;" in as many places as
>> necessary is the way to go. Or even:

> +1

OK, here's a v5 that does it like that.

I've spent a little time pushing ahead on other input functions,
and realized that my original plan to require a pre-encoded typmod
for these test functions was not very user-friendly.  So in v5
you can write something like

pg_input_is_valid('1234.567', 'numeric(7,4)')

0004 attached finishes up the remaining core numeric datatypes
(int*, float*, numeric).  I ripped out float8in_internal_opt_error
in favor of a function that uses the new APIs.

0005 converts contrib/cube, which I chose to tackle partly because
I'd already touched it in 0004, partly because it seemed like a
good idea to verify that extension modules wouldn't have any
problems with this apprach, and partly because I wondered whether
an input function that uses a Bison/Flex parser would have big
problems getting converted.  This one didn't, anyway.

Given that this additional experimentation didn't find any holes
in the API design, I think this is pretty much ready to go.

            regards, tom lane

diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 693423e524..994dfc6526 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -900,6 +900,17 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
    function is written in C.
   </para>

+  <para>
+   In <productname>PostgreSQL</productname> version 16 and later,
+   it is desirable for base types' input functions to
+   return <quote>soft</quote> errors using the
+   new <function>errsave()</function>/<function>ereturn()</function>
+   mechanism, rather than throwing <function>ereport()</function>
+   exceptions as in previous versions.
+   See <filename>src/backend/utils/fmgr/README</filename> for more
+   information.
+  </para>
+
  </refsect1>

  <refsect1>
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 4368c30fdb..7c594be583 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -56,6 +56,7 @@ node_headers = \
     nodes/bitmapset.h \
     nodes/extensible.h \
     nodes/lockoptions.h \
+    nodes/miscnodes.h \
     nodes/replnodes.h \
     nodes/supportnodes.h \
     nodes/value.h \
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7212bc486f..08992dfd47 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -68,6 +68,7 @@ my @all_input_files = qw(
   nodes/bitmapset.h
   nodes/extensible.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
   nodes/value.h
@@ -89,6 +90,7 @@ my @nodetag_only_files = qw(
   executor/tuptable.h
   foreign/fdwapi.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
 );
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index f5cd1b7493..eb489ea3a7 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -71,6 +71,7 @@
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
+#include "nodes/miscnodes.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/bgworker.h"
@@ -611,6 +612,128 @@ errfinish(const char *filename, int lineno, const char *funcname)
     CHECK_FOR_INTERRUPTS();
 }

+
+/*
+ * errsave_start --- begin a "soft" error-reporting cycle
+ *
+ * If "context" isn't an ErrorSaveContext node, this behaves as
+ * errstart(ERROR, domain), and the errsave() macro ends up acting
+ * exactly like ereport(ERROR, ...).
+ *
+ * If "context" is an ErrorSaveContext node, but the node creator only wants
+ * notification of the fact of a soft error without any details, we just set
+ * the error_occurred flag in the ErrorSaveContext node and return false,
+ * which will cause us to skip the remaining error processing steps.
+ *
+ * Otherwise, create and initialize error stack entry and return true.
+ * Subsequently, errmsg() and perhaps other routines will be called to further
+ * populate the stack entry.  Finally, errsave_finish() will be called to
+ * tidy up.
+ */
+bool
+errsave_start(struct Node *context, const char *domain)
+{
+    ErrorSaveContext *escontext;
+    ErrorData  *edata;
+
+    /*
+     * Do we have a context for soft error reporting?  If not, just punt to
+     * errstart().
+     */
+    if (context == NULL || !IsA(context, ErrorSaveContext))
+        return errstart(ERROR, domain);
+
+    /* Report that a soft error was detected */
+    escontext = (ErrorSaveContext *) context;
+    escontext->error_occurred = true;
+
+    /* Nothing else to do if caller wants no further details */
+    if (!escontext->details_wanted)
+        return false;
+
+    /*
+     * Okay, crank up a stack entry to store the info in.
+     */
+
+    recursion_depth++;
+
+    /* Initialize data for this error frame */
+    edata = get_error_stack_entry();
+    edata->elevel = LOG;        /* signal all is well to errsave_finish */
+    set_stack_entry_domain(edata, domain);
+    /* Select default errcode based on the assumed elevel of ERROR */
+    edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
+
+    /*
+     * Any allocations for this error state level should go into the caller's
+     * context.  We don't need to pollute ErrorContext, or even require it to
+     * exist, in this code path.
+     */
+    edata->assoc_context = CurrentMemoryContext;
+
+    recursion_depth--;
+    return true;
+}
+
+/*
+ * errsave_finish --- end a "soft" error-reporting cycle
+ *
+ * If errsave_start() decided this was a regular error, behave as
+ * errfinish().  Otherwise, package up the error details and save
+ * them in the ErrorSaveContext node.
+ */
+void
+errsave_finish(struct Node *context, const char *filename, int lineno,
+               const char *funcname)
+{
+    ErrorSaveContext *escontext = (ErrorSaveContext *) context;
+    ErrorData  *edata = &errordata[errordata_stack_depth];
+
+    /* verify stack depth before accessing *edata */
+    CHECK_STACK_DEPTH();
+
+    /*
+     * If errsave_start punted to errstart, then elevel will be ERROR or
+     * perhaps even PANIC.  Punt likewise to errfinish.
+     */
+    if (edata->elevel >= ERROR)
+    {
+        errfinish(filename, lineno, funcname);
+        pg_unreachable();
+    }
+
+    /*
+     * Else, we should package up the stack entry contents and deliver them to
+     * the caller.
+     */
+    recursion_depth++;
+
+    /* Save the last few bits of error state into the stack entry */
+    set_stack_entry_location(edata, filename, lineno, funcname);
+
+    /* Replace the LOG value that errsave_start inserted */
+    edata->elevel = ERROR;
+
+    /*
+     * We skip calling backtrace and context functions, which are more likely
+     * to cause trouble than provide useful context; they might act on the
+     * assumption that a transaction abort is about to occur.
+     */
+
+    /*
+     * Make a copy of the error info for the caller.  All the subsidiary
+     * strings are already in the caller's context, so it's sufficient to
+     * flat-copy the stack entry.
+     */
+    escontext->error_data = palloc_object(ErrorData);
+    memcpy(escontext->error_data, edata, sizeof(ErrorData));
+
+    /* Exit error-handling context */
+    errordata_stack_depth--;
+    recursion_depth--;
+}
+
+
 /*
  * get_error_stack_entry --- allocate and initialize a new stack entry
  *
diff --git a/src/backend/utils/fmgr/README b/src/backend/utils/fmgr/README
index 49845f67ac..9958d38992 100644
--- a/src/backend/utils/fmgr/README
+++ b/src/backend/utils/fmgr/README
@@ -267,6 +267,78 @@ See windowapi.h for more information.
 information about the context of the CALL statement, particularly
 whether it is within an "atomic" execution context.

+* Some callers of datatype input functions (and in future perhaps
+other classes of functions) pass an instance of ErrorSaveContext.
+This indicates that the caller wishes to handle "soft" errors without
+a transaction-terminating exception being thrown: instead, the callee
+should store information about the error cause in the ErrorSaveContext
+struct and return a dummy result value.  Further details appear in
+"Handling Soft Errors" below.
+
+
+Handling Soft Errors
+--------------------
+
+Postgres' standard mechanism for reporting errors (ereport() or elog())
+is used for all sorts of error conditions.  This means that throwing
+an exception via ereport(ERROR) requires an expensive transaction or
+subtransaction abort and cleanup, since the exception catcher dare not
+make many assumptions about what has gone wrong.  There are situations
+where we would rather have a lighter-weight mechanism for dealing
+with errors that are known to be safe to recover from without a full
+transaction cleanup.  SQL-callable functions can support this need
+using the ErrorSaveContext context mechanism.
+
+To report a "soft" error, a SQL-callable function should call
+    errsave(fcinfo->context, ...)
+where it would previously have done
+    ereport(ERROR, ...)
+If the passed "context" is NULL or is not an ErrorSaveContext node,
+then errsave behaves precisely as ereport(ERROR): the exception is
+thrown via longjmp, so that control does not return.  If "context"
+is an ErrorSaveContext node, then the error information included in
+errsave's subsidiary reporting calls is stored into the context node
+and control returns from errsave normally.  The function should then
+return a dummy value to its caller.  (SQL NULL is recommendable as
+the dummy value; but anything will do, since the caller is expected
+to ignore the function's return value once it sees that an error has
+been reported in the ErrorSaveContext node.)
+
+If there is nothing to do except return after calling errsave(),
+you can save a line or two by writing
+    ereturn(fcinfo->context, dummy_value, ...)
+to perform errsave() and then "return dummy_value".
+
+An error reported "softly" must be safe, in the sense that there is
+no question about our ability to continue normal processing of the
+transaction.  Error conditions that should NOT be handled this way
+include out-of-memory, unexpected internal errors, or anything that
+cannot easily be cleaned up after.  Such cases should still be thrown
+with ereport, as they have been in the past.
+
+Considering datatype input functions as examples, typical "soft" error
+conditions include input syntax errors and out-of-range values.  An
+input function typically detects such cases with simple if-tests and
+can easily change the ensuing ereport call to an errsave or ereturn.
+Because of this restriction, it's typically not necessary to pass
+the ErrorSaveContext pointer down very far, as errors reported by
+low-level functions are typically reasonable to consider internal.
+(Another way to frame the distinction is that input functions should
+report all invalid-input conditions softly, but internal problems are
+hard errors.)
+
+Because no transaction cleanup will occur, a function that is exiting
+after errsave() returns will bear responsibility for resource cleanup.
+It is not necessary to be concerned about small leakages of palloc'd
+memory, since the caller should be running the function in a short-lived
+memory context.  However, resources such as locks, open files, or buffer
+pins must be closed out cleanly, as they would be in the non-error code
+path.
+
+Conventions for callers that use the ErrorSaveContext mechanism
+to trap errors are discussed with the declaration of that struct,
+in nodes/miscnodes.h.
+

 Functions Accepting or Returning Sets
 -------------------------------------
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 3c210297aa..a9519c6a8d 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -23,6 +23,7 @@
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/miscnodes.h"
 #include "nodes/nodeFuncs.h"
 #include "pgstat.h"
 #include "utils/acl.h"
@@ -1548,6 +1549,70 @@ InputFunctionCall(FmgrInfo *flinfo, char *str, Oid typioparam, int32 typmod)
     return result;
 }

+/*
+ * Call a previously-looked-up datatype input function, with non-exception
+ * handling of "soft" errors.
+ *
+ * This is basically like InputFunctionCall, but the converted Datum is
+ * returned into *result while the function result is true for success or
+ * false for failure.  Also, the caller may pass an ErrorSaveContext node.
+ * (We declare that as "fmNodePtr" to avoid including nodes.h in fmgr.h.)
+ *
+ * If escontext points to an ErrorSaveContext, any "soft" errors detected by
+ * the input function will be reported by filling the escontext struct and
+ * returning false.  (The caller can choose to test SOFT_ERROR_OCCURRED(),
+ * but checking the function result instead is usually cheaper.)
+ *
+ * If escontext does not point to an ErrorSaveContext, errors are reported
+ * via ereport(ERROR), so that there is no functional difference from
+ * InputFunctionCall; the result will always be true if control returns.
+ */
+bool
+InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                      Oid typioparam, int32 typmod,
+                      fmNodePtr escontext,
+                      Datum *result)
+{
+    LOCAL_FCINFO(fcinfo, 3);
+
+    if (str == NULL && flinfo->fn_strict)
+    {
+        *result = (Datum) 0;    /* just return null result */
+        return true;
+    }
+
+    InitFunctionCallInfoData(*fcinfo, flinfo, 3, InvalidOid, escontext, NULL);
+
+    fcinfo->args[0].value = CStringGetDatum(str);
+    fcinfo->args[0].isnull = false;
+    fcinfo->args[1].value = ObjectIdGetDatum(typioparam);
+    fcinfo->args[1].isnull = false;
+    fcinfo->args[2].value = Int32GetDatum(typmod);
+    fcinfo->args[2].isnull = false;
+
+    *result = FunctionCallInvoke(fcinfo);
+
+    /* Result value is garbage, and could be null, if an error was reported */
+    if (SOFT_ERROR_OCCURRED(escontext))
+        return false;
+
+    /* Otherwise, should get null result if and only if str is NULL */
+    if (str == NULL)
+    {
+        if (!fcinfo->isnull)
+            elog(ERROR, "input function %u returned non-NULL",
+                 flinfo->fn_oid);
+    }
+    else
+    {
+        if (fcinfo->isnull)
+            elog(ERROR, "input function %u returned NULL",
+                 flinfo->fn_oid);
+    }
+
+    return true;
+}
+
 /*
  * Call a previously-looked-up datatype output function.
  *
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index 380a82b9de..b7832d0aa2 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -700,6 +700,10 @@ extern Datum OidFunctionCall9Coll(Oid functionId, Oid collation,
 /* Special cases for convenient invocation of datatype I/O functions. */
 extern Datum InputFunctionCall(FmgrInfo *flinfo, char *str,
                                Oid typioparam, int32 typmod);
+extern bool InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
+                                  Oid typioparam, int32 typmod,
+                                  fmNodePtr escontext,
+                                  Datum *result);
 extern Datum OidInputFunctionCall(Oid functionId, char *str,
                                   Oid typioparam, int32 typmod);
 extern char *OutputFunctionCall(FmgrInfo *flinfo, Datum val);
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index e63881086e..f0e60935b6 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -16,6 +16,7 @@ node_support_input_i = [
   'nodes/bitmapset.h',
   'nodes/extensible.h',
   'nodes/lockoptions.h',
+  'nodes/miscnodes.h',
   'nodes/replnodes.h',
   'nodes/supportnodes.h',
   'nodes/value.h',
diff --git a/src/include/nodes/miscnodes.h b/src/include/nodes/miscnodes.h
new file mode 100644
index 0000000000..b50ee60352
--- /dev/null
+++ b/src/include/nodes/miscnodes.h
@@ -0,0 +1,56 @@
+/*-------------------------------------------------------------------------
+ *
+ * miscnodes.h
+ *      Definitions for hard-to-classify node types.
+ *
+ * Node types declared here are not part of parse trees, plan trees,
+ * or execution state trees.  We only assign them NodeTag values because
+ * IsA() tests provide a convenient way to disambiguate what kind of
+ * structure is being passed through assorted APIs, such as function
+ * "context" pointers.
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/nodes/miscnodes.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MISCNODES_H
+#define MISCNODES_H
+
+#include "nodes/nodes.h"
+
+/*
+ * ErrorSaveContext -
+ *        function call context node for handling of "soft" errors
+ *
+ * A caller wishing to trap soft errors must initialize a struct like this
+ * with all fields zero/NULL except for the NodeTag.  Optionally, set
+ * details_wanted = true if more than the bare knowledge that a soft error
+ * occurred is required.  The struct is then passed to a SQL-callable function
+ * via the FunctionCallInfo.context field; or below the level of SQL calls,
+ * it could be passed to a subroutine directly.
+ *
+ * After calling code that might report an error this way, check
+ * error_occurred to see if an error happened.  If so, and if details_wanted
+ * is true, error_data has been filled with error details (stored in the
+ * callee's memory context!).  FreeErrorData() can be called to release
+ * error_data, although that step is typically not necessary if the called
+ * code was run in a short-lived context.
+ */
+typedef struct ErrorSaveContext
+{
+    NodeTag        type;
+    bool        error_occurred; /* set to true if we detect a soft error */
+    bool        details_wanted; /* does caller want more info than that? */
+    ErrorData  *error_data;        /* details of error, if so */
+} ErrorSaveContext;
+
+/* Often-useful macro for checking if a soft error was reported */
+#define SOFT_ERROR_OCCURRED(escontext) \
+    ((escontext) != NULL && IsA(escontext, ErrorSaveContext) && \
+     ((ErrorSaveContext *) (escontext))->error_occurred)
+
+#endif                            /* MISCNODES_H */
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f107a818e8..8025dce335 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -18,6 +18,10 @@

 #include "lib/stringinfo.h"

+/* We cannot include nodes.h yet, so forward-declare struct Node */
+struct Node;
+
+
 /* Error level codes */
 #define DEBUG5        10            /* Debugging messages, in categories of
                                  * decreasing detail. */
@@ -235,6 +239,63 @@ extern int    getinternalerrposition(void);
     ereport(elevel, errmsg_internal(__VA_ARGS__))


+/*----------
+ * Support for reporting "soft" errors that don't require a full transaction
+ * abort to clean up.  This is to be used in this way:
+ *        errsave(context,
+ *                errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ *                errmsg("invalid input syntax for type %s: \"%s\"",
+ *                       "boolean", in_str),
+ *                ... other errxxx() fields as needed ...);
+ *
+ * "context" is a node pointer or NULL, and the remaining auxiliary calls
+ * provide the same error details as in ereport().  If context is not a
+ * pointer to an ErrorSaveContext node, then errsave(context, ...)
+ * behaves identically to ereport(ERROR, ...).  If context is a pointer
+ * to an ErrorSaveContext node, then the information provided by the
+ * auxiliary calls is stored in the context node and control returns
+ * normally.  The caller of errsave() must then do any required cleanup
+ * and return control back to its caller.  That caller must check the
+ * ErrorSaveContext node to see whether an error occurred before
+ * it can trust the function's result to be meaningful.
+ *
+ * errsave_domain() allows a message domain to be specified; it is
+ * precisely analogous to ereport_domain().
+ *----------
+ */
+#define errsave_domain(context, domain, ...)    \
+    do { \
+        struct Node *context_ = (context); \
+        pg_prevent_errno_in_scope(); \
+        if (errsave_start(context_, domain)) \
+            __VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \
+    } while(0)
+
+#define errsave(context, ...)    \
+    errsave_domain(context, TEXTDOMAIN, __VA_ARGS__)
+
+/*
+ * "ereturn(context, dummy_value, ...);" is exactly the same as
+ * "errsave(context, ...); return dummy_value;".  This saves a bit
+ * of typing in the common case where a function has no cleanup
+ * actions to take after reporting a soft error.  "dummy_value"
+ * can be empty if the function returns void.
+ */
+#define ereturn_domain(context, dummy_value, domain, ...)    \
+    do { \
+        errsave_domain(context, domain, __VA_ARGS__); \
+        return dummy_value; \
+    } while(0)
+
+#define ereturn(context, dummy_value, ...)    \
+    ereturn_domain(context, dummy_value, TEXTDOMAIN, __VA_ARGS__)
+
+extern bool errsave_start(struct Node *context, const char *domain);
+extern void errsave_finish(struct Node *context,
+                           const char *filename, int lineno,
+                           const char *funcname);
+
+
 /* Support for constructing error strings separately from ereport() calls */

 extern void pre_format_elog_string(int errnumber, const char *domain);
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e57ffce971..ad31fdb737 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24683,6 +24683,107 @@ SELECT collation for ('foo' COLLATE "de_DE");

   </sect2>

+  <sect2 id="functions-info-validity">
+   <title>Data Validity Checking Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-info-validity-table"/>
+    can be helpful for checking validity of proposed input data.
+   </para>
+
+   <table id="functions-info-validity-table">
+    <title>Data Validity Checking Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para>
+       <para>
+        Example(s)
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_input_is_valid</primary>
+        </indexterm>
+        <function>pg_input_is_valid</function> (
+          <parameter>string</parameter> <type>text</type>,
+          <parameter>type</parameter> <type>text</type>
+        )
+        <returnvalue>boolean</returnvalue>
+       </para>
+       <para>
+        Tests whether the given <parameter>string</parameter> is valid
+        input for the specified data type, returning true or false.
+       </para>
+       <para>
+        This function will only work as desired if the data type's input
+        function has been updated to report invalid input as
+        a <quote>soft</quote> error.  Otherwise, invalid input will abort
+        the transaction, just as if the string had been cast to the type
+        directly.
+        </para>
+        <para>
+         <literal>pg_input_is_valid('42', 'integer')</literal>
+         <returnvalue>t</returnvalue>
+        </para>
+        <para>
+         <literal>pg_input_is_valid('42000000000', 'integer')</literal>
+         <returnvalue>f</returnvalue>
+        </para>
+        <para>
+         <literal>pg_input_is_valid('1234.567', 'numeric(7,4)')</literal>
+         <returnvalue>f</returnvalue>
+       </para></entry>
+      </row>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_input_error_message</primary>
+        </indexterm>
+        <function>pg_input_error_message</function> (
+          <parameter>string</parameter> <type>text</type>,
+          <parameter>type</parameter> <type>text</type>
+        )
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Tests whether the given <parameter>string</parameter> is valid
+        input for the specified data type; if not, return the error
+        message that would have been thrown.  If the input is valid, the
+        result is NULL.  The inputs are the same as
+        for <function>pg_input_is_valid</function>.
+       </para>
+       <para>
+        This function will only work as desired if the data type's input
+        function has been updated to report invalid input as
+        a <quote>soft</quote> error.  Otherwise, invalid input will abort
+        the transaction, just as if the string had been cast to the type
+        directly.
+        </para>
+        <para>
+         <literal>pg_input_error_message('42000000000', 'integer')</literal>
+         <returnvalue>value "42000000000" is out of range for type integer</returnvalue>
+        </para>
+        <para>
+         <literal>pg_input_error_message('1234.567', 'numeric(7,4)')</literal>
+         <returnvalue>numeric field overflow</returnvalue>
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-info-snapshot">
    <title>Transaction ID and Snapshot Information Functions</title>

diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 9c13251231..20a464fb59 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -32,6 +32,8 @@
 #include "common/keywords.h"
 #include "funcapi.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
+#include "parser/parse_type.h"
 #include "parser/scansup.h"
 #include "pgstat.h"
 #include "postmaster/syslogger.h"
@@ -45,6 +47,23 @@
 #include "utils/ruleutils.h"
 #include "utils/timestamp.h"

+
+/*
+ * structure to cache metadata needed in pg_input_is_valid_common
+ */
+typedef struct BasicIOData
+{
+    Oid            typoid;
+    Oid            typiofunc;
+    Oid            typioparam;
+    FmgrInfo    inputproc;
+} BasicIOData;
+
+static bool pg_input_is_valid_common(FunctionCallInfo fcinfo,
+                                     text *txt, text *typname,
+                                     ErrorSaveContext *escontext);
+
+
 /*
  * Common subroutine for num_nulls() and num_nonnulls().
  * Returns true if successful, false if function should return NULL.
@@ -640,6 +659,104 @@ pg_column_is_updatable(PG_FUNCTION_ARGS)
 }


+/*
+ * pg_input_is_valid - test whether string is valid input for datatype.
+ *
+ * Returns true if OK, false if not.
+ *
+ * This will only work usefully if the datatype's input function has been
+ * updated to return "soft" errors via errsave/ereturn.
+ */
+Datum
+pg_input_is_valid(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    text       *typname = PG_GETARG_TEXT_PP(1);
+    ErrorSaveContext escontext = {T_ErrorSaveContext};
+
+    PG_RETURN_BOOL(pg_input_is_valid_common(fcinfo, txt, typname,
+                                            &escontext));
+}
+
+/*
+ * pg_input_error_message - test whether string is valid input for datatype.
+ *
+ * Returns NULL if OK, else the primary message string from the error.
+ *
+ * This will only work usefully if the datatype's input function has been
+ * updated to return "soft" errors via errsave/ereturn.
+ */
+Datum
+pg_input_error_message(PG_FUNCTION_ARGS)
+{
+    text       *txt = PG_GETARG_TEXT_PP(0);
+    text       *typname = PG_GETARG_TEXT_PP(1);
+    ErrorSaveContext escontext = {T_ErrorSaveContext};
+
+    /* Enable details_wanted */
+    escontext.details_wanted = true;
+
+    if (pg_input_is_valid_common(fcinfo, txt, typname,
+                                 &escontext))
+        PG_RETURN_NULL();
+
+    Assert(escontext.error_occurred);
+    Assert(escontext.error_data != NULL);
+    Assert(escontext.error_data->message != NULL);
+
+    PG_RETURN_TEXT_P(cstring_to_text(escontext.error_data->message));
+}
+
+/* Common subroutine for the above */
+static bool
+pg_input_is_valid_common(FunctionCallInfo fcinfo,
+                         text *txt, text *typname,
+                         ErrorSaveContext *escontext)
+{
+    char       *str = text_to_cstring(txt);
+    char       *typnamestr = text_to_cstring(typname);
+    Oid            typoid;
+    int32        typmod;
+    BasicIOData *my_extra;
+    Datum        converted;
+
+    /* Parse type-name argument to obtain type OID and encoded typmod. */
+    parseTypeString(typnamestr, &typoid, &typmod, false);
+
+    /*
+     * We arrange to look up the needed I/O info just once per series of
+     * calls, assuming the data type doesn't change underneath us.
+     */
+    my_extra = (BasicIOData *) fcinfo->flinfo->fn_extra;
+    if (my_extra == NULL)
+    {
+        fcinfo->flinfo->fn_extra =
+            MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+                               sizeof(BasicIOData));
+        my_extra = (BasicIOData *) fcinfo->flinfo->fn_extra;
+        my_extra->typoid = InvalidOid;
+    }
+
+    if (my_extra->typoid != typoid)
+    {
+        getTypeInputInfo(typoid,
+                         &my_extra->typiofunc,
+                         &my_extra->typioparam);
+        fmgr_info_cxt(my_extra->typiofunc, &my_extra->inputproc,
+                      fcinfo->flinfo->fn_mcxt);
+        my_extra->typoid = typoid;
+    }
+
+    /* Now we can try to perform the conversion. */
+    return InputFunctionCallSafe(&my_extra->inputproc,
+                                 str,
+                                 my_extra->typioparam,
+                                 typmod,
+                                 (Node *) escontext,
+                                 &converted);
+}
+
+
 /*
  * Is character a valid identifier start?
  * Must match scan.l's {ident_start} character class.
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f9301b2627..719599649a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7060,6 +7060,14 @@
   prorettype => 'regnamespace', proargtypes => 'text',
   prosrc => 'to_regnamespace' },

+{ oid => '8050', descr => 'test whether string is valid input for data type',
+  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
+  proargtypes => 'text text', prosrc => 'pg_input_is_valid' },
+{ oid => '8051',
+  descr => 'get error message if string is not valid input for data type',
+  proname => 'pg_input_error_message', provolatile => 's', prorettype => 'text',
+  proargtypes => 'text text', prosrc => 'pg_input_error_message' },
+
 { oid => '1268',
   descr => 'parse qualified identifier to array of identifiers',
   proname => 'parse_ident', prorettype => '_text', proargtypes => 'text bool',
diff --git a/src/test/regress/expected/create_type.out b/src/test/regress/expected/create_type.out
index 0dfc88c1c8..7383fcdbb1 100644
--- a/src/test/regress/expected/create_type.out
+++ b/src/test/regress/expected/create_type.out
@@ -249,6 +249,31 @@ select format_type('bpchar'::regtype, -1);
  bpchar
 (1 row)

+-- Test non-error-throwing APIs using widget, which still throws errors
+SELECT pg_input_is_valid('(1,2,3)', 'widget');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('(1,2)', 'widget');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
+SELECT pg_input_is_valid('{"(1,2,3)"}', 'widget[]');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('{"(1,2)"}', 'widget[]');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
+SELECT pg_input_is_valid('("(1,2,3)")', 'mytab');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('("(1,2)")', 'mytab');  -- hard error expected
+ERROR:  invalid input syntax for type widget: "(1,2)"
 -- Test creation of an operator over a user-defined type
 CREATE FUNCTION pt_in_widget(point, widget)
    RETURNS bool
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index 548afb4438..2977045cc7 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -183,6 +183,11 @@ widget_in(PG_FUNCTION_ARGS)
             coord[i++] = p + 1;
     }

+    /*
+     * Note: DON'T convert this error to "soft" style (errsave/ereturn).  We
+     * want this data type to stay permanently in the hard-error world so that
+     * it can be used for testing that such cases still work reasonably.
+     */
     if (i < NARGS)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
diff --git a/src/test/regress/sql/create_type.sql b/src/test/regress/sql/create_type.sql
index c6fc4f9029..c25018029c 100644
--- a/src/test/regress/sql/create_type.sql
+++ b/src/test/regress/sql/create_type.sql
@@ -192,6 +192,14 @@ select format_type('bpchar'::regtype, null);
 -- this behavior difference is intentional
 select format_type('bpchar'::regtype, -1);

+-- Test non-error-throwing APIs using widget, which still throws errors
+SELECT pg_input_is_valid('(1,2,3)', 'widget');
+SELECT pg_input_is_valid('(1,2)', 'widget');  -- hard error expected
+SELECT pg_input_is_valid('{"(1,2,3)"}', 'widget[]');
+SELECT pg_input_is_valid('{"(1,2)"}', 'widget[]');  -- hard error expected
+SELECT pg_input_is_valid('("(1,2,3)")', 'mytab');
+SELECT pg_input_is_valid('("(1,2)")', 'mytab');  -- hard error expected
+
 -- Test creation of an operator over a user-defined type

 CREATE FUNCTION pt_in_widget(point, widget)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 495e449a9e..c011ebdfd9 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -90,14 +90,15 @@ typedef struct ArrayIteratorData
 }            ArrayIteratorData;

 static bool array_isspace(char ch);
-static int    ArrayCount(const char *str, int *dim, char typdelim);
-static void ReadArrayStr(char *arrayStr, const char *origStr,
+static int    ArrayCount(const char *str, int *dim, char typdelim,
+                       Node *escontext);
+static bool ReadArrayStr(char *arrayStr, const char *origStr,
                          int nitems, int ndim, int *dim,
                          FmgrInfo *inputproc, Oid typioparam, int32 typmod,
                          char typdelim,
                          int typlen, bool typbyval, char typalign,
                          Datum *values, bool *nulls,
-                         bool *hasnulls, int32 *nbytes);
+                         bool *hasnulls, int32 *nbytes, Node *escontext);
 static void ReadArrayBinary(StringInfo buf, int nitems,
                             FmgrInfo *receiveproc, Oid typioparam, int32 typmod,
                             int typlen, bool typbyval, char typalign,
@@ -177,6 +178,7 @@ array_in(PG_FUNCTION_ARGS)
     Oid            element_type = PG_GETARG_OID(1);    /* type of an array
                                                      * element */
     int32        typmod = PG_GETARG_INT32(2);    /* typmod for array elements */
+    Node       *escontext = fcinfo->context;
     int            typlen;
     bool        typbyval;
     char        typalign;
@@ -258,7 +260,7 @@ array_in(PG_FUNCTION_ARGS)
             break;                /* no more dimension items */
         p++;
         if (ndim >= MAXDIM)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                             ndim + 1, MAXDIM)));
@@ -266,7 +268,7 @@ array_in(PG_FUNCTION_ARGS)
         for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
              /* skip */ ;
         if (q == p)                /* no digits? */
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("\"[\" must introduce explicitly-specified array dimensions.")));
@@ -280,7 +282,7 @@ array_in(PG_FUNCTION_ARGS)
             for (q = p; isdigit((unsigned char) *q) || (*q == '-') || (*q == '+'); q++)
                  /* skip */ ;
             if (q == p)            /* no digits? */
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Missing array dimension value.")));
@@ -291,7 +293,7 @@ array_in(PG_FUNCTION_ARGS)
             lBound[ndim] = 1;
         }
         if (*q != ']')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -301,7 +303,7 @@ array_in(PG_FUNCTION_ARGS)
         ub = atoi(p);
         p = q + 1;
         if (ub < lBound[ndim])
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
                      errmsg("upper bound cannot be less than lower bound")));

@@ -313,11 +315,13 @@ array_in(PG_FUNCTION_ARGS)
     {
         /* No array dimensions, so intuit dimensions from brace structure */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array value must start with \"{\" or dimension information.")));
-        ndim = ArrayCount(p, dim, typdelim);
+        ndim = ArrayCount(p, dim, typdelim, escontext);
+        if (ndim < 0)
+            PG_RETURN_NULL();
         for (i = 0; i < ndim; i++)
             lBound[i] = 1;
     }
@@ -328,7 +332,7 @@ array_in(PG_FUNCTION_ARGS)

         /* If array dimensions are given, expect '=' operator */
         if (strncmp(p, ASSGN, strlen(ASSGN)) != 0)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Missing \"%s\" after array dimensions.",
@@ -342,20 +346,22 @@ array_in(PG_FUNCTION_ARGS)
          * were given
          */
         if (*p != '{')
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Array contents must start with \"{\".")));
-        ndim_braces = ArrayCount(p, dim_braces, typdelim);
+        ndim_braces = ArrayCount(p, dim_braces, typdelim, escontext);
+        if (ndim_braces < 0)
+            PG_RETURN_NULL();
         if (ndim_braces != ndim)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", string),
                      errdetail("Specified array dimensions do not match array contents.")));
         for (i = 0; i < ndim; ++i)
         {
             if (dim[i] != dim_braces[i])
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed array literal: \"%s\"", string),
                          errdetail("Specified array dimensions do not match array contents.")));
@@ -372,8 +378,11 @@ array_in(PG_FUNCTION_ARGS)
 #endif

     /* This checks for overflow of the array dimensions */
-    nitems = ArrayGetNItems(ndim, dim);
-    ArrayCheckBounds(ndim, dim, lBound);
+    nitems = ArrayGetNItemsSafe(ndim, dim, escontext);
+    if (nitems < 0)
+        PG_RETURN_NULL();
+    if (!ArrayCheckBoundsSafe(ndim, dim, lBound, escontext))
+        PG_RETURN_NULL();

     /* Empty array? */
     if (nitems == 0)
@@ -381,13 +390,14 @@ array_in(PG_FUNCTION_ARGS)

     dataPtr = (Datum *) palloc(nitems * sizeof(Datum));
     nullsPtr = (bool *) palloc(nitems * sizeof(bool));
-    ReadArrayStr(p, string,
-                 nitems, ndim, dim,
-                 &my_extra->proc, typioparam, typmod,
-                 typdelim,
-                 typlen, typbyval, typalign,
-                 dataPtr, nullsPtr,
-                 &hasnulls, &nbytes);
+    if (!ReadArrayStr(p, string,
+                      nitems, ndim, dim,
+                      &my_extra->proc, typioparam, typmod,
+                      typdelim,
+                      typlen, typbyval, typalign,
+                      dataPtr, nullsPtr,
+                      &hasnulls, &nbytes, escontext))
+        PG_RETURN_NULL();
     if (hasnulls)
     {
         dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems);
@@ -451,9 +461,12 @@ array_isspace(char ch)
  *
  * Returns number of dimensions as function result.  The axis lengths are
  * returned in dim[], which must be of size MAXDIM.
+ *
+ * If we detect an error, fill *escontext with error details and return -1
+ * (unless escontext isn't provided, in which case errors will be thrown).
  */
 static int
-ArrayCount(const char *str, int *dim, char typdelim)
+ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 {
     int            nest_level = 0,
                 i;
@@ -488,11 +501,10 @@ ArrayCount(const char *str, int *dim, char typdelim)
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, -1,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"", str),
                              errdetail("Unexpected end of input.")));
-                    break;
                 case '\\':

                     /*
@@ -504,7 +516,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         parse_state != ARRAY_ELEM_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected \"%c\" character.",
@@ -515,7 +527,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (*(ptr + 1))
                         ptr++;
                     else
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected end of input.")));
@@ -530,7 +542,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                     if (parse_state != ARRAY_LEVEL_STARTED &&
                         parse_state != ARRAY_QUOTED_ELEM_STARTED &&
                         parse_state != ARRAY_ELEM_DELIMITED)
-                        ereport(ERROR,
+                        ereturn(escontext, -1,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"", str),
                                  errdetail("Unexpected array element.")));
@@ -551,14 +563,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                         if (parse_state != ARRAY_NO_LEVEL &&
                             parse_state != ARRAY_LEVEL_STARTED &&
                             parse_state != ARRAY_LEVEL_DELIMITED)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '{')));
                         parse_state = ARRAY_LEVEL_STARTED;
                         if (nest_level >= MAXDIM)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                                      errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
                                             nest_level + 1, MAXDIM)));
@@ -581,14 +593,14 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                             parse_state != ARRAY_LEVEL_COMPLETED &&
                             !(nest_level == 1 && parse_state == ARRAY_LEVEL_STARTED))
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unexpected \"%c\" character.",
                                                '}')));
                         parse_state = ARRAY_LEVEL_COMPLETED;
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Unmatched \"%c\" character.", '}')));
@@ -596,7 +608,7 @@ ArrayCount(const char *str, int *dim, char typdelim)

                         if (nelems_last[nest_level] != 0 &&
                             nelems[nest_level] != nelems_last[nest_level])
-                            ereport(ERROR,
+                            ereturn(escontext, -1,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"", str),
                                      errdetail("Multidimensional arrays must have "
@@ -630,7 +642,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                                 parse_state != ARRAY_ELEM_COMPLETED &&
                                 parse_state != ARRAY_QUOTED_ELEM_COMPLETED &&
                                 parse_state != ARRAY_LEVEL_COMPLETED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected \"%c\" character.",
@@ -653,7 +665,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
                             if (parse_state != ARRAY_LEVEL_STARTED &&
                                 parse_state != ARRAY_ELEM_STARTED &&
                                 parse_state != ARRAY_ELEM_DELIMITED)
-                                ereport(ERROR,
+                                ereturn(escontext, -1,
                                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                          errmsg("malformed array literal: \"%s\"", str),
                                          errdetail("Unexpected array element.")));
@@ -673,7 +685,7 @@ ArrayCount(const char *str, int *dim, char typdelim)
     while (*ptr)
     {
         if (!array_isspace(*ptr++))
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"", str),
                      errdetail("Junk after closing right brace.")));
@@ -713,11 +725,16 @@ ArrayCount(const char *str, int *dim, char typdelim)
  *    *hasnulls: set true iff there are any null elements.
  *    *nbytes: set to total size of data area needed (including alignment
  *        padding but not including array header overhead).
+ *    *escontext: if this points to an ErrorSaveContext, details of
+ *        any error are reported there.
+ *
+ * Result:
+ *    true for success, false for failure (if escontext is provided).
  *
  * Note that values[] and nulls[] are allocated by the caller, and must have
  * nitems elements.
  */
-static void
+static bool
 ReadArrayStr(char *arrayStr,
              const char *origStr,
              int nitems,
@@ -733,7 +750,8 @@ ReadArrayStr(char *arrayStr,
              Datum *values,
              bool *nulls,
              bool *hasnulls,
-             int32 *nbytes)
+             int32 *nbytes,
+             Node *escontext)
 {
     int            i,
                 nest_level = 0;
@@ -784,7 +802,7 @@ ReadArrayStr(char *arrayStr,
             {
                 case '\0':
                     /* Signal a premature end of the string */
-                    ereport(ERROR,
+                    ereturn(escontext, false,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed array literal: \"%s\"",
                                     origStr)));
@@ -793,7 +811,7 @@ ReadArrayStr(char *arrayStr,
                     /* Skip backslash, copy next character as-is. */
                     srcptr++;
                     if (*srcptr == '\0')
-                        ereport(ERROR,
+                        ereturn(escontext, false,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed array literal: \"%s\"",
                                         origStr)));
@@ -823,7 +841,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level >= ndim)
-                            ereport(ERROR,
+                            ereturn(escontext, false,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -838,7 +856,7 @@ ReadArrayStr(char *arrayStr,
                     if (!in_quotes)
                     {
                         if (nest_level == 0)
-                            ereport(ERROR,
+                            ereturn(escontext, false,
                                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                      errmsg("malformed array literal: \"%s\"",
                                             origStr)));
@@ -891,7 +909,7 @@ ReadArrayStr(char *arrayStr,
         *dstendptr = '\0';

         if (i < 0 || i >= nitems)
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("malformed array literal: \"%s\"",
                             origStr)));
@@ -900,14 +918,20 @@ ReadArrayStr(char *arrayStr,
             pg_strcasecmp(itemstart, "NULL") == 0)
         {
             /* it's a NULL item */
-            values[i] = InputFunctionCall(inputproc, NULL,
-                                          typioparam, typmod);
+            if (!InputFunctionCallSafe(inputproc, NULL,
+                                       typioparam, typmod,
+                                       escontext,
+                                       &values[i]))
+                return false;
             nulls[i] = true;
         }
         else
         {
-            values[i] = InputFunctionCall(inputproc, itemstart,
-                                          typioparam, typmod);
+            if (!InputFunctionCallSafe(inputproc, itemstart,
+                                       typioparam, typmod,
+                                       escontext,
+                                       &values[i]))
+                return false;
             nulls[i] = false;
         }
     }
@@ -930,7 +954,7 @@ ReadArrayStr(char *arrayStr,
             totbytes = att_align_nominal(totbytes, typalign);
             /* check for overflow of total request */
             if (!AllocSizeIsValid(totbytes))
-                ereport(ERROR,
+                ereturn(escontext, false,
                         (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                          errmsg("array size exceeds the maximum allowed (%d)",
                                 (int) MaxAllocSize)));
@@ -938,6 +962,7 @@ ReadArrayStr(char *arrayStr,
     }
     *hasnulls = hasnull;
     *nbytes = totbytes;
+    return true;
 }


diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 051169a149..3821f6637b 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -74,6 +74,16 @@ ArrayGetOffset0(int n, const int *tup, const int *scale)
  */
 int
 ArrayGetNItems(int ndim, const int *dims)
+{
+    return ArrayGetNItemsSafe(ndim, dims, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.  -1 is returned after an error.
+ */
+int
+ArrayGetNItemsSafe(int ndim, const int *dims, struct Node *escontext)
 {
     int32        ret;
     int            i;
@@ -89,7 +99,7 @@ ArrayGetNItems(int ndim, const int *dims)

         /* A negative dimension implies that UB-LB overflowed ... */
         if (dims[i] < 0)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
@@ -98,14 +108,14 @@ ArrayGetNItems(int ndim, const int *dims)

         ret = (int32) prod;
         if ((int64) ret != prod)
-            ereport(ERROR,
+            ereturn(escontext, -1,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array size exceeds the maximum allowed (%d)",
                             (int) MaxArraySize)));
     }
     Assert(ret >= 0);
     if ((Size) ret > MaxArraySize)
-        ereport(ERROR,
+        ereturn(escontext, -1,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("array size exceeds the maximum allowed (%d)",
                         (int) MaxArraySize)));
@@ -126,6 +136,17 @@ ArrayGetNItems(int ndim, const int *dims)
  */
 void
 ArrayCheckBounds(int ndim, const int *dims, const int *lb)
+{
+    (void) ArrayCheckBoundsSafe(ndim, dims, lb, NULL);
+}
+
+/*
+ * This entry point can return the error into an ErrorSaveContext
+ * instead of throwing an exception.
+ */
+bool
+ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                     struct Node *escontext)
 {
     int            i;

@@ -135,11 +156,13 @@ ArrayCheckBounds(int ndim, const int *dims, const int *lb)
         int32        sum PG_USED_FOR_ASSERTS_ONLY;

         if (pg_add_s32_overflow(dims[i], lb[i], &sum))
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("array lower bound is too large: %d",
                             lb[i])));
     }
+
+    return true;
 }

 /*
diff --git a/src/backend/utils/adt/bool.c b/src/backend/utils/adt/bool.c
index cd7335287f..e291672ae4 100644
--- a/src/backend/utils/adt/bool.c
+++ b/src/backend/utils/adt/bool.c
@@ -148,13 +148,10 @@ boolin(PG_FUNCTION_ARGS)
     if (parse_bool_with_len(str, len, &result))
         PG_RETURN_BOOL(result);

-    ereport(ERROR,
+    ereturn(fcinfo->context, (Datum) 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "boolean", in_str)));
-
-    /* not reached */
-    PG_RETURN_BOOL(false);
 }

 /*
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 42ddae99ef..e1837bee71 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -291,7 +291,7 @@ int4in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT32(pg_strtoint32(num));
+    PG_RETURN_INT32(pg_strtoint32_safe(num, fcinfo->context));
 }

 /*
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index a64422c8d0..0de0bed0e8 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -166,8 +166,11 @@ invalid_syntax:
 /*
  * Convert input string to a signed 32 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * pg_strtoint32() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint32_safe() instead returns such complaints in *escontext,
+ * if it's an ErrorSaveContext.
  *
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
@@ -175,6 +178,12 @@ invalid_syntax:
  */
 int32
 pg_strtoint32(const char *s)
+{
+    return pg_strtoint32_safe(s, NULL);
+}
+
+int32
+pg_strtoint32_safe(const char *s, Node *escontext)
 {
     const char *ptr = s;
     uint32        tmp = 0;
@@ -227,18 +236,16 @@ pg_strtoint32(const char *s)
     return (int32) tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "integer")));

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "integer", s)));
-
-    return 0;                    /* keep compiler quiet */
 }

 /*
diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..bdafcff02d 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -77,6 +77,7 @@ record_in(PG_FUNCTION_ARGS)
     char       *string = PG_GETARG_CSTRING(0);
     Oid            tupType = PG_GETARG_OID(1);
     int32        tupTypmod = PG_GETARG_INT32(2);
+    Node       *escontext = fcinfo->context;
     HeapTupleHeader result;
     TupleDesc    tupdesc;
     HeapTuple    tuple;
@@ -100,7 +101,7 @@ record_in(PG_FUNCTION_ARGS)
      * supply a valid typmod, and then we can do something useful for RECORD.
      */
     if (tupType == RECORDOID && tupTypmod < 0)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("input of anonymous composite types is not implemented")));

@@ -152,10 +153,13 @@ record_in(PG_FUNCTION_ARGS)
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr++ != '(')
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Missing left parenthesis.")));
+        goto fail;
+    }

     initStringInfo(&buf);

@@ -181,10 +185,13 @@ record_in(PG_FUNCTION_ARGS)
                 ptr++;
             else
                 /* *ptr must be ')' */
-                ereport(ERROR,
+            {
+                errsave(escontext,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("malformed record literal: \"%s\"", string),
                          errdetail("Too few columns.")));
+                goto fail;
+            }
         }

         /* Check for null: completely empty input means null */
@@ -204,19 +211,25 @@ record_in(PG_FUNCTION_ARGS)
                 char        ch = *ptr++;

                 if (ch == '\0')
-                    ereport(ERROR,
+                {
+                    errsave(escontext,
                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                              errmsg("malformed record literal: \"%s\"",
                                     string),
                              errdetail("Unexpected end of input.")));
+                    goto fail;
+                }
                 if (ch == '\\')
                 {
                     if (*ptr == '\0')
-                        ereport(ERROR,
+                    {
+                        errsave(escontext,
                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                                  errmsg("malformed record literal: \"%s\"",
                                         string),
                                  errdetail("Unexpected end of input.")));
+                        goto fail;
+                    }
                     appendStringInfoChar(&buf, *ptr++);
                 }
                 else if (ch == '"')
@@ -252,10 +265,13 @@ record_in(PG_FUNCTION_ARGS)
             column_info->column_type = column_type;
         }

-        values[i] = InputFunctionCall(&column_info->proc,
-                                      column_data,
-                                      column_info->typioparam,
-                                      att->atttypmod);
+        if (!InputFunctionCallSafe(&column_info->proc,
+                                   column_data,
+                                   column_info->typioparam,
+                                   att->atttypmod,
+                                   escontext,
+                                   &values[i]))
+            goto fail;

         /*
          * Prep for next column
@@ -264,18 +280,24 @@ record_in(PG_FUNCTION_ARGS)
     }

     if (*ptr++ != ')')
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Too many columns.")));
+        goto fail;
+    }
     /* Allow trailing whitespace */
     while (*ptr && isspace((unsigned char) *ptr))
         ptr++;
     if (*ptr)
-        ereport(ERROR,
+    {
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("malformed record literal: \"%s\"", string),
                  errdetail("Junk after right parenthesis.")));
+        goto fail;
+    }

     tuple = heap_form_tuple(tupdesc, values, nulls);

@@ -294,6 +316,11 @@ record_in(PG_FUNCTION_ARGS)
     ReleaseTupleDesc(tupdesc);

     PG_RETURN_HEAPTUPLEHEADER(result);
+
+    /* exit here once we've done lookup_rowtype_tupdesc */
+fail:
+    ReleaseTupleDesc(tupdesc);
+    PG_RETURN_NULL();
 }

 /*
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 2f794d1168..3f6319aed5 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -447,7 +447,11 @@ extern void array_free_iterator(ArrayIterator iterator);
 extern int    ArrayGetOffset(int n, const int *dim, const int *lb, const int *indx);
 extern int    ArrayGetOffset0(int n, const int *tup, const int *scale);
 extern int    ArrayGetNItems(int ndim, const int *dims);
+extern int    ArrayGetNItemsSafe(int ndim, const int *dims,
+                               struct Node *escontext);
 extern void ArrayCheckBounds(int ndim, const int *dims, const int *lb);
+extern bool ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
+                                 struct Node *escontext);
 extern void mda_get_range(int n, int *span, const int *st, const int *endp);
 extern void mda_get_prod(int n, const int *range, int *prod);
 extern void mda_get_offset_values(int n, int *dist, const int *prod, const int *span);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 81631f1645..fbfd8375e3 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -45,6 +45,7 @@ extern int    namestrcmp(Name name, const char *str);
 /* numutils.c */
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
+extern int32 pg_strtoint32_safe(const char *s, Node *escontext);
 extern int64 pg_strtoint64(const char *s);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 97920f38c2..a2f9d7ed16 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -182,6 +182,31 @@ SELECT a,b,c FROM arrtest;
  [4:4]={NULL}  | {3,4}                 | {foo,new_word}
 (3 rows)

+-- test non-error-throwing API
+SELECT pg_input_is_valid('{1,2,3}', 'integer[]');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('{1,2', 'integer[]');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('{1,zed}', 'integer[]');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('{1,zed}', 'integer[]');
+            pg_input_error_message
+----------------------------------------------
+ invalid input syntax for type integer: "zed"
+(1 row)
+
 -- test mixed slice/scalar subscripting
 select '{{1,2,3},{4,5,6},{7,8,9}}'::int[];
            int4
diff --git a/src/test/regress/expected/boolean.out b/src/test/regress/expected/boolean.out
index 4728fe2dfd..977124b20b 100644
--- a/src/test/regress/expected/boolean.out
+++ b/src/test/regress/expected/boolean.out
@@ -142,6 +142,25 @@ SELECT bool '' AS error;
 ERROR:  invalid input syntax for type boolean: ""
 LINE 1: SELECT bool '' AS error;
                     ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('true', 'bool');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'bool');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('junk', 'bool');
+            pg_input_error_message
+-----------------------------------------------
+ invalid input syntax for type boolean: "junk"
+(1 row)
+
 -- and, or, not in qualifications
 SELECT bool 't' or bool 'f' AS true;
  true
diff --git a/src/test/regress/expected/int4.out b/src/test/regress/expected/int4.out
index fbcc0e8d9e..b98007bd7a 100644
--- a/src/test/regress/expected/int4.out
+++ b/src/test/regress/expected/int4.out
@@ -45,6 +45,31 @@ SELECT * FROM INT4_TBL;
  -2147483647
 (5 rows)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int4');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'int4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1000000000000', 'int4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1000000000000', 'int4');
+                 pg_input_error_message
+--------------------------------------------------------
+ value "1000000000000" is out of range for type integer
+(1 row)
+
 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int2 '0';
      f1
 -------------
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index a4cc2d8c12..1bcd2b499c 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -69,6 +69,32 @@ ERROR:  malformed record literal: "(Joe,Blow) /"
 LINE 1: select '(Joe,Blow) /'::fullname;
                ^
 DETAIL:  Junk after right parenthesis.
+-- test non-error-throwing API
+create type twoints as (r integer, i integer);
+SELECT pg_input_is_valid('(1,2)', 'twoints');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('(1,2', 'twoints');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('(1,zed)', 'twoints');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('(1,zed)', 'twoints');
+            pg_input_error_message
+----------------------------------------------
+ invalid input syntax for type integer: "zed"
+(1 row)
+
 create temp table quadtable(f1 int, q quad);
 insert into quadtable values (1, ((3.3,4.4),(5.5,6.6)));
 insert into quadtable values (2, ((null,4.4),(5.5,6.6)));
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index 791af5c0ce..38e8dd440b 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -113,6 +113,12 @@ SELECT a FROM arrtest WHERE a[2] IS NULL;
 DELETE FROM arrtest WHERE a[2] IS NULL AND b IS NULL;
 SELECT a,b,c FROM arrtest;

+-- test non-error-throwing API
+SELECT pg_input_is_valid('{1,2,3}', 'integer[]');
+SELECT pg_input_is_valid('{1,2', 'integer[]');
+SELECT pg_input_is_valid('{1,zed}', 'integer[]');
+SELECT pg_input_error_message('{1,zed}', 'integer[]');
+
 -- test mixed slice/scalar subscripting
 select '{{1,2,3},{4,5,6},{7,8,9}}'::int[];
 select ('{{1,2,3},{4,5,6},{7,8,9}}'::int[])[1:2][2];
diff --git a/src/test/regress/sql/boolean.sql b/src/test/regress/sql/boolean.sql
index 4dd47aaf9d..dfaa55dd0f 100644
--- a/src/test/regress/sql/boolean.sql
+++ b/src/test/regress/sql/boolean.sql
@@ -62,6 +62,11 @@ SELECT bool '000' AS error;

 SELECT bool '' AS error;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('true', 'bool');
+SELECT pg_input_is_valid('asdf', 'bool');
+SELECT pg_input_error_message('junk', 'bool');
+
 -- and, or, not in qualifications

 SELECT bool 't' or bool 'f' AS true;
diff --git a/src/test/regress/sql/int4.sql b/src/test/regress/sql/int4.sql
index f19077f3da..54420818de 100644
--- a/src/test/regress/sql/int4.sql
+++ b/src/test/regress/sql/int4.sql
@@ -17,6 +17,12 @@ INSERT INTO INT4_TBL(f1) VALUES ('');

 SELECT * FROM INT4_TBL;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int4');
+SELECT pg_input_is_valid('asdf', 'int4');
+SELECT pg_input_is_valid('1000000000000', 'int4');
+SELECT pg_input_error_message('1000000000000', 'int4');
+
 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int2 '0';

 SELECT i.* FROM INT4_TBL i WHERE i.f1 <> int4 '0';
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index ad5b7e128f..4cd6a49215 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -31,6 +31,13 @@ select '[]'::fullname;          -- bad
 select ' (Joe,Blow)  '::fullname;  -- ok, extra whitespace
 select '(Joe,Blow) /'::fullname;  -- bad

+-- test non-error-throwing API
+create type twoints as (r integer, i integer);
+SELECT pg_input_is_valid('(1,2)', 'twoints');
+SELECT pg_input_is_valid('(1,2', 'twoints');
+SELECT pg_input_is_valid('(1,zed)', 'twoints');
+SELECT pg_input_error_message('(1,zed)', 'twoints');
+
 create temp table quadtable(f1 int, q quad);

 insert into quadtable values (1, ((3.3,4.4),(5.5,6.6)));
diff --git a/contrib/cube/cubeparse.y b/contrib/cube/cubeparse.y
index 977dcba965..e6e361736c 100644
--- a/contrib/cube/cubeparse.y
+++ b/contrib/cube/cubeparse.y
@@ -190,18 +190,18 @@ write_box(int dim, char *str1, char *str2)
     s = str1;
     i = 0;
     if (dim > 0)
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1);
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1, NULL);
     while ((s = strchr(s, ',')) != NULL)
     {
         s++;
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1);
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1, NULL);
     }
     Assert(i == dim);

     s = str2;
     if (dim > 0)
     {
-        bp->x[i] = float8in_internal(s, &endptr, "cube", str2);
+        bp->x[i] = float8in_internal(s, &endptr, "cube", str2, NULL);
         /* code this way to do right thing with NaN */
         point &= (bp->x[i] == bp->x[0]);
         i++;
@@ -209,7 +209,7 @@ write_box(int dim, char *str1, char *str2)
     while ((s = strchr(s, ',')) != NULL)
     {
         s++;
-        bp->x[i] = float8in_internal(s, &endptr, "cube", str2);
+        bp->x[i] = float8in_internal(s, &endptr, "cube", str2, NULL);
         point &= (bp->x[i] == bp->x[i - dim]);
         i++;
     }
@@ -250,11 +250,11 @@ write_point_as_box(int dim, char *str)
     s = str;
     i = 0;
     if (dim > 0)
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str);
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str, NULL);
     while ((s = strchr(s, ',')) != NULL)
     {
         s++;
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str);
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str, NULL);
     }
     Assert(i == dim);

diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index da97538ebe..b02a19be24 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -163,6 +163,7 @@ Datum
 float4in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     char       *orig_num;
     float        val;
     char       *endptr;
@@ -183,7 +184,7 @@ float4in(PG_FUNCTION_ARGS)
      * strtod() on different platforms.
      */
     if (*num == '\0')
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("invalid input syntax for type %s: \"%s\"",
                         "real", orig_num)));
@@ -257,13 +258,13 @@ float4in(PG_FUNCTION_ARGS)
                 (val >= HUGE_VALF || val <= -HUGE_VALF)
 #endif
                 )
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                          errmsg("\"%s\" is out of range for type real",
                                 orig_num)));
         }
         else
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for type %s: \"%s\"",
                             "real", orig_num)));
@@ -275,7 +276,7 @@ float4in(PG_FUNCTION_ARGS)

     /* if there is any junk left at the end of the string, bail out */
     if (*endptr != '\0')
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("invalid input syntax for type %s: \"%s\"",
                         "real", orig_num)));
@@ -337,52 +338,40 @@ float8in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_FLOAT8(float8in_internal(num, NULL, "double precision", num));
+    PG_RETURN_FLOAT8(float8in_internal(num, NULL, "double precision", num,
+                                       fcinfo->context));
 }

-/* Convenience macro: set *have_error flag (if provided) or throw error */
-#define RETURN_ERROR(throw_error, have_error) \
-do { \
-    if (have_error) { \
-        *have_error = true; \
-        return 0.0; \
-    } else { \
-        throw_error; \
-    } \
-} while (0)
-
 /*
- * float8in_internal_opt_error - guts of float8in()
+ * float8in_internal - guts of float8in()
  *
  * This is exposed for use by functions that want a reasonably
  * platform-independent way of inputting doubles.  The behavior is
- * essentially like strtod + ereport on error, but note the following
+ * essentially like strtod + ereturn on error, but note the following
  * differences:
  * 1. Both leading and trailing whitespace are skipped.
- * 2. If endptr_p is NULL, we throw error if there's trailing junk.
+ * 2. If endptr_p is NULL, we report error if there's trailing junk.
  * Otherwise, it's up to the caller to complain about trailing junk.
  * 3. In event of a syntax error, the report mentions the given type_name
  * and prints orig_string as the input; this is meant to support use of
  * this function with types such as "box" and "point", where what we are
  * parsing here is just a substring of orig_string.
  *
+ * If escontext points to an ErrorSaveContext node, that is filled instead
+ * of throwing an error; the caller must check SOFT_ERROR_OCCURRED()
+ * to detect errors.
+ *
  * "num" could validly be declared "const char *", but that results in an
  * unreasonable amount of extra casting both here and in callers, so we don't.
- *
- * When "*have_error" flag is provided, it's set instead of throwing an
- * error.  This is helpful when caller need to handle errors by itself.
  */
-double
-float8in_internal_opt_error(char *num, char **endptr_p,
-                            const char *type_name, const char *orig_string,
-                            bool *have_error)
+float8
+float8in_internal(char *num, char **endptr_p,
+                  const char *type_name, const char *orig_string,
+                  struct Node *escontext)
 {
     double        val;
     char       *endptr;

-    if (have_error)
-        *have_error = false;
-
     /* skip leading whitespace */
     while (*num != '\0' && isspace((unsigned char) *num))
         num++;
@@ -392,11 +381,10 @@ float8in_internal_opt_error(char *num, char **endptr_p,
      * strtod() on different platforms.
      */
     if (*num == '\0')
-        RETURN_ERROR(ereport(ERROR,
-                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                              errmsg("invalid input syntax for type %s: \"%s\"",
-                                     type_name, orig_string))),
-                     have_error);
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        type_name, orig_string)));

     errno = 0;
     val = strtod(num, &endptr);
@@ -469,20 +457,17 @@ float8in_internal_opt_error(char *num, char **endptr_p,
                 char       *errnumber = pstrdup(num);

                 errnumber[endptr - num] = '\0';
-                RETURN_ERROR(ereport(ERROR,
-                                     (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-                                      errmsg("\"%s\" is out of range for type double precision",
-                                             errnumber))),
-                             have_error);
+                ereturn(escontext, 0,
+                        (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                         errmsg("\"%s\" is out of range for type double precision",
+                                errnumber)));
             }
         }
         else
-            RETURN_ERROR(ereport(ERROR,
-                                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                                  errmsg("invalid input syntax for type "
-                                         "%s: \"%s\"",
-                                         type_name, orig_string))),
-                         have_error);
+            ereturn(escontext, 0,
+                    (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                     errmsg("invalid input syntax for type %s: \"%s\"",
+                            type_name, orig_string)));
     }

     /* skip trailing whitespace */
@@ -493,27 +478,14 @@ float8in_internal_opt_error(char *num, char **endptr_p,
     if (endptr_p)
         *endptr_p = endptr;
     else if (*endptr != '\0')
-        RETURN_ERROR(ereport(ERROR,
-                             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                              errmsg("invalid input syntax for type "
-                                     "%s: \"%s\"",
-                                     type_name, orig_string))),
-                     have_error);
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        type_name, orig_string)));

     return val;
 }

-/*
- * Interface to float8in_internal_opt_error() without "have_error" argument.
- */
-double
-float8in_internal(char *num, char **endptr_p,
-                  const char *type_name, const char *orig_string)
-{
-    return float8in_internal_opt_error(num, endptr_p, type_name,
-                                       orig_string, NULL);
-}
-

 /*
  *        float8out        - converts float8 number to a string
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index d78002b901..721ce6634f 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -189,7 +189,7 @@ static float8
 single_decode(char *num, char **endptr_p,
               const char *type_name, const char *orig_string)
 {
-    return float8in_internal(num, endptr_p, type_name, orig_string);
+    return float8in_internal(num, endptr_p, type_name, orig_string, NULL);
 }                                /* single_decode() */

 static void
@@ -212,7 +212,7 @@ pair_decode(char *str, float8 *x, float8 *y, char **endptr_p,
     if ((has_delim = (*str == LDELIM)))
         str++;

-    *x = float8in_internal(str, &str, type_name, orig_string);
+    *x = float8in_internal(str, &str, type_name, orig_string, NULL);

     if (*str++ != DELIM)
         ereport(ERROR,
@@ -220,7 +220,7 @@ pair_decode(char *str, float8 *x, float8 *y, char **endptr_p,
                  errmsg("invalid input syntax for type %s: \"%s\"",
                         type_name, orig_string)));

-    *y = float8in_internal(str, &str, type_name, orig_string);
+    *y = float8in_internal(str, &str, type_name, orig_string, NULL);

     if (has_delim)
     {
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index e1837bee71..8de38abd11 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -64,7 +64,7 @@ int2in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT16(pg_strtoint16(num));
+    PG_RETURN_INT16(pg_strtoint16_safe(num, fcinfo->context));
 }

 /*
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 98d4323755..7d1767ce0f 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -52,7 +52,7 @@ int8in(PG_FUNCTION_ARGS)
 {
     char       *num = PG_GETARG_CSTRING(0);

-    PG_RETURN_INT64(pg_strtoint64(num));
+    PG_RETURN_INT64(pg_strtoint64_safe(num, fcinfo->context));
 }


diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 8d83b2edb3..930bd26584 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -64,6 +64,7 @@
 #include "funcapi.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "regex/regex.h"
 #include "utils/builtins.h"
 #include "utils/date.h"
@@ -1041,15 +1042,15 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
                     char       *tmp = DatumGetCString(DirectFunctionCall1(numeric_out,
                                                                           NumericGetDatum(jb->val.numeric)));
                     double        val;
-                    bool        have_error = false;
+                    ErrorSaveContext escontext = {T_ErrorSaveContext};

-                    val = float8in_internal_opt_error(tmp,
-                                                      NULL,
-                                                      "double precision",
-                                                      tmp,
-                                                      &have_error);
+                    val = float8in_internal(tmp,
+                                            NULL,
+                                            "double precision",
+                                            tmp,
+                                            (Node *) &escontext);

-                    if (have_error || isinf(val) || isnan(val))
+                    if (escontext.error_occurred || isinf(val) || isnan(val))
                         RETURN_ERROR(ereport(ERROR,
                                              (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
                                               errmsg("numeric argument of jsonpath item method .%s() is out of range
fortype double precision", 
@@ -1062,15 +1063,15 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
                     double        val;
                     char       *tmp = pnstrdup(jb->val.string.val,
                                                jb->val.string.len);
-                    bool        have_error = false;
+                    ErrorSaveContext escontext = {T_ErrorSaveContext};

-                    val = float8in_internal_opt_error(tmp,
-                                                      NULL,
-                                                      "double precision",
-                                                      tmp,
-                                                      &have_error);
+                    val = float8in_internal(tmp,
+                                            NULL,
+                                            "double precision",
+                                            tmp,
+                                            (Node *) &escontext);

-                    if (have_error || isinf(val) || isnan(val))
+                    if (escontext.error_occurred || isinf(val) || isnan(val))
                         RETURN_ERROR(ereport(ERROR,
                                              (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
                                               errmsg("string argument of jsonpath item method .%s() is not a valid
representationof a double precision number", 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 7f0e93aa80..c024928bc8 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -497,8 +497,9 @@ static void alloc_var(NumericVar *var, int ndigits);
 static void free_var(NumericVar *var);
 static void zero_var(NumericVar *var);

-static const char *set_var_from_str(const char *str, const char *cp,
-                                    NumericVar *dest);
+static bool set_var_from_str(const char *str, const char *cp,
+                             NumericVar *dest, const char **endptr,
+                             Node *escontext);
 static void set_var_from_num(Numeric num, NumericVar *dest);
 static void init_var_from_num(Numeric num, NumericVar *dest);
 static void set_var_from_var(const NumericVar *value, NumericVar *dest);
@@ -512,8 +513,8 @@ static Numeric duplicate_numeric(Numeric num);
 static Numeric make_result(const NumericVar *var);
 static Numeric make_result_opt_error(const NumericVar *var, bool *have_error);

-static void apply_typmod(NumericVar *var, int32 typmod);
-static void apply_typmod_special(Numeric num, int32 typmod);
+static bool apply_typmod(NumericVar *var, int32 typmod, Node *escontext);
+static bool apply_typmod_special(Numeric num, int32 typmod, Node *escontext);

 static bool numericvar_to_int32(const NumericVar *var, int32 *result);
 static bool numericvar_to_int64(const NumericVar *var, int64 *result);
@@ -617,11 +618,11 @@ Datum
 numeric_in(PG_FUNCTION_ARGS)
 {
     char       *str = PG_GETARG_CSTRING(0);
-
 #ifdef NOT_USED
     Oid            typelem = PG_GETARG_OID(1);
 #endif
     int32        typmod = PG_GETARG_INT32(2);
+    Node       *escontext = fcinfo->context;
     Numeric        res;
     const char *cp;

@@ -679,10 +680,12 @@ numeric_in(PG_FUNCTION_ARGS)
          * Use set_var_from_str() to parse a normal numeric value
          */
         NumericVar    value;
+        bool        have_error;

         init_var(&value);

-        cp = set_var_from_str(str, cp, &value);
+        if (!set_var_from_str(str, cp, &value, &cp, escontext))
+            PG_RETURN_NULL();

         /*
          * We duplicate a few lines of code here because we would like to
@@ -693,16 +696,23 @@ numeric_in(PG_FUNCTION_ARGS)
         while (*cp)
         {
             if (!isspace((unsigned char) *cp))
-                ereport(ERROR,
+                ereturn(escontext, (Datum) 0,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("invalid input syntax for type %s: \"%s\"",
                                 "numeric", str)));
             cp++;
         }

-        apply_typmod(&value, typmod);
+        if (!apply_typmod(&value, typmod, escontext))
+            PG_RETURN_NULL();
+
+        res = make_result_opt_error(&value, &have_error);
+
+        if (have_error)
+            ereturn(escontext, (Datum) 0,
+                    (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                     errmsg("value overflows numeric format")));

-        res = make_result(&value);
         free_var(&value);

         PG_RETURN_NUMERIC(res);
@@ -712,7 +722,7 @@ numeric_in(PG_FUNCTION_ARGS)
     while (*cp)
     {
         if (!isspace((unsigned char) *cp))
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for type %s: \"%s\"",
                             "numeric", str)));
@@ -720,7 +730,8 @@ numeric_in(PG_FUNCTION_ARGS)
     }

     /* As above, throw any typmod error after finishing syntax check */
-    apply_typmod_special(res, typmod);
+    if (!apply_typmod_special(res, typmod, escontext))
+        PG_RETURN_NULL();

     PG_RETURN_NUMERIC(res);
 }
@@ -1058,7 +1069,7 @@ numeric_recv(PG_FUNCTION_ARGS)
     {
         trunc_var(&value, value.dscale);

-        apply_typmod(&value, typmod);
+        (void) apply_typmod(&value, typmod, NULL);

         res = make_result(&value);
     }
@@ -1067,7 +1078,7 @@ numeric_recv(PG_FUNCTION_ARGS)
         /* apply_typmod_special wants us to make the Numeric first */
         res = make_result(&value);

-        apply_typmod_special(res, typmod);
+        (void) apply_typmod_special(res, typmod, NULL);
     }

     free_var(&value);
@@ -1180,7 +1191,7 @@ numeric        (PG_FUNCTION_ARGS)
      */
     if (NUMERIC_IS_SPECIAL(num))
     {
-        apply_typmod_special(num, typmod);
+        (void) apply_typmod_special(num, typmod, NULL);
         PG_RETURN_NUMERIC(duplicate_numeric(num));
     }

@@ -1231,7 +1242,7 @@ numeric        (PG_FUNCTION_ARGS)
     init_var(&var);

     set_var_from_num(num, &var);
-    apply_typmod(&var, typmod);
+    (void) apply_typmod(&var, typmod, NULL);
     new = make_result(&var);

     free_var(&var);
@@ -4395,6 +4406,7 @@ float8_numeric(PG_FUNCTION_ARGS)
     Numeric        res;
     NumericVar    result;
     char        buf[DBL_DIG + 100];
+    const char *endptr;

     if (isnan(val))
         PG_RETURN_NUMERIC(make_result(&const_nan));
@@ -4412,7 +4424,7 @@ float8_numeric(PG_FUNCTION_ARGS)
     init_var(&result);

     /* Assume we need not worry about leading/trailing spaces */
-    (void) set_var_from_str(buf, buf, &result);
+    (void) set_var_from_str(buf, buf, &result, &endptr, NULL);

     res = make_result(&result);

@@ -4488,6 +4500,7 @@ float4_numeric(PG_FUNCTION_ARGS)
     Numeric        res;
     NumericVar    result;
     char        buf[FLT_DIG + 100];
+    const char *endptr;

     if (isnan(val))
         PG_RETURN_NUMERIC(make_result(&const_nan));
@@ -4505,7 +4518,7 @@ float4_numeric(PG_FUNCTION_ARGS)
     init_var(&result);

     /* Assume we need not worry about leading/trailing spaces */
-    (void) set_var_from_str(buf, buf, &result);
+    (void) set_var_from_str(buf, buf, &result, &endptr, NULL);

     res = make_result(&result);

@@ -6804,14 +6817,19 @@ zero_var(NumericVar *var)
  *    Parse a string and put the number into a variable
  *
  * This function does not handle leading or trailing spaces.  It returns
- * the end+1 position parsed, so that caller can check for trailing
- * spaces/garbage if deemed necessary.
+ * the end+1 position parsed into *endptr, so that caller can check for
+ * trailing spaces/garbage if deemed necessary.
  *
  * cp is the place to actually start parsing; str is what to use in error
  * reports.  (Typically cp would be the same except advanced over spaces.)
+ *
+ * Returns true on success, false on failure (if escontext points to an
+ * ErrorSaveContext; otherwise errors are thrown).
  */
-static const char *
-set_var_from_str(const char *str, const char *cp, NumericVar *dest)
+static bool
+set_var_from_str(const char *str, const char *cp,
+                 NumericVar *dest, const char **endptr,
+                 Node *escontext)
 {
     bool        have_dp = false;
     int            i;
@@ -6849,7 +6867,7 @@ set_var_from_str(const char *str, const char *cp, NumericVar *dest)
     }

     if (!isdigit((unsigned char) *cp))
-        ereport(ERROR,
+        ereturn(escontext, false,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("invalid input syntax for type %s: \"%s\"",
                         "numeric", str)));
@@ -6873,7 +6891,7 @@ set_var_from_str(const char *str, const char *cp, NumericVar *dest)
         else if (*cp == '.')
         {
             if (have_dp)
-                ereport(ERROR,
+                ereturn(escontext, false,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("invalid input syntax for type %s: \"%s\"",
                                 "numeric", str)));
@@ -6897,7 +6915,7 @@ set_var_from_str(const char *str, const char *cp, NumericVar *dest)
         cp++;
         exponent = strtol(cp, &endptr, 10);
         if (endptr == cp)
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for type %s: \"%s\"",
                             "numeric", str)));
@@ -6912,7 +6930,7 @@ set_var_from_str(const char *str, const char *cp, NumericVar *dest)
          * for consistency use the same ereport errcode/text as make_result().
          */
         if (exponent >= INT_MAX / 2 || exponent <= -(INT_MAX / 2))
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                      errmsg("value overflows numeric format")));
         dweight += (int) exponent;
@@ -6963,7 +6981,9 @@ set_var_from_str(const char *str, const char *cp, NumericVar *dest)
     strip_var(dest);

     /* Return end+1 position for caller */
-    return cp;
+    *endptr = cp;
+
+    return true;
 }


@@ -7455,9 +7475,12 @@ make_result(const NumericVar *var)
  *
  *    Do bounds checking and rounding according to the specified typmod.
  *    Note that this is only applied to normal finite values.
+ *
+ * Returns true on success, false on failure (if escontext points to an
+ * ErrorSaveContext; otherwise errors are thrown).
  */
-static void
-apply_typmod(NumericVar *var, int32 typmod)
+static bool
+apply_typmod(NumericVar *var, int32 typmod, Node *escontext)
 {
     int            precision;
     int            scale;
@@ -7467,7 +7490,7 @@ apply_typmod(NumericVar *var, int32 typmod)

     /* Do nothing if we have an invalid typmod */
     if (!is_valid_numeric_typmod(typmod))
-        return;
+        return true;

     precision = numeric_typmod_precision(typmod);
     scale = numeric_typmod_scale(typmod);
@@ -7514,7 +7537,7 @@ apply_typmod(NumericVar *var, int32 typmod)
 #error unsupported NBASE
 #endif
                 if (ddigits > maxdigits)
-                    ereport(ERROR,
+                    ereturn(escontext, false,
                             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                              errmsg("numeric field overflow"),
                              errdetail("A field with precision %d, scale %d must round to an absolute value less than
%s%d.",
@@ -7528,6 +7551,8 @@ apply_typmod(NumericVar *var, int32 typmod)
             ddigits -= DEC_DIGITS;
         }
     }
+
+    return true;
 }

 /*
@@ -7535,9 +7560,12 @@ apply_typmod(NumericVar *var, int32 typmod)
  *
  *    Do bounds checking according to the specified typmod, for an Inf or NaN.
  *    For convenience of most callers, the value is presented in packed form.
+ *
+ * Returns true on success, false on failure (if escontext points to an
+ * ErrorSaveContext; otherwise errors are thrown).
  */
-static void
-apply_typmod_special(Numeric num, int32 typmod)
+static bool
+apply_typmod_special(Numeric num, int32 typmod, Node *escontext)
 {
     int            precision;
     int            scale;
@@ -7551,16 +7579,16 @@ apply_typmod_special(Numeric num, int32 typmod)
      * any finite number of digits.
      */
     if (NUMERIC_IS_NAN(num))
-        return;
+        return true;

     /* Do nothing if we have a default typmod (-1) */
     if (!is_valid_numeric_typmod(typmod))
-        return;
+        return true;

     precision = numeric_typmod_precision(typmod);
     scale = numeric_typmod_scale(typmod);

-    ereport(ERROR,
+    ereturn(escontext, false,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("numeric field overflow"),
              errdetail("A field with precision %d, scale %d cannot hold an infinite value.",
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 0de0bed0e8..ab1564f22d 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -88,15 +88,24 @@ decimalLength64(const uint64 v)
 /*
  * Convert input string to a signed 16 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
  *
+ * pg_strtoint16() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint16_safe() instead returns such complaints in *escontext,
+ * if it's an ErrorSaveContext.
+*
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
  * signed positive number.
  */
 int16
 pg_strtoint16(const char *s)
+{
+    return pg_strtoint16_safe(s, NULL);
+}
+
+int16
+pg_strtoint16_safe(const char *s, Node *escontext)
 {
     const char *ptr = s;
     uint16        tmp = 0;
@@ -149,18 +158,16 @@ pg_strtoint16(const char *s)
     return (int16) tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "smallint")));

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "smallint", s)));
-
-    return 0;                    /* keep compiler quiet */
 }

 /*
@@ -251,8 +258,11 @@ invalid_syntax:
 /*
  * Convert input string to a signed 64 bit integer.
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * pg_strtoint64() will throw ereport() upon bad input format or overflow;
+ * while pg_strtoint64_safe() instead returns such complaints in *escontext,
+ * if it's an ErrorSaveContext.
  *
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
@@ -260,6 +270,12 @@ invalid_syntax:
  */
 int64
 pg_strtoint64(const char *s)
+{
+    return pg_strtoint64_safe(s, NULL);
+}
+
+int64
+pg_strtoint64_safe(const char *s, Node *escontext)
 {
     const char *ptr = s;
     uint64        tmp = 0;
@@ -312,18 +328,16 @@ pg_strtoint64(const char *s)
     return (int64) tmp;

 out_of_range:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
              errmsg("value \"%s\" is out of range for type %s",
                     s, "bigint")));

 invalid_syntax:
-    ereport(ERROR,
+    ereturn(escontext, 0,
             (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
              errmsg("invalid input syntax for type %s: \"%s\"",
                     "bigint", s)));
-
-    return 0;                    /* keep compiler quiet */
 }

 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index fbfd8375e3..10d13b0f1e 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -44,9 +44,11 @@ extern int    namestrcmp(Name name, const char *str);

 /* numutils.c */
 extern int16 pg_strtoint16(const char *s);
+extern int16 pg_strtoint16_safe(const char *s, Node *escontext);
 extern int32 pg_strtoint32(const char *s);
 extern int32 pg_strtoint32_safe(const char *s, Node *escontext);
 extern int64 pg_strtoint64(const char *s);
+extern int64 pg_strtoint64_safe(const char *s, Node *escontext);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
 extern int    pg_ulltoa_n(uint64 value, char *a);
diff --git a/src/include/utils/float.h b/src/include/utils/float.h
index 4bf0e3ac07..f92860b4a4 100644
--- a/src/include/utils/float.h
+++ b/src/include/utils/float.h
@@ -42,10 +42,8 @@ extern void float_underflow_error(void) pg_attribute_noreturn();
 extern void float_zero_divide_error(void) pg_attribute_noreturn();
 extern int    is_infinite(float8 val);
 extern float8 float8in_internal(char *num, char **endptr_p,
-                                const char *type_name, const char *orig_string);
-extern float8 float8in_internal_opt_error(char *num, char **endptr_p,
-                                          const char *type_name, const char *orig_string,
-                                          bool *have_error);
+                                const char *type_name, const char *orig_string,
+                                struct Node *escontext);
 extern char *float8out_internal(float8 num);
 extern int    float4_cmp_internal(float4 a, float4 b);
 extern int    float8_cmp_internal(float8 a, float8 b);
diff --git a/src/test/regress/expected/float4-misrounded-input.out
b/src/test/regress/expected/float4-misrounded-input.out
index 3d5d298b73..24fde6cc9f 100644
--- a/src/test/regress/expected/float4-misrounded-input.out
+++ b/src/test/regress/expected/float4-misrounded-input.out
@@ -81,6 +81,31 @@ INSERT INTO FLOAT4_TBL(f1) VALUES ('123            5');
 ERROR:  invalid input syntax for type real: "123            5"
 LINE 1: INSERT INTO FLOAT4_TBL(f1) VALUES ('123            5');
                                            ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'float4');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('xyz', 'float4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1e400', 'float4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1e400', 'float4');
+        pg_input_error_message
+---------------------------------------
+ "1e400" is out of range for type real
+(1 row)
+
 -- special inputs
 SELECT 'NaN'::float4;
  float4
diff --git a/src/test/regress/expected/float4.out b/src/test/regress/expected/float4.out
index 6ad5d00aa2..1d7090a90d 100644
--- a/src/test/regress/expected/float4.out
+++ b/src/test/regress/expected/float4.out
@@ -81,6 +81,31 @@ INSERT INTO FLOAT4_TBL(f1) VALUES ('123            5');
 ERROR:  invalid input syntax for type real: "123            5"
 LINE 1: INSERT INTO FLOAT4_TBL(f1) VALUES ('123            5');
                                            ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'float4');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('xyz', 'float4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1e400', 'float4');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1e400', 'float4');
+        pg_input_error_message
+---------------------------------------
+ "1e400" is out of range for type real
+(1 row)
+
 -- special inputs
 SELECT 'NaN'::float4;
  float4
diff --git a/src/test/regress/expected/float8.out b/src/test/regress/expected/float8.out
index de4d57ec9f..2b25784f7f 100644
--- a/src/test/regress/expected/float8.out
+++ b/src/test/regress/expected/float8.out
@@ -68,6 +68,31 @@ INSERT INTO FLOAT8_TBL(f1) VALUES ('123           5');
 ERROR:  invalid input syntax for type double precision: "123           5"
 LINE 1: INSERT INTO FLOAT8_TBL(f1) VALUES ('123           5');
                                            ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'float8');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('xyz', 'float8');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1e4000', 'float8');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1e4000', 'float8');
+               pg_input_error_message
+----------------------------------------------------
+ "1e4000" is out of range for type double precision
+(1 row)
+
 -- special inputs
 SELECT 'NaN'::float8;
  float8
diff --git a/src/test/regress/expected/int2.out b/src/test/regress/expected/int2.out
index 109cf9baaa..6a23567b67 100644
--- a/src/test/regress/expected/int2.out
+++ b/src/test/regress/expected/int2.out
@@ -45,6 +45,31 @@ SELECT * FROM INT2_TBL;
  -32767
 (5 rows)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int2');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'int2');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('50000', 'int2');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('50000', 'int2');
+             pg_input_error_message
+-------------------------------------------------
+ value "50000" is out of range for type smallint
+(1 row)
+
 SELECT * FROM INT2_TBL AS f(a, b);
 ERROR:  table "f" has 1 columns available but 2 columns specified
 SELECT * FROM (TABLE int2_tbl) AS s (a, b);
diff --git a/src/test/regress/expected/int8.out b/src/test/regress/expected/int8.out
index 1ae23cf3f9..90ed061249 100644
--- a/src/test/regress/expected/int8.out
+++ b/src/test/regress/expected/int8.out
@@ -42,6 +42,31 @@ SELECT * FROM INT8_TBL;
  4567890123456789 | -4567890123456789
 (5 rows)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int8');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'int8');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('10000000000000000000', 'int8');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('10000000000000000000', 'int8');
+                    pg_input_error_message
+--------------------------------------------------------------
+ value "10000000000000000000" is out of range for type bigint
+(1 row)
+
 -- int8/int8 cmp
 SELECT * FROM INT8_TBL WHERE q2 = 4567890123456789;
         q1        |        q2
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index 3c610646dc..30a5613ed7 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2199,6 +2199,49 @@ SELECT * FROM num_input_test;
  -Infinity
 (13 rows)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'numeric');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('34xyz', 'numeric');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('1e400000', 'numeric');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1e400000', 'numeric');
+     pg_input_error_message
+--------------------------------
+ value overflows numeric format
+(1 row)
+
+SELECT pg_input_is_valid('1234.567', 'numeric(8,4)');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('1234.567', 'numeric(7,4)');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('1234.567', 'numeric(7,4)');
+ pg_input_error_message
+------------------------
+ numeric field overflow
+(1 row)
+
 --
 -- Test precision and scale typemods
 --
diff --git a/src/test/regress/sql/float4.sql b/src/test/regress/sql/float4.sql
index 612486ecbd..061477726b 100644
--- a/src/test/regress/sql/float4.sql
+++ b/src/test/regress/sql/float4.sql
@@ -36,6 +36,12 @@ INSERT INTO FLOAT4_TBL(f1) VALUES ('5.   0');
 INSERT INTO FLOAT4_TBL(f1) VALUES ('     - 3.0');
 INSERT INTO FLOAT4_TBL(f1) VALUES ('123            5');

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'float4');
+SELECT pg_input_is_valid('xyz', 'float4');
+SELECT pg_input_is_valid('1e400', 'float4');
+SELECT pg_input_error_message('1e400', 'float4');
+
 -- special inputs
 SELECT 'NaN'::float4;
 SELECT 'nan'::float4;
diff --git a/src/test/regress/sql/float8.sql b/src/test/regress/sql/float8.sql
index 03c134b078..c276a5324c 100644
--- a/src/test/regress/sql/float8.sql
+++ b/src/test/regress/sql/float8.sql
@@ -34,6 +34,12 @@ INSERT INTO FLOAT8_TBL(f1) VALUES ('5.   0');
 INSERT INTO FLOAT8_TBL(f1) VALUES ('    - 3');
 INSERT INTO FLOAT8_TBL(f1) VALUES ('123           5');

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'float8');
+SELECT pg_input_is_valid('xyz', 'float8');
+SELECT pg_input_is_valid('1e4000', 'float8');
+SELECT pg_input_error_message('1e4000', 'float8');
+
 -- special inputs
 SELECT 'NaN'::float8;
 SELECT 'nan'::float8;
diff --git a/src/test/regress/sql/int2.sql b/src/test/regress/sql/int2.sql
index ea29066b78..98a761a24a 100644
--- a/src/test/regress/sql/int2.sql
+++ b/src/test/regress/sql/int2.sql
@@ -17,6 +17,12 @@ INSERT INTO INT2_TBL(f1) VALUES ('');

 SELECT * FROM INT2_TBL;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int2');
+SELECT pg_input_is_valid('asdf', 'int2');
+SELECT pg_input_is_valid('50000', 'int2');
+SELECT pg_input_error_message('50000', 'int2');
+
 SELECT * FROM INT2_TBL AS f(a, b);

 SELECT * FROM (TABLE int2_tbl) AS s (a, b);
diff --git a/src/test/regress/sql/int8.sql b/src/test/regress/sql/int8.sql
index 38b771964d..76007b692b 100644
--- a/src/test/regress/sql/int8.sql
+++ b/src/test/regress/sql/int8.sql
@@ -16,6 +16,12 @@ INSERT INTO INT8_TBL(q1) VALUES ('');

 SELECT * FROM INT8_TBL;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34', 'int8');
+SELECT pg_input_is_valid('asdf', 'int8');
+SELECT pg_input_is_valid('10000000000000000000', 'int8');
+SELECT pg_input_error_message('10000000000000000000', 'int8');
+
 -- int8/int8 cmp
 SELECT * FROM INT8_TBL WHERE q2 = 4567890123456789;
 SELECT * FROM INT8_TBL WHERE q2 <> 4567890123456789;
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index 93bb0996be..7bb34e5021 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1053,6 +1053,15 @@ INSERT INTO num_input_test(n1) VALUES ('+ infinity');

 SELECT * FROM num_input_test;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('34.5', 'numeric');
+SELECT pg_input_is_valid('34xyz', 'numeric');
+SELECT pg_input_is_valid('1e400000', 'numeric');
+SELECT pg_input_error_message('1e400000', 'numeric');
+SELECT pg_input_is_valid('1234.567', 'numeric(8,4)');
+SELECT pg_input_is_valid('1234.567', 'numeric(7,4)');
+SELECT pg_input_error_message('1234.567', 'numeric(7,4)');
+
 --
 -- Test precision and scale typemods
 --
diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c
index 4f32c5dc1d..1fc447511a 100644
--- a/contrib/cube/cube.c
+++ b/contrib/cube/cube.c
@@ -123,8 +123,9 @@ cube_in(PG_FUNCTION_ARGS)

     cube_scanner_init(str, &scanbuflen);

-    cube_yyparse(&result, scanbuflen);
+    cube_yyparse(&result, scanbuflen, fcinfo->context);

+    /* We might as well run this even on failure. */
     cube_scanner_finish();

     PG_RETURN_NDBOX_P(result);
diff --git a/contrib/cube/cubedata.h b/contrib/cube/cubedata.h
index 640a7ca580..96fa41a04e 100644
--- a/contrib/cube/cubedata.h
+++ b/contrib/cube/cubedata.h
@@ -61,9 +61,12 @@ typedef struct NDBOX

 /* in cubescan.l */
 extern int    cube_yylex(void);
-extern void cube_yyerror(NDBOX **result, Size scanbuflen, const char *message) pg_attribute_noreturn();
+extern void cube_yyerror(NDBOX **result, Size scanbuflen,
+                         struct Node *escontext,
+                         const char *message);
 extern void cube_scanner_init(const char *str, Size *scanbuflen);
 extern void cube_scanner_finish(void);

 /* in cubeparse.y */
-extern int    cube_yyparse(NDBOX **result, Size scanbuflen);
+extern int    cube_yyparse(NDBOX **result, Size scanbuflen,
+                         struct Node *escontext);
diff --git a/contrib/cube/cubeparse.y b/contrib/cube/cubeparse.y
index e6e361736c..44450d1027 100644
--- a/contrib/cube/cubeparse.y
+++ b/contrib/cube/cubeparse.y
@@ -7,6 +7,7 @@
 #include "postgres.h"

 #include "cubedata.h"
+#include "nodes/miscnodes.h"
 #include "utils/float.h"

 /* All grammar constructs return strings */
@@ -21,14 +22,17 @@
 #define YYFREE   pfree

 static int item_count(const char *s, char delim);
-static NDBOX *write_box(int dim, char *str1, char *str2);
-static NDBOX *write_point_as_box(int dim, char *str);
+static bool write_box(int dim, char *str1, char *str2,
+                      NDBOX **result, struct Node *escontext);
+static bool write_point_as_box(int dim, char *str,
+                               NDBOX **result, struct Node *escontext);

 %}

 /* BISON Declarations */
 %parse-param {NDBOX **result}
 %parse-param {Size scanbuflen}
+%parse-param {struct Node *escontext}
 %expect 0
 %name-prefix="cube_yy"

@@ -45,7 +49,7 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
         dim = item_count($2, ',');
         if (item_count($4, ',') != dim)
         {
-            ereport(ERROR,
+            errsave(escontext,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for cube"),
                      errdetail("Different point dimensions in (%s) and (%s).",
@@ -54,7 +58,7 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
         }
         if (dim > CUBE_MAX_DIM)
         {
-            ereport(ERROR,
+            errsave(escontext,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for cube"),
                      errdetail("A cube cannot have more than %d dimensions.",
@@ -62,7 +66,8 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
             YYABORT;
         }

-        *result = write_box( dim, $2, $4 );
+        if (!write_box(dim, $2, $4, result, escontext))
+            YYABORT;
     }

     | paren_list COMMA paren_list
@@ -72,7 +77,7 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
         dim = item_count($1, ',');
         if (item_count($3, ',') != dim)
         {
-            ereport(ERROR,
+            errsave(escontext,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for cube"),
                      errdetail("Different point dimensions in (%s) and (%s).",
@@ -81,7 +86,7 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
         }
         if (dim > CUBE_MAX_DIM)
         {
-            ereport(ERROR,
+            errsave(escontext,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for cube"),
                      errdetail("A cube cannot have more than %d dimensions.",
@@ -89,7 +94,8 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
             YYABORT;
         }

-        *result = write_box( dim, $1, $3 );
+        if (!write_box(dim, $1, $3, result, escontext))
+            YYABORT;
     }

     | paren_list
@@ -99,7 +105,7 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
         dim = item_count($1, ',');
         if (dim > CUBE_MAX_DIM)
         {
-            ereport(ERROR,
+            errsave(escontext,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for cube"),
                      errdetail("A cube cannot have more than %d dimensions.",
@@ -107,7 +113,8 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
             YYABORT;
         }

-        *result = write_point_as_box(dim, $1);
+        if (!write_point_as_box(dim, $1, result, escontext))
+            YYABORT;
     }

     | list
@@ -117,7 +124,7 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
         dim = item_count($1, ',');
         if (dim > CUBE_MAX_DIM)
         {
-            ereport(ERROR,
+            errsave(escontext,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("invalid input syntax for cube"),
                      errdetail("A cube cannot have more than %d dimensions.",
@@ -125,7 +132,8 @@ box: O_BRACKET paren_list COMMA paren_list C_BRACKET
             YYABORT;
         }

-        *result = write_point_as_box(dim, $1);
+        if (!write_point_as_box(dim, $1, result, escontext))
+            YYABORT;
     }
     ;

@@ -173,8 +181,9 @@ item_count(const char *s, char delim)
     return nitems;
 }

-static NDBOX *
-write_box(int dim, char *str1, char *str2)
+static bool
+write_box(int dim, char *str1, char *str2,
+          NDBOX **result, struct Node *escontext)
 {
     NDBOX       *bp;
     char       *s;
@@ -190,18 +199,26 @@ write_box(int dim, char *str1, char *str2)
     s = str1;
     i = 0;
     if (dim > 0)
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1, NULL);
+    {
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1, escontext);
+        if (SOFT_ERROR_OCCURRED(escontext))
+            return false;
+    }
     while ((s = strchr(s, ',')) != NULL)
     {
         s++;
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1, NULL);
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str1, escontext);
+        if (SOFT_ERROR_OCCURRED(escontext))
+            return false;
     }
     Assert(i == dim);

     s = str2;
     if (dim > 0)
     {
-        bp->x[i] = float8in_internal(s, &endptr, "cube", str2, NULL);
+        bp->x[i] = float8in_internal(s, &endptr, "cube", str2, escontext);
+        if (SOFT_ERROR_OCCURRED(escontext))
+            return false;
         /* code this way to do right thing with NaN */
         point &= (bp->x[i] == bp->x[0]);
         i++;
@@ -209,7 +226,9 @@ write_box(int dim, char *str1, char *str2)
     while ((s = strchr(s, ',')) != NULL)
     {
         s++;
-        bp->x[i] = float8in_internal(s, &endptr, "cube", str2, NULL);
+        bp->x[i] = float8in_internal(s, &endptr, "cube", str2, escontext);
+        if (SOFT_ERROR_OCCURRED(escontext))
+            return false;
         point &= (bp->x[i] == bp->x[i - dim]);
         i++;
     }
@@ -229,11 +248,13 @@ write_box(int dim, char *str1, char *str2)
         SET_POINT_BIT(bp);
     }

-    return bp;
+    *result = bp;
+    return true;
 }

-static NDBOX *
-write_point_as_box(int dim, char *str)
+static bool
+write_point_as_box(int dim, char *str,
+                   NDBOX **result, struct Node *escontext)
 {
     NDBOX        *bp;
     int            i,
@@ -250,13 +271,20 @@ write_point_as_box(int dim, char *str)
     s = str;
     i = 0;
     if (dim > 0)
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str, NULL);
+    {
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str, escontext);
+        if (SOFT_ERROR_OCCURRED(escontext))
+            return false;
+    }
     while ((s = strchr(s, ',')) != NULL)
     {
         s++;
-        bp->x[i++] = float8in_internal(s, &endptr, "cube", str, NULL);
+        bp->x[i++] = float8in_internal(s, &endptr, "cube", str, escontext);
+        if (SOFT_ERROR_OCCURRED(escontext))
+            return false;
     }
     Assert(i == dim);

-    return bp;
+    *result = bp;
+    return true;
 }
diff --git a/contrib/cube/cubescan.l b/contrib/cube/cubescan.l
index 6b316f2d54..49cb699216 100644
--- a/contrib/cube/cubescan.l
+++ b/contrib/cube/cubescan.l
@@ -72,11 +72,13 @@ NaN          [nN][aA][nN]

 /* result and scanbuflen are not used, but Bison expects this signature */
 void
-cube_yyerror(NDBOX **result, Size scanbuflen, const char *message)
+cube_yyerror(NDBOX **result, Size scanbuflen,
+             struct Node *escontext,
+             const char *message)
 {
     if (*yytext == YY_END_OF_BUFFER_CHAR)
     {
-        ereport(ERROR,
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("invalid input syntax for cube"),
                  /* translator: %s is typically "syntax error" */
@@ -84,7 +86,7 @@ cube_yyerror(NDBOX **result, Size scanbuflen, const char *message)
     }
     else
     {
-        ereport(ERROR,
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("invalid input syntax for cube"),
                  /* translator: first %s is typically "syntax error" */
diff --git a/contrib/cube/expected/cube.out b/contrib/cube/expected/cube.out
index 5b89cb1a26..dc23e5ccc0 100644
--- a/contrib/cube/expected/cube.out
+++ b/contrib/cube/expected/cube.out
@@ -325,6 +325,31 @@ SELECT '-1e-700'::cube AS cube; -- out of range
 ERROR:  "-1e-700" is out of range for type double precision
 LINE 1: SELECT '-1e-700'::cube AS cube;
                ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('(1,2)', 'cube');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('[(1),]', 'cube');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_is_valid('-1e-700', 'cube');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('-1e-700', 'cube');
+               pg_input_error_message
+-----------------------------------------------------
+ "-1e-700" is out of range for type double precision
+(1 row)
+
 --
 -- Testing building cubes from float8 values
 --
diff --git a/contrib/cube/sql/cube.sql b/contrib/cube/sql/cube.sql
index 7f8b2e3979..384883d16e 100644
--- a/contrib/cube/sql/cube.sql
+++ b/contrib/cube/sql/cube.sql
@@ -79,6 +79,12 @@ SELECT '1,2a'::cube AS cube; -- 7
 SELECT '1..2'::cube AS cube; -- 7
 SELECT '-1e-700'::cube AS cube; -- out of range

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('(1,2)', 'cube');
+SELECT pg_input_is_valid('[(1),]', 'cube');
+SELECT pg_input_is_valid('-1e-700', 'cube');
+SELECT pg_input_error_message('-1e-700', 'cube');
+
 --
 -- Testing building cubes from float8 values
 --

Re: Error-safe user functions

From

Andrew Dunstan

Date:

09 December 2022, 02:15:42

On 2022-12-08 Th 17:57, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2022-12-08 16:00:10 -0500, Robert Haas wrote:
>>> Yes, I think just putting "struct Node;" in as many places as
>>> necessary is the way to go. Or even:
>> +1
> OK, here's a v5 that does it like that.
>
> I've spent a little time pushing ahead on other input functions,
> and realized that my original plan to require a pre-encoded typmod
> for these test functions was not very user-friendly.  So in v5
> you can write something like
>
> pg_input_is_valid('1234.567', 'numeric(7,4)')
>
> 0004 attached finishes up the remaining core numeric datatypes
> (int*, float*, numeric).  I ripped out float8in_internal_opt_error
> in favor of a function that uses the new APIs.


Great, that takes care of some of the relatively urgent work.


>
> 0005 converts contrib/cube, which I chose to tackle partly because
> I'd already touched it in 0004, partly because it seemed like a
> good idea to verify that extension modules wouldn't have any
> problems with this apprach, and partly because I wondered whether
> an input function that uses a Bison/Flex parser would have big
> problems getting converted.  This one didn't, anyway.


Cool


>
> Given that this additional experimentation didn't find any holes
> in the API design, I think this is pretty much ready to go.
>
>             


I will look in more detail tomorrow, but it LGTM on a quick look.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Andres Freund

Date:

09 December 2022, 02:33:51

Hi,

On 2022-12-08 17:57:09 -0500, Tom Lane wrote:
> Given that this additional experimentation didn't find any holes
> in the API design, I think this is pretty much ready to go.

One interesting area is timestamp / datetime related code. There's been some
past efforts in the area, mostly in 5bc450629b3. See the RETURN_ERROR macro in
formatting.c.

This is not directly about type input functions, but it looks to me that the
functionality in the patchset should work.

I certainly have the hope that it'll make the code look a bit less ugly...

It looks like a fair bit of work to convert this code, so I don't think we
should tie converting formatting.c to the patchset. But it might be a good
idea for Tom to skim the code to see whether there's any things impacting the
design.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

09 December 2022, 02:59:50

Andres Freund <andres@anarazel.de> writes:
> On 2022-12-08 17:57:09 -0500, Tom Lane wrote:
>> Given that this additional experimentation didn't find any holes
>> in the API design, I think this is pretty much ready to go.

> One interesting area is timestamp / datetime related code. There's been some
> past efforts in the area, mostly in 5bc450629b3. See the RETURN_ERROR macro in
> formatting.c.
> This is not directly about type input functions, but it looks to me that the
> functionality in the patchset should work.

Yeah, I was planning to take a look at that before walking away from
this stuff.  (I'm sure not volunteering to convert ALL the input
functions, but I'll do the datetime code.)

You're right that formatting.c is doing stuff that's not exactly
an input function, but I don't see why we can't apply the same
API concepts to it.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

09 December 2022, 13:06:58

On 2022-12-08 Th 21:59, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2022-12-08 17:57:09 -0500, Tom Lane wrote:
>>> Given that this additional experimentation didn't find any holes
>>> in the API design, I think this is pretty much ready to go.
>> One interesting area is timestamp / datetime related code. There's been some
>> past efforts in the area, mostly in 5bc450629b3. See the RETURN_ERROR macro in
>> formatting.c.
>> This is not directly about type input functions, but it looks to me that the
>> functionality in the patchset should work.
> Yeah, I was planning to take a look at that before walking away from
> this stuff.  (I'm sure not volunteering to convert ALL the input
> functions, but I'll do the datetime code.)
>

Awesome. Perhaps if there are no more comments you can commit what you
currently have so people can start work on other input functions.


Thanks for your work on this.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

09 December 2022, 15:16:33

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-08 Th 21:59, Tom Lane wrote:
>> Yeah, I was planning to take a look at that before walking away from
>> this stuff.  (I'm sure not volunteering to convert ALL the input
>> functions, but I'll do the datetime code.)

> Awesome. Perhaps if there are no more comments you can commit what you
> currently have so people can start work on other input functions.

Pushed.  As I said, I'll take a look at the datetime area.  Do we
have any volunteers for other input functions?

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

09 December 2022, 15:37:56

On 2022-12-09 Fr 10:16, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 2022-12-08 Th 21:59, Tom Lane wrote:
>>> Yeah, I was planning to take a look at that before walking away from
>>> this stuff.  (I'm sure not volunteering to convert ALL the input
>>> functions, but I'll do the datetime code.)
>> Awesome. Perhaps if there are no more comments you can commit what you
>> currently have so people can start work on other input functions.
> Pushed.  


Great!


> As I said, I'll take a look at the datetime area.  Do we
> have any volunteers for other input functions?
>
>             


I am currently looking at the json types. I think that will be enough to
let us rework the sql/json patches as discussed a couple of months ago.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Amul Sul

Date:

09 December 2022, 16:16:34

On Fri, Dec 9, 2022 at 9:08 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2022-12-09 Fr 10:16, Tom Lane wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >> On 2022-12-08 Th 21:59, Tom Lane wrote:
> >>> Yeah, I was planning to take a look at that before walking away from
> >>> this stuff.  (I'm sure not volunteering to convert ALL the input
> >>> functions, but I'll do the datetime code.)
> >> Awesome. Perhaps if there are no more comments you can commit what you
> >> currently have so people can start work on other input functions.
> > Pushed.
>
>
> Great!
>
>
> > As I said, I'll take a look at the datetime area.  Do we
> > have any volunteers for other input functions?
> >
> >
>
>
> I am currently looking at the json types. I think that will be enough to
> let us rework the sql/json patches as discussed a couple of months ago.
>

I will pick a few other input functions, thanks.

Regards,
Amul

Re: Error-safe user functions

From

Corey Huinker

Date:

09 December 2022, 22:54:24

On Fri, Dec 9, 2022 at 11:17 AM Amul Sul <sulamul@gmail.com> wrote:

On Fri, Dec 9, 2022 at 9:08 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2022-12-09 Fr 10:16, Tom Lane wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >> On 2022-12-08 Th 21:59, Tom Lane wrote:
> >>> Yeah, I was planning to take a look at that before walking away from
> >>> this stuff. (I'm sure not volunteering to convert ALL the input
> >>> functions, but I'll do the datetime code.)
> >> Awesome. Perhaps if there are no more comments you can commit what you
> >> currently have so people can start work on other input functions.
> > Pushed.
>
>
> Great!
>
>
> > As I said, I'll take a look at the datetime area. Do we
> > have any volunteers for other input functions?
> >
> >
>
>
> I am currently looking at the json types. I think that will be enough to
> let us rework the sql/json patches as discussed a couple of months ago.
>

I will pick a few other input functions, thanks.

Regards,
Amul

I can do a few as well, as I need them done for the CAST With Default effort.

Amul, please let me know which ones you pick so we don't duplicate work.

Re: Error-safe user functions

From

Tom Lane

Date:

10 December 2022, 01:28:47

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-09 Fr 10:16, Tom Lane wrote:
>> As I said, I'll take a look at the datetime area.  Do we
>> have any volunteers for other input functions?

> I am currently looking at the json types. I think that will be enough to
> let us rework the sql/json patches as discussed a couple of months ago.

Cool.  I've finished up what I wanted to do with the datetime code.

It occurred to me that we're going to have a bit of a problem
with domain_in.  We can certainly make it pass back any soft
errors from the underlying type's input function, and we can
make it return a soft error if a domain constraint evaluates
to false.  However, what happens if some function in a check
constraint throws an error?  Our only hope of trapping that,
given that it's a general user-defined expression, would be
a subtransaction.  Which is exactly what we don't want here.

I think though that it might be okay to just define this as
Not Our Problem.  Although we don't seem to try to enforce it,
non-immutable domain check constraints are strongly deprecated
(the CREATE DOMAIN man page says that we assume immutability).
And not throwing errors is something that we usually consider
should ride along with immutability.  So I think it might be
okay to say "if you want soft error treatment for a domain,
make sure its check constraints don't throw errors".

Thoughts?

            regards, tom lane

Re: Error-safe user functions

From

Alvaro Herrera

Date:

10 December 2022, 12:20:13

On 2022-Dec-09, Tom Lane wrote:

> I think though that it might be okay to just define this as
> Not Our Problem.  Although we don't seem to try to enforce it,
> non-immutable domain check constraints are strongly deprecated
> (the CREATE DOMAIN man page says that we assume immutability).
> And not throwing errors is something that we usually consider
> should ride along with immutability.  So I think it might be
> okay to say "if you want soft error treatment for a domain,
> make sure its check constraints don't throw errors".

I think that's fine.  If the user does, say "CHECK (value > 0)" and that
results in a soft error, that seems to me enough support for now.  If
they want to do something more elaborate, they can write C functions.
Maybe eventually we'll want to offer some other mechanism that doesn't
require C, but let's figure out what the requirements are.  I don't
think we know that, at this point.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)

Re: Error-safe user functions

From

Tom Lane

Date:

10 December 2022, 14:20:27

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2022-Dec-09, Tom Lane wrote:
>> ...  So I think it might be
>> okay to say "if you want soft error treatment for a domain,
>> make sure its check constraints don't throw errors".

> I think that's fine.  If the user does, say "CHECK (value > 0)" and that
> results in a soft error, that seems to me enough support for now.  If
> they want to do something more elaborate, they can write C functions.
> Maybe eventually we'll want to offer some other mechanism that doesn't
> require C, but let's figure out what the requirements are.  I don't
> think we know that, at this point.

A fallback we can offer to anyone with such a problem is "write a
plpgsql function and wrap the potentially-failing bit in an exception
block".  Then they get to pay the cost of the subtransaction, while
we're not imposing one on everybody else.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

10 December 2022, 14:35:12

On 2022-12-09 Fr 10:37, Andrew Dunstan wrote:
> I am currently looking at the json types. I think that will be enough to
> let us rework the sql/json patches as discussed a couple of months ago.
>

OK, json is a fairly easy case, see attached. But jsonb is a different
kettle of fish. Both the semantic routines called by the parser and the
subsequent call to JsonbValueToJsonb() can raise errors. These are
pretty much all about breaking various limits (for strings, objects,
arrays). There's also a call to numeric_in, but I assume that a string
that's already parsed as a valid json numeric literal won't upset
numeric_in. Many of these occur several calls down the stack, so
adjusting everything to deal with them would be fairly invasive. Perhaps
we could instead document that this class of input error won't be
trapped, at least for jsonb. We could still test for well-formed jsonb
input, just as I propose for json. That means that we would not be able
to trap one of these errors in the ON ERROR clause of JSON_TABLE. I
think we can probably live with that.

Thoughts?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

0001-adjustments-for-json_in.patch

Re: Error-safe user functions

From

Pavel Stehule

Date:

10 December 2022, 15:43:28

so 10. 12. 2022 v 15:35 odesílatel Andrew Dunstan <andrew@dunslane.net> napsal:

On 2022-12-09 Fr 10:37, Andrew Dunstan wrote:
> I am currently looking at the json types. I think that will be enough to
> let us rework the sql/json patches as discussed a couple of months ago.
>

OK, json is a fairly easy case, see attached. But jsonb is a different
kettle of fish. Both the semantic routines called by the parser and the
subsequent call to JsonbValueToJsonb() can raise errors. These are
pretty much all about breaking various limits (for strings, objects,
arrays). There's also a call to numeric_in, but I assume that a string
that's already parsed as a valid json numeric literal won't upset
numeric_in. Many of these occur several calls down the stack, so
adjusting everything to deal with them would be fairly invasive. Perhaps
we could instead document that this class of input error won't be
trapped, at least for jsonb. We could still test for well-formed jsonb
input, just as I propose for json. That means that we would not be able
to trap one of these errors in the ON ERROR clause of JSON_TABLE. I
think we can probably live with that.

Thoughts?

Pavel

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Corey Huinker

Date:

10 December 2022, 17:19:59

On Sat, Dec 10, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2022-Dec-09, Tom Lane wrote:
>> ... So I think it might be
>> okay to say "if you want soft error treatment for a domain,
>> make sure its check constraints don't throw errors".

> I think that's fine. If the user does, say "CHECK (value > 0)" and that
> results in a soft error, that seems to me enough support for now. If
> they want to do something more elaborate, they can write C functions.
> Maybe eventually we'll want to offer some other mechanism that doesn't
> require C, but let's figure out what the requirements are. I don't
> think we know that, at this point.

A fallback we can offer to anyone with such a problem is "write a
plpgsql function and wrap the potentially-failing bit in an exception
block". Then they get to pay the cost of the subtransaction, while
we're not imposing one on everybody else.

regards, tom lane

That exception block will prevent parallel plans. I'm not saying it isn't the best way forward for us, but wanted to make that side effect clear.

Re: Error-safe user functions

From

Tom Lane

Date:

10 December 2022, 19:38:38

Andrew Dunstan <andrew@dunslane.net> writes:
> OK, json is a fairly easy case, see attached. But jsonb is a different
> kettle of fish. Both the semantic routines called by the parser and the
> subsequent call to JsonbValueToJsonb() can raise errors. These are
> pretty much all about breaking various limits (for strings, objects,
> arrays). There's also a call to numeric_in, but I assume that a string
> that's already parsed as a valid json numeric literal won't upset
> numeric_in.

Um, nope ...

regression=# select '1e1000000'::jsonb;
ERROR:  value overflows numeric format
LINE 1: select '1e1000000'::jsonb;
               ^

> Many of these occur several calls down the stack, so
> adjusting everything to deal with them would be fairly invasive. Perhaps
> we could instead document that this class of input error won't be
> trapped, at least for jsonb.

Seeing that SQL/JSON is one of the major drivers of this whole project,
it seemed a little sad to me that jsonb couldn't manage to implement
what is required.  So I spent a bit of time poking at it.  Attached
is an extended version of your patch that also covers jsonb.

The main thing I soon realized is that the JsonSemAction API is based
on the assumption that semantic actions will report errors by throwing
them.  This is a bit schizophrenic considering the parser itself carefully
hands back error codes instead of throwing anything (excluding palloc
failures of course).  What I propose in the attached is that we change
that API so that action functions return JsonParseErrorType, and add
an enum value denoting "I already logged a suitable error, so you don't
have to".  It was a little tedious to modify all the existing functions
that way, but not hard.  Only the ones used by jsonb_in need to do
anything except "return JSON_SUCCESS", at least for now.

(I wonder if pg_verifybackup's parse_manifest.c could use a second
look at how it's handling errors, given this API.  I didn't study it
closely.)

I have not done anything here about errors within JsonbValueToJsonb.
There would need to be another round of API-extension in that area
if we want to be able to trap its errors.  As you say, those are mostly
about exceeding implementation size limits, so I suppose one could argue
that they are not so different from palloc failure.  It's still annoying.
If people are good with the changes attached, I might take a look at
that.

            regards, tom lane

diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index fee2ffb55c..e6896eccfe 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -81,9 +81,10 @@ json_in(PG_FUNCTION_ARGS)

     /* validate it */
     lex = makeJsonLexContext(result, false);
-    pg_parse_json_or_ereport(lex, &nullSemAction);
+    if (!pg_parse_json_or_errsave(lex, &nullSemAction, fcinfo->context))
+        PG_RETURN_NULL();

-    /* Internal representation is the same as text, for now */
+    /* Internal representation is the same as text */
     PG_RETURN_TEXT_P(result);
 }

@@ -1337,7 +1338,7 @@ json_typeof(PG_FUNCTION_ARGS)
     /* Lex exactly one token from the input and check its type. */
     result = json_lex(lex);
     if (result != JSON_SUCCESS)
-        json_ereport_error(result, lex);
+        json_errsave_error(result, lex, NULL);
     tok = lex->token_type;
     switch (tok)
     {
diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 9e14922ec2..7c1e5e6144 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -33,6 +33,7 @@ typedef struct JsonbInState
 {
     JsonbParseState *parseState;
     JsonbValue *res;
+    Node       *escontext;
 } JsonbInState;

 /* unlike with json categories, we need to treat json and jsonb differently */
@@ -61,15 +62,15 @@ typedef struct JsonbAggState
     Oid            val_output_func;
 } JsonbAggState;

-static inline Datum jsonb_from_cstring(char *json, int len);
-static size_t checkStringLen(size_t len);
-static void jsonb_in_object_start(void *pstate);
-static void jsonb_in_object_end(void *pstate);
-static void jsonb_in_array_start(void *pstate);
-static void jsonb_in_array_end(void *pstate);
-static void jsonb_in_object_field_start(void *pstate, char *fname, bool isnull);
+static inline Datum jsonb_from_cstring(char *json, int len, Node *escontext);
+static bool checkStringLen(size_t len, Node *escontext);
+static JsonParseErrorType jsonb_in_object_start(void *pstate);
+static JsonParseErrorType jsonb_in_object_end(void *pstate);
+static JsonParseErrorType jsonb_in_array_start(void *pstate);
+static JsonParseErrorType jsonb_in_array_end(void *pstate);
+static JsonParseErrorType jsonb_in_object_field_start(void *pstate, char *fname, bool isnull);
 static void jsonb_put_escaped_value(StringInfo out, JsonbValue *scalarVal);
-static void jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype);
+static JsonParseErrorType jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype);
 static void jsonb_categorize_type(Oid typoid,
                                   JsonbTypeCategory *tcategory,
                                   Oid *outfuncoid);
@@ -98,7 +99,7 @@ jsonb_in(PG_FUNCTION_ARGS)
 {
     char       *json = PG_GETARG_CSTRING(0);

-    return jsonb_from_cstring(json, strlen(json));
+    return jsonb_from_cstring(json, strlen(json), fcinfo->context);
 }

 /*
@@ -122,7 +123,7 @@ jsonb_recv(PG_FUNCTION_ARGS)
     else
         elog(ERROR, "unsupported jsonb version number %d", version);

-    return jsonb_from_cstring(str, nbytes);
+    return jsonb_from_cstring(str, nbytes, NULL);
 }

 /*
@@ -251,9 +252,12 @@ jsonb_typeof(PG_FUNCTION_ARGS)
  * Turns json string into a jsonb Datum.
  *
  * Uses the json parser (with hooks) to construct a jsonb.
+ *
+ * If escontext points to an ErrorSaveContext, errors are reported there
+ * instead of being thrown.
  */
 static inline Datum
-jsonb_from_cstring(char *json, int len)
+jsonb_from_cstring(char *json, int len, Node *escontext)
 {
     JsonLexContext *lex;
     JsonbInState state;
@@ -263,6 +267,7 @@ jsonb_from_cstring(char *json, int len)
     memset(&sem, 0, sizeof(sem));
     lex = makeJsonLexContextCstringLen(json, len, GetDatabaseEncoding(), true);

+    state.escontext = escontext;
     sem.semstate = (void *) &state;

     sem.object_start = jsonb_in_object_start;
@@ -272,58 +277,67 @@ jsonb_from_cstring(char *json, int len)
     sem.scalar = jsonb_in_scalar;
     sem.object_field_start = jsonb_in_object_field_start;

-    pg_parse_json_or_ereport(lex, &sem);
+    if (!pg_parse_json_or_errsave(lex, &sem, escontext))
+        return (Datum) 0;

     /* after parsing, the item member has the composed jsonb structure */
     PG_RETURN_POINTER(JsonbValueToJsonb(state.res));
 }

-static size_t
-checkStringLen(size_t len)
+static bool
+checkStringLen(size_t len, Node *escontext)
 {
     if (len > JENTRY_OFFLENMASK)
-        ereport(ERROR,
+        ereturn(escontext, false,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("string too long to represent as jsonb string"),
                  errdetail("Due to an implementation restriction, jsonb strings cannot exceed %d bytes.",
                            JENTRY_OFFLENMASK)));

-    return len;
+    return true;
 }

-static void
+static JsonParseErrorType
 jsonb_in_object_start(void *pstate)
 {
     JsonbInState *_state = (JsonbInState *) pstate;

     _state->res = pushJsonbValue(&_state->parseState, WJB_BEGIN_OBJECT, NULL);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 jsonb_in_object_end(void *pstate)
 {
     JsonbInState *_state = (JsonbInState *) pstate;

     _state->res = pushJsonbValue(&_state->parseState, WJB_END_OBJECT, NULL);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 jsonb_in_array_start(void *pstate)
 {
     JsonbInState *_state = (JsonbInState *) pstate;

     _state->res = pushJsonbValue(&_state->parseState, WJB_BEGIN_ARRAY, NULL);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 jsonb_in_array_end(void *pstate)
 {
     JsonbInState *_state = (JsonbInState *) pstate;

     _state->res = pushJsonbValue(&_state->parseState, WJB_END_ARRAY, NULL);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 jsonb_in_object_field_start(void *pstate, char *fname, bool isnull)
 {
     JsonbInState *_state = (JsonbInState *) pstate;
@@ -331,10 +345,14 @@ jsonb_in_object_field_start(void *pstate, char *fname, bool isnull)

     Assert(fname != NULL);
     v.type = jbvString;
-    v.val.string.len = checkStringLen(strlen(fname));
+    v.val.string.len = strlen(fname);
+    if (!checkStringLen(v.val.string.len, _state->escontext))
+        return JSON_SEM_ACTION_FAILED;
     v.val.string.val = fname;

     _state->res = pushJsonbValue(&_state->parseState, WJB_KEY, &v);
+
+    return JSON_SUCCESS;
 }

 static void
@@ -367,7 +385,7 @@ jsonb_put_escaped_value(StringInfo out, JsonbValue *scalarVal)
 /*
  * For jsonb we always want the de-escaped value - that's what's in token
  */
-static void
+static JsonParseErrorType
 jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype)
 {
     JsonbInState *_state = (JsonbInState *) pstate;
@@ -380,7 +398,9 @@ jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype)
         case JSON_TOKEN_STRING:
             Assert(token != NULL);
             v.type = jbvString;
-            v.val.string.len = checkStringLen(strlen(token));
+            v.val.string.len = strlen(token);
+            if (!checkStringLen(v.val.string.len, _state->escontext))
+                return JSON_SEM_ACTION_FAILED;
             v.val.string.val = token;
             break;
         case JSON_TOKEN_NUMBER:
@@ -391,10 +411,11 @@ jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype)
              */
             Assert(token != NULL);
             v.type = jbvNumeric;
-            numd = DirectFunctionCall3(numeric_in,
-                                       CStringGetDatum(token),
-                                       ObjectIdGetDatum(InvalidOid),
-                                       Int32GetDatum(-1));
+            if (!DirectInputFunctionCallSafe(numeric_in, token,
+                                             InvalidOid, -1,
+                                             _state->escontext,
+                                             &numd))
+                return JSON_SEM_ACTION_FAILED;
             v.val.numeric = DatumGetNumeric(numd);
             break;
         case JSON_TOKEN_TRUE:
@@ -443,6 +464,8 @@ jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype)
                 elog(ERROR, "unexpected parent of nested structure");
         }
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -726,6 +749,9 @@ jsonb_categorize_type(Oid typoid,
  *
  * If key_scalar is true, the value is stored as a key, so insist
  * it's of an acceptable type, and force it to be a jbvString.
+ *
+ * Note: currently, we assume that result->escontext is NULL and errors
+ * will be thrown.
  */
 static void
 datum_to_jsonb(Datum val, bool is_null, JsonbInState *result,
@@ -898,7 +924,8 @@ datum_to_jsonb(Datum val, bool is_null, JsonbInState *result,
             default:
                 outputstr = OidOutputFunctionCall(outfuncoid, val);
                 jb.type = jbvString;
-                jb.val.string.len = checkStringLen(strlen(outputstr));
+                jb.val.string.len = strlen(outputstr);
+                (void) checkStringLen(jb.val.string.len, NULL);
                 jb.val.string.val = outputstr;
                 break;
         }
@@ -1636,6 +1663,7 @@ jsonb_agg_finalfn(PG_FUNCTION_ARGS)
      * shallow clone is sufficient as we aren't going to change any of the
      * values, just add the final array end marker.
      */
+    memset(&result, 0, sizeof(JsonbInState));

     result.parseState = clone_parse_state(arg->res->parseState);

@@ -1868,6 +1896,7 @@ jsonb_object_agg_finalfn(PG_FUNCTION_ARGS)
      * going to change any of the values, just add the final object end
      * marker.
      */
+    memset(&result, 0, sizeof(JsonbInState));

     result.parseState = clone_parse_state(arg->res->parseState);

diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index bfc3f02a86..463a8fdf23 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
@@ -336,20 +337,20 @@ typedef struct JsObject
 static int    report_json_context(JsonLexContext *lex);

 /* semantic action functions for json_object_keys */
-static void okeys_object_field_start(void *state, char *fname, bool isnull);
-static void okeys_array_start(void *state);
-static void okeys_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType okeys_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType okeys_array_start(void *state);
+static JsonParseErrorType okeys_scalar(void *state, char *token, JsonTokenType tokentype);

 /* semantic action functions for json_get* functions */
-static void get_object_start(void *state);
-static void get_object_end(void *state);
-static void get_object_field_start(void *state, char *fname, bool isnull);
-static void get_object_field_end(void *state, char *fname, bool isnull);
-static void get_array_start(void *state);
-static void get_array_end(void *state);
-static void get_array_element_start(void *state, bool isnull);
-static void get_array_element_end(void *state, bool isnull);
-static void get_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType get_object_start(void *state);
+static JsonParseErrorType get_object_end(void *state);
+static JsonParseErrorType get_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType get_object_field_end(void *state, char *fname, bool isnull);
+static JsonParseErrorType get_array_start(void *state);
+static JsonParseErrorType get_array_end(void *state);
+static JsonParseErrorType get_array_element_start(void *state, bool isnull);
+static JsonParseErrorType get_array_element_end(void *state, bool isnull);
+static JsonParseErrorType get_scalar(void *state, char *token, JsonTokenType tokentype);

 /* common worker function for json getter functions */
 static Datum get_path_all(FunctionCallInfo fcinfo, bool as_text);
@@ -359,9 +360,9 @@ static Datum get_jsonb_path_all(FunctionCallInfo fcinfo, bool as_text);
 static text *JsonbValueAsText(JsonbValue *v);

 /* semantic action functions for json_array_length */
-static void alen_object_start(void *state);
-static void alen_scalar(void *state, char *token, JsonTokenType tokentype);
-static void alen_array_element_start(void *state, bool isnull);
+static JsonParseErrorType alen_object_start(void *state);
+static JsonParseErrorType alen_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType alen_array_element_start(void *state, bool isnull);

 /* common workers for json{b}_each* functions */
 static Datum each_worker(FunctionCallInfo fcinfo, bool as_text);
@@ -369,10 +370,10 @@ static Datum each_worker_jsonb(FunctionCallInfo fcinfo, const char *funcname,
                                bool as_text);

 /* semantic action functions for json_each */
-static void each_object_field_start(void *state, char *fname, bool isnull);
-static void each_object_field_end(void *state, char *fname, bool isnull);
-static void each_array_start(void *state);
-static void each_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType each_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType each_object_field_end(void *state, char *fname, bool isnull);
+static JsonParseErrorType each_array_start(void *state);
+static JsonParseErrorType each_scalar(void *state, char *token, JsonTokenType tokentype);

 /* common workers for json{b}_array_elements_* functions */
 static Datum elements_worker(FunctionCallInfo fcinfo, const char *funcname,
@@ -381,44 +382,44 @@ static Datum elements_worker_jsonb(FunctionCallInfo fcinfo, const char *funcname
                                    bool as_text);

 /* semantic action functions for json_array_elements */
-static void elements_object_start(void *state);
-static void elements_array_element_start(void *state, bool isnull);
-static void elements_array_element_end(void *state, bool isnull);
-static void elements_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType elements_object_start(void *state);
+static JsonParseErrorType elements_array_element_start(void *state, bool isnull);
+static JsonParseErrorType elements_array_element_end(void *state, bool isnull);
+static JsonParseErrorType elements_scalar(void *state, char *token, JsonTokenType tokentype);

 /* turn a json object into a hash table */
 static HTAB *get_json_object_as_hash(char *json, int len, const char *funcname);

 /* semantic actions for populate_array_json */
-static void populate_array_object_start(void *_state);
-static void populate_array_array_end(void *_state);
-static void populate_array_element_start(void *_state, bool isnull);
-static void populate_array_element_end(void *_state, bool isnull);
-static void populate_array_scalar(void *_state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType populate_array_object_start(void *_state);
+static JsonParseErrorType populate_array_array_end(void *_state);
+static JsonParseErrorType populate_array_element_start(void *_state, bool isnull);
+static JsonParseErrorType populate_array_element_end(void *_state, bool isnull);
+static JsonParseErrorType populate_array_scalar(void *_state, char *token, JsonTokenType tokentype);

 /* semantic action functions for get_json_object_as_hash */
-static void hash_object_field_start(void *state, char *fname, bool isnull);
-static void hash_object_field_end(void *state, char *fname, bool isnull);
-static void hash_array_start(void *state);
-static void hash_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType hash_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType hash_object_field_end(void *state, char *fname, bool isnull);
+static JsonParseErrorType hash_array_start(void *state);
+static JsonParseErrorType hash_scalar(void *state, char *token, JsonTokenType tokentype);

 /* semantic action functions for populate_recordset */
-static void populate_recordset_object_field_start(void *state, char *fname, bool isnull);
-static void populate_recordset_object_field_end(void *state, char *fname, bool isnull);
-static void populate_recordset_scalar(void *state, char *token, JsonTokenType tokentype);
-static void populate_recordset_object_start(void *state);
-static void populate_recordset_object_end(void *state);
-static void populate_recordset_array_start(void *state);
-static void populate_recordset_array_element_start(void *state, bool isnull);
+static JsonParseErrorType populate_recordset_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType populate_recordset_object_field_end(void *state, char *fname, bool isnull);
+static JsonParseErrorType populate_recordset_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType populate_recordset_object_start(void *state);
+static JsonParseErrorType populate_recordset_object_end(void *state);
+static JsonParseErrorType populate_recordset_array_start(void *state);
+static JsonParseErrorType populate_recordset_array_element_start(void *state, bool isnull);

 /* semantic action functions for json_strip_nulls */
-static void sn_object_start(void *state);
-static void sn_object_end(void *state);
-static void sn_array_start(void *state);
-static void sn_array_end(void *state);
-static void sn_object_field_start(void *state, char *fname, bool isnull);
-static void sn_array_element_start(void *state, bool isnull);
-static void sn_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType sn_object_start(void *state);
+static JsonParseErrorType sn_object_end(void *state);
+static JsonParseErrorType sn_array_start(void *state);
+static JsonParseErrorType sn_array_end(void *state);
+static JsonParseErrorType sn_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType sn_array_element_start(void *state, bool isnull);
+static JsonParseErrorType sn_scalar(void *state, char *token, JsonTokenType tokentype);

 /* worker functions for populate_record, to_record, populate_recordset and to_recordset */
 static Datum populate_recordset_worker(FunctionCallInfo fcinfo, const char *funcname,
@@ -478,33 +479,43 @@ static void setPathArray(JsonbIterator **it, Datum *path_elems,
                          JsonbValue *newval, uint32 nelems, int op_type);

 /* function supporting iterate_json_values */
-static void iterate_values_scalar(void *state, char *token, JsonTokenType tokentype);
-static void iterate_values_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType iterate_values_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType iterate_values_object_field_start(void *state, char *fname, bool isnull);

 /* functions supporting transform_json_string_values */
-static void transform_string_values_object_start(void *state);
-static void transform_string_values_object_end(void *state);
-static void transform_string_values_array_start(void *state);
-static void transform_string_values_array_end(void *state);
-static void transform_string_values_object_field_start(void *state, char *fname, bool isnull);
-static void transform_string_values_array_element_start(void *state, bool isnull);
-static void transform_string_values_scalar(void *state, char *token, JsonTokenType tokentype);
+static JsonParseErrorType transform_string_values_object_start(void *state);
+static JsonParseErrorType transform_string_values_object_end(void *state);
+static JsonParseErrorType transform_string_values_array_start(void *state);
+static JsonParseErrorType transform_string_values_array_end(void *state);
+static JsonParseErrorType transform_string_values_object_field_start(void *state, char *fname, bool isnull);
+static JsonParseErrorType transform_string_values_array_element_start(void *state, bool isnull);
+static JsonParseErrorType transform_string_values_scalar(void *state, char *token, JsonTokenType tokentype);
+

 /*
- * pg_parse_json_or_ereport
+ * pg_parse_json_or_errsave
  *
  * This function is like pg_parse_json, except that it does not return a
  * JsonParseErrorType. Instead, in case of any failure, this function will
+ * save error data into *escontext if that's an ErrorSaveContext, otherwise
  * ereport(ERROR).
+ *
+ * Returns a boolean indicating success or failure (failure will only be
+ * returned when escontext is an ErrorSaveContext).
  */
-void
-pg_parse_json_or_ereport(JsonLexContext *lex, JsonSemAction *sem)
+bool
+pg_parse_json_or_errsave(JsonLexContext *lex, JsonSemAction *sem,
+                         Node *escontext)
 {
     JsonParseErrorType result;

     result = pg_parse_json(lex, sem);
     if (result != JSON_SUCCESS)
-        json_ereport_error(result, lex);
+    {
+        json_errsave_error(result, lex, escontext);
+        return false;
+    }
+    return true;
 }

 /*
@@ -608,17 +619,24 @@ jsonb_object_keys(PG_FUNCTION_ARGS)
  * Report a JSON error.
  */
 void
-json_ereport_error(JsonParseErrorType error, JsonLexContext *lex)
+json_errsave_error(JsonParseErrorType error, JsonLexContext *lex,
+                   Node *escontext)
 {
     if (error == JSON_UNICODE_HIGH_ESCAPE ||
         error == JSON_UNICODE_CODE_POINT_ZERO)
-        ereport(ERROR,
+        errsave(escontext,
                 (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER),
                  errmsg("unsupported Unicode escape sequence"),
                  errdetail_internal("%s", json_errdetail(error, lex)),
                  report_json_context(lex)));
+    else if (error == JSON_SEM_ACTION_FAILED)
+    {
+        /* semantic action function had better have reported something */
+        if (!SOFT_ERROR_OCCURRED(escontext))
+            elog(ERROR, "JSON semantic action function did not provide error information");
+    }
     else
-        ereport(ERROR,
+        errsave(escontext,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("invalid input syntax for type %s", "json"),
                  errdetail_internal("%s", json_errdetail(error, lex)),
@@ -745,14 +763,14 @@ json_object_keys(PG_FUNCTION_ARGS)
     SRF_RETURN_DONE(funcctx);
 }

-static void
+static JsonParseErrorType
 okeys_object_field_start(void *state, char *fname, bool isnull)
 {
     OkeysState *_state = (OkeysState *) state;

     /* only collecting keys for the top level object */
     if (_state->lex->lex_level != 1)
-        return;
+        return JSON_SUCCESS;

     /* enlarge result array if necessary */
     if (_state->result_count >= _state->result_size)
@@ -764,9 +782,11 @@ okeys_object_field_start(void *state, char *fname, bool isnull)

     /* save a copy of the field name */
     _state->result[_state->result_count++] = pstrdup(fname);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 okeys_array_start(void *state)
 {
     OkeysState *_state = (OkeysState *) state;
@@ -777,9 +797,11 @@ okeys_array_start(void *state)
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot call %s on an array",
                         "json_object_keys")));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 okeys_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     OkeysState *_state = (OkeysState *) state;
@@ -790,6 +812,8 @@ okeys_scalar(void *state, char *token, JsonTokenType tokentype)
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot call %s on a scalar",
                         "json_object_keys")));
+
+    return JSON_SUCCESS;
 }

 /*
@@ -1112,7 +1136,7 @@ get_worker(text *json,
     return state->tresult;
 }

-static void
+static JsonParseErrorType
 get_object_start(void *state)
 {
     GetState   *_state = (GetState *) state;
@@ -1127,9 +1151,11 @@ get_object_start(void *state)
          */
         _state->result_start = _state->lex->token_start;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_object_end(void *state)
 {
     GetState   *_state = (GetState *) state;
@@ -1143,9 +1169,11 @@ get_object_end(void *state)

         _state->tresult = cstring_to_text_with_len(start, len);
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_object_field_start(void *state, char *fname, bool isnull)
 {
     GetState   *_state = (GetState *) state;
@@ -1188,9 +1216,11 @@ get_object_field_start(void *state, char *fname, bool isnull)
             _state->result_start = _state->lex->token_start;
         }
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_object_field_end(void *state, char *fname, bool isnull)
 {
     GetState   *_state = (GetState *) state;
@@ -1237,9 +1267,11 @@ get_object_field_end(void *state, char *fname, bool isnull)
         /* this should be unnecessary but let's do it for cleanliness: */
         _state->result_start = NULL;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_array_start(void *state)
 {
     GetState   *_state = (GetState *) state;
@@ -1260,7 +1292,7 @@ get_array_start(void *state)

             error = json_count_array_elements(_state->lex, &nelements);
             if (error != JSON_SUCCESS)
-                json_ereport_error(error, _state->lex);
+                json_errsave_error(error, _state->lex, NULL);

             if (-_state->path_indexes[lex_level] <= nelements)
                 _state->path_indexes[lex_level] += nelements;
@@ -1275,9 +1307,11 @@ get_array_start(void *state)
          */
         _state->result_start = _state->lex->token_start;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_array_end(void *state)
 {
     GetState   *_state = (GetState *) state;
@@ -1291,9 +1325,11 @@ get_array_end(void *state)

         _state->tresult = cstring_to_text_with_len(start, len);
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_array_element_start(void *state, bool isnull)
 {
     GetState   *_state = (GetState *) state;
@@ -1337,9 +1373,11 @@ get_array_element_start(void *state, bool isnull)
             _state->result_start = _state->lex->token_start;
         }
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_array_element_end(void *state, bool isnull)
 {
     GetState   *_state = (GetState *) state;
@@ -1379,9 +1417,11 @@ get_array_element_end(void *state, bool isnull)

         _state->result_start = NULL;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 get_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     GetState   *_state = (GetState *) state;
@@ -1420,6 +1460,8 @@ get_scalar(void *state, char *token, JsonTokenType tokentype)
         /* make sure the next call to get_scalar doesn't overwrite it */
         _state->next_scalar = false;
     }
+
+    return JSON_SUCCESS;
 }

 Datum
@@ -1834,7 +1876,7 @@ jsonb_array_length(PG_FUNCTION_ARGS)
  * a scalar or an object).
  */

-static void
+static JsonParseErrorType
 alen_object_start(void *state)
 {
     AlenState  *_state = (AlenState *) state;
@@ -1844,9 +1886,11 @@ alen_object_start(void *state)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot get array length of a non-array")));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 alen_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     AlenState  *_state = (AlenState *) state;
@@ -1856,9 +1900,11 @@ alen_scalar(void *state, char *token, JsonTokenType tokentype)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot get array length of a scalar")));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 alen_array_element_start(void *state, bool isnull)
 {
     AlenState  *_state = (AlenState *) state;
@@ -1866,6 +1912,8 @@ alen_array_element_start(void *state, bool isnull)
     /* just count up all the level 1 elements */
     if (_state->lex->lex_level == 1)
         _state->count++;
+
+    return JSON_SUCCESS;
 }

 /*
@@ -2026,7 +2074,7 @@ each_worker(FunctionCallInfo fcinfo, bool as_text)
 }


-static void
+static JsonParseErrorType
 each_object_field_start(void *state, char *fname, bool isnull)
 {
     EachState  *_state = (EachState *) state;
@@ -2044,9 +2092,11 @@ each_object_field_start(void *state, char *fname, bool isnull)
         else
             _state->result_start = _state->lex->token_start;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 each_object_field_end(void *state, char *fname, bool isnull)
 {
     EachState  *_state = (EachState *) state;
@@ -2059,7 +2109,7 @@ each_object_field_end(void *state, char *fname, bool isnull)

     /* skip over nested objects */
     if (_state->lex->lex_level != 1)
-        return;
+        return JSON_SUCCESS;

     /* use the tmp context so we can clean up after each tuple is done */
     old_cxt = MemoryContextSwitchTo(_state->tmp_cxt);
@@ -2090,9 +2140,11 @@ each_object_field_end(void *state, char *fname, bool isnull)
     /* clean up and switch back */
     MemoryContextSwitchTo(old_cxt);
     MemoryContextReset(_state->tmp_cxt);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 each_array_start(void *state)
 {
     EachState  *_state = (EachState *) state;
@@ -2102,9 +2154,11 @@ each_array_start(void *state)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot deconstruct an array as an object")));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 each_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     EachState  *_state = (EachState *) state;
@@ -2118,6 +2172,8 @@ each_scalar(void *state, char *token, JsonTokenType tokentype)
     /* supply de-escaped value if required */
     if (_state->next_scalar)
         _state->normalized_scalar = token;
+
+    return JSON_SUCCESS;
 }

 /*
@@ -2268,7 +2324,7 @@ elements_worker(FunctionCallInfo fcinfo, const char *funcname, bool as_text)
     PG_RETURN_NULL();
 }

-static void
+static JsonParseErrorType
 elements_array_element_start(void *state, bool isnull)
 {
     ElementsState *_state = (ElementsState *) state;
@@ -2286,9 +2342,11 @@ elements_array_element_start(void *state, bool isnull)
         else
             _state->result_start = _state->lex->token_start;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 elements_array_element_end(void *state, bool isnull)
 {
     ElementsState *_state = (ElementsState *) state;
@@ -2301,7 +2359,7 @@ elements_array_element_end(void *state, bool isnull)

     /* skip over nested objects */
     if (_state->lex->lex_level != 1)
-        return;
+        return JSON_SUCCESS;

     /* use the tmp context so we can clean up after each tuple is done */
     old_cxt = MemoryContextSwitchTo(_state->tmp_cxt);
@@ -2330,9 +2388,11 @@ elements_array_element_end(void *state, bool isnull)
     /* clean up and switch back */
     MemoryContextSwitchTo(old_cxt);
     MemoryContextReset(_state->tmp_cxt);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 elements_object_start(void *state)
 {
     ElementsState *_state = (ElementsState *) state;
@@ -2343,9 +2403,11 @@ elements_object_start(void *state)
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot call %s on a non-array",
                         _state->function_name)));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 elements_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     ElementsState *_state = (ElementsState *) state;
@@ -2360,6 +2422,8 @@ elements_scalar(void *state, char *token, JsonTokenType tokentype)
     /* supply de-escaped value if required */
     if (_state->next_scalar)
         _state->normalized_scalar = token;
+
+    return JSON_SUCCESS;
 }

 /*
@@ -2508,7 +2572,7 @@ populate_array_element(PopulateArrayContext *ctx, int ndim, JsValue *jsv)
 }

 /* json object start handler for populate_array_json() */
-static void
+static JsonParseErrorType
 populate_array_object_start(void *_state)
 {
     PopulateArrayState *state = (PopulateArrayState *) _state;
@@ -2518,10 +2582,12 @@ populate_array_object_start(void *_state)
         populate_array_assign_ndims(state->ctx, ndim);
     else if (ndim < state->ctx->ndims)
         populate_array_report_expected_array(state->ctx, ndim);
+
+    return JSON_SUCCESS;
 }

 /* json array end handler for populate_array_json() */
-static void
+static JsonParseErrorType
 populate_array_array_end(void *_state)
 {
     PopulateArrayState *state = (PopulateArrayState *) _state;
@@ -2533,10 +2599,12 @@ populate_array_array_end(void *_state)

     if (ndim < ctx->ndims)
         populate_array_check_dimension(ctx, ndim);
+
+    return JSON_SUCCESS;
 }

 /* json array element start handler for populate_array_json() */
-static void
+static JsonParseErrorType
 populate_array_element_start(void *_state, bool isnull)
 {
     PopulateArrayState *state = (PopulateArrayState *) _state;
@@ -2549,10 +2617,12 @@ populate_array_element_start(void *_state, bool isnull)
         state->element_type = state->lex->token_type;
         state->element_scalar = NULL;
     }
+
+    return JSON_SUCCESS;
 }

 /* json array element end handler for populate_array_json() */
-static void
+static JsonParseErrorType
 populate_array_element_end(void *_state, bool isnull)
 {
     PopulateArrayState *state = (PopulateArrayState *) _state;
@@ -2588,10 +2658,12 @@ populate_array_element_end(void *_state, bool isnull)

         populate_array_element(ctx, ndim, &jsv);
     }
+
+    return JSON_SUCCESS;
 }

 /* json scalar handler for populate_array_json() */
-static void
+static JsonParseErrorType
 populate_array_scalar(void *_state, char *token, JsonTokenType tokentype)
 {
     PopulateArrayState *state = (PopulateArrayState *) _state;
@@ -2610,6 +2682,8 @@ populate_array_scalar(void *_state, char *token, JsonTokenType tokentype)
         /* element_type must already be set in populate_array_element_start() */
         Assert(state->element_type == tokentype);
     }
+
+    return JSON_SUCCESS;
 }

 /* parse a json array and populate array */
@@ -3491,13 +3565,13 @@ get_json_object_as_hash(char *json, int len, const char *funcname)
     return tab;
 }

-static void
+static JsonParseErrorType
 hash_object_field_start(void *state, char *fname, bool isnull)
 {
     JHashState *_state = (JHashState *) state;

     if (_state->lex->lex_level > 1)
-        return;
+        return JSON_SUCCESS;

     /* remember token type */
     _state->saved_token_type = _state->lex->token_type;
@@ -3513,9 +3587,11 @@ hash_object_field_start(void *state, char *fname, bool isnull)
         /* must be a scalar */
         _state->save_json_start = NULL;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 hash_object_field_end(void *state, char *fname, bool isnull)
 {
     JHashState *_state = (JHashState *) state;
@@ -3526,7 +3602,7 @@ hash_object_field_end(void *state, char *fname, bool isnull)
      * Ignore nested fields.
      */
     if (_state->lex->lex_level > 1)
-        return;
+        return JSON_SUCCESS;

     /*
      * Ignore field names >= NAMEDATALEN - they can't match a record field.
@@ -3536,7 +3612,7 @@ hash_object_field_end(void *state, char *fname, bool isnull)
      * has previously insisted on exact equality, so we keep this behavior.)
      */
     if (strlen(fname) >= NAMEDATALEN)
-        return;
+        return JSON_SUCCESS;

     hashentry = hash_search(_state->hash, fname, HASH_ENTER, &found);

@@ -3562,9 +3638,11 @@ hash_object_field_end(void *state, char *fname, bool isnull)
         /* must have had a scalar instead */
         hashentry->val = _state->saved_scalar;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 hash_array_start(void *state)
 {
     JHashState *_state = (JHashState *) state;
@@ -3573,9 +3651,11 @@ hash_array_start(void *state)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("cannot call %s on an array", _state->function_name)));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 hash_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     JHashState *_state = (JHashState *) state;
@@ -3591,6 +3671,8 @@ hash_scalar(void *state, char *token, JsonTokenType tokentype)
         /* saved_token_type must already be set in hash_object_field_start() */
         Assert(_state->saved_token_type == tokentype);
     }
+
+    return JSON_SUCCESS;
 }


@@ -3840,7 +3922,7 @@ populate_recordset_worker(FunctionCallInfo fcinfo, const char *funcname,
     PG_RETURN_NULL();
 }

-static void
+static JsonParseErrorType
 populate_recordset_object_start(void *state)
 {
     PopulateRecordsetState *_state = (PopulateRecordsetState *) state;
@@ -3856,7 +3938,7 @@ populate_recordset_object_start(void *state)

     /* Nested objects require no special processing */
     if (lex_level > 1)
-        return;
+        return JSON_SUCCESS;

     /* Object at level 1: set up a new hash table for this object */
     ctl.keysize = NAMEDATALEN;
@@ -3866,9 +3948,11 @@ populate_recordset_object_start(void *state)
                                     100,
                                     &ctl,
                                     HASH_ELEM | HASH_STRINGS | HASH_CONTEXT);
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 populate_recordset_object_end(void *state)
 {
     PopulateRecordsetState *_state = (PopulateRecordsetState *) state;
@@ -3876,7 +3960,7 @@ populate_recordset_object_end(void *state)

     /* Nested objects require no special processing */
     if (_state->lex->lex_level > 1)
-        return;
+        return JSON_SUCCESS;

     obj.is_json = true;
     obj.val.json_hash = _state->json_hash;
@@ -3887,9 +3971,11 @@ populate_recordset_object_end(void *state)
     /* Done with hash for this object */
     hash_destroy(_state->json_hash);
     _state->json_hash = NULL;
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 populate_recordset_array_element_start(void *state, bool isnull)
 {
     PopulateRecordsetState *_state = (PopulateRecordsetState *) state;
@@ -3900,15 +3986,18 @@ populate_recordset_array_element_start(void *state, bool isnull)
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("argument of %s must be an array of objects",
                         _state->function_name)));
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 populate_recordset_array_start(void *state)
 {
     /* nothing to do */
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 populate_recordset_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     PopulateRecordsetState *_state = (PopulateRecordsetState *) state;
@@ -3921,15 +4010,17 @@ populate_recordset_scalar(void *state, char *token, JsonTokenType tokentype)

     if (_state->lex->lex_level == 2)
         _state->saved_scalar = token;
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 populate_recordset_object_field_start(void *state, char *fname, bool isnull)
 {
     PopulateRecordsetState *_state = (PopulateRecordsetState *) state;

     if (_state->lex->lex_level > 2)
-        return;
+        return JSON_SUCCESS;

     _state->saved_token_type = _state->lex->token_type;

@@ -3942,9 +4033,11 @@ populate_recordset_object_field_start(void *state, char *fname, bool isnull)
     {
         _state->save_json_start = NULL;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 populate_recordset_object_field_end(void *state, char *fname, bool isnull)
 {
     PopulateRecordsetState *_state = (PopulateRecordsetState *) state;
@@ -3955,7 +4048,7 @@ populate_recordset_object_field_end(void *state, char *fname, bool isnull)
      * Ignore nested fields.
      */
     if (_state->lex->lex_level > 2)
-        return;
+        return JSON_SUCCESS;

     /*
      * Ignore field names >= NAMEDATALEN - they can't match a record field.
@@ -3965,7 +4058,7 @@ populate_recordset_object_field_end(void *state, char *fname, bool isnull)
      * has previously insisted on exact equality, so we keep this behavior.)
      */
     if (strlen(fname) >= NAMEDATALEN)
-        return;
+        return JSON_SUCCESS;

     hashentry = hash_search(_state->json_hash, fname, HASH_ENTER, &found);

@@ -3991,6 +4084,8 @@ populate_recordset_object_field_end(void *state, char *fname, bool isnull)
         /* must have had a scalar instead */
         hashentry->val = _state->saved_scalar;
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -4002,39 +4097,47 @@ populate_recordset_object_field_end(void *state, char *fname, bool isnull)
  * is called.
  */

-static void
+static JsonParseErrorType
 sn_object_start(void *state)
 {
     StripnullState *_state = (StripnullState *) state;

     appendStringInfoCharMacro(_state->strval, '{');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 sn_object_end(void *state)
 {
     StripnullState *_state = (StripnullState *) state;

     appendStringInfoCharMacro(_state->strval, '}');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 sn_array_start(void *state)
 {
     StripnullState *_state = (StripnullState *) state;

     appendStringInfoCharMacro(_state->strval, '[');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 sn_array_end(void *state)
 {
     StripnullState *_state = (StripnullState *) state;

     appendStringInfoCharMacro(_state->strval, ']');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 sn_object_field_start(void *state, char *fname, bool isnull)
 {
     StripnullState *_state = (StripnullState *) state;
@@ -4047,7 +4150,7 @@ sn_object_field_start(void *state, char *fname, bool isnull)
          * object or array. The flag will be reset in the scalar action.
          */
         _state->skip_next_null = true;
-        return;
+        return JSON_SUCCESS;
     }

     if (_state->strval->data[_state->strval->len - 1] != '{')
@@ -4060,18 +4163,22 @@ sn_object_field_start(void *state, char *fname, bool isnull)
     escape_json(_state->strval, fname);

     appendStringInfoCharMacro(_state->strval, ':');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 sn_array_element_start(void *state, bool isnull)
 {
     StripnullState *_state = (StripnullState *) state;

     if (_state->strval->data[_state->strval->len - 1] != '[')
         appendStringInfoCharMacro(_state->strval, ',');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 sn_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     StripnullState *_state = (StripnullState *) state;
@@ -4080,13 +4187,15 @@ sn_scalar(void *state, char *token, JsonTokenType tokentype)
     {
         Assert(tokentype == JSON_TOKEN_NULL);
         _state->skip_next_null = false;
-        return;
+        return JSON_SUCCESS;
     }

     if (tokentype == JSON_TOKEN_STRING)
         escape_json(_state->strval, token);
     else
         appendStringInfoString(_state->strval, token);
+
+    return JSON_SUCCESS;
 }

 /*
@@ -5326,7 +5435,7 @@ iterate_json_values(text *json, uint32 flags, void *action_state,
  * An auxiliary function for iterate_json_values to invoke a specified
  * JsonIterateStringValuesAction for specified values.
  */
-static void
+static JsonParseErrorType
 iterate_values_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     IterateJsonStringValuesState *_state = (IterateJsonStringValuesState *) state;
@@ -5350,9 +5459,11 @@ iterate_values_scalar(void *state, char *token, JsonTokenType tokentype)
             /* do not call callback for any other token */
             break;
     }
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 iterate_values_object_field_start(void *state, char *fname, bool isnull)
 {
     IterateJsonStringValuesState *_state = (IterateJsonStringValuesState *) state;
@@ -5363,6 +5474,8 @@ iterate_values_object_field_start(void *state, char *fname, bool isnull)

         _state->action(_state->action_state, val, strlen(val));
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -5430,7 +5543,6 @@ transform_json_string_values(text *json, void *action_state,
     state->action_state = action_state;

     sem->semstate = (void *) state;
-    sem->scalar = transform_string_values_scalar;
     sem->object_start = transform_string_values_object_start;
     sem->object_end = transform_string_values_object_end;
     sem->array_start = transform_string_values_array_start;
@@ -5449,39 +5561,47 @@ transform_json_string_values(text *json, void *action_state,
  * specified JsonTransformStringValuesAction for all values and left everything
  * else untouched.
  */
-static void
+static JsonParseErrorType
 transform_string_values_object_start(void *state)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;

     appendStringInfoCharMacro(_state->strval, '{');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 transform_string_values_object_end(void *state)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;

     appendStringInfoCharMacro(_state->strval, '}');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 transform_string_values_array_start(void *state)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;

     appendStringInfoCharMacro(_state->strval, '[');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 transform_string_values_array_end(void *state)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;

     appendStringInfoCharMacro(_state->strval, ']');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 transform_string_values_object_field_start(void *state, char *fname, bool isnull)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;
@@ -5495,18 +5615,22 @@ transform_string_values_object_field_start(void *state, char *fname, bool isnull
      */
     escape_json(_state->strval, fname);
     appendStringInfoCharMacro(_state->strval, ':');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 transform_string_values_array_element_start(void *state, bool isnull)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;

     if (_state->strval->data[_state->strval->len - 1] != '[')
         appendStringInfoCharMacro(_state->strval, ',');
+
+    return JSON_SUCCESS;
 }

-static void
+static JsonParseErrorType
 transform_string_values_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     TransformJsonStringValuesState *_state = (TransformJsonStringValuesState *) state;
@@ -5519,4 +5643,6 @@ transform_string_values_scalar(void *state, char *token, JsonTokenType tokentype
     }
     else
         appendStringInfoString(_state->strval, token);
+
+    return JSON_SUCCESS;
 }
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 0d37f69298..7b28a266ce 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -1614,6 +1614,51 @@ InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
     return true;
 }

+/*
+ * Call a directly-named datatype input function, with non-exception
+ * handling of "soft" errors.
+ *
+ * This is like InputFunctionCallSafe, except that it is given a direct
+ * pointer to the C function to call.  We assume that that function is
+ * strict.  Also, the function cannot be one that needs to
+ * look at FmgrInfo, since there won't be any.
+ */
+bool
+DirectInputFunctionCallSafe(PGFunction func, char *str,
+                            Oid typioparam, int32 typmod,
+                            fmNodePtr escontext,
+                            Datum *result)
+{
+    LOCAL_FCINFO(fcinfo, 3);
+
+    if (str == NULL)
+    {
+        *result = (Datum) 0;    /* just return null result */
+        return true;
+    }
+
+    InitFunctionCallInfoData(*fcinfo, NULL, 3, InvalidOid, escontext, NULL);
+
+    fcinfo->args[0].value = CStringGetDatum(str);
+    fcinfo->args[0].isnull = false;
+    fcinfo->args[1].value = ObjectIdGetDatum(typioparam);
+    fcinfo->args[1].isnull = false;
+    fcinfo->args[2].value = Int32GetDatum(typmod);
+    fcinfo->args[2].isnull = false;
+
+    *result = (*func) (fcinfo);
+
+    /* Result value is garbage, and could be null, if an error was reported */
+    if (SOFT_ERROR_OCCURRED(escontext))
+        return false;
+
+    /* Otherwise, shouldn't get null result */
+    if (fcinfo->isnull)
+        elog(ERROR, "input function %p returned NULL", (void *) func);
+
+    return true;
+}
+
 /*
  * Call a previously-looked-up datatype output function.
  *
diff --git a/src/bin/pg_verifybackup/parse_manifest.c b/src/bin/pg_verifybackup/parse_manifest.c
index 6364b01282..beff018e18 100644
--- a/src/bin/pg_verifybackup/parse_manifest.c
+++ b/src/bin/pg_verifybackup/parse_manifest.c
@@ -88,14 +88,14 @@ typedef struct
     char       *manifest_checksum;
 } JsonManifestParseState;

-static void json_manifest_object_start(void *state);
-static void json_manifest_object_end(void *state);
-static void json_manifest_array_start(void *state);
-static void json_manifest_array_end(void *state);
-static void json_manifest_object_field_start(void *state, char *fname,
-                                             bool isnull);
-static void json_manifest_scalar(void *state, char *token,
-                                 JsonTokenType tokentype);
+static JsonParseErrorType json_manifest_object_start(void *state);
+static JsonParseErrorType json_manifest_object_end(void *state);
+static JsonParseErrorType json_manifest_array_start(void *state);
+static JsonParseErrorType json_manifest_array_end(void *state);
+static JsonParseErrorType json_manifest_object_field_start(void *state, char *fname,
+                                                           bool isnull);
+static JsonParseErrorType json_manifest_scalar(void *state, char *token,
+                                               JsonTokenType tokentype);
 static void json_manifest_finalize_file(JsonManifestParseState *parse);
 static void json_manifest_finalize_wal_range(JsonManifestParseState *parse);
 static void verify_manifest_checksum(JsonManifestParseState *parse,
@@ -162,7 +162,7 @@ json_parse_manifest(JsonManifestParseContext *context, char *buffer,
  * WAL range is also expected to be an object. If we're anywhere else in the
  * document, it's an error.
  */
-static void
+static JsonParseErrorType
 json_manifest_object_start(void *state)
 {
     JsonManifestParseState *parse = state;
@@ -191,6 +191,8 @@ json_manifest_object_start(void *state)
                                         "unexpected object start");
             break;
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -201,7 +203,7 @@ json_manifest_object_start(void *state)
  * reach the end of an object representing a particular file or WAL range,
  * we must call json_manifest_finalize_file() to save the associated details.
  */
-static void
+static JsonParseErrorType
 json_manifest_object_end(void *state)
 {
     JsonManifestParseState *parse = state;
@@ -224,6 +226,8 @@ json_manifest_object_end(void *state)
                                         "unexpected object end");
             break;
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -233,7 +237,7 @@ json_manifest_object_end(void *state)
  * should be an array. Similarly for the "WAL-Ranges" key. No other arrays
  * are expected.
  */
-static void
+static JsonParseErrorType
 json_manifest_array_start(void *state)
 {
     JsonManifestParseState *parse = state;
@@ -251,6 +255,8 @@ json_manifest_array_start(void *state)
                                         "unexpected array start");
             break;
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -258,7 +264,7 @@ json_manifest_array_start(void *state)
  *
  * The cases here are analogous to those in json_manifest_array_start.
  */
-static void
+static JsonParseErrorType
 json_manifest_array_end(void *state)
 {
     JsonManifestParseState *parse = state;
@@ -274,12 +280,14 @@ json_manifest_array_end(void *state)
                                         "unexpected array end");
             break;
     }
+
+    return JSON_SUCCESS;
 }

 /*
  * Invoked at the start of each object field in the JSON document.
  */
-static void
+static JsonParseErrorType
 json_manifest_object_field_start(void *state, char *fname, bool isnull)
 {
     JsonManifestParseState *parse = state;
@@ -367,6 +375,8 @@ json_manifest_object_field_start(void *state, char *fname, bool isnull)
                                         "unexpected object field");
             break;
     }
+
+    return JSON_SUCCESS;
 }

 /*
@@ -384,7 +394,7 @@ json_manifest_object_field_start(void *state, char *fname, bool isnull)
  * reach either the end of the object representing this file, or the end
  * of the manifest, as the case may be.
  */
-static void
+static JsonParseErrorType
 json_manifest_scalar(void *state, char *token, JsonTokenType tokentype)
 {
     JsonManifestParseState *parse = state;
@@ -448,6 +458,8 @@ json_manifest_scalar(void *state, char *token, JsonTokenType tokentype)
             json_manifest_parse_failure(parse->context, "unexpected scalar");
             break;
     }
+
+    return JSON_SUCCESS;
 }

 /*
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index 873357aa02..83c286b89b 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -298,9 +298,9 @@ parse_scalar(JsonLexContext *lex, JsonSemAction *sem)
         return result;

     /* invoke the callback */
-    (*sfunc) (sem->semstate, val, tok);
+    result = (*sfunc) (sem->semstate, val, tok);

-    return JSON_SUCCESS;
+    return result;
 }

 static JsonParseErrorType
@@ -335,7 +335,11 @@ parse_object_field(JsonLexContext *lex, JsonSemAction *sem)
     isnull = tok == JSON_TOKEN_NULL;

     if (ostart != NULL)
-        (*ostart) (sem->semstate, fname, isnull);
+    {
+        result = (*ostart) (sem->semstate, fname, isnull);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     switch (tok)
     {
@@ -352,7 +356,12 @@ parse_object_field(JsonLexContext *lex, JsonSemAction *sem)
         return result;

     if (oend != NULL)
-        (*oend) (sem->semstate, fname, isnull);
+    {
+        result = (*oend) (sem->semstate, fname, isnull);
+        if (result != JSON_SUCCESS)
+            return result;
+    }
+
     return JSON_SUCCESS;
 }

@@ -373,7 +382,11 @@ parse_object(JsonLexContext *lex, JsonSemAction *sem)
 #endif

     if (ostart != NULL)
-        (*ostart) (sem->semstate);
+    {
+        result = (*ostart) (sem->semstate);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     /*
      * Data inside an object is at a higher nesting level than the object
@@ -417,7 +430,11 @@ parse_object(JsonLexContext *lex, JsonSemAction *sem)
     lex->lex_level--;

     if (oend != NULL)
-        (*oend) (sem->semstate);
+    {
+        result = (*oend) (sem->semstate);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     return JSON_SUCCESS;
 }
@@ -429,13 +446,16 @@ parse_array_element(JsonLexContext *lex, JsonSemAction *sem)
     json_aelem_action aend = sem->array_element_end;
     JsonTokenType tok = lex_peek(lex);
     JsonParseErrorType result;
-
     bool        isnull;

     isnull = tok == JSON_TOKEN_NULL;

     if (astart != NULL)
-        (*astart) (sem->semstate, isnull);
+    {
+        result = (*astart) (sem->semstate, isnull);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     /* an array element is any object, array or scalar */
     switch (tok)
@@ -454,7 +474,11 @@ parse_array_element(JsonLexContext *lex, JsonSemAction *sem)
         return result;

     if (aend != NULL)
-        (*aend) (sem->semstate, isnull);
+    {
+        result = (*aend) (sem->semstate, isnull);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     return JSON_SUCCESS;
 }
@@ -475,7 +499,11 @@ parse_array(JsonLexContext *lex, JsonSemAction *sem)
 #endif

     if (astart != NULL)
-        (*astart) (sem->semstate);
+    {
+        result = (*astart) (sem->semstate);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     /*
      * Data inside an array is at a higher nesting level than the array
@@ -508,7 +536,11 @@ parse_array(JsonLexContext *lex, JsonSemAction *sem)
     lex->lex_level--;

     if (aend != NULL)
-        (*aend) (sem->semstate);
+    {
+        result = (*aend) (sem->semstate);
+        if (result != JSON_SUCCESS)
+            return result;
+    }

     return JSON_SUCCESS;
 }
@@ -1139,6 +1171,9 @@ json_errdetail(JsonParseErrorType error, JsonLexContext *lex)
             return _("Unicode high surrogate must not follow a high surrogate.");
         case JSON_UNICODE_LOW_SURROGATE:
             return _("Unicode low surrogate must follow a high surrogate.");
+        case JSON_SEM_ACTION_FAILED:
+            /* fall through to the error code after switch */
+            break;
     }

     /*
diff --git a/src/include/common/jsonapi.h b/src/include/common/jsonapi.h
index 8d31630e5c..4590ff2476 100644
--- a/src/include/common/jsonapi.h
+++ b/src/include/common/jsonapi.h
@@ -52,7 +52,8 @@ typedef enum
     JSON_UNICODE_ESCAPE_FORMAT,
     JSON_UNICODE_HIGH_ESCAPE,
     JSON_UNICODE_HIGH_SURROGATE,
-    JSON_UNICODE_LOW_SURROGATE
+    JSON_UNICODE_LOW_SURROGATE,
+    JSON_SEM_ACTION_FAILED        /* error should already be reported */
 } JsonParseErrorType;


@@ -84,14 +85,15 @@ typedef struct JsonLexContext
     StringInfo    strval;
 } JsonLexContext;

-typedef void (*json_struct_action) (void *state);
-typedef void (*json_ofield_action) (void *state, char *fname, bool isnull);
-typedef void (*json_aelem_action) (void *state, bool isnull);
-typedef void (*json_scalar_action) (void *state, char *token, JsonTokenType tokentype);
+typedef JsonParseErrorType (*json_struct_action) (void *state);
+typedef JsonParseErrorType (*json_ofield_action) (void *state, char *fname, bool isnull);
+typedef JsonParseErrorType (*json_aelem_action) (void *state, bool isnull);
+typedef JsonParseErrorType (*json_scalar_action) (void *state, char *token, JsonTokenType tokentype);


 /*
  * Semantic Action structure for use in parsing json.
+ *
  * Any of these actions can be NULL, in which case nothing is done at that
  * point, Likewise, semstate can be NULL. Using an all-NULL structure amounts
  * to doing a pure parse with no side-effects, and is therefore exactly
@@ -100,6 +102,11 @@ typedef void (*json_scalar_action) (void *state, char *token, JsonTokenType toke
  * The 'fname' and 'token' strings passed to these actions are palloc'd.
  * They are not free'd or used further by the parser, so the action function
  * is free to do what it wishes with them.
+ *
+ * All action functions return JsonParseErrorType.  If the result isn't
+ * JSON_SUCCESS, the parse is abandoned and that error code is returned.
+ * If it is JSON_SEM_ACTION_FAILED, the action function is responsible
+ * for having reported the error in some appropriate way.
  */
 typedef struct JsonSemAction
 {
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index b7832d0aa2..972afe3aff 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -704,6 +704,10 @@ extern bool InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
                                   Oid typioparam, int32 typmod,
                                   fmNodePtr escontext,
                                   Datum *result);
+extern bool DirectInputFunctionCallSafe(PGFunction func, char *str,
+                                        Oid typioparam, int32 typmod,
+                                        fmNodePtr escontext,
+                                        Datum *result);
 extern Datum OidInputFunctionCall(Oid functionId, char *str,
                                   Oid typioparam, int32 typmod);
 extern char *OutputFunctionCall(FmgrInfo *flinfo, Datum val);
diff --git a/src/include/utils/jsonfuncs.h b/src/include/utils/jsonfuncs.h
index 865b2ff7c1..7fad0269f6 100644
--- a/src/include/utils/jsonfuncs.h
+++ b/src/include/utils/jsonfuncs.h
@@ -39,11 +39,16 @@ typedef text *(*JsonTransformStringValuesAction) (void *state, char *elem_value,
 /* build a JsonLexContext from a text datum */
 extern JsonLexContext *makeJsonLexContext(text *json, bool need_escapes);

-/* try to parse json, and ereport(ERROR) on failure */
-extern void pg_parse_json_or_ereport(JsonLexContext *lex, JsonSemAction *sem);
+/* try to parse json, and errsave(escontext) on failure */
+extern bool pg_parse_json_or_errsave(JsonLexContext *lex, JsonSemAction *sem,
+                                     struct Node *escontext);

-/* report an error during json lexing or parsing */
-extern void json_ereport_error(JsonParseErrorType error, JsonLexContext *lex);
+#define pg_parse_json_or_ereport(lex, sem) \
+    (void) pg_parse_json_or_errsave(lex, sem, NULL)
+
+/* save an error during json lexing or parsing */
+extern void json_errsave_error(JsonParseErrorType error, JsonLexContext *lex,
+                               struct Node *escontext);

 extern uint32 parse_jsonb_index_flags(Jsonb *jb);
 extern void iterate_jsonb_values(Jsonb *jb, uint32 flags, void *state,
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index cb181226e9..af96ce4180 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -320,6 +320,25 @@ LINE 1: SELECT '{
 DETAIL:  Expected JSON value, but found "}".
 CONTEXT:  JSON data, line 4: ...yveryveryveryveryveryveryveryverylongfieldname":}
 -- ERROR missing value for last field
+-- test non-error-throwing input
+select pg_input_is_valid('{"a":true}', 'json');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+select pg_input_is_valid('{"a":true', 'json');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+select pg_input_error_message('{"a":true', 'json');
+       pg_input_error_message
+------------------------------------
+ invalid input syntax for type json
+(1 row)
+
 --constructors
 -- array_to_json
 SELECT array_to_json(array(select 1 as a));
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index b2b3677482..be85676b5b 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -310,6 +310,31 @@ LINE 1: SELECT '{
 DETAIL:  Expected JSON value, but found "}".
 CONTEXT:  JSON data, line 4: ...yveryveryveryveryveryveryveryverylongfieldname":}
 -- ERROR missing value for last field
+-- test non-error-throwing input
+select pg_input_is_valid('{"a":true}', 'jsonb');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+select pg_input_is_valid('{"a":true', 'jsonb');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+select pg_input_error_message('{"a":true', 'jsonb');
+       pg_input_error_message
+------------------------------------
+ invalid input syntax for type json
+(1 row)
+
+select pg_input_error_message('{"a":1e1000000}', 'jsonb');
+     pg_input_error_message
+--------------------------------
+ value overflows numeric format
+(1 row)
+
 -- make sure jsonb is passed through json generators without being escaped
 SELECT array_to_json(ARRAY [jsonb '{"a":1}', jsonb '{"b":[2,3]}']);
       array_to_json
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index 589e0cea36..21534ed959 100644
--- a/src/test/regress/sql/json.sql
+++ b/src/test/regress/sql/json.sql
@@ -81,6 +81,11 @@ SELECT '{
         "averyveryveryveryveryveryveryveryveryverylongfieldname":}'::json;
 -- ERROR missing value for last field

+-- test non-error-throwing input
+select pg_input_is_valid('{"a":true}', 'json');
+select pg_input_is_valid('{"a":true', 'json');
+select pg_input_error_message('{"a":true', 'json');
+
 --constructors
 -- array_to_json

diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index 8d25966267..bc44ad1518 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -86,6 +86,12 @@ SELECT '{
         "averyveryveryveryveryveryveryveryveryverylongfieldname":}'::jsonb;
 -- ERROR missing value for last field

+-- test non-error-throwing input
+select pg_input_is_valid('{"a":true}', 'jsonb');
+select pg_input_is_valid('{"a":true', 'jsonb');
+select pg_input_error_message('{"a":true', 'jsonb');
+select pg_input_error_message('{"a":1e1000000}', 'jsonb');
+
 -- make sure jsonb is passed through json generators without being escaped
 SELECT array_to_json(ARRAY [jsonb '{"a":1}', jsonb '{"b":[2,3]}']);

Re: Error-safe user functions

From

Tom Lane

Date:

10 December 2022, 21:01:24

Corey Huinker <corey.huinker@gmail.com> writes:
> On Sat, Dec 10, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A fallback we can offer to anyone with such a problem is "write a
>> plpgsql function and wrap the potentially-failing bit in an exception
>> block".  Then they get to pay the cost of the subtransaction, while
>> we're not imposing one on everybody else.

> That exception block will prevent parallel plans. I'm not saying it isn't
> the best way forward for us, but wanted to make that side effect clear.

Hmm.  Apropos of that, I notice that domain_in is marked PARALLEL SAFE,
which seems like a bad idea if it could invoke not-so-parallel-safe
expressions.  Do we need to mark it less safe, and if so how much less?

Anyway, assuming that people are okay with the Not Our Problem approach,
the patch is pretty trivial, as attached.  I started to write an addition
to the CREATE DOMAIN man page recommending that domain CHECK constraints
not throw errors, but couldn't get past the bare recommendation.  Normally
I'd want to explain such a thing along the lines of "For example, X won't
work" ... but we don't yet have any committed features that depend on
this.  I'm inclined to leave it like that for now.  If we don't remember
to fix it once we do have some features, I'm sure somebody will ask a
question about it eventually.

            regards, tom lane

diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml
index 82a0b87492..73f9f28d6c 100644
--- a/doc/src/sgml/ref/create_domain.sgml
+++ b/doc/src/sgml/ref/create_domain.sgml
@@ -239,6 +239,11 @@ INSERT INTO tab (domcol) VALUES ((SELECT domcol FROM tab WHERE false));
    DOMAIN</command>), adjust the function definition, and re-add the
    constraint, thereby rechecking it against stored data.
   </para>
+
+  <para>
+   It's also good practice to ensure that domain <literal>CHECK</literal>
+   expressions will not throw errors.
+  </para>
  </refsect1>

  <refsect1>
diff --git a/src/backend/utils/adt/domains.c b/src/backend/utils/adt/domains.c
index 3de0cb01a2..99aeaddb5d 100644
--- a/src/backend/utils/adt/domains.c
+++ b/src/backend/utils/adt/domains.c
@@ -126,9 +126,14 @@ domain_state_setup(Oid domainType, bool binary, MemoryContext mcxt)
  * This is roughly similar to the handling of CoerceToDomain nodes in
  * execExpr*.c, but we execute each constraint separately, rather than
  * compiling them in-line within a larger expression.
+ *
+ * If escontext points to an ErrorStateContext, any failures are reported
+ * there, otherwise they are ereport'ed.  Note that we do not attempt to do
+ * soft reporting of errors raised during execution of CHECK constraints.
  */
 static void
-domain_check_input(Datum value, bool isnull, DomainIOData *my_extra)
+domain_check_input(Datum value, bool isnull, DomainIOData *my_extra,
+                   Node *escontext)
 {
     ExprContext *econtext = my_extra->econtext;
     ListCell   *l;
@@ -144,11 +149,14 @@ domain_check_input(Datum value, bool isnull, DomainIOData *my_extra)
         {
             case DOM_CONSTRAINT_NOTNULL:
                 if (isnull)
-                    ereport(ERROR,
+                {
+                    errsave(escontext,
                             (errcode(ERRCODE_NOT_NULL_VIOLATION),
                              errmsg("domain %s does not allow null values",
                                     format_type_be(my_extra->domain_type)),
                              errdatatype(my_extra->domain_type)));
+                    goto fail;
+                }
                 break;
             case DOM_CONSTRAINT_CHECK:
                 {
@@ -179,13 +187,16 @@ domain_check_input(Datum value, bool isnull, DomainIOData *my_extra)
                     econtext->domainValue_isNull = isnull;

                     if (!ExecCheck(con->check_exprstate, econtext))
-                        ereport(ERROR,
+                    {
+                        errsave(escontext,
                                 (errcode(ERRCODE_CHECK_VIOLATION),
                                  errmsg("value for domain %s violates check constraint \"%s\"",
                                         format_type_be(my_extra->domain_type),
                                         con->name),
                                  errdomainconstraint(my_extra->domain_type,
                                                      con->name)));
+                        goto fail;
+                    }
                     break;
                 }
             default:
@@ -200,6 +211,7 @@ domain_check_input(Datum value, bool isnull, DomainIOData *my_extra)
      * per-tuple memory.  This avoids leaking non-memory resources, if
      * anything in the expression(s) has any.
      */
+fail:
     if (econtext)
         ReScanExprContext(econtext);
 }
@@ -213,6 +225,7 @@ domain_in(PG_FUNCTION_ARGS)
 {
     char       *string;
     Oid            domainType;
+    Node       *escontext = fcinfo->context;
     DomainIOData *my_extra;
     Datum        value;

@@ -245,15 +258,18 @@ domain_in(PG_FUNCTION_ARGS)
     /*
      * Invoke the base type's typinput procedure to convert the data.
      */
-    value = InputFunctionCall(&my_extra->proc,
-                              string,
-                              my_extra->typioparam,
-                              my_extra->typtypmod);
+    if (!InputFunctionCallSafe(&my_extra->proc,
+                               string,
+                               my_extra->typioparam,
+                               my_extra->typtypmod,
+                               escontext,
+                               &value))
+        PG_RETURN_NULL();

     /*
      * Do the necessary checks to ensure it's a valid domain value.
      */
-    domain_check_input(value, (string == NULL), my_extra);
+    domain_check_input(value, (string == NULL), my_extra, escontext);

     if (string == NULL)
         PG_RETURN_NULL();
@@ -309,7 +325,7 @@ domain_recv(PG_FUNCTION_ARGS)
     /*
      * Do the necessary checks to ensure it's a valid domain value.
      */
-    domain_check_input(value, (buf == NULL), my_extra);
+    domain_check_input(value, (buf == NULL), my_extra, NULL);

     if (buf == NULL)
         PG_RETURN_NULL();
@@ -349,7 +365,7 @@ domain_check(Datum value, bool isnull, Oid domainType,
     /*
      * Do the necessary checks to ensure it's a valid domain value.
      */
-    domain_check_input(value, isnull, my_extra);
+    domain_check_input(value, isnull, my_extra, NULL);
 }

 /*
diff --git a/src/test/regress/expected/domain.out b/src/test/regress/expected/domain.out
index 73b010f6ed..25f6bb9e1f 100644
--- a/src/test/regress/expected/domain.out
+++ b/src/test/regress/expected/domain.out
@@ -87,6 +87,56 @@ drop domain domainvarchar restrict;
 drop domain domainnumeric restrict;
 drop domain domainint4 restrict;
 drop domain domaintext;
+-- Test non-error-throwing input
+create domain positiveint int4 check(value > 0);
+create domain weirdfloat float8 check((1 / value) < 10);
+select pg_input_is_valid('1', 'positiveint');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+select pg_input_is_valid('junk', 'positiveint');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+select pg_input_is_valid('-1', 'positiveint');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+select pg_input_error_message('junk', 'positiveint');
+            pg_input_error_message
+-----------------------------------------------
+ invalid input syntax for type integer: "junk"
+(1 row)
+
+select pg_input_error_message('-1', 'positiveint');
+                           pg_input_error_message
+----------------------------------------------------------------------------
+ value for domain positiveint violates check constraint "positiveint_check"
+(1 row)
+
+select pg_input_error_message('junk', 'weirdfloat');
+                 pg_input_error_message
+--------------------------------------------------------
+ invalid input syntax for type double precision: "junk"
+(1 row)
+
+select pg_input_error_message('0.01', 'weirdfloat');
+                          pg_input_error_message
+--------------------------------------------------------------------------
+ value for domain weirdfloat violates check constraint "weirdfloat_check"
+(1 row)
+
+-- We currently can't trap errors raised in the CHECK expression itself
+select pg_input_error_message('0', 'weirdfloat');
+ERROR:  division by zero
+drop domain positiveint;
+drop domain weirdfloat;
 -- Test domains over array types
 create domain domainint4arr int4[1];
 create domain domainchar4arr varchar(4)[2][3];
diff --git a/src/test/regress/sql/domain.sql b/src/test/regress/sql/domain.sql
index f2ca1fb675..1558bd9a33 100644
--- a/src/test/regress/sql/domain.sql
+++ b/src/test/regress/sql/domain.sql
@@ -69,6 +69,25 @@ drop domain domainint4 restrict;
 drop domain domaintext;


+-- Test non-error-throwing input
+
+create domain positiveint int4 check(value > 0);
+create domain weirdfloat float8 check((1 / value) < 10);
+
+select pg_input_is_valid('1', 'positiveint');
+select pg_input_is_valid('junk', 'positiveint');
+select pg_input_is_valid('-1', 'positiveint');
+select pg_input_error_message('junk', 'positiveint');
+select pg_input_error_message('-1', 'positiveint');
+select pg_input_error_message('junk', 'weirdfloat');
+select pg_input_error_message('0.01', 'weirdfloat');
+-- We currently can't trap errors raised in the CHECK expression itself
+select pg_input_error_message('0', 'weirdfloat');
+
+drop domain positiveint;
+drop domain weirdfloat;
+
+
 -- Test domains over array types

 create domain domainint4arr int4[1];

Re: Error-safe user functions

From

Andrew Dunstan

Date:

10 December 2022, 23:11:35

On 2022-12-10 Sa 14:38, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> OK, json is a fairly easy case, see attached. But jsonb is a different
>> kettle of fish. Both the semantic routines called by the parser and the
>> subsequent call to JsonbValueToJsonb() can raise errors. These are
>> pretty much all about breaking various limits (for strings, objects,
>> arrays). There's also a call to numeric_in, but I assume that a string
>> that's already parsed as a valid json numeric literal won't upset
>> numeric_in.
> Um, nope ...
>
> regression=# select '1e1000000'::jsonb;
> ERROR:  value overflows numeric format
> LINE 1: select '1e1000000'::jsonb;
>                ^


Oops, yeah.


>> Many of these occur several calls down the stack, so
>> adjusting everything to deal with them would be fairly invasive. Perhaps
>> we could instead document that this class of input error won't be
>> trapped, at least for jsonb.
> Seeing that SQL/JSON is one of the major drivers of this whole project,
> it seemed a little sad to me that jsonb couldn't manage to implement
> what is required.  So I spent a bit of time poking at it.  Attached
> is an extended version of your patch that also covers jsonb.
>
> The main thing I soon realized is that the JsonSemAction API is based
> on the assumption that semantic actions will report errors by throwing
> them.  This is a bit schizophrenic considering the parser itself carefully
> hands back error codes instead of throwing anything (excluding palloc
> failures of course).  What I propose in the attached is that we change
> that API so that action functions return JsonParseErrorType, and add
> an enum value denoting "I already logged a suitable error, so you don't
> have to".  It was a little tedious to modify all the existing functions
> that way, but not hard.  Only the ones used by jsonb_in need to do
> anything except "return JSON_SUCCESS", at least for now.


Many thanks for doing this, it looks good.

> I have not done anything here about errors within JsonbValueToJsonb.
> There would need to be another round of API-extension in that area
> if we want to be able to trap its errors.  As you say, those are mostly
> about exceeding implementation size limits, so I suppose one could argue
> that they are not so different from palloc failure.  It's still annoying.
> If people are good with the changes attached, I might take a look at
> that.
>
>             


Awesome.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

11 December 2022, 00:00:13

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-10 Sa 14:38, Tom Lane wrote:
>> Seeing that SQL/JSON is one of the major drivers of this whole project,
>> it seemed a little sad to me that jsonb couldn't manage to implement
>> what is required.  So I spent a bit of time poking at it.  Attached
>> is an extended version of your patch that also covers jsonb.

> Many thanks for doing this, it looks good.

Cool, thanks.  Looking at my notes, there's one other loose end
I forgot to mention:

                     * Note: pg_unicode_to_server() will throw an error for a
                     * conversion failure, rather than returning a failure
                     * indication.  That seems OK.

We ought to do something about that, but I'm not sure how hard we
ought to work at it.  Perhaps it's sufficient to make a variant of
pg_unicode_to_server that just returns true/false instead of failing,
and add a JsonParseErrorType for "untranslatable character" to let
json_errdetail return a reasonably on-point message.  We could imagine
extending the ErrorSaveContext infrastructure into the encoding
conversion modules, and maybe at some point that'll be worth doing,
but in this particular context it doesn't seem like we'd be getting
a very much better error message.  The main thing that we would get
from such an extension is a chance to capture the report from
report_untranslatable_char.  But what that adds is the ability to
identify exactly which character couldn't be translated --- and in
this use-case there's always just one.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

11 December 2022, 14:35:40

On 2022-12-10 Sa 19:00, Tom Lane wrote:
> Looking at my notes, there's one other loose end
> I forgot to mention:
>
>                      * Note: pg_unicode_to_server() will throw an error for a
>                      * conversion failure, rather than returning a failure
>                      * indication.  That seems OK.
>
> We ought to do something about that, but I'm not sure how hard we
> ought to work at it.  Perhaps it's sufficient to make a variant of
> pg_unicode_to_server that just returns true/false instead of failing,
> and add a JsonParseErrorType for "untranslatable character" to let
> json_errdetail return a reasonably on-point message. 


Seems reasonable.


>  We could imagine
> extending the ErrorSaveContext infrastructure into the encoding
> conversion modules, and maybe at some point that'll be worth doing,
> but in this particular context it doesn't seem like we'd be getting
> a very much better error message.  The main thing that we would get
> from such an extension is a chance to capture the report from
> report_untranslatable_char.  But what that adds is the ability to
> identify exactly which character couldn't be translated --- and in
> this use-case there's always just one.
>
>             


Yeah, probably overkill for now.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

11 December 2022, 17:24:11

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-10 Sa 14:38, Tom Lane wrote:
>> I have not done anything here about errors within JsonbValueToJsonb.
>> There would need to be another round of API-extension in that area
>> if we want to be able to trap its errors.  As you say, those are mostly
>> about exceeding implementation size limits, so I suppose one could argue
>> that they are not so different from palloc failure.  It's still annoying.
>> If people are good with the changes attached, I might take a look at
>> that.

> Awesome.

I spent some time looking at this, and was discouraged to conclude
that the notational mess would probably be substantially out of
proportion to the value.  The main problem is that we'd have to change
the API of pushJsonbValue, which has more than 150 call sites, most
of which would need to grow a new test for failure return.  Maybe
somebody will feel like tackling that at some point, but not me today.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

11 December 2022, 18:01:36

On 2022-12-11 Su 12:24, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 2022-12-10 Sa 14:38, Tom Lane wrote:
>>> I have not done anything here about errors within JsonbValueToJsonb.
>>> There would need to be another round of API-extension in that area
>>> if we want to be able to trap its errors.  As you say, those are mostly
>>> about exceeding implementation size limits, so I suppose one could argue
>>> that they are not so different from palloc failure.  It's still annoying.
>>> If people are good with the changes attached, I might take a look at
>>> that.
>> Awesome.
> I spent some time looking at this, and was discouraged to conclude
> that the notational mess would probably be substantially out of
> proportion to the value.  The main problem is that we'd have to change
> the API of pushJsonbValue, which has more than 150 call sites, most
> of which would need to grow a new test for failure return.  Maybe
> somebody will feel like tackling that at some point, but not me today.
>
>             

Yes, I had similar feelings when I looked at it. I don't think this
needs to hold up proceeding with the SQL/JSON rework, which I think can
reasonably restart now.

Maybe as we work through the remaining input functions (there are about
60 core candidates left on my list) we should mark them with a comment
if no adjustment is needed.

I'm going to look at jsonpath and the text types next, I somewhat tied
up this week but might get to relook at pushJsonbValue later in the month.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

11 December 2022, 18:29:57

Andrew Dunstan <andrew@dunslane.net> writes:
> Maybe as we work through the remaining input functions (there are about
> 60 core candidates left on my list) we should mark them with a comment
> if no adjustment is needed.

I did a quick pass through them last night.  Assuming that we don't
need to touch the unimplemented input functions (eg for pseudotypes),
I count these core functions as still needing work:

aclitemin
bit_in
box_in
bpcharin
byteain
cash_in
cidin
cidr_in
circle_in
inet_in
int2vectorin
jsonpath_in
line_in
lseg_in
macaddr8_in
macaddr_in
multirange_in
namein
oidin
oidvectorin
path_in
pg_lsn_in
pg_snapshot_in
point_in
poly_in
range_in
regclassin
regcollationin
regconfigin
regdictionaryin
regnamespacein
regoperatorin
regoperin
regprocedurein
regprocin
regrolein
regtypein
tidin
tsqueryin
tsvectorin
uuid_in
varbit_in
varcharin
xid8in
xidin
xml_in

and these contrib functions:

hstore:
hstore_in
intarray:
bqarr_in
isn:
ean13_in
isbn_in
ismn_in
issn_in
upc_in
ltree:
ltree_in
lquery_in
ltxtq_in
seg:
seg_in

Maybe we should have a conversation about which of these are
highest priority to get to a credible feature.  We clearly need
to fix the remaining SQL-spec types (varchar and bpchar, mainly).
At the other extreme, likely nobody would weep if we never fixed
int2vectorin, for instance.

I'm a little concerned about the cost-benefit of fixing the reg* types.
The ones that accept type names actually use the core grammar to parse
those.  Now, we probably could fix the grammar to be non-throwing, but
it'd be very invasive and I'm not sure about the performance impact.
It might be best to content ourselves with soft reporting of lookup
failures, as opposed to syntax problems.

            regards, tom lane

Re: Error-safe user functions

From

Andres Freund

Date:

11 December 2022, 20:41:21

Hi,

On 2022-12-11 12:24:11 -0500, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > On 2022-12-10 Sa 14:38, Tom Lane wrote:
> >> I have not done anything here about errors within JsonbValueToJsonb.
> >> There would need to be another round of API-extension in that area
> >> if we want to be able to trap its errors.  As you say, those are mostly
> >> about exceeding implementation size limits, so I suppose one could argue
> >> that they are not so different from palloc failure.  It's still annoying.
> >> If people are good with the changes attached, I might take a look at
> >> that.
> 
> > Awesome.
> 
> I spent some time looking at this, and was discouraged to conclude
> that the notational mess would probably be substantially out of
> proportion to the value.  The main problem is that we'd have to change
> the API of pushJsonbValue, which has more than 150 call sites, most
> of which would need to grow a new test for failure return.  Maybe
> somebody will feel like tackling that at some point, but not me today.

Could we address this more minimally by putting the error state into the
JsonbParseState and add a check for that error state to convertToJsonb() or
such (by passing in the JsonbParseState)? We'd need to return immediately in
pushJsonbValue() if there's already an error, but that that's not too bad.

Greetings,

Andres Freund

Re: Error-safe user functions

From

Tom Lane

Date:

11 December 2022, 21:23:38

Andres Freund <andres@anarazel.de> writes:
> On 2022-12-11 12:24:11 -0500, Tom Lane wrote:
>> I spent some time looking at this, and was discouraged to conclude
>> that the notational mess would probably be substantially out of
>> proportion to the value.  The main problem is that we'd have to change
>> the API of pushJsonbValue, which has more than 150 call sites, most
>> of which would need to grow a new test for failure return.  Maybe
>> somebody will feel like tackling that at some point, but not me today.

> Could we address this more minimally by putting the error state into the
> JsonbParseState and add a check for that error state to convertToJsonb() or
> such (by passing in the JsonbParseState)? We'd need to return immediately in
> pushJsonbValue() if there's already an error, but that that's not too bad.

We could shoehorn error state into the JsonbParseState, although the
fact that that stack normally starts out empty is a bit of a problem.
I think you'd have to push a dummy entry if you want soft errors,
store the error state pointer into that, and have pushState() copy
down the parent's error pointer.  Kind of ugly, but do-able.  Whether
it's better than replacing that argument with a pointer-to-struct-
that-includes-the-stack-and-the-error-pointer wasn't real clear to me.

What seemed like a mess was getting the calling code to quit early.
I'm not convinced that just putting an immediate exit into pushJsonbValue
would be enough, because the callers tend to assume a series of calls
will behave as they expect.  Probably some of the call sites could
ignore the issue, but you'd still end with a lot of messy changes
I fear.

            regards, tom lane

Re: Error-safe user functions

From

Amul Sul

Date:

13 December 2022, 12:33:19

On Mon, Dec 12, 2022 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Maybe as we work through the remaining input functions (there are about
> > 60 core candidates left on my list) we should mark them with a comment
> > if no adjustment is needed.
>
> I did a quick pass through them last night.  Assuming that we don't
> need to touch the unimplemented input functions (eg for pseudotypes),
> I count these core functions as still needing work:
>
> aclitemin
> bit_in
> box_in
> bpcharin
> byteain
> cash_in
> cidin
> cidr_in
> circle_in
> inet_in
> int2vectorin
> jsonpath_in
> line_in
> lseg_in
> macaddr8_in
> macaddr_in

Attaching patches changing these functions except bpcharin,
byteain, jsonpath_in, and cidin. I am continuing work on the next
items below:

> multirange_in
> namein
> oidin
> oidvectorin
> path_in
> pg_lsn_in
> pg_snapshot_in
> point_in
> poly_in
> range_in
> regclassin
> regcollationin
> regconfigin
> regdictionaryin
> regnamespacein
> regoperatorin
> regoperin
> regprocedurein
> regprocin
> regrolein
> regtypein
> tidin
> tsqueryin
> tsvectorin
> uuid_in
> varbit_in
> varcharin
> xid8in
> xidin
> xml_in
>
> and these contrib functions:
>
> hstore:
> hstore_in
> intarray:
> bqarr_in
> isn:
> ean13_in
> isbn_in
> ismn_in
> issn_in
> upc_in
> ltree:
> ltree_in
> lquery_in
> ltxtq_in
> seg:
> seg_in
>
> Maybe we should have a conversation about which of these are
> highest priority to get to a credible feature.  We clearly need
> to fix the remaining SQL-spec types (varchar and bpchar, mainly).
> At the other extreme, likely nobody would weep if we never fixed
> int2vectorin, for instance.
>
> I'm a little concerned about the cost-benefit of fixing the reg* types.
> The ones that accept type names actually use the core grammar to parse
> those.  Now, we probably could fix the grammar to be non-throwing, but
> it'd be very invasive and I'm not sure about the performance impact.
> It might be best to content ourselves with soft reporting of lookup
> failures, as opposed to syntax problems.
>

Regards,
Amul

Attachment

Re: Error-safe user functions

From

Amul Sul

Date:

14 December 2022, 12:35:02

On Tue, Dec 13, 2022 at 6:03 PM Amul Sul <sulamul@gmail.com> wrote:
>
> On Mon, Dec 12, 2022 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Andrew Dunstan <andrew@dunslane.net> writes:
> > > Maybe as we work through the remaining input functions (there are about
> > > 60 core candidates left on my list) we should mark them with a comment
> > > if no adjustment is needed.
> >
> > I did a quick pass through them last night.  Assuming that we don't
> > need to touch the unimplemented input functions (eg for pseudotypes),
> > I count these core functions as still needing work:
> >
> > aclitemin
> > bit_in
> > box_in
> > bpcharin
> > byteain
> > cash_in
> > cidin
> > cidr_in
> > circle_in
> > inet_in
> > int2vectorin
> > jsonpath_in
> > line_in
> > lseg_in
> > macaddr8_in
> > macaddr_in
>
> Attaching patches changing these functions except bpcharin,
> byteain, jsonpath_in, and cidin. I am continuing work on the next
> items below:
>
> > multirange_in
> > namein
> > oidin
> > oidvectorin
> > path_in
> > pg_lsn_in
> > pg_snapshot_in
> > point_in
> > poly_in
> > range_in
> > regclassin
> > regcollationin
> > regconfigin
> > regdictionaryin
> > regnamespacein
> > regoperatorin
> > regoperin
> > regprocedurein
> > regprocin
> > regrolein
> > regtypein
> > tidin
> > tsqueryin
> > tsvectorin
> > uuid_in
> > varbit_in
> > varcharin
> > xid8in
> > xidin

Attaching a complete set of the patches changing function till this
except bpcharin, byteain jsonpath_in that Andrew is planning to look
in. I have skipped reg* functions.
multirange_in and range_in changes are a bit complicated and big --
planning to resume work on that and the rest of the items in the list
in the last week of this month, thanks.


> > xml_in
> >
> > and these contrib functions:
> >
> > hstore:
> > hstore_in
> > intarray:
> > bqarr_in
> > isn:
> > ean13_in
> > isbn_in
> > ismn_in
> > issn_in
> > upc_in
> > ltree:
> > ltree_in
> > lquery_in
> > ltxtq_in
> > seg:
> > seg_in
> >
> > Maybe we should have a conversation about which of these are
> > highest priority to get to a credible feature.  We clearly need
> > to fix the remaining SQL-spec types (varchar and bpchar, mainly).
> > At the other extreme, likely nobody would weep if we never fixed
> > int2vectorin, for instance.
> >
> > I'm a little concerned about the cost-benefit of fixing the reg* types.
> > The ones that accept type names actually use the core grammar to parse
> > those.  Now, we probably could fix the grammar to be non-throwing, but
> > it'd be very invasive and I'm not sure about the performance impact.
> > It might be best to content ourselves with soft reporting of lookup
> > failures, as opposed to syntax problems.
> >
>

Regards,
Amul

On 2022-12-14 We 17:37, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Thanks, I have been looking at jsonpath, but I'm not quite sure how to
>> get the escontext argument to the yyerror calls in jsonath_scan.l. Maybe
>> I need to specify a lex-param setting?
> You want a parse-param option in jsonpath_gram.y, I think; adding that
> will persuade Bison to change the signatures of relevant functions.
> Compare the mods I made in contrib/cube in ccff2d20e.
>
>             


Yeah, I started there, but it's substantially more complex - unlike cube
the jsonpath scanner calls the error routines as well as the parser.


Anyway, here's a patch.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

jsonpath_error_free.patch

Re: Error-safe user functions

From

Robert Haas

Date:

19 December 2022, 16:34:39

On Fri, Dec 16, 2022 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The reg* functions probably need a unified plan as to how far
> down we want to push non-error behavior.  The rest of these
> I think just require turning the crank along the same lines
> as in functions already dealt with.

I would be in favor of an aggressive approach. For example, let's look
at regclassin(). It calls oidin(), stringToQualifiedNameList(),
makeRangeVarFromNameList(), and RangeVarGetRelidExtended(). Basically,
oidin() could fail if the input, known to be all digits, is out of
range; stringToQualifiedNameList() could fail due to mismatched
delimiters or improperly-separated names; makeRangeVarFromNameList()
doesn't want to have more than three name components
(db.schema.relation); and RangeVarGetRelidExtended() doesn't like
cross-database references or non-existent relations.

Now, one option here would be to distinguish between something that
could be valid in some database but isn't in this one, like a
non-existent relation name, and one that couldn't ever work anywhere,
like a relation name with four parts or bad quoting. You could decide
that the former kind of error will be reported softly but the latter
is hard error. But I think that is presuming that we know how users
will want to use this functionality, and I don't think we do. I also
think that it will be confusing to users. Finally, I think it's
different from what we do for other data types. You could equally well
argue that, for int4in, we ought to treat '9999999999' and 'potato'
differently, one a hard error and the other soft. I think it's hard to
puzzle out a decision that makes any sense there, and I don't think
this case is much different. I don't think it's too hard to mentally
separate errors about the validity of the input from, say, out of
memory errors -- but one distinguishing between one kind of input
validity check and another seems like a muddle.

It also doesn't seem too bad from an implementation point of view to
try to cover all the caes. The stickiest case looks to be
RangeVarGetRelidExtended() and we might need to give a bit of thought
to how to handle that one. The others don't seem like a big issue, and
oidin() is already done.

> While it'd be good to get all of these done before v16 feature
> freeze, I can't see that any of them represent blockers for
> building features based on soft input error handling.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

19 December 2022, 16:44:37

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Dec 16, 2022 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The reg* functions probably need a unified plan as to how far
>> down we want to push non-error behavior.  The rest of these
>> I think just require turning the crank along the same lines
>> as in functions already dealt with.

> I would be in favor of an aggressive approach.

I agree that anything based on implementation concerns is going
to look pretty unprincipled to end users.  However ...

> It also doesn't seem too bad from an implementation point of view to
> try to cover all the caes.

... I guess you didn't read my remarks upthread about regtypein.
I do not want to try to make gram.y+scan.l non-error-throwing.

            regards, tom lane

Re: Error-safe user functions

From

Robert Haas

Date:

19 December 2022, 18:25:19

On Mon, Dec 19, 2022 at 11:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > It also doesn't seem too bad from an implementation point of view to
> > try to cover all the caes.
>
> ... I guess you didn't read my remarks upthread about regtypein.
> I do not want to try to make gram.y+scan.l non-error-throwing.

Huh, for some reason I'm not seeing an email about that. Do you have a link?

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

19 December 2022, 21:27:00

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Dec 19, 2022 at 11:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... I guess you didn't read my remarks upthread about regtypein.
>> I do not want to try to make gram.y+scan.l non-error-throwing.

> Huh, for some reason I'm not seeing an email about that. Do you have a link?

In [1] I wrote

>>> I'm a little concerned about the cost-benefit of fixing the reg* types.
>>> The ones that accept type names actually use the core grammar to parse
>>> those.  Now, we probably could fix the grammar to be non-throwing, but
>>> it'd be very invasive and I'm not sure about the performance impact.
>>> It might be best to content ourselves with soft reporting of lookup
>>> failures, as opposed to syntax problems.

            regards, tom lane

[1] https://www.postgresql.org/message-id/1863335.1670783397%40sss.pgh.pa.us

Re: Error-safe user functions

From

Robert Haas

Date:

19 December 2022, 22:48:53

On Mon, Dec 19, 2022 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In [1] I wrote
>
> >>> I'm a little concerned about the cost-benefit of fixing the reg* types.
> >>> The ones that accept type names actually use the core grammar to parse
> >>> those.  Now, we probably could fix the grammar to be non-throwing, but
> >>> it'd be very invasive and I'm not sure about the performance impact.
> >>> It might be best to content ourselves with soft reporting of lookup
> >>> failures, as opposed to syntax problems.

Ah right.  I agree that invading the main grammar doesn't seem
terribly appealing. Setting regtypein aside could be a sensible
choice, then. Another option might be to have some way of parsing type
names outside of the main grammar, which would be more work and would
require keeping things in sync, but perhaps it would end up being less
ugly....

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

From

Andrew Dunstan

Date:

21 December 2022, 23:19:52

On 2022-12-18 Su 09:42, Andrew Dunstan wrote:
> On 2022-12-14 We 17:37, Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> Thanks, I have been looking at jsonpath, but I'm not quite sure how to
>>> get the escontext argument to the yyerror calls in jsonath_scan.l. Maybe
>>> I need to specify a lex-param setting?
>> You want a parse-param option in jsonpath_gram.y, I think; adding that
>> will persuade Bison to change the signatures of relevant functions.
>> Compare the mods I made in contrib/cube in ccff2d20e.
>>
>>             
>
> Yeah, I started there, but it's substantially more complex - unlike cube
> the jsonpath scanner calls the error routines as well as the parser.
>
>
> Anyway, here's a patch.
>
>

And here's another for contrib/seg

I'm planning to commit these two in the next day or so.


cheers


andew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

0001-Provide-error-safety-for-contrib-seg-s-input-functio.patch

Re: Error-safe user functions

From

Tom Lane

Date:

22 December 2022, 06:10:54

Andrew Dunstan <andrew@dunslane.net> writes:
> And here's another for contrib/seg
> I'm planning to commit these two in the next day or so.

I didn't look at the jsonpath one yet.  The seg patch passes
an eyeball check, with one minor nit: in seg_atof,

+    *result = float4in_internal(value, NULL, "real", value, escontext);

don't we want to use "seg" as the type_name?

Even more nitpicky, in

+seg_yyerror(SEG *result, struct Node *escontext, const char *message)
 {
+    if (SOFT_ERROR_OCCURRED(escontext))
+        return;

I'd be inclined to add some explanation, say

+seg_yyerror(SEG *result, struct Node *escontext, const char *message)
 {
+    /* if we already reported an error, don't overwrite it */
+    if (SOFT_ERROR_OCCURRED(escontext))
+        return;

            regards, tom lane

Re: Error-safe user functions

From

Tom Lane

Date:

22 December 2022, 16:44:00

Andrew Dunstan <andrew@dunslane.net> writes:
> Yeah, I started there, but it's substantially more complex - unlike cube
> the jsonpath scanner calls the error routines as well as the parser.
> Anyway, here's a patch.

I looked through this and it seems generally OK.  A minor nitpick is
that we usually write "(Datum) 0" not "(Datum) NULL" for dont-care Datum
values.  A slightly bigger issue is that makeItemLikeRegex still allows
an error to be thrown from RE_compile_and_cache if a bogus regex is
presented.  But that could be dealt with later.

(I wonder why this is using RE_compile_and_cache at all, really,
rather than some other API.  There doesn't seem to be value in
forcing the regex into the cache at this point.)

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

23 December 2022, 14:52:12

On 2022-12-22 Th 01:10, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> And here's another for contrib/seg
>> I'm planning to commit these two in the next day or so.
> I didn't look at the jsonpath one yet.  The seg patch passes
> an eyeball check, with one minor nit: in seg_atof,
>
> +    *result = float4in_internal(value, NULL, "real", value, escontext);
>
> don't we want to use "seg" as the type_name?
>
> Even more nitpicky, in
>
> +seg_yyerror(SEG *result, struct Node *escontext, const char *message)
>  {
> +    if (SOFT_ERROR_OCCURRED(escontext))
> +        return;
>
> I'd be inclined to add some explanation, say
>
> +seg_yyerror(SEG *result, struct Node *escontext, const char *message)
>  {
> +    /* if we already reported an error, don't overwrite it */
> +    if (SOFT_ERROR_OCCURRED(escontext))
> +        return;
>
>             


Thanks for the review.


Fixed both of these and pushed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Andrew Dunstan

Date:

23 December 2022, 17:19:44

On 2022-12-22 Th 11:44, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Yeah, I started there, but it's substantially more complex - unlike cube
>> the jsonpath scanner calls the error routines as well as the parser.
>> Anyway, here's a patch.
> I looked through this and it seems generally OK.  A minor nitpick is
> that we usually write "(Datum) 0" not "(Datum) NULL" for dont-care Datum
> values.  


Fixed in the new version attached.


> A slightly bigger issue is that makeItemLikeRegex still allows
> an error to be thrown from RE_compile_and_cache if a bogus regex is
> presented.  But that could be dealt with later.


I'd rather fix it now while we're paying attention.


>
> (I wonder why this is using RE_compile_and_cache at all, really,
> rather than some other API.  There doesn't seem to be value in
> forcing the regex into the cache at this point.)
>
>             


I agree. The attached uses pg_regcomp instead. I had a lift a couple of
lines from regexp.c, but not too many.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

jsonpath_error_free-v2.patch

Re: Error-safe user functions

From

Tom Lane

Date:

23 December 2022, 18:53:24

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-12-22 Th 11:44, Tom Lane wrote:
>> (I wonder why this is using RE_compile_and_cache at all, really,
>> rather than some other API.  There doesn't seem to be value in
>> forcing the regex into the cache at this point.)

> I agree. The attached uses pg_regcomp instead. I had a lift a couple of
> lines from regexp.c, but not too many.

LGTM.  No further comments.

            regards, tom lane

Re: Error-safe user functions

From

Ted Yu

Date:

23 December 2022, 21:19:07

On Fri, Dec 23, 2022 at 9:20 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-12-22 Th 11:44, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Yeah, I started there, but it's substantially more complex - unlike cube
>> the jsonpath scanner calls the error routines as well as the parser.
>> Anyway, here's a patch.
> I looked through this and it seems generally OK. A minor nitpick is
> that we usually write "(Datum) 0" not "(Datum) NULL" for dont-care Datum
> values.

Fixed in the new version attached.

> A slightly bigger issue is that makeItemLikeRegex still allows
> an error to be thrown from RE_compile_and_cache if a bogus regex is
> presented. But that could be dealt with later.

I'd rather fix it now while we're paying attention.

>
> (I wonder why this is using RE_compile_and_cache at all, really,
> rather than some other API. There doesn't seem to be value in
> forcing the regex into the cache at this point.)
>
>

I agree. The attached uses pg_regcomp instead. I had a lift a couple of
lines from regexp.c, but not too many.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Hi,

In makeItemLikeRegex :

+ /* See regexp.c for explanation */
+ CHECK_FOR_INTERRUPTS();
+ pg_regerror(re_result, &re_tmp, errMsg, sizeof(errMsg));

+ ereturn(escontext, false,

Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call is still necessary.

Cheers

Re: Error-safe user functions

From

Tom Lane

Date:

23 December 2022, 21:22:47

Ted Yu <yuzhihong@gmail.com> writes:
> In makeItemLikeRegex :

> +                       /* See regexp.c for explanation */
> +                       CHECK_FOR_INTERRUPTS();
> +                       pg_regerror(re_result, &re_tmp, errMsg,
> sizeof(errMsg));
> +                       ereturn(escontext, false,

> Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call is
> still necessary.

Yes, it is.  We don't want a query-cancel transformed into a soft error.

            regards, tom lane

Re: Error-safe user functions

From

Ted Yu

Date:

23 December 2022, 21:25:42

On Fri, Dec 23, 2022 at 1:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ted Yu <yuzhihong@gmail.com> writes:
> In makeItemLikeRegex :

> + /* See regexp.c for explanation */
> + CHECK_FOR_INTERRUPTS();
> + pg_regerror(re_result, &re_tmp, errMsg,
> sizeof(errMsg));
> + ereturn(escontext, false,

> Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call is
> still necessary.

Yes, it is. We don't want a query-cancel transformed into a soft error.

regards, tom lane

Hi,

`ereturn(escontext` calls appear in multiple places in the patch.

What about other callsites (w.r.t. checking interrupt) ?

Cheers

Re: Error-safe user functions

From

Tom Lane

Date:

23 December 2022, 21:38:09

Ted Yu <yuzhihong@gmail.com> writes:
> On Fri, Dec 23, 2022 at 1:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Ted Yu <yuzhihong@gmail.com> writes:
>>> +                       /* See regexp.c for explanation */
>>> +                       CHECK_FOR_INTERRUPTS();

>> Yes, it is.  We don't want a query-cancel transformed into a soft error.

> `ereturn(escontext` calls appear in multiple places in the patch.
> What about other callsites (w.r.t. checking interrupt) ?

What about them?  The reason this one is special is that backend/regexp
might return a failure code that's specifically "I gave up because
there's a query cancel pending".  We don't want to report that as a soft
error.  It's true that we might cancel the query for real a bit later on
even if this check weren't here, but that doesn't mean it's okay to go
down the soft error path and hope that there'll be a CHECK_FOR_INTERRUPTS
sometime before there's any visible evidence that we did the wrong thing.

            regards, tom lane

Re: Error-safe user functions

From

Ted Yu

Date:

24 December 2022, 09:51:10

On Fri, Dec 23, 2022 at 1:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ted Yu <yuzhihong@gmail.com> writes:
> In makeItemLikeRegex :

> + /* See regexp.c for explanation */
> + CHECK_FOR_INTERRUPTS();
> + pg_regerror(re_result, &re_tmp, errMsg,
> sizeof(errMsg));
> + ereturn(escontext, false,

> Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call is
> still necessary.

Yes, it is. We don't want a query-cancel transformed into a soft error.

regards, tom lane

Hi,

For this case (`invalid regular expression`), the potential user interruption is one reason for stopping execution.

I feel surfacing user interruption somehow masks the underlying error.

The same regex, without user interruption, would exhibit an `invalid regular expression` error.

I think it would be better to surface the error.

Cheers

Re: Error-safe user functions

From

Andrew Dunstan

Date:

24 December 2022, 12:38:44

On 2022-12-24 Sa 04:51, Ted Yu wrote:
>
>
> On Fri, Dec 23, 2022 at 1:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>     Ted Yu <yuzhihong@gmail.com> writes:
>     > In makeItemLikeRegex :
>
>     > +                       /* See regexp.c for explanation */
>     > +                       CHECK_FOR_INTERRUPTS();
>     > +                       pg_regerror(re_result, &re_tmp, errMsg,
>     > sizeof(errMsg));
>     > +                       ereturn(escontext, false,
>
>     > Since an error is returned, I wonder if the
>     `CHECK_FOR_INTERRUPTS` call is
>     > still necessary.
>
>     Yes, it is.  We don't want a query-cancel transformed into a soft
>     error.
>
>                             regards, tom lane
>
> Hi,
> For this case (`invalid regular expression`), the potential user
> interruption is one reason for stopping execution.
> I feel surfacing user interruption somehow masks the underlying error.
>
> The same regex, without user interruption, would exhibit an `invalid
> regular expression` error.
> I think it would be better to surface the error.
>
>

All that this patch is doing is replacing a call to
RE_compile_and_cache, which calls CHECK_FOR_INTERRUPTS, with similar
code, which gives us the opportunity to call ereturn instead of ereport.
Note that where escontext is NULL (the common case), ereturn functions
identically to ereport. So unless you want to argue that the logic in
RE_compile_and_cache is wrong I don't see what we're arguing about. If
instead I had altered the API of RE_compile_and_cache to include an
escontext parameter we wouldn't be having this argument at all. The only
reason I didn't do that was the point Tom quite properly raised about
why we're doing any caching here anyway.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Andrew Dunstan

Date:

24 December 2022, 12:51:25

On 2022-12-23 Fr 13:53, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 2022-12-22 Th 11:44, Tom Lane wrote:
>>> (I wonder why this is using RE_compile_and_cache at all, really,
>>> rather than some other API.  There doesn't seem to be value in
>>> forcing the regex into the cache at this point.)
>> I agree. The attached uses pg_regcomp instead. I had a lift a couple of
>> lines from regexp.c, but not too many.
> LGTM.  No further comments.
>
>             

As I was giving this a final polish I noticed this in jspConvertRegexFlags:

    /*
     * We'll never need sub-match details at execution.  While
     * RE_compile_and_execute would set this flag anyway, force it on
here to
     * ensure that the regex cache entries created by makeItemLikeRegex are
     * useful.
     */
    cflags |= REG_NOSUB;

Clearly the comment would no longer be true. I guess I should just
remove this?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Ted Yu

Date:

24 December 2022, 14:28:45

On Sat, Dec 24, 2022 at 4:38 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-12-24 Sa 04:51, Ted Yu wrote:
>
>
> On Fri, Dec 23, 2022 at 1:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Ted Yu <yuzhihong@gmail.com> writes:
> > In makeItemLikeRegex :
>
> > + /* See regexp.c for explanation */
> > + CHECK_FOR_INTERRUPTS();
> > + pg_regerror(re_result, &re_tmp, errMsg,
> > sizeof(errMsg));
> > + ereturn(escontext, false,
>
> > Since an error is returned, I wonder if the
> `CHECK_FOR_INTERRUPTS` call is
> > still necessary.
>
> Yes, it is. We don't want a query-cancel transformed into a soft
> error.
>
> regards, tom lane
>
> Hi,
> For this case (`invalid regular expression`), the potential user
> interruption is one reason for stopping execution.
> I feel surfacing user interruption somehow masks the underlying error.
>
> The same regex, without user interruption, would exhibit an `invalid
> regular expression` error.
> I think it would be better to surface the error.
>
>

All that this patch is doing is replacing a call to
RE_compile_and_cache, which calls CHECK_FOR_INTERRUPTS, with similar
code, which gives us the opportunity to call ereturn instead of ereport.
Note that where escontext is NULL (the common case), ereturn functions
identically to ereport. So unless you want to argue that the logic in
RE_compile_and_cache is wrong I don't see what we're arguing about. If
instead I had altered the API of RE_compile_and_cache to include an
escontext parameter we wouldn't be having this argument at all. The only
reason I didn't do that was the point Tom quite properly raised about
why we're doing any caching here anyway.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Andrew:

Thanks for the response.

Re: Error-safe user functions

From

Tom Lane

Date:

24 December 2022, 15:42:35

Andrew Dunstan <andrew@dunslane.net> writes:
> As I was giving this a final polish I noticed this in jspConvertRegexFlags:

>     /*
>      * We'll never need sub-match details at execution.  While
>      * RE_compile_and_execute would set this flag anyway, force it on here to
>      * ensure that the regex cache entries created by makeItemLikeRegex are
>      * useful.
>      */
>     cflags |= REG_NOSUB;

> Clearly the comment would no longer be true. I guess I should just
> remove this?

Yeah, we can just drop that I guess.  I'm slightly worried that we might
need it again after some future refactoring; but it's not really worth
devising a re-worded comment to justify keeping it.

Also, I realized that I failed in my reviewerly duty by not noticing
that you'd forgotten to pg_regfree the regex after successful
compilation.  Running something like this exposes the memory leak
very quickly:

select pg_input_is_valid('$ ? (@ like_regex "pattern" flag "smixq")', 'jsonpath')
  from generate_series(1,10000000);

The attached delta patch takes care of it.  (Per comment at pg_regcomp,
we don't need this after a failure return.)

            regards, tom lane

diff --git a/src/backend/utils/adt/jsonpath_gram.y b/src/backend/utils/adt/jsonpath_gram.y
index 8c3a0c7623..30179408f5 100644
--- a/src/backend/utils/adt/jsonpath_gram.y
+++ b/src/backend/utils/adt/jsonpath_gram.y
@@ -560,6 +560,8 @@ makeItemLikeRegex(JsonPathParseItem *expr, JsonPathString *pattern,
                     (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
                      errmsg("invalid regular expression: %s", errMsg)));
         }
+
+        pg_regfree(&re_tmp);
     }

     *result = v;

Re: Error-safe user functions

From

Tom Lane

Date:

24 December 2022, 15:48:11

Ted Yu <yuzhihong@gmail.com> writes:
> On Fri, Dec 23, 2022 at 1:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yes, it is.  We don't want a query-cancel transformed into a soft error.

> The same regex, without user interruption, would exhibit an `invalid
> regular expression` error.

On what grounds do you claim that?  The timing of arrival of the SIGINT
is basically chance --- it might happen while we're inside backend/regex,
or not.  I mean, sure you could claim that a bad regex might run a long
time and thereby be more likely to cause the user to issue a query
cancel, but that's a stretched line of reasoning.

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

24 December 2022, 20:23:38

On 2022-12-24 Sa 10:42, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> As I was giving this a final polish I noticed this in jspConvertRegexFlags:
>>     /*
>>      * We'll never need sub-match details at execution.  While
>>      * RE_compile_and_execute would set this flag anyway, force it on here to
>>      * ensure that the regex cache entries created by makeItemLikeRegex are
>>      * useful.
>>      */
>>     cflags |= REG_NOSUB;
>> Clearly the comment would no longer be true. I guess I should just
>> remove this?
> Yeah, we can just drop that I guess.  I'm slightly worried that we might
> need it again after some future refactoring; but it's not really worth
> devising a re-worded comment to justify keeping it.
>
> Also, I realized that I failed in my reviewerly duty by not noticing
> that you'd forgotten to pg_regfree the regex after successful
> compilation.  Running something like this exposes the memory leak
> very quickly:
>
> select pg_input_is_valid('$ ? (@ like_regex "pattern" flag "smixq")', 'jsonpath')
>   from generate_series(1,10000000);
>
> The attached delta patch takes care of it.  (Per comment at pg_regcomp,
> we don't need this after a failure return.)
>
>             


Thanks, pushed with those changes.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

25 December 2022, 17:13:26

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Dec 16, 2022 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The reg* functions probably need a unified plan as to how far
>> down we want to push non-error behavior.

> I would be in favor of an aggressive approach.

Here's a proposed patch for converting regprocin and friends
to soft error reporting.  I'll say at the outset that it's an
engineering compromise, and it may be worth going further in
future.  But I doubt it's worth doing more than this for v16,
because the next steps would be pretty invasive.

I've converted all the errors thrown directly within regproc.c,
and also converted parseTypeString, typeStringToTypeName, and
stringToQualifiedNameList to report their own errors softly.
This affected some outside callers, but not so many of them
that I think it's worth inventing compatibility wrappers.

I dealt with lookup failures by just changing the input functions
to call the respective lookup functions with missing_ok = true,
and then throw their own error softly on failure.

Also, I've changed to_regproc() and friends to return NULL
in exactly the same cases that are now soft errors for the
input functions.  Previously they were a bit inconsistent
about what triggered hard errors vs. returning NULL.
(Perhaps we should go further than this, and convert all these
functions to just be DirectInputFunctionCallSafe wrappers
around the corresponding input functions?  That would save
some duplicative code, but I've not done it here.)

What's not fixed here:

1. As previously discussed, parse errors in type names are
thrown by the main grammar, so getting those to not be
hard errors seems like too big a lift for today.

2. Errors about invalid type modifiers (reported by
typenameTypeMod or type-specific typmodin routines) are not
trapped either.  Fixing this would require extending the
soft-error conventions to typmodin routines, which maybe will
be worth doing someday but it seems pretty far down the
priority list.  Specifying a typmod is surely not main-line
usage for regtypein.

3. Throwing our own error has the demerit that it might be
different from what the underlying lookup function would have
reported.  This is visible in some changes in existing
regression test cases, such as

-ERROR:  schema "ng_catalog" does not exist
+ERROR:  relation "ng_catalog.pg_class" does not exist

This isn't wrong, exactly, but the loss of specificity is
a bit annoying.

4. This still fails to trap errors about "too many dotted names"
and "cross-database references are not implemented", which are
thrown in DeconstructQualifiedName, LookupTypeName,
RangeVarGetRelid, and maybe some other places.

5. We also don't trap errors about "the schema exists but
you don't have USAGE permission to do a lookup in it",
because LookupExplicitNamespace still throws that even
when passed missing_ok = true.

The obvious way to fix #3,#4,#5 is to change pretty much all
of the catalog lookup infrastructure to deal in escontext
arguments instead of "missing_ok" booleans.  That might be
worth doing --- it'd have benefits beyond the immediate
problem, I think --- but I feel it's a bigger lift than we
want to undertake for v16.  It'd be better to spend the time
we have left for v16 on building features that use soft error
reporting than on refining corner cases in the reg* functions.

So I think we should stop more or less here, possibly after
changing the to_regfoo functions to be simple wrappers
around the soft input functions.

            regards, tom lane

diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index 109bdfb33f..a1df8b1ddc 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -2182,7 +2182,7 @@ pg_get_object_address(PG_FUNCTION_ARGS)
             ereport(ERROR,
                     (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                      errmsg("name or argument lists may not contain nulls")));
-        typename = typeStringToTypeName(TextDatumGetCString(elems[0]));
+        typename = typeStringToTypeName(TextDatumGetCString(elems[0]), NULL);
     }
     else if (type == OBJECT_LARGEOBJECT)
     {
@@ -2238,7 +2238,8 @@ pg_get_object_address(PG_FUNCTION_ARGS)
                         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                          errmsg("name or argument lists may not contain nulls")));
             args = lappend(args,
-                           typeStringToTypeName(TextDatumGetCString(elems[i])));
+                           typeStringToTypeName(TextDatumGetCString(elems[i]),
+                                                NULL));
         }
     }
     else
diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c
index f7ad689459..8f3850aa4e 100644
--- a/src/backend/parser/parse_type.c
+++ b/src/backend/parser/parse_type.c
@@ -727,10 +727,15 @@ pts_error_callback(void *arg)
  * Given a string that is supposed to be a SQL-compatible type declaration,
  * such as "int4" or "integer" or "character varying(32)", parse
  * the string and return the result as a TypeName.
- * If the string cannot be parsed as a type, an error is raised.
+ *
+ * If the string cannot be parsed as a type, an error is raised,
+ * unless escontext is an ErrorSaveContext node, in which case we may
+ * fill that and return NULL.  But note that the ErrorSaveContext option
+ * is mostly aspirational at present: errors detected by the main
+ * grammar, rather than here, will still be thrown.
  */
 TypeName *
-typeStringToTypeName(const char *str)
+typeStringToTypeName(const char *str, Node *escontext)
 {
     List       *raw_parsetree_list;
     TypeName   *typeName;
@@ -763,49 +768,54 @@ typeStringToTypeName(const char *str)
     return typeName;

 fail:
-    ereport(ERROR,
+    ereturn(escontext, NULL,
             (errcode(ERRCODE_SYNTAX_ERROR),
              errmsg("invalid type name \"%s\"", str)));
-    return NULL;                /* keep compiler quiet */
 }

 /*
  * Given a string that is supposed to be a SQL-compatible type declaration,
  * such as "int4" or "integer" or "character varying(32)", parse
  * the string and convert it to a type OID and type modifier.
- * If missing_ok is true, InvalidOid is returned rather than raising an error
- * when the type name is not found.
+ *
+ * If escontext is an ErrorSaveContext node, then errors are reported by
+ * filling escontext and returning false, instead of throwing them.
  */
-void
-parseTypeString(const char *str, Oid *typeid_p, int32 *typmod_p, bool missing_ok)
+bool
+parseTypeString(const char *str, Oid *typeid_p, int32 *typmod_p,
+                Node *escontext)
 {
     TypeName   *typeName;
     Type        tup;

-    typeName = typeStringToTypeName(str);
+    typeName = typeStringToTypeName(str, escontext);
+    if (typeName == NULL)
+        return false;

-    tup = LookupTypeName(NULL, typeName, typmod_p, missing_ok);
+    tup = LookupTypeName(NULL, typeName, typmod_p,
+                         (escontext && IsA(escontext, ErrorSaveContext)));
     if (tup == NULL)
     {
-        if (!missing_ok)
-            ereport(ERROR,
-                    (errcode(ERRCODE_UNDEFINED_OBJECT),
-                     errmsg("type \"%s\" does not exist",
-                            TypeNameToString(typeName)),
-                     parser_errposition(NULL, typeName->location)));
-        *typeid_p = InvalidOid;
+        ereturn(escontext, false,
+                (errcode(ERRCODE_UNDEFINED_OBJECT),
+                 errmsg("type \"%s\" does not exist",
+                        TypeNameToString(typeName))));
     }
     else
     {
         Form_pg_type typ = (Form_pg_type) GETSTRUCT(tup);

         if (!typ->typisdefined)
-            ereport(ERROR,
+        {
+            ReleaseSysCache(tup);
+            ereturn(escontext, false,
                     (errcode(ERRCODE_UNDEFINED_OBJECT),
                      errmsg("type \"%s\" is only a shell",
-                            TypeNameToString(typeName)),
-                     parser_errposition(NULL, typeName->location)));
+                            TypeNameToString(typeName))));
+        }
         *typeid_p = typ->oid;
         ReleaseSysCache(tup);
     }
+
+    return true;
 }
diff --git a/src/backend/tsearch/dict_thesaurus.c b/src/backend/tsearch/dict_thesaurus.c
index b8c08bcf7b..3df29e3345 100644
--- a/src/backend/tsearch/dict_thesaurus.c
+++ b/src/backend/tsearch/dict_thesaurus.c
@@ -599,6 +599,7 @@ thesaurus_init(PG_FUNCTION_ARGS)
     DictThesaurus *d;
     char       *subdictname = NULL;
     bool        fileloaded = false;
+    List       *namelist;
     ListCell   *l;

     d = (DictThesaurus *) palloc0(sizeof(DictThesaurus));
@@ -642,7 +643,8 @@ thesaurus_init(PG_FUNCTION_ARGS)
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("missing Dictionary parameter")));

-    d->subdictOid = get_ts_dict_oid(stringToQualifiedNameList(subdictname), false);
+    namelist = stringToQualifiedNameList(subdictname, NULL);
+    d->subdictOid = get_ts_dict_oid(namelist, false);
     d->subdict = lookup_ts_dictionary_cache(d->subdictOid);

     compileTheLexeme(d);
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 7808fbd448..cc25acb656 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -724,7 +724,9 @@ pg_input_is_valid_common(FunctionCallInfo fcinfo,
         Oid            typoid;

         /* Parse type-name argument to obtain type OID and encoded typmod. */
-        parseTypeString(typnamestr, &typoid, &my_extra->typmod, false);
+        if (!parseTypeString(typnamestr, &typoid, &my_extra->typmod,
+                             (Node *) escontext))
+            return false;

         /* Update type-specific info if typoid changed. */
         if (my_extra->typoid != typoid)
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index a6d695d6cb..14d76c856d 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -31,7 +31,9 @@
 #include "catalog/pg_ts_dict.h"
 #include "catalog/pg_type.h"
 #include "lib/stringinfo.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "parser/parse_type.h"
 #include "parser/scansup.h"
 #include "utils/acl.h"
@@ -43,8 +45,9 @@

 static bool parseNumericOid(char *string, Oid *result, Node *escontext);
 static bool parseDashOrOid(char *string, Oid *result, Node *escontext);
-static void parseNameAndArgTypes(const char *string, bool allowNone,
-                                 List **names, int *nargs, Oid *argtypes);
+static bool parseNameAndArgTypes(const char *string, bool allowNone,
+                                 List **names, int *nargs, Oid *argtypes,
+                                 Node *escontext);


 /*****************************************************************************
@@ -63,12 +66,13 @@ Datum
 regprocin(PG_FUNCTION_ARGS)
 {
     char       *pro_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     RegProcedure result;
     List       *names;
     FuncCandidateList clist;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(pro_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(pro_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* Else it's a name, possibly schema-qualified */
@@ -84,15 +88,18 @@ regprocin(PG_FUNCTION_ARGS)
      * Normal case: parse the name into components and see if it matches any
      * pg_proc entries in the current search path.
      */
-    names = stringToQualifiedNameList(pro_name_or_oid);
-    clist = FuncnameGetCandidates(names, -1, NIL, false, false, false, false);
+    names = stringToQualifiedNameList(pro_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();
+
+    clist = FuncnameGetCandidates(names, -1, NIL, false, false, false, true);

     if (clist == NULL)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_UNDEFINED_FUNCTION),
                  errmsg("function \"%s\" does not exist", pro_name_or_oid)));
     else if (clist->next != NULL)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_AMBIGUOUS_FUNCTION),
                  errmsg("more than one function named \"%s\"",
                         pro_name_or_oid)));
@@ -113,12 +120,16 @@ to_regproc(PG_FUNCTION_ARGS)
     char       *pro_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     List       *names;
     FuncCandidateList clist;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Parse the name into components and see if it matches any pg_proc
      * entries in the current search path.
      */
-    names = stringToQualifiedNameList(pro_name);
+    names = stringToQualifiedNameList(pro_name, (Node *) &escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();
+
     clist = FuncnameGetCandidates(names, -1, NIL, false, false, false, true);

     if (clist == NULL || clist->next != NULL)
@@ -222,6 +233,7 @@ Datum
 regprocedurein(PG_FUNCTION_ARGS)
 {
     char       *pro_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     RegProcedure result;
     List       *names;
     int            nargs;
@@ -229,7 +241,7 @@ regprocedurein(PG_FUNCTION_ARGS)
     FuncCandidateList clist;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(pro_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(pro_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* The rest of this wouldn't work in bootstrap mode */
@@ -242,10 +254,13 @@ regprocedurein(PG_FUNCTION_ARGS)
      * which one exactly matches the given argument types.  (There will not be
      * more than one match.)
      */
-    parseNameAndArgTypes(pro_name_or_oid, false, &names, &nargs, argtypes);
+    if (!parseNameAndArgTypes(pro_name_or_oid, false,
+                              &names, &nargs, argtypes,
+                              escontext))
+        PG_RETURN_NULL();

     clist = FuncnameGetCandidates(names, nargs, NIL, false, false,
-                                  false, false);
+                                  false, true);

     for (; clist; clist = clist->next)
     {
@@ -254,7 +269,7 @@ regprocedurein(PG_FUNCTION_ARGS)
     }

     if (clist == NULL)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_UNDEFINED_FUNCTION),
                  errmsg("function \"%s\" does not exist", pro_name_or_oid)));

@@ -276,13 +291,17 @@ to_regprocedure(PG_FUNCTION_ARGS)
     int            nargs;
     Oid            argtypes[FUNC_MAX_ARGS];
     FuncCandidateList clist;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Parse the name and arguments, look up potential matches in the current
      * namespace search list, and scan to see which one exactly matches the
      * given argument types.    (There will not be more than one match.)
      */
-    parseNameAndArgTypes(pro_name, false, &names, &nargs, argtypes);
+    if (!parseNameAndArgTypes(pro_name, false,
+                              &names, &nargs, argtypes,
+                              (Node *) &escontext))
+        PG_RETURN_NULL();

     clist = FuncnameGetCandidates(names, nargs, NIL, false, false, false, true);

@@ -484,12 +503,13 @@ Datum
 regoperin(PG_FUNCTION_ARGS)
 {
     char       *opr_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;
     FuncCandidateList clist;

     /* Handle "0" or numeric OID */
-    if (parseNumericOid(opr_name_or_oid, &result, fcinfo->context))
+    if (parseNumericOid(opr_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* Else it's a name, possibly schema-qualified */
@@ -502,15 +522,18 @@ regoperin(PG_FUNCTION_ARGS)
      * Normal case: parse the name into components and see if it matches any
      * pg_operator entries in the current search path.
      */
-    names = stringToQualifiedNameList(opr_name_or_oid);
-    clist = OpernameGetCandidates(names, '\0', false);
+    names = stringToQualifiedNameList(opr_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();
+
+    clist = OpernameGetCandidates(names, '\0', true);

     if (clist == NULL)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_UNDEFINED_FUNCTION),
                  errmsg("operator does not exist: %s", opr_name_or_oid)));
     else if (clist->next != NULL)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_AMBIGUOUS_FUNCTION),
                  errmsg("more than one operator named %s",
                         opr_name_or_oid)));
@@ -531,12 +554,16 @@ to_regoper(PG_FUNCTION_ARGS)
     char       *opr_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     List       *names;
     FuncCandidateList clist;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Parse the name into components and see if it matches any pg_operator
      * entries in the current search path.
      */
-    names = stringToQualifiedNameList(opr_name);
+    names = stringToQualifiedNameList(opr_name, (Node *) &escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();
+
     clist = OpernameGetCandidates(names, '\0', true);

     if (clist == NULL || clist->next != NULL)
@@ -646,13 +673,14 @@ Datum
 regoperatorin(PG_FUNCTION_ARGS)
 {
     char       *opr_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;
     int            nargs;
     Oid            argtypes[FUNC_MAX_ARGS];

     /* Handle "0" or numeric OID */
-    if (parseNumericOid(opr_name_or_oid, &result, fcinfo->context))
+    if (parseNumericOid(opr_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* The rest of this wouldn't work in bootstrap mode */
@@ -665,14 +693,18 @@ regoperatorin(PG_FUNCTION_ARGS)
      * which one exactly matches the given argument types.  (There will not be
      * more than one match.)
      */
-    parseNameAndArgTypes(opr_name_or_oid, true, &names, &nargs, argtypes);
+    if (!parseNameAndArgTypes(opr_name_or_oid, true,
+                              &names, &nargs, argtypes,
+                              escontext))
+        PG_RETURN_NULL();
+
     if (nargs == 1)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_UNDEFINED_PARAMETER),
                  errmsg("missing argument"),
                  errhint("Use NONE to denote the missing argument of a unary operator.")));
     if (nargs != 2)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_TOO_MANY_ARGUMENTS),
                  errmsg("too many arguments"),
                  errhint("Provide two argument types for operator.")));
@@ -680,7 +712,7 @@ regoperatorin(PG_FUNCTION_ARGS)
     result = OpernameGetOprid(names, argtypes[0], argtypes[1]);

     if (!OidIsValid(result))
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_UNDEFINED_FUNCTION),
                  errmsg("operator does not exist: %s", opr_name_or_oid)));

@@ -700,23 +732,20 @@ to_regoperator(PG_FUNCTION_ARGS)
     List       *names;
     int            nargs;
     Oid            argtypes[FUNC_MAX_ARGS];
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Parse the name and arguments, look up potential matches in the current
      * namespace search list, and scan to see which one exactly matches the
      * given argument types.    (There will not be more than one match.)
      */
-    parseNameAndArgTypes(opr_name_or_oid, true, &names, &nargs, argtypes);
-    if (nargs == 1)
-        ereport(ERROR,
-                (errcode(ERRCODE_UNDEFINED_PARAMETER),
-                 errmsg("missing argument"),
-                 errhint("Use NONE to denote the missing argument of a unary operator.")));
+    if (!parseNameAndArgTypes(opr_name_or_oid, true,
+                              &names, &nargs, argtypes,
+                              (Node *) &escontext))
+        PG_RETURN_NULL();
+
     if (nargs != 2)
-        ereport(ERROR,
-                (errcode(ERRCODE_TOO_MANY_ARGUMENTS),
-                 errmsg("too many arguments"),
-                 errhint("Provide two argument types for operator.")));
+        PG_RETURN_NULL();

     result = OpernameGetOprid(names, argtypes[0], argtypes[1]);

@@ -903,11 +932,12 @@ Datum
 regclassin(PG_FUNCTION_ARGS)
 {
     char       *class_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(class_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(class_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* Else it's a name, possibly schema-qualified */
@@ -920,10 +950,18 @@ regclassin(PG_FUNCTION_ARGS)
      * Normal case: parse the name into components and see if it matches any
      * pg_class entries in the current search path.
      */
-    names = stringToQualifiedNameList(class_name_or_oid);
+    names = stringToQualifiedNameList(class_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     /* We might not even have permissions on this relation; don't lock it. */
-    result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, false);
+    result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, true);
+
+    if (!OidIsValid(result))
+        ereturn(escontext, (Datum) 0,
+                (errcode(ERRCODE_UNDEFINED_TABLE),
+                 errmsg("relation \"%s\" does not exist",
+                        NameListToString(names))));

     PG_RETURN_OID(result);
 }
@@ -939,12 +977,15 @@ to_regclass(PG_FUNCTION_ARGS)
     char       *class_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     Oid            result;
     List       *names;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Parse the name into components and see if it matches any pg_class
      * entries in the current search path.
      */
-    names = stringToQualifiedNameList(class_name);
+    names = stringToQualifiedNameList(class_name, (Node *) &escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     /* We might not even have permissions on this relation; don't lock it. */
     result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, true);
@@ -1045,11 +1086,12 @@ Datum
 regcollationin(PG_FUNCTION_ARGS)
 {
     char       *collation_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(collation_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(collation_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* Else it's a name, possibly schema-qualified */
@@ -1062,9 +1104,17 @@ regcollationin(PG_FUNCTION_ARGS)
      * Normal case: parse the name into components and see if it matches any
      * pg_collation entries in the current search path.
      */
-    names = stringToQualifiedNameList(collation_name_or_oid);
+    names = stringToQualifiedNameList(collation_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();
+
+    result = get_collation_oid(names, true);

-    result = get_collation_oid(names, false);
+    if (!OidIsValid(result))
+        ereturn(escontext, (Datum) 0,
+                (errcode(ERRCODE_UNDEFINED_OBJECT),
+                 errmsg("collation \"%s\" for encoding \"%s\" does not exist",
+                        NameListToString(names), GetDatabaseEncodingName())));

     PG_RETURN_OID(result);
 }
@@ -1080,12 +1130,15 @@ to_regcollation(PG_FUNCTION_ARGS)
     char       *collation_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     Oid            result;
     List       *names;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Parse the name into components and see if it matches any pg_collation
      * entries in the current search path.
      */
-    names = stringToQualifiedNameList(collation_name);
+    names = stringToQualifiedNameList(collation_name, (Node *) &escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     result = get_collation_oid(names, true);

@@ -1192,11 +1245,12 @@ Datum
 regtypein(PG_FUNCTION_ARGS)
 {
     char       *typ_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     int32        typmod;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(typ_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(typ_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* Else it's a type name, possibly schema-qualified or decorated */
@@ -1207,9 +1261,10 @@ regtypein(PG_FUNCTION_ARGS)

     /*
      * Normal case: invoke the full parser to deal with special cases such as
-     * array syntax.
+     * array syntax.  We don't need to check for parseTypeString failure,
+     * since we'll just return anyway.
      */
-    parseTypeString(typ_name_or_oid, &result, &typmod, false);
+    (void) parseTypeString(typ_name_or_oid, &result, &typmod, escontext);

     PG_RETURN_OID(result);
 }
@@ -1225,13 +1280,12 @@ to_regtype(PG_FUNCTION_ARGS)
     char       *typ_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     Oid            result;
     int32        typmod;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

     /*
      * Invoke the full parser to deal with special cases such as array syntax.
      */
-    parseTypeString(typ_name, &result, &typmod, true);
-
-    if (OidIsValid(result))
+    if (parseTypeString(typ_name, &result, &typmod, (Node *) &escontext))
         PG_RETURN_OID(result);
     else
         PG_RETURN_NULL();
@@ -1318,11 +1372,12 @@ Datum
 regconfigin(PG_FUNCTION_ARGS)
 {
     char       *cfg_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(cfg_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(cfg_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* The rest of this wouldn't work in bootstrap mode */
@@ -1333,9 +1388,17 @@ regconfigin(PG_FUNCTION_ARGS)
      * Normal case: parse the name into components and see if it matches any
      * pg_ts_config entries in the current search path.
      */
-    names = stringToQualifiedNameList(cfg_name_or_oid);
+    names = stringToQualifiedNameList(cfg_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

-    result = get_ts_config_oid(names, false);
+    result = get_ts_config_oid(names, true);
+
+    if (!OidIsValid(result))
+        ereturn(escontext, (Datum) 0,
+                (errcode(ERRCODE_UNDEFINED_OBJECT),
+                 errmsg("text search configuration \"%s\" does not exist",
+                        NameListToString(names))));

     PG_RETURN_OID(result);
 }
@@ -1419,11 +1482,12 @@ Datum
 regdictionaryin(PG_FUNCTION_ARGS)
 {
     char       *dict_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(dict_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(dict_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* The rest of this wouldn't work in bootstrap mode */
@@ -1434,9 +1498,17 @@ regdictionaryin(PG_FUNCTION_ARGS)
      * Normal case: parse the name into components and see if it matches any
      * pg_ts_dict entries in the current search path.
      */
-    names = stringToQualifiedNameList(dict_name_or_oid);
+    names = stringToQualifiedNameList(dict_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();
+
+    result = get_ts_dict_oid(names, true);

-    result = get_ts_dict_oid(names, false);
+    if (!OidIsValid(result))
+        ereturn(escontext, (Datum) 0,
+                (errcode(ERRCODE_UNDEFINED_OBJECT),
+                 errmsg("text search dictionary \"%s\" does not exist",
+                        NameListToString(names))));

     PG_RETURN_OID(result);
 }
@@ -1520,11 +1592,12 @@ Datum
 regrolein(PG_FUNCTION_ARGS)
 {
     char       *role_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(role_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(role_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* The rest of this wouldn't work in bootstrap mode */
@@ -1532,14 +1605,22 @@ regrolein(PG_FUNCTION_ARGS)
         elog(ERROR, "regrole values must be OIDs in bootstrap mode");

     /* Normal case: see if the name matches any pg_authid entry. */
-    names = stringToQualifiedNameList(role_name_or_oid);
+    names = stringToQualifiedNameList(role_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     if (list_length(names) != 1)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_NAME),
                  errmsg("invalid name syntax")));

-    result = get_role_oid(strVal(linitial(names)), false);
+    result = get_role_oid(strVal(linitial(names)), true);
+
+    if (!OidIsValid(result))
+        ereturn(escontext, (Datum) 0,
+                (errcode(ERRCODE_UNDEFINED_OBJECT),
+                 errmsg("role \"%s\" does not exist",
+                        strVal(linitial(names)))));

     PG_RETURN_OID(result);
 }
@@ -1555,13 +1636,14 @@ to_regrole(PG_FUNCTION_ARGS)
     char       *role_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     Oid            result;
     List       *names;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

-    names = stringToQualifiedNameList(role_name);
+    names = stringToQualifiedNameList(role_name, (Node *) &escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     if (list_length(names) != 1)
-        ereport(ERROR,
-                (errcode(ERRCODE_INVALID_NAME),
-                 errmsg("invalid name syntax")));
+        PG_RETURN_NULL();

     result = get_role_oid(strVal(linitial(names)), true);

@@ -1635,11 +1717,12 @@ Datum
 regnamespacein(PG_FUNCTION_ARGS)
 {
     char       *nsp_name_or_oid = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     Oid            result;
     List       *names;

     /* Handle "-" or numeric OID */
-    if (parseDashOrOid(nsp_name_or_oid, &result, fcinfo->context))
+    if (parseDashOrOid(nsp_name_or_oid, &result, escontext))
         PG_RETURN_OID(result);

     /* The rest of this wouldn't work in bootstrap mode */
@@ -1647,14 +1730,22 @@ regnamespacein(PG_FUNCTION_ARGS)
         elog(ERROR, "regnamespace values must be OIDs in bootstrap mode");

     /* Normal case: see if the name matches any pg_namespace entry. */
-    names = stringToQualifiedNameList(nsp_name_or_oid);
+    names = stringToQualifiedNameList(nsp_name_or_oid, escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     if (list_length(names) != 1)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_INVALID_NAME),
                  errmsg("invalid name syntax")));

-    result = get_namespace_oid(strVal(linitial(names)), false);
+    result = get_namespace_oid(strVal(linitial(names)), true);
+
+    if (!OidIsValid(result))
+        ereturn(escontext, (Datum) 0,
+                (errcode(ERRCODE_UNDEFINED_SCHEMA),
+                 errmsg("schema \"%s\" does not exist",
+                        strVal(linitial(names)))));

     PG_RETURN_OID(result);
 }
@@ -1670,13 +1761,14 @@ to_regnamespace(PG_FUNCTION_ARGS)
     char       *nsp_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
     Oid            result;
     List       *names;
+    ErrorSaveContext escontext = {T_ErrorSaveContext};

-    names = stringToQualifiedNameList(nsp_name);
+    names = stringToQualifiedNameList(nsp_name, (Node *) &escontext);
+    if (names == NIL)
+        PG_RETURN_NULL();

     if (list_length(names) != 1)
-        ereport(ERROR,
-                (errcode(ERRCODE_INVALID_NAME),
-                 errmsg("invalid name syntax")));
+        PG_RETURN_NULL();

     result = get_namespace_oid(strVal(linitial(names)), true);

@@ -1763,9 +1855,13 @@ text_regclass(PG_FUNCTION_ARGS)

 /*
  * Given a C string, parse it into a qualified-name list.
+ *
+ * If escontext is an ErrorSaveContext node, invalid input will be
+ * reported there instead of being thrown, and we return NIL.
+ * (NIL is not possible as a success return, since empty-input is an error.)
  */
 List *
-stringToQualifiedNameList(const char *string)
+stringToQualifiedNameList(const char *string, Node *escontext)
 {
     char       *rawname;
     List       *result = NIL;
@@ -1776,12 +1872,12 @@ stringToQualifiedNameList(const char *string)
     rawname = pstrdup(string);

     if (!SplitIdentifierString(rawname, '.', &namelist))
-        ereport(ERROR,
+        ereturn(escontext, NIL,
                 (errcode(ERRCODE_INVALID_NAME),
                  errmsg("invalid name syntax")));

     if (namelist == NIL)
-        ereport(ERROR,
+        ereturn(escontext, NIL,
                 (errcode(ERRCODE_INVALID_NAME),
                  errmsg("invalid name syntax")));

@@ -1858,10 +1954,14 @@ parseDashOrOid(char *string, Oid *result, Node *escontext)
  *
  * If allowNone is true, accept "NONE" and return it as InvalidOid (this is
  * for unary operators).
+ *
+ * Returns true on success, false on failure (the latter only possible
+ * if escontext is an ErrorSaveContext node).
  */
-static void
+static bool
 parseNameAndArgTypes(const char *string, bool allowNone, List **names,
-                     int *nargs, Oid *argtypes)
+                     int *nargs, Oid *argtypes,
+                     Node *escontext)
 {
     char       *rawname;
     char       *ptr;
@@ -1886,13 +1986,15 @@ parseNameAndArgTypes(const char *string, bool allowNone, List **names,
             break;
     }
     if (*ptr == '\0')
-        ereport(ERROR,
+        ereturn(escontext, false,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("expected a left parenthesis")));

     /* Separate the name and parse it into a list */
     *ptr++ = '\0';
-    *names = stringToQualifiedNameList(rawname);
+    *names = stringToQualifiedNameList(rawname, escontext);
+    if (*names == NIL)
+        return false;

     /* Check for the trailing right parenthesis and remove it */
     ptr2 = ptr + strlen(ptr);
@@ -1902,7 +2004,7 @@ parseNameAndArgTypes(const char *string, bool allowNone, List **names,
             break;
     }
     if (*ptr2 != ')')
-        ereport(ERROR,
+        ereturn(escontext, false,
                 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                  errmsg("expected a right parenthesis")));

@@ -1921,7 +2023,7 @@ parseNameAndArgTypes(const char *string, bool allowNone, List **names,
         {
             /* End of string.  Okay unless we had a comma before. */
             if (had_comma)
-                ereport(ERROR,
+                ereturn(escontext, false,
                         (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                          errmsg("expected a type name")));
             break;
@@ -1953,7 +2055,7 @@ parseNameAndArgTypes(const char *string, bool allowNone, List **names,
             }
         }
         if (in_quote || paren_count != 0)
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                      errmsg("improper type name")));

@@ -1985,10 +2087,11 @@ parseNameAndArgTypes(const char *string, bool allowNone, List **names,
         else
         {
             /* Use full parser to resolve the type name */
-            parseTypeString(typename, &typeid, &typmod, false);
+            if (!parseTypeString(typename, &typeid, &typmod, escontext))
+                return false;
         }
         if (*nargs >= FUNC_MAX_ARGS)
-            ereport(ERROR,
+            ereturn(escontext, false,
                     (errcode(ERRCODE_TOO_MANY_ARGUMENTS),
                      errmsg("too many arguments")));

@@ -1997,4 +2100,6 @@ parseNameAndArgTypes(const char *string, bool allowNone, List **names,
     }

     pfree(rawname);
+
+    return true;
 }
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index caeb85b4ca..66ce710598 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -2652,7 +2652,7 @@ tsvector_update_trigger(PG_FUNCTION_ARGS, bool config_column)
     {
         List       *names;

-        names = stringToQualifiedNameList(trigger->tgargs[1]);
+        names = stringToQualifiedNameList(trigger->tgargs[1], NULL);
         /* require a schema so that results are not search path dependent */
         if (list_length(names) < 2)
             ereport(ERROR,
diff --git a/src/backend/utils/cache/ts_cache.c b/src/backend/utils/cache/ts_cache.c
index 450ea34336..043abd341d 100644
--- a/src/backend/utils/cache/ts_cache.c
+++ b/src/backend/utils/cache/ts_cache.c
@@ -38,6 +38,7 @@
 #include "catalog/pg_ts_template.h"
 #include "commands/defrem.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "tsearch/ts_cache.h"
 #include "utils/builtins.h"
 #include "utils/catcache.h"
@@ -556,6 +557,8 @@ lookup_ts_config_cache(Oid cfgId)
 Oid
 getTSCurrentConfig(bool emitError)
 {
+    List       *namelist;
+
     /* if we have a cached value, return it */
     if (OidIsValid(TSCurrentConfigCache))
         return TSCurrentConfigCache;
@@ -576,9 +579,22 @@ getTSCurrentConfig(bool emitError)
     }

     /* Look up the config */
-    TSCurrentConfigCache =
-        get_ts_config_oid(stringToQualifiedNameList(TSCurrentConfig),
-                          !emitError);
+    if (emitError)
+    {
+        namelist = stringToQualifiedNameList(TSCurrentConfig, NULL);
+        TSCurrentConfigCache = get_ts_config_oid(namelist, false);
+    }
+    else
+    {
+        ErrorSaveContext escontext = {T_ErrorSaveContext};
+
+        namelist = stringToQualifiedNameList(TSCurrentConfig,
+                                             (Node *) &escontext);
+        if (namelist != NIL)
+            TSCurrentConfigCache = get_ts_config_oid(namelist, true);
+        else
+            TSCurrentConfigCache = InvalidOid;    /* bad name list syntax */
+    }

     return TSCurrentConfigCache;
 }
@@ -594,12 +610,19 @@ check_default_text_search_config(char **newval, void **extra, GucSource source)
      */
     if (IsTransactionState() && MyDatabaseId != InvalidOid)
     {
+        ErrorSaveContext escontext = {T_ErrorSaveContext};
+        List       *namelist;
         Oid            cfgId;
         HeapTuple    tuple;
         Form_pg_ts_config cfg;
         char       *buf;

-        cfgId = get_ts_config_oid(stringToQualifiedNameList(*newval), true);
+        namelist = stringToQualifiedNameList(*newval,
+                                             (Node *) &escontext);
+        if (namelist != NIL)
+            cfgId = get_ts_config_oid(namelist, true);
+        else
+            cfgId = InvalidOid; /* bad name list syntax */

         /*
          * When source == PGC_S_TEST, don't throw a hard error for a
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 87cbb1d3e3..51e5893404 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -1876,7 +1876,7 @@ RelationNameGetTupleDesc(const char *relname)
     List       *relname_list;

     /* Open relation and copy the tuple description */
-    relname_list = stringToQualifiedNameList(relname);
+    relname_list = stringToQualifiedNameList(relname, NULL);
     relvar = makeRangeVarFromNameList(relname_list);
     rel = relation_openrv(relvar, AccessShareLock);
     tupdesc = CreateTupleDescCopy(RelationGetDescr(rel));
diff --git a/src/include/parser/parse_type.h b/src/include/parser/parse_type.h
index 4e5624d721..c6c92a0009 100644
--- a/src/include/parser/parse_type.h
+++ b/src/include/parser/parse_type.h
@@ -51,8 +51,9 @@ extern Datum stringTypeDatum(Type tp, char *string, int32 atttypmod);
 extern Oid    typeidTypeRelid(Oid type_id);
 extern Oid    typeOrDomainTypeRelid(Oid type_id);

-extern TypeName *typeStringToTypeName(const char *str);
-extern void parseTypeString(const char *str, Oid *typeid_p, int32 *typmod_p, bool missing_ok);
+extern TypeName *typeStringToTypeName(const char *str, Node *escontext);
+extern bool parseTypeString(const char *str, Oid *typeid_p, int32 *typmod_p,
+                            Node *escontext);

 /* true if typeid is composite, or domain over composite, but not RECORD */
 #define ISCOMPLEX(typeid) (typeOrDomainTypeRelid(typeid) != InvalidOid)
diff --git a/src/include/utils/regproc.h b/src/include/utils/regproc.h
index 0e2965ff93..4c3311d8e2 100644
--- a/src/include/utils/regproc.h
+++ b/src/include/utils/regproc.h
@@ -25,7 +25,7 @@ extern char *format_procedure_extended(Oid procedure_oid, bits16 flags);
 #define FORMAT_OPERATOR_FORCE_QUALIFY    0x02    /* force qualification */
 extern char *format_operator_extended(Oid operator_oid, bits16 flags);

-extern List *stringToQualifiedNameList(const char *string);
+extern List *stringToQualifiedNameList(const char *string, Node *escontext);
 extern char *format_procedure(Oid procedure_oid);
 extern char *format_procedure_qualified(Oid procedure_oid);
 extern void format_procedure_parts(Oid procedure_oid, List **objnames,
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 8f21e0d701..8143ae40a0 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -3613,7 +3613,7 @@ plperl_spi_prepare(char *query, int argc, SV **argv)
             char       *typstr;

             typstr = sv2cstr(argv[i]);
-            parseTypeString(typstr, &typId, &typmod, false);
+            (void) parseTypeString(typstr, &typId, &typmod, NULL);
             pfree(typstr);

             getTypeInputInfo(typId, &typInput, &typIOParam);
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index f7cf2b4b89..fe63766e5d 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -3725,7 +3725,7 @@ parse_datatype(const char *string, int location)
     error_context_stack = &syntax_errcontext;

     /* Let the main parser try to parse it under standard SQL rules */
-    typeName = typeStringToTypeName(string);
+    typeName = typeStringToTypeName(string, NULL);
     typenameTypeIdAndMod(NULL, typeName, &type_id, &typmod);

     /* Restore former ereport callback */
diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index 6b9f8d5b43..ff87b27de0 100644
--- a/src/pl/plpython/plpy_spi.c
+++ b/src/pl/plpython/plpy_spi.c
@@ -105,7 +105,7 @@ PLy_spi_prepare(PyObject *self, PyObject *args)
              *information for input conversion.
              ********************************************************/

-            parseTypeString(sptr, &typeId, &typmod, false);
+            (void) parseTypeString(sptr, &typeId, &typmod, NULL);

             Py_DECREF(optr);

diff --git a/src/pl/tcl/pltcl.c b/src/pl/tcl/pltcl.c
index 4185fb1221..185d5bed99 100644
--- a/src/pl/tcl/pltcl.c
+++ b/src/pl/tcl/pltcl.c
@@ -615,7 +615,7 @@ call_pltcl_start_proc(Oid prolang, bool pltrusted)
     error_context_stack = &errcallback;

     /* Parse possibly-qualified identifier and look up the function */
-    namelist = stringToQualifiedNameList(start_proc);
+    namelist = stringToQualifiedNameList(start_proc, NULL);
     procOid = LookupFuncName(namelist, 0, NULL, false);

     /* Current user must have permission to call function */
@@ -2603,7 +2603,8 @@ pltcl_SPI_prepare(ClientData cdata, Tcl_Interp *interp,
                         typIOParam;
             int32        typmod;

-            parseTypeString(Tcl_GetString(argsObj[i]), &typId, &typmod, false);
+            (void) parseTypeString(Tcl_GetString(argsObj[i]),
+                                   &typId, &typmod, NULL);

             getTypeInputInfo(typId, &typInput, &typIOParam);

diff --git a/src/test/regress/expected/regproc.out b/src/test/regress/expected/regproc.out
index e45ff5483f..0c5e1d4be6 100644
--- a/src/test/regress/expected/regproc.out
+++ b/src/test/regress/expected/regproc.out
@@ -245,7 +245,7 @@ LINE 1: SELECT regtype('int3');
                        ^
 -- with schemaname
 SELECT regoper('ng_catalog.||/');
-ERROR:  schema "ng_catalog" does not exist
+ERROR:  operator does not exist: ng_catalog.||/
 LINE 1: SELECT regoper('ng_catalog.||/');
                        ^
 SELECT regoperator('ng_catalog.+(int4,int4)');
@@ -253,15 +253,15 @@ ERROR:  operator does not exist: ng_catalog.+(int4,int4)
 LINE 1: SELECT regoperator('ng_catalog.+(int4,int4)');
                            ^
 SELECT regproc('ng_catalog.now');
-ERROR:  schema "ng_catalog" does not exist
+ERROR:  function "ng_catalog.now" does not exist
 LINE 1: SELECT regproc('ng_catalog.now');
                        ^
 SELECT regprocedure('ng_catalog.abs(numeric)');
-ERROR:  schema "ng_catalog" does not exist
+ERROR:  function "ng_catalog.abs(numeric)" does not exist
 LINE 1: SELECT regprocedure('ng_catalog.abs(numeric)');
                             ^
 SELECT regclass('ng_catalog.pg_class');
-ERROR:  schema "ng_catalog" does not exist
+ERROR:  relation "ng_catalog.pg_class" does not exist
 LINE 1: SELECT regclass('ng_catalog.pg_class');
                         ^
 SELECT regtype('ng_catalog.int4');
@@ -269,7 +269,7 @@ ERROR:  schema "ng_catalog" does not exist
 LINE 1: SELECT regtype('ng_catalog.int4');
                        ^
 SELECT regcollation('ng_catalog."POSIX"');
-ERROR:  schema "ng_catalog" does not exist
+ERROR:  collation "ng_catalog.POSIX" for encoding "SQL_ASCII" does not exist
 LINE 1: SELECT regcollation('ng_catalog."POSIX"');
                             ^
 -- schemaname not applicable
@@ -406,7 +406,11 @@ SELECT to_regrole('"regress_regrole_test"');
 (1 row)

 SELECT to_regrole('foo.bar');
-ERROR:  invalid name syntax
+ to_regrole
+------------
+
+(1 row)
+
 SELECT to_regrole('Nonexistent');
  to_regrole
 ------------
@@ -420,7 +424,11 @@ SELECT to_regrole('"Nonexistent"');
 (1 row)

 SELECT to_regrole('foo.bar');
-ERROR:  invalid name syntax
+ to_regrole
+------------
+
+(1 row)
+
 SELECT to_regnamespace('Nonexistent');
  to_regnamespace
 -----------------
@@ -434,4 +442,105 @@ SELECT to_regnamespace('"Nonexistent"');
 (1 row)

 SELECT to_regnamespace('foo.bar');
-ERROR:  invalid name syntax
+ to_regnamespace
+-----------------
+
+(1 row)
+
+-- Test soft-error API
+SELECT pg_input_error_message('ng_catalog.pg_class', 'regclass');
+            pg_input_error_message
+-----------------------------------------------
+ relation "ng_catalog.pg_class" does not exist
+(1 row)
+
+SELECT pg_input_error_message('ng_catalog."POSIX"', 'regcollation');
+                        pg_input_error_message
+----------------------------------------------------------------------
+ collation "ng_catalog.POSIX" for encoding "SQL_ASCII" does not exist
+(1 row)
+
+SELECT pg_input_error_message('no_such_config', 'regconfig');
+                  pg_input_error_message
+-----------------------------------------------------------
+ text search configuration "no_such_config" does not exist
+(1 row)
+
+SELECT pg_input_error_message('no_such_dictionary', 'regdictionary');
+                   pg_input_error_message
+------------------------------------------------------------
+ text search dictionary "no_such_dictionary" does not exist
+(1 row)
+
+SELECT pg_input_error_message('Nonexistent', 'regnamespace');
+       pg_input_error_message
+-------------------------------------
+ schema "nonexistent" does not exist
+(1 row)
+
+SELECT pg_input_error_message('ng_catalog.||/', 'regoper');
+         pg_input_error_message
+-----------------------------------------
+ operator does not exist: ng_catalog.||/
+(1 row)
+
+SELECT pg_input_error_message('-', 'regoper');
+     pg_input_error_message
+--------------------------------
+ more than one operator named -
+(1 row)
+
+SELECT pg_input_error_message('ng_catalog.+(int4,int4)', 'regoperator');
+              pg_input_error_message
+--------------------------------------------------
+ operator does not exist: ng_catalog.+(int4,int4)
+(1 row)
+
+SELECT pg_input_error_message('-', 'regoperator');
+   pg_input_error_message
+-----------------------------
+ expected a left parenthesis
+(1 row)
+
+SELECT pg_input_error_message('ng_catalog.now', 'regproc');
+          pg_input_error_message
+------------------------------------------
+ function "ng_catalog.now" does not exist
+(1 row)
+
+SELECT pg_input_error_message('ng_catalog.abs(numeric)', 'regprocedure');
+              pg_input_error_message
+---------------------------------------------------
+ function "ng_catalog.abs(numeric)" does not exist
+(1 row)
+
+SELECT pg_input_error_message('ng_catalog.abs(numeric', 'regprocedure');
+    pg_input_error_message
+------------------------------
+ expected a right parenthesis
+(1 row)
+
+SELECT pg_input_error_message('regress_regrole_test', 'regrole');
+           pg_input_error_message
+--------------------------------------------
+ role "regress_regrole_test" does not exist
+(1 row)
+
+SELECT pg_input_error_message('no_such_type', 'regtype');
+       pg_input_error_message
+------------------------------------
+ type "no_such_type" does not exist
+(1 row)
+
+-- Some cases that should be soft errors, but are not yet
+SELECT pg_input_error_message('incorrect type name syntax', 'regtype');
+ERROR:  syntax error at or near "type"
+LINE 1: SELECT pg_input_error_message('incorrect type name syntax', ...
+                  ^
+CONTEXT:  invalid type name "incorrect type name syntax"
+SELECT pg_input_error_message('numeric(1,2,3)', 'regtype');  -- bogus typmod
+ERROR:  invalid NUMERIC type modifier
+SELECT pg_input_error_message('way.too.many.names', 'regtype');
+ERROR:  improper qualified name (too many dotted names): way.too.many.names
+SELECT pg_input_error_message('no_such_catalog.schema.name', 'regtype');
+ERROR:  cross-database references are not implemented: no_such_catalog.schema.name
diff --git a/src/test/regress/sql/regproc.sql b/src/test/regress/sql/regproc.sql
index faab0c15ce..aa1f1bb17a 100644
--- a/src/test/regress/sql/regproc.sql
+++ b/src/test/regress/sql/regproc.sql
@@ -120,3 +120,26 @@ SELECT to_regrole('foo.bar');
 SELECT to_regnamespace('Nonexistent');
 SELECT to_regnamespace('"Nonexistent"');
 SELECT to_regnamespace('foo.bar');
+
+-- Test soft-error API
+
+SELECT pg_input_error_message('ng_catalog.pg_class', 'regclass');
+SELECT pg_input_error_message('ng_catalog."POSIX"', 'regcollation');
+SELECT pg_input_error_message('no_such_config', 'regconfig');
+SELECT pg_input_error_message('no_such_dictionary', 'regdictionary');
+SELECT pg_input_error_message('Nonexistent', 'regnamespace');
+SELECT pg_input_error_message('ng_catalog.||/', 'regoper');
+SELECT pg_input_error_message('-', 'regoper');
+SELECT pg_input_error_message('ng_catalog.+(int4,int4)', 'regoperator');
+SELECT pg_input_error_message('-', 'regoperator');
+SELECT pg_input_error_message('ng_catalog.now', 'regproc');
+SELECT pg_input_error_message('ng_catalog.abs(numeric)', 'regprocedure');
+SELECT pg_input_error_message('ng_catalog.abs(numeric', 'regprocedure');
+SELECT pg_input_error_message('regress_regrole_test', 'regrole');
+SELECT pg_input_error_message('no_such_type', 'regtype');
+
+-- Some cases that should be soft errors, but are not yet
+SELECT pg_input_error_message('incorrect type name syntax', 'regtype');
+SELECT pg_input_error_message('numeric(1,2,3)', 'regtype');  -- bogus typmod
+SELECT pg_input_error_message('way.too.many.names', 'regtype');
+SELECT pg_input_error_message('no_such_catalog.schema.name', 'regtype');

Re: Error-safe user functions

From

Tom Lane

Date:

25 December 2022, 20:38:49

I got annoyed by the fact that types cid, xid, xid8 don't throw
error even for obvious garbage, because they just believe the
result of strtoul or strtoull without any checking.  That was
probably up to project standards when cidin and xidin were
written; but surely it's not anymore, especially when we can
piggyback on work already done for type oid.

Anybody have an objection to the following?  One note is that
because we already had test cases checking that xid would
accept hex input, I made the common subroutines use "0" not
"10" for strtoul's last argument, meaning that oid will accept
hex now too.

            regards, tom lane

diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 7cded73e6e..c67a79344a 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -477,6 +477,180 @@ invalid_syntax:
                     "bigint", s)));
 }

+/*
+ * Convert input string to an unsigned 32 bit integer.
+ *
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * If endloc isn't NULL, store a pointer to the rest of the string there,
+ * so that caller can parse the rest.  Otherwise, it's an error if anything
+ * but whitespace follows.
+ *
+ * typname is what is reported in error messges.
+ *
+ * If escontext points to an ErrorSaveContext node, that is filled instead
+ * of throwing an error; the caller must check SOFT_ERROR_OCCURRED()
+ * to detect errors.
+ */
+uint32
+uint32in_subr(const char *s, char **endloc,
+              const char *typname, Node *escontext)
+{
+    uint32        result;
+    unsigned long cvt;
+    char       *endptr;
+
+    /* Ensure that empty-input is handled consistently across platforms */
+    if (*s == '\0')
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        typname, s)));
+
+    errno = 0;
+    cvt = strtoul(s, &endptr, 0);
+
+    /*
+     * strtoul() normally only sets ERANGE.  On some systems it also may set
+     * EINVAL, which simply means it couldn't parse the input string. This is
+     * handled by the second "if" consistent across platforms.
+     */
+    if (errno && errno != ERANGE && errno != EINVAL)
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        typname, s)));
+
+    if (endptr == s)
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        typname, s)));
+
+    if (errno == ERANGE)
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                 errmsg("value \"%s\" is out of range for type %s",
+                        s, typname)));
+
+    if (endloc)
+    {
+        /* caller wants to deal with rest of string */
+        *endloc = endptr;
+    }
+    else
+    {
+        /* allow only whitespace after number */
+        while (*endptr && isspace((unsigned char) *endptr))
+            endptr++;
+        if (*endptr)
+            ereturn(escontext, 0,
+                    (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                     errmsg("invalid input syntax for type %s: \"%s\"",
+                            typname, s)));
+    }
+
+    result = (uint32) cvt;
+
+    /*
+     * Cope with possibility that unsigned long is wider than uint32, in which
+     * case strtoul will not raise an error for some values that are out of
+     * the range of uint32.
+     *
+     * For backwards compatibility, we want to accept inputs that are given
+     * with a minus sign, so allow the input value if it matches after either
+     * signed or unsigned extension to long.
+     *
+     * To ensure consistent results on 32-bit and 64-bit platforms, make sure
+     * the error message is the same as if strtoul() had returned ERANGE.
+     */
+#if PG_UINT32_MAX != ULONG_MAX
+    if (cvt != (unsigned long) result &&
+        cvt != (unsigned long) ((int) result))
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                 errmsg("value \"%s\" is out of range for type %s",
+                        s, typname)));
+#endif
+
+    return result;
+}
+
+/*
+ * Convert input string to an unsigned 64 bit integer.
+ *
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * If endloc isn't NULL, store a pointer to the rest of the string there,
+ * so that caller can parse the rest.  Otherwise, it's an error if anything
+ * but whitespace follows.
+ *
+ * typname is what is reported in error messges.
+ *
+ * If escontext points to an ErrorSaveContext node, that is filled instead
+ * of throwing an error; the caller must check SOFT_ERROR_OCCURRED()
+ * to detect errors.
+ */
+uint64
+uint64in_subr(const char *s, char **endloc,
+              const char *typname, Node *escontext)
+{
+    uint64        result;
+    char       *endptr;
+
+    /* Ensure that empty-input is handled consistently across platforms */
+    if (*s == '\0')
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        typname, s)));
+
+    errno = 0;
+    result = strtou64(s, &endptr, 0);
+
+    /*
+     * strtoul[l]() normally only sets ERANGE.  On some systems it also may
+     * set EINVAL, which simply means it couldn't parse the input string. This
+     * is handled by the second "if" consistent across platforms.
+     */
+    if (errno && errno != ERANGE && errno != EINVAL)
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        typname, s)));
+
+    if (endptr == s)
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        typname, s)));
+
+    if (errno == ERANGE)
+        ereturn(escontext, 0,
+                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                 errmsg("value \"%s\" is out of range for type %s",
+                        s, typname)));
+
+    if (endloc)
+    {
+        /* caller wants to deal with rest of string */
+        *endloc = endptr;
+    }
+    else
+    {
+        /* allow only whitespace after number */
+        while (*endptr && isspace((unsigned char) *endptr))
+            endptr++;
+        if (*endptr)
+            ereturn(escontext, 0,
+                    (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                     errmsg("invalid input syntax for type %s: \"%s\"",
+                            typname, s)));
+    }
+
+    return result;
+}
+
 /*
  * pg_itoa: converts a signed 16-bit integer to its string representation
  * and returns strlen(a).
diff --git a/src/backend/utils/adt/oid.c b/src/backend/utils/adt/oid.c
index 9d382b5cb7..6b70b774d5 100644
--- a/src/backend/utils/adt/oid.c
+++ b/src/backend/utils/adt/oid.c
@@ -32,106 +32,13 @@
  *     USER I/O ROUTINES                                                         *
  *****************************************************************************/

-/*
- * Parse a single OID and return its value.
- *
- * If endloc isn't NULL, store a pointer to the rest of the string there,
- * so that caller can parse the rest.  Otherwise, it's an error if anything
- * but whitespace follows.
- *
- * If escontext points to an ErrorSaveContext node, that is filled instead
- * of throwing an error; the caller must check SOFT_ERROR_OCCURRED()
- * to detect errors.
- */
-static Oid
-oidin_subr(const char *s, char **endloc, Node *escontext)
-{
-    unsigned long cvt;
-    char       *endptr;
-    Oid            result;
-
-    if (*s == '\0')
-        ereturn(escontext, InvalidOid,
-                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                 errmsg("invalid input syntax for type %s: \"%s\"",
-                        "oid", s)));
-
-    errno = 0;
-    cvt = strtoul(s, &endptr, 10);
-
-    /*
-     * strtoul() normally only sets ERANGE.  On some systems it also may set
-     * EINVAL, which simply means it couldn't parse the input string. This is
-     * handled by the second "if" consistent across platforms.
-     */
-    if (errno && errno != ERANGE && errno != EINVAL)
-        ereturn(escontext, InvalidOid,
-                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                 errmsg("invalid input syntax for type %s: \"%s\"",
-                        "oid", s)));
-
-    if (endptr == s && *s != '\0')
-        ereturn(escontext, InvalidOid,
-                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                 errmsg("invalid input syntax for type %s: \"%s\"",
-                        "oid", s)));
-
-    if (errno == ERANGE)
-        ereturn(escontext, InvalidOid,
-                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-                 errmsg("value \"%s\" is out of range for type %s",
-                        s, "oid")));
-
-    if (endloc)
-    {
-        /* caller wants to deal with rest of string */
-        *endloc = endptr;
-    }
-    else
-    {
-        /* allow only whitespace after number */
-        while (*endptr && isspace((unsigned char) *endptr))
-            endptr++;
-        if (*endptr)
-            ereturn(escontext, InvalidOid,
-                    (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-                     errmsg("invalid input syntax for type %s: \"%s\"",
-                            "oid", s)));
-    }
-
-    result = (Oid) cvt;
-
-    /*
-     * Cope with possibility that unsigned long is wider than Oid, in which
-     * case strtoul will not raise an error for some values that are out of
-     * the range of Oid.
-     *
-     * For backwards compatibility, we want to accept inputs that are given
-     * with a minus sign, so allow the input value if it matches after either
-     * signed or unsigned extension to long.
-     *
-     * To ensure consistent results on 32-bit and 64-bit platforms, make sure
-     * the error message is the same as if strtoul() had returned ERANGE.
-     */
-#if OID_MAX != ULONG_MAX
-    if (cvt != (unsigned long) result &&
-        cvt != (unsigned long) ((int) result))
-        ereturn(escontext, InvalidOid,
-                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-                 errmsg("value \"%s\" is out of range for type %s",
-                        s, "oid")));
-#endif
-
-    return result;
-}
-
 Datum
 oidin(PG_FUNCTION_ARGS)
 {
     char       *s = PG_GETARG_CSTRING(0);
     Oid            result;

-    result = oidin_subr(s, NULL, fcinfo->context);
+    result = uint32in_subr(s, NULL, "oid", fcinfo->context);
     PG_RETURN_OID(result);
 }

@@ -218,7 +125,8 @@ oidvectorin(PG_FUNCTION_ARGS)
             oidString++;
         if (*oidString == '\0')
             break;
-        result->values[n] = oidin_subr(oidString, &oidString, escontext);
+        result->values[n] = uint32in_subr(oidString, &oidString,
+                                          "oid", escontext);
         if (SOFT_ERROR_OCCURRED(escontext))
             PG_RETURN_NULL();
     }
@@ -339,7 +247,8 @@ oidparse(Node *node)
              * constants by the lexer.  Accept these if they are valid OID
              * strings.
              */
-            return oidin_subr(castNode(Float, node)->fval, NULL, NULL);
+            return uint32in_subr(castNode(Float, node)->fval, NULL,
+                                 "oid", NULL);
         default:
             elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
     }
diff --git a/src/backend/utils/adt/xid.c b/src/backend/utils/adt/xid.c
index e4b4952a28..6c94aadb6f 100644
--- a/src/backend/utils/adt/xid.c
+++ b/src/backend/utils/adt/xid.c
@@ -31,8 +31,10 @@ Datum
 xidin(PG_FUNCTION_ARGS)
 {
     char       *str = PG_GETARG_CSTRING(0);
+    TransactionId result;

-    PG_RETURN_TRANSACTIONID((TransactionId) strtoul(str, NULL, 0));
+    result = uint32in_subr(str, NULL, "xid", fcinfo->context);
+    PG_RETURN_TRANSACTIONID(result);
 }

 Datum
@@ -183,8 +185,10 @@ Datum
 xid8in(PG_FUNCTION_ARGS)
 {
     char       *str = PG_GETARG_CSTRING(0);
+    uint64        result;

-    PG_RETURN_FULLTRANSACTIONID(FullTransactionIdFromU64(strtou64(str, NULL, 0)));
+    result = uint64in_subr(str, NULL, "xid8", fcinfo->context);
+    PG_RETURN_FULLTRANSACTIONID(FullTransactionIdFromU64(result));
 }

 Datum
@@ -321,8 +325,10 @@ Datum
 cidin(PG_FUNCTION_ARGS)
 {
     char       *str = PG_GETARG_CSTRING(0);
+    CommandId    result;

-    PG_RETURN_COMMANDID((CommandId) strtoul(str, NULL, 0));
+    result = uint32in_subr(str, NULL, "xid", fcinfo->context);
+    PG_RETURN_COMMANDID(result);
 }

 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 15373ba68f..a4a20b5a45 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -51,6 +51,10 @@ extern int32 pg_strtoint32(const char *s);
 extern int32 pg_strtoint32_safe(const char *s, Node *escontext);
 extern int64 pg_strtoint64(const char *s);
 extern int64 pg_strtoint64_safe(const char *s, Node *escontext);
+extern uint32 uint32in_subr(const char *s, char **endloc,
+                            const char *typname, Node *escontext);
+extern uint64 uint64in_subr(const char *s, char **endloc,
+                            const char *typname, Node *escontext);
 extern int    pg_itoa(int16 i, char *a);
 extern int    pg_ultoa_n(uint32 value, char *a);
 extern int    pg_ulltoa_n(uint64 value, char *a);
diff --git a/src/test/regress/expected/xid.out b/src/test/regress/expected/xid.out
index c7b8d299c8..e62f701943 100644
--- a/src/test/regress/expected/xid.out
+++ b/src/test/regress/expected/xid.out
@@ -13,29 +13,58 @@ select '010'::xid,
    8 |  42 | 4294967295 | 4294967295 |    8 |   42 | 18446744073709551615 | 18446744073709551615
 (1 row)

--- garbage values are not yet rejected (perhaps they should be)
+-- garbage values
 select ''::xid;
- xid
------
-   0
+ERROR:  invalid input syntax for type xid: ""
+LINE 1: select ''::xid;
+               ^
+select 'asdf'::xid;
+ERROR:  invalid input syntax for type xid: "asdf"
+LINE 1: select 'asdf'::xid;
+               ^
+select ''::xid8;
+ERROR:  invalid input syntax for type xid8: ""
+LINE 1: select ''::xid8;
+               ^
+select 'asdf'::xid8;
+ERROR:  invalid input syntax for type xid8: "asdf"
+LINE 1: select 'asdf'::xid8;
+               ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('42', 'xid');
+ pg_input_is_valid
+-------------------
+ t
 (1 row)

-select 'asdf'::xid;
- xid
------
-   0
+SELECT pg_input_is_valid('asdf', 'xid');
+ pg_input_is_valid
+-------------------
+ f
 (1 row)

-select ''::xid8;
- xid8
-------
-    0
+SELECT pg_input_error_message('0xffffffffff', 'xid');
+              pg_input_error_message
+---------------------------------------------------
+ value "0xffffffffff" is out of range for type xid
 (1 row)

-select 'asdf'::xid8;
- xid8
-------
-    0
+SELECT pg_input_is_valid('42', 'xid8');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('asdf', 'xid8');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('0xffffffffffffffffffff', 'xid8');
+                    pg_input_error_message
+--------------------------------------------------------------
+ value "0xffffffffffffffffffff" is out of range for type xid8
 (1 row)

 -- equality
diff --git a/src/test/regress/sql/xid.sql b/src/test/regress/sql/xid.sql
index 2289803681..b6996588ef 100644
--- a/src/test/regress/sql/xid.sql
+++ b/src/test/regress/sql/xid.sql
@@ -10,12 +10,20 @@ select '010'::xid,
        '0xffffffffffffffff'::xid8,
        '-1'::xid8;

--- garbage values are not yet rejected (perhaps they should be)
+-- garbage values
 select ''::xid;
 select 'asdf'::xid;
 select ''::xid8;
 select 'asdf'::xid8;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('42', 'xid');
+SELECT pg_input_is_valid('asdf', 'xid');
+SELECT pg_input_error_message('0xffffffffff', 'xid');
+SELECT pg_input_is_valid('42', 'xid8');
+SELECT pg_input_is_valid('asdf', 'xid8');
+SELECT pg_input_error_message('0xffffffffffffffffffff', 'xid8');
+
 -- equality
 select '1'::xid = '1'::xid;
 select '1'::xid != '1'::xid;

Re: Error-safe user functions

From

Andrew Dunstan

Date:

26 December 2022, 13:59:08

On 2022-12-25 Su 12:13, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Dec 16, 2022 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> The reg* functions probably need a unified plan as to how far
>>> down we want to push non-error behavior.
>> I would be in favor of an aggressive approach.
> Here's a proposed patch for converting regprocin and friends
> to soft error reporting.  I'll say at the outset that it's an
> engineering compromise, and it may be worth going further in
> future.  But I doubt it's worth doing more than this for v16,
> because the next steps would be pretty invasive.


It's a judgement call, but I'm not too fussed about stopping here for
v16. I see the reg* items as probably the lowest priority to fix.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

26 December 2022, 17:47:02

Here's a proposed patch for making tsvectorin and tsqueryin
report errors softly.  We have to take the changes down a
couple of levels of subroutines, but it's not hugely difficult.

A couple of points worthy of comment:

* To reduce API changes, I made the functions in
tsvector_parser.c and tsquery.c pass around the escontext pointer
in TSVectorParseState and TSQueryParserState respectively.
This is a little duplicative, but since those structs are private
within those files, there's no easy way to share the same
pointer except by adding it as a new parameter to all those
functions.  This also means that if any of the outside callers
of parse_tsquery (in to_tsany.c) wanted to do soft error handling
and wanted their custom PushFunctions to be able to report such
errors, they'd need to pass the escontext via their "opaque"
passthrough structs, making for yet a third copy.  Still,
I judged adding an extra parameter to dozens of functions wasn't
a better way.

* There are two places in tsquery parsing that emit nuisance
NOTICEs about empty queries.  I chose to suppress those when
soft error handling has been requested.  Maybe we should rethink
whether we want them at all?

With the other patches I've posted recently, this covers all
of the core datatype input functions.  There are still half
a dozen to tackle in contrib.

            regards, tom lane

diff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
index edeffacc2d..c0030ddaec 100644
--- a/src/backend/tsearch/to_tsany.c
+++ b/src/backend/tsearch/to_tsany.c
@@ -594,7 +594,8 @@ to_tsquery_byid(PG_FUNCTION_ARGS)
     query = parse_tsquery(text_to_cstring(in),
                           pushval_morph,
                           PointerGetDatum(&data),
-                          0);
+                          0,
+                          NULL);

     PG_RETURN_TSQUERY(query);
 }
@@ -630,7 +631,8 @@ plainto_tsquery_byid(PG_FUNCTION_ARGS)
     query = parse_tsquery(text_to_cstring(in),
                           pushval_morph,
                           PointerGetDatum(&data),
-                          P_TSQ_PLAIN);
+                          P_TSQ_PLAIN,
+                          NULL);

     PG_RETURN_POINTER(query);
 }
@@ -667,7 +669,8 @@ phraseto_tsquery_byid(PG_FUNCTION_ARGS)
     query = parse_tsquery(text_to_cstring(in),
                           pushval_morph,
                           PointerGetDatum(&data),
-                          P_TSQ_PLAIN);
+                          P_TSQ_PLAIN,
+                          NULL);

     PG_RETURN_TSQUERY(query);
 }
@@ -704,7 +707,8 @@ websearch_to_tsquery_byid(PG_FUNCTION_ARGS)
     query = parse_tsquery(text_to_cstring(in),
                           pushval_morph,
                           PointerGetDatum(&data),
-                          P_TSQ_WEB);
+                          P_TSQ_WEB,
+                          NULL);

     PG_RETURN_TSQUERY(query);
 }
diff --git a/src/backend/utils/adt/tsquery.c b/src/backend/utils/adt/tsquery.c
index a206926042..1097294d55 100644
--- a/src/backend/utils/adt/tsquery.c
+++ b/src/backend/utils/adt/tsquery.c
@@ -16,6 +16,7 @@

 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/miscnodes.h"
 #include "tsearch/ts_locale.h"
 #include "tsearch/ts_type.h"
 #include "tsearch/ts_utils.h"
@@ -58,10 +59,16 @@ typedef enum
 /*
  * get token from query string
  *
- * *operator is filled in with OP_* when return values is PT_OPR,
- * but *weight could contain a distance value in case of phrase operator.
- * *strval, *lenval and *weight are filled in when return value is PT_VAL
+ * All arguments except "state" are output arguments.
  *
+ * If return value is PT_OPR, then *operator is filled with an OP_* code
+ * and *weight will contain a distance value in case of phrase operator.
+ *
+ * If return value is PT_VAL, then *lenval, *strval, *weight, and *prefix
+ * are filled.
+ *
+ * If PT_ERR is returned then a soft error has occurred.  If state->escontext
+ * isn't already filled then this should be reported as a generic parse error.
  */
 typedef ts_tokentype (*ts_tokenizer) (TSQueryParserState state, int8 *operator,
                                       int *lenval, char **strval,
@@ -93,6 +100,9 @@ struct TSQueryParserStateData

     /* state for value's parser */
     TSVectorParseState valstate;
+
+    /* context object for soft errors - must match valstate's escontext */
+    Node       *escontext;
 };

 /*
@@ -194,7 +204,7 @@ parse_phrase_operator(TSQueryParserState pstate, int16 *distance)
                 if (ptr == endptr)
                     return false;
                 else if (errno == ERANGE || l < 0 || l > MAXENTRYPOS)
-                    ereport(ERROR,
+                    ereturn(pstate->escontext, false,
                             (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                              errmsg("distance in phrase operator must be an integer value between zero and %d
inclusive",
                                     MAXENTRYPOS)));
@@ -301,10 +311,8 @@ gettoken_query_standard(TSQueryParserState state, int8 *operator,
                 }
                 else if (t_iseq(state->buf, ':'))
                 {
-                    ereport(ERROR,
-                            (errcode(ERRCODE_SYNTAX_ERROR),
-                             errmsg("syntax error in tsquery: \"%s\"",
-                                    state->buffer)));
+                    /* generic syntax error message is fine */
+                    return PT_ERR;
                 }
                 else if (!t_isspace(state->buf))
                 {
@@ -320,12 +328,17 @@ gettoken_query_standard(TSQueryParserState state, int8 *operator,
                         state->state = WAITOPERATOR;
                         return PT_VAL;
                     }
+                    else if (SOFT_ERROR_OCCURRED(state->escontext))
+                    {
+                        /* gettoken_tsvector reported a soft error */
+                        return PT_ERR;
+                    }
                     else if (state->state == WAITFIRSTOPERAND)
                     {
                         return PT_END;
                     }
                     else
-                        ereport(ERROR,
+                        ereturn(state->escontext, PT_ERR,
                                 (errcode(ERRCODE_SYNTAX_ERROR),
                                  errmsg("no operand in tsquery: \"%s\"",
                                         state->buffer)));
@@ -354,6 +367,11 @@ gettoken_query_standard(TSQueryParserState state, int8 *operator,
                     *operator = OP_PHRASE;
                     return PT_OPR;
                 }
+                else if (SOFT_ERROR_OCCURRED(state->escontext))
+                {
+                    /* parse_phrase_operator reported a soft error */
+                    return PT_ERR;
+                }
                 else if (t_iseq(state->buf, ')'))
                 {
                     state->buf++;
@@ -438,6 +456,11 @@ gettoken_query_websearch(TSQueryParserState state, int8 *operator,
                         state->state = WAITOPERATOR;
                         return PT_VAL;
                     }
+                    else if (SOFT_ERROR_OCCURRED(state->escontext))
+                    {
+                        /* gettoken_tsvector reported a soft error */
+                        return PT_ERR;
+                    }
                     else if (state->state == WAITFIRSTOPERAND)
                     {
                         return PT_END;
@@ -529,12 +552,12 @@ pushValue_internal(TSQueryParserState state, pg_crc32 valcrc, int distance, int
     QueryOperand *tmp;

     if (distance >= MAXSTRPOS)
-        ereport(ERROR,
+        ereturn(state->escontext,,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("value is too big in tsquery: \"%s\"",
                         state->buffer)));
     if (lenval >= MAXSTRLEN)
-        ereport(ERROR,
+        ereturn(state->escontext,,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("operand is too long in tsquery: \"%s\"",
                         state->buffer)));
@@ -562,7 +585,7 @@ pushValue(TSQueryParserState state, char *strval, int lenval, int16 weight, bool
     pg_crc32    valcrc;

     if (lenval >= MAXSTRLEN)
-        ereport(ERROR,
+        ereturn(state->escontext,,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("word is too long in tsquery: \"%s\"",
                         state->buffer)));
@@ -686,11 +709,17 @@ makepol(TSQueryParserState state,
                 return;
             case PT_ERR:
             default:
-                ereport(ERROR,
-                        (errcode(ERRCODE_SYNTAX_ERROR),
-                         errmsg("syntax error in tsquery: \"%s\"",
-                                state->buffer)));
+                /* don't overwrite a soft error saved by gettoken function */
+                if (!SOFT_ERROR_OCCURRED(state->escontext))
+                    errsave(state->escontext,
+                            (errcode(ERRCODE_SYNTAX_ERROR),
+                             errmsg("syntax error in tsquery: \"%s\"",
+                                    state->buffer)));
+                return;
         }
+        /* detect soft error in pushval or recursion */
+        if (SOFT_ERROR_OCCURRED(state->escontext))
+            return;
     }

     cleanOpStack(state, opstack, &lenstack, OP_OR /* lowest */ );
@@ -769,6 +798,8 @@ findoprnd(QueryItem *ptr, int size, bool *needcleanup)


 /*
+ * Parse the tsquery stored in "buf".
+ *
  * Each value (operand) in the query is passed to pushval. pushval can
  * transform the simple value to an arbitrarily complex expression using
  * pushValue and pushOperator. It must push a single value with pushValue,
@@ -778,12 +809,19 @@ findoprnd(QueryItem *ptr, int size, bool *needcleanup)
  *
  * opaque is passed on to pushval as is, pushval can use it to store its
  * private state.
+ *
+ * The pushval function can record soft errors via escontext.
+ * Callers must check SOFT_ERROR_OCCURRED to detect that.
+ *
+ * A bitmask of flags (see ts_utils.h) and an error context object
+ * can be provided as well.  If a soft error occurs, NULL is returned.
  */
 TSQuery
 parse_tsquery(char *buf,
               PushFunction pushval,
               Datum opaque,
-              int flags)
+              int flags,
+              Node *escontext)
 {
     struct TSQueryParserStateData state;
     int            i;
@@ -791,6 +829,7 @@ parse_tsquery(char *buf,
     int            commonlen;
     QueryItem  *ptr;
     ListCell   *cell;
+    bool        noisy;
     bool        needcleanup;
     int            tsv_flags = P_TSV_OPR_IS_DELIM | P_TSV_IS_TSQUERY;

@@ -808,15 +847,19 @@ parse_tsquery(char *buf,
     else
         state.gettoken = gettoken_query_standard;

+    /* emit nuisance NOTICEs only if not doing soft errors */
+    noisy = !(escontext && IsA(escontext, ErrorSaveContext));
+
     /* init state */
     state.buffer = buf;
     state.buf = buf;
     state.count = 0;
     state.state = WAITFIRSTOPERAND;
     state.polstr = NIL;
+    state.escontext = escontext;

     /* init value parser's state */
-    state.valstate = init_tsvector_parser(state.buffer, tsv_flags);
+    state.valstate = init_tsvector_parser(state.buffer, tsv_flags, escontext);

     /* init list of operand */
     state.sumlen = 0;
@@ -829,11 +872,15 @@ parse_tsquery(char *buf,

     close_tsvector_parser(state.valstate);

+    if (SOFT_ERROR_OCCURRED(escontext))
+        return NULL;
+
     if (state.polstr == NIL)
     {
-        ereport(NOTICE,
-                (errmsg("text-search query doesn't contain lexemes: \"%s\"",
-                        state.buffer)));
+        if (noisy)
+            ereport(NOTICE,
+                    (errmsg("text-search query doesn't contain lexemes: \"%s\"",
+                            state.buffer)));
         query = (TSQuery) palloc(HDRSIZETQ);
         SET_VARSIZE(query, HDRSIZETQ);
         query->size = 0;
@@ -841,7 +888,7 @@ parse_tsquery(char *buf,
     }

     if (TSQUERY_TOO_BIG(list_length(state.polstr), state.sumlen))
-        ereport(ERROR,
+        ereturn(escontext, NULL,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("tsquery is too large")));
     commonlen = COMPUTESIZE(list_length(state.polstr), state.sumlen);
@@ -889,7 +936,7 @@ parse_tsquery(char *buf,
      * If there are QI_VALSTOP nodes, delete them and simplify the tree.
      */
     if (needcleanup)
-        query = cleanup_tsquery_stopwords(query);
+        query = cleanup_tsquery_stopwords(query, noisy);

     return query;
 }
@@ -908,8 +955,13 @@ Datum
 tsqueryin(PG_FUNCTION_ARGS)
 {
     char       *in = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;

-    PG_RETURN_TSQUERY(parse_tsquery(in, pushval_asis, PointerGetDatum(NULL), 0));
+    PG_RETURN_TSQUERY(parse_tsquery(in,
+                                    pushval_asis,
+                                    PointerGetDatum(NULL),
+                                    0,
+                                    escontext));
 }

 /*
diff --git a/src/backend/utils/adt/tsquery_cleanup.c b/src/backend/utils/adt/tsquery_cleanup.c
index b77a7878dc..94030a75d5 100644
--- a/src/backend/utils/adt/tsquery_cleanup.c
+++ b/src/backend/utils/adt/tsquery_cleanup.c
@@ -383,7 +383,7 @@ calcstrlen(NODE *node)
  * Remove QI_VALSTOP (stopword) nodes from TSQuery.
  */
 TSQuery
-cleanup_tsquery_stopwords(TSQuery in)
+cleanup_tsquery_stopwords(TSQuery in, bool noisy)
 {
     int32        len,
                 lenstr,
@@ -403,8 +403,9 @@ cleanup_tsquery_stopwords(TSQuery in)
     root = clean_stopword_intree(maketree(GETQUERY(in)), &ladd, &radd);
     if (root == NULL)
     {
-        ereport(NOTICE,
-                (errmsg("text-search query contains only stop words or doesn't contain lexemes, ignored")));
+        if (noisy)
+            ereport(NOTICE,
+                    (errmsg("text-search query contains only stop words or doesn't contain lexemes, ignored")));
         out = palloc(HDRSIZETQ);
         out->size = 0;
         SET_VARSIZE(out, HDRSIZETQ);
diff --git a/src/backend/utils/adt/tsvector.c b/src/backend/utils/adt/tsvector.c
index 04c6f33537..0b430d3c47 100644
--- a/src/backend/utils/adt/tsvector.c
+++ b/src/backend/utils/adt/tsvector.c
@@ -15,6 +15,7 @@
 #include "postgres.h"

 #include "libpq/pqformat.h"
+#include "nodes/miscnodes.h"
 #include "tsearch/ts_locale.h"
 #include "tsearch/ts_utils.h"
 #include "utils/builtins.h"
@@ -178,6 +179,7 @@ Datum
 tsvectorin(PG_FUNCTION_ARGS)
 {
     char       *buf = PG_GETARG_CSTRING(0);
+    Node       *escontext = fcinfo->context;
     TSVectorParseState state;
     WordEntryIN *arr;
     int            totallen;
@@ -201,7 +203,7 @@ tsvectorin(PG_FUNCTION_ARGS)
     char       *cur;
     int            buflen = 256;    /* allocated size of tmpbuf */

-    state = init_tsvector_parser(buf, 0);
+    state = init_tsvector_parser(buf, 0, escontext);

     arrlen = 64;
     arr = (WordEntryIN *) palloc(sizeof(WordEntryIN) * arrlen);
@@ -210,14 +212,14 @@ tsvectorin(PG_FUNCTION_ARGS)
     while (gettoken_tsvector(state, &token, &toklen, &pos, &poslen, NULL))
     {
         if (toklen >= MAXSTRLEN)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("word is too long (%ld bytes, max %ld bytes)",
                             (long) toklen,
                             (long) (MAXSTRLEN - 1))));

         if (cur - tmpbuf > MAXSTRPOS)
-            ereport(ERROR,
+            ereturn(escontext, (Datum) 0,
                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                      errmsg("string is too long for tsvector (%ld bytes, max %ld bytes)",
                             (long) (cur - tmpbuf), (long) MAXSTRPOS)));
@@ -261,13 +263,17 @@ tsvectorin(PG_FUNCTION_ARGS)

     close_tsvector_parser(state);

+    /* Did gettoken_tsvector fail? */
+    if (SOFT_ERROR_OCCURRED(escontext))
+        PG_RETURN_NULL();
+
     if (len > 0)
         len = uniqueentry(arr, len, tmpbuf, &buflen);
     else
         buflen = 0;

     if (buflen > MAXSTRPOS)
-        ereport(ERROR,
+        ereturn(escontext, (Datum) 0,
                 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                  errmsg("string is too long for tsvector (%d bytes, max %d bytes)", buflen, MAXSTRPOS)));

@@ -285,6 +291,7 @@ tsvectorin(PG_FUNCTION_ARGS)
         stroff += arr[i].entry.len;
         if (arr[i].entry.haspos)
         {
+            /* This should be unreachable because of MAXNUMPOS restrictions */
             if (arr[i].poslen > 0xFFFF)
                 elog(ERROR, "positions array too long");

diff --git a/src/backend/utils/adt/tsvector_parser.c b/src/backend/utils/adt/tsvector_parser.c
index e2460d393a..eeea93e622 100644
--- a/src/backend/utils/adt/tsvector_parser.c
+++ b/src/backend/utils/adt/tsvector_parser.c
@@ -20,9 +20,19 @@

 /*
  * Private state of tsvector parser.  Note that tsquery also uses this code to
- * parse its input, hence the boolean flags.  The two flags are both true or
- * both false in current usage, but we keep them separate for clarity.
+ * parse its input, hence the boolean flags.  The oprisdelim and is_tsquery
+ * flags are both true or both false in current usage, but we keep them
+ * separate for clarity.
+ *
+ * If oprisdelim is set, the following characters are treated as delimiters
+ * (in addition to whitespace): ! | & ( )
+ *
  * is_tsquery affects *only* the content of error messages.
+ *
+ * is_web can be true to further modify tsquery parsing.
+ *
+ * If escontext is an ErrorSaveContext node, then soft errors can be
+ * captured there rather than being thrown.
  */
 struct TSVectorParseStateData
 {
@@ -34,16 +44,17 @@ struct TSVectorParseStateData
     bool        oprisdelim;        /* treat ! | * ( ) as delimiters? */
     bool        is_tsquery;        /* say "tsquery" not "tsvector" in errors? */
     bool        is_web;            /* we're in websearch_to_tsquery() */
+    Node       *escontext;        /* for soft error reporting */
 };


 /*
- * Initializes parser for the input string. If oprisdelim is set, the
- * following characters are treated as delimiters in addition to whitespace:
- * ! | & ( )
+ * Initializes a parser state object for the given input string.
+ * A bitmask of flags (see ts_utils.h) and an error context object
+ * can be provided as well.
  */
 TSVectorParseState
-init_tsvector_parser(char *input, int flags)
+init_tsvector_parser(char *input, int flags, Node *escontext)
 {
     TSVectorParseState state;

@@ -56,12 +67,15 @@ init_tsvector_parser(char *input, int flags)
     state->oprisdelim = (flags & P_TSV_OPR_IS_DELIM) != 0;
     state->is_tsquery = (flags & P_TSV_IS_TSQUERY) != 0;
     state->is_web = (flags & P_TSV_IS_WEB) != 0;
+    state->escontext = escontext;

     return state;
 }

 /*
  * Reinitializes parser to parse 'input', instead of previous input.
+ *
+ * Note that bufstart (the string reported in errors) is not changed.
  */
 void
 reset_tsvector_parser(TSVectorParseState state, char *input)
@@ -122,23 +136,26 @@ do { \
 #define WAITPOSDELIM    7
 #define WAITCHARCMPLX    8

-#define PRSSYNTAXERROR prssyntaxerror(state)
+#define PRSSYNTAXERROR return prssyntaxerror(state)

-static void
+static bool
 prssyntaxerror(TSVectorParseState state)
 {
-    ereport(ERROR,
+    errsave(state->escontext,
             (errcode(ERRCODE_SYNTAX_ERROR),
              state->is_tsquery ?
              errmsg("syntax error in tsquery: \"%s\"", state->bufstart) :
              errmsg("syntax error in tsvector: \"%s\"", state->bufstart)));
+    /* In soft error situation, return false as convenience for caller */
+    return false;
 }


 /*
  * Get next token from string being parsed. Returns true if successful,
- * false if end of input string is reached.  On success, these output
- * parameters are filled in:
+ * false if end of input string is reached or soft error.
+ *
+ * On success, these output parameters are filled in:
  *
  * *strval        pointer to token
  * *lenval        length of *strval
@@ -149,7 +166,11 @@ prssyntaxerror(TSVectorParseState state)
  * *poslen        number of elements in *pos_ptr
  * *endptr        scan resumption point
  *
- * Pass NULL for unwanted output parameters.
+ * Pass NULL for any unwanted output parameters.
+ *
+ * If state->escontext is an ErrorSaveContext, then caller must check
+ * SOFT_ERROR_OCCURRED() to determine whether a "false" result means
+ * error or normal end-of-string.
  */
 bool
 gettoken_tsvector(TSVectorParseState state,
@@ -195,7 +216,7 @@ gettoken_tsvector(TSVectorParseState state,
         else if (statecode == WAITNEXTCHAR)
         {
             if (*(state->prsbuf) == '\0')
-                ereport(ERROR,
+                ereturn(state->escontext, false,
                         (errcode(ERRCODE_SYNTAX_ERROR),
                          errmsg("there is no escaped character: \"%s\"",
                                 state->bufstart)));
@@ -313,7 +334,7 @@ gettoken_tsvector(TSVectorParseState state,
                 WEP_SETPOS(pos[npos - 1], LIMITPOS(atoi(state->prsbuf)));
                 /* we cannot get here in tsquery, so no need for 2 errmsgs */
                 if (WEP_GETPOS(pos[npos - 1]) == 0)
-                    ereport(ERROR,
+                    ereturn(state->escontext, false,
                             (errcode(ERRCODE_SYNTAX_ERROR),
                              errmsg("wrong position info in tsvector: \"%s\"",
                                     state->bufstart)));
diff --git a/src/include/tsearch/ts_utils.h b/src/include/tsearch/ts_utils.h
index 6fdd334fff..2297fb6cd5 100644
--- a/src/include/tsearch/ts_utils.h
+++ b/src/include/tsearch/ts_utils.h
@@ -25,11 +25,13 @@
 struct TSVectorParseStateData;    /* opaque struct in tsvector_parser.c */
 typedef struct TSVectorParseStateData *TSVectorParseState;

+/* flag bits that can be passed to init_tsvector_parser: */
 #define P_TSV_OPR_IS_DELIM    (1 << 0)
 #define P_TSV_IS_TSQUERY    (1 << 1)
 #define P_TSV_IS_WEB        (1 << 2)

-extern TSVectorParseState init_tsvector_parser(char *input, int flags);
+extern TSVectorParseState init_tsvector_parser(char *input, int flags,
+                                               Node *escontext);
 extern void reset_tsvector_parser(TSVectorParseState state, char *input);
 extern bool gettoken_tsvector(TSVectorParseState state,
                               char **strval, int *lenval,
@@ -58,13 +60,15 @@ typedef void (*PushFunction) (Datum opaque, TSQueryParserState state,
                                                      * QueryOperand struct */
                               bool prefix);

+/* flag bits that can be passed to parse_tsquery: */
 #define P_TSQ_PLAIN        (1 << 0)
 #define P_TSQ_WEB        (1 << 1)

 extern TSQuery parse_tsquery(char *buf,
                              PushFunction pushval,
                              Datum opaque,
-                             int flags);
+                             int flags,
+                             Node *escontext);

 /* Functions for use by PushFunction implementations */
 extern void pushValue(TSQueryParserState state,
@@ -222,7 +226,7 @@ extern int32 tsCompareString(char *a, int lena, char *b, int lenb, bool prefix);
  * TSQuery Utilities
  */
 extern QueryItem *clean_NOT(QueryItem *ptr, int32 *len);
-extern TSQuery cleanup_tsquery_stopwords(TSQuery in);
+extern TSQuery cleanup_tsquery_stopwords(TSQuery in, bool noisy);

 typedef struct QTNode
 {
diff --git a/src/test/regress/expected/tstypes.out b/src/test/regress/expected/tstypes.out
index 92c1c6e10b..a8785cd708 100644
--- a/src/test/regress/expected/tstypes.out
+++ b/src/test/regress/expected/tstypes.out
@@ -89,6 +89,25 @@ SELECT $$'' '1' '2'$$::tsvector;  -- error, empty lexeme is not allowed
 ERROR:  syntax error in tsvector: "'' '1' '2'"
 LINE 1: SELECT $$'' '1' '2'$$::tsvector;
                ^
+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('foo', 'tsvector');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid($$''$$, 'tsvector');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message($$''$$, 'tsvector');
+     pg_input_error_message
+--------------------------------
+ syntax error in tsvector: "''"
+(1 row)
+
 --Base tsquery test
 SELECT '1'::tsquery;
  tsquery
@@ -372,6 +391,31 @@ SELECT '!!a & !!b'::tsquery;
  !!'a' & !!'b'
 (1 row)

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('foo', 'tsquery');
+ pg_input_is_valid
+-------------------
+ t
+(1 row)
+
+SELECT pg_input_is_valid('foo!', 'tsquery');
+ pg_input_is_valid
+-------------------
+ f
+(1 row)
+
+SELECT pg_input_error_message('foo!', 'tsquery');
+     pg_input_error_message
+---------------------------------
+ syntax error in tsquery: "foo!"
+(1 row)
+
+SELECT pg_input_error_message('a <100000> b', 'tsquery');
+                                pg_input_error_message
+---------------------------------------------------------------------------------------
+ distance in phrase operator must be an integer value between zero and 16384 inclusive
+(1 row)
+
 --comparisons
 SELECT 'a' < 'b & c'::tsquery as "true";
  true
diff --git a/src/test/regress/sql/tstypes.sql b/src/test/regress/sql/tstypes.sql
index 61e8f49c91..b73dd1cb07 100644
--- a/src/test/regress/sql/tstypes.sql
+++ b/src/test/regress/sql/tstypes.sql
@@ -19,6 +19,11 @@ SELECT '''w'':4A,3B,2C,1D,5 a:8';
 SELECT 'a:3A b:2a'::tsvector || 'ba:1234 a:1B';
 SELECT $$'' '1' '2'$$::tsvector;  -- error, empty lexeme is not allowed

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('foo', 'tsvector');
+SELECT pg_input_is_valid($$''$$, 'tsvector');
+SELECT pg_input_error_message($$''$$, 'tsvector');
+
 --Base tsquery test
 SELECT '1'::tsquery;
 SELECT '1 '::tsquery;
@@ -68,6 +73,12 @@ SELECT 'a & !!b'::tsquery;
 SELECT '!!a & b'::tsquery;
 SELECT '!!a & !!b'::tsquery;

+-- Also try it with non-error-throwing API
+SELECT pg_input_is_valid('foo', 'tsquery');
+SELECT pg_input_is_valid('foo!', 'tsquery');
+SELECT pg_input_error_message('foo!', 'tsquery');
+SELECT pg_input_error_message('a <100000> b', 'tsquery');
+
 --comparisons
 SELECT 'a' < 'b & c'::tsquery as "true";
 SELECT 'a' > 'b & c'::tsquery as "false";

Re: Error-safe user functions

From

Andrew Dunstan

Date:

26 December 2022, 19:12:06

On 2022-12-26 Mo 12:47, Tom Lane wrote:
> Here's a proposed patch for making tsvectorin and tsqueryin
> report errors softly.  We have to take the changes down a
> couple of levels of subroutines, but it's not hugely difficult.


Great!


>
> With the other patches I've posted recently, this covers all
> of the core datatype input functions.  There are still half
> a dozen to tackle in contrib.
>
>             


Yeah, I'm currently looking at those in ltree.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

26 December 2022, 23:00:00

I wrote:
> (Perhaps we should go further than this, and convert all these
> functions to just be DirectInputFunctionCallSafe wrappers
> around the corresponding input functions?  That would save
> some duplicative code, but I've not done it here.)

I looked closer at that idea, and realized that it would do more than
just save some code: it'd cause the to_regfoo functions to accept
numeric OIDs, as they did not before (and are documented not to).
It is unclear to me whether that inconsistency with the input
functions is really desirable or not --- but I don't offhand see a
good argument for it.  If we change this though, it should probably
happen in a separate commit.  Accordingly, here's a delta patch
doing that.

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 836b9254fb..3bf8d021c3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24100,8 +24100,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regclass</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>

@@ -24118,8 +24117,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regcollation</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>

@@ -24136,8 +24134,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regnamespace</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>

@@ -24154,8 +24151,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regoper</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found or is ambiguous.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found or is ambiguous.
        </para></entry>
       </row>

@@ -24172,8 +24168,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regoperator</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>

@@ -24190,8 +24185,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regproc</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found or is ambiguous.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found or is ambiguous.
        </para></entry>
       </row>

@@ -24208,8 +24202,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regprocedure</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>

@@ -24226,8 +24219,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regrole</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>

@@ -24244,8 +24236,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         obtained by casting the string to type <type>regtype</type> (see
         <xref linkend="datatype-oid"/>); however, this function will return
         <literal>NULL</literal> rather than throwing an error if the name is
-        not found.  Also unlike the cast, this does not accept
-        a numeric OID as input.
+        not found.
        </para></entry>
       </row>
      </tbody>
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index 14d76c856d..3635a94633 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -118,24 +118,15 @@ Datum
 to_regproc(PG_FUNCTION_ARGS)
 {
     char       *pro_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    List       *names;
-    FuncCandidateList clist;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Parse the name into components and see if it matches any pg_proc
-     * entries in the current search path.
-     */
-    names = stringToQualifiedNameList(pro_name, (Node *) &escontext);
-    if (names == NIL)
-        PG_RETURN_NULL();
-
-    clist = FuncnameGetCandidates(names, -1, NIL, false, false, false, true);
-
-    if (clist == NULL || clist->next != NULL)
+    if (!DirectInputFunctionCallSafe(regprocin, pro_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
-
-    PG_RETURN_OID(clist->oid);
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -287,31 +278,15 @@ Datum
 to_regprocedure(PG_FUNCTION_ARGS)
 {
     char       *pro_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    List       *names;
-    int            nargs;
-    Oid            argtypes[FUNC_MAX_ARGS];
-    FuncCandidateList clist;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Parse the name and arguments, look up potential matches in the current
-     * namespace search list, and scan to see which one exactly matches the
-     * given argument types.    (There will not be more than one match.)
-     */
-    if (!parseNameAndArgTypes(pro_name, false,
-                              &names, &nargs, argtypes,
-                              (Node *) &escontext))
+    if (!DirectInputFunctionCallSafe(regprocedurein, pro_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
-
-    clist = FuncnameGetCandidates(names, nargs, NIL, false, false, false, true);
-
-    for (; clist; clist = clist->next)
-    {
-        if (memcmp(clist->args, argtypes, nargs * sizeof(Oid)) == 0)
-            PG_RETURN_OID(clist->oid);
-    }
-
-    PG_RETURN_NULL();
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -552,24 +527,15 @@ Datum
 to_regoper(PG_FUNCTION_ARGS)
 {
     char       *opr_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    List       *names;
-    FuncCandidateList clist;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Parse the name into components and see if it matches any pg_operator
-     * entries in the current search path.
-     */
-    names = stringToQualifiedNameList(opr_name, (Node *) &escontext);
-    if (names == NIL)
+    if (!DirectInputFunctionCallSafe(regoperin, opr_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
-
-    clist = OpernameGetCandidates(names, '\0', true);
-
-    if (clist == NULL || clist->next != NULL)
-        PG_RETURN_NULL();
-
-    PG_RETURN_OID(clist->oid);
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -728,31 +694,15 @@ Datum
 to_regoperator(PG_FUNCTION_ARGS)
 {
     char       *opr_name_or_oid = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    Oid            result;
-    List       *names;
-    int            nargs;
-    Oid            argtypes[FUNC_MAX_ARGS];
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Parse the name and arguments, look up potential matches in the current
-     * namespace search list, and scan to see which one exactly matches the
-     * given argument types.    (There will not be more than one match.)
-     */
-    if (!parseNameAndArgTypes(opr_name_or_oid, true,
-                              &names, &nargs, argtypes,
-                              (Node *) &escontext))
-        PG_RETURN_NULL();
-
-    if (nargs != 2)
+    if (!DirectInputFunctionCallSafe(regoperatorin, opr_name_or_oid,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
-
-    result = OpernameGetOprid(names, argtypes[0], argtypes[1]);
-
-    if (!OidIsValid(result))
-        PG_RETURN_NULL();
-
-    PG_RETURN_OID(result);
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -975,25 +925,15 @@ Datum
 to_regclass(PG_FUNCTION_ARGS)
 {
     char       *class_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    Oid            result;
-    List       *names;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Parse the name into components and see if it matches any pg_class
-     * entries in the current search path.
-     */
-    names = stringToQualifiedNameList(class_name, (Node *) &escontext);
-    if (names == NIL)
-        PG_RETURN_NULL();
-
-    /* We might not even have permissions on this relation; don't lock it. */
-    result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, true);
-
-    if (OidIsValid(result))
-        PG_RETURN_OID(result);
-    else
+    if (!DirectInputFunctionCallSafe(regclassin, class_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -1128,24 +1068,15 @@ Datum
 to_regcollation(PG_FUNCTION_ARGS)
 {
     char       *collation_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    Oid            result;
-    List       *names;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Parse the name into components and see if it matches any pg_collation
-     * entries in the current search path.
-     */
-    names = stringToQualifiedNameList(collation_name, (Node *) &escontext);
-    if (names == NIL)
-        PG_RETURN_NULL();
-
-    result = get_collation_oid(names, true);
-
-    if (OidIsValid(result))
-        PG_RETURN_OID(result);
-    else
+    if (!DirectInputFunctionCallSafe(regcollationin, collation_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -1278,17 +1209,15 @@ Datum
 to_regtype(PG_FUNCTION_ARGS)
 {
     char       *typ_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    Oid            result;
-    int32        typmod;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    /*
-     * Invoke the full parser to deal with special cases such as array syntax.
-     */
-    if (parseTypeString(typ_name, &result, &typmod, (Node *) &escontext))
-        PG_RETURN_OID(result);
-    else
+    if (!DirectInputFunctionCallSafe(regtypein, typ_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -1634,23 +1563,15 @@ Datum
 to_regrole(PG_FUNCTION_ARGS)
 {
     char       *role_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    Oid            result;
-    List       *names;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    names = stringToQualifiedNameList(role_name, (Node *) &escontext);
-    if (names == NIL)
-        PG_RETURN_NULL();
-
-    if (list_length(names) != 1)
-        PG_RETURN_NULL();
-
-    result = get_role_oid(strVal(linitial(names)), true);
-
-    if (OidIsValid(result))
-        PG_RETURN_OID(result);
-    else
+    if (!DirectInputFunctionCallSafe(regrolein, role_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
+    PG_RETURN_DATUM(result);
 }

 /*
@@ -1759,23 +1680,15 @@ Datum
 to_regnamespace(PG_FUNCTION_ARGS)
 {
     char       *nsp_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
-    Oid            result;
-    List       *names;
+    Datum        result;
     ErrorSaveContext escontext = {T_ErrorSaveContext};

-    names = stringToQualifiedNameList(nsp_name, (Node *) &escontext);
-    if (names == NIL)
-        PG_RETURN_NULL();
-
-    if (list_length(names) != 1)
-        PG_RETURN_NULL();
-
-    result = get_namespace_oid(strVal(linitial(names)), true);
-
-    if (OidIsValid(result))
-        PG_RETURN_OID(result);
-    else
+    if (!DirectInputFunctionCallSafe(regnamespacein, nsp_name,
+                                     InvalidOid, -1,
+                                     (Node *) &escontext,
+                                     &result))
         PG_RETURN_NULL();
+    PG_RETURN_DATUM(result);
 }

 /*

Re: Error-safe user functions

From

Andrew Dunstan

Date:

27 December 2022, 13:31:01

On 2022-12-26 Mo 14:12, Andrew Dunstan wrote:
> On 2022-12-26 Mo 12:47, Tom Lane wrote:
>> Here's a proposed patch for making tsvectorin and tsqueryin
>> report errors softly.  We have to take the changes down a
>> couple of levels of subroutines, but it's not hugely difficult.
>
> Great!
>
>
>> With the other patches I've posted recently, this covers all
>> of the core datatype input functions.  There are still half
>> a dozen to tackle in contrib.
>>
>>             
>
> Yeah, I'm currently looking at those in ltree.
>
>

Here's a patch that covers the ltree and intarray contrib modules. I
think that would leave just hstore to be done.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

ltree-intarray-error-safe.patch

Re: Error-safe user functions

From

Andrew Dunstan

Date:

27 December 2022, 13:36:15

On 2022-12-26 Mo 18:00, Tom Lane wrote:
> I wrote:
>> (Perhaps we should go further than this, and convert all these
>> functions to just be DirectInputFunctionCallSafe wrappers
>> around the corresponding input functions?  That would save
>> some duplicative code, but I've not done it here.)
> I looked closer at that idea, and realized that it would do more than
> just save some code: it'd cause the to_regfoo functions to accept
> numeric OIDs, as they did not before (and are documented not to).
> It is unclear to me whether that inconsistency with the input
> functions is really desirable or not --- but I don't offhand see a
> good argument for it.  If we change this though, it should probably
> happen in a separate commit.  Accordingly, here's a delta patch
> doing that.
>
>             


+1 for doing this. The code simplification is nice too.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Tom Lane

Date:

27 December 2022, 17:47:18

Andrew Dunstan <andrew@dunslane.net> writes:
> Here's a patch that covers the ltree and intarray contrib modules.

I would probably have done this a little differently --- I think
the added "res" parameters aren't really necessary for most of
these.  But it's not worth arguing over.

> I think that would leave just hstore to be done.

Yeah, that matches my scoreboard.  Are you going to look at
hstore, or do you want me to?

            regards, tom lane

Re: Error-safe user functions

From

Andrew Dunstan

Date:

27 December 2022, 18:05:06


> On Dec 27, 2022, at 12:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Here's a patch that covers the ltree and intarray contrib modules.
>
> I would probably have done this a little differently --- I think
> the added "res" parameters aren't really necessary for most of
> these.  But it's not worth arguing over.

I’ll take another look


>
>> I think that would leave just hstore to be done.
>
> Yeah, that matches my scoreboard.  Are you going to look at
> hstore, or do you want me to?
>
>

Go for it.

Cheers

Andrew

Re: Error-safe user functions

From

Tom Lane

Date:

27 December 2022, 19:51:36

Andrew Dunstan <andrew@dunslane.net> writes:
> On Dec 27, 2022, at 12:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> I think that would leave just hstore to be done.

>> Yeah, that matches my scoreboard.  Are you going to look at
>> hstore, or do you want me to?

> Go for it. 

Done.

            regards, tom lane

Re: Error-safe user functions

From

Amul Sul

Date:

28 December 2022, 06:00:34

On Tue, Dec 27, 2022 at 11:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Here's a patch that covers the ltree and intarray contrib modules.
>
> I would probably have done this a little differently --- I think
> the added "res" parameters aren't really necessary for most of
> these.  But it's not worth arguing over.
>

Also, it would be good if we can pass "escontext" through the "state"
argument of makepool() like commit 78212f210114 done for makepol() of
tsquery.c. Attached patch is the updated version that does the same.

Regards,
Amul

Attachment

v2-ltree-intarray-error-safe.patch

Re: Error-safe user functions

From

Andrew Dunstan

Date:

28 December 2022, 15:04:07

On 2022-12-28 We 01:00, Amul Sul wrote:
> On Tue, Dec 27, 2022 at 11:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> Here's a patch that covers the ltree and intarray contrib modules.
>> I would probably have done this a little differently --- I think
>> the added "res" parameters aren't really necessary for most of
>> these.  But it's not worth arguing over.
>>
> Also, it would be good if we can pass "escontext" through the "state"
> argument of makepool() like commit 78212f210114 done for makepol() of
> tsquery.c. Attached patch is the updated version that does the same.
>


Thanks, I have done both of these things. Looks like we're now done with
this task, thanks everybody.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

From

Robert Haas

Date:

03 January 2023, 18:16:40

On Sun, Dec 25, 2022 at 12:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Here's a proposed patch for converting regprocin and friends
> to soft error reporting.  I'll say at the outset that it's an
> engineering compromise, and it may be worth going further in
> future.  But I doubt it's worth doing more than this for v16,
> because the next steps would be pretty invasive.

I don't know that I feel particularly good about converting some
errors to be reported softly and others not, especially since the
dividing line around which things fall into which category is pretty
much "well, whatever seemed hard we didn't convert". We could consider
hanging it to report everything as a hard error until we can convert
everything, but I'm not sure that's better.

On another note, I can't help noticing that all of these patches seem
to have been committed without any documentation changes. Maybe that's
because there's nothing user-visible that makes any use of these
features yet, but if that's true, then we probably ought to add
something so that the changes are testable. And having done that we
need to explain to users what the behavior actually is: that input
validation errors are trapped but other kinds of failures like out of
memory are not; that most core data types report all input validation
errors softly, and the exceptions; and that for non-core data types
the behavior depends on how the extension is coded. I think it's
really a mistake to suppose that users won't care about or don't need
to know these kinds of details. In my experience, that's just not
true.

-- 
Robert Haas
EDB: http://www.enterprisedb.com