Re: Error-safe user functions - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Error-safe user functions
Date
Msg-id CADkLM=cvLMbPSOCsiSATuwbDx3mOn1tH3uifm+fZ=ro_xOA=Lw@mail.gmail.com
Whole thread Raw
In response to Re: Error-safe user functions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers


On Fri, Dec 2, 2022 at 1:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Corey Huinker <corey.huinker@gmail.com> writes:
> I'm still working on organizing my patch, but it grew out of a desire to do
> this:
> CAST(value AS TypeName DEFAULT expr)
> This is a thing that exists in other forms in other databases and while it
> may look unrelated, it is essentially the SQL/JSON casts within a nested
> data structure issue, just a lot simpler.

Okay, maybe that's why I was thinking we had a requirement for
failure-free casts.  Sure, you can transform it to the other thing
by always implementing this as a cast-via-IO, but you could run into
semantics issues that way.  (If memory serves, we already have cases
where casting X to Y gives a different result from casting X to text
to Y.)

Yes, I was setting aside the issue of direct cast functions for my v0.1
 

> My original plan had been two new params to all _in() functions: a boolean
> error_mode and a default expression Datum.
> After consulting with Jeff Davis and Michael Paquier, the notion of
> modifying fcinfo itself two booleans:
>   allow_error (this call is allowed to return if there was an error with
> INPUT) and
>   has_error (this function has the concept of a purely-input-based error,
> and found one)

Hmm ... my main complaint about that is the lack of any way to report
the details of the error.  I realize that a plain boolean failure flag
might be enough for our immediate use-cases, but I don't want to expend
the amount of effort this is going to involve and then later find we
have a use-case where we need the error details.

I agree, but then we're past a boolean for allow_error, and we probably get into a list of modes like this:

CAST_ERROR_ERROR  /* default ereport(), what we do now */
CAST_ERROR_REPORT_FULL /* report that the cast failed, everything that you would have put in the ereport() instead put in a struct that gets returned to caller */
CAST_ERROR_REPORT_SILENT /* report that the cast failed, but nobody cares why so don't even form the ereport strings, good for bulk operations */
CAST_ERROR_WARNING /* report that the cast failed, but emit ereport() as a warning */
CAST_ERROR_[NOTICE,LOG,DEBUG1,..DEBUG5] /* same, but some other loglevel */
 

The sketch that's in my head at the moment is to make use of the existing
"context" field of FunctionCallInfo: if that contains a node of some
to-be-named type, then we are requesting that errors not be thrown
but instead be reported by passing back an ErrorData using a field of
that node.  The issue about not constructing an ErrorData if the outer
caller doesn't need it could perhaps be addressed by adding some boolean
flag fields in that node, but the details of that need not be known to
the functions reporting errors this way; it'd be a side channel from the
outer caller to elog.c.

That should be a good place for it, assuming it's not already used like fn_extra is. It would also squash those cases above into just three: ERROR, REPORT_FULL, and REPORT_SILENT, leaving it up to the node what type of erroring/logging is appropriate.
 

The main objection I can see to this approach is that we only support
one context value per call, so you could not easily combine this
functionality with existing use-cases for the context field.  A quick
census of InitFunctionCallInfoData calls finds aggregates, window
functions, triggers, and procedures, none of which seem like plausible
candidates for wanting no-error behavior, so I'm not too concerned
about that.  (Maybe the error-reporting node could be made a sub-node
of the context node in any future cases where we do need it?)

A subnode had occurred to me when fiddling about with fn_extra, so so that applies here, but if we're doing a sub-node, then maybe it's worth it's own parameter. I struggled with that because of how few of these functions will use it vs how often they're executed.
 

> The one gap I see so far in the patch presented is that it returns a null
> value on bad input, which might be ok if the node has the default, but that
> then presents the node with having to understand whether it was a null
> because of bad input vs a null that was expected.

Yeah.  That's something we could probably get away with for the case of
input functions only, but I think explicit out-of-band signaling that
there was an error is a more future-proof solution.

I think we'll run into it fairly soon, because if I recall correctly, SQL/JSON has a formatting spec that essentially means that we're not calling input functions, we're calling TO_CHAR() and TO_DATE(), but they're very similiar to input functions.

One positive side effect of all this is we can get a isa(value, typname) construct like this "for free", we just try the cast and return the value.

I'm still working on my patch even though it will probably be sidelined in the hopes that it informs us of any subsequent issues.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Error-safe user functions
Next
From: Nathan Bossart
Date:
Subject: Re: O(n) tasks cause lengthy startups and checkpoints