Re: Error-safe user functions - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Error-safe user functions
Date
Msg-id 20221206005607.lxcrazckoalptwds@alap3.anarazel.de
Whole thread Raw
In response to Re: Error-safe user functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Error-safe user functions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2022-12-05 19:18:11 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Why is context a void *?
>
> elog.h can't depend on nodes.h, at least not without some rather
> fundamental rethinking of our #include relationships.  We could
> possibly use the same kind of hack that fmgr.h does:
>
> typedef struct Node *fmNodePtr;
>
> but I'm not sure that's much of an improvement.  Note that it'd
> *not* be correct to declare it as anything more specific than Node*,
> since the fmgr context pointer is Node* and we're not expecting
> callers to do their own IsA checks to see what they were passed.

Ah - I hadn't actually grokked that that's the reason for the
void*. Unless I missed a comment to that regard, entirely possible, it
seems worth explaining that above errsave_start().


> > This seems like a fair bit of duplicated code.
>
> I don't think refactoring to remove the duplication would improve it.

Why? I think a populate_edata() or such seems to make sense. And the
required argument to skip ->backtrace and error_context_stack processing
seem like things that'd be good to document anyway.


> > I wonder if we shouldn't instead make InputFunctionCallSafe() return a
> > boolean and return the Datum via a pointer. As callers are otherwise
> > going to need to do SAFE_ERROR_OCCURRED(escontext) themselves, I think
> > it should also lead to more concise (and slightly more efficient) code.
>
> Hmm, maybe.  It would be a bigger change from existing code, but
> I don't think very many call sites would be impacted.  (But by
> the same token, we'd not save much code this way.)  Personally
> I put more value on keeping similar APIs between InputFunctionCall
> and InputFunctionCallSafe, but I won't argue hard if you're insistent.

I think it's good to diverge from the existing code, because imo the
behaviour is quite different and omitting the SAFE_ERROR_OCCURRED()
check will lead to brokenness.


> > Is there a reason not to provide this infrastructure for
> > ReceiveFunctionCall() as well?
>
> There's a comment in 0003 about that: I doubt that it makes sense
> to have no-error semantics for binary input.  That would require
> far more trust in the receive functions' ability to detect garbage
> input than I think they have in reality.  Perhaps more to the
> point, even if we ultimately do that I don't want to do it now.
> Including the receive functions in the first-pass conversion would
> roughly double the amount of work needed per datatype, and we are
> already going to be hard put to it to finish what needs to be done
> for v16.

Fair enough.


> > Wonder if it's perhaps worth to add VALGRIND_MAKE_MEM_UNDEFINED() calls
> > to InputFunctionCallSafe() to more easily detect cases where a caller
> > ignores that an error occured.
>
> I do not think there are going to be enough callers of
> InputFunctionCallSafe that we need such tactics to validate them.

I predict that we'll have quite a few bugs due to converting some parts
of the system, but not other parts. But we can add them later, so I'll
not insist on it.


> > I'd put that into the if (es_context->error_occurred) path. Likely the
> > window for store-forwarding issues is smaller than
> > InputFunctionCallSafe(), but it's trivial to write it differently...
>
> Does not seem better to me, and your argument for it seems like the
> worst sort of premature micro-optimization.

Shrug. The copy code is quite slow today, but not by a single source,
but by death by a thousand cuts.


> > Think it'd be good to have a test for a composite type where one of the
> > columns safely errors out and the other doesn't.
>
> I wasn't trying all that hard on the error tests, because I think
> 0003 is just throwaway code at this point.

I am mainly interested in having *something* test erroring out hard when
using the "Safe" mechanism, which afaict we don't have with the patches
as they stand.  You're right that it'd be better to do that without COPY
in the way, but it doesn't seem all that crucial.


> If we want to seriously check the input functions' behavior then we
> need to factorize the tests so it can be done per-datatype, not in one
> central place in the COPY tests.  For the core types it could make
> sense to provide some function in pg_regress.c that allows access to
> the non-exception code path independently of COPY; but I'm not sure
> how contrib datatypes could use that.

It might be worth adding a function for testing safe input functions
into core PG - it's not like we don't have other such functions.

But perhaps it's even worth having such a function properly exposed:
It's not at all rare to parse text data during ETL and quite often
erroring out fatally is undesirable. As savepoints are undesirable
overhead-wise, there's a lot of SQL out there that tries to do a
pre-check about whether some text could be cast to some other data
type. A function that'd try to cast input to a certain type without
erroring out hard would be quite useful for that.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Ian Lawrence Barwick
Date:
Subject: Re: doc: add missing "id" attributes to extension packaging page
Next
From: Ian Lawrence Barwick
Date:
Subject: Re: docs: add missing id elements for developer GUCs