Re: Potential security risk associated with function call - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Potential security risk associated with function call
Date
Msg-id CAFj8pRB4zkAYnUNvkHPLXbwhRLbXN9hx8ZXRs1xLKCNpb8cMFQ@mail.gmail.com
Whole thread Raw
In response to Re: Potential security risk associated with function call  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi

út 10. 3. 2026 v 20:56 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,

On 2026-03-10 14:08:45 -0400, Tom Lane wrote:
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > Tangent: I think it could be possible to make extensions (and PG
> > itself) generate more extensive pg_finfo records that contain
> > sufficient information to describe the functions' expected SQL calling
> > signature(s), which PG could then check and verify when the function
> > is catalogued (e.g. through lanvalidator).
>
> I think that'd be a lot of work with little result other than to
> change what sort of manual validation you have to do.  Today, you
> have to check "does the function's actual C code match the SQL
> definition?".  But with this, you'd have to check "does the function's
> actual C code match the pg_finfo record?".  I'm not seeing a huge win
> there.

If we were to do this, I'd assume it'd be something vaguely like

PG_DEFINE_C_FUNCTION(funcname, {argtype1, argtype2}, returntype)
{
    ...
}

Where PG_DEFINE_C_FUNCTION() would evaluate to an extended version of
PG_FUNCTION_INFO_V1() that also declared argument types and also emitted the
function definition.  So there hopefully would be less of a chance of a
mismatch...  Then the CREATE FUNCTION could verify that, if present, the
additional information present in the finfo matches the SQL signature.


FWIW, I think we're going to eventually need a more optimized function call
protocol for the most common cases (small number of arguments, no SRF, perhaps
requiring them to be strict, ...). If you look at profiles of queries that do
stuff like aggregate transition invocations or WHERE clause evaluation as part
of a large seqscan, moving things into and out FunctionCallInfo really adds
up. We spend way more on that than e.g. evaluating an int4lt or int8inc.

Maybe a vector executor and vector instructions can be a solution - as an alternative (not substitution).

The overhead of fmgr per one row is high, but for a call with a batch 1000 rows can be minimal. 

Regards

Pavel
 

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: Andres Freund
Date:
Subject: Re: Remove header lock BufferGetLSNAtomic() on architectures with 64 bit atomic operations