Thread: Re: [PATCH] Add some documentation on how to call internal functions
Hi,The documentation on extending using C functions,leaves a blank on how to call other internal Postgres functions.This can leave first-timers hanging.I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.Here’s an attempt (also a PR[0]).Here’s the relevant HTML snippet for convenience:To call another version-1 function, you can use
DirectFunctionCall
. This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.n
(func, arg1,...,argn)Different flavors of similar macros can be found in
fmgr.h
. The main point though is that they expect a C function name to call as their first argument (or itsOid
in some cases), and actual arguments should be supplied asDatum
s. They always returnDatum
.For example, to call the
starts_with(text, text)
from C, you can search through the catalog and find out that its C implementation is based on theDatum text_starts_with(PG_FUNCTION_ARGS)
function.In
fmgr.h
there are also available macros the facilitate conversions between C types andDatum
. For example to turntext*
intoDatum
, you can useDatumGetTextPP(X)
. If your extension defines additional types, it is usually convenient to define similar macros for these types too.I’ve also added the below example function:PG_FUNCTION_INFO_V1(t_starts_with); Datum t_starts_with(PG_FUNCTION_ARGS) { Datum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); bool bool_res; Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2); bool_res = DatumGetBool(datum_res); PG_RETURN_BOOL(bool_res); }PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.If so, it needs updating too.
Text *t1 = PG_GETARG_TEXT_PP(0); Text *t2 = PG_GETARG_TEXT_PP(1); bool result; result = DatumGetBool(
DirectFunctionCall2(text_starts_with,
PointerGetDatum(t1),
PointerGetDatum(t2)); PG_RETURN_BOOL(result);
or
Datum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); Datum result; result = DirectFunctionCall2(text_starts_with, t1, t2); PG_RETURN_DATUM(result);
On 9 Oct 2024, at 2:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:HiI am checking this patchčt 12. 9. 2024 v 11:42 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:Hi,The documentation on extending using C functions,leaves a blank on how to call other internal Postgres functions.This can leave first-timers hanging.I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.Here’s an attempt (also a PR[0]).Here’s the relevant HTML snippet for convenience:To call another version-1 function, you can use
DirectFunctionCall
. This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.n
(func, arg1,...,argn)Different flavors of similar macros can be found in
fmgr.h
. The main point though is that they expect a C function name to call as their first argument (or itsOid
in some cases), and actual arguments should be supplied asDatum
s. They always returnDatum
.For example, to call the
starts_with(text, text)
from C, you can search through the catalog and find out that its C implementation is based on theDatum text_starts_with(PG_FUNCTION_ARGS)
function.In
fmgr.h
there are also available macros the facilitate conversions between C types andDatum
. For example to turntext*
intoDatum
, you can useDatumGetTextPP(X)
. If your extension defines additional types, it is usually convenient to define similar macros for these types too.I’ve also added the below example function:PG_FUNCTION_INFO_V1(t_starts_with); Datum t_starts_with(PG_FUNCTION_ARGS) { Datum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); bool bool_res; Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2); bool_res = DatumGetBool(datum_res); PG_RETURN_BOOL(bool_res); }PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.If so, it needs updating too.I have few points1. I am missing warning so NULL is not supported by DirectFunctionCall - on input and on output. Any argument should not be null, and the result should not be null.
2. The example looks little bit not consistentI proposeText *t1 = PG_GETARG_TEXT_PP(0); Text *t2 = PG_GETARG_TEXT_PP(1); bool result; result = DatumGetBool(
DirectFunctionCall2(text_starts_with,
PointerGetDatum(t1),
PointerGetDatum(t2)); PG_RETURN_BOOL(result);orDatum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); Datum result; result = DirectFunctionCall2(text_starts_with, t1, t2); PG_RETURN_DATUM(result);Both examples can show some interesting patterns (maybe can be used both)
On 9 Oct 2024, at 2:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:HiI am checking this patchčt 12. 9. 2024 v 11:42 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:Hi,The documentation on extending using C functions,leaves a blank on how to call other internal Postgres functions.This can leave first-timers hanging.I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.Here’s an attempt (also a PR[0]).Here’s the relevant HTML snippet for convenience:To call another version-1 function, you can use
DirectFunctionCall
. This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.n
(func, arg1,...,argn)Different flavors of similar macros can be found in
fmgr.h
. The main point though is that they expect a C function name to call as their first argument (or itsOid
in some cases), and actual arguments should be supplied asDatum
s. They always returnDatum
.For example, to call the
starts_with(text, text)
from C, you can search through the catalog and find out that its C implementation is based on theDatum text_starts_with(PG_FUNCTION_ARGS)
function.In
fmgr.h
there are also available macros the facilitate conversions between C types andDatum
. For example to turntext*
intoDatum
, you can useDatumGetTextPP(X)
. If your extension defines additional types, it is usually convenient to define similar macros for these types too.I’ve also added the below example function:PG_FUNCTION_INFO_V1(t_starts_with); Datum t_starts_with(PG_FUNCTION_ARGS) { Datum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); bool bool_res; Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2); bool_res = DatumGetBool(datum_res); PG_RETURN_BOOL(bool_res); }PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.If so, it needs updating too.I have few points1. I am missing warning so NULL is not supported by DirectFunctionCall - on input and on output. Any argument should not be null, and the result should not be null.You’re right.2. The example looks little bit not consistentI proposeText *t1 = PG_GETARG_TEXT_PP(0); Text *t2 = PG_GETARG_TEXT_PP(1); bool result; result = DatumGetBool(
DirectFunctionCall2(text_starts_with,
PointerGetDatum(t1),
PointerGetDatum(t2)); PG_RETURN_BOOL(result);orDatum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); Datum result; result = DirectFunctionCall2(text_starts_with, t1, t2); PG_RETURN_DATUM(result);Both examples can show some interesting patterns (maybe can be used both)The example can be extended a bit more.I like your first proposal.My primary purpose with this patch is to show explicitly howone goes from SQL types, to Datum and/or char*/text if necessary,use some internal C function (even if it doesn’t have an SQL equivalent).and then return back an SQL type.So, simply GET_ARG_DATUM() and then PG_RETURN_DATUM() defeats the purpose.
Here's a v2- note on NULL args/returns- ref PointerGetDatum- used your example. Started adding some comments but don't think they're really necessary. The reader gets the point as-is I think.
Pavel Stehule <pavel.stehule@gmail.com> writes: > I'll mark this patch as ready for committer I spent a little time looking at this. I agree that it's better to show conversion of the example function's arguments to and from text* rather than leaving them as Datum. I also do think that we need to incorporate the example into src/tutorial, if only because that provides a chance to test it. And indeed, testing exposed that the example doesn't work: $ cd src/tutorial $ make $ psql postgres psql (18devel) Type "help" for help. postgres=# \i funcs.sql psql:funcs.sql:152: ERROR: could not determine which collation to use for string comparison HINT: Use the COLLATE clause to set the collation explicitly. The problem here is that we failed to pass through the result of PG_GET_COLLATION() to text_starts_with. We could do that, certainly, for a couple more lines of code. But it feels like this is getting into details that obscure the main point. I wonder if it'd be better to choose a different example that calls a non-collation-dependent target function. Anyway, proposed v3 attached. I folded the two patches into one, and editorialized on the text a little. regards, tom lane diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index af7864a1b5..03fa7633ef 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -2384,6 +2384,48 @@ PG_FUNCTION_INFO_V1(funcname); takes as its argument the actual value to return. </para> + <para> + To call another version-1 function, you can use + <function>DirectFunctionCall<replaceable>n</replaceable>(func, + arg1, ..., argn)</function>. This is particularly useful when you want + to call functions defined in the standard internal library, by using an + interface similar to their SQL signature. + </para> + + <para> + These convenience functions and similar ones can be found + in <filename>fmgr.h</filename>. + The <function>DirectFunctionCall<replaceable>n</replaceable></function> + family expect a C function name as their first argument. There are also + <function>OidFunctionCall<replaceable>n</replaceable></function> which + take the OID of the target function, and some other variants. All of + these expect the function's arguments to be supplied + as <type>Datum</type>s, and likewise they return <type>Datum</type>. + Note that neither arguments nor result are allowed to be NULL when + using these convenience functions. + </para> + + <para> + For example, to call the <function>starts_with(text, text)</function> + function from C, you can search through the catalog and find out that + its C implementation is the + <function>Datum text_starts_with(PG_FUNCTION_ARGS)</function> function. + So you would use <literal>DirectFunctionCall2(text_starts_with, + ...)</literal> to call it. + </para> + + <para> + <filename>fmgr.h</filename> also supplies macros that facilitate + conversions between C types and <type>Datum</type>. For example to + turn <type>Datum</type> into <type>text*</type>, you can + use <function>DatumGetTextPP(X)</function>. While some types have macros + named like <function>TypeGetDatum(X)</function> for the reverse + conversion, <type>text*</type> does not; it's sufficient to use the + generic macro <function>PointerGetDatum(X)</function> for that. + If your extension defines additional types, it is usually convenient to + define similar macros for your types too. + </para> + <para> Here are some examples using the version-1 calling convention: </para> @@ -2482,6 +2524,23 @@ concat_text(PG_FUNCTION_ARGS) memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size); PG_RETURN_TEXT_P(new_text); } + +/* A wrapper around starts_with(text, text) */ + +PG_FUNCTION_INFO_V1(t_starts_with); + +Datum +t_starts_with(PG_FUNCTION_ARGS) +{ + text *t1 = PG_GETARG_TEXT_PP(0); + text *t2 = PG_GETARG_TEXT_PP(1); + bool result; + + result = DatumGetBool(DirectFunctionCall2(text_starts_with, + PointerGetDatum(t1), + PointerGetDatum(t2))); + PG_RETURN_BOOL(result); +} ]]> </programlisting> @@ -2513,6 +2572,10 @@ CREATE FUNCTION copytext(text) RETURNS text CREATE FUNCTION concat_text(text, text) RETURNS text AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text' LANGUAGE C STRICT; + +CREATE FUNCTION t_starts_with(text, text) RETURNS boolean + AS '<replaceable>DIRECTORY</replaceable>/funcs', 't_starts_with' + LANGUAGE C STRICT; </programlisting> <para> diff --git a/src/tutorial/funcs.c b/src/tutorial/funcs.c index f597777a1f..73ecc023b1 100644 --- a/src/tutorial/funcs.c +++ b/src/tutorial/funcs.c @@ -11,6 +11,7 @@ #include "postgres.h" /* general Postgres declarations */ #include "executor/executor.h" /* for GetAttributeByName() */ +#include "utils/fmgrprotos.h" /* for text_starts_with() */ #include "utils/geo_decls.h" /* for point type */ PG_MODULE_MAGIC; @@ -102,6 +103,23 @@ concat_text(PG_FUNCTION_ARGS) PG_RETURN_TEXT_P(new_text); } +/* A wrapper around starts_with(text, text) */ + +PG_FUNCTION_INFO_V1(t_starts_with); + +Datum +t_starts_with(PG_FUNCTION_ARGS) +{ + text *t1 = PG_GETARG_TEXT_PP(0); + text *t2 = PG_GETARG_TEXT_PP(1); + bool result; + + result = DatumGetBool(DirectFunctionCall2(text_starts_with, + PointerGetDatum(t1), + PointerGetDatum(t2))); + PG_RETURN_BOOL(result); +} + /* Composite types */ PG_FUNCTION_INFO_V1(c_overpaid); diff --git a/src/tutorial/funcs.source b/src/tutorial/funcs.source index 542b5c81ec..eb56153754 100644 --- a/src/tutorial/funcs.source +++ b/src/tutorial/funcs.source @@ -123,16 +123,25 @@ SELECT * FROM EMP; ----------------------------- CREATE FUNCTION add_one(integer) RETURNS integer - AS '_OBJWD_/funcs' LANGUAGE C; + AS '_OBJWD_/funcs' LANGUAGE C STRICT; + +CREATE FUNCTION add_one(double precision) RETURNS double precision + AS '_OBJWD_/funcs' LANGUAGE C STRICT; CREATE FUNCTION makepoint(point, point) RETURNS point - AS '_OBJWD_/funcs' LANGUAGE C; + AS '_OBJWD_/funcs' LANGUAGE C STRICT; CREATE FUNCTION copytext(text) RETURNS text - AS '_OBJWD_/funcs' LANGUAGE C; + AS '_OBJWD_/funcs' LANGUAGE C STRICT; + +CREATE FUNCTION concat_text(text, text) RETURNS text + AS '_OBJWD_/funcs' LANGUAGE C STRICT; + +CREATE FUNCTION t_starts_with(text, text) RETURNS boolean + AS '_OBJWD_/funcs' LANGUAGE C STRICT; CREATE FUNCTION c_overpaid(EMP, integer) RETURNS boolean - AS '_OBJWD_/funcs' LANGUAGE C; + AS '_OBJWD_/funcs' LANGUAGE C STRICT; SELECT add_one(3) AS four; @@ -140,6 +149,8 @@ SELECT makepoint('(1,2)'::point, '(3,4)'::point ) AS newpoint; SELECT copytext('hello world!'); +SELECT t_starts_with('foobar', 'foo'); + SELECT name, c_overpaid(EMP, 1500) AS overpaid FROM EMP WHERE name = 'Bill' or name = 'Sam'; @@ -147,10 +158,13 @@ WHERE name = 'Bill' or name = 'Sam'; -- remove functions that were created in this file DROP FUNCTION c_overpaid(EMP, integer); +DROP FUNCTION t_starts_with(text, text); +DROP FUNCTION concat_text(text, text); DROP FUNCTION copytext(text); DROP FUNCTION makepoint(point, point); +DROP FUNCTION add_one(double precision); DROP FUNCTION add_one(integer); ---DROP FUNCTION clean_EMP(); +DROP FUNCTION clean_EMP(); DROP FUNCTION high_pay(); DROP FUNCTION new_emp(); DROP FUNCTION add_em(integer, integer);
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I'll mark this patch as ready for committer
I spent a little time looking at this. I agree that it's better to
show conversion of the example function's arguments to and from text*
rather than leaving them as Datum. I also do think that we need to
incorporate the example into src/tutorial, if only because that
provides a chance to test it. And indeed, testing exposed that the
example doesn't work:
$ cd src/tutorial
$ make
$ psql postgres
psql (18devel)
Type "help" for help.
postgres=# \i funcs.sql
psql:funcs.sql:152: ERROR: could not determine which collation to use for string comparison
HINT: Use the COLLATE clause to set the collation explicitly.
The problem here is that we failed to pass through the result of
PG_GET_COLLATION() to text_starts_with. We could do that, certainly,
for a couple more lines of code. But it feels like this is getting
into details that obscure the main point. I wonder if it'd be better
to choose a different example that calls a non-collation-dependent
target function.
<--><--><--><--><--><--><--><--><--><--><--><--> DEFAULT_COLLATION_OID,
<--><--><--><--><--><--><--><--><--><--><--><--> PointerGetDatum(t1),
<--><--><--><--><--><--><--><--><--><--><--><--> PointerGetDatum(t2)));
bytea_left(PG_FUNCTION_ARGS)
{
<-->bytea<-> *t = PG_GETARG_BYTEA_PP(0);
<-->int32<-><-->l = PG_GETARG_INT32(1);
<-->bytea<-> *result;
<-->result = DatumGetByteaPP(DirectFunctionCall3(bytea_substr,
<--><--><--><--><--><--><--><--><--><--><--><--> PointerGetDatum(t),
<--><--><--><--><--><--><--><--><--><--><--><--> Int32GetDatum(1),
<--><--><--><--><--><--><--><--><--><--><--><--> Int32GetDatum(l)));
<-->PG_RETURN_BYTEA_P(result);
}
Anyway, proposed v3 attached. I folded the two patches into one,
and editorialized on the text a little.
regards, tom lane