Thread: Re: [PATCH] Add some documentation on how to call internal functions

Re: [PATCH] Add some documentation on how to call internal functions

From
Pavel Stehule
Date:
Hi

I am checking this patch

čt 12. 9. 2024 v 11:42 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:
Hi, 

The documentation on extending using C functions, 
leaves a blank on how to call other internal Postgres functions.
This can leave first-timers hanging.

I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.

Here’s an attempt (also a PR[0]).

Here’s the relevant HTML snippet for convenience:

To call another version-1 function, you can use DirectFunctionCalln(func, arg1,...,argn). This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.

Different flavors of similar macros can be found in fmgr.h. The main point though is that they expect a C function name to call as their first argument (or its Oid in some cases), and actual arguments should be supplied as Datums. They always return Datum.

For example, to call the starts_with(text, text) from C, you can search through the catalog and find out that its C implementation is based on the Datum text_starts_with(PG_FUNCTION_ARGS) function.

In fmgr.h there are also available macros the facilitate conversions between C types and Datum. For example to turn text* into Datum, you can use DatumGetTextPP(X). If your extension defines additional types, it is usually convenient to define similar macros for these types too.

I’ve also added the below example function:

PG_FUNCTION_INFO_V1(t_starts_with);

Datum
t_starts_with(PG_FUNCTION_ARGS)
{    Datum t1 = PG_GETARG_DATUM(0);    Datum t2 = PG_GETARG_DATUM(1);    bool  bool_res;
    Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2);    bool_res = DatumGetBool(datum_res);
    PG_RETURN_BOOL(bool_res);
}
PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.
If so, it needs updating too.


I have few points

1. I am missing warning so NULL is not supported by DirectFunctionCall - on input and on output. Any argument should not be null, and the result should not be null.

2. The example looks little bit not consistent

I propose

    Text *t1 = PG_GETARG_TEXT_PP(0);    Text *t2 = PG_GETARG_TEXT_PP(1);    bool  result;
    result = DatumGetBool(
DirectFunctionCall2(text_starts_with,
PointerGetDatum(t1),
PointerGetDatum(t2)); PG_RETURN_BOOL(result);

or

    Datum t1 = PG_GETARG_DATUM(0);    Datum t2 = PG_GETARG_DATUM(1);    Datum result;
    result = DirectFunctionCall2(text_starts_with, t1, t2);
    PG_RETURN_DATUM(result);

Both examples can show some interesting patterns (maybe can be used both)

Regards

Pavel
 

Re: [PATCH] Add some documentation on how to call internal functions

From
Florents Tselai
Date:


On 9 Oct 2024, at 2:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

I am checking this patch

čt 12. 9. 2024 v 11:42 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:
Hi, 

The documentation on extending using C functions, 
leaves a blank on how to call other internal Postgres functions.
This can leave first-timers hanging.

I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.

Here’s an attempt (also a PR[0]).

Here’s the relevant HTML snippet for convenience:

To call another version-1 function, you can use DirectFunctionCalln(func, arg1,...,argn). This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.

Different flavors of similar macros can be found in fmgr.h. The main point though is that they expect a C function name to call as their first argument (or its Oid in some cases), and actual arguments should be supplied as Datums. They always return Datum.

For example, to call the starts_with(text, text) from C, you can search through the catalog and find out that its C implementation is based on the Datum text_starts_with(PG_FUNCTION_ARGS) function.

In fmgr.h there are also available macros the facilitate conversions between C types and Datum. For example to turn text* into Datum, you can use DatumGetTextPP(X). If your extension defines additional types, it is usually convenient to define similar macros for these types too.

I’ve also added the below example function:

PG_FUNCTION_INFO_V1(t_starts_with);

Datum
t_starts_with(PG_FUNCTION_ARGS)
{    Datum t1 = PG_GETARG_DATUM(0);    Datum t2 = PG_GETARG_DATUM(1);    bool  bool_res;
    Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2);    bool_res = DatumGetBool(datum_res);
    PG_RETURN_BOOL(bool_res);
}
PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.
If so, it needs updating too.


I have few points

1. I am missing warning so NULL is not supported by DirectFunctionCall - on input and on output. Any argument should not be null, and the result should not be null.

You’re right.


2. The example looks little bit not consistent

I propose

    Text *t1 = PG_GETARG_TEXT_PP(0);    Text *t2 = PG_GETARG_TEXT_PP(1);    bool  result;
    result = DatumGetBool(
DirectFunctionCall2(text_starts_with,
PointerGetDatum(t1),
PointerGetDatum(t2)); PG_RETURN_BOOL(result);

or

    Datum t1 = PG_GETARG_DATUM(0);    Datum t2 = PG_GETARG_DATUM(1);    Datum result;
    result = DirectFunctionCall2(text_starts_with, t1, t2);
    PG_RETURN_DATUM(result);

Both examples can show some interesting patterns (maybe can be used both)

The example can be extended a bit more.
I like your first proposal.

My primary purpose with this patch is to show explicitly how 
one goes from SQL types, to Datum and/or char*/text if necessary,
use some internal C function (even if it doesn’t have an SQL equivalent).
and then return back an SQL type.

So, simply GET_ARG_DATUM() and then PG_RETURN_DATUM() defeats the purpose.



Re: [PATCH] Add some documentation on how to call internal functions

From
Pavel Stehule
Date:


st 9. 10. 2024 v 14:39 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:


On 9 Oct 2024, at 2:57 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

I am checking this patch

čt 12. 9. 2024 v 11:42 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:
Hi, 

The documentation on extending using C functions, 
leaves a blank on how to call other internal Postgres functions.
This can leave first-timers hanging.

I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.

Here’s an attempt (also a PR[0]).

Here’s the relevant HTML snippet for convenience:

To call another version-1 function, you can use DirectFunctionCalln(func, arg1,...,argn). This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.

Different flavors of similar macros can be found in fmgr.h. The main point though is that they expect a C function name to call as their first argument (or its Oid in some cases), and actual arguments should be supplied as Datums. They always return Datum.

For example, to call the starts_with(text, text) from C, you can search through the catalog and find out that its C implementation is based on the Datum text_starts_with(PG_FUNCTION_ARGS) function.

In fmgr.h there are also available macros the facilitate conversions between C types and Datum. For example to turn text* into Datum, you can use DatumGetTextPP(X). If your extension defines additional types, it is usually convenient to define similar macros for these types too.

I’ve also added the below example function:

PG_FUNCTION_INFO_V1(t_starts_with);

Datum
t_starts_with(PG_FUNCTION_ARGS)
{    Datum t1 = PG_GETARG_DATUM(0);    Datum t2 = PG_GETARG_DATUM(1);    bool  bool_res;
    Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2);    bool_res = DatumGetBool(datum_res);
    PG_RETURN_BOOL(bool_res);
}
PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.
If so, it needs updating too.


I have few points

1. I am missing warning so NULL is not supported by DirectFunctionCall - on input and on output. Any argument should not be null, and the result should not be null.

You’re right.


2. The example looks little bit not consistent

I propose

    Text *t1 = PG_GETARG_TEXT_PP(0);    Text *t2 = PG_GETARG_TEXT_PP(1);    bool  result;
    result = DatumGetBool(
DirectFunctionCall2(text_starts_with,
PointerGetDatum(t1),
PointerGetDatum(t2)); PG_RETURN_BOOL(result);

or

    Datum t1 = PG_GETARG_DATUM(0);    Datum t2 = PG_GETARG_DATUM(1);    Datum result;
    result = DirectFunctionCall2(text_starts_with, t1, t2);
    PG_RETURN_DATUM(result);

Both examples can show some interesting patterns (maybe can be used both)

The example can be extended a bit more.
I like your first proposal.

My primary purpose with this patch is to show explicitly how 
one goes from SQL types, to Datum and/or char*/text if necessary,
use some internal C function (even if it doesn’t have an SQL equivalent).
and then return back an SQL type.

So, simply GET_ARG_DATUM() and then PG_RETURN_DATUM() defeats the purpose.

so please, send updated patch + sentence missing null support, and I think so this patch can be ok

Regards

Pavel

Re: [PATCH] Add some documentation on how to call internal functions

From
Pavel Stehule
Date:
Hi

st 9. 10. 2024 v 20:54 odesílatel Florents Tselai <florents.tselai@gmail.com> napsal:
Here's a v2
- note on NULL args/returns
- ref PointerGetDatum
- used your example. Started adding some comments but don't think they're really necessary. The reader gets the point as-is I think.

Now, I don't see any issue

I tested this patch and there is not any problem with patching and compiling

I'll mark this patch as ready for committer

Regards

Pavel




 
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I'll mark this patch as ready for committer

I spent a little time looking at this.  I agree that it's better to
show conversion of the example function's arguments to and from text*
rather than leaving them as Datum.  I also do think that we need to
incorporate the example into src/tutorial, if only because that
provides a chance to test it.  And indeed, testing exposed that the
example doesn't work:

$ cd src/tutorial
$ make
$ psql postgres
psql (18devel)
Type "help" for help.

postgres=# \i funcs.sql

psql:funcs.sql:152: ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

The problem here is that we failed to pass through the result of
PG_GET_COLLATION() to text_starts_with.  We could do that, certainly,
for a couple more lines of code.  But it feels like this is getting
into details that obscure the main point.  I wonder if it'd be better
to choose a different example that calls a non-collation-dependent
target function.

Anyway, proposed v3 attached.  I folded the two patches into one,
and editorialized on the text a little.

            regards, tom lane

diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index af7864a1b5..03fa7633ef 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -2384,6 +2384,48 @@ PG_FUNCTION_INFO_V1(funcname);
      takes as its argument the actual value to return.
     </para>

+    <para>
+     To call another version-1 function, you can use
+     <function>DirectFunctionCall<replaceable>n</replaceable>(func,
+     arg1, ..., argn)</function>.  This is particularly useful when you want
+     to call functions defined in the standard internal library, by using an
+     interface similar to their SQL signature.
+    </para>
+
+    <para>
+     These convenience functions and similar ones can be found
+     in <filename>fmgr.h</filename>.
+     The <function>DirectFunctionCall<replaceable>n</replaceable></function>
+     family expect a C function name as their first argument.  There are also
+     <function>OidFunctionCall<replaceable>n</replaceable></function> which
+     take the OID of the target function, and some other variants.  All of
+     these expect the function's arguments to be supplied
+     as <type>Datum</type>s, and likewise they return <type>Datum</type>.
+     Note that neither arguments nor result are allowed to be NULL when
+     using these convenience functions.
+    </para>
+
+    <para>
+     For example, to call the <function>starts_with(text, text)</function>
+     function from C, you can search through the catalog and find out that
+     its C implementation is the
+     <function>Datum text_starts_with(PG_FUNCTION_ARGS)</function> function.
+     So you would use <literal>DirectFunctionCall2(text_starts_with,
+     ...)</literal> to call it.
+    </para>
+
+    <para>
+     <filename>fmgr.h</filename> also supplies macros that facilitate
+     conversions between C types and <type>Datum</type>.  For example to
+     turn <type>Datum</type> into <type>text*</type>, you can
+     use <function>DatumGetTextPP(X)</function>.  While some types have macros
+     named like <function>TypeGetDatum(X)</function> for the reverse
+     conversion, <type>text*</type> does not; it's sufficient to use the
+     generic macro <function>PointerGetDatum(X)</function> for that.
+     If your extension defines additional types, it is usually convenient to
+     define similar macros for your types too.
+    </para>
+
     <para>
      Here are some examples using the version-1 calling convention:
     </para>
@@ -2482,6 +2524,23 @@ concat_text(PG_FUNCTION_ARGS)
     memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
     PG_RETURN_TEXT_P(new_text);
 }
+
+/* A wrapper around starts_with(text, text) */
+
+PG_FUNCTION_INFO_V1(t_starts_with);
+
+Datum
+t_starts_with(PG_FUNCTION_ARGS)
+{
+    text       *t1 = PG_GETARG_TEXT_PP(0);
+    text       *t2 = PG_GETARG_TEXT_PP(1);
+    bool        result;
+
+    result = DatumGetBool(DirectFunctionCall2(text_starts_with,
+                                              PointerGetDatum(t1),
+                                              PointerGetDatum(t2)));
+    PG_RETURN_BOOL(result);
+}
 ]]>
 </programlisting>

@@ -2513,6 +2572,10 @@ CREATE FUNCTION copytext(text) RETURNS text
 CREATE FUNCTION concat_text(text, text) RETURNS text
      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
      LANGUAGE C STRICT;
+
+CREATE FUNCTION t_starts_with(text, text) RETURNS boolean
+     AS '<replaceable>DIRECTORY</replaceable>/funcs', 't_starts_with'
+     LANGUAGE C STRICT;
 </programlisting>

     <para>
diff --git a/src/tutorial/funcs.c b/src/tutorial/funcs.c
index f597777a1f..73ecc023b1 100644
--- a/src/tutorial/funcs.c
+++ b/src/tutorial/funcs.c
@@ -11,6 +11,7 @@
 #include "postgres.h"            /* general Postgres declarations */

 #include "executor/executor.h"    /* for GetAttributeByName() */
+#include "utils/fmgrprotos.h"    /* for text_starts_with() */
 #include "utils/geo_decls.h"    /* for point type */

 PG_MODULE_MAGIC;
@@ -102,6 +103,23 @@ concat_text(PG_FUNCTION_ARGS)
     PG_RETURN_TEXT_P(new_text);
 }

+/* A wrapper around starts_with(text, text) */
+
+PG_FUNCTION_INFO_V1(t_starts_with);
+
+Datum
+t_starts_with(PG_FUNCTION_ARGS)
+{
+    text       *t1 = PG_GETARG_TEXT_PP(0);
+    text       *t2 = PG_GETARG_TEXT_PP(1);
+    bool        result;
+
+    result = DatumGetBool(DirectFunctionCall2(text_starts_with,
+                                              PointerGetDatum(t1),
+                                              PointerGetDatum(t2)));
+    PG_RETURN_BOOL(result);
+}
+
 /* Composite types */

 PG_FUNCTION_INFO_V1(c_overpaid);
diff --git a/src/tutorial/funcs.source b/src/tutorial/funcs.source
index 542b5c81ec..eb56153754 100644
--- a/src/tutorial/funcs.source
+++ b/src/tutorial/funcs.source
@@ -123,16 +123,25 @@ SELECT * FROM EMP;
 -----------------------------

 CREATE FUNCTION add_one(integer) RETURNS integer
-   AS '_OBJWD_/funcs' LANGUAGE C;
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;
+
+CREATE FUNCTION add_one(double precision) RETURNS double precision
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;

 CREATE FUNCTION makepoint(point, point) RETURNS point
-   AS '_OBJWD_/funcs' LANGUAGE C;
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;

 CREATE FUNCTION copytext(text) RETURNS text
-   AS '_OBJWD_/funcs' LANGUAGE C;
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;
+
+CREATE FUNCTION concat_text(text, text) RETURNS text
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;
+
+CREATE FUNCTION t_starts_with(text, text) RETURNS boolean
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;

 CREATE FUNCTION c_overpaid(EMP, integer) RETURNS boolean
-   AS '_OBJWD_/funcs' LANGUAGE C;
+   AS '_OBJWD_/funcs' LANGUAGE C STRICT;

 SELECT add_one(3) AS four;

@@ -140,6 +149,8 @@ SELECT makepoint('(1,2)'::point, '(3,4)'::point ) AS newpoint;

 SELECT copytext('hello world!');

+SELECT t_starts_with('foobar', 'foo');
+
 SELECT name, c_overpaid(EMP, 1500) AS overpaid
 FROM EMP
 WHERE name = 'Bill' or name = 'Sam';
@@ -147,10 +158,13 @@ WHERE name = 'Bill' or name = 'Sam';
 -- remove functions that were created in this file

 DROP FUNCTION c_overpaid(EMP, integer);
+DROP FUNCTION t_starts_with(text, text);
+DROP FUNCTION concat_text(text, text);
 DROP FUNCTION copytext(text);
 DROP FUNCTION makepoint(point, point);
+DROP FUNCTION add_one(double precision);
 DROP FUNCTION add_one(integer);
---DROP FUNCTION clean_EMP();
+DROP FUNCTION clean_EMP();
 DROP FUNCTION high_pay();
 DROP FUNCTION new_emp();
 DROP FUNCTION add_em(integer, integer);

Re: [PATCH] Add some documentation on how to call internal functions

From
Pavel Stehule
Date:


pá 18. 10. 2024 v 22:23 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I'll mark this patch as ready for committer

I spent a little time looking at this.  I agree that it's better to
show conversion of the example function's arguments to and from text*
rather than leaving them as Datum.  I also do think that we need to
incorporate the example into src/tutorial, if only because that
provides a chance to test it.  And indeed, testing exposed that the
example doesn't work:

$ cd src/tutorial
$ make
$ psql postgres
psql (18devel)
Type "help" for help.

postgres=# \i funcs.sql

psql:funcs.sql:152: ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

The problem here is that we failed to pass through the result of
PG_GET_COLLATION() to text_starts_with.  We could do that, certainly,
for a couple more lines of code.  But it feels like this is getting
into details that obscure the main point.  I wonder if it'd be better
to choose a different example that calls a non-collation-dependent
target function.

This can be a trap for some beginners too. So example of DirectFunctionCall2Coll can be nice

<-->result = DatumGetBool(DirectFunctionCall2Coll(text_starts_with,
<--><--><--><--><--><--><--><--><--><--><--><-->  DEFAULT_COLLATION_OID,
<--><--><--><--><--><--><--><--><--><--><--><-->  PointerGetDatum(t1),
<--><--><--><--><--><--><--><--><--><--><--><-->  PointerGetDatum(t2)));

With comment so text based functions can require collation - and simple solution can be using DEFAULT_COLLATION_OID, and we can
introduce second example of just DirectFunctionCall

Datum
bytea_left(PG_FUNCTION_ARGS)
{
<-->bytea<->   *t = PG_GETARG_BYTEA_PP(0);
<-->int32<-><-->l = PG_GETARG_INT32(1);
<-->bytea<->   *result;

<-->result = DatumGetByteaPP(DirectFunctionCall3(bytea_substr,
<--><--><--><--><--><--><--><--><--><--><--><--> PointerGetDatum(t),
<--><--><--><--><--><--><--><--><--><--><--><--> Int32GetDatum(1),
<--><--><--><--><--><--><--><--><--><--><--><--> Int32GetDatum(l)));
<-->PG_RETURN_BYTEA_P(result);
}

 

Anyway, proposed v3 attached.  I folded the two patches into one,
and editorialized on the text a little.

                        regards, tom lane