Re: Using Expanded Objects other than Arrays from plpgsql - Mailing list pgsql-hackers

From Michel Pelletier
Subject Re: Using Expanded Objects other than Arrays from plpgsql
Date
Msg-id CACxu=vJMqLUZR1N-AT4cXw9JHMsN=5-iYvPvKWy-CcKwz_9aBg@mail.gmail.com
Whole thread Raw
In response to Re: Using Expanded Objects other than Arrays from plpgsql  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Using Expanded Objects other than Arrays from plpgsql
List pgsql-hackers
On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michel Pelletier <pelletier.michel@gmail.com> writes:
> Here's another example:

> CREATE OR REPLACE FUNCTION test2(graph matrix)
>     RETURNS bigint LANGUAGE plpgsql AS
>     $$
>     BEGIN
>     perform set_element(graph, 1, 1, 1);
>     RETURN nvals(graph);
>     end;
>     $$;
> CREATE FUNCTION
> postgres=# select test2(matrix('int32'));
> DEBUG:  new_matrix
> DEBUG:  matrix_get_flat_size
> DEBUG:  flatten_matrix
> DEBUG:  scalar_int32
> DEBUG:  new_scalar
> DEBUG:  matrix_set_element
> DEBUG:  DatumGetMatrix
> DEBUG:  expand_matrix
> DEBUG:  new_matrix
> DEBUG:  DatumGetScalar
> DEBUG:  matrix_get_flat_size
> DEBUG:  matrix_get_flat_size
> DEBUG:  flatten_matrix
> DEBUG:  context_callback_matrix_free
> DEBUG:  context_callback_scalar_free
> DEBUG:  matrix_nvals
> DEBUG:  DatumGetMatrix
> DEBUG:  expand_matrix
> DEBUG:  new_matrix
> DEBUG:  context_callback_matrix_free
> DEBUG:  context_callback_matrix_free
>  test2
> -------
>      0
> (1 row)

I'm a little confused by your debug output.  What are "scalar_int32"
and "new_scalar", and what part of the plpgsql function is causing
them to be invoked?

GraphBLAS scalars hold a single element value for the matrix type.  Internally, they are simply 1x1 matrices (much like vectors are 1xn matrices).  The function signature is:

set_element(a matrix, i bigint, j bigint, s scalar)

There is a "CAST (integer as scalar)" function (scalar_int32) that casts Postgres integers to GraphBLAS GrB_INT32 scalar elements (which calls new_scalar because like vectors and matrices, they are expanded objects which have a GrB_Scalar "handle").  Scalars are useful for working with individual values, for example reduce() returns a scalar.  There are way more efficient ways to push huge C arrays of values into matrices but for now I'm just working at the element level.

Another thing that confuses me is why there's a second flatten_matrix
operation happening here.  Shouldn't set_element return its result
as a R/W expanded object?

That confuses me too, and my default assumption is always that I'm doing it wrong.  set_element does return a R/W object afaict, here is the return:


where:

#define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix)->hdr)


> I would expect that to return 1.  If I do "graph = set_element(graph, 1, 1,
> 1)" it works.

I think you have a faulty understanding of PERFORM.  It's defined as
"evaluate this expression and throw away the result", so it's *not*
going to change "graph", not even if set_element declares that
argument as INOUT. 

Faulty indeed, I was going from the plpgsql statement documentation:

"Sometimes it is useful to evaluate an expression or SELECT query but discard the result, for example when calling a function that has side-effects but no useful result value."

My understanding of "side-effects" was flawed there, but I'm fine with "x = set_element(x...)" anyway as I was trying to follow the example of array_append et al.
 
(Our interpretation of OUT arguments for functions
is that they're just an alternate notation for specifying the function
result.)  If you want to avoid the explicit assignment back to "graph"
then the thing to do would be to declare set_element as a procedure,
not a function, with an INOUT argument and then call it with CALL.

I'll stick with the assignment. 

That's only cosmetically different though, in that the updated
"graph" value is still passed back much as if it were a function
result, and then the CALL infrastructure knows it has to assign that
back to the argument variable.  And, as I tried to explain earlier,
that code path currently has no mechanism for avoiding making a copy
of the graph somewhere along the line: it will pass "graph" to the
procedure as either a flat Datum or a R/O expanded object, so that
set_element will be required to copy that before modifying it.

Right, I'm still figuring out exactly what that code flow is.  This is my first dive into these corners of the code so thank you for being patient with me.  I promise to write up some expanded object documentation if this works!
 
We can imagine extending whatever we do for "x := f(x)" cases so that
it also works during "CALL p(x)".  But I think that's only going to
yield cosmetic or notational improvements so I don't want to start
with doing that.  Let's focus first on improving the existing
infrastructure for the f(x) case.

+1

-Michel 

pgsql-hackers by date:

Previous
From: Matthew Morrissette Vance
Date:
Subject: Commutation of array SOME/ANY and ALL operators
Next
From: Tom Lane
Date:
Subject: Re: Commutation of array SOME/ANY and ALL operators