Thread: Options to rowwise persist result of stable/immutable function with RECORD result

Hi,

I have noticed a rather odd behaviour that is not strictly a bug, but is unexpected.

It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.

See below for an example.

Sure I can work around this by returning in an array, or materialised as a whole by e.g. a materialised CTE, but what I'm looking for is materialising of just the individual row during processing, if the function is to be called on many rows.

Obviously in theory the returned record could be very complex, so we might not want it materialised in general, but an option to do so would be nice. I would suggest that a WITH could be marked with a new "MATERIALIZED ROW" option (reusing already reserved keywords).

Note how I below have set the cost extreme, in this test, the value does not affect the behaviour..

The result set here have five elements, if i change the type to VOLATILE, the execution time is reduced by a factor of five (see the difference between the stamp of line one and two). It is directly proportional to the number of elements requested from the record (here I requested all) 

(The real life scenario is a function that by a list of reg_ex expessions, splits up the input in numerous fields, And I noticed the behaviour as a raise function added for debug, put out the same repeatedly.)

-----------------

DROP TYPE IF EXISTS septima.foo_type CASCADE;
CREATE TYPE septima.foo_type AS (a text, b text, c text, d text, e text);
DROP FUNCTION IF EXISTS septima.foo(text);
CREATE OR REPLACE FUNCTION septima.foo(inp text) RETURNS septima.foo_type
AS
$BODY$
DECLARE
  result_record septima.foo_type;
  i BIGINT :=12345678;
BEGIN
  WHILE 0<i LOOP
    i=i-1;
  END LOOP;
  RETURN result_record;
END
$BODY$
  LANGUAGE plpgsql IMMUTABLE
  COST 1234567890;
;
WITH x AS (
  SELECT  * FROM (
    SELECT clock_timestamp() rowstart, (g).*, clock_timestamp() rowend FROM (
      SELECT septima.foo(inp) g FROM (
        SELECT '1' inp UNION
        SELECT '2')
    y) x
  ) x
)
SELECT * FROM x;
DROP TYPE IF EXISTS septima.foo_type CASCADE;

Med venlig hilsen
Eske Rahn
Seniorkonsulent
+45 93 87 96 30 
--------------------------
Septima P/S
Frederiksberggade 19, 2. sal
1459 København K
+45 72 30 06 72

Re: Options to rowwise persist result of stable/immutable function with RECORD result

From
"David G. Johnston"
Date:
On Tuesday, March 21, 2023, Eske Rahn <eske@septima.dk> wrote:
Hi,

I have noticed a rather odd behaviour that is not strictly a bug, but is unexpected.

It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.

The LATERAL join modifier exists to handle this kind of situation.

David J.
 
Hi,

Thanks for the quick answer :-D

That was a nice sideeffect of lateral.

In the example, the calling code also gets simplified:

WITH x AS (
  SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
    SELECT '1' inp UNION
    SELECT '2'
  ) y,  LATERAL septima.foo(inp) g
)
SELECT * FROM x;


That solved the issue at hand, in a much better way. Thanks

Though I still fail to see why the other way should generally call the function for every column in the result record - if the function is STABLE or IMMUTABLE.

BUT as I can not think up a sensible example where LATERAL will not do the trick, so the oddity becomes academic.
So just a thing to remember: always use lateral with functions with record result types - unless they are volatile)




Med venlig hilsen
Eske Rahn
Seniorkonsulent
+45 93 87 96 30 
--------------------------
Septima P/S
Frederiksberggade 19, 2. sal
1459 København K
+45 72 30 06 72


On Wed, Mar 22, 2023 at 10:50 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tuesday, March 21, 2023, Eske Rahn <eske@septima.dk> wrote:
Hi,

I have noticed a rather odd behaviour that is not strictly a bug, but is unexpected.

It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.

The LATERAL join modifier exists to handle this kind of situation.

David J.
 

Re: Options to rowwise persist result of stable/immutable function with RECORD result

From
"David G. Johnston"
Date:
On Wed, Mar 22, 2023 at 4:32 PM Eske Rahn <eske@septima.dk> wrote:
Hi,

Thanks for the quick answer :-D

That was a nice sideeffect of lateral.

In the example, the calling code also gets simplified:

WITH x AS (
  SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
    SELECT '1' inp UNION
    SELECT '2'
  ) y,  LATERAL septima.foo(inp) g
)
SELECT * FROM x;


That solved the issue at hand, in a much better way. Thanks

Though I still fail to see why the other way should generally call the function for every column in the result record - if the function is STABLE or IMMUTABLE.

It gets rewritten to be effectively:

select func_call(...).col1, func_call(...).col2, func_call(...).col3

under the assumption that repeating the function call will be cheap and side-effect free.  It was never ideal but fixing that form of optimization was harder than implementing LATERAL where the multi-column result has a natural output in the form of a multi-column table.  A normal function call in the target list really means "return a single value" which is at odds with writing .* after it.

David J.

Re: Options to rowwise persist result of stable/immutable function with RECORD result

From
"David G. Johnston"
Date:
On Wed, Mar 22, 2023 at 4:46 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wed, Mar 22, 2023 at 4:32 PM Eske Rahn <eske@septima.dk> wrote:
Hi,

Thanks for the quick answer :-D

That was a nice sideeffect of lateral.

In the example, the calling code also gets simplified:

WITH x AS (
  SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
    SELECT '1' inp UNION
    SELECT '2'
  ) y,  LATERAL septima.foo(inp) g
)
SELECT * FROM x;


That solved the issue at hand, in a much better way. Thanks

Though I still fail to see why the other way should generally call the function for every column in the result record - if the function is STABLE or IMMUTABLE.

It gets rewritten to be effectively:

select func_call(...).col1, func_call(...).col2, func_call(...).col3

under the assumption that repeating the function call will be cheap and side-effect free.  It was never ideal but fixing that form of optimization was harder than implementing LATERAL where the multi-column result has a natural output in the form of a multi-column table.  A normal function call in the target list really means "return a single value" which is at odds with writing .* after it.


Actually, it is less "optimization" and more "SQL is strongly typed and all columns must be defined during query compilation".

David J.