Thread: Options to rowwise persist result of stable/immutable function with RECORD result
Options to rowwise persist result of stable/immutable function with RECORD result
From
Eske Rahn
Date:
Hi,
I have noticed a rather odd behaviour that is not strictly a bug, but is unexpected.
It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.
See below for an example.It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.
Sure I can work around this by returning in an array, or materialised as a whole by e.g. a materialised CTE, but what I'm looking for is materialising of just the individual row during processing, if the function is to be called on many rows.
Obviously in theory the returned record could be very complex, so we might not want it materialised in general, but an option to do so would be nice. I would suggest that a WITH could be marked with a new "MATERIALIZED ROW" option (reusing already reserved keywords).
Note how I below have set the cost extreme, in this test, the value does not affect the behaviour..
The result set here have five elements, if i change the type to VOLATILE, the execution time is reduced by a factor of five (see the difference between the stamp of line one and two). It is directly proportional to the number of elements requested from the record (here I requested all)
(The real life scenario is a function that by a list of reg_ex expessions, splits up the input in numerous fields, And I noticed the behaviour as a raise function added for debug, put out the same repeatedly.)
-----------------
The result set here have five elements, if i change the type to VOLATILE, the execution time is reduced by a factor of five (see the difference between the stamp of line one and two). It is directly proportional to the number of elements requested from the record (here I requested all)
(The real life scenario is a function that by a list of reg_ex expessions, splits up the input in numerous fields, And I noticed the behaviour as a raise function added for debug, put out the same repeatedly.)
-----------------
DROP TYPE IF EXISTS septima.foo_type CASCADE;
CREATE TYPE septima.foo_type AS (a text, b text, c text, d text, e text);
DROP FUNCTION IF EXISTS septima.foo(text);
CREATE OR REPLACE FUNCTION septima.foo(inp text) RETURNS septima.foo_type
AS
$BODY$
DECLARE
result_record septima.foo_type;
i BIGINT :=12345678;
BEGIN
WHILE 0<i LOOP
i=i-1;
END LOOP;
RETURN result_record;
END
$BODY$
LANGUAGE plpgsql IMMUTABLE
COST 1234567890;
;
WITH x AS (
SELECT * FROM (
SELECT clock_timestamp() rowstart, (g).*, clock_timestamp() rowend FROM (
SELECT septima.foo(inp) g FROM (
SELECT '1' inp UNION
SELECT '2')
y) x
) x
)
SELECT * FROM x;
DROP TYPE IF EXISTS septima.foo_type CASCADE;
CREATE TYPE septima.foo_type AS (a text, b text, c text, d text, e text);
DROP FUNCTION IF EXISTS septima.foo(text);
CREATE OR REPLACE FUNCTION septima.foo(inp text) RETURNS septima.foo_type
AS
$BODY$
DECLARE
result_record septima.foo_type;
i BIGINT :=12345678;
BEGIN
WHILE 0<i LOOP
i=i-1;
END LOOP;
RETURN result_record;
END
$BODY$
LANGUAGE plpgsql IMMUTABLE
COST 1234567890;
;
WITH x AS (
SELECT * FROM (
SELECT clock_timestamp() rowstart, (g).*, clock_timestamp() rowend FROM (
SELECT septima.foo(inp) g FROM (
SELECT '1' inp UNION
SELECT '2')
y) x
) x
)
SELECT * FROM x;
DROP TYPE IF EXISTS septima.foo_type CASCADE;
Eske Rahn
Seniorkonsulent
+45 93 87 96 30
--------------------------
Septima P/S
Frederiksberggade 19, 2. sal
1459 København K
+45 72 30 06 72
Re: Options to rowwise persist result of stable/immutable function with RECORD result
From
"David G. Johnston"
Date:
On Tuesday, March 21, 2023, Eske Rahn <eske@septima.dk> wrote:
Hi,I have noticed a rather odd behaviour that is not strictly a bug, but is unexpected.
It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.
The LATERAL join modifier exists to handle this kind of situation.
David J.
Re: Options to rowwise persist result of stable/immutable function with RECORD result
From
Eske Rahn
Date:
Hi,
That was a nice sideeffect of lateral.
In the example, the calling code also gets simplified:
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;
That solved the issue at hand, in a much better way. Thanks
Though I still fail to see why the other way should generally call the function for every column in the result record - if the function is STABLE or IMMUTABLE.
BUT as I can not think up a sensible example where LATERAL will not do the trick, so the oddity becomes academic.
So just a thing to remember: always use lateral with functions with record result types - unless they are volatile)
Eske Rahn
Seniorkonsulent
+45 93 87 96 30
--------------------------
Septima P/S
Frederiksberggade 19, 2. sal
1459 København K
+45 72 30 06 72
On Wed, Mar 22, 2023 at 10:50 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tuesday, March 21, 2023, Eske Rahn <eske@septima.dk> wrote:Hi,I have noticed a rather odd behaviour that is not strictly a bug, but is unexpected.
It is when a immutable (or stable) PG function is returning results in a record structure a select on these calls the function repeatedly for each element in the output record.The LATERAL join modifier exists to handle this kind of situation.David J.
Re: Options to rowwise persist result of stable/immutable function with RECORD result
From
"David G. Johnston"
Date:
On Wed, Mar 22, 2023 at 4:32 PM Eske Rahn <eske@septima.dk> wrote:
Hi,Thanks for the quick answer :-DThat was a nice sideeffect of lateral.In the example, the calling code also gets simplified:
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;That solved the issue at hand, in a much better way. ThanksThough I still fail to see why the other way should generally call the function for every column in the result record - if the function is STABLE or IMMUTABLE.
It gets rewritten to be effectively:
select func_call(...).col1, func_call(...).col2, func_call(...).col3
under the assumption that repeating the function call will be cheap and side-effect free. It was never ideal but fixing that form of optimization was harder than implementing LATERAL where the multi-column result has a natural output in the form of a multi-column table. A normal function call in the target list really means "return a single value" which is at odds with writing .* after it.
David J.
Re: Options to rowwise persist result of stable/immutable function with RECORD result
From
"David G. Johnston"
Date:
On Wed, Mar 22, 2023 at 4:46 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wed, Mar 22, 2023 at 4:32 PM Eske Rahn <eske@septima.dk> wrote:Hi,Thanks for the quick answer :-DThat was a nice sideeffect of lateral.In the example, the calling code also gets simplified:
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;That solved the issue at hand, in a much better way. ThanksThough I still fail to see why the other way should generally call the function for every column in the result record - if the function is STABLE or IMMUTABLE.It gets rewritten to be effectively:select func_call(...).col1, func_call(...).col2, func_call(...).col3under the assumption that repeating the function call will be cheap and side-effect free. It was never ideal but fixing that form of optimization was harder than implementing LATERAL where the multi-column result has a natural output in the form of a multi-column table. A normal function call in the target list really means "return a single value" which is at odds with writing .* after it.
Actually, it is less "optimization" and more "SQL is strongly typed and all columns must be defined during query compilation".
David J.