Re: dubious optimization of the function in SELECT INTO target list - Mailing list pgsql-general

From Adrian Klaver
Subject Re: dubious optimization of the function in SELECT INTO target list
Date
Msg-id 561431B1.2070107@aklaver.com
Whole thread Raw
In response to dubious optimization of the function in SELECT INTO target list  (Oleksii Kliukin <alexk@hintbits.com>)
Responses Re: dubious optimization of the function in SELECT INTO target list
List pgsql-general
On 10/06/2015 01:13 PM, Oleksii Kliukin wrote:
> Hello,
>
> I have an issue with a function that is called as a part of the SELECT
> INTO target list in pl/pgSQL. I'd like to illustrate it with a simple
> example:
>
> DO $$
> DECLARE l_id integer;
>      BEGIN
>         SELECT test(id) INTO l_id
>          FROM generate_series(1,10) t(id);
>      END;
> $$ LANGUAGE plpgsql;
>
> It looks like the test function in this example is executed only once.
> In order to check this, one can define the test function as following:
>
> CREATE TABLE foo(id integer);
>
> CREATE OR REPLACE FUNCTION public.test(id integer)
> RETURNS integer
> LANGUAGE plpgsql
> AS
> $fn$
>      BEGIN
>          INSERT INTO foo VALUES($1);
>          RETURN $1;
>      END;
> $fn$
>
> Basically, if we invoke the first example, the foo table with have only
> 1 row and not 10, as supplied by the generate_series.
> However, when ORDER BY is attached to the query, or aggregate (such as
> max, min or array_agg) is wrapped around the test(id) call, the test
> function is called exactly 10 times. If I replace the SELECT INTO with
> PERFORM, it would also be called 10 times. Unfortunately, it is not
> possible to use PERFORM directly in the CTE expression.

What CTE expression?

>
> The documentation on the SELECT INTO suggests that the rows returned by
> the test function may be discarded after the first one:
>
> "If STRICT is not specified in the INTO clause, then target will be set
> to the first row returned by the query, or to nulls if the query
> returned no rows. (Note that "the first row" is not well-defined unless
> you've used ORDER BY.) Any result rows after the first row are
> discarded."
> http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW
>
> However, it does not say anything about the number of rows the query
> target list will be evaluated, meaning one may expect it to be evaluated
> more than once. It seems that in the case of the example above
> optimizing out calls to the 'test' function would only produce an
> expected result if the function itself does not have any side-effects,
> e.g.. qualifies as 'stable' or 'immutable'.

How about:

DO $$
DECLARE l_id integer;
     BEGIN
        FOR l_id IN SELECT id
         FROM generate_series(1,10) as id LOOP
             SELECT INTO l_id test(l_id);
        END LOOP;
     END;
$$ LANGUAGE plpgsql;

>
> Is there some (undocumented) restriction on the functions allowed in the
> SELECT target list, and isn't the optimization to limit the number of
> calls to 'test' to 1 wrong in this case?
>
> Kind regards,
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Oleksii Kliukin
Date:
Subject: dubious optimization of the function in SELECT INTO target list
Next
From: Oleksii Kliukin
Date:
Subject: Re: dubious optimization of the function in SELECT INTO target list