[PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7 - Mailing list pgsql-hackers

From DIPESH DHAMELIYA
Subject [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7
Date
Msg-id CABgZEgdfbnq9t6xXJnmXbChNTcWFjeM_6nuig41tm327gYi2ig@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hello everyone,

With the commit 556f7b7bc18d34ddec45392965c3b3038206bb62, Any plpgsql function that returns scalar value would not be able to use parallelism to evaluate a return statement. It will not be considered for parallel execution because we are passing maxtuples = 2 to exec_run_select from exec_eval_expr to evaluate the return expression of the function.

Call stake to ExecutePlan -

#0  ExecutePlan (queryDesc=0x589c390, operation=CMD_SELECT, sendTuples=true, numberTuples=2, direction=ForwardScanDirection, dest=0xe15ca0 <spi_printtupDR>) at execMain.c:1654

#1  0x000000000075edb6 in standard_ExecutorRun (queryDesc=0x589c390, direction=ForwardScanDirection, count=2, execute_once=true) at execMain.c:366

#2  0x00007f5749c9b8d8 in explain_ExecutorRun (queryDesc=0x589c390, direction=ForwardScanDirection, count=2, execute_once=true) at auto_explain.c:334

#3  0x000000000075ec25 in ExecutorRun (queryDesc=0x589c390, direction=ForwardScanDirection, count=2, execute_once=true) at execMain.c:310

#4  0x00000000007c4a48 in _SPI_pquery (queryDesc=0x589c390, fire_triggers=true, tcount=2) at spi.c:2980

#5  0x00000000007c44a9 in _SPI_execute_plan (plan=0x5878780, options=0x7ffc6ad467e0, snapshot=0x0, crosscheck_snapshot=0x0, fire_triggers=true) at spi.c:2747

#6  0x00000000007c135f in SPI_execute_plan_with_paramlist (plan=0x5878780, params=0x0, read_only=false, tcount=2) at spi.c:765

#7  0x00007f5749eb4a8b in exec_run_select (estate=0x7ffc6ad46ba0, expr=0x5892b80, maxtuples=2, portalP=0x0) at pl_exec.c:5840 <-- maxtuples = 2

#8  0x00007f5749eb46fe in exec_eval_expr (estate=0x7ffc6ad46ba0, expr=0x5892b80, isNull=0x7ffc6ad46bc0, rettype=0x7ffc6ad46bc4, rettypmod=0x7ffc6ad468e8) at pl_exec.c:5734


Consider the following simple repro –

 

postgres=# create table test_tab(a int);

CREATE TABLE

postgres=# insert into test_tab (a) SELECT generate_series(1, 1000000);

INSERT 0 1000000

postgres=# analyse test_tab;

ANALYZE

postgres=# create function test_plpgsql() returns int

language plpgsql

as

$$

begin

return (select count(*) from test_tab where a between 5.0 and 999999.0);

end;

$$;

postgres=# LOAD 'auto_explain';

LOAD

postgres=# SET auto_explain.log_min_duration = 0;

SET

postgres=# SET auto_explain.log_analyze = true;

SET

postgres=# SET auto_explain.log_nested_statements = true;

SET

postgres=# select test_plpgsql();

test_plpgsql

--------------

999995

(1 row)

 

Plan logged in logfile -

    Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0)

    Result  (cost=13763.77..13763.78 rows=1 width=8) (actual time=912.108..912.110 rows=1 loops=1)

      InitPlan 1

        ->  Finalize Aggregate  (cost=13763.76..13763.77 rows=1 width=8) (actual time=912.103..912.104 rows=1 loops=1)

              ->  Gather  (cost=13763.54..13763.75 rows=2 width=8) (actual time=912.096..912.098 rows=1 loops=1)

                    Workers Planned: 2

                    Workers Launched: 0

                    ->  Partial Aggregate  (cost=12763.54..12763.55 rows=1 width=8) (actual time=912.095..912.096 rows=1 loops=1)

                          ->  Parallel Seq Scan on test_tab  (cost=0.00..12758.33 rows=2083 width=0) (actual time=0.022..812.253 rows=999995 loops=1)

                                Filter: (((a)::numeric >= 5.0) AND ((a)::numeric <= 999999.0))

                                Rows Removed by Filter: 5



Patch to fix this issue is attached. Proposed fix should not cause any regression because the number of returned rows is anyway being checked later inside exec_eval_expr(…).


Plan logged after fix – 

 

Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0)

    Result  (cost=13763.77..13763.78 rows=1 width=8) (actual time=324.397..328.007 rows=1.00 loops=1)

      InitPlan 1

        ->  Finalize Aggregate  (cost=13763.76..13763.77 rows=1 width=8) (actual time=324.391..327.999 rows=1.00 loops=1)

              ->  Gather  (cost=13763.54..13763.75 rows=2 width=8) (actual time=324.052..327.989 rows=3.00 loops=1)

                    Workers Planned: 2

                    Workers Launched: 2

                    ->  Partial Aggregate  (cost=12763.54..12763.55 rows=1 width=8) (actual time=320.254..320.255 rows=1.00 loops=3)

                          ->  Parallel Seq Scan on test_tab  (cost=0.00..12758.33 rows=2083 width=0) (actual time=0.029..286.410 rows=333331.67 loops=3)

                                Filter: (((a)::numeric >= 5.0) AND ((a)::numeric <= 999999.0))

                                Rows Removed by Filter: 2

Thanks & Regards,

Dipesh





Attachment

pgsql-hackers by date:

Previous
From: ADRIANO BOLLER
Date:
Subject: RFC: Command Restrictions by INI file with Audit Logging (DROP/TRUNCATE/DELETE)
Next
From: "David G. Johnston"
Date:
Subject: Re: pg_createsubscriber: Fix incorrect handling of cleanup flags