Thread: [PERFORM] repeated subplan execution

[PERFORM] repeated subplan execution

From
Jeff Janes
Date:
I have a complicated query which runs the exact same subplan more than once.

Here is a greatly simplified (and rather pointless) query to replicate the issue: 

select aid, sum_bid from
    (select
        aid,
        (select sum(bid) from pgbench_branches
            where bbalance between -10000-abalance and 1+abalance
        ) as sum_bid
        from pgbench_accounts
        where aid between 1 and 1000
        group by aid
    ) asdfsadf
where sum_bid >0;

                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Group  (cost=0.44..375841.29 rows=931 width=12) (actual time=1.233..691.200 rows=679 loops=1)
   Group Key: pgbench_accounts.aid
   Filter: ((SubPlan 2) > 0)
   Rows Removed by Filter: 321
   ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts  (cost=0.44..634.32 rows=931 width=8) (actual time=0.040..1.783 rows=1000 loops=1)
         Index Cond: ((aid >= 1) AND (aid <= 1000))
   SubPlan 2
     ->  Aggregate  (cost=403.00..403.01 rows=1 width=8) (actual time=0.406..0.407 rows=1 loops=1000)
           ->  Seq Scan on pgbench_branches pgbench_branches_1  (cost=0.00..403.00 rows=1 width=4) (actual time=0.392..0.402 rows=1 loops=1000)
                 Filter: ((bbalance >= ('-10000'::integer - pgbench_accounts.abalance)) AND (bbalance <= (1 + pgbench_accounts.abalance)))
                 Rows Removed by Filter: 199
   SubPlan 1
     ->  Aggregate  (cost=403.00..403.01 rows=1 width=8) (actual time=0.407..0.407 rows=1 loops=679)
           ->  Seq Scan on pgbench_branches  (cost=0.00..403.00 rows=1 width=4) (actual time=0.388..0.402 rows=1 loops=679)
                 Filter: ((bbalance >= ('-10000'::integer - pgbench_accounts.abalance)) AND (bbalance <= (1 + pgbench_accounts.abalance)))
                 Rows Removed by Filter: 199
 Planning time: 0.534 ms
 Execution time: 691.784 ms




The subplan is not so fast that I wish it to be executed again or every row which passes the filter.  

I can prevent this dual execution using a CTE, but that creates other problems.  Is there a way to get rid of it without resorting to that?

Maybe also a question for bugs and/or hackers, is why should I need to do anything special to avoid dual execution?

Cheers,

Jeff

Re: [PERFORM] repeated subplan execution

From
monika yadav
Date:
Hi All,

I didn't understand why same sub plan for the sub query executed two times? As per the query it should have been executed only once.

Can someone please explain this behaviour of query execution ? 

Thanks a lot.

On Wed, 20 Sep 2017 at 5:01 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
I have a complicated query which runs the exact same subplan more than once.

Here is a greatly simplified (and rather pointless) query to replicate the issue: 

select aid, sum_bid from
    (select
        aid,
        (select sum(bid) from pgbench_branches
            where bbalance between -10000-abalance and 1+abalance
        ) as sum_bid
        from pgbench_accounts
        where aid between 1 and 1000
        group by aid
    ) asdfsadf
where sum_bid >0;

                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Group  (cost=0.44..375841.29 rows=931 width=12) (actual time=1.233..691.200 rows=679 loops=1)
   Group Key: pgbench_accounts.aid
   Filter: ((SubPlan 2) > 0)
   Rows Removed by Filter: 321
   ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts  (cost=0.44..634.32 rows=931 width=8) (actual time=0.040..1.783 rows=1000 loops=1)
         Index Cond: ((aid >= 1) AND (aid <= 1000))
   SubPlan 2
     ->  Aggregate  (cost=403.00..403.01 rows=1 width=8) (actual time=0.406..0.407 rows=1 loops=1000)
           ->  Seq Scan on pgbench_branches pgbench_branches_1  (cost=0.00..403.00 rows=1 width=4) (actual time=0.392..0.402 rows=1 loops=1000)
                 Filter: ((bbalance >= ('-10000'::integer - pgbench_accounts.abalance)) AND (bbalance <= (1 + pgbench_accounts.abalance)))
                 Rows Removed by Filter: 199
   SubPlan 1
     ->  Aggregate  (cost=403.00..403.01 rows=1 width=8) (actual time=0.407..0.407 rows=1 loops=679)
           ->  Seq Scan on pgbench_branches  (cost=0.00..403.00 rows=1 width=4) (actual time=0.388..0.402 rows=1 loops=679)
                 Filter: ((bbalance >= ('-10000'::integer - pgbench_accounts.abalance)) AND (bbalance <= (1 + pgbench_accounts.abalance)))
                 Rows Removed by Filter: 199
 Planning time: 0.534 ms
 Execution time: 691.784 ms




The subplan is not so fast that I wish it to be executed again or every row which passes the filter.  

I can prevent this dual execution using a CTE, but that creates other problems.  Is there a way to get rid of it without resorting to that?

Maybe also a question for bugs and/or hackers, is why should I need to do anything special to avoid dual execution?

Cheers,

Jeff

Re: [PERFORM] repeated subplan execution

From
Jeff Janes
Date:
On Tue, Sep 19, 2017 at 7:31 PM, monika yadav <monika.1234yadav@gmail.com> wrote:
Hi All,

I didn't understand why same sub plan for the sub query executed two times? As per the query it should have been executed only once.

Can someone please explain this behaviour of query execution ? 


The sum_bid at the end of the query is an alias for the entire subselect, so it not entirely surprising that it gets interpolated twice. it is just kind of unfortunate from a performance perspective.

The query I originally gave is equivalent to this query:


 select
        aid,
        (select sum(bid) from pgbench_branches
            where bbalance between -10000-abalance and 1+abalance
        ) as sum_bid
        from pgbench_accounts
        where aid between 1 and 1000
        group by aid 
        having (select sum(bid) from pgbench_branches where bbalance between -10000-abalance and 1+abalance ) >0;


In my originally query I just wrapped the whole thing in another select, so that I could use the alias rather than having to mechanically repeat the entire subquery again in the HAVING section.  They give identical plans.

Cheers,

Jeff

Re: [PERFORM] repeated subplan execution

From
monika yadav
Date:
Hi Jeff,

Thanks for the update and clarification. I will look to see a better alternative to resolve this twice execution of same plan.

On Wed, Sep 20, 2017 at 10:16 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Sep 19, 2017 at 7:31 PM, monika yadav <monika.1234yadav@gmail.com> wrote:
Hi All,

I didn't understand why same sub plan for the sub query executed two times? As per the query it should have been executed only once.

Can someone please explain this behaviour of query execution ? 


The sum_bid at the end of the query is an alias for the entire subselect, so it not entirely surprising that it gets interpolated twice. it is just kind of unfortunate from a performance perspective.

The query I originally gave is equivalent to this query:


 select
        aid,
        (select sum(bid) from pgbench_branches
            where bbalance between -10000-abalance and 1+abalance
        ) as sum_bid
        from pgbench_accounts
        where aid between 1 and 1000
        group by aid 
        having (select sum(bid) from pgbench_branches where bbalance between -10000-abalance and 1+abalance ) >0;


In my originally query I just wrapped the whole thing in another select, so that I could use the alias rather than having to mechanically repeat the entire subquery again in the HAVING section.  They give identical plans.

Cheers,

Jeff