Thread: Plan output: actual execution time not considering loops?

Plan output: actual execution time not considering loops?

From
Thomas Kellerer
Date:

Consider the following execution plan:

table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; text-align: left; } table tr:nth-child(even) { background-color: #eee; } table tr:nth-child(odd) { background-color:#fff; } table th { background-color: black; color: white; }Seq Scan on orders o1  (cost=0.00..18818840.86 rows=3500 width=16) (actual time=0.033..8625.104 rows=99906 loops=1)
  Filter: (amount = (SubPlan 1))
  Rows Removed by Filter: 600094
  Buffers: shared hit=7719778
  SubPlan 1
    ->  Aggregate  (cost=26.87..26.87 rows=1 width=32) (actual time=0.012..0.012 rows=1 loops=700000)
          Buffers: shared hit=7714631
          ->  Bitmap Heap Scan on orders o2  (cost=3.45..26.85 rows=8 width=8) (actual time=0.004..0.008 rows=8 loops=700000)
                Recheck Cond: (customer_id = o1.customer_id)
                Heap Blocks: exact=5597311
                Buffers: shared hit=7714631
                ->  Bitmap Index Scan on orders_customer_id_order_date_idx  (cost=0.00..3.45 rows=8 width=0) (actual time=0.003..0.003 rows=8 loops=700000)
                      Index Cond: (customer_id = o1.customer_id)
                      Buffers: shared hit=2117320
Planning time: 0.136 ms
Execution time: 8628.724 ms

My expectation would have been that the "Aggregate" step shows the actual time as a product of the number of loops.

The Bitmap Heap Scan takes 0.008ms for each execution, so shouldn't the "actual time" for the "Aggregate" step be 0.008ms * 700000 (= 5600ms)?

The plan was generated using Postgres 10.4 (on Windows 10, but I think that is irrelevant)

Thomas





Re: Plan output: actual execution time not considering loops?

From
Tom Lane
Date:
Thomas Kellerer <spam_eater@gmx.net> writes:
> Consider the following execution plan:
> ...
>     ->  Aggregate  (cost=26.87..26.87 rows=1 width=32) (actual time=0.012..0.012 rows=1 loops=700000)
>           ->  Bitmap Heap Scan on orders o2  (cost=3.45..26.85 rows=8 width=8) (actual time=0.004..0.008 rows=8
loops=700000)
>                 ->  Bitmap Index Scan on orders_customer_id_order_date_idx  (cost=0.00..3.45 rows=8 width=0) (actual
time=0.003..0.003rows=8 loops=700000) 

> My expectation would have been that the "Aggregate" step shows the actual time as a product of the number of loops.

No, that looks fine to me.  The rule of thumb for reading this is total
time spent in/below this node is "actual time" times "number of loops".

It seems a bit odd that the Agg node would account for a third of the
total execution time when it's only processing 8 rows on average ...
but maybe it's a really expensive aggregate.

Another thought is that the EXPLAIN ANALYZE instrumentation itself
can account for significant per-node-invocation overhead.  If the
total execution time drops significantly when you add "timing off"
to the EXPLAIN options, then that's probably a factor in making
the Agg node look relatively expensive.

            regards, tom lane


Re: Plan output: actual execution time not considering loops?

From
Thomas Kellerer
Date:
Tom Lane schrieb am 20.06.2018 um 16:03:
>> Consider the following execution plan:
>> ...
>>     ->  Aggregate  (cost=26.87..26.87 rows=1 width=32) (actual time=0.012..0.012 rows=1 loops=700000)
>>           ->  Bitmap Heap Scan on orders o2  (cost=3.45..26.85 rows=8 width=8) (actual time=0.004..0.008 rows=8
loops=700000)
>>                 ->  Bitmap Index Scan on orders_customer_id_order_date_idx  (cost=0.00..3.45 rows=8 width=0) (actual
time=0.003..0.003rows=8 loops=700000)
 
> 
>> My expectation would have been that the "Aggregate" step shows the actual time as a product of the number of loops.
> 
> No, that looks fine to me.  The rule of thumb for reading this is total
> time spent in/below this node is "actual time" times "number of loops".

OK, if that is the rule I can live with that ;)

> It seems a bit odd that the Agg node would account for a third of the
> total execution time when it's only processing 8 rows on average ...
> but maybe it's a really expensive aggregate.

But it's processing those 8 rows 700.000 times - so the total time seems correct. 

FWIW, the query looks like this:

    select customer_id, 
           amount, 
           sales_person_id
    from orders o1
    where amount = (select max(o2.amount)
                    from orders o2
                    where o2.customer_id = o1.customer_id);

It's not a real world query - it's just there to illustrate the drawbacks of co-related sub-queries.
 
> Another thought is that the EXPLAIN ANALYZE instrumentation itself
> can account for significant per-node-invocation overhead.  If the
> total execution time drops significantly when you add "timing off"
> to the EXPLAIN options, then that's probably a factor in making
> the Agg node look relatively expensive.

"timing off" doesn't really change the execution time (it's about 60ms faster without)

Thanks for the answer, I am not really concerned about the query performance itself, just about the plan ;) 

Thomas