Thread: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Masahiko Sawada
Date:
Hi all,

My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
fine for parallel seq scan.

postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ;
       QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------------------Finalize
Aggregate (cost=217018.55..217018.56 rows=1 width=8)
 
(actual time=2640.015..2640.015 rows=1 loops=1)  Output: count(*)  ->  Gather  (cost=217018.33..217018.54 rows=2
width=8)(actual
 
time=2639.064..2640.002 rows=3 loops=1)        Output: (PARTIAL count(*))        Workers Planned: 2        Workers
Launched:2        ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
 
width=8) (actual time=2632.714..2632.715 rows=1 loops=3)              Output: PARTIAL count(*)              Worker 0:
actualtime=2632.583..2632.584 rows=1 loops=1              Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
      ->  Parallel Seq Scan on public.pgbench_accounts
 
(cost=0.00..205601.67 rows=4166667 width=0) (actual
time=0.042..1685.542 rows=3333333 loops=3)                    Worker 0: actual time=0.033..1657.486 rows=3457968
loops=1                   Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1Planning time: 1.026 msExecution
time:2640.225 ms
 
(15 rows)

For example, the above result shows,
Parallel Seq Scan : actual rows = 3333333
worker 0               : actual rows = 3457968
worker 1               : actual rows = 3741069
Summation of these is 10532370, but actual total rows is 10000000.
I think that Parallel Seq Scan should show actual rows =
10000000(total rows) or actual rows = 2800963(rows collected by
itself). (10000000 maybe better)

After spent time to investigate this behaviour, ISTM that the problem
is nloops of Parallel Seq Scan.
Parallel Seq Scan is done only once, but nloops is incremented to 3.
So its "actual rows" is calculated 3333333(10000000 / 3) at explain.c:L1223.
Is it a bug?

Regards,

--
Masahiko Sawada



Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Amit Kapila
Date:
On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
> fine for parallel seq scan.
>
> postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ;
>                                                     QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
> (actual time=2640.015..2640.015 rows=1 loops=1)
>    Output: count(*)
>    ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
> time=2639.064..2640.002 rows=3 loops=1)
>          Output: (PARTIAL count(*))
>          Workers Planned: 2
>          Workers Launched: 2
>          ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>                Output: PARTIAL count(*)
>                Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>                Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>                ->  Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..205601.67 rows=4166667 width=0) (actual
> time=0.042..1685.542 rows=3333333 loops=3)
>                      Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
>                      Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
>  Planning time: 1.026 ms
>  Execution time: 2640.225 ms
> (15 rows)
>
> For example, the above result shows,
> Parallel Seq Scan : actual rows = 3333333
> worker 0               : actual rows = 3457968
> worker 1               : actual rows = 3741069
> Summation of these is 10532370, but actual total rows is 10000000.
> I think that Parallel Seq Scan should show actual rows =
> 10000000(total rows) or actual rows = 2800963(rows collected by
> itself). (10000000 maybe better)
>

You have to read the rows at Parallel Seq Scan nodes as total count of rows, but you have to consider the loops parameter as well. 
 
>
> After spent time to investigate this behaviour, ISTM that the problem
> is nloops of Parallel Seq Scan.
> Parallel Seq Scan is done only once, but nloops is incremented to 3.
>

nloops here indicates, that it is done for 2 workers and a master backend.

> So its "actual rows" is calculated 3333333(10000000 / 3) at explain.c:L1223.
>

Thats how it should be considered.  You might want to compare the behaviour with other cases where value of nloops is used.

> Is it a bug?
>

I don't think so.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Masahiko Sawada
Date:
On Mon, Jun 20, 2016 at 3:42 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> Hi all,
>>
>> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
>> fine for parallel seq scan.
>>
>> postgres(1)=# explain analyze verbose select count(*) from
>> pgbench_accounts ;
>>                                                     QUERY PLAN
>>
>>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
>> (actual time=2640.015..2640.015 rows=1 loops=1)
>>    Output: count(*)
>>    ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
>> time=2639.064..2640.002 rows=3 loops=1)
>>          Output: (PARTIAL count(*))
>>          Workers Planned: 2
>>          Workers Launched: 2
>>          ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
>> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>>                Output: PARTIAL count(*)
>>                Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>>                Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>>                ->  Parallel Seq Scan on public.pgbench_accounts
>> (cost=0.00..205601.67 rows=4166667 width=0) (actual
>> time=0.042..1685.542 rows=3333333 loops=3)
>>                      Worker 0: actual time=0.033..1657.486 rows=3457968
>> loops=1
>>                      Worker 1: actual time=0.039..1702.979 rows=3741069
>> loops=1
>>  Planning time: 1.026 ms
>>  Execution time: 2640.225 ms
>> (15 rows)
>>
>> For example, the above result shows,
>> Parallel Seq Scan : actual rows = 3333333
>> worker 0               : actual rows = 3457968
>> worker 1               : actual rows = 3741069
>> Summation of these is 10532370, but actual total rows is 10000000.
>> I think that Parallel Seq Scan should show actual rows =
>> 10000000(total rows) or actual rows = 2800963(rows collected by
>> itself). (10000000 maybe better)
>>
>
> You have to read the rows at Parallel Seq Scan nodes as total count of rows,
> but you have to consider the loops parameter as well.
>

In following case, it look to me that no one collect the tuple.
But it's obviously incorrect, this query collects a tuple(aid = 10) actually.

postgres(1)=# explain analyze verbose select * from pgbench_accounts
where aid = 10;                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------Gather
(cost=1000.00..217018.43 rows=1 width=97) (actual
 
time=0.541..2094.773 rows=1 loops=1)  Output: aid, bid, abalance, filler  Workers Planned: 2  Workers Launched: 2  ->
ParallelSeq Scan on public.pgbench_accounts
 
(cost=0.00..216018.34 rows=0 width=97) (actual time=1390.109..2088.103
rows=0 loops=3)        Output: aid, bid, abalance, filler        Filter: (pgbench_accounts.aid = 10)        Rows
Removedby Filter: 3333333        Worker 0: actual time=2082.681..2082.681 rows=0 loops=1        Worker 1: actual
time=2087.532..2087.532rows=0 loops=1Planning time: 0.126 msExecution time: 2095.564 ms
 
(12 rows)

How can we consider actual rows and nloops?

Regards,

--
Masahiko Sawada



Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Satoshi Nagayasu
Date:
Hi,

2016-06-20 15:42 GMT+09:00 Amit Kapila <amit.kapila16@gmail.com>:
> On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> Hi all,
>>
>> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
>> fine for parallel seq scan.
>>
>> postgres(1)=# explain analyze verbose select count(*) from
>> pgbench_accounts ;
>>                                                     QUERY PLAN
>>
>>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
>> (actual time=2640.015..2640.015 rows=1 loops=1)
>>    Output: count(*)
>>    ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
>> time=2639.064..2640.002 rows=3 loops=1)
>>          Output: (PARTIAL count(*))
>>          Workers Planned: 2
>>          Workers Launched: 2
>>          ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
>> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>>                Output: PARTIAL count(*)
>>                Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>>                Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>>                ->  Parallel Seq Scan on public.pgbench_accounts
>> (cost=0.00..205601.67 rows=4166667 width=0) (actual
>> time=0.042..1685.542 rows=3333333 loops=3)
>>                      Worker 0: actual time=0.033..1657.486 rows=3457968
>> loops=1
>>                      Worker 1: actual time=0.039..1702.979 rows=3741069
>> loops=1
>>  Planning time: 1.026 ms
>>  Execution time: 2640.225 ms
>> (15 rows)
>>
>> For example, the above result shows,
>> Parallel Seq Scan : actual rows = 3333333
>> worker 0               : actual rows = 3457968
>> worker 1               : actual rows = 3741069
>> Summation of these is 10532370, but actual total rows is 10000000.
>> I think that Parallel Seq Scan should show actual rows =
>> 10000000(total rows) or actual rows = 2800963(rows collected by
>> itself). (10000000 maybe better)
>>
>
> You have to read the rows at Parallel Seq Scan nodes as total count of rows,
> but you have to consider the loops parameter as well.

IMHO, "actual rows" of "Parallel Seq Scan" should not be divided by the loops,
because the total rows processed here is 10000000, not 3333333 * 3.
I think the actual row value shown here "3333333 " is a bit confusing and tricky
for users.

>> After spent time to investigate this behaviour, ISTM that the problem
>> is nloops of Parallel Seq Scan.
>> Parallel Seq Scan is done only once, but nloops is incremented to 3.
>
> nloops here indicates, that it is done for 2 workers and a master backend.

Right, but I'm not sure "loops" is an appropriate word for this context.

>> So its "actual rows" is calculated 3333333(10000000 / 3) at
>> explain.c:L1223.
>>
>
> Thats how it should be considered.  You might want to compare the behaviour
> with other cases where value of nloops is used.
>
>> Is it a bug?
>>
>
> I don't think so.

I would like to propose a few modification for parallel queries to
make it more clear.

>>                ->  Parallel Seq Scan on public.pgbench_accounts (cost=0.00..205601.67 rows=4166667 width=0) (actual
time=0.042..1685.542rows=10000000 workers=2)
 
>>                      Parent: actual time=0.033..1657.486 rows=2800963loops=1
>>                      Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
>>                      Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1

How about this?

Regards,
-- 
Satoshi Nagayasu <snaga@uptime.jp>



Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Amit Langote
Date:
On 2016/06/20 15:42, Amit Kapila wrote:
> On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> Hi all,
>>
>> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
>> fine for parallel seq scan.
>>

[ ... ]

>> For example, the above result shows,
>> Parallel Seq Scan : actual rows = 3333333
>> worker 0               : actual rows = 3457968
>> worker 1               : actual rows = 3741069
>> Summation of these is 10532370, but actual total rows is 10000000.
>> I think that Parallel Seq Scan should show actual rows =
>> 10000000(total rows) or actual rows = 2800963(rows collected by
>> itself). (10000000 maybe better)
>>
> 
> You have to read the rows at Parallel Seq Scan nodes as total count of
> rows, but you have to consider the loops parameter as well.
> 
>>
>> After spent time to investigate this behaviour, ISTM that the problem
>> is nloops of Parallel Seq Scan.
>> Parallel Seq Scan is done only once, but nloops is incremented to 3.
>>
> 
> nloops here indicates, that it is done for 2 workers and a master backend.

Does the way parallel instrumentation info is currently handled allow to
also show the leader's info maybe like below:

->  Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..205601.67 rows=4166667 width=0) (actual
time=0.042..1685.542 rows=3333333 loops=3)   Leader: actual time=x.xxx..xxxx.xxx rows=2800963 loops=1   Worker 0:
actualtime=0.033..1657.486 rows=3457968 loops=1   Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
 

Thanks,
Amit





Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Amit Kapila
Date:
On Mon, Jun 20, 2016 at 12:24 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> In following case, it look to me that no one collect the tuple.
> But it's obviously incorrect, this query collects a tuple(aid = 10) actually.
>
> postgres(1)=# explain analyze verbose select * from pgbench_accounts
> where aid = 10;
>                                                                  QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------
>  Gather  (cost=1000.00..217018.43 rows=1 width=97) (actual
> time=0.541..2094.773 rows=1 loops=1)
>    Output: aid, bid, abalance, filler
>    Workers Planned: 2
>    Workers Launched: 2
>    ->  Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..216018.34 rows=0 width=97) (actual time=1390.109..2088.103
> rows=0 loops=3)
>          Output: aid, bid, abalance, filler
>          Filter: (pgbench_accounts.aid = 10)
>          Rows Removed by Filter: 3333333
>          Worker 0: actual time=2082.681..2082.681 rows=0 loops=1
>          Worker 1: actual time=2087.532..2087.532 rows=0 loops=1
>  Planning time: 0.126 ms
>  Execution time: 2095.564 ms
> (12 rows)
>
> How can we consider actual rows and nloops?
>

Here rows is an average value per-execution.   In some cases, you might not get the exact number.  You might find the information in docs [1] useful. Refer Explain Analyze section.

Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Amit Kapila
Date:
On Mon, Jun 20, 2016 at 12:47 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 2016/06/20 15:42, Amit Kapila wrote:
> >
> >>
> >> After spent time to investigate this behaviour, ISTM that the problem
> >> is nloops of Parallel Seq Scan.
> >> Parallel Seq Scan is done only once, but nloops is incremented to 3.
> >>
> >
> > nloops here indicates, that it is done for 2 workers and a master backend.
>
> Does the way parallel instrumentation info is currently handled allow to
> also show the leader's info maybe like below:
>
> ->  Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..205601.67 rows=4166667 width=0) (actual
> time=0.042..1685.542 rows=3333333 loops=3)
>     Leader: actual time=x.xxx..xxxx.xxx rows=2800963 loops=1
>     Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
>     Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
>

Currently, we don't have a way to display information like above, however feel free to propose a patch, if you find a neat way to achieve it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
Robert Haas
Date:
On Mon, Jun 20, 2016 at 3:06 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
> IMHO, "actual rows" of "Parallel Seq Scan" should not be divided by the loops,
> because the total rows processed here is 10000000, not 3333333 * 3.
> I think the actual row value shown here "3333333 " is a bit confusing and tricky
> for users.

I don't think the total number of rows for a node should EVER be
divided by the number of loops.  We've had that behavior for a long
time and I find it to be mind-numbingly stupid.  Whenever I'm reading
an EXPLAIN plan, I end up trying to figure out what's really going on
by multiplying the number of rows shown by the loop count, but often
the row count is very small, like 0 or 1 or 2, so the round-off error
is large and you can't really tell what's happening.  Nobody wants to
know the average number of rows returned per node execution: they
want, as you want here, the total number of rows that node ever
processed.  I doubt we can convince Tom Lane to let us change it, but
feel free to post a patch.

One thing I don't think we can do here is have some weird exception
where parallel query works differently from everything else.  "loops"
just counts the number of times that the node was executed.  It most
often ends up >1 when the plan node is on the inner side of a nested
loop, but parallel query ends up creating that scenario also.  There's
no real way to separate those things out, though.  If a node executes
3 times in one worker, 4 times in another, and once in the leader,
what value are you going to display for loops other than 8?  And if
you accept that's the right answer in that case, then you pretty much
need the answer when it executes once in one worker, once in another
worker, and once in the leader to be 3.  I agree that this is very
confusing - and you're not the first person to complain about it - but
I think that parallel query is merely throwing light on the fact that
the pre-existing behavior of EXPLAIN is poorly chosen, not creating
any fundamentally new issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From
"David G. Johnston"
Date:
On Mon, Jun 20, 2016 at 2:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>                                                     QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
>> (actual time=2640.015..2640.015 rows=1 loops=1)
>>    Output: count(*)
>>    ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
>> time=2639.064..2640.002 rows=3 loops=1)
>>          Output: (PARTIAL count(*))
>>          Workers Planned: 2
>>          Workers Launched: 2
>>          ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
>> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>>                Output: PARTIAL count(*)
>>                Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>>                Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>>                ->  Parallel Seq Scan on public.pgbench_accounts
>> (cost=0.00..205601.67 rows=4166667 width=0) (actual
>> time=0.042..1685.542 rows=3333333 loops=3)
>>                      Worker 0: actual time=0.033..1657.486 rows=3457968
>> loops=1
>>                      Worker 1: actual time=0.039..1702.979 rows=3741069
>> loops=1
 
postgres(1)=# explain analyze verbose select * from pgbench_accounts
where aid = 10;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..217018.43 rows=1 width=97) (actual
time=0.541..2094.773 rows=1 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..216018.34 rows=0 width=97) (actual time=1390.109..2088.103
rows=0 loops=3)
         Output: aid, bid, abalance, filler
         Filter: (pgbench_accounts.aid = 10)
         Rows Removed by Filter: 3333333
         Worker 0: actual time=2082.681..2082.681 rows=0 loops=1
         Worker 1: actual time=2087.532..2087.532 rows=0 loops=1
 Planning time: 0.126 ms
 Execution time: 2095.564 ms
(12 rows)


The below discrepancy bothered me but it finally dawned on me that this is exactly what Robert's complaint is about.  The 1 row returned by the leader is averaged across three parallel workers and shows as zero in the "actual" count for the PSS in the second plan.

​Actual rows in parens:

First Plan:

Finalize Agg. (1) <- Gather (3) <- ​Partial Agg (1x3 = 3) { "Leader" (1) + Worker0 (1) + Worker1 (1) } <- ParallelSS (3.33Mx3) { "Leader" (~2M) +  Worker0 (~4M) + Worker1 (~4M) }

Second Plan:

Gather (1) <- ParallelSS (0.33, rounded down :( x 3) { "Leader" (0) + Worker0 (0) + Worker1 (0) }

(rhetorical question, see my first paragraph)
Why, in the first plan, does the PSS node have a total count of 10M which are fed to and aggregated by the Partial Agg node parent, while in the second plan the PSS node shows zero actual rows yet its parent has a row count of 1?

I'm inclined to go with Robert on this - reporting averages seems counter-productive.  Similar information can be had by reporting totals and loop and letting the do the division if needed.

As a middle position if the resultant integer average rows value is 0 can we output <rows=actual loops=0>?  Or omit "loops" altogether?  Maybe this doesn't help on other cases where the round-off error doesn't force the result to zero - I haven't gone that deep into the pool.  Or just start reporting "totalrows=x" in addition to the other information.

While this issue isn't new to Parallel Queries its scope and visibility is greatly expanded now that we have them.  At that same time the history suggests that we don't have to fix this in 9.6 but instead can get a feel for how this plays out in the new reality and then react either via better documentation or implementation changes.

The two obvious issues with changes in this area are:

Backward compatibility (which can be mitigated)
Regression testing expected output changes (which can be funded)

In any case I do think it would be nice to output leader-as-worker data alongside dedicated workers.  The fewer the hidden numbers the better.  In this case seeing a 1 from the leader actual, even if the PSS node has an actual count of 0, would have been enough of a hint that indeed the single row is being passed through.  As it stands the reader doesn't see any rows=1 until the final gather.

David J.