Re: Understanding EXPLAIN ANALYZE estimates when loops != 1 - Mailing list pgsql-general

From Philip Semanchuk
Subject Re: Understanding EXPLAIN ANALYZE estimates when loops != 1
Date
Msg-id 88D7CF3A-B5D2-4585-9CFE-F62097F1EBCA@americanefficient.com
Whole thread Raw
In response to Re: Understanding EXPLAIN ANALYZE estimates when loops != 1  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Understanding EXPLAIN ANALYZE estimates when loops != 1  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-general

> On Aug 19, 2020, at 6:24 PM, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 20 Aug 2020 at 09:55, Philip Semanchuk
> <philip@americanefficient.com> wrote:
>> I could use some help interpreting EXPLAIN ANALYZE output.
>>
>> ->  Index Scan using ix_foo on t1 (cost=0.69..68.57 rows=3283 width=105) (actual time=0.006..0.918 rows=3760
loops=94)
>>
>> The actual rows returned by this plan node ~= 3760 * 94 = 353,440.
>
> Yes.  It's total rows / loops rounded to the nearest integer number.
>
>> Did postgres expect (estimate) 3283 rows from this join, or 3283 * 94 = 308,602?
>
> Yes, that's the case at least when the node is not a Parallel node.
> If this index scan was part of a parameterized nested loop, then
> you'll see the estimate of the number of expected loops from the outer
> side of the join.

Thanks, I was wondering where the 94 came from.


> Same question for this node.
>>
>> ->  Parallel Index Scan using pk_xyz on xyz  (cost=0.29..2354.67 rows=54285 width=25) (actual time=0.049..6.326
rows=14864loops=5) 
>>
>> Actual rows ~= 14864 * 5 = 74,320, estimate = 54285 or 54285 * 5?
>
> So parallel plans are a bit more complex.   The row estimates are the
> total estimated rows  / the amount of workers we expect to do useful
> work.  You might expect the divisor there to be an integer number
> since you can't really have 0.5 workers.  However, it's more complex
> than that since the leader has other tasks to take care of such as
> pulling tuples from workers, it's not dedicated to helping out.

Sounds like it help to set max_parallel_workers = 1 before running EXPLAIN ANALYZE in order to simplify the numbers,
yes?Or is there a possibility that doing so would send the planner down an entirely different path? 

>
> If you're into reading C code, then there's more information in
> https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/costsize.c#L5699
> , if you hunt around for usages of that function then you'll see the
> estimated row counts are divided by the return value of that function.

Yes, I’ve spent some time reading that file and its relatives. It’s been helpful.

Much appreciated
Philip




pgsql-general by date:

Previous
From: "Ko, Christina"
Date:
Subject: RE: EXTERNAL: Re: Loading Oracle Spatial Data to Postgresql
Next
From: Ron
Date:
Subject: Re: EXTERNAL: Re: Loading Oracle Spatial Data to Postgresql