Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Parallel Seq Scan
Date
Msg-id CAJrrPGfCP7SzwP3g0HCY2WaR-+POz7e2oErR8NpBLyfAWYu_VA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Sep 18, 2015 at 9:45 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Sep 18, 2015 at 1:33 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
> wrote:
>>
>> On Thu, Sep 3, 2015 at 8:21 PM, Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>> >
>> > Attached, find the rebased version of patch.
>>
>> Here are the performance test results:
>>
>> Query          selectivity    HashAgg                         HashAgg
>>                     (million)     + seqscan(ms)                +
>> parallel seq scan(ms)
>>                                                                     2
>> workers   4 workers      8 workers
>> $1 <= '001'      0.1           16717.00             7086.00
>> 4459.00         2912.00
>> $1 <= '004'      0.4           17962.00             7410.00
>> 4651.00         2977.00
>> $1 <= '008'      0.8           18870.00             7849.00
>> 4868.00         3092.00
>> $1 <= '016'      1.5           21368.00             8645.00
>> 6800.00         3486.00
>> $1 <= '030'      2.7           24622.00           14796.00    13108.00
>>         9981.00
>> $1 <= '060'      5.4           31690.00           29839.00    26544.00
>>       23814.00
>> $1 <= '080'      7.2           37147.00           40485.00    35763.00
>>       32679.00
>>
>
> I think here probably when the selectivity is more than 5, then it should
> not have selected Funnel plan.  Have you by any chance changed
> cpu_tuple_comm_cost?  If not, then you can try by setting value of
> parallel_setup_cost (may be 10) and then see if it selects the Funnel
> Plan.  Is it possible for you to check the cost difference of Sequence
> and Funnel plan, hopefully explain or explain analyze should be sufficient?

Yes, I changed cpu_tuple_comm_cost to zero to observe how parallel seq scan
performs in high selectivity. Forgot to mention in the earlier mail.
Overall the
parallel seq scan performance is good.


>> And also attached perf results for selectivity of 0.1 million and 5.4
>> million cases for analysis.
>>
>
> I have checked perf reports and it seems that when selectivity is more, it
> seems to be spending time in some kernel calls which could be due
> communication of tuples.

Yes. And also in low selectivity with increase of workers, tas and
s_lock functions usage
is getting increased. May be these are also one of the reasons for
scaling problem.

Regards,
Hari Babu
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Vladimir Borodin
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Next
From: Michael Paquier
Date:
Subject: Re: Use pg_rewind when target timeline was switched