Home > mailing lists

Re: Parallel Seq Scan - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Parallel Seq Scan
Date	July 22, 2015 15:44:13
Msg-id	CA+TgmoZKn13DzP=p=FmY=L9CcxVeOdnhib5qQ2ZJnCUPOn+KQw@mail.gmail.com Whole thread
In response to	Re: Parallel Seq Scan (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses	Re: Parallel Seq Scan
List	pgsql-hackers

Tree view

On Mon, Jul 6, 2015 at 8:49 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> I ran some performance tests on a 16 core machine with large shared
> buffers, so there is no IO involved.
> With the default value of cpu_tuple_comm_cost, parallel plan is not
> getting generated even if we are selecting 100K records from 40
> million records. So I changed the value to '0' and collected the
> performance readings.
>
> Here are the performance numbers:
>
> selectivity(millions)  Seq scan(ms)                  Parallel scan
>                                                          2 workers
> 4 workers     8 workers
>       0.1                      11498.93            4821.40
> 3305.84        3291.90
>       0.4                      10942.98            4967.46
> 3338.58        3374.00
>       0.8                      11619.44            5189.61
> 3543.86        3534.40
>       1.5                      12585.51            5718.07
> 4162.71        2994.90
>       2.7                      14725.66            8346.96
> 10429.05        8049.11
>       5.4                      18719.00          20212.33     21815.19
>      19026.99
>       7.2                      21955.79          28570.74     28217.60
>      27042.27
>
> The average table row size is around 500 bytes and query selection
> column width is around 36 bytes.
> when the query selectivity goes more than 10% of total table records,
> the parallel scan performance is dropping.

Thanks for doing this testing.  I think that is quite valuable.  I am
not too concerned about the fact that queries where more than 10% of
records are selected do not speed up.  Obviously, it would be nice to
improve that, but I think that can be left as an area for future
improvement.

One thing I noticed that is a bit dismaying is that we don't get a lot
of benefit from having more workers.  Look at the 0.1 data.  At 2
workers, if we scaled perfectly, we would be 3x faster (since the
master can do work too), but we are actually 2.4x faster.  Each
process is on the average 80% efficient.  That's respectable.  At 4
workers, we would be 5x faster with perfect scaling; here we are 3.5x
faster.   So the third and fourth worker were about 50% efficient.
Hmm, not as good.  But then going up to 8 workers bought us basically
nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Jim Nasby
Date: 22 July 2015, 15:29:14
Subject: Re: [PROPOSAL] VACUUM Progress Checker.

From: Jeff Janes
Date: 22 July 2015, 16:11:46
Subject: Re: optimizing vacuum truncation scans

Re: Parallel Seq Scan - Mailing list pgsql-hackers

Previous

Next