On Mon, Jul 6, 2015 at 8:49 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> I ran some performance tests on a 16 core machine with large shared
> buffers, so there is no IO involved.
> With the default value of cpu_tuple_comm_cost, parallel plan is not
> getting generated even if we are selecting 100K records from 40
> million records. So I changed the value to '0' and collected the
> performance readings.
>
> Here are the performance numbers:
>
> selectivity(millions) Seq scan(ms) Parallel scan
> 2 workers
> 4 workers 8 workers
> 0.1 11498.93 4821.40
> 3305.84 3291.90
> 0.4 10942.98 4967.46
> 3338.58 3374.00
> 0.8 11619.44 5189.61
> 3543.86 3534.40
> 1.5 12585.51 5718.07
> 4162.71 2994.90
> 2.7 14725.66 8346.96
> 10429.05 8049.11
> 5.4 18719.00 20212.33 21815.19
> 19026.99
> 7.2 21955.79 28570.74 28217.60
> 27042.27
>
> The average table row size is around 500 bytes and query selection
> column width is around 36 bytes.
> when the query selectivity goes more than 10% of total table records,
> the parallel scan performance is dropping.
Thanks for doing this testing. I think that is quite valuable. I am
not too concerned about the fact that queries where more than 10% of
records are selected do not speed up. Obviously, it would be nice to
improve that, but I think that can be left as an area for future
improvement.
One thing I noticed that is a bit dismaying is that we don't get a lot
of benefit from having more workers. Look at the 0.1 data. At 2
workers, if we scaled perfectly, we would be 3x faster (since the
master can do work too), but we are actually 2.4x faster. Each
process is on the average 80% efficient. That's respectable. At 4
workers, we would be 5x faster with perfect scaling; here we are 3.5x
faster. So the third and fourth worker were about 50% efficient.
Hmm, not as good. But then going up to 8 workers bought us basically
nothing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company