On Fri, Aug 6, 2021 at 4:31 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> Results: (query EXPLAIN ANALYZE SELECT * FROM t;)
> 1) Non-parallel (default)
> Execution Time: 31627.492 ms
>
> 2) Parallel with 4 workers (force by setting parallel_tuple_cost to 0)
> Execution Time: 37498.672 ms
>
> 3) Same as above (2) but with the patch.
> Execution Time: 23649.287 ms
This strikes me as an amazingly good result. I guess before seeing
these results, I would have said that you can't reasonably expect
parallel query to win on a query like this because there isn't enough
for the workers to do. It's not like they are spending time evaluating
filter conditions or anything like that - they're just fetching tuples
off of disk pages and sticking them into a queue. And it's unclear to
me why it should be better to have a bunch of processes doing that
instead of just one. I would have thought, looking at just (1) and
(2), that parallelism gained nothing and communication overhead lost 6
seconds.
But what this suggests is that parallelism gained at least 8 seconds,
and communication overhead lost at least 14 seconds. In fact...
> - If I apply both Experiment#1 and Experiment#2 patches together then,
> we can further reduce the execution time to 20963.539 ms (with 4
> workers and 4MB tuple queue size)
...this suggests that parallelism actually gained at least 10-11
seconds, and the communication overhead lost at least 15-16 seconds.
If that's accurate, it's pretty crazy. We might need to drastically
reduce the value of parallel_tuple_cost if these results hold up and
this patch gets committed.
--
Robert Haas
EDB: http://www.enterprisedb.com