Re: Parallel Aggregate - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Parallel Aggregate
Date
Msg-id CAJrrPGfsF8ony1K1OFED+ZVS+_OnygrCR0z9vYBtvz_f6XMtfQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Aggregate  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Fri, Dec 11, 2015 at 5:42 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> 3. Performance test to observe the effect of parallel aggregate.

Here I attached the performance test report of parallel aggregate.
Summary of the result is:
1. Parallel aggregate is not giving any improvement or having
very less overhead compared to parallel scan in case of low
selectivity.

2. Parallel aggregate is performing well more than 60% compared
to parallel scan because of very less data transfer overhead as the
hash aggregate operation is reducing the number of tuples that
are required to be transferred from workers to backend.

The parallel aggregate plan is depends on below parallel seq scan.
In case if parallel seq scan plan is not generated because of more
tuple transfer overhead cost in case of higher selectivity, then
parallel aggregate is also not possible. But with parallel aggregate
the number of records that are required to be transferred from
worker to backend may reduce compared to parallel seq scan. So
the overall cost of parallel aggregate may be better.

To handle this problem, how about the following way?

Having an one more member in RelOptInfo called
cheapest_parallel_path used to store the parallel path that is possible.
where ever the parallel plan is possible, this value will be set with
the possible parallel plan. If parallel plan is not possible in the parent
nodes, then this will be set as NULL. otherwise again calculate the
parallel plan at this node based on the below parallel plan node.

Once the entire paths are finalized, in grouping planner, prepare a
plan for normal aggregate and parallel aggregate. Compare these
two costs and decide the cheapest cost plan.

I didn't yet evaluated the feasibility of the above solution. suggestions?

Regards,
Hari Babu
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_stat_replication log positions vs base backups
Next
From: Corey Huinker
Date:
Subject: Re: Disabling an index temporarily