On Fri, 2 Jul 2021 at 00:28, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Thu, 1 Jul 2021 at 06:43, David Rowley <dgrowleyml@gmail.com> wrote:
> >
> > Master @ 3788c6678
> >
> > Execution Time: 8306.319 ms
> > Execution Time: 8407.785 ms
> > Execution Time: 8491.056 ms
> >
> > Master + numeric-agg-sumX2-overflow-v2.patch
> > Execution Time: 6633.278 ms
> > Execution Time: 6657.350 ms
> > Execution Time: 6568.184 ms
> >
>
> Hmm, I'm a bit surprised by those numbers. I wouldn't have expected it
> to be spending enough time in the serialization/deserialization code
> for it to make such a difference. I was only able to measure a 2-3%
> performance improvement with the same test, and that was barely above
> the noise.
I ran this again with a few different worker counts after tuning a few
memory settings so there was no spilling to disk and so everything was
in RAM. Mostly so I could get consistent results.
Here's the results. Average over 3 runs on each:
Workers Master Patched Percent
8 11094.1 11084.9 100.08%
16 8711.4 8562.6 101.74%
32 6961.4 6726.3 103.50%
64 6137.4 5854.8 104.83%
128 6090.3 5747.4 105.96%
So the gains are much less at lower worker counts. I think this is
because most of the gains are in the serial part of the plan and with
higher worker counts that part of the plan is relatively much bigger.
So likely performance isn't too critical here, but it is something to
keep in mind.
David