On Thu, 22 Jun 2023 at 20:59, Yuya Watari <watari.yuya@gmail.com> wrote:
> Table 1: Planning time and its speedup of Join Order Benchmark
> (n: the number of partitions of each table)
> (Speedup: higher is better)
> 64 | 115.7%
> 128 | 142.9%
> 256 | 187.7%
Thanks for benchmarking. It certainly looks like a win for larger
sets. Would you be able to profile the 256 partition case to see
where exactly master is so slow? (I'm surprised this patch improves
performance that much.)
I think it's also important to check we don't slow anything down for
more normal-sized sets. The vast majority of sets will contain just a
single word, so we should probably focus on making sure we're not
slowing anything down for those.
To get the ball rolling on that I used the attached plan_times.patch
so that the planner writes the number of elapsed nanosecond from
calling standard_planner(). Patching with this then running make
installcheck kicks out about 35k log lines with times on it.
I ran this on a Linux AMD 3990x machine and also an Apple M2 pro
machine. Taking the sum of the nanoseconds and converting into
seconds, I see:
AMD 3990x
master: 1.384267931 seconds
patched 1.339178764 seconds (3.37% faster)
M2 pro:
master: 0.58293 seconds
patched: 0.581483 seconds (0.25% faster)
So it certainly does not look any slower. Perhaps a little faster with
the zen2 machine.
(The m2 only seems to have microsecond resolution on the timer code
whereas the zen2 has nanosecond. I don't think this matters much as
the planner takes enough microseconds to plan even for simple queries)
I've also attached the v4 patch again as I'll add this patch to the
commitfest and if I don't do that then the CFbot will pick up Ranier's
patch instead of mine.
David