Hi,
I would like to propose a small patch to the JIT machinery which makes
the IR code generation lazy. The reason for postponing the generation of
the IR code is that with partitions we get an explosion in the number of
JIT functions generated as many child tables are involved, each with
their own JITted functions, especially when e.g. partition-aware
joins/aggregates are enabled. However, only a fraction of those
functions is actually executed because the Parallel Append node
distributes the workers among the nodes. With the attached patch we get
a lazy generation which makes that this is no longer a problem.
For benchmarks I have in TPC-H and TPC-DS like queries with partitioning
by hash seen query runtimes increase by 20+ seconds even on the simpler
queries. Also I created a small benchmark to reproduce the case easily
(see attached sql file):
without patch, using 7 launched workers:
- without jit: ~220ms
- with jit: ~1880ms
without patch, using 50 launched workers:
- without jit: ~280ms
- with jit: ~3400ms
with patch, using 7 launched workers:
- without jit: ~220ms
- with jit: ~590ms
with patch, using 50 launched workers:
- without jit: ~280ms
- with jit: ~530ms
Thoughts?
With Regards,
Luc Vlaming
Swarm64