Hi,
Attached is an updated version of the patchset, but more importantly
some benchmark numbers.
I ran tpch scale 5, on my laptop with
-c shared_buffers=20GB # all data fits into memory
-c max_parallel_workers_per_gather=0 # reduce variability
-c work_mem=4GB # other suggestions welcome
-c huge_pages=on #reduces variability
My benchmarking script first prewarms the whole database, then runs the
tpch queries in sequence, repeated three times, and compares the shortes
execution time:
master q01 min: 16176.843 dev-llvm-off min: 13153.179 [diff -22.99]
master q02 min: 1280.691 dev-llvm-off min: 1263.032 [diff -1.40]
master q03 min: 7330.737 dev-llvm-off min: 7144.982 [diff -2.60]
master q04 min: 1014.347 dev-llvm-off min: 991.008 [diff -2.36]
master q05 min: 5490.103 dev-llvm-off min: 5439.878 [diff -0.92]
master q06 min: 1916.45 dev-llvm-off min: 1818.839 [diff -5.37]
master q07 min: 5282.129 dev-llvm-off min: 5222.879 [diff -1.13]
master q08 min: 1655.824 dev-llvm-off min: 1532.1 [diff -8.08]
master q09 min: 7009.372 dev-llvm-off min: 6724.515 [diff -4.24]
master q10 min: 6017.01 dev-llvm-off min: 5848.337 [diff -2.88]
master q11 min: 316.724 dev-llvm-off min: 292.51 [diff -8.28]
master q12 min: 4829.502 dev-llvm-off min: 4698.14 [diff -2.80]
master q13 min: 8679.991 dev-llvm-off min: 8427.614 [diff -2.99]
master q14 min: 814.109 dev-llvm-off min: 774.805 [diff -5.07]
master q15 min: 1957.248 dev-llvm-off min: 1841.377 [diff -6.29]
master q16 min: 1976.544 dev-llvm-off min: 1936.932 [diff -2.05]
master q17 min: 559.664 dev-llvm-off min: 446.199 [diff -25.43]
master q18 min: 16565.722 dev-llvm-off min: 15475.372 [diff -7.05]
master q19 min: 385.222 dev-llvm-off min: 310.398 [diff -24.11]
master q20 min: 1717.015 dev-llvm-off min: 1389.064 [diff -23.61]
master q22 min: 753.017 dev-llvm-off min: 637.152 [diff -18.18]
(note that there's a fair amount of per-run variance, I've seen one or
two of the small differences go the other way in previous runs)
It's clearly visible that the performance gain heavily depends on the
type of query. Which makes sense - expression evaluation isn't a
bottleneck everywhere. The actual gains in expression evalution are
larger than the maximum here, but bottlenecks shift.
Besides cleanups the important changes in this version of the patches
are:
- I've added a fastpath for the interpretation of very simple
expressions (non-sysvar, single column Vars and Consts), before that
performance regressed slightly for cases that evaluated a *lot* of
Vars/Consts. The only case where I could actually reproduce that is in
large hash-joins where the to-be-hashed value is extracted on its own.[1;5A
- I moved the invocation of expression evaluation back to a
callback. That's better for predicting branches both with JITing and
when using fastpath functions.
- removed used EXEC_EVALDEBUG code
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers