On 11 October 2015 at 13:20, Peter Geoghegan <pg@heroku.com> wrote:
It's worth considering that for some (possibly legitimate) reason, the built-in function call is ignored by your compiler, since GCC has license to do that. You might try this on both master and patched builds:
You're right, gcc did not include the prefetch instructions.
I've tested again on the same machine but with clang 3.7 instead of gcc 4.8.3
I've conducted the same tests again. All times are in milliseconds. Results are the average and median over 10 runs.
set work_mem ='1GB';
create table t1 as select md5(random()::text) from generate_series(1,10000000);
vacuum freeze t1;
Running 1 query at a time the results are as follows:
Test1: select count(distinct md5) from t1;
MasterPatchedGain
Average10853.67910132.544107.12%
Median10754.19310005.001107.49%
Test2: select sum(rn) from (select row_number() over (order by md5) rn from t1) a;
MasterPatchedGain
Average11495.870311475.0081100.18%
Median11495.601511455.944100.35%
Test3: create index t1_md5_idx on t1(md5);
MasterPatchedGain
Average36464.463237830.387996.39%
Median35946.60836765.005597.77%
I also decided to run multiple queries at once, to see if there was any cache pollution problems with the prefetching.
Test 1pgbench -T 600 -c 16 -j 16 -f test1.sql -n
Test 2pgbench -T 600 -c 16 -j 16 -f test2.sql -n
(tps output from pgbench was converted to milliseconds with 1/TPS*1000)
MasterPatchedGain
Test 11375.4131358.494101.25%
Test 21594.7531588.340100.40%
CPU: 1 x Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
I've attached a spreadsheet with all of the results.