I was profiling a case involving UPDATEs into a table with too many
indexes (brought to mind by mysql's sql-bench, about which more later)
and got this rather surprising result for routines costing more than
1% of the total runtime:
Each sample counts as 0.01 seconds. % cumulative self self total time seconds
seconds calls s/call s/call name 64.03 86.20 86.20 133608 0.00 0.00 XLogInsert 3.50
90.91 4.71 2484787 0.00 0.00 _bt_compare 2.92 94.84 3.93 839893 0.00 0.00 hash_search
2.77 98.57 3.73 1875815 0.00 0.00 LWLockAcquire 1.89 101.12 2.55 1887972 0.00 0.00
LWLockRelease1.27 102.83 1.71 125234 0.00 0.00 _bt_getroot 1.01 104.19 1.36 403342 0.00
0.00 PinBuffer 1.00 105.54 1.35 840002 0.00 0.00 hash_any
I suppose that the bulk of the CPU cycles being attributed to XLogInsert
are going into the inlined CRC calculations. Maybe we need to think
twice about the cost/benefit ratio of using 64-bit CRCs to protect xlog
records that are often only a few dozen bytes.
regards, tom lane