On Tue, 2005-05-10 at 18:22 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > The cause of the performance problem has been attributed to it being a
> > 64-bit rather than 32-bit calculation. That is certainly part of it, but
> > I have seen evidence that there is an Intel processor stall associated
> > with the use of a single byte constant somewhere in the algorithm.
>
> That's awfully vague --- can't you give any more detail?
>
> I have seen XLogInsert eating significant amounts of time (up to 10% of
> total CPU time) on non-Intel architectures, so I think that dropping
> down to 32 bits is warranted in any case. But if you are correct then
> that might not fix the problem on Intel machines. We need more info.
I have seen an Intel VTune report that shows a memory stall causing high
latency associated with a single assembly instruction that in the
compiled code of the CRC calculation. The instruction was manipulating a
single byte only. I couldn't tell exactly which line of PostgreSQL code
produced the assembler. This could be either a partial register stall or
a memory order buffer stall (or another?)
Here's a discussion of this
http://www.gamasutra.com/features/19991221/barad_pfv.htm
Sorry, but thats all I know. I will try to obtain the report, which is
not in my possession.
I do *not* know with any certainty what the proportion of time lost from
the CRC calc proper in an idealised CPU against the time lost from this
hardware specific interaction. I don't know if non-Intel CPUs are
effected either.
Best Regards, Simon Riggs