On Fri, 07 Apr 2006 16:06:02 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The pSeries isn't much older than our Xeon machine, and I expected
> > the performance level to be exemplary out of the box..
>
> I'm fairly surprised too. One thing I note from your comparison of
> settings is that the default WAL sync method is different on the two
> operating systems.
We're very read-focussed.. there's update activity, sure, but the IO is
only pushing about 500KByte/sec on average, usually much less. I also
have fsync switched off - yes dangerous, but I just want to eliminate
IO completely as a contributing factor.
> Does AIX have anything comparable to oprofile or dtrace?
I've used neither on Linux, but a quick google showed up a few articles
along the lines of 'in theory it shouldn't be hard to port to AIX....'
but nothing concrete. My guess is IBM sell a tool to do this. Hell, the
C++ compiler is £1200... (hence our use of GCC 4.1 to compile pg)
> Failing a low-level profiler, there should at least be
> something comparable to strace --- you should try watching some of
> the backends with strace and see what their behavior is when the
> performance goes south. Lots of delaying select()s or semop()s would
> be a red flag.
There's truss installed which seems to do the same as strace on
Linux... and here's a wildly non-scientific glance.. I watched the
'topas' output (top for AIX) , identified a PID that was doing a lot of
work, then attached truss to that pid. In addition to lots of send
(), recv() and lseek()s... about once a minute I saw hundreds of calls
to __semop() interspersed with _select(), followed by tons of lseek()
+kread()+__semop() and then I can see the kwrite() to the pg logfile
246170: kwrite(2, " L O G : d u", 8) = 8 etc.
Cheers,
Gavin.