Re: Bad iostat numbers - Mailing list pgsql-performance

From Greg Smith
Subject Re: Bad iostat numbers
Date
Msg-id Pine.GSO.4.64.0612032336250.19679@westnet.com
Whole thread Raw
In response to Bad iostat numbers  ("Carlos H. Reimer" <carlos.reimer@opendb.com.br>)
Responses Re: Bad iostat numbers
List pgsql-performance
On Thu, 30 Nov 2006, Carlos H. Reimer wrote:

> I would like to discover how much cache is present in
> the controller, how can I find this value from Linux?

As far as I know there is no cache on an Adaptec 39320.  The write-back
cache Linux was reporting on was the one in the drives, which is 8MB; see
http://www.seagate.com/cda/products/discsales/enterprise/tech/1,1593,541,00.html
Be warned that running your database with the combination of an uncached
controller plus disks with write caching is dangerous to your database
integrity.

There is a common problem with the Linux driver for this card (aic7902)
where it enters what's they're calling an "Infinite Interrupt Loop".
That seems to match your readings:

> Here is a typical iostat -x:
> Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s
> sda          0.00   7.80  0.40  6.40   41.60  113.60    20.80    56.80
> avgrq-sz avgqu-sz   await  svctm  %util
> 22.82 570697.50   10.59 147.06 100.00

An avgqu-sz of 570697.50 is extremely large.  That explains why the
utilization is 100%, because there's a massive number of I/O operations
queued up that aren't getting flushed out.  The read and write data says
these drives are barely doing anything, as 20kB/s and 57KB/s are
practically idle; they're not even remotely close to saturated.

See http://lkml.org/lkml/2005/10/1/47 for a suggested workaround that may
reduce the magnitude of this issue; lower the card's speed to U160 in the
BIOS was also listed as a useful workaround.  You might get better results
by upgrading to a newer Linux kernel, and just rebooting to clear out the
garbage might help if you haven't tried that yet.

On the pessimistic side, other people reporting issues with this
controller are:

http://lkml.org/lkml/2005/12/17/55
http://www.ussg.iu.edu/hypermail/linux/kernel/0512.2/0390.html
http://www.linuxforums.org/forum/peripherals-hardware/59306-scsi-hangs-boot.html
and even under FreeBSD at
http://lists.freebsd.org/pipermail/aic7xxx/2003-August/003973.html

This Adaptec card just barely works under Linux, which happens regularly
with their controllers, and my guess is that you've run into one of the
ways it goes crazy sometimes.  I just chuckled when checking
http://linux.adaptec.com/ again and noticing they can't even be bothered
to keep that server up at all.  According to

http://www.adaptec.com/en-US/downloads/linux_source/linux_source_code?productId=ASC-39320-R&dn=Adaptec+SCSI+Card+39320-R

the driver for your card is "*minimally tested* for Linux Kernel v2.6 on
all platforms."  Adaptec doesn't care about Linux support on their
products; if you want a SCSI controller that actually works under Linux,
get an LSI MegaRAID.

If this were really a Postgres problem, I wouldn't expect %iowait=1.10.
Were the database engine waiting to read/write data, that number would be
dramatically higher.  Whatever is generating all these I/O requests, it's
not waiting for them to complete like the database would be.  Besides the
driver problems that I'm very suspicious of, I'd suspect a runaway process
writing garbage to the disks might also cause this behavior.

> Ive taken a look in the /var/log/messages and found some temperature
> messages about the disk drives:
> Nov 30 11:08:07 totall smartd[1620]: Device: /dev/sda, Temperature changed 2
> Celsius to 51 Celsius since last report
> Can this temperature influence in the performance?

That's close to the upper tolerance for this drive (55 degrees), which
means the drive is being cooked and will likely wear out quickly.  But
that won't slow it down, and you'd get much scarier messages out of smartd
if the drives had a real problem.  You should improve cooling in this case
if you want to drives to have a healthy life, odds are low this is
relevant to your performance issue though.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: "Carlo Stonebanks"
Date:
Subject: Re: Is Vacuum/analyze destroying my performance?
Next
From: "Alex Turner"
Date:
Subject: Re: Bad iostat numbers