Thread: "iowait" bug?

"iowait" bug?

From

"M. Edward (Ed) Borasky"

Date:

21 March 2009, 01:17:21

I just discovered this on a LinkedIn user group:

http://bugzilla.kernel.org/show_bug.cgi?id=12309

Is anyone here seeing evidence of this in PostgreSQL??
--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.

Re: "iowait" bug?

From

Laurent Wandrebeck

Date:

22 March 2009, 07:15:50

2009/3/21 M. Edward (Ed) Borasky <zznmeb@gmail.com>:
> I just discovered this on a LinkedIn user group:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> Is anyone here seeing evidence of this in PostgreSQL??
I've been hit by an I/O wait problem, as described here:
https://bugzilla.redhat.com/show_bug.cgi?id=444759
I've told it to that other bug, but no one seems to have followed that path.
Regards,
Laurent

Re: "iowait" bug?

From

Laurent Laborde

Date:

22 March 2009, 12:29:40

On Sun, Mar 22, 2009 at 8:49 AM, Laurent Wandrebeck
<l.wandrebeck@gmail.com> wrote:
> 2009/3/21 M. Edward (Ed) Borasky <zznmeb@gmail.com>:
>> I just discovered this on a LinkedIn user group:
>>
>> http://bugzilla.kernel.org/show_bug.cgi?id=12309
>>
>> Is anyone here seeing evidence of this in PostgreSQL??
> I've been hit by an I/O wait problem, as described here:
> https://bugzilla.redhat.com/show_bug.cgi?id=444759
> I've told it to that other bug, but no one seems to have followed that path.

We applied this mwi patch on 3 pgsql servers, and seen great
performance improvement.
Using 3ware, 8 SAS HDD, Octocore (2x4) Xeon and 32GB RAM, on a custom
2.6.18 kernel.

--
Laurent Laborde
http://www.over-blog.com/

Re: "iowait" bug?

From

Greg Smith

Date:

22 March 2009, 17:17:50

On Fri, 20 Mar 2009, M. Edward (Ed) Borasky wrote:

> I just discovered this on a LinkedIn user group:
> http://bugzilla.kernel.org/show_bug.cgi?id=12309

I would bet there's at least 3 different bugs in that one.  That bug
report got a lot of press via Slashdot a few months ago, and it's picked
all sort of people who all have I/O wait issues, but they don't all have
the same cause.  The 3ware-specific problem Laurent mentioned is an
example.  That's not the same thing most of the people there are running
into, the typical reporter there has disks attached directly to their
motherboard.  The irony here is that #12309 was a fork of #7372 to start
over with a clean discussion slat because the same thing happened to that
earlier one.

The original problem reported there showed up in 2.6.20, so I've been able
to avoid this whole thing by sticking to the stock RHEL5 kernel (2.6.18)
on most of the production systems I deal with.  (Except for my system with
an Areca card--that one needs 2.6.22 or later to be stable, and seems to
have no unexpected I/O wait issues.  I think this is because it's taking
over the lowest level I/O scheduling from Linux, when it pushes from the
card's cache onto the disks).

Some of the people there reported significant improvement by tuning the
pdflush tunables; now that I've had to do a few times on systems to get
rid of unexpected write lulls.  I wrote up a walkthrough on one of them at
http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html that
goes over how to tell if you're running into that problem, and what to do
about it; something else I wrote on that already made it into the bug
report in comment #150.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: "iowait" bug?

From

Laurent Wandrebeck

Date:

22 March 2009, 20:15:01

2009/3/22 Greg Smith <gsmith@gregsmith.com>:
> On Fri, 20 Mar 2009, M. Edward (Ed) Borasky wrote:
>
>> I just discovered this on a LinkedIn user group:
>> http://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> I would bet there's at least 3 different bugs in that one.  That bug report
> got a lot of press via Slashdot a few months ago, and it's picked all sort
> of people who all have I/O wait issues, but they don't all have the same
> cause.  The 3ware-specific problem Laurent mentioned is an example.  That's
> not the same thing most of the people there are running into, the typical
> reporter there has disks attached directly to their motherboard.  The irony
> here is that #12309 was a fork of #7372 to start over with a clean
> discussion slat because the same thing happened to that earlier one.
That I/O wait problem is not 3ware specific. A friend of mine has the
same problem/fix with aacraid.
I'd bet a couple coins that controllers that show this problem do not set mwi.
quickly grepping linux sources (2.6.28.8) for pci_try_set_mwi:
(only disks controllers showed here)
230:pata_cs5530.c
3442:sata_mv.c
2016:3w-9xxx.c
147:qla_init.c
2412:lpfc_init.c
171:cs5530.c
>
> The original problem reported there showed up in 2.6.20, so I've been able
> to avoid this whole thing by sticking to the stock RHEL5 kernel (2.6.18) on
> most of the production systems I deal with.  (Except for my system with an
> Areca card--that one needs 2.6.22 or later to be stable, and seems to have
> no unexpected I/O wait issues.  I think this is because it's taking over the
> lowest level I/O scheduling from Linux, when it pushes from the card's cache
> onto the disks).
I thought about completely fair scheduler at first, but that one came
in around 2.6.21.
some tests were done with different I/O scheduler, and they do not
seem to be the real cause of I/O wait.
A bad interaction between hard raid cards cache and system willing the
card to write at the same time could be a reason.
unfortunately, I've met it with a now retired box at work, that was
running a single disk plugged on the mobo controller.
So, there's something else under the hood...but my (very) limited
kernel knowledge can't help more here.
>
> Some of the people there reported significant improvement by tuning the
> pdflush tunables; now that I've had to do a few times on systems to get rid
> of unexpected write lulls.  I wrote up a walkthrough on one of them at
> http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html that
> goes over how to tell if you're running into that problem, and what to do
> about it; something else I wrote on that already made it into the bug report
> in comment #150.
I think that forcing the system to write down more often, and smaller
data just hides the problem, and doesn't correct it.
But well, that's just feeling, not science. I hope some real hacker
will be able to spot the problem(s) so they can be fixed.
anyway, I keep a couple coins on mwi as a source of problem :-)
Regards,
Laurent

Re: "iowait" bug?

From

Greg Smith

Date:

23 March 2009, 01:05:18

On Mon, 23 Mar 2009, Laurent Wandrebeck wrote:

> I thought about completely fair scheduler at first, but that one came
> in around 2.6.21.

CFS showed up in 2.6.23.

> I think that forcing the system to write down more often, and smaller
> data just hides the problem, and doesn't correct it.

That's one possibility.  I've been considering things like whether the OS
is getting bogged down managing things like the elevator sorting for
outstanding writes.  If there was something about that process that gets
really inefficient proportionally to the size of the pending queue, that
would both match the kinds of symptoms people are reporting, and would go
better just reducing the maximum size of the issue by lowering the pdflush
tunables.

Anyway, the point I was trying to make is that there sure seem to be
multiple problems mixed into that one bug report, and it's starting to
look just as unmanagably messy as the older bug that had to be abandoned.
It would have been nice if somebody kicked out all the diversions it
wanted into to keep the focus a bit better.  Anybody using a SSD device,
USB, or ext4 should have been punted to somewhere else for example.
Plenty of examples that don't require any of those things.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD