Thread: odd iostat graph

odd iostat graph

From
Alan McKay
Date:
Hey folks,

Earlier in the week I wrote a Munin plugin that takes the "await" and
"average queue length" fields from "iostat -x" and graphs them.

This seems rather odd to me :

http://picasaweb.google.ca/alan.mckay/Work#5380253477470243954

That is Qlen.   And await looks similar

http://picasaweb.google.ca/alan.mckay/Work#5380254090296723426

This is on an IBM 3650 with the 2 main "internal" drives set up in a
mirrored config, and sdb are the 6 other drives set up in a RAID5 with
a global hot spare.  (4 drives in array + 1 to make it RAID5 + global
hot spare)

We aren't seeing any performance problems on this per-se.   But that
just seems like a really odd graph to me.  Can anyone explain it?  In
particular, how regular it is?

cheers,
-Alan

--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: odd iostat graph

From
Scott Carey
Date:
On 9/11/09 9:58 AM, "Alan McKay" <alan.mckay@gmail.com> wrote:

> Hey folks,
>
> Earlier in the week I wrote a Munin plugin that takes the "await" and
> "average queue length" fields from "iostat -x" and graphs them.
>
> This seems rather odd to me :
>
> http://picasaweb.google.ca/alan.mckay/Work#5380253477470243954
>
> That is Qlen.   And await looks similar
>
> http://picasaweb.google.ca/alan.mckay/Work#5380254090296723426
>
> This is on an IBM 3650 with the 2 main "internal" drives set up in a
> mirrored config, and sdb are the 6 other drives set up in a RAID5 with
> a global hot spare.  (4 drives in array + 1 to make it RAID5 + global
> hot spare)
>
> We aren't seeing any performance problems on this per-se.   But that
> just seems like a really odd graph to me.  Can anyone explain it?  In
> particular, how regular it is?

My guess is this is checkpoint related.
Find out when your checkpoints are happening.  The drops are most likely due
to the sync() on all outstanding writes at the end of each checkpoint.  The
rise is probably small writes not yet on disk in the OS bufffer cache.  If
this is due to checkpoints, I would expect a burst of write volume to disk
at the same time of the drop.

You can change your logging settings to output the time of each checkpoint
and some stats about them.

>
> cheers,
> -Alan
>
> --
> ³Don't eat anything you've ever seen advertised on TV²
>          - Michael Pollan, author of "In Defense of Food"
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: odd iostat graph

From
Alan McKay
Date:
> My guess is this is checkpoint related.

I'll assume "checkpoint" is a PG term that I'm not yet familiar with -
will query my DBA :-)

If this OS buffer cache, wouldn't that be cached an awfully long time?
 i.e. we're in big trouble if we get a bad crash?


--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: odd iostat graph

From
"Kevin Grittner"
Date:
Alan McKay <alan.mckay@gmail.com> wrote:

>>  My guess is this is checkpoint related.
>
> I'll assume "checkpoint" is a PG term that I'm not yet familiar with
-
> will query my DBA :-)

A checkpoint flushes all dirty PostgreSQL buffers to the OS and then
tells the OS to write them to disk.  The exact details of how that's
done and the timings involved vary with PostgreSQL version and
configuration.

> If this OS buffer cache, wouldn't that be cached an awfully long
> time?  i.e. we're in big trouble if we get a bad crash?

Before the commit of a database transaction is completed the changes
which are involved in that are written to a write ahead log (WAL).  A
checkpoint is also recorded in the WAL.  On recovery from a crash,
PostgreSQL replays all activity from committed transactions after the
last checkpoint; so nothing from a committed transaction is lost.

-Kevin

Re: odd iostat graph

From
Greg Smith
Date:
On Fri, 11 Sep 2009, Alan McKay wrote:

> We aren't seeing any performance problems on this per-se.   But that
> just seems like a really odd graph to me.  Can anyone explain it?  In
> particular, how regular it is?

What's the scale on the bottom there?  The label says "by week" but the
way your message is written makes me think it's actually a much smaller
time frame.  If those valleys are around around five minutes apart, those
are the checkpoints finishing; the shape of the graph is right for it to
be those.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: odd iostat graph

From
Alan McKay
Date:
> What's the scale on the bottom there?  The label says "by week" but the way
> your message is written makes me think it's actually a much smaller time
> frame.  If those valleys are around around five minutes apart, those are the
> checkpoints finishing; the shape of the graph is right for it to be those.

No, that's about 6 days of stats.
The numbers 04, 05, 06 are Sept 04, 05, 06 ...

This is the part I found oddest - that this takes place over such a
huge timeframe!

Munin takes a snapshot every 5 minutes, and this graph shows it
averaged over that timeframe.

--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: odd iostat graph

From
Greg Smith
Date:
On Fri, 11 Sep 2009, Alan McKay wrote:

> Munin takes a snapshot every 5 minutes, and this graph shows it
> averaged over that timeframe.

The default postgresql.conf puts a checkpoint every 5 minutes as well.
It's not going to be as exact as Munin's time though, they'll be just a
little longer than that.  I wonder if you're seeing the checkpoint pattern
anyway.  If the checkpoint period is fairly stable (which it could be in
your case), is slighly longer than the monitoring one, and each checkpoint
has the same basic shape (also usually true), each monitoring sample is
going to trace out the usual checkpoint pattern by sampling a slightly
later point from successive ones.  The shape of the curve you're seeing
seems way too close to the standard checkpoint one to be just a
coincidence.

In any case, 5 minutes is unfortunately for you a really bad sampling
period for a PG database because of the similarity to the checkpoint
timing.  You have to take measurements at least once a minute to see the
checkpoints happening usefully at all.  I think you're stuck generating an
activity graph some other way with finer resolution to get to the bottom
of this.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD