Thread: Identifying the nature of blocking I/O

Identifying the nature of blocking I/O

From
Peter Schuller
Date:
[for the purpose of this post, 'blocking' refers to an I/O operation
taking a long time for reasons other than the amount of work the I/O
operation itself actually implies; not to use of blocking I/O calls or
anything like that]

Hello,

I have a situation in which deterministic latency is a lot more
important than throughput.

I realize this is a hugely complex topic and that there is inteaction
between many different things (pg buffer cache, os buffer cache, raid
controller caching, wal buffers, storage layout, etc). I already know
several things I definitely want to do to improve things.

But in general, it would be very interesting to see, at any given
moment, what PostgreSQL backends are actually blocking on from the
perspective of PostgreSQL.

So for example, if I have 30 COMMIT:s that are active, to know whether
it is simply waiting on the WAL fsync or actually waiting on a data
fsync because a checkpoint is being created. or similarly, for
non-commits whether they are blocking because WAL buffers is full and
writing them out is blocking, etc.

This would make it easier to observe and draw conclusions when
tweaking different things in pg/the os/the raid controller.

Is there currently a way of dumping such information? I.e., asking PG
"what are backends waiting on right now?".

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey@scode.org
E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org


Attachment

Re: Identifying the nature of blocking I/O

From
Craig Ringer
Date:
Peter Schuller wrote:

> But in general, it would be very interesting to see, at any given
> moment, what PostgreSQL backends are actually blocking on from the
> perspective of PostgreSQL.

The recent work on DTrace support for PostgreSQL will probably give you
the easiest path to useful results. You'll probably need an OpenSolaris
or (I think) FreeBSD host, though, rather than a Linux host.

--
Craig Ringer

Re: Identifying the nature of blocking I/O

From
"Scott Carey"
Date:
More info/notes on DTrace --

DTrace is available now on MacOSX, Solaris 10, OpenSolaris, and FreeBSD.
Linux however is still in the dark ages when it comes to system monitoring, especially with I/O.

You can write some custom DTrace scripts to map any of the basic Postgres operations or processes to things that it is waiting on in the OS.  You can definitely write a script that would be able to track the I/O in reads and writes caused by a transaction, how long those took, what the I/O sizes were, and even what portion of the disk it went to.

http://lethargy.org/~jesus/archives/74-PostgreSQL-performance-through-the-eyes-of-DTrace.html
http://www.brendangregg.com/dtrace.html#DTraceToolkit

Even without the custom DTrace probes in Postgres, DTrace gives you the ability to see what the OS is doing, how long it is taking, and what processes, files, locks, or other things are involved.  Most important however is the ability to correlate things and not just deal with high level aggregates like more simplistic tools.  It takes some work and it is not the easiest thing to use, as its power comes at a complexity cost.



On Sun, Aug 24, 2008 at 5:30 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
Peter Schuller wrote:

> But in general, it would be very interesting to see, at any given
> moment, what PostgreSQL backends are actually blocking on from the
> perspective of PostgreSQL.

The recent work on DTrace support for PostgreSQL will probably give you
the easiest path to useful results. You'll probably need an OpenSolaris
or (I think) FreeBSD host, though, rather than a Linux host.

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Identifying the nature of blocking I/O

From
Tom Lane
Date:
Craig Ringer <craig@postnewspapers.com.au> writes:
> Peter Schuller wrote:
>> But in general, it would be very interesting to see, at any given
>> moment, what PostgreSQL backends are actually blocking on from the
>> perspective of PostgreSQL.

> The recent work on DTrace support for PostgreSQL will probably give you
> the easiest path to useful results. You'll probably need an OpenSolaris
> or (I think) FreeBSD host, though, rather than a Linux host.

<cant-resist>get a mac</cant-resist>

(Mind you, I don't think Apple sells any hardware that would be really
suitable for a big-ass database server.  But for development purposes,
OS X on a recent laptop is a pretty nice unix-at-the-core-plus-eye-candy
environment.)

            regards, tom lane

Re: Identifying the nature of blocking I/O

From
Tom Lane
Date:
"Scott Carey" <scott@richrelevance.com> writes:
> DTrace is available now on MacOSX, Solaris 10, OpenSolaris, and FreeBSD.
> Linux however is still in the dark ages when it comes to system monitoring,
> especially with I/O.

Oh, after poking around a bit, I should note that some of my Red Hat
compatriots think that "systemtap" is the long-term Linux answer here.
I know zip about it myself, but it's something to read up on if you are
looking for better performance monitoring on Linux.

            regards, tom lane

Re: Identifying the nature of blocking I/O

From
Alvaro Herrera
Date:
Tom Lane wrote:
> "Scott Carey" <scott@richrelevance.com> writes:
> > DTrace is available now on MacOSX, Solaris 10, OpenSolaris, and FreeBSD.
> > Linux however is still in the dark ages when it comes to system monitoring,
> > especially with I/O.
>
> Oh, after poking around a bit, I should note that some of my Red Hat
> compatriots think that "systemtap" is the long-term Linux answer here.
> I know zip about it myself, but it's something to read up on if you are
> looking for better performance monitoring on Linux.

FWIW there are a number of tracing options on Linux, none of which is
said to be yet at the level of DTrace.  See here for an article on the
topic: http://lwn.net/Articles/291091/

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Identifying the nature of blocking I/O

From
"Alexander Staubo"
Date:
On Mon, Aug 25, 2008 at 3:34 AM, Scott Carey <scott@richrelevance.com> wrote:
> DTrace is available now on MacOSX, Solaris 10, OpenSolaris, and FreeBSD.
> Linux however is still in the dark ages when it comes to system monitoring,
> especially with I/O.

While that's true, newer 2.6 kernel versions at least have I/O
accounting built in, something which only used to be available through
the "atop" accounting kernel patch:

$ cat /proc/22785/io
rchar: 31928
wchar: 138
syscr: 272
syscw: 4
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0

Alexander.

Re: Identifying the nature of blocking I/O

From
RW
Date:
This matches not exactly the topic but it is sometimes helpfull.
If you've enabled I/O accounting and a kernel >= 2.6.20 (needs
to be compiled with

**CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
)

and sysstat package (>= 7.1.5) installed you can use "pidstat"
command which show's you the processes doing I/O in kb/sec.

Robert

**


Alexander Staubo wrote:
> On Mon, Aug 25, 2008 at 3:34 AM, Scott Carey <scott@richrelevance.com> wrote:
>
>> DTrace is available now on MacOSX, Solaris 10, OpenSolaris, and FreeBSD.
>> Linux however is still in the dark ages when it comes to system monitoring,
>> especially with I/O.
>>
>
> While that's true, newer 2.6 kernel versions at least have I/O
> accounting built in, something which only used to be available through
> the "atop" accounting kernel patch:
>
> $ cat /proc/22785/io
> rchar: 31928
> wchar: 138
> syscr: 272
> syscw: 4
> read_bytes: 0
> write_bytes: 0
> cancelled_write_bytes: 0
>
> Alexander.
>
>


Re: Identifying the nature of blocking I/O

From
"Jonah H. Harris"
Date:
On Fri, Aug 22, 2008 at 7:52 AM, Peter Schuller
<peter.schuller@infidyne.com> wrote:
> Is there currently a way of dumping such information? I.e., asking PG
> "what are backends waiting on right now?".

Unfortunately, not within Postgres itself.  The question, "what is the
database waiting on?" is a good one, and one Oracle understood in the
early 90's.  It is for that reason that EnterpriseDB added RITA, the
Runtime Instrumentation and Tracing Architecture, to their Advanced
Server product.  RITA gives DBAs some of the same information as the
Oracle Wait Interface does regarding what the database is waiting for,
such as locks, I/O, and which relation/block.  While it's not as
efficient as DTrace due to Linux's lack of a good high-resolution
user-mode timer, no one has found it to have a noticible overhead on
the throughput of a system in benchmarks or real-world applications.

If you're on a DTrace platform, I would suggest using it.  Otherwise,
you can try and use strace/ltrace on Linux, but that's probably not
going to get you the answers you need quickly or easily enough.  Until
enough users ask for this type of feature, the community isn't going
to see it as valuable enough to add to the core engine.  IIRC,
systemtap is pretty much dead :(

--
Jonah H. Harris, Senior DBA
myYearbook.com

Re: Identifying the nature of blocking I/O

From
Greg Smith
Date:
On Sun, 24 Aug 2008, Tom Lane wrote:

> Mind you, I don't think Apple sells any hardware that would be really
> suitable for a big-ass database server.

If you have money to burn, you can get an XServe with up to 8 cores and
32GB of RAM, and get a card to connect it to a Fiber Channel disk array.
For only moderately large requirements, you can even get a card with 256MB
of battery-backed cache (rebranded LSI) to attach the 3 drives in the
chassis.  None of these are very cost effective compared to servers like
the popular HP models people mention here regularly, but it is possible.

As for Systemtap on Linux, it might be possible that will accumulate
enough of a standard library to be usable by regular admins one day, but I
don't see any sign that's a priority for development.  Right now what you
have to know in order to write useful scripts is so much more complicated
than DTrace, where there's all sorts of useful things you can script
trivially.  I think a good part of DTrace's success comes from flattening
that learning curve.  Take a look at the one-liners at
http://www.solarisinternals.com/wiki/index.php/DTraceToolkit and compare
them against http://sourceware.org/systemtap/examples/

That complexity works against the tool on so many levels.  For example, I
can easily imagine selling even a paranoid admin on running a simple
DTrace script like the one-line examples.  Whereas every Systemtap example
I've seen looks pretty scary at first, and I can't imagine a DBA in a
typical enterprise environment being able to convince their associated
admin team they're perfectly safe to run in production.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD