Thread: how to find the sql using most of the i/o in an oltp system

how to find the sql using most of the i/o in an oltp system

From
Uwe Bartels
Date:
Hi,

I read a lot about postgres tuning and did some of it. But one of the
things, when you start tuning a system that is completely new to you, is
checking which sql statement(s) cost most of the resources.

cpu instensive sql seems easy to find.
But how do I find i/o intensive sql as fast as possible?

Tuning a sql statements I'm familiar with. Finding a sql statement which
takes too long due to i/o is probably easy as well. But how about
statements that take about 100 ms, that read a lot and that are executed
several times per second?

To ask the same question just different - Is there a possibility to
check how many pages/kB a sql reads from shared buffer and how many
pages/kB from disk?
Is there a possibility to read or create historical records about how
often, how fast one sql is run and how many reads it used from shared
buffers, from disk an so on.


Best regards,
Uwe

Re: how to find the sql using most of the i/o in an oltp system

From
"Uwe Bartels"
Date:
Hi Alexander,

thanks for you answer.
What you wrote in terms of postgres I knew. I just tested to log all statements with statistics. This is a lot of unstructured data in a logfile. But this is the best I found as far.

The database is running on a solaris box. So DTrace is no problem. I couldn't find any dtrace scripts for postgres. Do you know any scripts except this sample script?

Thanks.
Uwe

On Sun, Jun 15, 2008 at 4:03 PM, Alexander Staubo <alex@bengler.no> wrote:
On Sun, Jun 15, 2008 at 3:48 PM, Uwe Bartels <uwe.bartels@gmail.com> wrote:
> Tuning a sql statements I'm familiar with. Finding a sql statement which
> takes too long due to i/o is probably easy as well. But how about statements
> that take about 100 ms, that read a lot and that are executed several times
> per second?

Take a look at the PostgreSQL manual chapter on monitoring and statistics:

 http://www.postgresql.org/docs/8.3/interactive/monitoring.html

If you have access to DTrace (available on Solaris, OS X and possibly
FreeBSD), you could hook the low-level system calls to reads and
writes. If you don't have access to DTrace, the pg_statio_* set of
tables is your main option. In particular, pg_statio_user_tables and
pg_statio_user_indexes. See the documentation for the meaning of the
individual columns.

Unfortunately, the statistics tables are not transaction-specific
(indeed I believe they only update once you commit the transaction,
and then only after a delay), meaning they capture statistics about
everything currently going on in the database. The only way to capture
statistics about a single query, then, is to run it in complete
isolation.

Alexander.

Re: how to find the sql using most of the i/o in an oltp system

From
"Gavin M. Roy"
Date:
Check out pgFouine.

Sent from my iPhone

On Jun 15, 2008, at 10:41 AM, "Uwe Bartels" <uwe.bartels@gmail.com> wrote:

Hi Alexander,

thanks for you answer.
What you wrote in terms of postgres I knew. I just tested to log all statements with statistics. This is a lot of unstructured data in a logfile. But this is the best I found as far.

The database is running on a solaris box. So DTrace is no problem. I couldn't find any dtrace scripts for postgres. Do you know any scripts except this sample script?

Thanks.
Uwe

On Sun, Jun 15, 2008 at 4:03 PM, Alexander Staubo <alex@bengler.no> wrote:
On Sun, Jun 15, 2008 at 3:48 PM, Uwe Bartels <uwe.bartels@gmail.com> wrote:
> Tuning a sql statements I'm familiar with. Finding a sql statement which
> takes too long due to i/o is probably easy as well. But how about statements
> that take about 100 ms, that read a lot and that are executed several times
> per second?

Take a look at the PostgreSQL manual chapter on monitoring and statistics:

 http://www.postgresql.org/docs/8.3/interactive/monitoring.html

If you have access to DTrace (available on Solaris, OS X and possibly
FreeBSD), you could hook the low-level system calls to reads and
writes. If you don't have access to DTrace, the pg_statio_* set of
tables is your main option. In particular, pg_statio_user_tables and
pg_statio_user_indexes. See the documentation for the meaning of the
individual columns.

Unfortunately, the statistics tables are not transaction-specific
(indeed I believe they only update once you commit the transaction,
and then only after a delay), meaning they capture statistics about
everything currently going on in the database. The only way to capture
statistics about a single query, then, is to run it in complete
isolation.

Alexander.