Thread: understanding pg_stat* numbers

understanding pg_stat* numbers

From

Oleg Bartunov

Date:

26 March 2005, 10:41:59

Hi there,

I'm investigating one performance issue with tsearch2 index  and trying to 
interperet io statiscs from pg_statio_user_tables, pg_stat_user_tables.
But from documentation it's not clear what numbers I shoud take into
account and I'm a bit confused :)
I'm looking for blocks *actually* read from disk, since IO is the most 
important factor.

I reseted stats and run my  query and then obtained statistics:

=# select pg_stat_reset();

=# explain analyze select 1 from message_parts where  message_parts.index_fts @@ '\'star\'';
                                QUERY PLAN 
 

-------------------------------------------------------------------------------------------------------------------------------------
IndexScan using a_gist_key on message_parts  (cost=0.00..1381.92 rows=469 width=0) (actual time=0.785..1236.086
rows=5142loops=1)   Index Cond: (index_fts @@ '\'star\''::tsquery) Total runtime: 1240.274 ms
 
(3 rows)



=# select 'StatB:',heap_blks_read,heap_blks_hit,idx_blks_read, idx_blks_hit from pg_statio_user_tables where
relname='message_parts';?column? | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit 
 
----------+----------------+---------------+---------------+-------------- StatB:   |           1888 |          1700 |
       1056 |         7226
 
(1 row)

=# select 'StatR:',seq_scan,seq_tup_read,idx_scan,idx_tup_fetch from pg_stat_user_tables where relname='message_parts';
?column?| seq_scan | seq_tup_read | idx_scan | idx_tup_fetch 
 
----------+----------+--------------+----------+--------------- StatR:   |        0 |            0 |        1 |
5939
 
(1 row)

From documentation:

the total number of disk blocks read from table    - 1888
the number of buffer hits from table               - 1700
the numbers of disk blocks read from indices       - 1056
the number of buffer hits from indices             - 7226

total numbers of rows returned by index scan       - 5939

So, the total number of table blocks read  is (1888+1700), and index blocks 
is (1056+7226) ?  Or from 1888 table blocks read there were 1700 blocks
already in buffer, but then I dont' understand index stats.

Since disk io is the most important performance factor,
should I look mostly on heap_blks_read and idx_blks_read ?

My query returns 5142 rows, while I see from  idx_tup_fetch that
index returns 5939 rows. So, does it means that 5939 table rows was actually read
from disk and checked for lossines (index is lossy) and 797 hits was
actually false drops ?

    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: understanding pg_stat* numbers

From

Tom Lane

Date:

26 March 2005, 21:05:15

Oleg Bartunov <oleg@sai.msu.su> writes:
> I'm looking for blocks *actually* read from disk, since IO is the most 
> important factor.

Well, you won't find that out from Postgres, since it has no idea
whether a read() request was satisfied from kernel disk cache or had
to actually go to disk.

You could turn on log_statement_stats to get some possibly-more-reliable
numbers from the kernel via getrusage(), but this will only tell you
about totals across the whole backend process, not per-relation ...
        regards, tom lane

Re: understanding pg_stat* numbers

From

Oleg Bartunov

Date:

26 March 2005, 21:35:48

On Sat, 26 Mar 2005, Tom Lane wrote:

> Oleg Bartunov <oleg@sai.msu.su> writes:
>> I'm looking for blocks *actually* read from disk, since IO is the most
>> important factor.
>
> Well, you won't find that out from Postgres, since it has no idea
> whether a read() request was satisfied from kernel disk cache or had
> to actually go to disk.

so, what's the usefulness of pg_stat* ?

From my experiments I feel heap_blks_read is the table blocks read from disk, 
well, sort of, and heap_blks_hit - from buffer. At least, I observed when I 
repeat the same query several times (heap_blks_read+heap_blks_hit) doesn't 
changed while more blocks come from heap_blks_hit.

>
> You could turn on log_statement_stats to get some possibly-more-reliable
> numbers from the kernel via getrusage(), but this will only tell you
> about totals across the whole backend process, not per-relation ...
>

Hmm, it's impossible to do researching :( We have so many parameters and
almost no reliable stats. How do you believe you did a good choice ?
It's common in db worlds to have IO statistics (timings are not important)
to research algorithms and implementation.

>             regards, tom lane
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: understanding pg_stat* numbers

From

"Qingqing Zhou"

Date:

31 March 2005, 15:33:38

"Oleg Bartunov" <oleg@sai.msu.su> writes
> From my experiments I feel heap_blks_read is the table blocks read from
disk,
> well, sort of, and heap_blks_hit - from buffer. At least, I observed when
I
> repeat the same query several times (heap_blks_read+heap_blks_hit) doesn't
> changed while more blocks come from heap_blks_hit.
>

PG relies on the OS to schedule the *real* IO you mean. So heap_blks_read
may come from OS kernel buffer, or from disk. You can minimize/disable your
OS file buffers (if it supports this function) or change the mdopen() using
O_DIRECT to make the heap_blks_read approaches the *real* IO if you want -
but for sure this hurts performance.

> It's common in db worlds to have IO statistics (timings are not important)
> to research algorithms and implementation.
>

The *real* IO you mean might be an elegant choice in academic, but is a
subjective matter in practice. For example, we can only assume statistically
N random IO costs the same time with another N random IO, because disk
header has its own scheduling logic controlled by the layout of these
blocks, etc, which you can't decide. Not to speak of the IOs in concurrent
environment.  If so, use heap_blks_read/write is already good enough?

Regards,
Qingqing

Re: understanding pg_stat* numbers

From

Oleg Bartunov

Date:

01 April 2005, 08:55:01

Qingqing,

Thanks for the info, I do know all issues you mentioned. I want to know
if there is any possibility to research performance problem in PostgreSQL
without voodoo techiques, at least at the level of pg itself. At this aspect, 
I could answer myself: heap_blks_read is the # of blocks
'ordered' to read from disk and heap_blks_hit is the # of blocks
read from pg buffers.

One mystery remains, why stats show heap_blks_read > 0 for indexed search ?        select 1 from foo where id=5
I did pg_stat_reset() before run query.

Oleg

On Thu, 31 Mar 2005, Qingqing Zhou wrote:

>
> "Oleg Bartunov" <oleg@sai.msu.su> writes
>> From my experiments I feel heap_blks_read is the table blocks read from
> disk,
>> well, sort of, and heap_blks_hit - from buffer. At least, I observed when
> I
>> repeat the same query several times (heap_blks_read+heap_blks_hit) doesn't
>> changed while more blocks come from heap_blks_hit.
>>
>
> PG relies on the OS to schedule the *real* IO you mean. So heap_blks_read
> may come from OS kernel buffer, or from disk. You can minimize/disable your
> OS file buffers (if it supports this function) or change the mdopen() using
> O_DIRECT to make the heap_blks_read approaches the *real* IO if you want -
> but for sure this hurts performance.
>
>> It's common in db worlds to have IO statistics (timings are not important)
>> to research algorithms and implementation.
>>
>
> The *real* IO you mean might be an elegant choice in academic, but is a
> subjective matter in practice. For example, we can only assume statistically
> N random IO costs the same time with another N random IO, because disk
> header has its own scheduling logic controlled by the layout of these
> blocks, etc, which you can't decide. Not to speak of the IOs in concurrent
> environment.  If so, use heap_blks_read/write is already good enough?
>
> Regards,
> Qingqing
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
        Regards,            Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: understanding pg_stat* numbers

From

"Qingqing Zhou"

Date:

01 April 2005, 10:37:40

"Oleg Bartunov" <oleg@sai.msu.su> writes
> One mystery remains, why stats show heap_blks_read > 0 for indexed search
?
>      select 1 from foo where id=5
> I did pg_stat_reset() before run query.
>

There is no clustered index in PG so far, so all the data item has to be
stay in the heap. In brief, the executor has to check the visibility of each
row in the heap pointed by the index. For performance consideration, if one
row is invisible("killed"), PG could remember its status it in the index
item identifier, so next time it just ignore it.

Regards,
Qingqing