Thread: understanding pg_stat* numbers
Hi there, I'm investigating one performance issue with tsearch2 index and trying to interperet io statiscs from pg_statio_user_tables, pg_stat_user_tables. But from documentation it's not clear what numbers I shoud take into account and I'm a bit confused :) I'm looking for blocks *actually* read from disk, since IO is the most important factor. I reseted stats and run my query and then obtained statistics: =# select pg_stat_reset(); =# explain analyze select 1 from message_parts where message_parts.index_fts @@ '\'star\''; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- IndexScan using a_gist_key on message_parts (cost=0.00..1381.92 rows=469 width=0) (actual time=0.785..1236.086 rows=5142loops=1) Index Cond: (index_fts @@ '\'star\''::tsquery) Total runtime: 1240.274 ms (3 rows) =# select 'StatB:',heap_blks_read,heap_blks_hit,idx_blks_read, idx_blks_hit from pg_statio_user_tables where relname='message_parts';?column? | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit ----------+----------------+---------------+---------------+-------------- StatB: | 1888 | 1700 | 1056 | 7226 (1 row) =# select 'StatR:',seq_scan,seq_tup_read,idx_scan,idx_tup_fetch from pg_stat_user_tables where relname='message_parts'; ?column?| seq_scan | seq_tup_read | idx_scan | idx_tup_fetch ----------+----------+--------------+----------+--------------- StatR: | 0 | 0 | 1 | 5939 (1 row) From documentation: the total number of disk blocks read from table - 1888 the number of buffer hits from table - 1700 the numbers of disk blocks read from indices - 1056 the number of buffer hits from indices - 7226 total numbers of rows returned by index scan - 5939 So, the total number of table blocks read is (1888+1700), and index blocks is (1056+7226) ? Or from 1888 table blocks read there were 1700 blocks already in buffer, but then I dont' understand index stats. Since disk io is the most important performance factor, should I look mostly on heap_blks_read and idx_blks_read ? My query returns 5142 rows, while I see from idx_tup_fetch that index returns 5939 rows. So, does it means that 5939 table rows was actually read from disk and checked for lossines (index is lossy) and 797 hits was actually false drops ? Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: > I'm looking for blocks *actually* read from disk, since IO is the most > important factor. Well, you won't find that out from Postgres, since it has no idea whether a read() request was satisfied from kernel disk cache or had to actually go to disk. You could turn on log_statement_stats to get some possibly-more-reliable numbers from the kernel via getrusage(), but this will only tell you about totals across the whole backend process, not per-relation ... regards, tom lane
On Sat, 26 Mar 2005, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: >> I'm looking for blocks *actually* read from disk, since IO is the most >> important factor. > > Well, you won't find that out from Postgres, since it has no idea > whether a read() request was satisfied from kernel disk cache or had > to actually go to disk. so, what's the usefulness of pg_stat* ? From my experiments I feel heap_blks_read is the table blocks read from disk, well, sort of, and heap_blks_hit - from buffer. At least, I observed when I repeat the same query several times (heap_blks_read+heap_blks_hit) doesn't changed while more blocks come from heap_blks_hit. > > You could turn on log_statement_stats to get some possibly-more-reliable > numbers from the kernel via getrusage(), but this will only tell you > about totals across the whole backend process, not per-relation ... > Hmm, it's impossible to do researching :( We have so many parameters and almost no reliable stats. How do you believe you did a good choice ? It's common in db worlds to have IO statistics (timings are not important) to research algorithms and implementation. > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
"Oleg Bartunov" <oleg@sai.msu.su> writes > From my experiments I feel heap_blks_read is the table blocks read from disk, > well, sort of, and heap_blks_hit - from buffer. At least, I observed when I > repeat the same query several times (heap_blks_read+heap_blks_hit) doesn't > changed while more blocks come from heap_blks_hit. > PG relies on the OS to schedule the *real* IO you mean. So heap_blks_read may come from OS kernel buffer, or from disk. You can minimize/disable your OS file buffers (if it supports this function) or change the mdopen() using O_DIRECT to make the heap_blks_read approaches the *real* IO if you want - but for sure this hurts performance. > It's common in db worlds to have IO statistics (timings are not important) > to research algorithms and implementation. > The *real* IO you mean might be an elegant choice in academic, but is a subjective matter in practice. For example, we can only assume statistically N random IO costs the same time with another N random IO, because disk header has its own scheduling logic controlled by the layout of these blocks, etc, which you can't decide. Not to speak of the IOs in concurrent environment. If so, use heap_blks_read/write is already good enough? Regards, Qingqing
Qingqing, Thanks for the info, I do know all issues you mentioned. I want to know if there is any possibility to research performance problem in PostgreSQL without voodoo techiques, at least at the level of pg itself. At this aspect, I could answer myself: heap_blks_read is the # of blocks 'ordered' to read from disk and heap_blks_hit is the # of blocks read from pg buffers. One mystery remains, why stats show heap_blks_read > 0 for indexed search ? select 1 from foo where id=5 I did pg_stat_reset() before run query. Oleg On Thu, 31 Mar 2005, Qingqing Zhou wrote: > > "Oleg Bartunov" <oleg@sai.msu.su> writes >> From my experiments I feel heap_blks_read is the table blocks read from > disk, >> well, sort of, and heap_blks_hit - from buffer. At least, I observed when > I >> repeat the same query several times (heap_blks_read+heap_blks_hit) doesn't >> changed while more blocks come from heap_blks_hit. >> > > PG relies on the OS to schedule the *real* IO you mean. So heap_blks_read > may come from OS kernel buffer, or from disk. You can minimize/disable your > OS file buffers (if it supports this function) or change the mdopen() using > O_DIRECT to make the heap_blks_read approaches the *real* IO if you want - > but for sure this hurts performance. > >> It's common in db worlds to have IO statistics (timings are not important) >> to research algorithms and implementation. >> > > The *real* IO you mean might be an elegant choice in academic, but is a > subjective matter in practice. For example, we can only assume statistically > N random IO costs the same time with another N random IO, because disk > header has its own scheduling logic controlled by the layout of these > blocks, etc, which you can't decide. Not to speak of the IOs in concurrent > environment. If so, use heap_blks_read/write is already good enough? > > Regards, > Qingqing > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
"Oleg Bartunov" <oleg@sai.msu.su> writes > One mystery remains, why stats show heap_blks_read > 0 for indexed search ? > select 1 from foo where id=5 > I did pg_stat_reset() before run query. > There is no clustered index in PG so far, so all the data item has to be stay in the heap. In brief, the executor has to check the visibility of each row in the heap pointed by the index. For performance consideration, if one row is invisible("killed"), PG could remember its status it in the index item identifier, so next time it just ignore it. Regards, Qingqing