Thread: NDirectFileRead and Write

NDirectFileRead and Write

From
ITAGAKI Takahiro
Date:
Hello,

I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
for counting reads and writes in BufFile. They are defined, but not used
now. BufFile is used for tuple sorting or materializing, so we could use
NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.

We can get the values of NDirectFileRead/Write when we enable
log_statement_stats, log_parser_stats, log_planner_stats or
log_executor_stats. (They show always 0 for NDirectFile R/W now.)

I have a plan to store the values shown by log_xxx_stats into
pg_stat_statements, that is per-query statistics information collector
I'm developing. The combination of NDirectFile R/W and pg_stat_statements
have an advantage to detect which sql uses large tuple sorting.
And after we find such queries, we can turn trace_sort on and examine
how amount of work_mem is needed by such queries. We could save time
for tuning queries and reading server logs.



-- The patch would be very trivial ;-)

Index: src/backend/storage/file/buffile.c
===================================================================
--- src/backend/storage/file/buffile.c    (HEAD)
+++ src/backend/storage/file/buffile.c    (working copy)
@@ -238,6 +238,7 @@        file->nbytes = 0;    file->offsets[file->curFile] += file->nbytes;    /* we choose not to
advancecurOffset here */
 
+    NDirectFileRead++;}/*
@@ -300,6 +301,7 @@        file->offsets[file->curFile] += bytestowrite;        file->curOffset += bytestowrite;
wpos+= bytestowrite;
 
+        NDirectFileWrite++;    }    file->dirty = false;

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



Re: NDirectFileRead and Write

From
ITAGAKI Takahiro
Date:
Here is a patch to user NDirectFileRead/Write counters to get I/O counts
in BufFile module. We can see the counters when log_statement_stats is on.

The information is different from trace_sort; trace_sort shows used blocks
in external sort, and log_statement_stats shows how many I/Os are submitted
during sorts.

I wrote:
> I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
> for counting reads and writes in BufFile. They are defined, but not used
> now. BufFile is used for tuple sorting or materializing, so we could use
> NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.

=# SET client_min_messages = log;
=# SET trace_sort = on;
=# SET log_statement_stats = on;
=# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1000000) AS i ORDER BY i;
LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
LOG:  switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 0.35 sec
LOG:  performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
LOG:  finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG:  performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG:  external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 3.06 sec
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
!       3.078000 elapsed 2.234375 user 0.812500 system sec
!       [3.328125 user 1.281250 sys total]
! buffer usage stats:
!       Shared blocks:          0 read,          0 written, buffer hit rate = 0.00%
!       Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!       Direct blocks:       5375 read,       5374 written
                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 rows=1000000 loops=1)
   Sort Key: i
   Sort Method:  external sort  Disk: 19552kB
   ->  Function Scan on generate_series i  (cost=0.00..12.50 rows=1000 width=4) (actual time=349.065..892.907
rows=1000000loops=1) 
 Total runtime: 3087.305 ms
(5 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: NDirectFileRead and Write

From
"Hitoshi Harada"
Date:
2008/8/5 ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>:
> Here is a patch to user NDirectFileRead/Write counters to get I/O counts
> in BufFile module. We can see the counters when log_statement_stats is on.
>
> The information is different from trace_sort; trace_sort shows used blocks
> in external sort, and log_statement_stats shows how many I/Os are submitted
> during sorts.
>
> I wrote:
>> I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
>> for counting reads and writes in BufFile. They are defined, but not used
>> now. BufFile is used for tuple sorting or materializing, so we could use
>> NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.
>
> =# SET client_min_messages = log;
> =# SET trace_sort = on;
> =# SET log_statement_stats = on;
> =# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1000000) AS i ORDER BY i;
> LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
> LOG:  switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 0.35 sec
> LOG:  performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
> LOG:  finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 2.21 sec
> LOG:  performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
> LOG:  external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 3.06 sec
> LOG:  QUERY STATISTICS
> DETAIL:  ! system usage stats:
> !       3.078000 elapsed 2.234375 user 0.812500 system sec
> !       [3.328125 user 1.281250 sys total]
> ! buffer usage stats:
> !       Shared blocks:          0 read,          0 written, buffer hit rate = 0.00%
> !       Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
> !       Direct blocks:       5375 read,       5374 written
>                                                             QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 rows=1000000 loops=1)
>   Sort Key: i
>   Sort Method:  external sort  Disk: 19552kB
>   ->  Function Scan on generate_series i  (cost=0.00..12.50 rows=1000 width=4) (actual time=349.065..892.907
rows=1000000loops=1)
 
>  Total runtime: 3087.305 ms
> (5 rows)
>
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Though I'm not quite good around buffers, I reviewed this patch and
found that it seems strange about the declaring position of the two
variables. In ResetBufferUsage() of bufmgr.c it says:

void
ResetBufferUsage(void)
{BufferHitCount = 0;ReadBufferCount = 0;BufferFlushCount = 0;LocalBufferHitCount = 0;ReadLocalBufferCount =
0;LocalBufferFlushCount= 0;NDirectFileRead = 0;NDirectFileWrite = 0;
 
}

so I guess all of these variables should be defined together but
actually you put the two in buffile.h while the others in
buf_iinnternals.h. Is there clear reason for that?

Regards,


-- 
Hitoshi Harada


Re: NDirectFileRead and Write

From
ITAGAKI Takahiro
Date:
"Hitoshi Harada" <umi.tanuki@gmail.com> wrote:

> so I guess all of these variables should be defined together but
> actually you put the two in buffile.h while the others in
> buf_internals.h. Is there clear reason for that?

That's because buffile.c includes buffile.h, but not buf_internals.h .
NDirectFileRead/Writes are counters for BufFile in the patch
and don't depend on bufmgr.

It might be cleaner if we had something like storage/bufstats.h
and declared all counter variables in it, but it seems to be an overkill.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: NDirectFileRead and Write

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Here is a patch to user NDirectFileRead/Write counters to get I/O counts
> in BufFile module. We can see the counters when log_statement_stats is on.

Couple thoughts here:

* Let's rename them BufFileReadCount and BufFileWriteCount to reflect
their actual purpose.

* If the extern's are moved to buffile.h, I think the definitions of the
variables themselves should move to buffile.c.  However, that would
imply including buffile.h in bufmgr.c which is a bit ugly from a
modularity standpoint.  In any case I agree that the current arrangement
with execdebug.h declaring variables defined in bufmgr.c is just weird.
        regards, tom lane


Re: NDirectFileRead and Write

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> * Let's rename them BufFileReadCount and BufFileWriteCount to reflect
> their actual purpose.

It looks good. No one uses NDirectFileRead/Write variables now,
so we can rename it freely.

BTW, we have the word 'Direct' in ShowBufferUsage().
Can I keep it as-is?

ShowBufferUsage()
!   Shared blocks: ...
!   Local  blocks: ...
!   Direct blocks: <R> read, <W> written


> * In any case I agree that the current arrangement
> with execdebug.h declaring variables defined in bufmgr.c is just weird.

Ok.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: NDirectFileRead and Write

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> BTW, we have the word 'Direct' in ShowBufferUsage().
> Can I keep it as-is?

> ShowBufferUsage()
> !   Shared blocks: ...
> !   Local  blocks: ...
> !   Direct blocks: <R> read, <W> written

Good point.  Seems like it should be changed, but I am not sure to what.
Ideas anyone?
        regards, tom lane


Re: NDirectFileRead and Write

From
ITAGAKI Takahiro
Date:
Here is a revised patch to renewal NDirectFileRead/Write.

> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > * Let's rename them BufFileReadCount and BufFileWriteCount to reflect
> > their actual purpose.
> > * In any case I agree that the current arrangement
> > with execdebug.h declaring variables defined in bufmgr.c is just weird.

- NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
- They are still declared in execdebug.h and buffile.c includes it.


I did not touch messages in ShowBufferUsage() in the patch.
I think it still has meaning because BufFile counters are
kinds of direct block access.

> ShowBufferUsage()
> !   Shared blocks: ...
> !   Local  blocks: ...
> !   Direct blocks: <R> read, <W> written

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: NDirectFileRead and Write

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
>> BTW, we have the word 'Direct' in ShowBufferUsage().
>> Can I keep it as-is?
> 
>> ShowBufferUsage()
>> !   Shared blocks: ...
>> !   Local  blocks: ...
>> !   Direct blocks: <R> read, <W> written
> 
> Good point.  Seems like it should be changed, but I am not sure to what.
> Ideas anyone?

How about "Temp blocks"? It might be slightly confusing that access to 
temporary tables shows up in "Local blocks", rather than "Temp blocks", 
though. But then again, temp_tablespaces already controls where the 
files are created, among other things, so there's some precedence to 
calling those things "temp".

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: NDirectFileRead and Write

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> - NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
> - They are still declared in execdebug.h and buffile.c includes it.

After some thought I concluded that the least ugly way to do this
was to declare and define the variables in the same places as the
other counters that are printed by ShowBufferUsage.  It's not any
worse from a modularity standpoint than the other alternatives we
considered, and it seems to satisfy the principle of least
surprise a little better.  Committed it that way.

> I did not touch messages in ShowBufferUsage() in the patch.

I didn't change the message either.  Heikki's suggestion of "temp
blocks" doesn't seem very good to me because of the likelihood
of confusion with temp tables.  I don't have a better alternative
offhand.
        regards, tom lane