Re: Script to compute random page cost - Mailing list pgsql-hackers

From Curt Sampson
Subject Re: Script to compute random page cost
Date
Msg-id Pine.NEB.4.44.0209111548560.24427-100000@angelic.cynic.net
Whole thread Raw
In response to Re: Script to compute random page cost  (Mark Kirkwood <markir@slingshot.co.nz>)
List pgsql-hackers
On Wed, 11 Sep 2002, Mark Kirkwood wrote:

> Yes...and at the risk of being accused of marketing ;-) , that is
> exactly what the 3 programs in my archive do (see previous post for url) :

Hm, it appears we've both been working on something similar. However,
I've just released version 0.2 of randread, which has the following
features:
   Written in C, uses read(2) and write(2), pretty much like postgres.
   Reads or writes random blocks from a specified list of files,   treated as a contiguous range of blocks, again like
postgres.This   allows you to do random reads from the actual postgres data files   for a table, if you like.
 
   You can specify the block size to use, and the number of reads to do.
   Allows you to specify how many blocks you want to read before you   start reading again at a new random location.
(Thedefault is 1.)   This allows you to model various sequential and random read mixes.
 

If you want to do writes, I suggest you create your own set of files to
write, rather than destroying postgresql data. This can easily a be done
with something like this Bourne shell script:
   for i in 1 2 3 4; dodd if=/dev/zero of=file.$i bs=1m count=1024   done

However, it doesn't calculate the random vs. sequential ratio for you;
you've got to do that for yourself. E.g.,:

$ ./randread -l 512 -c 256 /u/cjs/z?
256 reads of 512 x 8.00 KB blocks (4096.00 KB) totalling 131072 blocks (1024.00 MB) from 524288 blocks (4092.00 MB) in
4files.
 
256 reads in 36.101119 sec. (141019 usec/read, 7 reads/sec, 29045.53 KB/sec)

$ ./randread -c 4096 /u/cjs/z?
4096 reads of 1 x 8.00 KB blocks (8.00 KB) totalling 4096 blocks (32.00 MB) from 524288 blocks (4095.99 MB) in 4
files.
4096 reads in 34.274582 sec. (8367 usec/read, 120 reads/sec, 956.04 KB/sec)

In this case, across 4 GB in 4 files on my 512 MB, 1.5 GHz Athlon
with an IBM 7200 RPM IDE drive, I read about 30 times faster doing
a full sequential read of the files than I do reading 32 MB randomly
from it.  But because of the size of this, there's basically no
buffer cache involved. If I do this on a single 512 MB file:

$ ./randread -c 4096 /u/cjs/z1:0-65536
4096 reads of 1 x 8.00 KB blocks (8.00 KB) totalling 4096 blocks (32.00 MB) from 65536 blocks (511.99 MB) in 1 files.
4096 reads in 28.064573 sec. (6851 usec/read, 146 reads/sec, 1167.59 KB/sec)

$ ./randread -l 65535 -c 1 /u/cjs/z1:0-65536
1 reads of 65535 x 8.00 KB blocks (524280.00 KB) totalling 65535 blocks (511.99 MB) from 65536 blocks (0.01 MB) in 1
files.
1 reads in 17.107867 sec. (17107867 usec/read, 0 reads/sec, 30645.55 KB/sec)

$ ./randread -c 4096 /u/cjs/z1:0-65536
4096 reads of 1 x 8.00 KB blocks (8.00 KB) totalling 4096 blocks (32.00 MB) from 65536 blocks (511.99 MB) in 1 files.
4096 reads in 19.413738 sec. (4739 usec/read, 215 reads/sec, 1687.88 KB/sec)

Well, there you see some of the buffer cache effect from starting
with about half the file in memory. If you want to see serious buffer
cache action, just use the first 128 MB of my first test file:

$ ./randread -c 4096 /u/cjs/z1:0-16536
4096 reads of 1 x 8.00 KB blocks (8.00 KB) totalling 4096 blocks (32.00 MB) from 16536 blocks (129.18 MB) in 1 files.
4096 reads in 20.220791 sec. (4936 usec/read, 204 reads/sec, 1620.51 KB/sec)

$ ./randread -l 16535 -c 1 /u/cjs/z1:0-16536
1 reads of 16535 x 8.00 KB blocks (132280.00 KB) totalling 16535 blocks (129.18 MB) from 16536 blocks (0.01 MB) in 1
files.
1 reads in 3.469231 sec. (3469231 usec/read, 0 reads/sec, 38129.49 KB/sec)

$  ./randread -l 16535 -c 64 /u/cjs/z1:0-16536
64 reads of 16535 x 8.00 KB blocks (132280.00 KB) totalling 1058240 blocks (8267.50 MB) from 16536 blocks (0.01 MB) in
1files.
 
64 reads in 23.643026 sec. (369422 usec/read, 2 reads/sec, 358072.59 KB/sec)

For those last three, we're basically limited completely by the
CPU, as there's not much disk I/O going on at all. The many-block
one is going to be slower because it's got to generate a lot more
random numbers and do a lot more lseek operations.

Anyway, looking at the real difference between truly sequential
and truly random reads on a large amount of data file (30:1 or so),
it looks to me that people getting much less than that are getting
good work out of their buffer cache. You've got to wonder if there's
some way to auto-tune for this sort of thing....

Anyway, feel free to download and play. If you want to work on the
program, I'm happy to give developer access on sourceforge.
   http://sourceforge.net/project/showfiles.php?group_id=55994

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC
 



pgsql-hackers by date:

Previous
From: Oliver Elphick
Date:
Subject: Re:
Next
From: "Dave Page"
Date:
Subject: Re: