10x, for the reply.
I did read all the developer documentation and it got me close but not
close
enough. It turns out that on linux the io statistics per processes is
not
counted at all. There is however what is called laptop_mode where
you want to find unecessary disk activity to save batteries.
In order to activate the io stats per process you do
echo "1">/proc/sys/vm/block_dump and watch the dmesg.
Now, it's a big mess there so as soon as I have some time I plan to
remove
the messages from dmesg with some switch and instead count statistics
into /proc/<pid>/statio. naturally I want to count the postgres process
which
runs my query. I also need to disable the bgwriter in the conf files and
probably
the other stat which I won't need. btw, when I disabled the bgwriter on
windows,
I am not sure but I think its not completely disabled since I still see
io
writes in a different postgres process (which I must assume is the
bgwriter
since they don't have clear names In the task list in windows).
Anyway, in linux (and probably in windows) there is also a background
writer of the system. In linux 2.4 its called kupdated and it write out
dirtied
buffers it has read from the disk. I will also need to circumvent it
somehow
( I don't know if it can be simply disabled)
with maybe O_SYNC or/and O_DIRECT in the fd.c source of postgreSQL
where they do fileWrite and fileRead.
Also, when you think about it, it is useless for the kernel to second
guess
postgreSQL since the developers know what they want to achieve when
they cache disk blocks. Maybe I should also search for a way to tell
kupdated
to stop caching a given list of processes.
Regards,
tzahi.
> -----Original Message-----
> From: Michael Fuhr [mailto:mike@fuhr.org]
> Sent: Sunday, January 16, 2005 5:44 AM
> To: Tzahi Fadida
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Is it possible to measure IO costs of
> a query in postgreSQL?
>
>
> On Sat, Jan 08, 2005 at 08:01:51PM +0200, Tzahi Fadida wrote:
>
> > I have been trying for a week now without success to
> discover if you
> > can measure the cost of a query (with my c function).
> EXPLAIN ANALYZE
> > seems to give you the actual time it took it to run but the "cost"
> > seems to be a fixed estimate number and not actual.
>
> "cost" is the planner's estimate of disk page fetches;
> "actual" is elapsed real time. See "Using EXPLAIN" in the
> "Performance Tips" chapter of the documentation.
>
> > I see in the code many times references to pgstats so
> > i figure there must be some kind of accounting.
> > Does anyone have a clue on this? its very important to me
> > to be able to measure scientifically the costs of a query.
>
> Have you looked at the "The Statistics Collector" section in
> the "Monitoring Database Activity" chapter? The statistics
> views won't show statistics for a particular query, but if
> the database is quiescent except for your activity then you
> might be able to measure queries' I/O costs by observing
> changes in the gathered statistics.
>
> --
> Michael Fuhr
> http://www.fuhr.org/~mfuhr/
>
>