On Mon, Feb 15, 2010 at 6:05 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Well there was a 30+ message thread almost a week ago where there
>> seemed to be some contention over the issue of whether the numbers
>> should be averages or totals. But were there was no dispute over the
>> idea of printing in memory units instead of blocks.
>
> Hmm.... yeah, I guess it wasn't discussed. I'm still not sure it's an
> improvement. If a query hit one buffer, is that really the same as
> saying it hit 8kB?
Well you can always convert between them. The only time it would make
a difference is if you're sure it's random i/o and you're concerned
with the number of iops. However it's impossible to tell from this
output how many of these buffers are read sequentially and how many
randomly. Even if it's sequential you don't know how much it read
between interruptions to handle the inner side of a join or whether
the cached blocks were interspersed throughout the file or were all at
the beginning or end.
I think we should provide better tools to measure these things
directly rather than force users to make deductions from buffer
counts. I'm still excited about using dtrace to get real counts of
iops, seeks, etc.
> To me, buffers seem like discrete (and unitless)
> entities, and we handle them that way elsewhere in the system (see,
> e.g. pg_stat_database, pg_statio_all_tables). I don't know that it's
> a good idea to display that same information here in a different
> format.
>...
> I definitely do not want to do anything that loses accuracy. This is
> probably accurate enough for most uses, but it's still not as accurate
> as just printing the raw numbers.
I left the XML/JSON output in terms of blocks on the theory that tools
reading this data can look up the block size and convert all it wants.
Likewise the pg_stat* functions are for extracting raw data. Any tool
or query that extracts this data can present it in any friendly form
it wants.
Incidentally looking at the pg_size_pretty() functions reminds me that
these counters are all 32-bit. That means they'll do funny things if
you have a query which accesses over 16TB of data... I suspect this
should probably be changed though I'm feeling lazy about it unless
someone else wants to push me to do it now.
--
greg