On 09/12/2010 06:52 PM, adi hirschtein wrote:
> Hi,
> Using the catalog tables, is there any way to correlate session id/user
> id to which object (i.e. tables, indexes etc) it access and much how
> disk reads or I/O wait has been done against the objects.
> in general, I'd like to see which objects are being accessed by which
> user and the time/amount of I/O wait/reads.
There isn't really anything like that, no. You have pg_stat_activity,
but it's a pretty coarse tool.
The shared buffer cache and the use of things like synchronized
sequential scans means that it wouldn't even be possible to truly track
who's causing load reliably. As I understand it, if Joe and Fred both to
a "SELECT * FROM sometable", it's quite likely that only one of the
backends will actually appear to read anything from disk. Which one is
pretty much luck.
Adding something like this would require instrumenting not only the
backends' disk I/O code, but also their shared buffer access code. It'd
potentially add a lot of overhead, and I'm not sure the results would
mean very much because the caching effects would make "fred's backend
did X I/O operations" less meaningful. Fred's might just have been the
first that ran after Joe's giant seqscan cross join of two big tables
that forced everything else out of shared_buffers.
Even if you did have that instrumentation, you'd need OS-level
instrumentation (dtrace, prof, etc) to track the OS's buffer cache,
which PostgreSQL relies on heavily. Without that you can't tell the
difference between a query that caused I/O calls from postgresql but
could be satisfied by OS buffer cache and one that required expensive
physical disk I/O to satisfy.
Really, I don't know if it's realistic to do what you're suggesting
unless every user's data set is isolated from every other user's, in
which case you can maybe use OS-level tools like prof or DTrace to
achieve it if you set postgresql up to log whichstarts which backend pid.
If there's any significant overlap in the data sets used by the users
(if they use the same databases or schema) I wouldn't think you'd be
able to get any meaningful results.
--
Craig Ringer