Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node - Mailing list pgsql-hackers
From | Neil Conway |
---|---|
Subject | Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node |
Date | |
Msg-id | 1197695435.4932.7.camel@goldbach Whole thread Raw |
In response to | EXPLAIN ANALYZE printing logical and hardware I/O per-node (Gregory Stark <stark@enterprisedb.com>) |
Responses |
Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node
Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node |
List | pgsql-hackers |
On Fri, 2007-12-14 at 15:47 +0000, Gregory Stark wrote: > I've wanted for a long time to have EXPLAIN ANALYZE output per-node I/O usage. > This would be especially useful if we could distinguish hardware versus > logical I/O though. And I always thought that would be very hard. > > My thought in the past was that would could do it on Solaris by having > Postgres use DTrace directly via its (undocumented but existing) programmatic > interface. > > For other operating systems it was tempting to suggest just timing the read(2) > call to see if it took too long to be a logical operation. The problem there > is that gettimeofday would impose far too much overhead to make that practical > (or even be precise enough to work properly). > > But it occurred to me just now that the hardware instruction counter available > on just about every platform would be good enough for a heuristic guess at > whether the read(2) was cached. I'm skeptical that this would be reliable enough to be very useful, especially in the face of concurrent, unpredictable system activity on a busy system. I agree that it would be useful information, though. Perhaps a useful first step would be to teach EXPLAIN ANALYZE to report the number of logical and physical I/Os from Postgres' perspective (i.e. physical I/O just means we need to go to the kernel). > The problem generally with using the hardware > instruction counter is that it's not necessarily in sync between processors > and might therefore run backwards or skip time forwards. This is a problem for > profiling but if all we care about is a boolean guess at whether the request > was satisfied quickly from cache then any such skipping forward or backward > would represent a context switch which we could just toss in the hardware i/o > bucket. It doesn't matter exactly how long the hardware i/o took, only that > there was one. > > To that end I would love to see something like: > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on h (cost=8.52..16.45 rows=2 width=512) (actual time=78.926..87.708 rows=2 loops=1 logical-I/O=2 physical-I/O=1) > Recheck Cond: (i = ANY ('{100,1000}'::integer[])) > -> Bitmap Index Scan on hi (cost=0.00..8.52 rows=2 width=0) (actual time=74.539..74.539 rows=2 loops=1 logical-I/O=2physical-I/O=2)) > Index Cond: (i = ANY ('{100,1000}'::integer[])) > Total runtime: 87.820 ms > >
pgsql-hackers by date: