Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node - Mailing list pgsql-hackers

From Neil Conway
Subject Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node
Date
Msg-id 1197695435.4932.7.camel@goldbach
Whole thread Raw
In response to EXPLAIN ANALYZE printing logical and hardware I/O per-node  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node
Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node
List pgsql-hackers
On Fri, 2007-12-14 at 15:47 +0000, Gregory Stark wrote:
> I've wanted for a long time to have EXPLAIN ANALYZE output per-node I/O usage.
> This would be especially useful if we could distinguish hardware versus
> logical I/O though. And I always thought that would be very hard.
> 
> My thought in the past was that would could do it on Solaris by having
> Postgres use DTrace directly via its (undocumented but existing) programmatic
> interface.
> 
> For other operating systems it was tempting to suggest just timing the read(2)
> call to see if it took too long to be a logical operation. The problem there
> is that gettimeofday would impose far too much overhead to make that practical
> (or even be precise enough to work properly).
> 
> But it occurred to me just now that the hardware instruction counter available
> on just about every platform would be good enough for a heuristic guess at
> whether the read(2) was cached.

I'm skeptical that this would be reliable enough to be very useful,
especially in the face of concurrent, unpredictable system activity on a
busy system. I agree that it would be useful information, though.
Perhaps a useful first step would be to teach EXPLAIN ANALYZE to report
the number of logical and physical I/Os from Postgres' perspective (i.e.
physical I/O just means we need to go to the kernel).

>  The problem generally with using the hardware
> instruction counter is that it's not necessarily in sync between processors
> and might therefore run backwards or skip time forwards. This is a problem for
> profiling but if all we care about is a boolean guess at whether the request
> was satisfied quickly from cache then any such skipping forward or backward
> would represent a context switch which we could just toss in the hardware i/o
> bucket. It doesn't matter exactly how long the hardware i/o took, only that
> there was one.
> 
> To that end I would love to see something like:
> 
>                                                  QUERY PLAN                                                  
> -------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on h  (cost=8.52..16.45 rows=2 width=512) (actual time=78.926..87.708 rows=2 loops=1 logical-I/O=2
physical-I/O=1)
>    Recheck Cond: (i = ANY ('{100,1000}'::integer[]))
>    ->  Bitmap Index Scan on hi  (cost=0.00..8.52 rows=2 width=0) (actual time=74.539..74.539 rows=2 loops=1
logical-I/O=2physical-I/O=2))
 
>          Index Cond: (i = ANY ('{100,1000}'::integer[]))
>  Total runtime: 87.820 ms
> 
> 



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: VLDB Features
Next
From: Trent Shipley
Date:
Subject: Re: VLDB Features