Re: EXPLAIN BUFFERS - Mailing list pgsql-hackers

From Takahiro Itagaki
Subject Re: EXPLAIN BUFFERS
Date
Msg-id 20091214120006.8A81.52131E4D@oss.ntt.co.jp
Whole thread Raw
In response to Re: EXPLAIN BUFFERS  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: EXPLAIN BUFFERS  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: EXPLAIN BUFFERS  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:

> I have a question about the comment in InstrStopNode(), which reads:
> "Adds delta of buffer usage to node's count and resets counter to
> start so that the counters are not double counted by parent nodes."
> It then calls BufferUsageAccumDiff(), but that function doesn't
> actually "reset" anything, so it seems like the comment is wrong.

Oops, it's wrong. It just does "Adds delta of buffer usage to node's count."

> Two other thoughts:
> 
> 1. It doesn't appear that there is any provision to ever zero
> pgBufferUsage.  Shouldn't we do this, say, once per explain, just to
> avoid the possibility of overflowing the counters?

I think the overflowing will not be a problem because we only use
the differences of values. The delta is always corrent unless we use
2^32 buffer accesses during one execution of a node.

> 2. We seem to do all the work associated with pgBufferUsage even when
> the "buffers" option is not passed to explain.  The overhead of
> incrementing the counters is probably negligible (and we were paying
> the equivalent overhead before anyway) but I'm not sure whether saving
> the starting counters and accumulating the deltas might be enough to
> slow down EXPLAIN ANALYZE.  That's sorta slow already so I'd hate to
> whack it any more - have you benchmarked this at all?

There are 5% of overheads in the worst cases. The difference will be
little if we have more complex operations or some disk I/Os.

Adding Instrumentation->count_bufusage flag could reduce the overheads.   if (instr->count_bufusage)
BufferUsageAccumDiff(...)

Should I add countBufferUsage boolean arguments to all places
doInstrument booleans are currently used? This requires several
minor modifications of codes in many places.

[without patch]
=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;                                                          QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------Seq
Scanon pgbench_accounts  (cost=0.00..263935.00 rows=10000000 width=97) (actual time=0.003..571.794 rows=10000000
loops=1)Totalruntime: 899.427 ms
 

[with patch]
=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;                                                          QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------Seq
Scanon pgbench_accounts  (cost=0.00..263935.00 rows=10000000 width=97) (actual time=0.003..585.885 rows=10000000
loops=1)Totalruntime: 955.280 ms
 

- shared_buffers = 1500MB
- pgbench -i -s100
- Read all pages in the pgbench_accounts into shared buffers before runs.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Streaming replication and non-blocking I/O
Next
From: Tom Lane
Date:
Subject: Re: EXPLAIN BUFFERS