Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN
Date
Msg-id 20220407075545.y3quyzy4p6mf6im4@jrouhaud
Whole thread Raw
In response to Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Thu, Apr 07, 2022 at 04:24:54PM +0900, Michael Paquier wrote:
> On Thu, Apr 07, 2022 at 03:14:01PM +0800, Julien Rouhaud wrote:
> > Sure, but gettimeofday() has been implemented in vDSO for quite some time on
> > most platforms, so it shouldn't hurt that much on mainstream platforms
> > especially compared to the cost of whatever operation is actually using that
> > temporary file.
> > 
> > I don't think that having an extra GUC for temp IO is sensible, if that's why
> > you're suggesting?  Or are you just asking to do some benchmarking on some
> > platform where getting the time is known to be slow (Windows?).
> 
> I am asking about the latter, but the former could be one solution if
> the latter proves to be a problem, and this has not been discussed on
> the thread yet.  So, with some kind of worst-case scenario, how much
> worse the performance gets once you add those extra calls when
> compared to HEAD?  I think that we'd better be careful with any
> additions of INSTR_TIME_SET_CURRENT().

I just did a quick test on my linux box, using this data:
CREATE TABLE tt AS select generate_series(1, 10000) id;
VACUUM ANALYZE tt;

and this scenario:
SET work_mem TO '64kB';
SELECT count(*) FROM (SELECT id FROM tt ORDER BY id) s;

which yields this plan:

                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1349.39..1349.40 rows=1 width=8) (actual time=5.417..5.417 rows=1 loops=1)
   ->  Sort  (cost=1199.39..1224.39 rows=10000 width=4) (actual time=2.910..4.422 rows=10000 loops=1)
         Sort Key: tt.id
         Sort Method: external merge  Disk: 144kB
         ->  Seq Scan on tt  (cost=0.00..145.00 rows=10000 width=4) (actual time=0.008..1.239 rows=10000 loops=1)
 Planning Time: 0.405 ms
 Execution Time: 5.524 ms

So maybe not the worst that could be tested, but probably bad enough for this
patch.

I ran that with pgbench, 4 clients (I have 4 cores) for 30 seconds, 3 times.

Comparing master and this patch with track_io_timing activated, I see a 0.95%
overhead, with a 2.6% noise level.



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: BufferAlloc: don't take two simultaneous locks
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: shared-memory based stats collector - v70