Re: EXPLAIN: showing ReadStream / prefetch stats - Mailing list pgsql-hackers

From Lukas Fittl
Subject Re: EXPLAIN: showing ReadStream / prefetch stats
Date
Msg-id CAP53Pkzd9P=k_DsxU4rGoCEO-26rr9NQcS8co2y4HXB7TyiMbA@mail.gmail.com
Whole thread Raw
In response to Re: EXPLAIN: showing ReadStream / prefetch stats  (Tomas Vondra <tomas@vondra.me>)
Responses Re: EXPLAIN: showing ReadStream / prefetch stats
List pgsql-hackers
On Wed, Mar 18, 2026 at 3:41 PM Tomas Vondra <tomas@vondra.me> wrote:
> The 0003 also changes the EXPLAIN to enable IO by default, just like we
> do for BUFFERS. It seems like a reasonable precedent to me.

One side effect of that is that the tests now fail for me locally,
because the specific values are system-dependent. Attached a patch
(nocfbot-0002) that fixed that for me.

There is one detail maybe calling out specifically on JSON output:
Currently Postgres always emits all fields in JSON output, even if
they are zero. The code that you have in v3 skips the "I/O" group when
the value is zero, which doesn't work well with how current regression
tests are written. I'm definitely not a fan of the unnecessary
verbosity of JSON EXPLAIN output, but I'd suggest we don't break with
the tradition here, and instead always output the "I/O" group in
non-text formats. Also attached a patch for that (nocfbot-0001).

Overall I think the abstraction here seems reasonable if we're
primarily focused on getting the per-node instrumentation taken care
of.

That said, two thoughts on an example EXPLAIN output I just ran:

1) I do wonder if its a bit confusing that we propagate I/O timings up
the EXPLAIN tree, but not the "I/O" information - I realize fixing
that would be a bit involved though, e.g. we'd have to invent
accumulation logic in explain.c. It'd also maybe make people thing
this covers things like temporary file reads/etc.

2) The ordering of "I/O Timings" in relation to "I/O" feels off to me
(since they're not next to each other) - maybe we should re-order I/O
Timings to come before Buffers in show_buffer_usage to address that?

EXPLAIN (ANALYZE) SELECT COUNT(*) FROM t;

                                                       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=218331.00..218331.01 rows=1 width=8) (actual
time=563.437..563.437 rows=1.00 loops=1)
   Buffers: shared hit=15806 read=41274
   I/O Timings: shared read=1.180
   ->  Seq Scan on t  (cost=0.00..186080.80 rows=12900080 width=0)
(actual time=0.335..306.737 rows=12900005.00 loops=1)
         Prefetch: avg=61.517 max=91 capacity=94
         I/O: stalls=7 size=14.825 inprogress=5.321
         Buffers: shared hit=15806 read=41274
         I/O Timings: shared read=1.180
 Planning Time: 0.101 ms
 Execution Time: 563.471 ms
(10 rows)

Thanks,
Lukas

--
Lukas Fittl

Attachment

pgsql-hackers by date:

Previous
From: Jianghua Yang
Date:
Subject: [PATCH] Fix build failure on macOS 26.2 SDK due to missing nl_langinfo_l declaration
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails