Thread: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

From

Andrey Borodin

Date:

14 February 2023, 00:23:30

Hi hackers!

I was asked to prototype a feature that helps to distinguish shared
buffer usage between index reads and heap reads. Practically it looks
like this:

# explain (analyze,verbose,buffers) select nextval('s');
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=8) (actual time=1.159..1.160
rows=1 loops=1)
   Output: nextval('s'::regclass)
   Buffers: shared hit=7(relation 3, index 4) read=6(relation 1, index
4, sequence 1) dirtied=1
 Planning Time: 0.214 ms
 Execution Time: 1.238 ms
(5 rows)


The change is in these parts "(relation 3, index 4)" and "(relation 1,
index 4, sequence 1)". Probably, it should help DBAs to better
understand complex queries.
I think cluttering output with more text is OK as long as "verbose" is
requested.

But there are some caveats:
1. Some more increments on hot paths. We have to add this tiny toll to
every single buffer hit, but it will be seldom of any use.
2. It's difficult to measure writes caused by query, and even dirties.
The patch adds "evictions" caused by query, but they have little
practical sense too.

All in all I do not have an opinion if this feature is a good tradeoff.
What do you think? Does the feature look useful? Do we want a more
polished implementation?

Thanks!


Best regards, Andrey Borodin.

Attachment

v1-0001-Split-EXPLAIN-ANALYZE-BUFFERS-by-RelKind.patch

Re: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

From

Andres Freund

Date:

14 February 2023, 00:29:57

Hi,

On 2023-02-13 16:23:30 -0800, Andrey Borodin wrote:
> But there are some caveats:
> 1. Some more increments on hot paths. We have to add this tiny toll to
> every single buffer hit, but it will be seldom of any use.

Additionally, I bet it slows down EXPLAIN (ANALYZE, BUFFERS) noticeably. It's
already quite expensive...


> All in all I do not have an opinion if this feature is a good tradeoff.
> What do you think? Does the feature look useful? Do we want a more
> polished implementation?

Unless the above issues could be avoided, I don't think so.

Greetings,

Andres Freund

Re: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

From

Andrey Borodin

Date:

14 February 2023, 00:36:25

On Mon, Feb 13, 2023 at 4:29 PM Andres Freund <andres@anarazel.de> wrote:
> > 1. Some more increments on hot paths. We have to add this tiny toll to
> > every single buffer hit, but it will be seldom of any use.
>
> Additionally, I bet it slows down EXPLAIN (ANALYZE, BUFFERS) noticeably. It's
> already quite expensive...
>

I think collection of instrumentation is done unconditionally.
We always do that
pgBufferUsage.shared_blks_hit++;
when the buffer is in shared_buffers.

Best regards, Andrey Borodin.

Re: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

From

Andres Freund

Date:

14 February 2023, 00:39:46

Hi,

On 2023-02-13 16:36:25 -0800, Andrey Borodin wrote:
> On Mon, Feb 13, 2023 at 4:29 PM Andres Freund <andres@anarazel.de> wrote:
> > > 1. Some more increments on hot paths. We have to add this tiny toll to
> > > every single buffer hit, but it will be seldom of any use.
> >
> > Additionally, I bet it slows down EXPLAIN (ANALYZE, BUFFERS) noticeably. It's
> > already quite expensive...
> >
> 
> I think collection of instrumentation is done unconditionally.
> We always do that
> pgBufferUsage.shared_blks_hit++;
> when the buffer is in shared_buffers.

The problem I'm talking about is the increased overhead in InstrStopNode(),
due to BufferUsageAccumDiff() getting more expensive.

Andres

Re: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

From

Andrey Borodin

Date:

14 February 2023, 02:14:58

On Mon, Feb 13, 2023 at 4:39 PM Andres Freund <andres@anarazel.de> wrote:
>
> The problem I'm talking about is the increased overhead in InstrStopNode(),
> due to BufferUsageAccumDiff() getting more expensive.
>

Thanks, now I understand the problem better. According to godbolt.com
my patch doubles the number of instructions in this function. Unless
we compute only tables\indexes\matviews.
Anyway, without regarding functionality of this particular patch,
BufferUsageAccumDiff() does not seem slow to me. It's just a
branchless bunch of simd instructions. Performance of this function
might matter only when called gazillion times per second.

Best regards, Andrey Borodin.

Attachment

for godbolt.cpp

Re: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS

From

Andres Freund

Date:

14 February 2023, 02:43:35

Hi,

On 2023-02-13 18:14:58 -0800, Andrey Borodin wrote:
> On Mon, Feb 13, 2023 at 4:39 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > The problem I'm talking about is the increased overhead in InstrStopNode(),
> > due to BufferUsageAccumDiff() getting more expensive.
> >
> 
> Thanks, now I understand the problem better. According to godbolt.com
> my patch doubles the number of instructions in this function. Unless
> we compute only tables\indexes\matviews.
> Anyway, without regarding functionality of this particular patch,
> BufferUsageAccumDiff() does not seem slow to me. It's just a
> branchless bunch of simd instructions. Performance of this function
> might matter only when called gazillion times per second.

It is called gazillions of times per second when you do an EXPLAIN (ANALYZE,
BUFFERS). Every invocation of an executor node calls it.

Here's a quick pgbench, showing todays code, with -O3, without assertions:
       298.396           0  SELECT generate_series(1, 10000000) OFFSET 10000000;
       397.400           0  EXPLAIN (ANALYZE, TIMING OFF) SELECT generate_series(1, 10000000) OFFSET 10000000;
       717.238           0  EXPLAIN (ANALYZE, TIMING ON) SELECT generate_series(1, 10000000) OFFSET 10000000;
       419.736           0  EXPLAIN (ANALYZE, BUFFERS, TIMING OFF) SELECT generate_series(1, 10000000) OFFSET
10000000;
       761.575           0  EXPLAIN (ANALYZE, BUFFERS, TIMING ON) SELECT generate_series(1, 10000000) OFFSET 10000000;

The effect ends up a lot larger once you add in joins etc, because it ads
additional executor node that all have their instrumentation started/stopped.

Greetings,

Andres Freund