Re: Summary function for pg_buffercache - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Summary function for pg_buffercache
Date
Msg-id CAGPVpCRi6VVN1Ve=BCo1TgF5pxJqC0sGwyRU49mhLBOT4DEo6A@mail.gmail.com
Whole thread Raw
In response to Re: Summary function for pg_buffercache  (Aleksander Alekseev <aleksander@timescale.com>)
Responses Re: Summary function for pg_buffercache
List pgsql-hackers
Hi Aleksander and Nathan,

Thanks for your comments.

Aleksander Alekseev <aleksander@timescale.com>, 9 Eyl 2022 Cum, 17:36 tarihinde şunu yazdı:
However I'm afraid you can't examine BufferDesc's without taking
locks. This is explicitly stated in buf_internals.h:

"""
Buffer header lock (BM_LOCKED flag) must be held to EXAMINE or change
TAG, state or wait_backend_pgprocno fields.
"""

I wasn't aware of this explanation. Thanks for pointing it out.

When somebody modifies relNumber concurrently (e.g. calls
ClearBufferTag()) this will cause an undefined behaviour.

I thought that it wouldn't really be a problem even if relNumber is modified concurrently, since the function does not actually rely on the actual values.
I'm not sure about what undefined behaviour could harm this badly. It seemed to me that it would read an invalid relNumber in the worst case scenario. 
But I'm not actually familiar with buffer related parts of the code, so I might be wrong.
And I'm okay with taking header locks if necessary. 

In the attached patch, I added buffer header locks just before examining tag as follows:

+ buf_state = LockBufHdr(bufHdr);
+
+ /* Invalid RelFileNumber means the buffer is unused */
+ if (RelFileNumberIsValid(BufTagGetRelNumber(&bufHdr->tag)))
+ {
...
+ }
...
+ UnlockBufHdr(bufHdr, buf_state);


> > I suggest we focus on saving the memory first and then think about the
> > performance, if necessary.
>
> +1

I again did the same quick benchmarking, here are the numbers with locks.

postgres=# show shared_buffers;
 shared_buffers
----------------
 16GB
(1 row)

postgres=#  SELECT relfilenode <> 0 AS is_valid, isdirty, count(*) FROM pg_buffercache GROUP BY relfilenode <> 0, isdirty;
 is_valid | isdirty |  count
----------+---------+---------
 t        | f       |     256
          |         | 2096876
 t        | t       |      20
(3 rows)

Time: 1024.456 ms (00:01.024)

postgres=#  select * from pg_buffercache_summary();
 used_buffers | unused_buffers | dirty_buffers | pinned_buffers | avg_usagecount
--------------+----------------+---------------+----------------+----------------
          282 |        2096870 |            20 |              0 |      3.4574468
(1 row)

Time: 33.074 ms

Yes, locks slowed pg_buffercache_summary down. But there is still quite a bit of performance improvement, plus memory saving as you mentioned.
 
Here is the corrected patch.
 
Also thanks for corrections.

Best,
Melih
 
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] building postgres with meson - v12
Next
From: Zhihong Yu
Date:
Subject: Re: cataloguing NOT NULL constraints