Thread: Add shared buffer hits to pg_stat_io

Add shared buffer hits to pg_stat_io

From
Melanie Plageman
Date:
Hi,

As suggested in [1], the attached patch adds shared buffer hits to
pg_stat_io.

I remember at some point having this in the view and then removing it
but I can't quite remember what the issue was -- nor do I see a
rationale mentioned in the thread [2].

It might have had something to do with the interaction between the
now-removed "rejected" buffers column.

I am looking for input as to the order of this column in the view. I
think it should go after op_bytes since it is not relevant for
non-block-oriented IO. However, I'm not sure what the order of hits,
evictions, and reuses should be (all have to do with buffers).

While adding this, I noticed that I had made all of the IOOP columns
int8 in the view, and I was wondering if this is sufficient for hits (I
imagine you could end up with quite a lot of those).

- Melanie

[1] https://www.postgresql.org/message-id/20230209050319.chyyup4vtq4jzobq%40awork3.anarazel.de
[2]
https://www.postgresql.org/message-id/flat/20230209050319.chyyup4vtq4jzobq%40awork3.anarazel.de#63ff7a97b7a5bb7b86c1a250065be7f9

Attachment

Re: Add shared buffer hits to pg_stat_io

From
"Drouvot, Bertrand"
Date:
Hi,

On 2/25/23 9:16 PM, Melanie Plageman wrote:
> Hi,
> 
> As suggested in [1], the attached patch adds shared buffer hits to
> pg_stat_io.
> 

Thanks for the patch!

  BufferDesc *
  LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
-                                bool *foundPtr, IOContext *io_context)
+                                bool *foundPtr, IOContext io_context)
  {
         BufferTag       newTag;                 /* identity of requested block */
         LocalBufferLookupEnt *hresult;
@@ -128,14 +128,6 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
         hresult = (LocalBufferLookupEnt *)
                 hash_search(LocalBufHash, &newTag, HASH_FIND, NULL);

-       /*
-        * IO Operations on local buffers are only done in IOCONTEXT_NORMAL. Set
-        * io_context here (instead of after a buffer hit would have returned) for
-        * convenience since we don't have to worry about the overhead of calling
-        * IOContextForStrategy().
-        */
-       *io_context = IOCONTEXT_NORMAL;


It looks like that io_context is not used in LocalBufferAlloc() anymore and then can be removed as an argument.

> 
> I am looking for input as to the order of this column in the view. I
> think it should go after op_bytes since it is not relevant for
> non-block-oriented IO. 

Agree.

> However, I'm not sure what the order of hits,
> evictions, and reuses should be (all have to do with buffers).
> 

I'm not sure there is a strong "correct" ordering but the proposed one looks natural to me.

> While adding this, I noticed that I had made all of the IOOP columns
> int8 in the view, and I was wondering if this is sufficient for hits (I
> imagine you could end up with quite a lot of those).
> 

I think that's ok and bigint is what is already used for pg_statio_user_tables.heap_blks_hit for example.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add shared buffer hits to pg_stat_io

From
Melanie Plageman
Date:
Thanks for the review!

On Tue, Feb 28, 2023 at 7:36 AM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>   BufferDesc *
>   LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
> -                                bool *foundPtr, IOContext *io_context)
> +                                bool *foundPtr, IOContext io_context)
>   {
>          BufferTag       newTag;                 /* identity of requested block */
>          LocalBufferLookupEnt *hresult;
> @@ -128,14 +128,6 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
>          hresult = (LocalBufferLookupEnt *)
>                  hash_search(LocalBufHash, &newTag, HASH_FIND, NULL);
>
> -       /*
> -        * IO Operations on local buffers are only done in IOCONTEXT_NORMAL. Set
> -        * io_context here (instead of after a buffer hit would have returned) for
> -        * convenience since we don't have to worry about the overhead of calling
> -        * IOContextForStrategy().
> -        */
> -       *io_context = IOCONTEXT_NORMAL;
>
>
> It looks like that io_context is not used in LocalBufferAlloc() anymore and then can be removed as an argument.

Good catch. Updated patchset attached.

> > While adding this, I noticed that I had made all of the IOOP columns
> > int8 in the view, and I was wondering if this is sufficient for hits (I
> > imagine you could end up with quite a lot of those).
> >
>
> I think that's ok and bigint is what is already used for pg_statio_user_tables.heap_blks_hit for example.

Ah, I was silly and didn't understand that the SQL type int8 is eight
bytes and not 1. That makes a lot of things make more sense :)

https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-C-TYPE-TABLE

- Melanie

Attachment

Re: Add shared buffer hits to pg_stat_io

From
"Drouvot, Bertrand"
Date:
Hi,

On 3/6/23 4:38 PM, Melanie Plageman wrote:
> Thanks for the review!
> 
> On Tue, Feb 28, 2023 at 7:36 AM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>    BufferDesc *
>>    LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
>> -                                bool *foundPtr, IOContext *io_context)
>> +                                bool *foundPtr, IOContext io_context)
>>    {
>>           BufferTag       newTag;                 /* identity of requested block */
>>           LocalBufferLookupEnt *hresult;
>> @@ -128,14 +128,6 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
>>           hresult = (LocalBufferLookupEnt *)
>>                   hash_search(LocalBufHash, &newTag, HASH_FIND, NULL);
>>
>> -       /*
>> -        * IO Operations on local buffers are only done in IOCONTEXT_NORMAL. Set
>> -        * io_context here (instead of after a buffer hit would have returned) for
>> -        * convenience since we don't have to worry about the overhead of calling
>> -        * IOContextForStrategy().
>> -        */
>> -       *io_context = IOCONTEXT_NORMAL;
>>
>>
>> It looks like that io_context is not used in LocalBufferAlloc() anymore and then can be removed as an argument.
> 
> Good catch. Updated patchset attached.

Thanks for the update!

> 
>>> While adding this, I noticed that I had made all of the IOOP columns
>>> int8 in the view, and I was wondering if this is sufficient for hits (I
>>> imagine you could end up with quite a lot of those).
>>>
>>
>> I think that's ok and bigint is what is already used for pg_statio_user_tables.heap_blks_hit for example.
> 
> Ah, I was silly and didn't understand that the SQL type int8 is eight
> bytes and not 1. That makes a lot of things make more sense :)

Oh, I see ;-)

I may give it another review but currently V2 looks good to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add shared buffer hits to pg_stat_io

From
Andres Freund
Date:
Hi,

LGTM. The only comment I have is that a small test wouldn't hurt... Compared
to the other things it should be fairly easy...

Greetings,

Andres Freund



Re: Add shared buffer hits to pg_stat_io

From
Melanie Plageman
Date:
On Tue, Mar 7, 2023 at 2:47 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> LGTM. The only comment I have is that a small test wouldn't hurt... Compared
> to the other things it should be fairly easy...

So, I have attached an updated patchset which adds a test for hits. Since
there is only one call site where we count hits, I think this single
test is sufficient to protect against regressions.

However, I am concerned that, while unlikely, this could be flakey.
Something could happen to force all of those blocks out of shared
buffers (even though they were just read in) before we hit them.

We could simply check if hits are greater at the end of all of the
pg_stat_io tests than at the beginning and rely on the fact that it is
highly unlikely that every single buffer access will be a miss for all
of the tests. However, is it not technically also possible to have zero
hits?

- Melanie

Attachment

Re: Add shared buffer hits to pg_stat_io

From
Andres Freund
Date:
On 2023-03-08 13:44:32 -0500, Melanie Plageman wrote:
> However, I am concerned that, while unlikely, this could be flakey.
> Something could happen to force all of those blocks out of shared
> buffers (even though they were just read in) before we hit them.

You could make the test query a simple nested loop self-join, that'll prevent
the page being evicted, because it'll still be pinned on the outer side, while
generating hits on the inner side.

Greetings,

Andres Freund



Re: Add shared buffer hits to pg_stat_io

From
Melanie Plageman
Date:
On Wed, Mar 8, 2023 at 2:23 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-03-08 13:44:32 -0500, Melanie Plageman wrote:
> > However, I am concerned that, while unlikely, this could be flakey.
> > Something could happen to force all of those blocks out of shared
> > buffers (even though they were just read in) before we hit them.
>
> You could make the test query a simple nested loop self-join, that'll prevent
> the page being evicted, because it'll still be pinned on the outer side, while
> generating hits on the inner side.

Good idea. v3 attached.

Attachment

Re: Add shared buffer hits to pg_stat_io

From
"Drouvot, Bertrand"
Date:
Hi,

On 3/9/23 2:23 PM, Melanie Plageman wrote:
> On Wed, Mar 8, 2023 at 2:23 PM Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2023-03-08 13:44:32 -0500, Melanie Plageman wrote:
>>> However, I am concerned that, while unlikely, this could be flakey.
>>> Something could happen to force all of those blocks out of shared
>>> buffers (even though they were just read in) before we hit them.
>>
>> You could make the test query a simple nested loop self-join, that'll prevent
>> the page being evicted, because it'll still be pinned on the outer side, while
>> generating hits on the inner side.
> 
> Good idea. v3 attached.

Thanks! The added test looks good to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add shared buffer hits to pg_stat_io

From
Andres Freund
Date:
Hi,

On 2023-03-09 08:23:46 -0500, Melanie Plageman wrote:
> Good idea. v3 attached.

I committed this, after some small regression test changes. I was worried that
the query for testing buffer hits might silently change in the future, so I
added an EXPLAIN for the query. Also removed the need for the explicit RESETs
by using BEGIN; SET LOCAL ...; query; COMMIT;.

Thanks for the patch Melanie and the review Bertrand. I'm excited about
finally being able to compute meaningful cache hit ratios :)

Regards,

Andres