Thread: Using read stream in autoprewarm

Using read stream in autoprewarm

From
Nazir Bilal Yavuz
Date:
Hi,

I am working on using the read stream in autoprewarm. I observed ~10%
performance gain with this change. The patch is attached.

The downside of the read stream approach is that a new read stream
object needs to be created for each database, relation and fork. I was
wondering if this would cause a regression but it did not (at least
depending on results of my testing). Another downside could be the
code getting complicated.

For the testing,
- I created 50 databases with each of them having 50 tables and the
size of the tables are 520KB.
    - patched: 51157 ms
    - master: 56769 ms
- I created 5 databases with each of them having 1 table and the size
of the tables are 3GB.
    - patched: 32679 ms
    - master: 36706 ms

I put debugging message with timing information in
autoprewarm_database_main() function, then run autoprewarm 100 times
(by restarting the server) and cleared the OS cache before each
restart. Also, I ensured that the block number of the buffer returning
from the read stream API is correct. I am not sure if that much
testing is enough for this kind of change.

Any feedback would be appreciated.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment

Re: Using read stream in autoprewarm

From
"Andrey M. Borodin"
Date:

> On 8 Aug 2024, at 11:32, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> Any feedback would be appreciated.

I've took a look into the patch. It seems to me that you add new block numbers to the read stream until you have
buffers.So when there are no more buffers you will still have some queued blocks. 
Maybe can you change the logic so that number of free buffers must be enough to allocate all blocks in look-ahead
distance?

Thanks!


Best regards, Andrey Borodin.


Re: Using read stream in autoprewarm

From
Stepan Neretin
Date:

Dear Nazir,

At first A quick look it looks good. I will take a closer look at it tomorrow. Could you please let me know about the performance tests and graphics?

Best regards, Stepan Neretin!

Re: Using read stream in autoprewarm

From
Nazir Bilal Yavuz
Date:
Hi,

Thanks for looking into this!

On Thu, 31 Oct 2024 at 21:18, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> > On 8 Aug 2024, at 11:32, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> >
> > Any feedback would be appreciated.
>
> I've took a look into the patch. It seems to me that you add new block numbers to the read stream until you have
buffers.So when there are no more buffers you will still have some queued blocks.
 
> Maybe can you change the logic so that number of free buffers must be enough to allocate all blocks in look-ahead
distance?

I see what you mean. When the have_free_buffer() function returns
false in the callback, there are still queued blocks in the stream
although there are no free buffers in the buffer pool. I think the
best way to solve this is to get the number of free buffers in the
buffer pool by 'BufferStrategyControl.lastFreeBuffer -
BufferStrategyControl.firstFreeBuffer' and then compare it with
'stream->pending_read_nblocks'. When the 'stream->pending_read_nblocks
== number_of_free_buffers_in_buffer_pool', end the stream. The problem
with that is stream->pending_read_nblocks isn't public, also I am not
sure whether 'BufferStrategyControl.lastFreeBuffer -
BufferStrategyControl.firstFreeBuffer' is safe to use.

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Using read stream in autoprewarm

From
Nazir Bilal Yavuz
Date:
Hi,

Thanks for looking into this!

On Thu, 31 Oct 2024 at 22:02, Stepan Neretin <sndcppg@gmail.com> wrote:
>
> At first A quick look it looks good. I will take a closer look at it tomorrow. Could you please let me know about the
performancetests and graphics?
 

Sorry but I didn't understand what you mean by performance tests and
graphics. Do you want something else than the information in the first
email [1]?

[1] postgr.es/m/CAN55FZ3n8Gd%2BhajbL%3D5UkGzu_aHGRqnn%2BxktXq2fuds%3D1AOR6Q%40mail.gmail.com

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Using read stream in autoprewarm

From
"Andrey M. Borodin"
Date:

> On 1 Nov 2024, at 12:51, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
>  am not
> sure whether 'BufferStrategyControl.lastFreeBuffer -
> BufferStrategyControl.firstFreeBuffer' is safe to use.

Ugh... it will work. But it seems to me too dirty hack. There's no scalable way to know size of a free list.
Let's just comment that we might read some more buffers if database does not fit into memory?
Alternatively we can count size of a free list on the start.


Best regards, Andrey Borodin.


Re: Using read stream in autoprewarm

From
Matheus Alcantara
Date:
Hi,

Newer reviewer here, trying to understand more about the read stream API.

On Tuesday, November 26th, 2024 at 11:07 AM, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

> Any feedback would be appreciated.

I've executed the same test of 5 databases with each of them having 1 table of
3GB of size and I've got very similar results.

I've also tested using a single database with 4 tables with ~60GB of size and
the results compared with master was more closer but still an improvement. Note
that I've also increased the default shared_buffers to 7GB to see how it works
with large buffer pools.
  - patched: 5.4259 s
  - master: 5.53186 s

Not to much to say about the code, I'm currently learning more about the read
stream API and Postgresql hacking itself. Just some minor points and questions
about the patches.


v2-0002-Count-free-buffers-at-the-start-of-the-autoprewar.patch
--- a/src/backend/storage/buffer/freelist.c
+/*
+ * get_number_of_free_buffers -- a lockless way to get the number of free
+ *                                 buffers in buffer pool.
+ *
+ * Note that result continuosly changes as free buffers are moved out by other
+ * operations.
+ */
+int
+get_number_of_free_buffers(void)

typo on continuosly -> continuously


v2-0001-Use-read-stream-in-autoprewarm.patch
+    bool       *rs_have_free_buffer = per_buffer_data;
+
+
+    *rs_have_free_buffer = true;
+

Not sure if I understand why this variable is needed, it seems that it is only
written and never read? Just as comparison, the block_range_read_stream_cb
callback used on pg_prewarm seems to not use the per_buffer_data parameter.


--
Matheus Alcantara
EDB: https://www.enterprisedb.com



Re: Using read stream in autoprewarm

From
Matheus Alcantara
Date:
On Wednesday, November 27th, 2024 at 11:19 AM, Nazir Bilal Yavuz 
<byavuz81@gmail.com> wrote:
 > > v2-0001-Use-read-stream-in-autoprewarm.patch
 > > + bool *rs_have_free_buffer = per_buffer_data;
 > > +
 > > +
 > > + *rs_have_free_buffer = true;
 > > +
 > >
 > > Not sure if I understand why this variable is needed, it seems that 
it is only
 > > written and never read? Just as comparison, the 
block_range_read_stream_cb
 > > callback used on pg_prewarm seems to not use the per_buffer_data 
parameter.
 >
 >
 > Actually, it is read in the main loop of the
 > autoprewarm_database_main() function:
 >
 > /* There are no free buffers left in shared buffers, break the loop. */
 > else if (!(*rs_have_free_buffer))
 > break;
 >
 > apw_read_stream_next_block() callback function sets
 > rs_have_free_buffer's value to false when there is no free buffer left
 > in the shared buffers. And the code above terminates the main loop in
 > the autoprewarm_database_main() function when it is set to false.
 >
 > block_range_read_stream_cb() callback is used when the callback only
 > needs to loop over the block numbers. However, for the autoprewarm
 > case; the callback function needs to do additional checks so another
 > callback and the use of this variable are required.

Ohh, I see, thanks very much for the explanation.

 > v3 is attached.

Thanks.

I don't know if there is another way that this patch could be tested? 
Looking
forward on other reviews on this.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com




Re: Using read stream in autoprewarm

From
Kirill Reshke
Date:
On Wed, 27 Nov 2024 at 19:20, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> Hi,
>
> Thank you for looking into this!
>
> On Wed, 27 Nov 2024 at 16:50, Matheus Alcantara <mths.dev@pm.me> wrote:
> > I've executed the same test of 5 databases with each of them having 1 table of
> > 3GB of size and I've got very similar results.
> >
> > I've also tested using a single database with 4 tables with ~60GB of size and
> > the results compared with master was more closer but still an improvement. Note
> > that I've also increased the default shared_buffers to 7GB to see how it works
> > with large buffer pools.
> >   - patched: 5.4259 s
> >   - master: 5.53186 s
>
> Thanks for the testing.
>
> > Not to much to say about the code, I'm currently learning more about the read
> > stream API and Postgresql hacking itself. Just some minor points and questions
> > about the patches.
> >
> >
> > v2-0002-Count-free-buffers-at-the-start-of-the-autoprewar.patch
> > --- a/src/backend/storage/buffer/freelist.c
> > +/*
> > + * get_number_of_free_buffers -- a lockless way to get the number of free
> > + *                                                              buffers in buffer pool.
> > + *
> > + * Note that result continuosly changes as free buffers are moved out by other
> > + * operations.
> > + */
> > +int
> > +get_number_of_free_buffers(void)
> >
> > typo on continuosly -> continuously
>
> Done.
>
> > v2-0001-Use-read-stream-in-autoprewarm.patch
> > +       bool       *rs_have_free_buffer = per_buffer_data;
> > +
> > +
> > +       *rs_have_free_buffer = true;
> > +
> >
> > Not sure if I understand why this variable is needed, it seems that it is only
> > written and never read? Just as comparison, the block_range_read_stream_cb
> > callback used on pg_prewarm seems to not use the per_buffer_data parameter.
>
> Actually, it is read in the main loop of the
> autoprewarm_database_main() function:
>
>         /* There are no free buffers left in shared buffers, break the loop. */
>         else if (!(*rs_have_free_buffer))
>             break;
>
> apw_read_stream_next_block() callback function sets
> rs_have_free_buffer's value to false when there is no free buffer left
> in the shared buffers. And the code above terminates the main loop in
> the autoprewarm_database_main() function when it is set to false.
>
> block_range_read_stream_cb() callback is used when the callback only
> needs to loop over the block numbers. However, for the autoprewarm
> case; the callback function needs to do additional checks so another
> callback and the use of this variable are required.
>
> v3 is attached.
>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft

Hi!

> + old_blk = &(p->block_info[p->pos - 1]);
> + cur_blk = &(p->block_info[p->pos]);
Should we Assert(p->pos > 0 && p->pos < *something*)

Patch tested with no regression.


-- 
Best regards,
Kirill Reshke



Re: Using read stream in autoprewarm

From
Kirill Reshke
Date:
On Fri, 29 Nov 2024 at 16:19, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

> v4 is attached.
Hi!

I feel like we are ready to mark this as RFC, WDYT?

-- 
Best regards,
Kirill Reshke



Re: Using read stream in autoprewarm

From
"Andrey M. Borodin"
Date:

> On 2 Dec 2024, at 16:16, Kirill Reshke <reshkekirill@gmail.com> wrote:
> 
> I feel like we are ready to mark this as RFC, WDYT?

+1

Best regards, Andrey Borodin.



Re: Using read stream in autoprewarm

From
Nazir Bilal Yavuz
Date:
Hi,

On Mon, 2 Dec 2024 at 16:30, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> > On 2 Dec 2024, at 16:16, Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > I feel like we are ready to mark this as RFC, WDYT?
>
> +1

Done.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft