Re: Use streaming read API in ANALYZE - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Use streaming read API in ANALYZE
Date
Msg-id 20240327201523.ub4htr6jr4zhcfjs@liskov
Whole thread Raw
In response to Re: Use streaming read API in ANALYZE  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Responses Re: Use streaming read API in ANALYZE
List pgsql-hackers
On Tue, Mar 26, 2024 at 02:51:27PM +0300, Nazir Bilal Yavuz wrote:
> Hi,
> 
> On Wed, 28 Feb 2024 at 14:42, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> >
> >
> > The new version of the streaming read API [1] is posted. I updated the
> > streaming read API changes patch (0001), using the streaming read API
> > in ANALYZE patch (0002) remains the same. This should make it easier
> > to review as it can be applied on top of master
> >
> >
> 
> The new version of the streaming read API is posted [1]. I rebased the
> patch on top of master and v9 of the streaming read API.
> 
> There is a minimal change in the 'using the streaming read API in ANALYZE
> patch (0002)', I changed STREAMING_READ_FULL to STREAMING_READ_MAINTENANCE
> to copy exactly the same behavior as before. Also, some benchmarking
> results:
> 
> I created a 22 GB table and set the size of shared buffers to 30GB, the
> rest is default.
> 
> ╔═══════════════════════════╦═════════════════════╦════════════╗
> ║                           ║  Avg Timings in ms  ║            ║
> ╠═══════════════════════════╬══════════╦══════════╬════════════╣
> ║                           ║  master  ║  patched ║ percentage ║
> ╠═══════════════════════════╬══════════╬══════════╬════════════╣
> ║     Both OS cache and     ║          ║          ║            ║
> ║  shared buffers are clear ║ 513.9247 ║ 463.1019 ║    %9.9    ║
> ╠═══════════════════════════╬══════════╬══════════╬════════════╣
> ║   OS cache is loaded but  ║          ║          ║            ║
> ║  shared buffers are clear ║ 423.1097 ║ 354.3277 ║    %16.3   ║
> ╠═══════════════════════════╬══════════╬══════════╬════════════╣
> ║ Shared buffers are loaded ║          ║          ║            ║
> ║                           ║  89.2846 ║  84.6952 ║    %5.1    ║
> ╚═══════════════════════════╩══════════╩══════════╩════════════╝
> 
> Any kind of feedback would be appreciated.

Thanks for working on this!

A general review comment: I noticed you have the old streaming read
(pgsr) naming still in a few places (including comments) -- so I would
just make sure and update everywhere when you rebase in Thomas' latest
version of the read stream API.

> From c7500cc1b9068ff0b704181442999cd8bed58658 Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> Date: Mon, 19 Feb 2024 14:30:47 +0300
> Subject: [PATCH v3 2/2] Use streaming read API in ANALYZE
>
> --- a/src/backend/commands/analyze.c
> +++ b/src/backend/commands/analyze.c
> @@ -1102,6 +1102,26 @@ examine_attribute(Relation onerel, int attnum, Node *index_expr)
>      return stats;
>  }
>  
> +/*
> + * Prefetch callback function to get next block number while using
> + * BlockSampling algorithm
> + */
> +static BlockNumber
> +pg_block_sampling_streaming_read_next(StreamingRead *stream,
> +                                      void *user_data,
> +                                      void *per_buffer_data)

I don't think you need the pg_ prefix

> +{
> +    BlockSamplerData *bs = user_data;
> +    BlockNumber *current_block = per_buffer_data;

Why can't you just do BufferGetBlockNumber() on the buffer returned from
the read stream API instead of allocating per_buffer_data for the block
number?

> +
> +    if (BlockSampler_HasMore(bs))
> +        *current_block = BlockSampler_Next(bs);
> +    else
> +        *current_block = InvalidBlockNumber;
> +
> +    return *current_block;


I think we'd like to keep the read stream code in heapam-specific code.
Instead of doing streaming_read_buffer_begin() here, you could put this
in heap_beginscan() or initscan() guarded by
    scan->rs_base.rs_flags & SO_TYPE_ANALYZE

same with streaming_read_buffer_end()/heap_endscan().

You'd also then need to save the reference to the read stream in the
HeapScanDescData.

> +    stream = streaming_read_buffer_begin(STREAMING_READ_MAINTENANCE,
> +                                         vac_strategy,
> +                                         BMR_REL(scan->rs_rd),
> +                                         MAIN_FORKNUM,
> +                                         pg_block_sampling_streaming_read_next,
> +                                         &bs,
> +                                         sizeof(BlockSamplerData));
>  
>      /* Outer loop over blocks to sample */

In fact, I think you could use this opportunity to get rid of the block
dependency in acquire_sample_rows() altogether.

Looking at the code now, it seems like you could just invoke
heapam_scan_analyze_next_block() (maybe rename it to
heapam_scan_analyze_next_buffer() or something) from
heapam_scan_analyze_next_tuple() and remove
table_scan_analyze_next_block() entirely.

Then table AMs can figure out how they want to return tuples from
table_scan_analyze_next_tuple().

If you do all this, note that you'll need to update the comments above
acquire_sample_rows() accordingly.

> -    while (BlockSampler_HasMore(&bs))
> +    while (nblocks)
>      {
>          bool        block_accepted;
> -        BlockNumber targblock = BlockSampler_Next(&bs);
> -#ifdef USE_PREFETCH
> -        BlockNumber prefetch_targblock = InvalidBlockNumber;
> -
> -        /*
> -         * Make sure that every time the main BlockSampler is moved forward
> -         * that our prefetch BlockSampler also gets moved forward, so that we
> -         * always stay out ahead.
> -         */
> -        if (prefetch_maximum && BlockSampler_HasMore(&prefetch_bs))
> -            prefetch_targblock = BlockSampler_Next(&prefetch_bs);
> -#endif
>  
>          vacuum_delay_point();
>  
> -        block_accepted = table_scan_analyze_next_block(scan, targblock, vac_strategy);
> +        block_accepted = table_scan_analyze_next_block(scan, stream);

- Melanie



pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Next
From: Dmitry Koval
Date:
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands