Home > mailing lists

Re: Use streaming read API in ANALYZE - Mailing list pgsql-hackers

From	Nazir Bilal Yavuz
Subject	Re: Use streaming read API in ANALYZE
Date	April 3 21:59:32
Msg-id	CAN55FZ1TQc06xPRdHhPfMOS8PWwDhQwKv2--dMuY+Z3Ln+5B9w@mail.gmail.com Whole thread Raw
In response to	Re: Use streaming read API in ANALYZE (Jakub Wartak <jakub.wartak@enterprisedb.com>)
List	pgsql-hackers

Tree view

Hi Jakub,

Thank you for looking into this and doing a performance analysis.

On Wed, 3 Apr 2024 at 11:42, Jakub Wartak <jakub.wartak@enterprisedb.com> wrote:
>
> On Tue, Apr 2, 2024 at 9:24 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> [..]
> > v4 is rebased on top of v14 streaming read API changes.
>
> Hi Nazir, so with streaming API committed, I gave a try to this patch.
> With autovacuum=off and 30GB table on NVMe (with standard readahead of
> 256kb and ext4, Debian 12, kernel 6.1.0, shared_buffers = 128MB
> default) created using: create table t as select repeat('a', 100) || i
> || repeat('b', 500) as filler from generate_series(1, 45000000) as i;
>
> on master, effect of mainteance_io_concurency [default 10] is like
> that (when resetting the fs cache after each ANALYZE):
>
>     m_io_c = 0:
>     Time: 3137.914 ms (00:03.138)
>     Time: 3094.540 ms (00:03.095)
>     Time: 3452.513 ms (00:03.453)
>
>     m_io_c = 1:
>     Time: 2972.751 ms (00:02.973)
>     Time: 2939.551 ms (00:02.940)
>     Time: 2904.428 ms (00:02.904)
>
>     m_io_c = 2:
>     Time: 1580.260 ms (00:01.580)
>     Time: 1572.132 ms (00:01.572)
>     Time: 1558.334 ms (00:01.558)
>
>     m_io_c = 4:
>     Time: 938.304 ms
>     Time: 931.772 ms
>     Time: 920.044 ms
>
>     m_io_c = 8:
>     Time: 666.025 ms
>     Time: 660.241 ms
>     Time: 648.848 ms
>
>     m_io_c = 16:
>     Time: 542.450 ms
>     Time: 561.155 ms
>     Time: 539.683 ms
>
>     m_io_c = 32:
>     Time: 538.487 ms
>     Time: 541.705 ms
>     Time: 538.101 ms
>
> with patch applied:
>
>     m_io_c = 0:
>     Time: 3106.469 ms (00:03.106)
>     Time: 3140.343 ms (00:03.140)
>     Time: 3044.133 ms (00:03.044)
>
>     m_io_c = 1:
>     Time: 2959.817 ms (00:02.960)
>     Time: 2920.265 ms (00:02.920)
>     Time: 2911.745 ms (00:02.912)
>
>     m_io_c = 2:
>     Time: 1581.912 ms (00:01.582)
>     Time: 1561.444 ms (00:01.561)
>     Time: 1558.251 ms (00:01.558)
>
>     m_io_c = 4:
>     Time: 908.116 ms
>     Time: 901.245 ms
>     Time: 901.071 ms
>
>     m_io_c = 8:
>     Time: 619.870 ms
>     Time: 620.327 ms
>     Time: 614.266 ms
>
>     m_io_c = 16:
>     Time: 529.885 ms
>     Time: 526.958 ms
>     Time: 528.474 ms
>
>     m_io_c = 32:
>     Time: 521.185 ms
>     Time: 520.713 ms
>     Time: 517.729 ms
>
> No difference to me, which seems to be good. I've double checked and
> patch used the new way
>
> acquire_sample_rows -> heapam_scan_analyze_next_block ->
> ReadBufferExtended -> ReadBuffer_common (inlined) -> WaitReadBuffers
> -> mdreadv -> FileReadV -> pg_preadv (inlined)
> acquire_sample_rows -> heapam_scan_analyze_next_block ->
> ReadBufferExtended -> ReadBuffer_common (inlined) -> StartReadBuffer
> -> ...
>
> I gave also io_combine_limit to 32 (max, 256kB) a try and got those
> slightly better results:
>
> [..]
> m_io_c = 16:
> Time: 494.599 ms
> Time: 496.345 ms
> Time: 973.500 ms
>
> m_io_c = 32:
> Time: 461.031 ms
> Time: 449.037 ms
> Time: 443.375 ms
>
> and that (last one) apparently was able to push it to ~50-60k still
> random IOPS range, the rareq-sz was still ~8 (9.9) kB as analyze was
> still reading random , so I assume no merging was done:
>
> Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz
> w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s
> drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
> nvme0n1       61212.00    591.82     0.00   0.00    0.10     9.90
> 2.00      0.02     0.00   0.00    0.00    12.00    0.00      0.00
> 0.00   0.00    0.00     0.00    0.00    0.00    6.28  85.20
>
> So in short it looks good to me.

My results are similar to yours, also I realized a bug while working
on your benchmarking cases. I will share the cause and the fix soon.

--
Regards,
Nazir Bilal Yavuz
Microsoft

pgsql-hackers by date:

From: Peter Geoghegan
Date: 03 April, 21:58:57
Subject: Re: Combine Prune and Freeze records emitted by vacuum

From: Pavel Borisov
Date: 03 April, 22:04:05
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed

Re: Use streaming read API in ANALYZE - Mailing list pgsql-hackers

Previous

Next