Thread: pgsql: Use streaming read I/O in VACUUM's third phase
Use streaming read I/O in VACUUM's third phase Make vacuum's third phase (its second pass over the heap), which reaps dead items collected in the first phase and marks them as reusable, use the read stream API. This commit adds a new read stream callback, vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and returns the next block number to read for vacuum. Author: Melanie Plageman <melanieplageman@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/c3e775e608f2a6d0bcfba147bf08a506827cc567 Modified Files -------------- src/backend/access/heap/vacuumlazy.c | 55 ++++++++++++++++++++++++++++++++---- 1 file changed, 49 insertions(+), 6 deletions(-)
On Fri, Feb 14, 2025 at 12:59 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > Use streaming read I/O in VACUUM's third phase > > Make vacuum's third phase (its second pass over the heap), which reaps > dead items collected in the first phase and marks them as reusable, use > the read stream API. This commit adds a new read stream callback, > vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and > returns the next block number to read for vacuum. > > Author: Melanie Plageman <melanieplageman@gmail.com> > Co-authored-by: Thomas Munro <thomas.munro@gmail.com> > Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> > Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com > > Branch > ------ > master > > Details > ------- > https://git.postgresql.org/pg/commitdiff/c3e775e608f2a6d0bcfba147bf08a506827cc567 > > Modified Files > -------------- > src/backend/access/heap/vacuumlazy.c | 55 ++++++++++++++++++++++++++++++++---- > 1 file changed, 49 insertions(+), 6 deletions(-) I'm looking into the valgrind failures [1]. ==1526248== VALGRINDERROR-END { <insert_a_suppression_name_here> Memcheck:Addr1 fun:lazy_scan_heap fun:heap_vacuum_rel fun:table_relation_vacuum fun:vacuum_rel fun:vacuum_rel fun:vacuum fun:ExecVacuum fun:standard_ProcessUtility fun:ProcessUtility fun:PortalRunUtility fun:PortalRunMulti fun:PortalRun fun:exec_simple_query } **1526248** Valgrind detected 492 error(s) during execution of "VACUUM FREEZE; ==1526248== VALGRINDERROR-END { <insert_a_suppression_name_here> Memcheck:Addr8 fun:TidStoreGetBlockOffsets fun:lazy_vacuum_heap_rel fun:lazy_vacuum fun:lazy_scan_heap fun:heap_vacuum_rel fun:table_relation_vacuum fun:vacuum_rel fun:vacuum fun:ExecVacuum fun:standard_ProcessUtility fun:ProcessUtility fun:PortalRunUtility fun:PortalRunMulti } ==152 [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-02-14%2018%3A00%3A12
On Fri, Feb 14, 2025 at 1:31 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Feb 14, 2025 at 12:59 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > > > Use streaming read I/O in VACUUM's third phase > > > > Make vacuum's third phase (its second pass over the heap), which reaps > > dead items collected in the first phase and marks them as reusable, use > > the read stream API. This commit adds a new read stream callback, > > vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and > > returns the next block number to read for vacuum. > > > > Author: Melanie Plageman <melanieplageman@gmail.com> > > Co-authored-by: Thomas Munro <thomas.munro@gmail.com> > > Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> > > Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com > > > > Branch > > ------ > > master > > > > Details > > ------- > > https://git.postgresql.org/pg/commitdiff/c3e775e608f2a6d0bcfba147bf08a506827cc567 > > > > Modified Files > > -------------- > > src/backend/access/heap/vacuumlazy.c | 55 ++++++++++++++++++++++++++++++++---- > > 1 file changed, 49 insertions(+), 6 deletions(-) > > I'm looking into the valgrind failures [1]. > > ==1526248== VALGRINDERROR-END > { > <insert_a_suppression_name_here> > Memcheck:Addr1 > fun:lazy_scan_heap > fun:heap_vacuum_rel > fun:table_relation_vacuum > fun:vacuum_rel > fun:vacuum_rel > fun:vacuum > fun:ExecVacuum > fun:standard_ProcessUtility > fun:ProcessUtility > fun:PortalRunUtility > fun:PortalRunMulti > fun:PortalRun > fun:exec_simple_query > } Looks like there is something wrong with the read stream API. This is the first read stream user taking advantage of per_buffer_data. Thomas and I are investigating further. It is trivially reproducible when running intidb under valgrind. - Melanie