On Wed, Oct 23, 2024 at 9:32 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> > On 22 Oct 2024, at 16:42, Melanie Plageman <melanieplageman@gmail.com> wrote:
> >
> > Ah, right, the callback might return InvalidBlockNumber far before
> > we've actually read (and vacuumed) the blocks it is specifying.
>
> I've discussed the issue with Thomas on PGConf.eu and he proposed to use stream reset.
That approach seems promising.
> PFA v3.
Note that you don't check if buf is valid here and break out of the
inner loop when it is invalid.
for (; scanblkno < num_pages; scanblkno++)
{
Buffer buf = read_stream_next_buffer(stream, NULL);
btvacuumpage(&vstate, buf);
...
}
By doing read_stream_reset() before you first invoke
read_stream_next_buffer(), seems like you invalidate the distance set
in read_stream_begin_relation()
if (flags & READ_STREAM_FULL)
stream->distance = Min(max_pinned_buffers, io_combine_limit);
->
stream->distance = 1 in read_stream_reset()
I still don't really like the inner loop using scanblkno:
for (; scanblkno < num_pages; scanblkno++)
{
Buffer buf = read_stream_next_buffer(stream, NULL);
btvacuumpage(&vstate, buf);
if (info->report_progress)
pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
scanblkno);
}
Since you already advance a block number in the callback, I find it
confusing to also use the block number as a loop condition here. I
think it would be clearer to loop on read_stream_next_buffer()
returning a valid buffer (and then move the progress reporting into
btvacuumpage()).
But, I think I would need to study this btree code more to do a more
informed review of the patch.
- Melanie