Thread: Assert("vacrel->eager_scan_remaining_successes > 0")
Hi, I hit the assertion failure in the subject line with the following script: create table t (a int) with (autovacuum_enabled = off); insert into t select generate_series(1, 1_000_000); vacuum t; insert into t select generate_series(1, 1_000_000); set vacuum_freeze_min_age to 0; vacuum t; When the success count reaches to 0, we disable the eager scan by resetting related fields as follows: /* * If we hit our success cap, permanently disable eager * scanning by setting the other eager scan management * fields to their disabled values. */ vacrel->eager_scan_remaining_fails = 0; vacrel->next_eager_scan_region_start = InvalidBlockNumber; vacrel->eager_scan_max_fails_per_region = 0; However, there is a possibility that we have already eagerly scanned another page and returned it to the read stream before we freeze the eagerly-scanned page and disable the eager scan. In this case, the next block that we retrieved from the read stream could also be an eagerly-scanned page. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
(CC'ed to Melanie) On Fri, May 2, 2025 at 11:59 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Hi, > > I hit the assertion failure in the subject line with the following script: > > create table t (a int) with (autovacuum_enabled = off); > insert into t select generate_series(1, 1_000_000); > vacuum t; > insert into t select generate_series(1, 1_000_000); > set vacuum_freeze_min_age to 0; > vacuum t; > > When the success count reaches to 0, we disable the eager scan by > resetting related fields as follows: > > /* > * If we hit our success cap, permanently disable eager > * scanning by setting the other eager scan management > * fields to their disabled values. > */ > vacrel->eager_scan_remaining_fails = 0; > vacrel->next_eager_scan_region_start = InvalidBlockNumber; > vacrel->eager_scan_max_fails_per_region = 0; > > However, there is a possibility that we have already eagerly scanned > another page and returned it to the read stream before we freeze the > eagerly-scanned page and disable the eager scan. In this case, the > next block that we retrieved from the read stream could also be an > eagerly-scanned page. I've added it to Open Items for v18. If I understand correctly, there's a potential scenario where we might eagerly scan more pages than permitted by the success and failure caps. One question is: what is the practical magnitude of this excess scanning? If the overflow could be substantial (for instance, eagerly scanning 30% of table pages), we should consider revising our eager scanning mechanism. One potential solution would be to implement a counter tracking the number of eagerly scanned but unprocessed pages. This counter could then inform the decision-making process in find_next_unskippable_block() regarding whether to proceed with eager scanning of additional pages. Alternatively, if the excess scanning proves negligible in practice, we could adopt a simpler solution by allowing vacrel->eager_scan_remaining_successes to accept negative values and removing the assertion in question. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, May 20, 2025 at 3:22 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > > On Tue, May 20, 2025 at 5:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> >> On Fri, May 2, 2025 at 11:59 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> > >> > However, there is a possibility that we have already eagerly scanned >> > another page and returned it to the read stream before we freeze the >> > eagerly-scanned page and disable the eager scan. In this case, the >> > next block that we retrieved from the read stream could also be an >> > eagerly-scanned page. >> >> I've added it to Open Items for v18. >> >> If I understand correctly, there's a potential scenario where we might >> eagerly scan more pages than permitted by the success and failure >> caps. One question is: what is the practical magnitude of this excess >> scanning? If the overflow could be substantial (for instance, eagerly >> scanning 30% of table pages), we should consider revising our eager >> scanning mechanism. > > > Thanks for reporting this. Sorry I missed it initially. > > I need to do some more investigation, but IIUC you are saying that this is an interaction between the read stream and eagerscan code? Yes. > I tried to confirm that was the case by setting io_combine_limit and maintenance_io_concurrency to 1, which should be similarbehavior to not using the read stream WRT read combining etc. But, even doing so, your repro tripped the assert. Whatmade you suspect an interaction with the read stream? While I haven't identified how exactly read stream is related to this issue, what I've observed through debugging this issue is, during a single read_stream_next_buffer() call, I observed that heap_vac_scan_next_block() is invoked multiple times to return blocks to the read stream and that we continued processing eagerly scanned pages even after the success counter reaches zero while processing the previous block. Even when I set both io_combine_limit and maintenance_io_concurrency to 1, the debug logs showed that vacuum still performs eager scanning two blocks ahead of the current block being processed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, May 20, 2025 at 6:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
While I haven't identified how exactly read stream is related to this
issue, what I've observed through debugging this issue is, during a
single read_stream_next_buffer() call, I observed that
heap_vac_scan_next_block() is invoked multiple times to return blocks
to the read stream and that we continued processing eagerly scanned
pages even after the success counter reaches zero while processing the
previous block. Even when I set both io_combine_limit and
maintenance_io_concurrency to 1, the debug logs showed that vacuum
still performs eager scanning two blocks ahead of the current block
being processed.
I spent some more time looking into this today. This assert no longer works with streaming vacuum.
Even with the minimums: io_combine_limit 1 and maintenance_io_concurrency set to 0, we will still invoke read_stream_get_block() one extra time at the start of the read stream.
To your point about eagerly scanning beyond the success and failure caps, it is possible, but it is hard to get into the situation where you far exceed them.
Though max_pinned_buffers is derived from io_combine_limit * maintenance_io_concurrency, it is limited by the size of the buffer access strategy ring (by default 2 MB for vacuum) and the size of shared buffers. And because we won't prefetch if sequential access is detected (and the prefetch distance has to first ramp up), it is unlikely we will both prefetch the maximum distance of randomly located blocks and combine blocks to the maximum IO size before returning to vacuum code from read stream code. That is to say, the maximum number of extra all-visible blocks scanned should not be very large.
We ran into a similar concern with the autoprewarm read stream user and decided the overrun would not be large enough to merit special handling.
In this case, because we don't know if we will successfully freeze eagerly scanned blocks until after they have been yielded by the read stream API, there is no way to use the caps to limit the prefetch distance or read combine size. Doing so would entail mixing logic into vacuum code about how the read stream API works -- at least as far as I can tell.
I think the best course of action is to either change the assert to a guard
if (vacrel->eager_scan_remaining_successes > 0)
vacrel->eager_scan_remaining_successes--;
vacrel->eager_scan_remaining_successes--;
- Melanie
On Wed, May 21, 2025 at 6:11 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > if (vacrel->eager_scan_remaining_successes > 0) > > vacrel->eager_scan_remaining_successes--; > > I've attached a patch that uses this idea. Feedback is very welcome. Thanks for writing the patch! I actually think we have the same situation with eager_scan_remaining_fails. Since the extra pages that are eagerly scanned could either fail or succeed to be frozen, so we probably also need to change the assert in the failure case into a guard as well: else { Assert(vacrel->eager_scan_remaining_fails > 0); vacrel->eager_scan_remaining_fails--; } -> else if (vacrel->eager_scan_remaining_fails > 0) vacrel->eager_scan_remaining_fails--; In the comment you wrote, I would probably just change one thing + /* + * Report only once that we disabled eager scanning. This + * check is required because we might have eagerly read + * more blocks and we could reach here even after + * disabling eager scanning. + */ I would emphasize that we read ahead these blocks before executing the code trying to freeze them. So, I might instead say something like: "Report only once that we disabled eager scanning. We may eagerly read ahead blocks in excess of the success or failure caps before attempting to freeze them, so we could reach here even after disabling additional eager scanning" And then probably avoid repeating the whole comment above the remaining fails guard. - Melanie
On Thu, May 22, 2025 at 7:27 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Wed, May 21, 2025 at 6:11 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > if (vacrel->eager_scan_remaining_successes > 0) > > > vacrel->eager_scan_remaining_successes--; > > > > I've attached a patch that uses this idea. Feedback is very welcome. > > Thanks for writing the patch! > > I actually think we have the same situation with > eager_scan_remaining_fails. Good catch. > Since the extra pages that are eagerly > scanned could either fail or succeed to be frozen, so we probably also > need to change the assert in the failure case into a guard as well: > > else > { > Assert(vacrel->eager_scan_remaining_fails > 0); > vacrel->eager_scan_remaining_fails--; > } > > -> > > else if (vacrel->eager_scan_remaining_fails > 0) > vacrel->eager_scan_remaining_fails--; > > In the comment you wrote, I would probably just change one thing > > + /* > + * Report only once that we disabled eager scanning. This > + * check is required because we might have eagerly read > + * more blocks and we could reach here even after > + * disabling eager scanning. > + */ > > I would emphasize that we read ahead these blocks before executing the > code trying to freeze them. So, I might instead say something like: > "Report only once that we disabled eager scanning. We may eagerly read > ahead blocks in excess of the success or failure caps before > attempting to freeze them, so we could reach here even after disabling > additional eager scanning" > > And then probably avoid repeating the whole comment above the > remaining fails guard. Agreed. I've updated the patch. Does this address your comments? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, May 22, 2025 at 4:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Agreed. I've updated the patch. Does this address your comments? Yep. LGTM. I'd probably just remove the parenthetical remark about 20% from the commit message since that only applies to the success cap and referencing both the success and failure caps will make the sentence very long. from the commit message: pages (beyond 20% of the total number of all-visible but not all-frozen pages), this does not pose a practical concern as the Thanks again for writing the patch! - Melanie
On Thu, May 22, 2025 at 1:19 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Thu, May 22, 2025 at 4:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Agreed. I've updated the patch. Does this address your comments? > > Yep. LGTM. > > I'd probably just remove the parenthetical remark about 20% from the > commit message since that only applies to the success cap and > referencing both the success and failure caps will make the sentence > very long. > > from the commit message: > pages (beyond 20% of the total number of all-visible but not > all-frozen pages), this does not pose a practical concern as the Agreed. I'll remove that part and push early next week, barring any objections. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, May 23, 2025 at 1:53 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, May 23, 2025 at 12:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I'll remove that part and push early next week, barring any objections. > > Great, thanks so much! Pushed the fix and closed the open item. Thank you for reviewing the patch! Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com