On Tue, Jul 08, 2025 at 11:43:17PM -0400, Peter Geoghegan wrote:
> On Tue, Jul 8, 2025 at 11:04 PM Noah Misch <noah@leadboat.com> wrote:
> > > -backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno
> > > +ERROR: could not find injection point lock-and-validate-left to wake up
> >
> > Agreed. Perhaps it's getting a different plan type on FreeBSD, so it's not
> > even reaching the INJECTION_POINT() calls? That would be consistent with
> > these output diffs having no ERROR from attach/detach. Some things I'd try:
> >
> > - Add a plain elog(WARNING) before each INJECTION_POINT()
> > - Use debug_print_plan or similar to confirm the plan type
>
> I added a pair of elog(WARNING) traces before each of the new
> INJECTION_POINT() calls.
>
> When I run the test against the FreeBSD CI target with this new
> instrumentation, I see a WARNING that indicates that we've reached the
> top of _bt_lock_and_validate_left as expected. I don't see any second
> WARNING indicating that we've taken _bt_lock_and_validate_left's
> unhappy path, though (and the test still fails). This doesn't look
> like an issue with the planner.
>
> I attach the relevant regression test output, that shows all this.
Looking at .cirrus.tasks.yml, I bet the key factor is that CI task using
debug_parallel_query=regress. I bet the leader is attached to the injection
point, but the WARNING is reached in a parallel worker.
If that matches what you see, I'd use a PARALLEL RESTRICTED or PARALLEL UNSAFE
function in your query to ensure the code in question runs in the leader.
(Simply overriding debug_parallel_query is less robust, because test runs
could use other settings that cause selection of a parallel plan.)