Hi,
On 2020-09-08 19:30:40 -0700, Andres Freund wrote:
> It was fairly straight forward to implement the basics of
> INTERRUPTIBLE. However it seems like it might not actually address my
> main concern, i.e. making this reliably testable with isolation
> tester. At least not the way I envisioned it...
>
> My idea for the test was to have one isolationtester session start a
> seqscan, which would then reliably block a concurrent VACUUM (FREEZE,
> INTERRUPTIBLE). That'd then give a stable point to switch back to the
> first session, which would interrupt the VACUUM by doing a LOCK.
>
> But because there's no known waiting-for pid for a buffer pin wait, we
> currently don't detect that we're blocked :(.
>
>
> Now, it'd not be too hard to make pg_isolation_test_session_is_blocked
> also report being blocked when we're waiting for a buffer pin (ignoring
> interesting_pids), similar to the safe snapshot test. However I'm
> worried that that could lead to "false" reports of blocking? But maybe
> that's a small enough risk because there's few unconditional cleanup
> lock acquisitions?
>
> Hacking such a wait condition test into
> pg_isolation_test_session_is_blocked() successfully allows my test to
> work for VACUUM.
Here's a patch series implementing that:
0001: Adds INTERRUPTIBLE to VACUUM ANALYZE
There's quite a few XXX's inside. And it needs some none
isolationtester test.
0002: Treat BufferPin as a waiting condition in isolationtest.
That's the aforementioned workaround.
0003: A test, finally.
But it only tests VACUUM, not yet ANALYZE. Perhaps also a test for
not allowing interruptions, somehow?
Clearly WIP, but good enough for some comments, I hope?
A few comments:
- Right now there can only be one such blocking task, because
PROC_IS_INTERRUPTIBLE is only set by vacuum / analyze, and the lock
levels are self exclusive. But it's generically named now, so it seems
just a matter of time until somebody adds that to other commands. I
think it's ok to not add support for ProcSleep() killing multiple
other processes?
- It's a bit annoying that normal user backends just see a generic
"ERROR: canceling statement due to user request". Do we need something
better?
- The order in which VACUUM parameters are documented & implemented
seems fairly random. Perhaps it'd make sense to reorder
alphabetically?
> Not yet sure about what a suitable way to test this for analyze would
> be, unless we'd also accept VacuumDelay as a wait condition :(. I guess
> one fairly easy way would be to include an expression index, and have
> the expression index acquire an unavailable lock. Which'd allow to
> switch to another session.
But here I've not yet done anything. That just seems too ugly & fragile.
Greetings,
Andres Freund