I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Hm. I wonder if all that's happening with prairedog is that the notice
>> is sent a bit later. I think that could e.g. conceivably happen because
>> it TCP_NODELAY isn't supported on prariedog? Or just because the machine
>> is very slow?
> The notices (not notifies) are coming out in the opposite order from
> expected. I haven't really thought hard about what's causing that;
> it seems odd, because isolationtester isn't supposed to give up waiting
> for a session until it's visibly blocked according to pg_locks. Maybe
> it needs to recheck for incoming data once more after seeing that?
Ah-hah, that seems to be the answer. With the attached patch I'm
getting reliable-seeming passes on prairiedog.
regards, tom lane
diff --git a/src/test/isolation/isolationtester.c b/src/test/isolation/isolationtester.c
index 6ab19b1..e97fef1 100644
*** a/src/test/isolation/isolationtester.c
--- b/src/test/isolation/isolationtester.c
*************** try_complete_step(Step *step, int flags)
*** 752,757 ****
--- 752,777 ----
if (waiting) /* waiting to acquire a lock */
{
+ /*
+ * Since it takes time to perform the lock-check query,
+ * some data --- notably, NOTICE messages --- might have
+ * arrived since we looked. We should do PQconsumeInput
+ * to process any such messages, and we might as well then
+ * check PQisBusy, though it's unlikely to succeed.
+ */
+ if (!PQconsumeInput(conn))
+ {
+ fprintf(stderr, "PQconsumeInput failed: %s\n",
+ PQerrorMessage(conn));
+ exit(1);
+ }
+ if (!PQisBusy(conn))
+ break;
+
+ /*
+ * conn is still busy, so conclude that the step really is
+ * waiting.
+ */
if (!(flags & STEP_RETRY))
printf("step %s: %s <waiting ...>\n",
step->name, step->sql);