Re: pgsql: Reorder EPQ work, to fix rowmark related bugs and improve effici - Mailing list pgsql-committers

From Tom Lane
Subject Re: pgsql: Reorder EPQ work, to fix rowmark related bugs and improve effici
Date
Msg-id 25247.1568074211@sss.pgh.pa.us
Whole thread Raw
In response to Re: pgsql: Reorder EPQ work, to fix rowmark related bugs and improve effici  (Andres Freund <andres@anarazel.de>)
Responses Re: pgsql: Reorder EPQ work, to fix rowmark related bugs and improveeffici  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-committers
Andres Freund <andres@anarazel.de> writes:
> On September 9, 2019 11:01:28 PM GMT+01:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think the first question is why is only prairiedog showing this.

> We only saw the similar failure on master on a very small number of machines as well. Given The somewhat unusual way
packetshave to go out to be likely to be problematic, that's not too surprising. 

Actually, now that I look at the prior discussion, it was *only*
prairiedog that we saw that case on.  However, in the last little
while mandrill and bowerbird have come up with failures more or
less like prairiedog's.  Interestingly, bowerbird passed that
test the first time, so it's intermittent there.

I did just see prairiedog get past eval-plan-qual in master,
though it won't finish that run for a good while yet.

It appears to me that this is indeed a case of notice-reporting
timing problems in isolationtester, since these WARNING messages
are handled the same way as notices.  I want to try to reproduce
manually on prairiedog to confirm that, but it seems like a pretty
likely explanation.

If that is the explanation, I agree that there's no need for a
panic response (ie re-wrapping the beta4 tarball).  I doubt that
very many people are going to be testing beta4 on machines that
are slow enough to observe this issue --- and if there are people
testing on slow machines, they probably aren't masochistic enough
to be doing check-world, anyway.

>>> Tom, do you think we should backpatch both the order fix and the notice improvement, or just the former? And to
whichversion? 

As for that, now that we realize that this applies to more than
just NOTICEs, I think we should back-patch the code change in
30717637c at least to v11, maybe all the way.  I don't see any
WARNINGs in the isolation expected files before v11, but it
hardly seems unlikely that we might back-patch some future test
that expects those to be printed in a consistent way.

The case for back-patching ebd499282 (allow NOTICEs to print)
is weaker, but it still seems like it might be a hazard for
back-patching test cases if we don't do so.

On balance I'm inclined to back-patch both changes.  Thoughts?

            regards, tom lane



pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgsql: Reorder EPQ work, to fix rowmark related bugs and improve effici
Next
From: Alvaro Herrera
Date:
Subject: Re: pgsql: Reorder EPQ work, to fix rowmark related bugs and improveeffici