Thread: pg_regress: Treat child process failure as test failure
In the thread about TAP format out in pg_regress, Andres pointed out [0] that we allow a test to pass even if the test child process failed. While its probably pretty rare to have a test pass if the process failed, this brings a risk for false positives (and it seems questionable that any regress test will have a child process failing as part of its intended run). The attached makes child failures an error condition for the test as a belts and suspenders type check. Thoughts? -- Daniel Gustafsson https://vmware.com/ [0] https://postgr.es/m/20221122235636.4frx7hjterq6bmls@awork3.anarazel.de
Attachment
Hi, On 2022-11-26 21:11:39 +0100, Daniel Gustafsson wrote: > In the thread about TAP format out in pg_regress, Andres pointed out [0] that > we allow a test to pass even if the test child process failed. While its > probably pretty rare to have a test pass if the process failed, this brings a > risk for false positives (and it seems questionable that any regress test will > have a child process failing as part of its intended run). > The attached makes child failures an error condition for the test as a belts > and suspenders type check. Thoughts? I wonder if it's the right thing to treat a failed psql that's then also ignored as "failed (ignored)". Perhaps it'd be better to move the statuses[i] != 0 check to before the if (differ)? > - if (differ) > + if (differ || statuses[i] != 0) > { > bool ignore = false; > _stringlist *sl; > @@ -1815,7 +1815,7 @@ run_single_test(const char *test, test_start_function startfunc, > differ |= newdiff; > } > > - if (differ) > + if (differ || exit_status != 0) > { > status(_("FAILED")); > fail_count++; It certainly is a bit confusing that we print a psql failure separately from the if "FAILED" vs "ok" bit. Greetings, Andres Freund
> On 26 Nov 2022, at 21:55, Andres Freund <andres@anarazel.de> wrote: > On 2022-11-26 21:11:39 +0100, Daniel Gustafsson wrote: >> The attached makes child failures an error condition for the test as a belts >> and suspenders type check. Thoughts? > > I wonder if it's the right thing to treat a failed psql that's then also > ignored as "failed (ignored)". Perhaps it'd be better to move the statuses[i] > != 0 check to before the if (differ)? I was thinking about that too, but I think you're right. The "ignore" part is about the test content and not the test run structure. > It certainly is a bit confusing that we print a psql failure separately from > the if "FAILED" vs "ok" bit. I've moved the statuses[i] check before the differ check, such that there is a separate block for this not mixed up with the differs check and printing. It does duplicate things a little bit but also makes it a lot clearer. -- Daniel Gustafsson https://vmware.com/
Attachment
> On 26 Nov 2022, at 22:46, Daniel Gustafsson <daniel@yesql.se> wrote: > I've moved the statuses[i] check before the differ check, such that there is a > separate block for this not mixed up with the differs check and printing. Rebased patch to handle breakage of v2 due to bd8d453e9b. -- Daniel Gustafsson
Attachment
Hi, On 2023-02-22 15:10:11 +0100, Daniel Gustafsson wrote: > > On 26 Nov 2022, at 22:46, Daniel Gustafsson <daniel@yesql.se> wrote: > > > I've moved the statuses[i] check before the differ check, such that there is a > > separate block for this not mixed up with the differs check and printing. > > Rebased patch to handle breakage of v2 due to bd8d453e9b. I think we probably should just apply this? The current behaviour doesn't seem right, and I don't see a downside of the new behaviour? Greetings, Andres Freund
> On 22 Feb 2023, at 21:33, Andres Freund <andres@anarazel.de> wrote: > On 2023-02-22 15:10:11 +0100, Daniel Gustafsson wrote: >> Rebased patch to handle breakage of v2 due to bd8d453e9b. > > I think we probably should just apply this? The current behaviour doesn't seem > right, and I don't see a downside of the new behaviour? Agreed, I can't think of a regression test where we wouldn't want this. My only concern was if any of the ECPG tests were doing something odd that would break from this but I can't see anything. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: >> On 22 Feb 2023, at 21:33, Andres Freund <andres@anarazel.de> wrote: >> On 2023-02-22 15:10:11 +0100, Daniel Gustafsson wrote: >>> Rebased patch to handle breakage of v2 due to bd8d453e9b. >> I think we probably should just apply this? The current behaviour doesn't seem >> right, and I don't see a downside of the new behaviour? > Agreed, I can't think of a regression test where we wouldn't want this. My > only concern was if any of the ECPG tests were doing something odd that would > break from this but I can't see anything. +1. I was a bit surprised to realize that we might not count such a case as a failure. regards, tom lane
> On 22 Feb 2023, at 21:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >>> On 22 Feb 2023, at 21:33, Andres Freund <andres@anarazel.de> wrote: >>> On 2023-02-22 15:10:11 +0100, Daniel Gustafsson wrote: >>>> Rebased patch to handle breakage of v2 due to bd8d453e9b. > >>> I think we probably should just apply this? The current behaviour doesn't seem >>> right, and I don't see a downside of the new behaviour? > >> Agreed, I can't think of a regression test where we wouldn't want this. My >> only concern was if any of the ECPG tests were doing something odd that would >> break from this but I can't see anything. > > +1. I was a bit surprised to realize that we might not count such > a case as a failure. Done that way, thanks! -- Daniel Gustafsson