On Thu, Jun 11, 2020 at 9:43 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Thu, Jun 11, 2020 at 2:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I have in the past scraped the latter results and tried to make sense of
> > them. They are *mighty* noisy, even when considering just one animal
> > that I know to be running on a machine with little else to do. Maybe
> > averaging across the whole buildfarm could reduce the noise level, but
> > I'm not very hopeful. Per-test-script times would likely be even
> > noisier (ISTM anyway, maybe I'm wrong).
>
> I've been doing that in a little database that pulls down the results
> and analyses them with primitive regexes. First I wanted to know the
> pass/fail history for each individual regression, isolation and TAP
> script, then I wanted to build something that could identify tests
> that are 'flapping', and work out when the started and stopped
> flapping etc. I soon realised it was all too noisy, but then I
> figured that I could fix that by detecting crashes. So I classify
> every top level build farm run as SUCCESS, FAILURE or CRASH. If the
> top level run was CRASH, than I can disregard the individual per
> script results, because they're all BS.
With more coffee I realise that you were talking about noise times,
not noisy pass/fail results. But I still want to throw that idea out
there, if we're considering analysing the logs.