Re: Recording test runtimes with the buildfarm - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Recording test runtimes with the buildfarm
Date
Msg-id CABUevExDEgsTgEXfOUCWBNteFAmb4JJu0yCbuBZxxUV7LrOeVA@mail.gmail.com
Whole thread Raw
In response to Re: Recording test runtimes with the buildfarm  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
Responses Re: Recording test runtimes with the buildfarm
List pgsql-hackers
On Thu, Jun 11, 2020 at 4:56 PM Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:

On 6/11/20 10:21 AM, Stephen Frost wrote:
> Greetings,
>
> * David Rowley (dgrowleyml@gmail.com) wrote:
>> On Thu, 11 Jun 2020 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Thomas Munro <thomas.munro@gmail.com> writes:
>>>> I've been doing that in a little database that pulls down the results
>>>> and analyses them with primitive regexes.  First I wanted to know the
>>>> pass/fail history for each individual regression, isolation and TAP
>>>> script, then I wanted to build something that could identify tests
>>>> that are 'flapping', and work out when the started and stopped
>>>> flapping etc.  I soon realised it was all too noisy, but then I
>>>> figured that I could fix that by detecting crashes.  So I classify
>>>> every top level build farm run as SUCCESS, FAILURE or CRASH.  If the
>>>> top level run was CRASH, than I can disregard the individual per
>>>> script results, because they're all BS.
>>> If you can pin the crash on a particular test script, it'd be useful
>>> to track that as a kind of failure.  In general, though, both crashes
>>> and non-crash failures tend to cause collateral damage to later test
>>> scripts --- if you can't filter that out then the later scripts will
>>> have high false-positive rates.
>> I guess the fact that you've both needed to do analysis on individual
>> tests shows that there might be a call for this beyond just recording
>> the test's runtime.
>>
>> If we had a table that stored the individual test details, pass/fail
>> and just stored the timing information along with that, then, even if
>> the timing was unstable, it could still be useful for some analysis.
>> I'd be happy enough even if that was only available as a csv file
>> download.  I imagine the buildfarm does not need to provide us with
>> any tools for doing analysis on this. Ideally, there would be some
>> run_id that we could link it back to the test run which would give us
>> the commit SHA, and the animal that it ran on. Joining to details
>> about the animal could be useful too, e.g perhaps a certain test
>> always fails on 32-bit machines.
>>
>> I suppose that maybe we could modify pg_regress to add a command line
>> option to have it write out a machine-readable file, e.g:
>> testname,result,runtime\n, then just have the buildfarm client ship
>> that off to the buildfarm server to record in the database.
> That seems like it'd be the best approach to me, though I'd defer to
> Andrew on it.
>
> By the way, if you'd like access to the buildfarm archive server where
> all this stuff is stored, that can certainly be arranged, just let me
> know.
>


Yeah, we'll need to work out where to stash the file. The client will
pick up anything in src/regress/log for "make check", but would need
adjusting for other steps that invoke pg_regress. I'm getting close to
cutting a new client release, but I can delay it till we settle this.


On the server side, we could add a table with a key of <animal,
snapshot, branch, step, testname> but we'd need to make sure those test
names were unique. Maybe we need a way of telling pg_regress to prepend
a module name (e.g. btree_gist ot plperl) to the test name.

It seems pretty trivial to for example get all the steps out of check.log and their timing with a regexp. I just used '^(?:test)?\s+(\S+)\s+\.\.\. ok\s+(\d+) ms$' as the regexp. Running that against a few hundred build runs in the db generally looks fine, though I didn't look into it in detail. 

Of course, that only looked at check.log, and more logic would be needed if we want to look into the other areas as well, but as long as it's pg_regress output I think it should be easy?

--

pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Windows regress fails (latest HEAD)
Next
From: Justin Pryzby
Date:
Subject: Re: how to create index concurrently on partitioned table