Re: Commit fest 2017-11 - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Commit fest 2017-11
Date
Msg-id CABUevEyme6XH+S_E37n3uYKrBEmcKHd746c5FDZadec-7BjGmw@mail.gmail.com
Whole thread Raw
In response to Re: Commit fest 2017-11  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers


On Thu, Mar 29, 2018 at 10:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 29 March 2018 at 09:19, Magnus Hagander <magnus@hagander.net> wrote:
>
>
> On Thu, Mar 29, 2018 at 10:06 AM, Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
>>
>> Hi, Fabien!
>>
>> On Fri, Dec 1, 2017 at 9:10 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>>>>
>>>> And the last 21 patches have been classified as well. Here is the
>>>> final score for this time:
>>>> Committed: 55.
>>>> Moved to next CF: 103.
>>>> Rejected: 1.
>>>> Returned with Feedback: 47.
>>>> Total: 206.
>>>>
>>>> Thanks to all the contributors for this session! The CF is now closed.
>>>
>>>
>>> Thanks for the CF management.
>>>
>>> Attached a small graph of the end status of patch at the end of each CF.
>>
>>
>> Thank you for the graph!
>> It would be interesting to see statistics not by patches count, but by
>> their complexity.
>> For rough measure of complexity we can use number of affected lines.  I
>> expect that
>> statistics would be even more distressing: small patches can be committed
>> faster,
>> while large patches are traversing from one CF to another during long
>> time.  Interesting
>> to check whether it's true...
>>
>
> I think that's very hard to do given that we simply don't have the data
> today. It's not that simple to analyze the patches in the archives, because
> some are single file, some are spread across multiple etc. I fear that
> anything trying to work off that would actually make the stats even more
> inaccurate. This is the pattern I've seen whenever I've treid tha tbefore.
>
> I wonder if we should consider adding a field to the CF app *specifically*
> to track things like this. What I'm thinking is a field that's set (or at
> least verified) by the person who flags a patch as committed with choices
> like Trivial/Simple/Medium/Complex (trivial being things like typo fixes
> etc, which today can hugely skew the stats).
>
> Would people who update the CF app be willing to put in that effort (however
> small it is, it has to be done consistently to be of any value) in order to
> be able to track such statistics?
>
> It would only help for the future of course, unless somebody wants to go
> back and backfill existing patches with such information (which they might
> be).

The focus of this is on the Committers, which seems wrong.

I suggest someone does another analysis that shows how many patch
reviews have been conducted by patch authors, so we can highlight
people who are causing the problem yet not helping solve the problem.

I have exactly such analysis available. The problem with it is that it cannot take complexity into account.

Reviewing a one-letter typo patch is not the same as reviewing MERGE, JIT or parallel query... But right now, we don't have enough proper data to differentiate those.

The idea would not to put the focus on the committer necessarily, but on the person who marks the patch as committed in the CF app. We can of course have the submitter also input this as a metadata, but in the end it's going to be the reviewers and committers who are the only  ones who can judge what the *actual* complexity of a patch is/was.
 
--

pgsql-hackers by date:

Previous
From: Arthur Zakirov
Date:
Subject: Typo in shared_record_table_compare() commentary
Next
From: Simon Riggs
Date:
Subject: Re: [HACKERS] A design for amcheck heapam verification