Thread: Large Commitfest items

Large Commitfest items

From
Andrew Dunstan
Date:
There has been some discussion around excluding large items from the
current commitfest, for several reasons. However I don't recall ever
seeing a definition of a large item. It seems to be a bit like "I know
it when I see it."   I've been looking at the current commitfest
entries. Based on that I suggest a heuristic that says a commitfest
item with patches greater than 5000 lines is large.

That's around 20 items in the current commitfest. I think we need to
be able to use some discretion to allow items with more lines or
exclude items with less lines that nevertheless have far-reaching
impacts, but this seems like a reasonable starting point.

We could double that number by reducing the 5000 to 2000, but that
seems too restrictive to me.

Thoughts?

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Large Commitfest items

From
Andres Freund
Date:
Hi,

On 2018-07-01 14:46:47 -0400, Andrew Dunstan wrote:
> There has been some discussion around excluding large items from the
> current commitfest, for several reasons. However I don't recall ever
> seeing a definition of a large item. It seems to be a bit like "I know
> it when I see it."   I've been looking at the current commitfest
> entries. Based on that I suggest a heuristic that says a commitfest
> item with patches greater than 5000 lines is large.

FWIW, I personally think the criteria should rather be "old" or "very
small". I.e. for patches that have waited for review being large
shouldn't necessarily be an impediment for getting worked on (depending
on invasiveness maybe not committed), and very small for newer things
should be way below 5kloc.

Greetings,

Andres Freund


Re: Large Commitfest items

From
Craig Ringer
Date:
On 2 July 2018 at 02:50, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-07-01 14:46:47 -0400, Andrew Dunstan wrote:
> There has been some discussion around excluding large items from the
> current commitfest, for several reasons. However I don't recall ever
> seeing a definition of a large item. It seems to be a bit like "I know
> it when I see it."   I've been looking at the current commitfest
> entries. Based on that I suggest a heuristic that says a commitfest
> item with patches greater than 5000 lines is large.

FWIW, I personally think the criteria should rather be "old" or "very
small". I.e. for patches that have waited for review being large
shouldn't necessarily be an impediment for getting worked on (depending
on invasiveness maybe not committed), and very small for newer things
should be way below 5kloc.


I agree. I think the idea is to stop people (um, totally not guilty of this) from dropping big or intrusive patches in late CFs. 

A 10 line patch can be massively intrusive and contentious. A 5000 line patch can be a mechanical change that nobody disagrees with, or a mature patch that just needed a few tweaks and missed commit in the last CF.

If a line limit is used, we'll get people optimising for the line limit. I don't think that's a win.

This benefits from being fuzzy IMO.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Large Commitfest items

From
Andrew Dunstan
Date:
On Sun, Jul 1, 2018 at 9:36 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 2 July 2018 at 02:50, Andres Freund <andres@anarazel.de> wrote:
>>
>> Hi,
>>
>> On 2018-07-01 14:46:47 -0400, Andrew Dunstan wrote:
>> > There has been some discussion around excluding large items from the
>> > current commitfest, for several reasons. However I don't recall ever
>> > seeing a definition of a large item. It seems to be a bit like "I know
>> > it when I see it."   I've been looking at the current commitfest
>> > entries. Based on that I suggest a heuristic that says a commitfest
>> > item with patches greater than 5000 lines is large.
>>
>> FWIW, I personally think the criteria should rather be "old" or "very
>> small". I.e. for patches that have waited for review being large
>> shouldn't necessarily be an impediment for getting worked on (depending
>> on invasiveness maybe not committed), and very small for newer things
>> should be way below 5kloc.
>>
>
> I agree. I think the idea is to stop people (um, totally not guilty of this)
> from dropping big or intrusive patches in late CFs.
>
> A 10 line patch can be massively intrusive and contentious. A 5000 line
> patch can be a mechanical change that nobody disagrees with, or a mature
> patch that just needed a few tweaks and missed commit in the last CF.
>
> If a line limit is used, we'll get people optimising for the line limit. I
> don't think that's a win.
>
> This benefits from being fuzzy IMO.
>

I totally agree. I never intended to apply this in anything but a
fuzzy way. I just used a heuristic as a way of breaking up the work a
bit.

Note that the intention is to treat this CF a little differently, as
it's taking place while we're still wrapping up Release 11.

If by "very small" we mean small impact, then the 22,000 line update
to the snowball stemmers is argubly small :-)

The following large items (by my rough heuristic) are marked Ready For
Committer and have been around from 3 to 5 Cfs:

Item 1160 Improve Geometric Types
Item 1294 Custom Compression Methods
Item 1421 Flexible Configuration for FTS

I propose we keep them.

As I mentioned, the update to the Snowball stemmers is fairly low
impact, and we can reasonably keep it.

Discussion seems to be ongoing re Pluggable storage API, and in any
case it could have a huge impact, so I suggest we push it out to
Spetember, even though it's been around for 5 CFs.

The following possibly qualify as "old", and could be kept on that
basis, depending on how large we think their impact might be:

Item 1062 Generic Type Subscripting
Item 1238 Multivariate MVC lists and histograms
Item 1247 push aggregation down to base relations and joins
Item 1348 BRIN Bloom and multi-minmax indexes

I'd like to hear some discussion on these.

That leaves the following "large" items which aren't particularly old:

Item 1471 jsonpath
Item 1472 SQL/JSON functions
Item 1473 JSONTABLE
Item 1553 Advanced partition matching for partition-wise join
Item 1574 Transactions involving multiple postgres foreign servers
Item 1648 Precalculate Stable Functions
Item 1649 Undo Log
Item 1685 Push down aggregates below joins

I suggest we keep jsonpath so we can make some progress on SQL/JSON,
and move the remainder to the September CF.

Before long I will post a similar examination of the next set of
smaller items (2000 to 5000 lines).

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Large Commitfest items

From
Peter Eisentraut
Date:
On 01.07.18 20:46, Andrew Dunstan wrote:
> There has been some discussion around excluding large items from the
> current commitfest, for several reasons.

I'm not sure how useful that would be.  We don't want to deal with the
large stuff late in the cycle, so it would seem to be beneficial to deal
with them early.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services