Thread: [pgadmin-hackers] Patch submissions

[pgadmin-hackers] Patch submissions

From
Dave Page
Date:
All,

I'd like to clarify our patch submission expectations as I think
there's been some confusion recently:

- Typically each new feature or change should be a single patch,
ideally in it's own mail thread for future tracking/searching etc.

- Large patches may be broken up into 2 or more smaller patches to aid
the review process. Typically this might be infrastructure changes,
then the new feature. A good rule of thumb is "is each patch useful in
its own right?".

- If patches are rejected (as is often the case for the first
submission), please do not send back an ever-increasing set of patches
correcting issues in the earlier ones. Please squash the changes down
into a replacement patch.

Patch review is a tedious and difficult job at the best of times -
careful generation and organisation of patches makes a surprising
difference to that process.

Thanks all, and keep 'em coming :-)

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [pgadmin-hackers] Patch submissions

From
Atira Odhner
Date:
Hi Dave,

I'm wondering what pain you are feeling around having multiple patches.

From my perspective it is much easier to deal with smaller commits as it gives us a quicker way to understand each change if we want to look back at history. I agree that each patch should work standalone (tests should be passing, the application should run.) But, I think aiming for a patch size that encompasses an entire feature is often too large.

When we were squashing patches and then having to go back and modify them, we were spending a large portion of our time managing patches and branches rather than working on code. With smaller patches, rebasing is much simpler and easier to understand. So I'd really like to be able to continue to do things this way. I do understand the desire not to have a 'bad' commit on master.

Can you explain how having a large number of patches impacts your process and maybe we can together come up with a process that works better for all of us?
E.g. if you just wanted to review all the changes at once, you could do `git apply *.patch` to review, and then `git am *.patch` when you are ready to commit the changes.

Thanks,

Tira


On Wed, Mar 15, 2017 at 6:15 PM, Dave Page <dpage@pgadmin.org> wrote:
All,

I'd like to clarify our patch submission expectations as I think
there's been some confusion recently:

- Typically each new feature or change should be a single patch,
ideally in it's own mail thread for future tracking/searching etc.

- Large patches may be broken up into 2 or more smaller patches to aid
the review process. Typically this might be infrastructure changes,
then the new feature. A good rule of thumb is "is each patch useful in
its own right?".

- If patches are rejected (as is often the case for the first
submission), please do not send back an ever-increasing set of patches
correcting issues in the earlier ones. Please squash the changes down
into a replacement patch.

Patch review is a tedious and difficult job at the best of times -
careful generation and organisation of patches makes a surprising
difference to that process.

Thanks all, and keep 'em coming :-)

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgadmin-hackers mailing list (pgadmin-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-hackers

Re: [pgadmin-hackers] Patch submissions

From
Dave Page
Date:
Hi

On Thu, Mar 16, 2017 at 5:58 PM, Atira Odhner <aodhner@pivotal.io> wrote:
> Hi Dave,
>
> I'm wondering what pain you are feeling around having multiple patches.
>
> From my perspective it is much easier to deal with smaller commits as it
> gives us a quicker way to understand each change if we want to look back at
> history. I agree that each patch should work standalone (tests should be
> passing, the application should run.) But, I think aiming for a patch size
> that encompasses an entire feature is often too large.
>
> When we were squashing patches and then having to go back and modify them,
> we were spending a large portion of our time managing patches and branches
> rather than working on code. With smaller patches, rebasing is much simpler
> and easier to understand. So I'd really like to be able to continue to do
> things this way. I do understand the desire not to have a 'bad' commit on
> master.
>
> Can you explain how having a large number of patches impacts your process
> and maybe we can together come up with a process that works better for all
> of us?
> E.g. if you just wanted to review all the changes at once, you could do `git
> apply *.patch` to review, and then `git am *.patch` when you are ready to
> commit the changes.

We aim to commit only the final version of the code, which is also
what I'm reviewing. If a patch set contains multiple patches
correcting review comments, and there are also other patches produced
in the middle, it becomes very difficult to separate the unrelated
changes, and review the final version of each individual change.

I was trying not to single out any particular work, but really the
Jasmine changes are a great example of this:

Patch 1 added Jasmine (feature 1)
Patch 2 added centralised translation rendering (feature 2)
Patch 3 refactored clipboard code into it's own file (feature 3)
Patch 4 (original version) removed some dead code in sqleditor.js (feature 4)
Patch 4 fixed quotes in sqleditor.js (arguably part of feature 2)
Patch 5 was rejected early on as it was unnecessary, but was resubmitted
Patch 6 fixed an issue in patch 3 (feature 3)
Patch 7 cleaned up patch 1 (feature 1)
Patch 8 updated packages to exclude unnecessary files introduced in
patch 1 (feature 1)
Patch 9 fixed patch 7 (feature 1)

I can tell you that had those been patches to PostgreSQL, they would
have been rejected out of hand.

What I want to see submitted, is 4 separate patches that evolve with
the feedback given:

Add Jasmine (feature 1)
Centralised translation rendering (feature 2)
Refactor clipboard code (feature 3)
Remove dead code (feature 4)

That enables reviewers to independently review and test each new block
of functionality, and commit the change as one completed unit of work.
With the 10 patches above, that wasn't possible because the
dependencies are intermixed.

I appreciate this puts more work on developers, however, there are far
fewer reviewers than developers, and patch review is not simply a case
of applying the patches and committing; it can take a significant
amount of time.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [pgadmin-hackers] Patch submissions

From
Atira Odhner
Date:
 
 Patch 4 fixed quotes in sqleditor.js (arguably part of feature 2)

This change isn't related to any of the other changes. Combining it with another patch would make it more difficult to understand the project history.

Patch 5 was rejected early on as it was unnecessary, but was resubmitted

Totally a mistake. Sorry.

Patch 9 fixed patch 7 (feature 1)

This change was to exclude node modules from the dmg build. Yes, it is *related* to adding node modules. I think having it as a separate commit again makes history easier to understand. The commit message on the changes is very clear and specific. The commit improves one simple thing which the other commits work with or without.

Regarding the commits that move around the specs, I understand the desire to rewrite history and "pretend" that the code was right the first time around. There are valid reasons for re-writing history, but it does more harm than good when applied so liberally. It hides the conversations that were had about the code. It's true that those conversations live on in an archived email thread, but why hide them from the code history?
If one day someone wants to move karma.conf back up to the pgAdmin4 directory, having two commits instead of one will provide valuable information to them. Just annotating the file will show them that there has already been a conversation about where karma.conf lives and there is probably some good thought behind it. Having that context will tip them off that they should then go look at the email thread.

There is no final version of code. That's why it's so important to write code that is readily changeable, and that's why it's important to have a commit history that's understandable as well.

Dave, I absolutely appreciate the work that you are doing to review our commits before pulling them in. You've provided lots of great substantive feedback which clearly involved carefully reading the code as well as testing out our changes. 
I'd like to lend a hand in reviewing the patches from your team in order to alleviate some of this process burden from you. Is there a process I can follow beyond replying to a patch email?

Thanks,

Tira
 

On Fri, Mar 17, 2017 at 5:23 AM, Dave Page <dpage@pgadmin.org> wrote:
Hi

On Thu, Mar 16, 2017 at 5:58 PM, Atira Odhner <aodhner@pivotal.io> wrote:
> Hi Dave,
>
> I'm wondering what pain you are feeling around having multiple patches.
>
> From my perspective it is much easier to deal with smaller commits as it
> gives us a quicker way to understand each change if we want to look back at
> history. I agree that each patch should work standalone (tests should be
> passing, the application should run.) But, I think aiming for a patch size
> that encompasses an entire feature is often too large.
>
> When we were squashing patches and then having to go back and modify them,
> we were spending a large portion of our time managing patches and branches
> rather than working on code. With smaller patches, rebasing is much simpler
> and easier to understand. So I'd really like to be able to continue to do
> things this way. I do understand the desire not to have a 'bad' commit on
> master.
>
> Can you explain how having a large number of patches impacts your process
> and maybe we can together come up with a process that works better for all
> of us?
> E.g. if you just wanted to review all the changes at once, you could do `git
> apply *.patch` to review, and then `git am *.patch` when you are ready to
> commit the changes.

We aim to commit only the final version of the code, which is also
what I'm reviewing. If a patch set contains multiple patches
correcting review comments, and there are also other patches produced
in the middle, it becomes very difficult to separate the unrelated
changes, and review the final version of each individual change.

I was trying not to single out any particular work, but really the
Jasmine changes are a great example of this:

Patch 1 added Jasmine (feature 1)
Patch 2 added centralised translation rendering (feature 2)
Patch 3 refactored clipboard code into it's own file (feature 3)
Patch 4 (original version) removed some dead code in sqleditor.js (feature 4)
Patch 4 fixed quotes in sqleditor.js (arguably part of feature 2)
Patch 5 was rejected early on as it was unnecessary, but was resubmitted
Patch 6 fixed an issue in patch 3 (feature 3)
Patch 7 cleaned up patch 1 (feature 1)
Patch 8 updated packages to exclude unnecessary files introduced in
patch 1 (feature 1)
Patch 9 fixed patch 7 (feature 1)

I can tell you that had those been patches to PostgreSQL, they would
have been rejected out of hand.

What I want to see submitted, is 4 separate patches that evolve with
the feedback given:

Add Jasmine (feature 1)
Centralised translation rendering (feature 2)
Refactor clipboard code (feature 3)
Remove dead code (feature 4)

That enables reviewers to independently review and test each new block
of functionality, and commit the change as one completed unit of work.
With the 10 patches above, that wasn't possible because the
dependencies are intermixed.

I appreciate this puts more work on developers, however, there are far
fewer reviewers than developers, and patch review is not simply a case
of applying the patches and committing; it can take a significant
amount of time.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [pgadmin-hackers] Patch submissions

From
Dave Page
Date:
Hi

On Sat, Mar 18, 2017 at 4:03 PM, Atira Odhner <aodhner@pivotal.io> wrote:
>>
>>  Patch 4 fixed quotes in sqleditor.js (arguably part of feature 2)
>
>
> This change isn't related to any of the other changes. Combining it with
> another patch would make it more difficult to understand the project
> history.

OK.

>> Patch 9 fixed patch 7 (feature 1)
>
>
> This change was to exclude node modules from the dmg build. Yes, it is
> *related* to adding node modules. I think having it as a separate commit
> again makes history easier to understand. The commit message on the changes
> is very clear and specific. The commit improves one simple thing which the
> other commits work with or without.

You could argue that one either way, I agree.

> Regarding the commits that move around the specs, I understand the desire to
> rewrite history and "pretend" that the code was right the first time around.
> There are valid reasons for re-writing history, but it does more harm than
> good when applied so liberally. It hides the conversations that were had
> about the code. It's true that those conversations live on in an archived
> email thread, but why hide them from the code history?
> If one day someone wants to move karma.conf back up to the pgAdmin4
> directory, having two commits instead of one will provide valuable
> information to them. Just annotating the file will show them that there has
> already been a conversation about where karma.conf lives and there is
> probably some good thought behind it. Having that context will tip them off
> that they should then go look at the email thread.

There are at least 3 issues (putting aside the issue of it being a
pain to review incremental patches with intermixed and unrelated
changes):

- You're trying to fundamentally change the way we, and the greater
PostgreSQL project and eco-system have worked very successfully for
nearly 20 years.

- We want the tree to be clean and as close to ready-for-release at
any time and from any commit. Patches that are rejected are usually
done so for reasons that would make it undesirable to release the code
with that change, and/or because those changes will break something,
often resulting in build/test failures in CI.

- If we commit half-baked code, there's no guarantee that the
submitter will actually come back and fix the outstanding problems.
We've had *significant* baggage in pgAdmin 3 in the past when this has
happened.

> There is no final version of code. That's why it's so important to write
> code that is readily changeable, and that's why it's important to have a
> commit history that's understandable as well.

Of course there's no final version - but what we commit, we expect to
be of release quality, i.e. having no known issues.

> Dave, I absolutely appreciate the work that you are doing to review our
> commits before pulling them in. You've provided lots of great substantive
> feedback which clearly involved carefully reading the code as well as
> testing out our changes.
> I'd like to lend a hand in reviewing the patches from your team in order to
> alleviate some of this process burden from you. Is there a process I can
> follow beyond replying to a patch email?

That would certainly be a huge help. There are some review notes in
the docs at https://www.pgadmin.org/docs4/1.x/code_review.html, which
are really just reminders of some specific things to check. The sort
of general topics we're checking when reviewing include:

- Does the change do what is intended?
- Do UI changes follow established patterns?
- Are there any potential UX issues in the design?
- Are there any unexpected or undesirable side effects?
- Does the code follow our standards, and is it designed in a sensible way?
- Will we be able to understand and maintain this code in 10 years time?
- Do regression tests pass?
- Are regression tests added (where appropriate)?
- Has the documentation been updated (where appropriate)?

Many key points from
https://wiki.postgresql.org/wiki/Reviewing_a_Patch are also relevant
with pgAdmin.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [pgadmin-hackers] Patch submissions

From
Atira Odhner
Date:
- If we commit half-baked code, there's no guarantee that the
submitter will actually come back and fix the outstanding problems.
We've had *significant* baggage in pgAdmin 3 in the past when this has
happened.

I'm not suggesting you apply the patches before you feel the issues are addressed. I'm just suggesting that the patches should not be merged together. 

Tira

On Mon, Mar 20, 2017 at 7:12 AM, Dave Page <dpage@pgadmin.org> wrote:
Hi

On Sat, Mar 18, 2017 at 4:03 PM, Atira Odhner <aodhner@pivotal.io> wrote:
>>
>>  Patch 4 fixed quotes in sqleditor.js (arguably part of feature 2)
>
>
> This change isn't related to any of the other changes. Combining it with
> another patch would make it more difficult to understand the project
> history.

OK.

>> Patch 9 fixed patch 7 (feature 1)
>
>
> This change was to exclude node modules from the dmg build. Yes, it is
> *related* to adding node modules. I think having it as a separate commit
> again makes history easier to understand. The commit message on the changes
> is very clear and specific. The commit improves one simple thing which the
> other commits work with or without.

You could argue that one either way, I agree.

> Regarding the commits that move around the specs, I understand the desire to
> rewrite history and "pretend" that the code was right the first time around.
> There are valid reasons for re-writing history, but it does more harm than
> good when applied so liberally. It hides the conversations that were had
> about the code. It's true that those conversations live on in an archived
> email thread, but why hide them from the code history?
> If one day someone wants to move karma.conf back up to the pgAdmin4
> directory, having two commits instead of one will provide valuable
> information to them. Just annotating the file will show them that there has
> already been a conversation about where karma.conf lives and there is
> probably some good thought behind it. Having that context will tip them off
> that they should then go look at the email thread.

There are at least 3 issues (putting aside the issue of it being a
pain to review incremental patches with intermixed and unrelated
changes):

- You're trying to fundamentally change the way we, and the greater
PostgreSQL project and eco-system have worked very successfully for
nearly 20 years.

- We want the tree to be clean and as close to ready-for-release at
any time and from any commit. Patches that are rejected are usually
done so for reasons that would make it undesirable to release the code
with that change, and/or because those changes will break something,
often resulting in build/test failures in CI.

- If we commit half-baked code, there's no guarantee that the
submitter will actually come back and fix the outstanding problems.
We've had *significant* baggage in pgAdmin 3 in the past when this has
happened.

> There is no final version of code. That's why it's so important to write
> code that is readily changeable, and that's why it's important to have a
> commit history that's understandable as well.

Of course there's no final version - but what we commit, we expect to
be of release quality, i.e. having no known issues.

> Dave, I absolutely appreciate the work that you are doing to review our
> commits before pulling them in. You've provided lots of great substantive
> feedback which clearly involved carefully reading the code as well as
> testing out our changes.
> I'd like to lend a hand in reviewing the patches from your team in order to
> alleviate some of this process burden from you. Is there a process I can
> follow beyond replying to a patch email?

That would certainly be a huge help. There are some review notes in
the docs at https://www.pgadmin.org/docs4/1.x/code_review.html, which
are really just reminders of some specific things to check. The sort
of general topics we're checking when reviewing include:

- Does the change do what is intended?
- Do UI changes follow established patterns?
- Are there any potential UX issues in the design?
- Are there any unexpected or undesirable side effects?
- Does the code follow our standards, and is it designed in a sensible way?
- Will we be able to understand and maintain this code in 10 years time?
- Do regression tests pass?
- Are regression tests added (where appropriate)?
- Has the documentation been updated (where appropriate)?

Many key points from
https://wiki.postgresql.org/wiki/Reviewing_a_Patch are also relevant
with pgAdmin.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company