Thread: antisocial things you can do in git (but not CVS)
I have some concerns related to the upcoming conversion to git and how we're going to avoid having things get messy as people start using the new repository. git has a lot more flexibility and power than CVS, and I'm worried that it would be easy, even accidentally, to screw up our history. 1. Inability to cleanly and easily (and programatically) identify who committed what. With CVS, the author of a revision is the person who committed it, period. With git, the author string can be set to anything the person typing 'git commit' feels like. I think there is also a committer field, but that doesn't always appear and I'm not clear on how it works. Also, the author field defaults to something dumb if you don't explicitly set it to a value. So I'm worried we could end up with stuff like this in the repository: Author: <rhaas@rhaas-laptop> Author: Robert Haas <robertmhaas@gmail.com> Author: Robert Haas <rhaas@enterprisedb.com> Author: Robert Haas <rhaas@some-place-i-might-hypothetically-work-in-the-future.com> Author: The Guy Who Wrote Some Patch Which Robert Haas Ended Up Committing <somerandomemail@somerandomdomain.whatever> Right now, it's easy to find all the commits by a particular committer, and it's easy to see who committed a particular patch, and the number of distinct committers is pretty small. I'd hate to give that up. git log | grep '^Author' | sort | uniq -c | sort -n | less My preference would be to stick to a style where we identify the committer using the author tag and note the patch author, reviewers, whether the committer made changes, etc. in the commit message. A single author field doesn't feel like enough for our workflow, and having a mix of authors and committers in the author field seems like a mess. 2. Branch and tag management. In CVS, there are branches and tags in only one place: on the server. In git, you can have local branches and tags and remote branches and tags, and you can pull and push tags between servers. If I'm working on a git repository that has branches master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, numeric_2b, and temprelnames, I want to make sure that I don't accidentally push the last three of those to the authoritative server... but I do want to push all the others. Similarly I want to push only the corrects subset of tags (though that should be less of an issue, at least for me, as I don't usually create local tags). I'm not sure how to set this up, though. 3. Merge commits. I believe that we have consensus that commits should always be done as a "squash", so that the history of all of our branches is linear. But it seems to me that someone could accidentally push a merge commit, either because they forgot to squash locally, or because of a conflict between their local git repo's master branch and origin/master. Can we forbid this? 4. History rewriting. Under what circumstances, if any, are we OK with rebasing the master? For example, if we decide not to have merge commits, and somebody does a merge commit anyway, are we going to rebase to get rid of it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Tue, Jul 20, 2010 at 20:34, Robert Haas <robertmhaas@gmail.com> wrote: > I have some concerns related to the upcoming conversion to git and how > we're going to avoid having things get messy as people start using the > new repository. git has a lot more flexibility and power than CVS, > and I'm worried that it would be easy, even accidentally, to screw up > our history. > > 1. Inability to cleanly and easily (and programatically) identify who > committed what. With CVS, the author of a revision is the person who > committed it, period. With git, the author string can be set to > anything the person typing 'git commit' feels like. I think there is > also a committer field, but that doesn't always appear and I'm not > clear on how it works. Also, the author field defaults to something > dumb if you don't explicitly set it to a value. So I'm worried we > could end up with stuff like this in the repository: I'm pretty sure we can enforce this on the server side, refusing commits that don't follow our standard. I haven't done it myself (yet), but I've read about it. > My preference would be to stick to a style where we identify the > committer using the author tag and note the patch author, reviewers, > whether the committer made changes, etc. in the commit message. A > single author field doesn't feel like enough for our workflow, and > having a mix of authors and committers in the author field seems like > a mess. +1. > 2. Branch and tag management. In CVS, there are branches and tags in > only one place: on the server. In git, you can have local branches > and tags and remote branches and tags, and you can pull and push tags > between servers. If I'm working on a git repository that has branches > master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, > numeric_2b, and temprelnames, I want to make sure that I don't > accidentally push the last three of those to the authoritative > server... but I do want to push all the others. Similarly I want to > push only the corrects subset of tags (though that should be less of > an issue, at least for me, as I don't usually create local tags). I'm > not sure how to set this up, though. We could put a safeguard in place on the server that won't let you push a branch and require that additions of new branches be done manually on the server? > 3. Merge commits. I believe that we have consensus that commits > should always be done as a "squash", so that the history of all of our > branches is linear. But it seems to me that someone could > accidentally push a merge commit, either because they forgot to squash > locally, or because of a conflict between their local git repo's > master branch and origin/master. Can we forbid this? Again, I haven't done it, but I've read about it, and I'm almost certain we can enforce this, yes. > 4. History rewriting. Under what circumstances, if any, are we OK > with rebasing the master? For example, if we decide not to have merge > commits, and somebody does a merge commit anyway, are we going to > rebase to get rid of it? That's something we need a good policy for. Merge commits are special. For content commits, I think we should basically *never* do that. If someone commits bad content, we should just make a revert commit which keeps history linear and just removes the changes as a new commit. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On tis, 2010-07-20 at 14:34 -0400, Robert Haas wrote: > Right now, it's easy to find all the commits by a particular > committer, and it's easy to see who committed a particular patch, and > the number of distinct committers is pretty small. I'd hate to give > that up. > > git log | grep '^Author' | sort | uniq -c | sort -n | less git log --format=full | grep '^Commit' | sort | uniq -c | sort -n | less > My preference would be to stick to a style where we identify the > committer using the author tag and note the patch author, reviewers, > whether the committer made changes, etc. in the commit message. A > single author field doesn't feel like enough for our workflow, and > having a mix of authors and committers in the author field seems like > a mess. Well, I had looked forward to actually putting the real author into the author field. > 2. Branch and tag management. In CVS, there are branches and tags in > only one place: on the server. In git, you can have local branches > and tags and remote branches and tags, and you can pull and push tags > between servers. If I'm working on a git repository that has branches > master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, > numeric_2b, and temprelnames, I want to make sure that I don't > accidentally push the last three of those to the authoritative > server... but I do want to push all the others. Similarly I want to > push only the corrects subset of tags (though that should be less of > an issue, at least for me, as I don't usually create local tags). I'm > not sure how to set this up, though. I'm going to use one separate clone for my development and one "pristine" one for the final commits and copy the patches over manually. That also solves the next problem ... > 3. Merge commits. I believe that we have consensus that commits > should always be done as a "squash", so that the history of all of our > branches is linear. But it seems to me that someone could > accidentally push a merge commit, either because they forgot to squash > locally, or because of a conflict between their local git repo's > master branch and origin/master. Can we forbid this?
On Tue, Jul 20, 2010 at 2:42 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Tue, Jul 20, 2010 at 20:34, Robert Haas <robertmhaas@gmail.com> wrote: >> I have some concerns related to the upcoming conversion to git and how >> we're going to avoid having things get messy as people start using the >> new repository. git has a lot more flexibility and power than CVS, >> and I'm worried that it would be easy, even accidentally, to screw up >> our history. >> >> 1. Inability to cleanly and easily (and programatically) identify who >> committed what. With CVS, the author of a revision is the person who >> committed it, period. With git, the author string can be set to >> anything the person typing 'git commit' feels like. I think there is >> also a committer field, but that doesn't always appear and I'm not >> clear on how it works. Also, the author field defaults to something >> dumb if you don't explicitly set it to a value. So I'm worried we >> could end up with stuff like this in the repository: > > I'm pretty sure we can enforce this on the server side, refusing > commits that don't follow our standard. I haven't done it myself > (yet), but I've read about it. +1, though I see downthread that Peter has a contrary opinion. >> My preference would be to stick to a style where we identify the >> committer using the author tag and note the patch author, reviewers, >> whether the committer made changes, etc. in the commit message. A >> single author field doesn't feel like enough for our workflow, and >> having a mix of authors and committers in the author field seems like >> a mess. > > +1. > > >> 2. Branch and tag management. In CVS, there are branches and tags in >> only one place: on the server. In git, you can have local branches >> and tags and remote branches and tags, and you can pull and push tags >> between servers. If I'm working on a git repository that has branches >> master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, >> numeric_2b, and temprelnames, I want to make sure that I don't >> accidentally push the last three of those to the authoritative >> server... but I do want to push all the others. Similarly I want to >> push only the corrects subset of tags (though that should be less of >> an issue, at least for me, as I don't usually create local tags). I'm >> not sure how to set this up, though. > > We could put a safeguard in place on the server that won't let you > push a branch and require that additions of new branches be done > manually on the server? On this one, I'd just like a way to prevent accidents. Is there maybe a config option I can set on my local repository? >> 3. Merge commits. I believe that we have consensus that commits >> should always be done as a "squash", so that the history of all of our >> branches is linear. But it seems to me that someone could >> accidentally push a merge commit, either because they forgot to squash >> locally, or because of a conflict between their local git repo's >> master branch and origin/master. Can we forbid this? > > Again, I haven't done it, but I've read about it, and I'm almost > certain we can enforce this, yes. OK, that sounds good... >> 4. History rewriting. Under what circumstances, if any, are we OK >> with rebasing the master? For example, if we decide not to have merge >> commits, and somebody does a merge commit anyway, are we going to >> rebase to get rid of it? > > That's something we need a good policy for. Merge commits are special. > For content commits, I think we should basically *never* do that. If > someone commits bad content, we should just make a revert commit which > keeps history linear and just removes the changes as a new commit. Yeah, I agree. I'm not sure that merge commits are the ONLY situation where we'd want to do this, but it should be reserved for cases where just reversing out the diff wouldn't be sufficient for some reason and we need to make it as though it never happened. I don't think it's probably necessary to disallow this completely - the default setting of allowing it only with + is probably enough. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > Well, I had looked forward to actually putting the real author into the > author field. What if there's more than one? What if you make changes yourself? How will you credit the reviewer? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas wrote: > I have some concerns related to the upcoming conversion to git and how > we're going to avoid having things get messy as people start using the > new repository. git has a lot more flexibility and power than CVS, > and I'm worried that it would be easy, even accidentally, to screw up > our history. > > 1. Inability to cleanly and easily (and programatically) identify who > committed what. > Each committer sets their name and email using git config. Doesn't look like a problem. We don't have such a large number of committers that this should be much of an issue. Maybe we can set a pre-receive hook to make sure that it's set appropriately? > 2. Branch and tag management. > [snip] I'm inclined to say that as now committers should not normally push tags. Marc or whoever is managing things should create the various tags. I think our current tagging policy is about right. "git push" doesn't push tags by default, so you'd have to be trying hard to mess this up. > 3. Merge commits. I believe that we have consensus that commits > should always be done as a "squash", so that the history of all of our > branches is linear. But it seems to me that someone could > accidentally push a merge commit, either because they forgot to squash > locally, or because of a conflict between their local git repo's > master branch and origin/master. Can we forbid this? > Again, a pre-receive hook might be able to handle this. See <http://progit.org/book/ch7-4.html> > 4. History rewriting. Under what circumstances, if any, are we OK > with rebasing the master? For example, if we decide not to have merge > commits, and somebody does a merge commit anyway, are we going to > rebase to get rid of it? > > In the end, if we screw up badly enough we can just roll things back. It would be a pain, but not insurmountably so. I think we need to expect that there will be some teething issues. I keep 7 days worth of backups of the CVS repo constantly now, and I'll probably do the same with git, and I'm sure there will be other backups. cheers andrew
Robert Haas wrote: > On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > >> Well, I had looked forward to actually putting the real author into the >> author field. >> > > What if there's more than one? What if you make changes yourself? > How will you credit the reviewer? > > I think our current practice is fine. Put it in the commit log. cheers andrew
On Tue, Jul 20, 2010 at 8:12 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> My preference would be to stick to a style where we identify the >> committer using the author tag and note the patch author, reviewers, >> whether the committer made changes, etc. in the commit message. A >> single author field doesn't feel like enough for our workflow, and >> having a mix of authors and committers in the author field seems like >> a mess. > > Well, I had looked forward to actually putting the real author into the > author field. I hadn't realised that was possible until Guillaume did so on his first commit to the new pgAdmin GIT repo. It seems to work nicely: http://git.postgresql.org/gitweb?p=pgadmin3.git;a=commit;h=08e2826d90129bd4e4b3b7462bab682dd6a703e4 -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 02:28, Andrew Dunstan <andrew@dunslane.net> wrote: > > > Robert Haas wrote: >> >> On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> >>> >>> Well, I had looked forward to actually putting the real author into the >>> author field. >>> >> >> What if there's more than one? What if you make changes yourself? >> How will you credit the reviewer? >> >> > > I think our current practice is fine. Put it in the commit log. If nothing else, I think this definitely falls under the "minimum changes first" policy. Let's start by doing things exactly as we're doing now. We can then consider changing this in the future, but let's not change everything at once. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
At 2010-07-20 14:34:20 -0400, robertmhaas@gmail.com wrote: > > I think there is also a committer field, but that doesn't always > appear and I'm not clear on how it works. There is always a committer field, and it is set sensibly as long as the committer has user.name and user.email set correctly with git-config. It is not displayed by git-log by default, unless it is different from the "author". (As PeterE showed, it's easy to get the list of committers.) > My preference would be to stick to a style where we identify the > committer using the author tag and note the patch author, reviewers, > whether the committer made changes, etc. in the commit message. An aside: as a patch author (and elsewhere, as a committer), it's nice when the log shows the author rather than the committer. Will we really have so many patches with multiple authors or other complications that we can't set the author by default and fall back to explanations in the commit message (e.g. "applied with changes") for more complicated cases? > I want to make sure that I don't accidentally push the last three of > those to the authoritative server... By default (at least with a recent git), "git push" will push branches that are tracking remote branches, but new local branches have to be pushed explicitly to create them on the remote. So don't worry about that. > 3. Merge commits. I believe that we have consensus that commits > should always be done as a "squash", so that the history of all of > our branches is linear. I admit I haven't been paying as much attention as I should, but I did not know there was such a consensus. If anyone could explain the rationale, I would be grateful. > But it seems to me that someone could > accidentally push a merge commit […] > Can we forbid this? Yes, I suppose it's possible, but personally I think it would be a waste of time to try to ban merge commits. > 4. History rewriting. Under what circumstances, if any, are we OK > with rebasing the master? Please, let's never do that. The cure for pulling a rebased branch into an existing clone may seem simple, but it's a huge pain in practice, and it's never really worth it. -- ams
On Wed, Jul 21, 2010 at 6:46 AM, Abhijit Menon-Sen <ams@toroid.org> wrote: >> My preference would be to stick to a style where we identify the >> committer using the author tag and note the patch author, reviewers, >> whether the committer made changes, etc. in the commit message. > > An aside: as a patch author (and elsewhere, as a committer), it's nice > when the log shows the author rather than the committer. Will we really > have so many patches with multiple authors or other complications that > we can't set the author by default and fall back to explanations in the > commit message (e.g. "applied with changes") for more complicated cases? Tom Lane rewrites part of nearly every commit, and even I change maybe 30% of them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 12:46, Abhijit Menon-Sen <ams@toroid.org> wrote: > At 2010-07-20 14:34:20 -0400, robertmhaas@gmail.com wrote: >> I want to make sure that I don't accidentally push the last three of >> those to the authoritative server... > > By default (at least with a recent git), "git push" will push branches > that are tracking remote branches, but new local branches have to be > pushed explicitly to create them on the remote. Yeha, i agree this is probably not a big problem. Plus, if we accidentally push a branch that shouldn't have been pushed, it can easily be removed (as long as it's noticed before anybody relies on it). To the suitable embarrassment of the committer who made the incorrect push, which has a tendency to teach them not to do it next time :-) >> 3. Merge commits. I believe that we have consensus that commits >> should always be done as a "squash", so that the history of all of >> our branches is linear. > > I admit I haven't been paying as much attention as I should, but I did > not know there was such a consensus. If anyone could explain the > rationale, I would be grateful. We are not changing the workflow, just the tool. We may consider changing the workflow sometime in the future (but don't bet on it), but we're definitely not changing both at the same time. This has been discussed many times before, both here on list and in person on at least two instances of the pgcon developer meeting. This is not the time to re-open that discussion. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
At 2010-07-21 12:55:55 +0200, magnus@hagander.net wrote: > > We are not changing the workflow, just the tool. OK, but I don't see why accidental merge commits need to be considered antisocial, and banned or rebased away. Who cares if they exist? They don't change anything you need to do to pull, create, view, or push changes. > This is not the time to re-open that discussion. Sure. I apologise for bringing it up. -- ams
On Wed, Jul 21, 2010 at 13:05, Abhijit Menon-Sen <ams@toroid.org> wrote: > At 2010-07-21 12:55:55 +0200, magnus@hagander.net wrote: >> >> We are not changing the workflow, just the tool. > > OK, but I don't see why accidental merge commits need to be considered > antisocial, and banned or rebased away. Who cares if they exist? They > don't change anything you need to do to pull, create, view, or push > changes. They makes it harder to track how the project has moved along for people who don't really know about the concept. I'm not sure, but I bet they may cause issues for those tracking the project through git-cvs, or any other tool that doesn't deal with nonlinear history. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Tue, 20 Jul 2010 14:34:20 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > I have some concerns related to the upcoming conversion to git and how > we're going to avoid having things get messy as people start using the > new repository. Here's a few responses from the point of view of somebody who has been working with git in the kernel community for some years now. Hopefully it's helpful... > 1. Inability to cleanly and easily (and programatically) identify who > committed what. No, git tracks committer information separately, and it's easily accessible. Dig into the grungy details of git-log and you'll see that you can get out just about anything you need, in any format. IMHO, vandalizing the author field would be a mistake; it's your best way of tracking where the patch came from and for ensuring credit in your changelogs. Why throw away information? > 2. Branch and tag management. In CVS, there are branches and tags in > only one place: on the server. In git, you can have local branches > and tags and remote branches and tags, and you can pull and push tags > between servers. If I'm working on a git repository that has branches > master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, > numeric_2b, and temprelnames, I want to make sure that I don't > accidentally push the last three of those to the authoritative > server... but I do want to push all the others. Similarly I want to > push only the corrects subset of tags (though that should be less of > an issue, at least for me, as I don't usually create local tags). I'm > not sure how to set this up, though. Branch push policy can be tweaked in your local config. I'm less sure about tags. It's worth noting that the kernel community does very little with push in general - things are much more often pulled. That may not be a workflow that's suitable for postgresql, though. > 3. Merge commits. I believe that we have consensus that commits > should always be done as a "squash", so that the history of all of our > branches is linear. But it seems to me that someone could > accidentally push a merge commit, either because they forgot to squash > locally, or because of a conflict between their local git repo's > master branch and origin/master. Can we forbid this? That seems like a terrible idea to me - why would you destroy history? Obviously I've missed a discussion here. But, the first time somebody wants to use bisect to pinpoint a regression-causing patch, you'll wish you had that information there. > 4. History rewriting. Under what circumstances, if any, are we OK > with rebasing the master? For example, if we decide not to have merge > commits, and somebody does a merge commit anyway, are we going to > rebase to get rid of it? A good general rule of thumb is to treat publicly-exposed history as immutable. As soon as you start rebasing trees you create misery for anybody working with those trees. If you're really set on avoiding things like merges, there are ways to set up scripts on the server to enforce policies. jon
On Wed, Jul 21, 2010 at 10:49 AM, Jonathan Corbet <corbet@lwn.net> wrote: >> 1. Inability to cleanly and easily (and programatically) identify who >> committed what. > > No, git tracks committer information separately, and it's easily > accessible. Dig into the grungy details of git-log and you'll see that you > can get out just about anything you need, in any format. > > IMHO, vandalizing the author field would be a mistake; it's your best way > of tracking where the patch came from and for ensuring credit in your > changelogs. Why throw away information? If git had a place to store all the information we care about, that would be fine, but it doesn't. Here's a recent attribution line I used: David Christensen. Reviewed by Steve Singer. Some further changes by me. There's no "reviewer" header, and there's no concept that a patch might have come from the author (or perhaps multiple authors), but then have been adjusted by one or more reviewers and then frobnicated some more by the committer. I'm not sure it's possible to create a system that can effectively store all the ways we give credit and attribution, but a single author line is definitely not it. How much do I have to change a patch before I would use my own name on the author line rather than the patch author's? A single byte? A non-whitespace change? More than 15% of the patch? And, oh by the way, it's the committer who writes the commit message, not the patch author, so at an *absolute minimum* that part of the commit object isn't coming from the original author. >> 2. Branch and tag management. In CVS, there are branches and tags in >> only one place: on the server. In git, you can have local branches >> and tags and remote branches and tags, and you can pull and push tags >> between servers. If I'm working on a git repository that has branches >> master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, >> numeric_2b, and temprelnames, I want to make sure that I don't >> accidentally push the last three of those to the authoritative >> server... but I do want to push all the others. Similarly I want to >> push only the corrects subset of tags (though that should be less of >> an issue, at least for me, as I don't usually create local tags). I'm >> not sure how to set this up, though. > > Branch push policy can be tweaked in your local config. I'm less sure > about tags. It's worth noting that the kernel community does very little > with push in general - things are much more often pulled. That may not be > a workflow that's suitable for postgresql, though. Seems like we've got this one worked out, per discussion upthread. >> 3. Merge commits. I believe that we have consensus that commits >> should always be done as a "squash", so that the history of all of our >> branches is linear. But it seems to me that someone could >> accidentally push a merge commit, either because they forgot to squash >> locally, or because of a conflict between their local git repo's >> master branch and origin/master. Can we forbid this? > > That seems like a terrible idea to me - why would you destroy history? > Obviously I've missed a discussion here. But, the first time somebody > wants to use bisect to pinpoint a regression-causing patch, you'll wish you > had that information there. In any commit pattern, if I use bisect to pinpoint a regression causing patch, I will find the commit that broke it. Whoever made that particular commit is to blame. Full stop. I don't really care where in the development of the patch that was eventually committed the breakage happened, and I do not want to wade through 50 revs of somebody's private development to find the particular place where they made a thinko. I only care that their patch *as committed* is broken.I don't think that non-linear history is an advantagein any situation. It may be an unavoidable necessity if you have lots of cross-merging between different repositories, but we don't, so for us it's just clutter. >> 4. History rewriting. Under what circumstances, if any, are we OK >> with rebasing the master? For example, if we decide not to have merge >> commits, and somebody does a merge commit anyway, are we going to >> rebase to get rid of it? > > A good general rule of thumb is to treat publicly-exposed history as > immutable. As soon as you start rebasing trees you create misery for > anybody working with those trees. > > If you're really set on avoiding things like merges, there are ways to set > up scripts on the server to enforce policies. Yep, Magnus coded it up today. It works great. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Jonathan Corbet wrote: >> 3. Merge commits. I believe that we have consensus that commits >> should always be done as a "squash", so that the history of all of our >> branches is linear. But it seems to me that someone could >> accidentally push a merge commit, either because they forgot to squash >> locally, or because of a conflict between their local git repo's >> master branch and origin/master. Can we forbid this? >> > > That seems like a terrible idea to me - why would you destroy history? > Obviously I've missed a discussion here. But, the first time somebody > wants to use bisect to pinpoint a regression-causing patch, you'll wish you > had that information there. > > We have a clear idea of what should be part of the public history contained in the authoritative repo and what should be history that is private to the developer/tester/committer. We don't want to pollute the former with the latter. The level of granularity of our current CVS commits seems to us to be about right. So when a committer pushes a patch it should add one fast-forward commit to the tree. We want to be able to bisect between these commit objects, but not between all the work product commits that led up to them. Of course, developers, committers and testers can keep what they like privately - we're only talking about what should go in the authoritative repo. cheers andrew
Excerpts from Andrew Dunstan's message of mié jul 21 15:11:41 -0400 2010: > > Jonathan Corbet wrote: > > That seems like a terrible idea to me - why would you destroy history? > > Obviously I've missed a discussion here. But, the first time somebody > > wants to use bisect to pinpoint a regression-causing patch, you'll wish you > > had that information there. > > So when a committer pushes a patch it should add one fast-forward commit > to the tree. We want to be able to bisect between these commit objects, > but not between all the work product commits that led up to them. Of > course, developers, committers and testers can keep what they like > privately - we're only talking about what should go in the authoritative > repo. I don't disagree that we're going to squash commits, but I don't believe that developers will be able to keep "what they like privately". The commit objects for the final patch are going to differ, if only because they have different parents than the ones on the main branch. Of course, they will be able to have a local branch with their local patch, but to Git there will be no relationship between this branch and the final, squashed patch in the authoritative repo.
On Wed, 21 Jul 2010 15:11:41 -0400 Andrew Dunstan <andrew@dunslane.net> wrote: > We have a clear idea of what should be part of the public history > contained in the authoritative repo and what should be history that is > private to the developer/tester/committer. We don't want to pollute the > former with the latter. The thought makes me shudder...you lose the history, the reasons for specific changes, the authorship of changes, and the ability of your testers to pinpoint problematic changes. But...your project, your decision...we'll keep using PostgreSQL regardless...:) Thanks, jon
* Robert Haas: > 1. Inability to cleanly and easily (and programatically) identify who > committed what. With CVS, the author of a revision is the person who > committed it, period. With git, the author string can be set to > anything the person typing 'git commit' feels like. It's even more difficult than that. Git does not record at all who updated a particular branch to include a specific commit. This operation does not leave any metadata behind. It is possible to write a log file for audit purposes, but it's an out-of-band solution. > My preference would be to stick to a style where we identify the > committer using the author tag and note the patch author, reviewers, > whether the committer made changes, etc. in the commit message. A > single author field doesn't feel like enough for our workflow, and > having a mix of authors and committers in the author field seems like > a mess. It would be possible to enforce that on the server side, but it would interfere with merges. > 3. Merge commits. I believe that we have consensus that commits > should always be done as a "squash", so that the history of all of our > branches is linear. But it seems to me that someone could > accidentally push a merge commit, either because they forgot to squash > locally, or because of a conflict between their local git repo's > master branch and origin/master. Can we forbid this? It's possible to do this with some scripting on the server side. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Le 21/07/2010 09:53, Dave Page a écrit : > On Tue, Jul 20, 2010 at 8:12 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> My preference would be to stick to a style where we identify the >>> committer using the author tag and note the patch author, reviewers, >>> whether the committer made changes, etc. in the commit message. A >>> single author field doesn't feel like enough for our workflow, and >>> having a mix of authors and committers in the author field seems like >>> a mess. >> >> Well, I had looked forward to actually putting the real author into the >> author field. > > I hadn't realised that was possible until Guillaume did so on his > first commit to the new pgAdmin GIT repo. It seems to work nicely: > > http://git.postgresql.org/gitweb?p=pgadmin3.git;a=commit;h=08e2826d90129bd4e4b3b7462bab682dd6a703e4 > It's one of the nice things with git. So, I'm eager to use it with the pgAdmin repo. -- Guillaumehttp://www.postgresql.frhttp://dalibo.com
Robert Haas wrote: > > If git had a place to store all the information we care about, that > would be fine... > > There's no "reviewer" header, and there's no concept that a patch > might have come from the author (or perhaps multiple authors), but > then have been adjusted by one or more reviewers and then frobnicated > some more by the committer > ... > I don't think that non-linear history is an advantage in any > situation. ISTM we could have the best of both those worlds - linear history and author&reviewer&committer information. Instead of squashing every patch into a single commit, what if it got squashed into a perhaps 3 separate commits -- one as submitted, one as reviewed, and one as re-written by the committer. History stays linear; and you keep the most relevant parts of the history, while dropping all the development brainstorming commits. And ISTM the patch reviewer could be responsible for this squashing so it's not much more work for the committer. For example, instead of commit 351c0b92eca40c1a36934cf83fe75db9dc458cde Author: Robert Haas <robertmhaas@gmail.com> Date: Fri Jul 23 00:43:002010 +0000 Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Andres Freund, with cleanupand adjustment for older branches by me. we'd see Author: Andreas Freund Date: Fri Jul 23 00:43:00 2010 +0000 Avoid deep recursion when assigning XIDs to multiplelevels of subxacts. Path as originally submitted to commit fest Author: [Whomever the reviewer was] Date: Fri Jul 23 00:43:00 2010 +0000 Avoid deep recursion when assigning XIDsto multiple levels of subxacts. Patch marked read for committer by reviewer. Author: Robert Haas <robertmhaas@gmail.com> Date: Fri Jul 23 00:43:00 2010 +0000 Avoid deep recursion when assigningXIDs to multiple levels of subxacts. Patch as rewritten by committer. For a complex enough patch with many authors, the reviewer could choose to keep an extra author commit or two to credit the extra authors.
On lör, 2010-07-24 at 07:02 -0700, Ron Mayer wrote: > Instead of squashing every patch into a single commit, what if it got > squashed into a perhaps 3 separate commits -- one as submitted, one > as reviewed, and one as re-written by the committer. History stays > linear; and you keep the most relevant parts of the history, > while dropping all the development brainstorming commits. But then there would be commits in the main repository that were known not good enough.
Peter Eisentraut wrote: > On lör, 2010-07-24 at 07:02 -0700, Ron Mayer wrote: > >> Instead of squashing every patch into a single commit, what if it got >> squashed into a perhaps 3 separate commits -- one as submitted, one >> as reviewed, and one as re-written by the committer. History stays >> linear; and you keep the most relevant parts of the history, >> while dropping all the development brainstorming commits. >> > > But then there would be commits in the main repository that were known > not good enough. > > Yeah. Also, please bear in mind that our explicit aim here is to make this change with a minimal disruption to existing work flows. So to all those people who want to say "Look, you can now do all these cool things" my answer is "Maybe we'll do them later, but not right now." cheers andrew
On Sat, 2010-07-24 at 13:48 -0400, Andrew Dunstan wrote: > > Yeah. Also, please bear in mind that our explicit aim here is to make > this change with a minimal disruption to existing work flows. So to all > those people who want to say "Look, you can now do all these cool > things" my answer is "Maybe we'll do them later, but not right now." Amen brother. Git is a universe different than CVS/SVN. Let's practice KISS for a while shall we. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
On Sat, 2010-07-24 at 13:48 -0400, Andrew Dunstan wrote: > > Yeah. Also, please bear in mind that our explicit aim here is to make > this change with a minimal disruption to existing work flows. So to all > those people who want to say "Look, you can now do all these cool > things" my answer is "Maybe we'll do them later, but not right now." Amen brother. Git is a universe different than CVS/SVN. Let's practice KISS for a while shall we. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt