antisocial things you can do in git (but not CVS) - Mailing list pgsql-hackers

I have some concerns related to the upcoming conversion to git and how
we're going to avoid having things get messy as people start using the
new repository.  git has a lot more flexibility and power than CVS,
and I'm worried that it would be easy, even accidentally, to screw up
our history.

1. Inability to cleanly and easily (and programatically) identify who
committed what.  With CVS, the author of a revision is the person who
committed it, period.  With git, the author string can be set to
anything the person typing 'git commit' feels like.  I think there is
also a committer field, but that doesn't always appear and I'm not
clear on how it works.  Also, the author field defaults to something
dumb if you don't explicitly set it to a value.  So I'm worried we
could end up with stuff like this in the repository:

Author: <rhaas@rhaas-laptop>
Author: Robert Haas <robertmhaas@gmail.com>
Author: Robert Haas <rhaas@enterprisedb.com>
Author: Robert Haas
<rhaas@some-place-i-might-hypothetically-work-in-the-future.com>
Author: The Guy Who Wrote Some Patch Which Robert Haas Ended Up
Committing <somerandomemail@somerandomdomain.whatever>

Right now, it's easy to find all the commits by a particular
committer, and it's easy to see who committed a particular patch, and
the number of distinct committers is pretty small.  I'd hate to give
that up.

git log | grep '^Author' | sort | uniq -c | sort -n | less

My preference would be to stick to a style where we identify the
committer using the author tag and note the patch author, reviewers,
whether the committer made changes, etc. in the commit message.  A
single author field doesn't feel like enough for our workflow, and
having a mix of authors and committers in the author field seems like
a mess.

2. Branch and tag management.  In CVS, there are branches and tags in
only one place: on the server.  In git, you can have local branches
and tags and remote branches and tags, and you can pull and push tags
between servers.  If I'm working on a git repository that has branches
master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
numeric_2b, and temprelnames, I want to make sure that I don't
accidentally push the last three of those to the authoritative
server... but I do want to push all the others.  Similarly I want to
push only the corrects subset of tags (though that should be less of
an issue, at least for me, as I don't usually create local tags).  I'm
not sure how to set this up, though.

3. Merge commits.  I believe that we have consensus that commits
should always be done as a "squash", so that the history of all of our
branches is linear.  But it seems to me that someone could
accidentally push a merge commit, either because they forgot to squash
locally, or because of a conflict between their local git repo's
master branch and origin/master.  Can we forbid this?

4. History rewriting.  Under what circumstances, if any, are we OK
with rebasing the master?  For example, if we decide not to have merge
commits, and somebody does a merge commit anyway, are we going to
rebase to get rid of it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: managing git disk space usage
Next
From: Markus Wanner
Date:
Subject: Re: dynamically allocating chunks from shared memory