Re: Multi-branch committing in git, revisited - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Multi-branch committing in git, revisited
Date
Msg-id AANLkTimKFKnCZENPif8t9Kciq0GndR9Z9wPuMn0QCLJ3@mail.gmail.com
Whole thread Raw
In response to Multi-branch committing in git, revisited  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Sep 21, 2010 at 9:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So it seems like maybe we still need some more thought about how to
> recognize related commits in different branches.  Or at the very least,
> we need a best-practice document explaining how to manage this --- we
> shouldn't expect every committer to reinvent this wheel for himself.
>
> Comments?

I don't think there's one right way to do this.  In fact, there are
probably at least 50 reasonable ways to do it, depending on your own
workflow preferences.  And you needn't do it the same way every time,
either.  I don't.  git is designed to treat commits the way that
databases treat rows.  They are objects.  You can create them, throw
them out, move them around, replace them with updated versions, etc.
And there are multiple ways of doing each of these things (just as
there's no single right way to design a database schema).  Of course,
in the One True Source Tree, we only every create them at the heads of
existing branches.  But in your own workspace, you can do all of those
things - and you should, because they make you able to get the same
things done faster.

What I plan to do, I think, is use one clone nearly all the time.  If
I need to back-patch, I'll switch branches and either apply from the
stash or cherry-pick off the master.  But if I need to to go back more
than one or two branches and there are merge conflicts, I'll create a
bunch of temporary clones off my main clone and push/pull from there.
Then I'll remove them when I'm done.  But from what I gather, there
are probably going to be at least as many workflows as there are
committers, and maybe more, since I just said I'm going to use two
different approaches depending on the situation.

One option is just to update the date stamp on each commit before you
push.  You could check out each branch you've updated and do something
like:

GIT_EDITOR=/usr/bin/true git commit --amend --date=now

Of course that'll only update the top commit, and you want to be sure
not to do it on branches where the top commit is something that's
already been pushed.  But to reiterate my main point, I think we
should only dictate the contents of the commits that must be pushed
(e.g. time stamps close together, so git_topo_order can match them up)
and not the process by which someone creates those commits (because
the chances of getting more than 1 person to agree on the way to
generate such commits seems near zero, and the fact that I plan to do
it different ways depending on the situation means that even getting 1
person to agree with themselves on how to do it may be out of reach).

People have repeatedly suggested that timestamp/author/log message
matching is a lousy way of matching up commits.  I agree, but we have
14.25 years of history for which that's the only practical method.  We
could decide on a different method going forward, such as embedding a
token (the format of which we can argue about until we're blue in the
face - could be whatever git cherry-pick does or could be a reference
to our issue tracking system if we had one, could be something else)
in each commit in a very specific format which a script can then
recognize.  The advantage of that is that it might feel a bit less ad
hoc than what we're doing now; the disadvantage is that then our
scripts would have to know about both methods, unless of course we go
back and annotate all of the old commits (using git-notes) with
reconstructed information on which commits go together, which we
already have from git_topo_order.  No matter what we pick, it's going
to require non-zero effort to not screw it up; and with the exception
of doing forward merges which I think is a terrible idea, none of the
methods so far proposed seem likely to take significantly more time
than any of the others.  So if we're going to change anything at all,
we ought to focus on how it's going to improve the overall way that we
manage the project, not on the exact sequence of commands that will be
required to create it, which I'm fairly confident will settle down to
a quite small number as you and everyone else get used to the new tool
(but it won't be the same sequence for everyone).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: snapshot generation broken
Next
From: Bruce Momjian
Date:
Subject: Re: Multi-branch committing in git, revisited