Thread: Managing multiple branches in git

Managing multiple branches in git

From
Tom Lane
Date:
[ it's way past time for a new subject thread ]

Marko Kreen <markokr@gmail.com> writes:
> They cannot be same commits in GIT as the resulting tree is different.

This brings up something that I've been wondering about: my limited
exposure to git hasn't shown me any sane way to work with multiple
release branches.

The way that I have things set up for CVS is that I have a checkout
of HEAD, and also "sticky" checkouts of the back branches:pgsql/ ...REL8_3/pgsql/ ... (made with -r
REL8_3_STABLE)REL8_2/pgsql/...etc
 

Each of these is configured (using --prefix) to install into a separate
installation tree.  So I can switch my attention to one branch or
another by cd'ing to the right place and adjusting a few environment
variables such as PATH and PGDATA.

The way I prepare a patch that has to be back-patched is first to make
and test the fix in HEAD.  Then apply it (using diff/patch and perhaps
manual adjustments) to the first back branch, and test that.  Repeat for
each back branch as far as I want to go.  Almost always, there is a
certain amount of manual adjustment involved due to renamings,
historical changes of pgindent rules, etc.  Once I have all the versions
tested, I prepare a commit message and commit all the branches.  This
results in one commit message per branch in the pgsql-committers
archives, and just one commit in the cvs2cl representation of the
history --- which is what I want.

I don't see any even-approximately-sane way to handle similar cases
in git.  From what I've learned so far, you can have one checkout
at a time in a git working tree, which would mean N copies of the
entire repository if I want N working trees.  Not to mention the
impossibility of getting it to regard parallel commits as related
in any way whatsoever.

So how is this normally done with git?
        regards, tom lane


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 8:43 AM, Tom Lane wrote:

> Each of these is configured (using --prefix) to install into a  
> separate
> installation tree.  So I can switch my attention to one branch or
> another by cd'ing to the right place and adjusting a few environment
> variables such as PATH and PGDATA.

Yeah, with git, rather than cd'ing to another directory, you'd just do  
`git checkout rel8_3` and work from the same directory.

> So how is this normally done with git?

For better or for worse, because git is project-oriented rather than  
filesystem-oriented, you can't commit to all the branches at once. You  
have to commit to each one independently. You can push them all back  
to the canonical repository at once, and the canonical repository's  
commit hooks can trigger for all of the commits at once (or so I  
gather from getting emails from GitHub with a bunch of commits listed  
in a single message), but each commit is still independent.

It has to do with the fundamentally different way in which Git works:  
snapshots of your code rather than different directories.

Best,

David



Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> Yeah, with git, rather than cd'ing to another directory, you'd just do  
> `git checkout rel8_3` and work from the same directory.

That's what I'd gathered, and frankly it is not an acceptable answer.
Sure, the "checkout" operation is remarkably fast, but it does nothing
for derived files.  What would really be involved here (if I wanted to
be sure of having a non-broken build) ismake maintainer-cleangit checkout rel8_3configuremake
which takes long enough that I'll have plenty of time to consider
how much I hate git.  If there isn't a better way proposed, I'm
going to flip back to voting against this conversion.  I need tools
that work for me not against me.
        regards, tom lane


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 9:03 AM, Tom Lane wrote:

> "David E. Wheeler" <david@kineticode.com> writes:
>> Yeah, with git, rather than cd'ing to another directory, you'd just  
>> do
>> `git checkout rel8_3` and work from the same directory.
>
> That's what I'd gathered, and frankly it is not an acceptable answer.
> Sure, the "checkout" operation is remarkably fast, but it does nothing
> for derived files.  What would really be involved here (if I wanted to
> be sure of having a non-broken build) is
>     make maintainer-clean
>     git checkout rel8_3
>     configure
>     make
> which takes long enough that I'll have plenty of time to consider
> how much I hate git.  If there isn't a better way proposed, I'm
> going to flip back to voting against this conversion.  I need tools
> that work for me not against me.

Well, you can have as many clones of a repository as you like. You can  
keep one with master checked out, another with rel8_3, another with  
rel8_2, etc. You'd just have to write a script to keep them in sync  
(shouldn't be too difficult, each just as all the others as an origin  
-- or maybe you have one that's canonical on your system).

Best,

David



Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
David E. Wheeler wrote:

> Well, you can have as many clones of a repository as you like. You can  
> keep one with master checked out, another with rel8_3, another with  
> rel8_2, etc. You'd just have to write a script to keep them in sync  
> (shouldn't be too difficult, each just as all the others as an origin -- 
> or maybe you have one that's canonical on your system).

Hmm, but is there a way to create those clones from a single local
"database"?

(I like the monotone model much better.  This mixing of working copies
and databases as if they were a single thing is silly and uncomfortable
to use.)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Greg Stark
Date:
Yeah I was annoyed by the issue with having to reconfigure as well.

There are various tricks you can do though with separate repositories.

You could have the older branch repositories be clones of HEAD branch  
repository so when you push from them the changes just go to that  
repository then you can push all three branches together (not sure if  
you can do it all in one command though)

You can also have the different repositories share data files which I  
think will mean you don't have to pull other people's commits  
repeatedly. (the default is to have local clones use hard links so  
they don't take a lot of space and they're quick to sync anyways.)

There's also an option to make a clone without the full history but  
for local clones they're fast enough to create anyways that there's  
probably no point.


Incidentally I use git-clean -x -d -f instead of make maintainer-clean.

-- 
Greg


On 2 Jun 2009, at 17:07, "David E. Wheeler" <david@kineticode.com>  
wrote:

> On Jun 2, 2009, at 9:03 AM, Tom Lane wrote:
>
>> "David E. Wheeler" <david@kineticode.com> writes:
>>> Yeah, with git, rather than cd'ing to another directory, you'd  
>>> just do
>>> `git checkout rel8_3` and work from the same directory.
>>
>> That's what I'd gathered, and frankly it is not an acceptable answer.
>> Sure, the "checkout" operation is remarkably fast, but it does  
>> nothing
>> for derived files.  What would really be involved here (if I wanted  
>> to
>> be sure of having a non-broken build) is
>>    make maintainer-clean
>>    git checkout rel8_3
>>    configure
>>    make
>> which takes long enough that I'll have plenty of time to consider
>> how much I hate git.  If there isn't a better way proposed, I'm
>> going to flip back to voting against this conversion.  I need tools
>> that work for me not against me.
>
> Well, you can have as many clones of a repository as you like. You  
> can keep one with master checked out, another with rel8_3, another  
> with rel8_2, etc. You'd just have to write a script to keep them in  
> sync (shouldn't be too difficult, each just as all the others as an  
> origin -- or maybe you have one that's canonical on your system).
>
> Best,
>
> David
>
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 9:16 AM, Alvaro Herrera wrote:

>> Well, you can have as many clones of a repository as you like. You  
>> can
>> keep one with master checked out, another with rel8_3, another with
>> rel8_2, etc. You'd just have to write a script to keep them in sync
>> (shouldn't be too difficult, each just as all the others as an  
>> origin --
>> or maybe you have one that's canonical on your system).
>
> Hmm, but is there a way to create those clones from a single local
> "database"?

Yeah, that's what I meant by a "canonical copy on your system."

> (I like the monotone model much better.  This mixing of working copies
> and databases as if they were a single thing is silly and  
> uncomfortable
> to use.)

Monotone?

Best,

David



Re: Managing multiple branches in git

From
Dave Page
Date:
On Tue, Jun 2, 2009 at 5:16 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> David E. Wheeler wrote:
>
>> Well, you can have as many clones of a repository as you like. You can
>> keep one with master checked out, another with rel8_3, another with
>> rel8_2, etc. You'd just have to write a script to keep them in sync
>> (shouldn't be too difficult, each just as all the others as an origin --
>> or maybe you have one that's canonical on your system).
>
> Hmm, but is there a way to create those clones from a single local
> "database"?

Just barely paying attention here, but isn't 'git clone --local' what you need?


-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: Managing multiple branches in git

From
Aidan Van Dyk
Date:
* David E. Wheeler <david@kineticode.com> [090602 11:56]:
> On Jun 2, 2009, at 8:43 AM, Tom Lane wrote:
>
>> Each of these is configured (using --prefix) to install into a  
>> separate
>> installation tree.  So I can switch my attention to one branch or
>> another by cd'ing to the right place and adjusting a few environment
>> variables such as PATH and PGDATA.
>
> Yeah, with git, rather than cd'ing to another directory, you'd just do  
> `git checkout rel8_3` and work from the same directory.

But that looses his "configured" and "compiled" state...

But git isn't forcing him to change his workflow at all...

He *can* keep completely separate "git repositories" for each release
and work just as before.  This will carry with it a full "separate"
history in each repository, and I think that extra couple hundred MB is
what he's hoping to avoid.

But git has concepts of "object alternates" and "reference
repositories".  To mimic your workflow, I would probably do something
like:
## Make my reference repository, cloned from "offical" where everyone pushesmountie@pumpkin:~/projects/postgresql$ git
clone--bare --mirror git://repo.or.cz/PostgreSQL.git PostgreSQL.git
 
## Make my local master development repositorymountie@pumpkin:~/projects/postgresql$ git clone --reference
PostgreSQL.gitgit://repo.or.cz/PostgreSQL.git masterInitialized empty Git repository in
/home/mountie/projects/postgresql/master/.git/
## Make my local REL8_3_STABLE development repositorymountie@pumpkin:~/projects/postgresql$ git clone --reference
PostgreSQL.gitgit://repo.or.cz/PostgreSQL.git REL8_3_STABLEInitialized empty Git repository in
/home/mountie/projects/postgresql/REL8_3_STABLE/.git/mountie@pumpkin:~/projects/postgresql$cd
REL8_3_STABLE/mountie@pumpkin:~/projects/postgresql/REL8_3_STABLE$git checkout --track -b REL8_3_STABLE
origin/REL8_3_STABLEBranchREL8_3_STABLE set up to track remote branch refs/remotes/origin/REL8_3_STABLE.Switched to a
newbranch 'REL8_3_STABLE'
 


Now, the master/REL8_3_STABLE directories are both complete git
repositories, independant of eachother, except that they both reference
the "objects" in the PostgreSQL.git repository.  They don't contain the
historical objects in their own object store.  And I would couple that
with a cronjob:
*/15 * * *    git --git-dir=$HOME/projects/postgresql/PostgreSQL.git fetch --quiet

which will keep my "reference" project up2date (a la rsync-the-CVSROOT,
or cvsup-a-mirror anybody currently has when working with CVS)...

Then Tom can keep working pretty much as he currently does.

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Managing multiple branches in git

From
Marko Kreen
Date:
On 6/2/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [ it's way past time for a new subject thread ]
>
>  Marko Kreen <markokr@gmail.com> writes:
>  > They cannot be same commits in GIT as the resulting tree is different.
>
>  This brings up something that I've been wondering about: my limited
>  exposure to git hasn't shown me any sane way to work with multiple
>  release branches.
>
>  The way that I have things set up for CVS is that I have a checkout
>  of HEAD, and also "sticky" checkouts of the back branches:
>         pgsql/ ...
>         REL8_3/pgsql/ ... (made with -r REL8_3_STABLE)
>         REL8_2/pgsql/ ...
>         etc
>
>  Each of these is configured (using --prefix) to install into a separate
>  installation tree.  So I can switch my attention to one branch or
>  another by cd'ing to the right place and adjusting a few environment
>  variables such as PATH and PGDATA.
>
>  The way I prepare a patch that has to be back-patched is first to make
>  and test the fix in HEAD.  Then apply it (using diff/patch and perhaps
>  manual adjustments) to the first back branch, and test that.  Repeat for
>  each back branch as far as I want to go.  Almost always, there is a
>  certain amount of manual adjustment involved due to renamings,
>  historical changes of pgindent rules, etc.  Once I have all the versions
>  tested, I prepare a commit message and commit all the branches.  This
>  results in one commit message per branch in the pgsql-committers
>  archives, and just one commit in the cvs2cl representation of the
>  history --- which is what I want.
>
>  I don't see any even-approximately-sane way to handle similar cases
>  in git.  From what I've learned so far, you can have one checkout
>  at a time in a git working tree, which would mean N copies of the
>  entire repository if I want N working trees.  Not to mention the
>  impossibility of getting it to regard parallel commits as related
>  in any way whatsoever.

Whether you use several branches in one tree or several checked out
trees should be a personal preference, both ways are possible with GIT.

>  So how is this normally done with git?

If you are talking about backbranch fixes, then the "most-version
controlled-way" to do would be to use lowest branch as base, commit
fix there and then merge it upwards.

Now whether it succeeds depends on merge points between branches,
as VCS system takes nearest merge point as base to launch merge logic on.

I think that is also the actual thing that Markus is concerned about.

But instead of having random merge points between branches that depend
on when some new file was added, we could simply import all branches
with linear history and later simply say to git that:
* 7.4 is merged into 8.0..* 8.2 is merged into 8.3* 8.3 is merged into HEAD

without any file changes.  Logically this would mean that "any changes in
branch N-1 are already in N".

So afterwards when working with fully with GIT any upwards merges
work without any fuss as it does not need to consider old history
imported from CVS at all.

-- 
marko


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/02/2009 05:43 PM, Tom Lane wrote:
> Marko Kreen<markokr@gmail.com>  writes:
>> They cannot be same commits in GIT as the resulting tree is different.
> I don't see any even-approximately-sane way to handle similar cases
> in git.  From what I've learned so far, you can have one checkout
> at a time in a git working tree, which would mean N copies of the
> entire repository if I want N working trees.  Not to mention the
> impossibility of getting it to regard parallel commits as related
> in any way whatsoever.
You can use the "--reference" option to git clone to refer to objects in 
another clone. That way most of the commits will only be stored in there 
- only the local commits will be in the local checkout.


Andres


Re: Managing multiple branches in git

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Hmm, but is there a way to create those clones from a single local
> "database"?

> (I like the monotone model much better.  This mixing of working copies
> and databases as if they were a single thing is silly and uncomfortable
> to use.)

I agree, .git as a subdirectory of the working directory doesn't make
much sense to me.

I wondered for a second about symlinking .git from several checkout
directories to a common master, but AFAICT .git stores both the
"repository" and status information about the current checkout, so
that's not gonna work.

In the one large project that I have a git tree for, .git seems to
eat only about as much disk space as the checkout (so apparently the
compression is pretty effective).  So it wouldn't be totally impractical
to have a separate repository for each branch, but it sure seems like
an ugly and klugy way to do it.  And we'd still end up with the "same"
commit on different branches appearing entirely unrelated.

At the same time, I don't really buy the theory that relating commits on
different branches via merges will work.  In my experience it is very
seldom the case that a patch applies to each back branch with no manual
effort whatever, which is what I gather the merge functionality could
help with.  So maybe there's not much help to be had on this ...
        regards, tom lane


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/02/2009 06:33 PM, Tom Lane wrote:
> At the same time, I don't really buy the theory that relating commits on
> different branches via merges will work.  In my experience it is very
> seldom the case that a patch applies to each back branch with no manual
> effort whatever, which is what I gather the merge functionality could
> help with.  So maybe there's not much help to be had on this ...
You can do a merge and change the commit during that - this way you get 
the merge tracking information correct although you did a merge so that 
further merge operations can consider the specific change to be applied 
on both/some/all branches.
This will happen by default if there is a merge conflict or can be 
forced by using the --no-commit option to merge.

Andres


Re: Managing multiple branches in git

From
Marko Kreen
Date:
On 6/2/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>  > Hmm, but is there a way to create those clones from a single local
>  > "database"?
>
>  > (I like the monotone model much better.  This mixing of working copies
>  > and databases as if they were a single thing is silly and uncomfortable
>  > to use.)
>
>
> I agree, .git as a subdirectory of the working directory doesn't make
>  much sense to me.
>
>  I wondered for a second about symlinking .git from several checkout
>  directories to a common master, but AFAICT .git stores both the
>  "repository" and status information about the current checkout, so
>  that's not gonna work.

You cannot share .git, but you can share object directory (.git/objects).
Which contains the bulk data.  There are various ways to do it, symlink
should be one of them.

>  In the one large project that I have a git tree for, .git seems to
>  eat only about as much disk space as the checkout (so apparently the
>  compression is pretty effective).  So it wouldn't be totally impractical
>  to have a separate repository for each branch, but it sure seems like
>  an ugly and klugy way to do it.  And we'd still end up with the "same"
>  commit on different branches appearing entirely unrelated.
>
>  At the same time, I don't really buy the theory that relating commits on
>  different branches via merges will work.  In my experience it is very
>  seldom the case that a patch applies to each back branch with no manual
>  effort whatever, which is what I gather the merge functionality could
>  help with.  So maybe there's not much help to be had on this ...

Sure, if branches are different enough, the merge commit would
contain lot of code changes.  But still - you would get single "main"
commit with log message, plus bunch of merge commits, which may be
nicer than several duplicate commits.

-- 
marko


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 9:23 AM, Aidan Van Dyk wrote:

>> Yeah, with git, rather than cd'ing to another directory, you'd just  
>> do
>> `git checkout rel8_3` and work from the same directory.
>
> But that looses his "configured" and "compiled" state...
>
> But git isn't forcing him to change his workflow at all...

I defer to your clearly superior knowledge. Git is simple, but there  
is *so* much to learn!

David


Re: Managing multiple branches in git

From
Ron Mayer
Date:
Tom Lane wrote:
> Marko Kreen <markokr@gmail.com> writes:
>> They cannot be same commits in GIT as the resulting tree is different.
> The way I prepare a patch that has to be back-patched is first to make
> and test the fix in HEAD.  Then apply it (using diff/patch and perhaps
> manual adjustments) to the first back branch, and test that.  Repeat for
> each back branch as far as I want to go.  Almost always, there is a
> certain amount of manual adjustment involved due to renamings,
> historical changes of pgindent rules, etc.  Once I have all the versions
> tested, I prepare a commit message and commit all the branches.  This
> results in one commit message per branch in the pgsql-committers
> archives, and just one commit in the cvs2cl representation of the
> history --- which is what I want.

I think the closest equivalent to what you're doing here is:
 "git cherry-pick -n -x <the commit you want to pull>"

The "git cherry-pick" command does similar to the diff/patch work.
The "-n" prevents an automatic checking to allow for manual adjustments.
The "-x" flag adds a note to the commit comment describing the relationship
between the commits.

It seems to me we could make a cvs2cl like script that's aware
of the comments "git-cherry-pick -x" inserts and rolls them up
in a similar way that cvs2cl does.




> The way that I have things set up for CVS is that I have a checkout
> of HEAD, and also "sticky" checkouts of the back branches...
> Each of these is configured (using --prefix) to install into a separate
> installation tree. ...

I think the most similar thing here would be for you to have one
normal clone of the "official" repository, and then use
git-clone --local
when you set up the back branch directories.  The --local flag will
use hard-links to avoid wasting space & time of maintaining multiple
copies of histories.

> I don't see any even-approximately-sane way to handle similar cases
> in git.  From what I've learned so far, you can have one checkout
> at a time in a git working tree, which would mean N copies of the
> entire repository if I want N working trees....

git-clone --local avoids that.

> ... Not to mention the
> impossibility of getting it to regard parallel commits as related
> in any way whatsoever.

Well - "related in any way whatsoever" seems possible either
through the comments added in the "-x" flag in git-cherry-pick, or
with the other workflows people described where you fix the bug in
a new branch off some ancestor of all the releases (ideally near
where the bug occurred) and merge them into the branches.


> So how is this normally done with git?




Re: Managing multiple branches in git

From
Aidan Van Dyk
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [090602 12:35]:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Hmm, but is there a way to create those clones from a single local
> > "database"?
> 
> > (I like the monotone model much better.  This mixing of working copies
> > and databases as if they were a single thing is silly and uncomfortable
> > to use.)
> 
> I agree, .git as a subdirectory of the working directory doesn't make
> much sense to me.

The main reason why git uses this is that the "index" (git equivilant of
the CVS/*) resides in 1 place instead of in each directory.  So, if you
have multiple working directories sharing a single .git, you get them
tromping on each others "index".

That said, you can symlink almost everything *inside* .git to other
repositories.

For instance, if you had the "Reference" repository I shows last time,
instead of doing the "git clone", you could do:
#Make a new REL8_2_STABLE working areamountie@pumpkin:~/pg-work$ REF=$(pwd)/PostgreSQL.gitmountie@pumpkin:~/pg-work$
mkdirREL8_2_STABLEmountie@pumpkin:~/pg-work$ cd REL8_2_STABLE/mountie@pumpkin:~/pg-work/REL8_2_STABLE$ git init
 
# And now make everything point backmountie@pumpkin:~/pg-work/REL8_2_STABLE$ mkdir .git/refs/remotes && ln -s
$REF/refs/heads.git/refs/remotes/originmountie@pumpkin:~/pg-work/REL8_2_STABLE$ rm -Rf .git/objects && ln -s
$REF/objects.git/objectsmountie@pumpkin:~/pg-work/REL8_2_STABLE$ rmdir .git/refs/tags  && ln -s $REF/refs/tags
.git/refs/tagsmountie@pumpkin:~/pg-work/REL8_2_STABLE$rm -Rf .git/info && ln -s $REF/info
.git/infomountie@pumpkin:~/pg-work/REL8_2_STABLE$rm -Rf .git/hooks && ln -s $REF/hooks
 

This will leave you with an independent config, independent index,
independent heads, and independent reflogs, with a shared "remote"
tracking branches, shared "object" store, shared "tags", and shared
hooks.

And make sure you don't purge any unused objects out of any of these
subdirs, because they don't know that the object might be in use in
another subdir...  This warning is the one reason why it's usually
recommended to just use a reference repository, and not have to worry..

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Managing multiple branches in git

From
Mark Mielke
Date:
Tom Lane wrote:
> I agree, .git as a subdirectory of the working directory doesn't make
> much sense to me.
>
> I wondered for a second about symlinking .git from several checkout
> directories to a common master, but AFAICT .git stores both the
> "repository" and status information about the current checkout, so
> that's not gonna work.
>
> In the one large project that I have a git tree for, .git seems to
> eat only about as much disk space as the checkout (so apparently the
> compression is pretty effective).  So it wouldn't be totally impractical
> to have a separate repository for each branch, but it sure seems like
> an ugly and klugy way to do it.  And we'd still end up with the "same"
> commit on different branches appearing entirely unrelated

I am curious about why an end user would really care? CVS and SVN both 
kept local workspace directories containing metadata. If anything, I 
find GIT the least intrusive of these three, as the .git is only in the 
top-level directory, whereas CVS and SVN like to pollute every directory.

Assuming you don't keep binaries under source control, the .git 
containing all history is very often smaller than the "pristine copy" 
kept by CVS or SVN in their metadata directories, so space isn't really 
the issue.

Maybe think of it more like a feature. GIT keeps a local cache of the 
entire repo, whereas SVN and CVS only keeps a local cache of the commit 
you are based on. It's a feature that you can review history without 
network connectivity.

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>



Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Mark Mielke wrote:

> I am curious about why an end user would really care? CVS and SVN both  
> kept local workspace directories containing metadata. If anything, I  
> find GIT the least intrusive of these three, as the .git is only in the  
> top-level directory, whereas CVS and SVN like to pollute every directory.

That's not the problem.  The problem is that it is kept in the same
directory as the checked out copy.  It would be a lot more usable if it
was possible to store it elsewhere.

Yes, the .svn directories are a PITA.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Marko Kreen
Date:
On 6/2/09, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Mark Mielke wrote:
>
>  > I am curious about why an end user would really care? CVS and SVN both
>  > kept local workspace directories containing metadata. If anything, I
>  > find GIT the least intrusive of these three, as the .git is only in the
>  > top-level directory, whereas CVS and SVN like to pollute every directory.
>
>
> That's not the problem.  The problem is that it is kept in the same
>  directory as the checked out copy.  It would be a lot more usable if it
>  was possible to store it elsewhere.

export GIT_DIR=...

-- 
marko


Re: Managing multiple branches in git

From
Heikki Linnakangas
Date:
Andres Freund wrote:
> On 06/02/2009 06:33 PM, Tom Lane wrote:
>> At the same time, I don't really buy the theory that relating commits on
>> different branches via merges will work.  In my experience it is very
>> seldom the case that a patch applies to each back branch with no manual
>> effort whatever, which is what I gather the merge functionality could
>> help with.  So maybe there's not much help to be had on this ...
> You can do a merge and change the commit during that - this way you get 
> the merge tracking information correct although you did a merge so that 
> further merge operations can consider the specific change to be applied 
> on both/some/all branches.
> This will happen by default if there is a merge conflict or can be 
> forced by using the --no-commit option to merge.

Yeah, that should work fine.

However, handling fixes to multiple branches by merging the release 
branches to master seems awkward to me. A merge will merge *all* commits 
in the release branch. Including "stamp 8.3.1" commits, and fixes for 
issues in release branches that are not present in master.

Cherry-picking seems like the best approach.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Managing multiple branches in git

From
Aidan Van Dyk
Date:
* Alvaro Herrera <alvherre@commandprompt.com> [090602 13:25]:

> That's not the problem.  The problem is that it is kept in the same
> directory as the checked out copy.  It would be a lot more usable if it
> was possible to store it elsewhere.
> 
> Yes, the .svn directories are a PITA.

You can export GIT_DIR to make the .git directory be somewhere else...
and you'll probalby want a corresponding GIT_WORK_TREE (or core.worktree
config) set.

If your careful (i.e. don't make a mistake), you can set GIT_DIR and
GIT_INDEX_FILE AND GIT_WORK_TREE, and use a single "git repository"
among multiple independent "working directories".

That said, is the carefulness needed to work that worth the < 200KB
you save?

On a "referenced" style development repository:mountie@pumpkin:~/pg-work/REL8_3_STABLE$ du -shc .git/*4.0K
.git/branches4.0K   .git/config4.0K    .git/description4.0K    .git/HEAD48K     .git/hooks328K    .git/index8.0K
.git/info36K    .git/logs16K     .git/objects4.0K    .git/packed-refs32K     .git/refs488K    total
 

488K total in the .git directory, 328K of that is the index.

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Tom Lane wrote:
> "David E. Wheeler" <david@kineticode.com> writes:
>   
>> Yeah, with git, rather than cd'ing to another directory, you'd just do  
>> `git checkout rel8_3` and work from the same directory.
>>     
>
> That's what I'd gathered, and frankly it is not an acceptable answer.
> Sure, the "checkout" operation is remarkably fast, but it does nothing
> for derived files.  What would really be involved here (if I wanted to
> be sure of having a non-broken build) is
>     make maintainer-clean
>     git checkout rel8_3
>     configure
>     make
> which takes long enough that I'll have plenty of time to consider
> how much I hate git.  If there isn't a better way proposed, I'm
> going to flip back to voting against this conversion.  I need tools
> that work for me not against me.
>
>     

Hmm.  I confess that I never switch between CVS branches. Instead I keep 
a separate tree for each maintained branch.  And that's what the 
buildfarm does and will continue doing with git. Maybe that's not as 
efficient a way for a developer to work, I don't know.

Of course, your work rate gives you much more weight in this discussion 
than me ;-)

cheers

andrew


Re: Managing multiple branches in git

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Hmm.  I confess that I never switch between CVS branches. Instead I keep 
> a separate tree for each maintained branch.

Right, exactly, and that's the workflow I want to maintain with git.
Having to rebuild the derived files every time I look at a different
branch is too much overhead.
        regards, tom lane


Re: Managing multiple branches in git

From
Mark Mielke
Date:
Alvaro Herrera wrote: <blockquote cite="mid:20090602172414.GE5845@alvh.no-ip.org" type="cite"><pre wrap="">Mark Mielke
wrote:</pre><blockquote type="cite"><pre wrap="">I am curious about why an end user would really care? CVS and SVN both

 
kept local workspace directories containing metadata. If anything, I  
find GIT the least intrusive of these three, as the .git is only in the  
top-level directory, whereas CVS and SVN like to pollute every directory.   </pre></blockquote><pre wrap="">
That's not the problem.  The problem is that it is kept in the same
directory as the checked out copy.  It would be a lot more usable if it
was possible to store it elsewhere. </pre></blockquote><br /> I'm not following. CVS and SVN both kept such directories
"inthe checked out copy." Recall the CSV/*,v files?<br /><br /> As for storing it elsewhere - if you absolute must, you
can.There is a --git-dir=GIT_DIR and --work-tree=GIT_WORK_TREE option to all git commands, and GIT_DIR / GIT_WORK_TREE
environmentvariables.<br /><br /> I just don't understand why you care. If the CVS directories didn't bug you before,
whydoes the single .git directory bug you now? I'm genuinely interested as I don't get it. :-)<br /><br /> Cheers,<br
/>mark<br /><br /><pre class="moz-signature" cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Mark Mielke wrote:

> I just don't understand why you care. If the CVS directories didn't bug  
> you before, why does the single .git directory bug you now? I'm  
> genuinely interested as I don't get it. :-)

It doesn't.  What bugs me is that the database (the "pulled" tree if you
will) is stored in it.  It has already been pointed out how to put it
elsewhere, so no need to explain that.

What *really* bugs me is that it's so difficult to have one "pulled"
tree and create a bunch of checked out copies from that.

(In the CVS world, I kept a single rsync'ed copy of the anoncvs
repository, and I could do multiple "cvs checkout" copies from there
using different branches.)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Managing multiple branches in git

From
Mark Mielke
Date:
Alvaro Herrera wrote: <blockquote cite="mid:20090602193823.GF5845@alvh.no-ip.org" type="cite"><pre wrap="">Mark Mielke
wrote:</pre><blockquote type="cite"><pre wrap="">I just don't understand why you care. If the CVS directories didn't
bug 
 
you before, why does the single .git directory bug you now? I'm  
genuinely interested as I don't get it. :-)   </pre></blockquote><pre wrap="">
It doesn't.  What bugs me is that the database (the "pulled" tree if you
will) is stored in it.  It has already been pointed out how to put it
elsewhere, so no need to explain that.

What *really* bugs me is that it's so difficult to have one "pulled"
tree and create a bunch of checked out copies from that.

(In the CVS world, I kept a single rsync'ed copy of the anoncvs
repository, and I could do multiple "cvs checkout" copies from there
using different branches.) </pre></blockquote><br /> You say "database", but unless you assume you know what is in it,
.gitisn't really different from CVS/ or .svn. It's workspace metadata. Size might concern you, except that it's
generallysmaller than CVS/ or .svn. Content might concern you, until you realize that being able to look through
historywithout accessing the network is a feature, not a problem. Time to prepare the workspace might concern you, but
Ihaven't seen people time the difference between building a cvs checkout vs a git clone.<br /><br /> You talk about
avoidingdownloads by rsync'ing the CVS repository. You can do nearly the exact same thing in GIT:<br /><br /> 1) Create
a'git clone --bare' that is kept up-to-date with 'git fetch'. This is your equivalent to an rsync'ed copy of the
anoncvsrepository.<br /> 2) Use 'git clone' from your local bare repo, or from the remote using the local bare repo as
areference. Either hard links, or as a reference no links at all will keep your clone smaller than either a CVS or an
SVNcheckout.<br /><br /> Mainly, I want to point out that the existence of ".git" is not a real problem - it's
certainlyno worse than before.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/02/2009 09:38 PM, Alvaro Herrera wrote:
> Mark Mielke wrote:
>
>> I just don't understand why you care. If the CVS directories didn't bug
>> you before, why does the single .git directory bug you now? I'm
>> genuinely interested as I don't get it. :-)
>
> It doesn't.  What bugs me is that the database (the "pulled" tree if you
> will) is stored in it.  It has already been pointed out how to put it
> elsewhere, so no need to explain that.
>
> What *really* bugs me is that it's so difficult to have one "pulled"
> tree and create a bunch of checked out copies from that.
I dont see were the difficulty resides?

#Setup a base repository
cd /../master
git [--bare] clone git://git.postgresql.org/whatever .


#Method 1
cd /../child1
git clone --reference /../master/ git://git.postgresql.org/whatever .
cd /../child2
git clone --reference /../master/ git://git.postgresql.org/whatever .

This way you can fetch from the git url without problem, but when a 
object is available locally it is not downloaded again.

#Method2
cd /../child3
git clone --shared /../postgresql/ child3
...
This way you only fetch from your "pulled" tree and never possibly from 
the upstream one.

Andres


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Once I have all the versions
> tested, I prepare a commit message and commit all the branches.  This
> results in one commit message per branch in the pgsql-committers
> archives, and just one commit in the cvs2cl representation of the
> history --- which is what I want.
>
>
>   

I think the 'just one commit' view is going to be the hard piece. Other 
than that, there will probably be some minor annoyances, but that's to 
be expected in any switch, I think.

Of course, it's open source so if someone wants to work on multibranch 
commit to make our life easier ... ;-)

cheers

andrew


Re: Managing multiple branches in git

From
Tom Lane
Date:
Mark Mielke <mark@mark.mielke.cc> writes:
> Alvaro Herrera wrote:
>> That's not the problem.  The problem is that it is kept in the same
>> directory as the checked out copy.  It would be a lot more usable if it
>> was possible to store it elsewhere.

> I'm not following. CVS and SVN both kept such directories "in the 
> checked out copy." Recall the CSV/*,v files?

I can't speak to SVN, but that is *not* how CVS does it.  There's a
small CVS/ directory, but the repository (with all the ,v files)
is somewhere else.  In particular I can have N different checked-out
working copies without duplicating the repository.

> I just don't understand why you care. If the CVS directories didn't bug 
> you before, why does the single .git directory bug you now?

(1) size (ok, not a showstopper)
(2) potential for error

Blowing away your working directory shouldn't result in loss of your
entire project history.
        regards, tom lane


Re: Managing multiple branches in git

From
Robert Haas
Date:
On Tue, Jun 2, 2009 at 3:38 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> What *really* bugs me is that it's so difficult to have one "pulled"
> tree and create a bunch of checked out copies from that.

Yeah.  It basically doesn't work, hacks to the contrary on this thread
nonwithstanding, and I'm sympathetic to Tom's pain as I spend a fair
amount of time switching branches, doing git-clean -dfx && configure
&& make check && make install.

Of course in my cases they are usually private branches rather than
back branches, but the problem is the same.

And, unfortunately, I'm not sure there's a good solution.  Tom could
create 1 local repository cloned from the origin and then N-1 copies
cloned with --local from that one, but this sort of defeats the
purpose of using git, because now if he commits a change to one of
them and then wants to apply that change to each back branch, he's got
to fetch that change on each one, cherry-pick it, make his changes,
commit, and then push it back to his main repository.  Some of this
could probably be automated using scripts and post-commit hooks, but
even so it's a nuisance, and if you ever want to reset or rebase
(before pushing to origin, of course) it gets even more annoying.

I wonder whether it would help with this problem if we had a way to
locate the build products outside the tree, and maybe fix things up so
that you can make the build products go to a different location
depending on which branch you're on.  I personally find it incredibly
convenient to be able to check out a different branch without losing
track of "where I am" in the tree.  So if I'm in
$HOME/pgsql-git/src/backend/commands and I switch to a new branch, I'm
still in that same directory, versus having to cd around.  So in
general I find the git way of doing things to be very convenient, but
needing to rebuild all the intermediates sucks.

...Robert


Re: Managing multiple branches in git

From
Robert Haas
Date:
On Tue, Jun 2, 2009 at 3:58 PM, Andres Freund <andres@anarazel.de> wrote:
> On 06/02/2009 09:38 PM, Alvaro Herrera wrote:
>>
>> Mark Mielke wrote:
>>
>>> I just don't understand why you care. If the CVS directories didn't bug
>>> you before, why does the single .git directory bug you now? I'm
>>> genuinely interested as I don't get it. :-)
>>
>> It doesn't.  What bugs me is that the database (the "pulled" tree if you
>> will) is stored in it.  It has already been pointed out how to put it
>> elsewhere, so no need to explain that.
>>
>> What *really* bugs me is that it's so difficult to have one "pulled"
>> tree and create a bunch of checked out copies from that.
>
> I dont see were the difficulty resides?
>
> #Setup a base repository
> cd /../master
> git [--bare] clone git://git.postgresql.org/whatever .
>
>
> #Method 1
> cd /../child1
> git clone --reference /../master/ git://git.postgresql.org/whatever .
> cd /../child2
> git clone --reference /../master/ git://git.postgresql.org/whatever .
>
> This way you can fetch from the git url without problem, but when a object
> is available locally it is not downloaded again.

Yeah but now you have to push and pull commits between your numerous
local working copies.  Boo, hiss.

> #Method2
> cd /../child3
> git clone --shared /../postgresql/ child3
> ...
> This way you only fetch from your "pulled" tree and never possibly from the
> upstream one.

This is so unsafe it's not even worth talking about.  See git-clone(1).

...Robert


Re: Managing multiple branches in git

From
Robert Haas
Date:
On Tue, Jun 2, 2009 at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Blowing away your working directory shouldn't result in loss of your
> entire project history.

Such an outcome could not possibly be less likely with any other
system than it is with git.  Every single developer has a copy of your
entire history, as does the origin server and the public mirror of the
origin server.  There are so many copies of the entire project history
that you'd need an asteroid to obliterate it.

The real potential for confusion with git has to do with the need to
explicitly move commits between repositories, which is a problem that
doesn't exist in CVS or SVN where There Can Only Be One.  That is not
really a problem (in fact, it's really nice) when each developer uses
a single repository, but your situation (1 developer, multiple
repositories) it's potentially quite a nuisance.

...Robert


Re: Managing multiple branches in git

From
Mark Mielke
Date:
Tom Lane wrote: <blockquote cite="mid:17750.1243973382@sss.pgh.pa.us" type="cite"><pre wrap="">Mark Mielke <a
class="moz-txt-link-rfc2396E"href="mailto:mark@mark.mielke.cc"><mark@mark.mielke.cc></a> writes:
</pre><blockquotetype="cite"><pre wrap="">I'm not following. CVS and SVN both kept such directories "in the 
 
checked out copy." Recall the CSV/*,v files?   </pre></blockquote><pre wrap="">
I can't speak to SVN, but that is *not* how CVS does it.  There's a
small CVS/ directory, but the repository (with all the ,v files)
is somewhere else.  In particular I can have N different checked-out
working copies without duplicating the repository. </pre></blockquote><br /> Ah - my mistake. It's been too long since
Iused CVS. CVS keeps the metadata describing what you have, but not the 'pristine copy' that SVN keeps.<br /><br
/><blockquotecite="mid:17750.1243973382@sss.pgh.pa.us" type="cite"><blockquote type="cite"><pre wrap="">I just don't
understandwhy you care. If the CVS directories didn't bug 
 
you before, why does the single .git directory bug you now?   </pre></blockquote><pre wrap="">
(1) size (ok, not a showstopper)
(2) potential for error

Blowing away your working directory shouldn't result in loss of your
entire project history</pre></blockquote><br /> Perhaps you could describe the 'blowing away your working directory
shouldn'tresult in loss of your entire project history'?<br /><br /> Yes, if that's the only copy you have - this is
true.But, you would normally have at least one copy, and everybody else will also have a copy. Linus has joked about
notneeding backups, since he can recover his entire project history from places all over the Internet.<br /><br /> As a
"forexample", you could have a local repo that you publish from. Your work spaces could be from that local repo. Others
pullfrom your local repo.<br /><br /> For a small project I have, I keep the SVN / centralized model. People upload
theirchanges with 'git push', and pick up updates with 'git pull' ('cvs update'). Whatever works best for you - but
it'sall available. Just because your workspace happens to have a copy of your entire project history doesn't
necessarilymean that blowing away your working directory results in loss of your entire project history. Think multiple
redundantcopies. It's a feature - not a problem. :-)<br /><br /> Cheers,<br /> mark<br /><br /><pre
class="moz-signature"cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

Re: Managing multiple branches in git

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I wonder whether it would help with this problem if we had a way to
> locate the build products outside the tree, and maybe fix things up so
> that you can make the build products go to a different location
> depending on which branch you're on.

I'm beginning to seriously consider the idea that the git repository
should think each branch is a separate directory subtree --- ie,
completely abandon the notion that git is worth anything at all for
managing multi-branch patches.  If we have HEAD, REL8_3, etc as
separate subtrees then we can easily have a single commit touching
multiple branches in whatever way we want.

The arguments that were put forward for switching to git all had to do
with managing patches against HEAD.  AFAIK hardly anyone but the core
committers deals with back-patching at all, and so a structure like this
isn't going to affect anyone else --- you'd just ignore the back-branch
directory subtrees in your checkout.
        regards, tom lane


Re: Managing multiple branches in git

From
Mark Mielke
Date:
Robert Haas wrote: <blockquote cite="mid:603c8f070906021313g54bcc303m29117566fda7acc@mail.gmail.com" type="cite"><pre
wrap="">OnTue, Jun 2, 2009 at 3:58 PM, Andres Freund <a class="moz-txt-link-rfc2396E"
href="mailto:andres@anarazel.de"><andres@anarazel.de></a>wrote: </pre><blockquote type="cite"><pre
wrap="">#Method1
 
cd /../child1
git clone --reference /../master/ git://git.postgresql.org/whatever .
cd /../child2
git clone --reference /../master/ git://git.postgresql.org/whatever .

This way you can fetch from the git url without problem, but when a object
is available locally it is not downloaded again.   </pre></blockquote><pre wrap="">
Yeah but now you have to push and pull commits between your numerous
local working copies.  Boo, hiss. </pre></blockquote><br /> Why? They are only references. They are effectively local
caches.Why push to them at all?<br /><br /> Push to the central repo. The local copy ("caches") will pick up the
changeseventually. If you really find .git getting larger and this is a problem (never been a problem for me), "git gc"
cankeep it to a minimum.<br /><br /><blockquote cite="mid:603c8f070906021313g54bcc303m29117566fda7acc@mail.gmail.com"
type="cite"><blockquotetype="cite"><pre wrap="">#Method2
 
cd /../child3
git clone --shared /../postgresql/ child3
...
This way you only fetch from your "pulled" tree and never possibly from the
upstream one.   </pre></blockquote><pre wrap="">
This is so unsafe it's not even worth talking about.  See git-clone(1)</pre></blockquote><br /> It's not actually
unsafe.There are just things to consider. Particularly, if history is ever removed from /../postgresql/ then the child3
canbecome corrupt. There is an easy solution here - don't remove history from /../postgresql/.<br /><br /> I use the
aboveto save space in a binary-heavy (each workspace is 150 Mbytes+ without --shared) git repo among three designers.
Itworks fine. We've never had a problem.<br /><br /> That said, I wouldn't recommend it be used unless you do in fact
understandthe problem well.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Andres Freund wrote:
> On 06/02/2009 09:38 PM, Alvaro Herrera wrote:

>> What *really* bugs me is that it's so difficult to have one "pulled"
>> tree and create a bunch of checked out copies from that.
> I dont see were the difficulty resides?
>
> #Setup a base repository
> cd /../master
> git [--bare] clone git://git.postgresql.org/whatever .

This is all quite ugly in fact.  What I want is something like this:

# the * below means "pull all branches"
mtn -d /home/repos/postgresql.mtn pull *
cd /home/trees
mkdir REL8_3_STABLE
cd REL8_3_STABLE
mtn checkout -d /home/repos/postgresql.mtn -b REL8_3_STABLE
cd ..
mkdir REL8_2_STABLE
cd REL8_2_STABLE
mtn checkout -d /home/repos/postgresql.mtn -b REL8_2_STABLE

and so on.  The "database" I pull into is common to all the branches,
/home/repos/postgresql.mtn; into that database I commit; and from there
I can push to the project's main database.  Whenever I do "mtn update",
it brings changes from the database (previously pulled into it) into the
working copy.

But this is all wishful thinking ('cause worse is better), so never mind
me.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Tom Lane
Date:
Mark Mielke <mark@mark.mielke.cc> writes:
> As a "for example", you could have a local repo that you publish from. 
> Your work spaces could be from that local repo.

Yes, exactly.  How do I do that?  My complaint is that git fails to
provide a distinction between a repo and a workspace --- they seem
to be totally tied together.
        regards, tom lane


Re: Managing multiple branches in git

From
Robert Haas
Date:
On Jun 2, 2009, at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> I wonder whether it would help with this problem if we had a way to
>> locate the build products outside the tree, and maybe fix things up  
>> so
>> that you can make the build products go to a different location
>> depending on which branch you're on.
>
> I'm beginning to seriously consider the idea that the git repository
> should think each branch is a separate directory subtree --- ie,
> completely abandon the notion that git is worth anything at all for
> managing multi-branch patches.  If we have HEAD, REL8_3, etc as
> separate subtrees then we can easily have a single commit touching
> multiple branches in whatever way we want.
>
> The arguments that were put forward for switching to git all had to do
> with managing patches against HEAD.  AFAIK hardly anyone but the core
> committers deals with back-patching at all, and so a structure like  
> this
> isn't going to affect anyone else --- you'd just ignore the back- 
> branch
> directory subtrees in your checkout.

If we're going to do that let's just keep using CVS.  I would consider  
a repository organized that way to be completely unusable; without  
doing anything the system we have now is better than that.

...Robert


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/02/2009 10:13 PM, Robert Haas wrote:
> On Tue, Jun 2, 2009 at 3:58 PM, Andres Freund<andres@anarazel.de>  wrote:
>> On 06/02/2009 09:38 PM, Alvaro Herrera wrote:
>>> Mark Mielke wrote:
>>>> I just don't understand why you care. If the CVS directories didn't bug
>>>> you before, why does the single .git directory bug you now? I'm
>>>> genuinely interested as I don't get it. :-)
>>>
>>> It doesn't.  What bugs me is that the database (the "pulled" tree if you
>>> will) is stored in it.  It has already been pointed out how to put it
>>> elsewhere, so no need to explain that.
>>>
>>> What *really* bugs me is that it's so difficult to have one "pulled"
>>> tree and create a bunch of checked out copies from that.
>>
>> I dont see were the difficulty resides?
>>
>> #Setup a base repository
>> cd /../master
>> git [--bare] clone git://git.postgresql.org/whatever .
>>
>>
>> #Method 1
>> cd /../child1
>> git clone --reference /../master/ git://git.postgresql.org/whatever .
>> cd /../child2
>> git clone --reference /../master/ git://git.postgresql.org/whatever .
>>
>> This way you can fetch from the git url without problem, but when a object
>> is available locally it is not downloaded again.
>
> Yeah but now you have to push and pull commits between your numerous
> local working copies.  Boo, hiss.
In the end thats the same with cvs and multiple checkouts?

>> #Method2
>> cd /../child3
>> git clone --shared /../postgresql/ child3
>> ...
>> This way you only fetch from your "pulled" tree and never possibly from the
>> upstream one.
>
> This is so unsafe it's not even worth talking about.  See git-clone(1).
No. It is unsafe if you play around in the master repository. If youre 
not doing that is safe.

Andres


Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Andres Freund escribió:
> On 06/02/2009 10:13 PM, Robert Haas wrote:

>> Yeah but now you have to push and pull commits between your numerous
>> local working copies.  Boo, hiss.
> In the end thats the same with cvs and multiple checkouts?

You don't pull and push in CVS, you just commit and update.  Different
thing.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jun 2, 2009 at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Blowing away your working directory shouldn't result in loss of your
>> entire project history.

> Such an outcome could not possibly be less likely with any other
> system than it is with git.  Every single developer has a copy of your
> entire history, as does the origin server and the public mirror of the
> origin server.

If it's a public project, and discounting any private branches you may
have had.  I don't see what's so unfathomable about "I'd like a clear
separation between workspace and repository".
        regards, tom lane


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Robert Haas wrote:
>>
>> The arguments that were put forward for switching to git all had to do
>> with managing patches against HEAD.  AFAIK hardly anyone but the core
>> committers deals with back-patching at all, and so a structure like this
>> isn't going to affect anyone else --- you'd just ignore the back-branch
>> directory subtrees in your checkout.
>
> If we're going to do that let's just keep using CVS.  I would consider 
> a repository organized that way to be completely unusable; without 
> doing anything the system we have now is better than that.
>

The only reason Tom sees a single line history is because he uses an 
addon tool for CVS called cvs2cl: see <http://www.red-bean.com/cvs2cl/>. 
It's not part of CVS, and I'm not sure how many others use it. I sure 
don't. It's written in Perl, and we have one or two tolerably competent 
Perl programmers around, so maybe we could produce a git equivalent?

cheers

andrew


Re: Managing multiple branches in git

From
Ron Mayer
Date:
Robert Haas wrote:
> And, unfortunately, I'm not sure there's a good solution.  Tom could
> create 1 local repository cloned from the origin and then N-1 copies
> cloned with --local from that one, but this sort of defeats the
> purpose of using git, because now if he commits a change to one of
> them and then wants to apply that change to each back branch, he's got
> to fetch that change on each one, cherry-pick it, make his changes,
> commit, and then push it back to his main repository.  Some of this

Why has he got to do this pushing back to his main?   How about
creating 1 local repository from Origin,create N-1 cloned with --local from that onefor each of those "--local" ones,
"git-remoteadd" the main origin
 

From then ISTM his workflow is very similar to the way he does with CVS,
pulling and pushing from those multiple repositories to the central
origin.  He can creating the patches/diffs to apply to each the same
way he does today.

ISTM he'd mostly be unaware that these repositories were ever connected
in some way unless he inspected that some of the files in .git had the
same inodes because they came from hard links.



Re: Managing multiple branches in git

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> The only reason Tom sees a single line history is because he uses an 
> addon tool for CVS called cvs2cl: see <http://www.red-bean.com/cvs2cl/>. 
> It's not part of CVS, and I'm not sure how many others use it. I sure 
> don't.

FWIW, I believe Bruce uses some version of it as well.  It's our main
tool for dredging up the raw data for release notes.

> It's written in Perl, and we have one or two tolerably competent 
> Perl programmers around, so maybe we could produce a git equivalent?

It's a bit premature to speculate about alternate history tools when
we haven't figured out what the repository is going to look like.  Right
at the moment I'm much more concerned about the question of supporting
a checkout-per-branch workflow.  That's something I use pretty nearly
every day, whereas looking at the history is maybe a once-a-week kind
of need at most.
        regards, tom lane


Re: Managing multiple branches in git

From
Robert Haas
Date:
On Jun 2, 2009, at 5:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Jun 2, 2009 at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Blowing away your working directory shouldn't result in loss of your
>>> entire project history.
>
>> Such an outcome could not possibly be less likely with any other
>> system than it is with git.  Every single developer has a copy of  
>> your
>> entire history, as does the origin server and the public mirror of  
>> the
>> origin server.
>
> If it's a public project, and discounting any private branches you may
> have had.  I don't see what's so unfathomable about "I'd like a clear
> separation between workspace and repository".

Well, nothing.  But, logically, the risk of data loss can't be higher  
just because you have more data cached locally.  The problem isn't  
that caching is bad; it's keeping multiple local caches coherent.

...Robert


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 2:31 PM, Tom Lane wrote:

> It's a bit premature to speculate about alternate history tools when
> we haven't figured out what the repository is going to look like.
> Right
> at the moment I'm much more concerned about the question of supporting
> a checkout-per-branch workflow.  That's something I use pretty nearly
> every day, whereas looking at the history is maybe a once-a-week kind
> of need at most.

Perhaps there's a master repository that corresonds to CVS HEAD, and
then release branches are actually separate git repositories. This
way, those who need to maintain back branches can do so with each
individual repository (and maybe cherry-pick commits to get them to be
the same in each repo; I'm not sure), and everyone else can just use
the master repository (and do their normal local branching and merging
routine as they use it).

Not idea, but might provide some relief…

Best,

David

Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> Perhaps there's a master repository that corresonds to CVS HEAD, and  
> then release branches are actually separate git repositories.

Yeah, I was speculating about that one too.  It might be workable.
Just "cp -r" the master whenever we fork a new branch.  However, there'd
be no very easy way to get a change history that includes patches that
applied only to some back branches (something that does happen, a few
times a year perhaps).  Maybe that special log tool Andrew was
speculating about would take the form of a program to aggregate the
change histories of several repositories.
        regards, tom lane


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 3:11 PM, Tom Lane wrote:

> "David E. Wheeler" <david@kineticode.com> writes:
>> Perhaps there's a master repository that corresonds to CVS HEAD, and
>> then release branches are actually separate git repositories.
>
> Yeah, I was speculating about that one too.  It might be workable.
> Just "cp -r" the master whenever we fork a new branch.

Well, you'd clone it, but yes, thats' what I meant.

> However, there'd
> be no very easy way to get a change history that includes patches that
> applied only to some back branches (something that does happen, a few
> times a year perhaps).  Maybe that special log tool Andrew was
> speculating about would take the form of a program to aggregate the
> change histories of several repositories.

You mean so that such patches in back branches show up in the the  
history of master?

Best,

David



Re: Managing multiple branches in git

From
Robert Haas
Date:
On Tue, Jun 2, 2009 at 5:28 PM, Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote:
> Robert Haas wrote:
>> And, unfortunately, I'm not sure there's a good solution.  Tom could
>> create 1 local repository cloned from the origin and then N-1 copies
>> cloned with --local from that one, but this sort of defeats the
>> purpose of using git, because now if he commits a change to one of
>> them and then wants to apply that change to each back branch, he's got
>> to fetch that change on each one, cherry-pick it, make his changes,
>> commit, and then push it back to his main repository.  Some of this
>
> Why has he got to do this pushing back to his main?   How about
>
>  creating 1 local repository from Origin,
>  create N-1 cloned with --local from that one
>  for each of those "--local" ones, "git-remote add" the main origin
>
> From then ISTM his workflow is very similar to the way he does with CVS,
> pulling and pushing from those multiple repositories to the central
> origin.  He can creating the patches/diffs to apply to each the same
> way he does today.
>
> ISTM he'd mostly be unaware that these repositories were ever connected
> in some way unless he inspected that some of the files in .git had the
> same inodes because they came from hard links.

Well, that might work, depending on his workflow.  Maybe I'm making
some assumptions here that aren't justified.  Let's assume contrary to
fact that I am a committer and the master VCS for this project is git.I need to fix something in the master branch and
backportit to 
REL8_3_STABLE and REL8_2_STABLE.  I would probably do it like this:

git pull
git checkout master
<do my thing>
git commit -a
git checkout REL8_3_STABLE
git cherry-pick -nx master
<adjust patch>
git commit -a
git checkout REL8_2_STABLE
git cherry-pick -nx REL8_3_STABLE
<further adjust patch>
git commit -a
git push

Since I push all of my commits together, it's almost as if it's a
single commit - it is at any rate no worse than CVS, which is
non-atomic by nature.  If I have multiple local git trees, I start
like this:

cd $HOME/pgsql/master
git pull
<do my thing>
git commit -a

...and now what?  If there's any point to git, it's surely that it's
easy to move commits around, so I'd like to type a command here to
attempt to apply that commit to $HOME/pgsql/REL8_3_STABLE.  Assuming
that tree is cloned from the other local tree, I think I need to do
this:

cd $HOME/pgsql/REL8_3_STABLE
git pull
git cherry-pick -nx master
<adjust patch>
git commit -a
git push

cd $HOME/pgsql/REL8_2_STABLE
git pull
git cherry-pick -nx REL8_3_STABLE
<adjust patch further>
git commit -a
git push

cd $HOME/pgsql/master
git push   # all branches upstream for real

Now, maybe if Tom is happy to move around his patches the way he does
now, it doesn't matter: he can just have three clones from upstream
and move the patch around using diff and patch or whatever; then have
a shell script call push-em-all to do just that.  I would not want to
do that, because the ability to move patches around easily between
branches is for me one of the biggest advantages of using git in the
first place, and having multiple local repositories dilutes that.  But
what I want doesn't matter unless it happens to be the same thing Tom
wants.

...Robert


Re: Managing multiple branches in git

From
Greg Stark
Date:
I think it all makes a lot more sense if you think of your local git  
clone as just a cache. The real repo is still separate in a real repo  
on a server.

In that mental model the equivalent of CVS "commit" is actually git  
push not git commit. And the equivalent of CVS update is actually git  
pull.

git commit is actually just adding another commit to your local cache  
that you can push to the real repo at your leisure.

This is just like the rest of the world has had to do using rsync cvs  
repos except we can actually git commit into our local cache instead  
of having to be careful not to ever commit anything.

-- 
Greg


On 2 Jun 2009, at 22:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Jun 2, 2009 at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Blowing away your working directory shouldn't result in loss of your
>>> entire project history.
>
>> Such an outcome could not possibly be less likely with any other
>> system than it is with git.  Every single developer has a copy of  
>> your
>> entire history, as does the origin server and the public mirror of  
>> the
>> origin server.
>
> If it's a public project, and discounting any private branches you may
> have had.  I don't see what's so unfathomable about "I'd like a clear
> separation between workspace and repository".
>
>            regards, tom lane
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Jun 2, 2009, at 3:11 PM, Tom Lane wrote:
>> Maybe that special log tool Andrew was
>> speculating about would take the form of a program to aggregate the
>> change histories of several repositories.

> You mean so that such patches in back branches show up in the the  
> history of master?

No, just so they're available in the actual text we consult when we
are preparing release notes or wondering when some bug was fixed.

I was not aware that so few people are familiar with cvs2cl.  Perhaps
it would help to show some examples of its output.

HEAD-only patch:

2009-05-27 16:42  tgl
* src/: backend/parser/gram.y, bin/pg_dump/pg_dump.c: IgnoreRECHECK in CREATE OPERATOR CLASS, just throwing a NOTICE,
insteadofthrowing an error as 8.4 had been doing.  The error interferedwith porting old database definitions
(particularlyforpg_migrator) without really buying any safety.    Per bug #4817 andsubsequent discussion.
 

Backpatched fix:

2009-05-19 04:30  heikki
* src/backend/commands/: analyze.c (REL8_1_STABLE), analyze.c(REL8_3_STABLE), analyze.c (REL8_2_STABLE), analyze.c:
Updaterelpagesand reltuples estimates in stand-alone ANALYZE, even ifthere's no analyzable attributes or indexes. We
alsoused to report0 live and dead tuples for such tables, which messed withautovacuum threshold calculations.This fixes
bug#4812 reported by George Su. Backpatch back to 8.1.
 

A back-branch-only fix would look the same except for not having any
unannotated filenames.  I'm too lazy to go trolling for one just now.

It's also possible to get it to produce histories that include only
the patches on particular branches.

I'm not by any means wedded to the details of this printout format; it's
kinda ugly in fact.  The point that I want to make is that I can look at
the commit history in a summary form that just shows me the commit message,
date/time/committer, affected file(s) and branch(es), and is not picky
about whether the changes were byte-for-byte the same in each branch
(because they hardly ever are).  The project's entire commit history
for, hm, probably the last ten years is specifically designed to be
able to get this type of report out of the repository, and we're going
to be pretty seriously unhappy if git is not able to replicate this
functionality.
        regards, tom lane


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/02/2009 10:43 PM, Alvaro Herrera wrote:
> Andres Freund wrote:
>> On 06/02/2009 09:38 PM, Alvaro Herrera wrote:
>
>>> What *really* bugs me is that it's so difficult to have one "pulled"
>>> tree and create a bunch of checked out copies from that.
>> I dont see were the difficulty resides?
>>
>> #Setup a base repository
>> cd /../master
>> git [--bare] clone git://git.postgresql.org/whatever .
>
> This is all quite ugly in fact.  What I want is something like this:
>
> # the * below means "pull all branches"
> mtn -d /home/repos/postgresql.mtn pull *
> cd /home/trees
> mkdir REL8_3_STABLE
> cd REL8_3_STABLE
> mtn checkout -d /home/repos/postgresql.mtn -b REL8_3_STABLE
> cd ..
> mkdir REL8_2_STABLE
> cd REL8_2_STABLE
> mtn checkout -d /home/repos/postgresql.mtn -b REL8_2_STABLE
>
> and so on.  The "database" I pull into is common to all the branches,
> /home/repos/postgresql.mtn; into that database I commit; and from there
> I can push to the project's main database.  Whenever I do "mtn update",
> it brings changes from the database (previously pulled into it) into the
> working copy.
>
> But this is all wishful thinking ('cause worse is better), so never mind
> me.
The contrib command git-new-workdir seems to do exactly that.


Andres


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 3:33 PM, Tom Lane wrote:

> A back-branch-only fix would look the same except for not having any
> unannotated filenames.  I'm too lazy to go trolling for one just now.

God Tom, you're such a bloody slacker. Sheesh!

> It's also possible to get it to produce histories that include only
> the patches on particular branches.
>
> I'm not by any means wedded to the details of this printout format;  
> it's
> kinda ugly in fact.  The point that I want to make is that I can  
> look at
> the commit history in a summary form that just shows me the commit  
> message,
> date/time/committer, affected file(s) and branch(es), and is not picky
> about whether the changes were byte-for-byte the same in each branch
> (because they hardly ever are).  The project's entire commit history
> for, hm, probably the last ten years is specifically designed to be
> able to get this type of report out of the repository, and we're going
> to be pretty seriously unhappy if git is not able to replicate this
> functionality.

I should think that it'd be pretty damned easy to generate such a  
report from a Git repository's log. `git log` is extremely powerful,  
and provides a lot of interfaces for hooking things in and sorting.  
It's eminently do-able.

Best,

David


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Mark Mielke wrote:
> Alvaro Herrera wrote:
>> Mark Mielke wrote:
>>   
>>> I am curious about why an end user would really care? CVS and SVN both  
>>> kept local workspace directories containing metadata. If anything, I  
>>> find GIT the least intrusive of these three, as the .git is only in the  
>>> top-level directory, whereas CVS and SVN like to pollute every directory.
>>>     
>>
>> That's not the problem.  The problem is that it is kept in the same
>> directory as the checked out copy.  It would be a lot more usable if it
>> was possible to store it elsewhere.
>>   
>
> I'm not following. CVS and SVN both kept such directories "in the 
> checked out copy." Recall the CSV/*,v files?

Umm, no. there are *no* ,v files in my working copies (I just checked, 
to make sure I wasn't on crack). The repository has them, but the 
working copy does not. SVN does keep the equivalent - that's how you can 
work offline for doing things like 'svn diff'. But it makes the repo 
quite ugly, in fact. Running recursive grep on a subversion working copy 
is quite nasty.


>
> As for storing it elsewhere - if you absolute must, you can. There is 
> a --git-dir=GIT_DIR and --work-tree=GIT_WORK_TREE option to all git 
> commands, and GIT_DIR / GIT_WORK_TREE environment variables.
>
> I just don't understand why you care. If the CVS directories didn't 
> bug you before, why does the single .git directory bug you now? I'm 
> genuinely interested as I don't get it. :-)
>
>

Well, it looks like the extra storage for my current 6 (soon to be 7) 
working copies of postgres over the CVS equivalents would cost something 
over 100Mb each. I know disk space is cheap but that's kinda sad. The 
volume of info kept in CVS metadata files is insignificant. Saying they 
are the same is just not so.

Is it possible for multiple working sets to share the same GIT_DIR?

cheers

andrew


Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> I should think that it'd be pretty damned easy to generate such a  
> report from a Git repository's log. `git log` is extremely powerful,  
> and provides a lot of interfaces for hooking things in and sorting.  
> It's eminently do-able.

Well, it's not like CVS makes it easy ... cvs2cl is about 50K of perl,
and is not very speedy or without bugs :-(.  So maybe we are setting
the goalposts in the wrong place by supposing that the lowest-level git
history needs to be exactly what's wanted for human consumption.
As long as it can be postprocessed into the form I do want to look at,
and someone will volunteer to write that postprocessor, the question
doesn't seem like a showstopper.

Meanwhile, there seem to have been ten different solutions proposed to
the problem of working with multiple branches/checkouts, and I plead
confusion.  Anyone want to try to sort out the pluses and minuses?
        regards, tom lane


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 3:55 PM, Andrew Dunstan wrote:

> Umm, no. there are *no* ,v files in my working copies (I just  
> checked, to make sure I wasn't on crack). The repository has them,  
> but the working copy does not. SVN does keep the equivalent - that's  
> how you can work offline for doing things like 'svn diff'. But it  
> makes the repo quite ugly, in fact. Running recursive grep on a  
> subversion working copy is quite nasty.

`git grep` to avoid this issue with Git.

> Well, it looks like the extra storage for my current 6 (soon to be  
> 7) working copies of postgres over the CVS equivalents would cost  
> something over 100Mb each. I know disk space is cheap but that's  
> kinda sad. The volume of info kept in CVS metadata files is  
> insignificant. Saying they are the same is just not so.
>
> Is it possible for multiple working sets to share the same GIT_DIR?

FWIW, I've found that my Bricolage repository in Git was far smaller  
than it was in Subversion. You can also `git gc` to get the size down.  
I would be surprised if all of the checkouts together were over 100MB,  
especially if you're sharing files between them.

Best,

David



Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 3:56 PM, Tom Lane wrote:

> Well, it's not like CVS makes it easy ... cvs2cl is about 50K of perl,
> and is not very speedy or without bugs :-(.  So maybe we are setting
> the goalposts in the wrong place by supposing that the lowest-level  
> git
> history needs to be exactly what's wanted for human consumption.
> As long as it can be postprocessed into the form I do want to look at,
> and someone will volunteer to write that postprocessor, the question
> doesn't seem like a showstopper.

Yes, I think that's the case.

> Meanwhile, there seem to have been ten different solutions proposed to
> the problem of working with multiple branches/checkouts, and I plead
> confusion.  Anyone want to try to sort out the pluses and minuses?

If the whole purpose of you committing all backpatches to CVS in a  
single commit is to get a simpler cvs2cl history, you can easily do  
that with a single clone of the entire history in Git, commit each  
branch separately but with the same commit message, and then, yeah,  
someone will be able to provide a report that filters out the  
duplicate messages appropriately, I have little doubt.

Best,

David


Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Jun 2, 2009, at 3:55 PM, Andrew Dunstan wrote:
>> Running recursive grep on a  
>> subversion working copy is quite nasty.

> `git grep` to avoid this issue with Git.

One thing that git does do right is that the .git subdirectory exists
only at the top level of your working directory tree, so it's not too
hard to avoid it in recursive searches.  Still, this is one of the
reasons why a separate repository tree would be preferable.
        regards, tom lane


Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Jun 2, 2009, at 3:56 PM, Tom Lane wrote:
>> Meanwhile, there seem to have been ten different solutions proposed to
>> the problem of working with multiple branches/checkouts, and I plead
>> confusion.  Anyone want to try to sort out the pluses and minuses?

> If the whole purpose of you committing all backpatches to CVS in a  
> single commit is to get a simpler cvs2cl history, you can easily do  
> that with a single clone of the entire history in Git, commit each  
> branch separately but with the same commit message, and then, yeah,  
> someone will be able to provide a report that filters out the  
> duplicate messages appropriately, I have little doubt.

I think you missed the part of the discussion about not wishing to share
a single working directory across all the branches.  The time to rebuild
derived files whenever I switch branches is simply too great with that
approach.  I want a working copy per branch, and some
not-impossibly-complicated scheme for managing the pulls/commits/pushes
given that environment.
        regards, tom lane


Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 4:13 PM, Tom Lane wrote:

> I think you missed the part of the discussion about not wishing to  
> share
> a single working directory across all the branches.

No, I was just ignoring it for the moment to focus on the commit and  
history issue.

> The time to rebuild
> derived files whenever I switch branches is simply too great with that
> approach.  I want a working copy per branch, and some
> not-impossibly-complicated scheme for managing the pulls/commits/ 
> pushes
> given that environment.

It seems that there are a few approaches, but simplest is probably to  
create a clone of the upstream repository for each branch and work in  
them as if they were separate repositories:

git clone git@git.postgresql.org/postgresql.git master
git clone git@git.postgresql.org/postgresql.git rel8_3
cd rel8_3
git checkout --track -b rel8_3 origin/rel8_3

Then you can make your changes in master and push them back to origin  
when you're done, then backpatch in the rel8_3 checkout and commit  
with the same commit message. The next time you do a `git pull` in  
master, it will also pull down the changes you committed in rel8_3, so  
you'll have a complete history. From there it's just a matter of  
scripting `git log` in a way to make it easy for you to create changes  
files.

Does that make sense?

Best,

David



Re: Managing multiple branches in git

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> Does that make sense?

Maybe, but it still seems messy, brute force, and error-prone.

I can't escape the feeling that we're missing something basic here.
It's allegedly one of git's great strengths that it allows you to easily
and quickly switch your attention among multiple development branches.
Well, so it does, if you haven't got any derived files to rebuild.
But rebuilding the Linux kernel is hardly a zero-cost operation,
so how have Linus and co failed to notice this problem?  There
must be some trick they're using that I haven't heard about, or
they'd not be nearly so pleased with git.
        regards, tom lane


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/03/2009 12:56 AM, Tom Lane wrote:
> "David E. Wheeler"<david@kineticode.com>  writes:
>> I should think that it'd be pretty damned easy to generate such a
>> report from a Git repository's log. `git log` is extremely powerful,
>> and provides a lot of interfaces for hooking things in and sorting.
>> It's eminently do-able.
> Well, it's not like CVS makes it easy ... cvs2cl is about 50K of perl,
> and is not very speedy or without bugs :-(.  So maybe we are setting
> the goalposts in the wrong place by supposing that the lowest-level git
> history needs to be exactly what's wanted for human consumption.
> As long as it can be postprocessed into the form I do want to look at,
> and someone will volunteer to write that postprocessor, the question
> doesn't seem like a showstopper.
If the merging would be done from "latest backbranch -> ... -> HEAD" 
(editing the commits included) that should be relatively easy. (My 
guess: Minor scripting < 100 lines)...

> Meanwhile, there seem to have been ten different solutions proposed to
> the problem of working with multiple branches/checkouts, and I plead
> confusion.  Anyone want to try to sort out the pluses and minuses?
I can try:

----
git-new-workdir
+ easy
+ small
+ safe
-+ repositories not completely independent (common commits, i.e. no 
pushing/pulling/etc)
- not included in default install (/contrib directory in the git install)
- no windows support

----

git clone --local
+ safe
+ at least initially small
+- push/fetch needed
- repositories potentially get bigger with time (each repository gets 
hardlinked object from other repositories. When packing history they get 
duplicated in all repositories)
- no windows support

----

git clone --reference common_repo
+ small
+ staying small
+ fast
+ windows supported
+- push/fetch needed
- possibly unsecure if you delete from the master repository - which one 
can easily prevent

----

git clone --shared
Essentially the same as the last above


In all of those solutions you dont need the .git directory to be in the 
local checkout. All are initially 77MB big which is the pure file size 
minus around 400kb of metadata in the .git directory.

Is this at least some guidance?

Andres


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/03/2009 01:39 AM, Tom Lane wrote:
> "David E. Wheeler"<david@kineticode.com>  writes:
>> Does that make sense?
> I can't escape the feeling that we're missing something basic here.
> It's allegedly one of git's great strengths that it allows you to easily
> and quickly switch your attention among multiple development branches.
> Well, so it does, if you haven't got any derived files to rebuild.
> But rebuilding the Linux kernel is hardly a zero-cost operation,
> so how have Linus and co failed to notice this problem?  There
> must be some trick they're using that I haven't heard about, or
> they'd not be nearly so pleased with git.
Building out of tree and ccache are frequently mentioned.

Andres


Re: Managing multiple branches in git

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----                              
Hash: RIPEMD160


> Umm, no. there are *no* ,v files in my working copies (I just checked,
> to make sure I wasn't on crack). The repository has them, but the
> working copy does not. SVN does keep the equivalent - that's how you can
> work offline for doing things like 'svn diff'. But it makes the repo
> quite ugly, in fact. Running recursive grep on a subversion working copy
> is quite nasty.

grep -r? No need to use that anymore when a tool like ack is available.
It ignores .svn dirs by default, and basically works the way most
people wish grep worked:

http://betterthangrep.com/

I also agree with Tom Lane elsewhere in this thread that a lot of this is
what may turn out to be pointless wheel spinning: certainly other big
projects (esp. Linux) must have encountered and solved these problems?
Anyone know a kernel hacker / git expert?

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200906021949
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkoluqsACgkQvJuQZxSWSsi0/ACdHQMZeixC5yMYOp0wkqZi/qEE
cqsAnjpjURChqOGOB8vmJ3GjQCm4Ts3n
=4pr9
-----END PGP SIGNATURE-----




Re: Managing multiple branches in git

From
"David E. Wheeler"
Date:
On Jun 2, 2009, at 4:39 PM, Tom Lane wrote:

> "David E. Wheeler" <david@kineticode.com> writes:
>> Does that make sense?
>
> Maybe, but it still seems messy, brute force, and error-prone.
>
> I can't escape the feeling that we're missing something basic here.
> It's allegedly one of git's great strengths that it allows you to
> easily
> and quickly switch your attention among multiple development branches.
> Well, so it does, if you haven't got any derived files to rebuild.
> But rebuilding the Linux kernel is hardly a zero-cost operation,
> so how have Linus and co failed to notice this problem?  There
> must be some trick they're using that I haven't heard about, or
> they'd not be nearly so pleased with git.

Yeah, it's a good question. Someone must know…

I tried an experiment with .gitignore and derived files in my pgtap
repository. I ran `make` to generate ignored files, then switched to a
different branch. The derived files from master were still there,
which is no good. Perhaps there's a way to have git ignore derived
files but store them for particular branches?

Best,

David



Re: Managing multiple branches in git

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I can't escape the feeling that we're missing something basic here.
> It's allegedly one of git's great strengths that it allows you to easily
> and quickly switch your attention among multiple development branches.
> Well, so it does, if you haven't got any derived files to rebuild.

I hope this isn't anything particularly special because I feel like I've
been doing it forever, but..

==# cvs -z3 co pgsql
==# mkdir pgsql.build
==# cd pgsql.build
==# ../pgsql/configure --my-args-here
==# make
...

Keeps all the build files and everything in pgsql.build and leaves the
pgsql directory pristine..  I've pretty much always done things this way
so I guess I just assumed it was common.  Maybe it's not what you're
looking for though.

Would that help?
Thanks,
    Stephen

Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Andres Freund escribió:

> git clone --reference common_repo
> + small
> + staying small
> + fast
> + windows supported
> +- push/fetch needed
> - possibly unsecure if you delete from the master repository - which one  
> can easily prevent
>
> git clone --shared
> Essentially the same as the last above

I think these are the two usable options.  They will probably end up
making sense (to me at least).  We only need to make sure we don't
accidentaly corrupt the WCs, but we should be safe because we don't
intend to "delete branches" in the upstream repository.  The note in the
docs:
      --shared, -s          When the repository to clone is on the local machine, instead of          using hard links,
automaticallysetup .git/objects/info/alternates          to share the objects with the source repository. The resulting
        repository starts out without any object of its own.
 
          NOTE: this is a possibly dangerous operation; do not use it unless          you understand what it does. If
youclone your repository using          this option and then delete branches (or use any other git command
thatmakes any existing commit unreferenced) in the source          repository, some objects may become unreferenced (or
dangling).         These objects may be removed by normal git operations (such as          git-commit) which
automaticallycall git gc --auto. (See git-          gc(1).) If these objects are removed and were referenced by the
    cloned repository, then the cloned repository will become corrupt.
 
      --reference <repository>          If the reference repository is on the local machine automatically
setup.git/objects/info/alternates to obtain objects from the          reference repository. Using an already existing
repositoryas an          alternate will require fewer objects to be copied from the          repository being cloned,
reducingnetwork and local storage costs.
 
          NOTE: see NOTE to --shared option.


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Stephen Frost escribió:

> I hope this isn't anything particularly special because I feel like I've
> been doing it forever, but..
> 
> ==# cvs -z3 co pgsql
> ==# mkdir pgsql.build
> ==# cd pgsql.build
> ==# ../pgsql/configure --my-args-here
> ==# make
> ...
> 
> Keeps all the build files and everything in pgsql.build and leaves the
> pgsql directory pristine..  I've pretty much always done things this way
> so I guess I just assumed it was common.  Maybe it's not what you're
> looking for though.

This doesn't fully work because some files are created in the source
directory even when building outside (e.g. src/backend/parser/scan.c)

(I work like this all the time too, except that I keep my trees in
.../pgsql/source/00head and they build to .../pgsql/build/00head)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 06/03/2009 01:39 AM, Tom Lane wrote:
>> But rebuilding the Linux kernel is hardly a zero-cost operation,
>> so how have Linus and co failed to notice this problem?  There
>> must be some trick they're using that I haven't heard about, or
>> they'd not be nearly so pleased with git.

> Building out of tree and ccache are frequently mentioned.

Yeah, I thought about building out of tree, with a different build tree
for each branch and VPATH pointing at the common source tree (working
copy).  That would probably work if it weren't that switching to branch
B and then back to branch A has to advance the filesystem timestamps on
every file that's different between the two branches.  So it would
defeat whatever intelligence "make" might have.  Even if ccache is not
fooled, that's only a very partial solution.
        regards, tom lane


Re: Managing multiple branches in git

From
Aidan Van Dyk
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [090602 20:18]:
> Yeah, I thought about building out of tree, with a different build tree
> for each branch and VPATH pointing at the common source tree (working
> copy).  That would probably work if it weren't that switching to branch
> B and then back to branch A has to advance the filesystem timestamps on
> every file that's different between the two branches.  So it would
> defeat whatever intelligence "make" might have.  Even if ccache is not
> fooled, that's only a very partial solution.

Yes, the linux kernel relies on the build system (for them its the
kbulid makefile setup) having complete knowledge of all dependencies.
So if you switch branches, "make" knows exactly what files *need* to be
rebuilt, based on complete dependencies (including config) and which
ones don't.

a.


-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Managing multiple branches in git

From
Robert Haas
Date:
On Tue, Jun 2, 2009 at 7:54 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Andres Freund escribió:
>
>> git clone --reference common_repo
>> + small
>> + staying small
>> + fast
>> + windows supported
>> +- push/fetch needed
>> - possibly unsecure if you delete from the master repository - which one
>> can easily prevent
>>
>> git clone --shared
>> Essentially the same as the last above
>
> I think these are the two usable options.  They will probably end up

...wait a minute.   I just went and Googled this git-new-workdir thing
and it looks like it's almost exactly what we need.  According to the
docs, it lets you share the same local repository between multiple
working copies, so all the commits are shared but the index is
separate for each working directory.  Assuming it works, that sounds
just about perfect for Tom's use case, since it would allow
cherry-picking of commits without an intervening push/pull cycle.  Did
you have some reason for passing over that as one of the usable
options?

...Robert


Re: Managing multiple branches in git

From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@commandprompt.com) wrote:
> This doesn't fully work because some files are created in the source
> directory even when building outside (e.g. src/backend/parser/scan.c)

Sure, there's a couple files here and there, but those could probably be
handled through gitignore, similar to our .cvsignore files..?  I dunno,
was just a thought towards this specific item Tom that was talking about
(build-generated files in the source tree).
Thanks,
    Stephen

Re: Managing multiple branches in git

From
Mark Mielke
Date:
Tom Lane wrote:
> I can't escape the feeling that we're missing something basic here.
> It's allegedly one of git's great strengths that it allows you to easily
> and quickly switch your attention among multiple development branches.
> Well, so it does, if you haven't got any derived files to rebuild.
> But rebuilding the Linux kernel is hardly a zero-cost operation,
> so how have Linus and co failed to notice this problem?  There
> must be some trick they're using that I haven't heard about, or
> they'd not be nearly so pleased with git.
>   

If git has a real weakness - it's that it offer too many workflows, and 
this just results in confusion and everybody coming up with their own 
way to build the pyramid. :-)
From reading this thread, there are things that you guys do that I am 
not familiar with. Not to say there isn't good reasons for what you do, 
but it means that I can only guess and throw suggestions at you, where 
you might be looking for an authoritative answer. :-)

"git" has a "git stash" command that I've used to accomplish something 
like what you describe above. That is, I find myself in mid-work, I want 
to save the current working copy away and start "fresh" from a different 
context. Here is the beginning of the description for it:

DESCRIPTION      Use git stash when you want to record the current state of the 
working      directory and the index, but want to go back to a clean working      directory. The command saves your
localmodifications away and 
 
reverts      the working directory to match the HEAD commit.

I believe using a repository per release is a common workflow. If you 
access the Linux git repos, you'll find that Linus has a Linux 2.6 repo 
available. However, I think you are talking about using branches for far 
more than just the release stream you are working towards. Each of your 
sub-systems is in a different branch? That seems a bit insane, and your 
email suggesting these be different directories in the working copy 
seemed a lot more sane to me, but then somebody else responded that this 
was a bad idea, so I pull out of the "is this a good idea or not?" 
debate. :-)

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>



Re: Managing multiple branches in git

From
Mark Mielke
Date:
Tom Lane wrote: <blockquote cite="mid:18639.1243975742@sss.pgh.pa.us" type="cite"><pre wrap="">Mark Mielke <a
class="moz-txt-link-rfc2396E"href="mailto:mark@mark.mielke.cc"><mark@mark.mielke.cc></a> writes:
</pre><blockquotetype="cite"><pre wrap="">As a "for example", you could have a local repo that you publish from. 
 
Your work spaces could be from that local repo.   </pre></blockquote><pre wrap="">Yes, exactly.  How do I do that?  My
complaintis that git fails to
 
provide a distinction between a repo and a workspace --- they seem
to be totally tied together. </pre></blockquote><br /> Hehe... my "for example" is a bit ambiguous. I was talking about
onecommon model I've seen under git where people have private and public repos. The private repo is where you do your
mainwork. Commits are "published" by pushing them to your public repo and making them generally available for others to
pullfrom. Under this model, your private repo could clone the public repo using --shared to keep the working copy at
minimalsize. You could have multiple private repos if this is required for your workflow. Still, it becomes a
multi-stepprocess to commit. 1) Commit to your private repo, 2) Push to your public repo, 3) If you use a centralized
repo,you need another process to push or pull the change from your public repo to the centralized repo.<br /><br />
Anotherposter referenced "git-new-workdir". It really does look like what you are looking for:<br /><br />     <a
class="moz-txt-link-freetext"
href="http://blog.nuclearsquid.com/writings/git-new-workdir">http://blog.nuclearsquid.com/writings/git-new-workdir</a><br
/><br/> If it lives up to its advertisement, it gives you a new working copy with a new index, but linked directly to
theshared repo rather than having its own repo.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature"
cols="72">--
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

Re: Managing multiple branches in git

From
Robert Haas
Date:
On Tue, Jun 2, 2009 at 9:46 PM, Mark Mielke <mark@mark.mielke.cc> wrote:
> Tom Lane wrote:
>>
>> I can't escape the feeling that we're missing something basic here.
>> It's allegedly one of git's great strengths that it allows you to easily
>> and quickly switch your attention among multiple development branches.
>> Well, so it does, if you haven't got any derived files to rebuild.
>> But rebuilding the Linux kernel is hardly a zero-cost operation,
>> so how have Linus and co failed to notice this problem?  There
>> must be some trick they're using that I haven't heard about, or
>> they'd not be nearly so pleased with git.
>>
>
> If git has a real weakness - it's that it offer too many workflows, and this
> just results in confusion and everybody coming up with their own way to
> build the pyramid. :-)

True.

> From reading this thread, there are things that you guys do that I am not
> familiar with. Not to say there isn't good reasons for what you do, but it
> means that I can only guess and throw suggestions at you, where you might be
> looking for an authoritative answer. :-)
>
> "git" has a "git stash" command that I've used to accomplish something like
> what you describe above. That is, I find myself in mid-work, I want to save
> the current working copy away and start "fresh" from a different context.
> Here is the beginning of the description for it:

That doesn't really solve Tom's problem with build intermediates...

> I believe using a repository per release is a common workflow. If you access
> the Linux git repos, you'll find that Linus has a Linux 2.6 repo available.
> However, I think you are talking about using branches for far more than just
> the release stream you are working towards. Each of your sub-systems is in a
> different branch? That seems a bit insane, and your email suggesting these
> be different directories in the working copy seemed a lot more sane to me,
> but then somebody else responded that this was a bad idea, so I pull out of
> the "is this a good idea or not?" debate. :-)

No, the subsystems are not different branches.  But the 7.4.x series
of releases is in a branch called REL7_4_STABLE, 8.0.x is
REL8_0_STABLE, etc.   Tom often commits a fix to CVS HEAD and then
backpatches to 1-4 previous releases, to be distributed as part of a
subsequent minor release for that branch.

The problem with making each release a separate directory is that,
just like using separate repositories, it will defeat one of the main
strengths of git, which is the ability to move around commits easily.
git-new-workdir is the only solution to the problem of having multiple
branches checked out simultaneously that seems like it might not
suffer from that weakness.

...Robert


Re: Managing multiple branches in git

From
"Joshua D. Drake"
Date:
On Tue, 2009-06-02 at 20:01 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 06/03/2009 01:39 AM, Tom Lane wrote:
> >> But rebuilding the Linux kernel is hardly a zero-cost operation,
> >> so how have Linus and co failed to notice this problem?  There
> >> must be some trick they're using that I haven't heard about, or
> >> they'd not be nearly so pleased with git.
> 
> > Building out of tree and ccache are frequently mentioned.
> 
> Yeah, I thought about building out of tree, with a different build tree
> for each branch and VPATH pointing at the common source tree (working
> copy).  That would probably work if it weren't that switching to branch
> B and then back to branch A has to advance the filesystem timestamps on
> every file that's different between the two branches.  So it would
> defeat whatever intelligence "make" might have.  Even if ccache is not
> fooled, that's only a very partial solution.

So I bounced on #git and got this:

(05:22:52 PM) mugwump: linuxpoet: great. So, anyway, for that particular
problem you have two possible solutions: git-new-workdir (crack it open,
it's very simple!), or using multiple clones with hooks that copy
revisions between each other when they are committed

The "particular" problem he is referring to is:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00221.php
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00221.php
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00202.php

Sincerely,

Joshua D. Drake


> 
>             regards, tom lane
> 
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: Managing multiple branches in git

From
"Markus Wanner"
Date:
Hi,

Quoting "Tom Lane" <tgl@sss.pgh.pa.us>:
> I can't escape the feeling that we're missing something basic here.

Perhaps the power (and importance) of merging is still a bit  
underestimated, but otherwise I don't think there's much to miss.

> But rebuilding the Linux kernel is hardly a zero-cost operation,
> so how have Linus and co failed to notice this problem?  There
> must be some trick they're using that I haven't heard about, or
> they'd not be nearly so pleased with git.

Keep in mind that they don't have half as many back branches to  
maintain (taking only 2.4 and 2.6 into account). The minor version  
stable branches are not maintained for such a long time (for example,  
the last fix for 2.6.19 happened 2 years ago, from what I can tell).  
Overall, I think the differences are smaller than between the stable  
branches of Postgres' repository.

Regards

Markus Wanner



Re: Managing multiple branches in git

From
Greg Stark
Date:
On Wed, Jun 3, 2009 at 12:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "David E. Wheeler" <david@kineticode.com> writes:
>> Does that make sense?
>
> Maybe, but it still seems messy, brute force, and error-prone.
>
> I can't escape the feeling that we're missing something basic here.
> It's allegedly one of git's great strengths that it allows you to easily
> and quickly switch your attention among multiple development branches.

Well as long as the development branches are all "compatible" in the
sense that you don't nee to reconfigure and rebuild everything when
you switch then yes. Doing git-checkout to switch branches will work
fine for branches created to work on new code or review patches.

I really don't see what you mean by messy or brute force or
error-prone with regards to keeping a separate clone for each major
release. It's basically equivalent to having a CVS checkout for each
major release which you do now.

The main difference is that commit becomes a two step process --
commit to your repository, then push to the public repository. That's
three steps if you include the add but I suspect you're going to be
doing git-commit -a most of the time.

There's an advantage that you can commit several changes to your local
repo and push them all to the public repo together. That might be good
if you have, for example, a bug fix which requires an api change
elsewhere. You might want two separate commit messages and the ability
to merge one of them forward or back separately but not want to put
them in the public repo until both are available.

-- 
greg


Re: Managing multiple branches in git

From
"Markus Wanner"
Date:
Hi,

Quoting "David E. Wheeler" <david@kineticode.com>:
> Monotone?

..one of the sources of inspiration for Linus to write git. He was not  
satisfied with its speed and he didn't like C++ and SQL. Plus the main  
contributors weren't around at the time Linus was on the mailing list.  
So he turned away and did his own thing, in C and filesystem based.  
(Most ranting stripped).

Regards

Markus Wanner




Re: Managing multiple branches in git

From
Florian Weimer
Date:
* Tom Lane:

> I wondered for a second about symlinking .git from several checkout
> directories to a common master, but AFAICT .git stores both the
> "repository" and status information about the current checkout, so
> that's not gonna work.

"git clone --reference" stores just a reference and does not copy the
history.

It's not going to help in the long run because history accumulating on
the HEAD will be duplicated in your release branches.  This is not a
problem if you never merge stuff into them, but I don't know how much
(recent) history browsing you want to do from your release checkouts.

> At the same time, I don't really buy the theory that relating commits on
> different branches via merges will work.  In my experience it is very
> seldom the case that a patch applies to each back branch with no manual
> effort whatever, which is what I gather the merge functionality could
> help with.  So maybe there's not much help to be had on this ...

Correct.  Merging doesn't work if you pick individual patches.  This
is a difficult problem, and few VCS seem to have tackled it.

Working with a single tree and ccache would be another alternative
(ccache still runs the preprocessor and hashes its output, so it
doesn't care about file modification times).

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/03/2009 01:48 PM, Florian Weimer wrote:
>> I wondered for a second about symlinking .git from several checkout
>> directories to a common master, but AFAICT .git stores both the
>> "repository" and status information about the current checkout, so
>> that's not gonna work.
> "git clone --reference" stores just a reference and does not copy the
> history.
> It's not going to help in the long run because history accumulating on
> the HEAD will be duplicated in your release branches.  This is not a
> problem if you never merge stuff into them, but I don't know how much
> (recent) history browsing you want to do from your release checkouts.
As the referenced repository would be a mirror from the "official" 
repository it should contain most of what is contained in the release 
checkouts - so repacking the release checkouts should remove duplicate 
objects, right?
The work on the release branches would hopefully get pushed to the 
official repository so I don't see a long term problem of duplicate objects.

I haven't really looked at the repack code, so I may be completely off...

Andres


Re: Managing multiple branches in git

From
Ron Mayer
Date:
Robert Haas wrote:
> The problem with making each release a separate directory is that,
> just like using separate repositories, it will defeat one of the main
> strengths of git, which is the ability to move around commits easily.
> git-new-workdir is the only solution to the problem of having multiple
> branches checked out simultaneously that seems like it might not
> suffer from that weakness.

While I agree "git-new-workdir" is best for typical postgres workflows
so I won't dwell on separate-repositories beyond this post - but I
think you overstate the difficulty a bit.


It seems it's not that hard to cherry-pick from a remote repository by
setting up a temporary tracking branch and (optionally) removing it
when you're done with it if you don't think you'll need it often.

From: http://www.sourcemage.org/Git_Guide
$ git checkout --track -b <tmp local branch> origin/<remote branch>
$ git cherry-pick -x <sha1 refspec of commit from other (local or remote) branch>
$ git push origin <tmp local branch>
$ git branch -D <tmp local branch>

And if you know you'll be moving patches between external repositories
like "origin/<remote branch>" often, ISTM you don't have to do the first and
last steps (which create and remove the tracked branch) each time; but rather
leave the local tracking branch there.



IMVHO, Moving commits around across *different* remote repositories is
also one of the main strengths of moving to a distributed VCS.


Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Robert Haas escribió:
> On Tue, Jun 2, 2009 at 7:54 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:

> > I think these are the two usable options.  They will probably end up
> 
> ...wait a minute.   I just went and Googled this git-new-workdir thing
> and it looks like it's almost exactly what we need.  According to the
> docs, it lets you share the same local repository between multiple
> working copies, so all the commits are shared but the index is
> separate for each working directory.  Assuming it works, that sounds
> just about perfect for Tom's use case, since it would allow
> cherry-picking of commits without an intervening push/pull cycle.  Did
> you have some reason for passing over that as one of the usable
> options?

Well, it sounds about perfect for my use case too (which is
approximately the same as Tom's), but the description makes it sound
unsupported.  It doesn't work on Windows which doesn't bother me
personally but may be a showstopper more generally.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Managing multiple branches in git

From
Dave Page
Date:
On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

> Well, it sounds about perfect for my use case too (which is
> approximately the same as Tom's), but the description makes it sound
> unsupported.  It doesn't work on Windows which doesn't bother me
> personally but may be a showstopper more generally.

It's not a showstopper for me. Can't speak for Magnus, Andrew or
anyone else working on Windows though. I imagine those two are the
most likely to have issues if they're back-patching - and that should
just be a matter of disk space.


--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: Managing multiple branches in git

From
Tom Lane
Date:
Dave Page <dpage@pgadmin.org> writes:
> On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>> Well, it sounds about perfect for my use case too (which is
>> approximately the same as Tom's), but the description makes it sound
>> unsupported. �It doesn't work on Windows which doesn't bother me
>> personally but may be a showstopper more generally.

> It's not a showstopper for me. Can't speak for Magnus, Andrew or
> anyone else working on Windows though.

Seems like we'd want all committers to be using a similar work-flow
for back-patching, else we're going to have random variations in what
patch sets look like in the history.

I think the appropriate question is why doesn't it work on Windows,
and is that fixable?  Without having looked, I'm guessing the issue
is that it depends on hardlinks or symlinks --- and we know those are
available, as long as you're using recent Windows with NTFS.  Which
does not sound like an unreasonable baseline requirement for someone
committing from Windows.
        regards, tom lane


Re: Managing multiple branches in git

From
"Joshua D. Drake"
Date:
On Wed, 2009-06-03 at 12:01 -0400, Tom Lane wrote:

> I think the appropriate question is why doesn't it work on Windows,
> and is that fixable?  Without having looked, I'm guessing the issue
> is that it depends on hardlinks or symlinks --- and we know those are
> available, as long as you're using recent Windows with NTFS.  Which
> does not sound like an unreasonable baseline requirement for someone
> committing from Windows.

That was the mention in the channel, that it had to do with symlinks.

Joshua D. Drake


-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Dave Page wrote:
> On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>
>   
>> Well, it sounds about perfect for my use case too (which is
>> approximately the same as Tom's), but the description makes it sound
>> unsupported.  It doesn't work on Windows which doesn't bother me
>> personally but may be a showstopper more generally.
>>     
>
> It's not a showstopper for me. Can't speak for Magnus, Andrew or
> anyone else working on Windows though. I imagine those two are the
> most likely to have issues if they're back-patching - and that should
> just be a matter of disk space.
>
>   

Yeah, AFAIK Magnus doesn't commit direct from Windows, and neither do I, 
and this should not be a showstopper for anyone who isn't a committer.

cheers

andrew


Re: Managing multiple branches in git

From
Dave Page
Date:
On Wed, Jun 3, 2009 at 5:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dave Page <dpage@pgadmin.org> writes:
>> On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
>> <alvherre@commandprompt.com> wrote:
>>> Well, it sounds about perfect for my use case too (which is
>>> approximately the same as Tom's), but the description makes it sound
>>> unsupported.  It doesn't work on Windows which doesn't bother me
>>> personally but may be a showstopper more generally.
>
>> It's not a showstopper for me. Can't speak for Magnus, Andrew or
>> anyone else working on Windows though.
>
> Seems like we'd want all committers to be using a similar work-flow
> for back-patching, else we're going to have random variations in what
> patch sets look like in the history.
>
> I think the appropriate question is why doesn't it work on Windows,
> and is that fixable?  Without having looked, I'm guessing the issue
> is that it depends on hardlinks or symlinks --- and we know those are
> available, as long as you're using recent Windows with NTFS.  Which
> does not sound like an unreasonable baseline requirement for someone
> committing from Windows.

It's a simple perl script that uses symlinks:
http://git.kernel.org/?p=git/git.git;a=blob;f=contrib/workdir/git-new-workdir

But... it doesn't really break the workflow as far as I can see - it
will just mean Windows users need multiple full copies of the repo for
each branch until the script could be hacked up.


--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: Managing multiple branches in git

From
Dave Page
Date:
On Wed, Jun 3, 2009 at 5:14 PM, Dave Page <dpage@pgadmin.org> wrote:

> It's a simple perl script that uses symlinks:
> http://git.kernel.org/?p=git/git.git;a=blob;f=contrib/workdir/git-new-workdir

Err, shell script even.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Dave Page <dpage@pgadmin.org> writes:
>   
>> On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
>> <alvherre@commandprompt.com> wrote:
>>     
>>> Well, it sounds about perfect for my use case too (which is
>>> approximately the same as Tom's), but the description makes it sound
>>> unsupported.  It doesn't work on Windows which doesn't bother me
>>> personally but may be a showstopper more generally.
>>>       
>
>   
>> It's not a showstopper for me. Can't speak for Magnus, Andrew or
>> anyone else working on Windows though.
>>     
>
> Seems like we'd want all committers to be using a similar work-flow
> for back-patching, else we're going to have random variations in what
> patch sets look like in the history.
>
> I think the appropriate question is why doesn't it work on Windows,
> and is that fixable?  Without having looked, I'm guessing the issue
> is that it depends on hardlinks or symlinks --- and we know those are
> available, as long as you're using recent Windows with NTFS.  Which
> does not sound like an unreasonable baseline requirement for someone
> committing from Windows.
>
>             
>   

It's a shell script, IIRC.

I think it could probably be made to work on WIndows if really necessary 
(e.g. by translating into perl).

cheers

andrew


Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/03/2009 06:17 PM, Andrew Dunstan wrote:
> Tom Lane wrote:
>> I think the appropriate question is why doesn't it work on Windows,
>> and is that fixable? Without having looked, I'm guessing the issue
>> is that it depends on hardlinks or symlinks --- and we know those are
>> available, as long as you're using recent Windows with NTFS. Which
>> does not sound like an unreasonable baseline requirement for someone
>> committing from Windows.
> I think it could probably be made to work on WIndows if really necessary
> (e.g. by translating into perl).
Is the fact that its implemented as a shell script the real problem? 
Isn't it more that "symlinks" aka Junction Points are really dangerous 
<= WinXP? (Deleting a symlink recurses to the target and deletes there).

Andres


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Andres Freund wrote:
> On 06/03/2009 06:17 PM, Andrew Dunstan wrote:
>> Tom Lane wrote:
>>> I think the appropriate question is why doesn't it work on Windows,
>>> and is that fixable? Without having looked, I'm guessing the issue
>>> is that it depends on hardlinks or symlinks --- and we know those are
>>> available, as long as you're using recent Windows with NTFS. Which
>>> does not sound like an unreasonable baseline requirement for someone
>>> committing from Windows.
>> I think it could probably be made to work on WIndows if really necessary
>> (e.g. by translating into perl).
> Is the fact that its implemented as a shell script the real problem? 
> Isn't it more that "symlinks" aka Junction Points are really dangerous 
> <= WinXP? (Deleting a symlink recurses to the target and deletes there).
>
>

You have carefully left out the first sentence of my reply. Neither of 
the committers who actually do much work on Windows (namely Magnus and 
me) commit direct from *any* version of Windows. And the whole point of 
this was to overcome an issue relating to commits, so it should not 
affect anyone except a committer.

And yes, we know about junction points. I don't think either of us is 
doing any development work on XP. I do most of my Windows work on my 
laptop, which has Vista (and thus mklink as well as junction points).

And yes, the fact that it's a shell script can be a problem if you're 
not using a Unix-like shell environment.

cheers

andrew



Re: Managing multiple branches in git

From
Andres Freund
Date:
On 06/03/2009 06:38 PM, Andrew Dunstan wrote:
> Andres Freund wrote:
>> On 06/03/2009 06:17 PM, Andrew Dunstan wrote:
>>> Tom Lane wrote:
>>>> I think the appropriate question is why doesn't it work on
>>>> Windows, and is that fixable? Without having looked, I'm
>>>> guessing the issue is that it depends on hardlinks or symlinks
>>>> --- and we know those are available, as long as you're using
>>>> recent Windows with NTFS. Which does not sound like an
>>>> unreasonable baseline requirement for someone committing from
>>>> Windows.
>>> I think it could probably be made to work on WIndows if really
>>> necessary (e.g. by translating into perl).
>> Is the fact that its implemented as a shell script the real
>> problem? Isn't it more that "symlinks" aka Junction Points are
>> really dangerous <= WinXP? (Deleting a symlink recurses to the
>> target and deletes there).
> You have carefully left out the first sentence of my reply.
Sorry, I didnt want to imply anything by that.

> And yes, we know about junction points. I don't think either of us is
> doing any development work on XP. I do most of my Windows work on my
> laptop, which has Vista (and thus mklink as well as junction
> points).
Good then.

> And yes, the fact that it's a shell script can be a problem if
> you're not using a Unix-like shell environment.
The git for windows installation includes a functional unix-alike shell
(mingw, not cygwin or such). Some core part of git are still written in
shell, so it would not work without that anyway.

Andres


Re: Managing multiple branches in git

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> You have carefully left out the first sentence of my reply. Neither of 
> the committers who actually do much work on Windows (namely Magnus and 
> me) commit direct from *any* version of Windows.

Nonetheless, that might not be true in future.  I'd be a bit worried
about establishing a project standard that excluded people from doing
commit work on Windows.

But it sounds like that problem could be dealt with if anyone cared to
put some work into it, so I'm feeling this is not a showstopper issue.


What it seems we need next is for someone to experiment with
git-new-workdir and committing patches that touch multiple branches,
to confirm whether this actually offers a good solution.
        regards, tom lane


Re: Managing multiple branches in git

From
Alvaro Herrera
Date:
Markus Wanner wrote:
> Hi,
>
> Quoting "David E. Wheeler" <david@kineticode.com>:
>> Monotone?
>
> ..one of the sources of inspiration for Linus to write git. He was not  
> satisfied with its speed and he didn't like C++ and SQL. Plus the main  
> contributors weren't around at the time Linus was on the mailing list.  
> So he turned away and did his own thing, in C and filesystem based.  
> (Most ranting stripped).

The only rant I have about the outcome is that Linus did not copy more
of it.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Managing multiple branches in git

From
Magnus Hagander
Date:
Andrew Dunstan wrote:
> 
> 
> Dave Page wrote:
>> On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
>> <alvherre@commandprompt.com> wrote:
>>
>>  
>>> Well, it sounds about perfect for my use case too (which is
>>> approximately the same as Tom's), but the description makes it sound
>>> unsupported.  It doesn't work on Windows which doesn't bother me
>>> personally but may be a showstopper more generally.
>>>     
>>
>> It's not a showstopper for me. Can't speak for Magnus, Andrew or
>> anyone else working on Windows though. I imagine those two are the
>> most likely to have issues if they're back-patching - and that should
>> just be a matter of disk space.
>>
>>   
> 
> Yeah, AFAIK Magnus doesn't commit direct from Windows, and neither do I,
> and this should not be a showstopper for anyone who isn't a committer.

Well, partially correct.

My workflow today is that I do the commit on a git repository in my
Windows VM. Which I then "git push" out to my linux box. Where I do a
make to be sure I didn't break things :-), and then just extract the
patch with "git diff" and apply it manually to the cvs tree, and finally
I commit in cvs...

Even if we move to git, I have no desire to push directly from Windows
into the core repository. I'll still stage it through a local one.

-- Magnus HaganderSelf: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Andres Freund wrote:
> The git for windows installation includes a functional unix-alike shell
> (mingw, not cygwin or such). Some core part of git are still written in
> shell, so it would not work without that anyway.
>
>

Ah. Ok. Good to know. Does it contain a builtin "ln" command? And does 
that use junction points?

cheers

andrew


Re: Managing multiple branches in git

From
Andres Freund
Date:
Hi,

On 06/03/2009 07:26 PM, Andrew Dunstan wrote:
> Andres Freund wrote:
>> The git for windows installation includes a functional unix-alike shell
>> (mingw, not cygwin or such). Some core part of git are still written in
>> shell, so it would not work without that anyway.
> Ah. Ok. Good to know. Does it contain a builtin "ln" command? And does
> that use junction points?
It contains a ln.exe but I do not know what it exactly does:
http://repo.or.cz/w/msysgit.git?a=tree;f=bin;h=ab9faa176dbed67a93aa223e0d84bff9f950a26d;hb=HEAD

I don't have windows access for the next few hours, but if nobody 
answered until then I will try it.

Andres


Re: Managing multiple branches in git

From
Andrew Dunstan
Date:

Magnus Hagander wrote:
> Andrew Dunstan wrote:
>   
>> Dave Page wrote:
>>     
>>> On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
>>> <alvherre@commandprompt.com> wrote:
>>>
>>>  
>>>       
>>>> Well, it sounds about perfect for my use case too (which is
>>>> approximately the same as Tom's), but the description makes it sound
>>>> unsupported.  It doesn't work on Windows which doesn't bother me
>>>> personally but may be a showstopper more generally.
>>>>     
>>>>         
>>> It's not a showstopper for me. Can't speak for Magnus, Andrew or
>>> anyone else working on Windows though. I imagine those two are the
>>> most likely to have issues if they're back-patching - and that should
>>> just be a matter of disk space.
>>>
>>>   
>>>       
>> Yeah, AFAIK Magnus doesn't commit direct from Windows, and neither do I,
>> and this should not be a showstopper for anyone who isn't a committer.
>>     
>
> Well, partially correct.
>
> My workflow today is that I do the commit on a git repository in my
> Windows VM. Which I then "git push" out to my linux box. Where I do a
> make to be sure I didn't break things :-), and then just extract the
> patch with "git diff" and apply it manually to the cvs tree, and finally
> I commit in cvs...
>
> Even if we move to git, I have no desire to push directly from Windows
> into the core repository. I'll still stage it through a local one.
>   

I see. In that case, though, you probably do need to be able to do thing 
atomically across branches, so that you can push a single changeset, no?

Anyway, it sounds like it's not going to be a showstopper.

cheers

andrew


Re: Managing multiple branches in git

From
Aidan Van Dyk
Date:
My last post on the git issue...  If any one wants to ask specific
questions, feel free to e-mail me directly...  But this thread has
digressed to way too much hand-waving...

If any of your are not familiar with git and want to get an overview of
it, this might be a good place to start:http://excess.org/article/2008/07/ogre-git-tutorial/

It was a presentation done Bart did here in Ottawa for a group of local
ruby enthusiasts, not necessarily aimed at "kernel" or C hackers.

a.
-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Managing multiple branches in git

From
Bruce Momjian
Date:
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > The only reason Tom sees a single line history is because he uses an 
> > addon tool for CVS called cvs2cl: see <http://www.red-bean.com/cvs2cl/>. 
> > It's not part of CVS, and I'm not sure how many others use it. I sure 
> > don't.
> 
> FWIW, I believe Bruce uses some version of it as well.  It's our main
> tool for dredging up the raw data for release notes.

I use pgsql/src/tools/pgcvslog because it gives me exactly the
information I need for the release notes.  It can even suppress items
that appeared in backbranch commits (because those changes already
appeared in a backbranch release).

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Managing multiple branches in git

From
"Markus Wanner"
Date:
Hi,

Quoting "Alvaro Herrera" <alvherre@commandprompt.com>:
> The only rant I have about the outcome is that Linus did not copy more
> of it.

He he.. nice way of looking at it ;-)

Regards

Markus Wanner




Re: Managing multiple branches in git

From
Peter Eisentraut
Date:
On Wednesday 03 June 2009 01:55:48 Andrew Dunstan wrote:
> Running recursive grep on a subversion working copy is quite nasty.

I suggest

export GREP_OPTIONS='-d skip -I --exclude=*.svn-base --exclude=tags --exclude=*~ --exclude-dir=CVS --exclude-dir=.git
--exclude-dir=.svn--exclude=TAGS'
 



Re: Managing multiple branches in git

From
Greg Smith
Date:
On Sun, 7 Jun 2009, Peter Eisentraut wrote:

> On Wednesday 03 June 2009 01:55:48 Andrew Dunstan wrote:
>> Running recursive grep on a subversion working copy is quite nasty.
>
> I suggest
> export GREP_OPTIONS='-d skip -I --exclude=*.svn-base --exclude=tags 
> --exclude=*~ --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.svn 
> --exclude=TAGS'

The other alternative is to use ack:  http://betterthangrep.com/ and have 
some better defaults.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD