Thread: documentation for committing with git

documentation for committing with git

From

Robert Haas

Date:

21 July 2010, 13:22:33

At the developer meeting, I promised to do the work of documenting how
committers should use git.  So here's a first version.

http://wiki.postgresql.org/wiki/Committing_with_Git

Note that while anyone is welcome to comment, I mostly care about
whether the document is adequate for our existing committers, rather
than whether someone who is not a committer thinks we should manage
the project differently... that might be an interesting discussion,
but we're theoretically making this switch in about a month, and
getting agreement on changing our current workflow will take about a
decade, so there is not time now to do the latter before we do the
former.  So I would ask everyone to consider postponing those
discussions until after we've made the switch and ironed out the
kinks.  On the other hand, if you have technical corrections, or if
you have suggestions on how to do the same things better (rather than
suggestions on what to do differently), that would be greatly
appreciated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Peter Eisentraut

Date:

21 July 2010, 16:07:15

On ons, 2010-07-21 at 12:22 -0400, Robert Haas wrote:
> At the developer meeting, I promised to do the work of documenting how
> committers should use git.  So here's a first version.
> 
> http://wiki.postgresql.org/wiki/Committing_with_Git

Looks good.  Please consolidate this with the Committers page when the
day comes.

Comments:

3. ... your name and email address must match those configured on the
server

==> How do we know what those are?  Who controls that?

6. Finally, you must push your changes back to the server.

git push

This will push changes in all branches you've updated, but only branches
that also exist on the remote side will be pushed; thus, you can have
local working branches that won't be pushed.

==> This is true, but I have found it saner to configure push.default =
tracking, so that only the current branch is pushes.  Some people might
find that useful.

Re: documentation for committing with git

From

Magnus Hagander

Date:

21 July 2010, 16:11:44

On Wed, Jul 21, 2010 at 21:07, Peter Eisentraut <peter_e@gmx.net> wrote:
> On ons, 2010-07-21 at 12:22 -0400, Robert Haas wrote:
>> At the developer meeting, I promised to do the work of documenting how
>> committers should use git.  So here's a first version.
>>
>> http://wiki.postgresql.org/wiki/Committing_with_Git
>
> Looks good.  Please consolidate this with the Committers page when the
> day comes.
>
> Comments:
>
> 3. ... your name and email address must match those configured on the
> server
>
> ==> How do we know what those are?  Who controls that?

sysadmins team. It's set up when committers are added, just like
today's authormap on the git mirror. Before we set up the system,
we'll double check all of them with each committer, of course.


> 6. Finally, you must push your changes back to the server.
>
> git push
>
> This will push changes in all branches you've updated, but only branches
> that also exist on the remote side will be pushed; thus, you can have
> local working branches that won't be pushed.
>
> ==> This is true, but I have found it saner to configure push.default =
> tracking, so that only the current branch is pushes.  Some people might
> find that useful.

Indeed. Why don't I do that more often...

+1 on making that a general recommendation, and have people only not
do that if they really know what they're doing :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: documentation for committing with git

From

Robert Haas

Date:

21 July 2010, 16:20:58

On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> 6. Finally, you must push your changes back to the server.
>>
>> git push
>>
>> This will push changes in all branches you've updated, but only branches
>> that also exist on the remote side will be pushed; thus, you can have
>> local working branches that won't be pushed.
>>
>> ==> This is true, but I have found it saner to configure push.default =
>> tracking, so that only the current branch is pushes.  Some people might
>> find that useful.
>
> Indeed. Why don't I do that more often...
>
> +1 on making that a general recommendation, and have people only not
> do that if they really know what they're doing :-)

Hmm, I didn't know about that option.  What makes us think that's the
behavior people will most often want?  Because it doesn't seem like
what I want, just for one example...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Magnus Hagander

Date:

21 July 2010, 16:22:11

On Wed, Jul 21, 2010 at 21:20, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> 6. Finally, you must push your changes back to the server.
>>>
>>> git push
>>>
>>> This will push changes in all branches you've updated, but only branches
>>> that also exist on the remote side will be pushed; thus, you can have
>>> local working branches that won't be pushed.
>>>
>>> ==> This is true, but I have found it saner to configure push.default =
>>> tracking, so that only the current branch is pushes.  Some people might
>>> find that useful.
>>
>> Indeed. Why don't I do that more often...
>>
>> +1 on making that a general recommendation, and have people only not
>> do that if they really know what they're doing :-)
>
> Hmm, I didn't know about that option.  What makes us think that's the
> behavior people will most often want?  Because it doesn't seem like
> what I want, just for one example...

It'd be what I want for everything *except* when doing backpatching.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: documentation for committing with git

From

David Christensen

Date:

21 July 2010, 16:23:21

On Jul 21, 2010, at 2:20 PM, Robert Haas wrote:

> On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> 6. Finally, you must push your changes back to the server.
>>>
>>> git push
>>>
>>> This will push changes in all branches you've updated, but only branches
>>> that also exist on the remote side will be pushed; thus, you can have
>>> local working branches that won't be pushed.
>>>
>>> ==> This is true, but I have found it saner to configure push.default =
>>> tracking, so that only the current branch is pushes.  Some people might
>>> find that useful.
>>
>> Indeed. Why don't I do that more often...
>>
>> +1 on making that a general recommendation, and have people only not
>> do that if they really know what they're doing :-)
>
> Hmm, I didn't know about that option.  What makes us think that's the
> behavior people will most often want?  Because it doesn't seem like
> what I want, just for one example...


So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit.
Createa push on master.  Bare git push.  WIP commit gets pushed upstream.  Oops. 

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com

Re: documentation for committing with git

From

Robert Haas

Date:

21 July 2010, 16:27:00

On Wed, Jul 21, 2010 at 3:23 PM, David Christensen <david@endpoint.com> wrote:
>
> On Jul 21, 2010, at 2:20 PM, Robert Haas wrote:
>
>> On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>>> 6. Finally, you must push your changes back to the server.
>>>>
>>>> git push
>>>>
>>>> This will push changes in all branches you've updated, but only branches
>>>> that also exist on the remote side will be pushed; thus, you can have
>>>> local working branches that won't be pushed.
>>>>
>>>> ==> This is true, but I have found it saner to configure push.default =
>>>> tracking, so that only the current branch is pushes.  Some people might
>>>> find that useful.
>>>
>>> Indeed. Why don't I do that more often...
>>>
>>> +1 on making that a general recommendation, and have people only not
>>> do that if they really know what they're doing :-)
>>
>> Hmm, I didn't know about that option.  What makes us think that's the
>> behavior people will most often want?  Because it doesn't seem like
>> what I want, just for one example...
>
>
> So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit.
 Createa push on master.  Bare git push.  WIP commit gets pushed upstream.  Oops. 

Sure, oops, but I would never do that.  I'd stash it or put it on a
topic branch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Alvaro Herrera

Date:

21 July 2010, 16:31:42

Excerpts from Robert Haas's message of mié jul 21 15:26:47 -0400 2010:

> > So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit.
 Createa push on master.  Bare git push.  WIP commit gets pushed upstream.  Oops.
 
> 
> Sure, oops, but I would never do that.  I'd stash it or put it on a
> topic branch.

Somebody else will.  Please remember you're writing docs that are not
for yourself.

Re: documentation for committing with git

From

Andrew Dunstan

Date:

21 July 2010, 16:37:42

Robert Haas wrote:
> At the developer meeting, I promised to do the work of documenting how
> committers should use git.  So here's a first version.
>
> http://wiki.postgresql.org/wiki/Committing_with_Git
>
> Note that while anyone is welcome to comment, I mostly care about
> whether the document is adequate for our existing committers, rather
> than whether someone who is not a committer thinks we should manage
> the project differently... that might be an interesting discussion,
> but we're theoretically making this switch in about a month, and
> getting agreement on changing our current workflow will take about a
> decade, so there is not time now to do the latter before we do the
> former.  So I would ask everyone to consider postponing those
> discussions until after we've made the switch and ironed out the
> kinks.  On the other hand, if you have technical corrections, or if
> you have suggestions on how to do the same things better (rather than
> suggestions on what to do differently), that would be greatly
> appreciated.
>   

Well, either we have a terminology problem or a statement of policy that 
I'm not sure I agree with, in point 2.  IMNSHO, what we need to forbid 
is commits that are not fast-forward commits, i.e. that do not have the 
current branch head as an ancestor, ideally as the immediate ancestor.

Personally, I have a strong opinion that for everything but totally 
trivial patches, the committer should create a short-lived work branch 
where all the work is done, and then do a squash merge back to the main 
branch, which is then pushed. This pattern is not mentioned at all. In 
my experience, it is essential, especially if you're working on more 
than one thing at a time, as many people often are.

cheers

andrew

Re: documentation for committing with git

From

Magnus Hagander

Date:

21 July 2010, 16:40:08

On Wed, Jul 21, 2010 at 21:37, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> Robert Haas wrote:
>>
>> At the developer meeting, I promised to do the work of documenting how
>> committers should use git.  So here's a first version.
>>
>> http://wiki.postgresql.org/wiki/Committing_with_Git
>>
>> Note that while anyone is welcome to comment, I mostly care about
>> whether the document is adequate for our existing committers, rather
>> than whether someone who is not a committer thinks we should manage
>> the project differently... that might be an interesting discussion,
>> but we're theoretically making this switch in about a month, and
>> getting agreement on changing our current workflow will take about a
>> decade, so there is not time now to do the latter before we do the
>> former.  So I would ask everyone to consider postponing those
>> discussions until after we've made the switch and ironed out the
>> kinks.  On the other hand, if you have technical corrections, or if
>> you have suggestions on how to do the same things better (rather than
>> suggestions on what to do differently), that would be greatly
>> appreciated.
>>
>
> Well, either we have a terminology problem or a statement of policy that I'm
> not sure I agree with, in point 2.  IMNSHO, what we need to forbid is
> commits that are not fast-forward commits, i.e. that do not have the current
> branch head as an ancestor, ideally as the immediate ancestor.
>
> Personally, I have a strong opinion that for everything but totally trivial
> patches, the committer should create a short-lived work branch where all the
> work is done, and then do a squash merge back to the main branch, which is
> then pushed. This pattern is not mentioned at all. In my experience, it is
> essential, especially if you're working on more than one thing at a time, as
> many people often are.

Uh, that's going to create an actual merge commit, no? Or you mean
squash-merge-but-only-fast-forward?

I *think* the docs is based off the pattern of the committer having
two repositories - one for his own work, one for comitting, much like
I assume all of us have today in cvs.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: documentation for committing with git

From

David Christensen

Date:

21 July 2010, 16:44:21

On Jul 21, 2010, at 2:39 PM, Magnus Hagander wrote:

> On Wed, Jul 21, 2010 at 21:37, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>>
>> Robert Haas wrote:
>>>
>>> At the developer meeting, I promised to do the work of documenting how
>>> committers should use git.  So here's a first version.
>>>
>>> http://wiki.postgresql.org/wiki/Committing_with_Git
>>>
>>> Note that while anyone is welcome to comment, I mostly care about
>>> whether the document is adequate for our existing committers, rather
>>> than whether someone who is not a committer thinks we should manage
>>> the project differently... that might be an interesting discussion,
>>> but we're theoretically making this switch in about a month, and
>>> getting agreement on changing our current workflow will take about a
>>> decade, so there is not time now to do the latter before we do the
>>> former.  So I would ask everyone to consider postponing those
>>> discussions until after we've made the switch and ironed out the
>>> kinks.  On the other hand, if you have technical corrections, or if
>>> you have suggestions on how to do the same things better (rather than
>>> suggestions on what to do differently), that would be greatly
>>> appreciated.
>>>
>>
>> Well, either we have a terminology problem or a statement of policy that I'm
>> not sure I agree with, in point 2.  IMNSHO, what we need to forbid is
>> commits that are not fast-forward commits, i.e. that do not have the current
>> branch head as an ancestor, ideally as the immediate ancestor.
>>
>> Personally, I have a strong opinion that for everything but totally trivial
>> patches, the committer should create a short-lived work branch where all the
>> work is done, and then do a squash merge back to the main branch, which is
>> then pushed. This pattern is not mentioned at all. In my experience, it is
>> essential, especially if you're working on more than one thing at a time, as
>> many people often are.
>
> Uh, that's going to create an actual merge commit, no? Or you mean
> squash-merge-but-only-fast-forward?
>
> I *think* the docs is based off the pattern of the committer having
> two repositories - one for his own work, one for comitting, much like
> I assume all of us have today in cvs.

You can also do a rebase after the merge to remove the local merge commit before pushing.  I tend to do this anytime I
mergea local branch, just to rebase on top of the most recent origin/master. 

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com

Re: documentation for committing with git

From

Andrew Dunstan

Date:

21 July 2010, 16:59:54

Magnus Hagander wrote:
>> Personally, I have a strong opinion that for everything but totally trivial
>> patches, the committer should create a short-lived work branch where all the
>> work is done, and then do a squash merge back to the main branch, which is
>> then pushed. This pattern is not mentioned at all. In my experience, it is
>> essential, especially if you're working on more than one thing at a time, as
>> many people often are.
>>     
>
> Uh, that's going to create an actual merge commit, no? Or you mean
> squash-merge-but-only-fast-forward?
>   

Yes, exactly that. Something like:
   git checkout -b myworkbranch   ... work, test, commit, rinse, lather repeat ...   git checkout RELn_m_STABLE   git
pull  git merge --squash myworkbranch   git push

> I *think* the docs is based off the pattern of the committer having
> two repositories - one for his own work, one for comitting, much like
> I assume all of us have today in cvs.
>
>   

So then what? After you've done your work you'll still need to pull the 
stuff somehow into your commit tree. I don't think this will buy you a 
lot. I usually clone the whole CVS tree for non-trivial work, but I'm 
not sure that's an ideal work pattern.

cheers

andrew

Re: documentation for committing with git

From

Robert Haas

Date:

21 July 2010, 17:48:12

On Wed, Jul 21, 2010 at 3:31 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of mié jul 21 15:26:47 -0400 2010:
>
>> > So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit.
 Createa push on master.  Bare git push.  WIP commit gets pushed upstream.  Oops. 
>>
>> Sure, oops, but I would never do that.  I'd stash it or put it on a
>> topic branch.
>
> Somebody else will.  Please remember you're writing docs that are not
> for yourself.

I don't have any problem suggesting it for those who may want it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Robert Haas

Date:

21 July 2010, 18:03:20

On Wed, Jul 21, 2010 at 3:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Well, either we have a terminology problem or a statement of policy that I'm
> not sure I agree with, in point 2.  IMNSHO, what we need to forbid is
> commits that are not fast-forward commits, i.e. that do not have the current
> branch head as an ancestor, ideally as the immediate ancestor.

There are two separate questions here.  One is whether an update to a
ref is fast-forward or history rewriting, and the other is whether it
is a merge commit or not.  I don't believe that we want either
history-rewriting commits or merge commits to get pushed, but this
paragraph is about merge commits.

> Personally, I have a strong opinion that for everything but totally trivial
> patches, the committer should create a short-lived work branch where all the
> work is done, and then do a squash merge back to the main branch, which is
> then pushed. This pattern is not mentioned at all. In my experience, it is
> essential, especially if you're working on more than one thing at a time, as
> many people often are.

git merge --squash doesn't create a merge commit.  Indeed, the whole
point is to create a commit which essentially encapsulates the same
diff as a merge commit but actually isn't one.  From the man page:

Produce the working tree and index state as if a real merge
happened (except for the merge information), but do not actually
make a commit or move the HEAD, nor record $GIT_DIR/MERGE_HEAD to
cause the next git commit command to create a merge commit.

As for whether to discuss the use of git merge --squash, I could go
either way on that.  Personally, my preferred workflow is to do 'git
rebase -i master' on a topic branch, squash all the commits, and then
switch to the master branch and do 'git merge otherbranch', resulting
in a fast-forward merge with no merge commit.  But there are many
other ways to do it, including 'git merge --squash' and the
already-mentioned 'git commit -a'.  I think there's a risk of this
turning into a complete tutorial on git, which might detract from its
primary purpose of explaining to committers how to get  a basic,
working setup in place.  But we can certainly add whatever you think
is important, or maybe some language indicating that 'git commit -a'
is just an EXAMPLE of how to create a commit...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Robert Haas

Date:

21 July 2010, 18:41:47

On Wed, Jul 21, 2010 at 5:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> working setup in place.  But we can certainly add whatever you think
> is important, or maybe some language indicating that 'git commit -a'
> is just an EXAMPLE of how to create a commit...

I took a crack at this, as well as incorporating some of the other
suggestions that have been made.  I'm sure it's not perfect, but maybe
it's an improvement...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Daniel Farina

Date:

28 July 2010, 20:22:14

On Wed, Jul 21, 2010 at 9:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On the other hand, if you have technical corrections, or if
> you have suggestions on how to do the same things better (rather than
> suggestions on what to do differently), that would be greatly
> appreciated.

Somewhere in that wiki page there is some musing about the size of
.git directories being somewhat dissatisfying when one feels compelled
to have multiple check-outs materialized.

There's a facility in git known as "alternates" that let you fetch
objects from some other pool. This is highly useful if you have the
same project checked out many times, either for personal use or on a
hosting server of some sort.

Because the object pool being referenced has no knowledge of other
repositories referencing it, garbage collection (should you be doing
things that generate garbage, such deleting branches and tags) of the
underlying pool can cause corruption in referencing repositories in
the case where they reference objects that have since been deleted.

This will never happen if the repository monotonically grows, as is
often the case for a 'authoritative' repository, such as the one at
git.postgresql.org that only has major branches and release tags that
will never go away. (save the rare case when fixing up after a cvs
race condition that has occurred a few times in the past).

In practice, one can just make a clean clone of a project for the
purposes of such an object pool and then let it sit for months or even
years, as the size of each subsequent .git, even for considerable
amounts of history, is marginal. Once in a while one can perform the
clean-up task of catching up the object pool, if they feel their .git
directories have gotten unacceptably large.

Here's a brief section about it on a git wiki:

https://git.wiki.kernel.org/index.php/GitFaq#How_to_share_objects_between_existing_repositories.3F

fdr

Re: documentation for committing with git

From

Heikki Linnakangas

Date:

04 August 2010, 06:22:36

On 21/07/10 18:22, Robert Haas wrote:
> At the developer meeting, I promised to do the work of documenting how
> committers should use git.  So here's a first version.
>
> http://wiki.postgresql.org/wiki/Committing_with_Git
>
> Note that while anyone is welcome to comment, I mostly care about
> whether the document is adequate for our existing committers, rather
> than whether someone who is not a committer thinks we should manage
> the project differently... that might be an interesting discussion,
> but we're theoretically making this switch in about a month, and
> getting agreement on changing our current workflow will take about a
> decade, so there is not time now to do the latter before we do the
> former.  So I would ask everyone to consider postponing those
> discussions until after we've made the switch and ironed out the
> kinks.  On the other hand, if you have technical corrections, or if
> you have suggestions on how to do the same things better (rather than
> suggestions on what to do differently), that would be greatly
> appreciated.

I'm a bit disappointed that the wiki page advises against 
git-new-workdir - that's exactly what I was planning to use. It claims 
there's data loss issues with that, does someone know the details? Is 
there really a risk of data loss if multiple workdirs are used, in our 
situation?

I'm planning to have only one local repository, with one workdir per 
branch. When applying a patch to multiple branches, I could work 
simultaneously on all branches, finish and commit the patches on all 
branches, and finally do one "git push" to push all the changes to the 
PostgreSQL repository in one go.

I'm working like that with the internal EDB repository right now, and 
seems to work fine. I keep the master branch checked out in the "main" 
workdir, within the repository, and for each backbranch there's an extra 
workdir created with git-new-workdir. Though I've only been doing this 
for a month or so - if there's issues I might not have noticed yet.

PS. I highly recommend always using "git push --dry-run" before the real 
thing, to make sure you're not doing anything funny.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: documentation for committing with git

From

Robert Haas

Date:

04 August 2010, 07:32:21

On Sun, Aug 1, 2010 at 5:08 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> I'm a bit disappointed that the wiki page advises against git-new-workdir -
> that's exactly what I was planning to use. It claims there's data loss
> issues with that, does someone know the details? Is there really a risk of
> data loss if multiple workdirs are used, in our situation?

As I understand it, there is a risk of corruption if you ever do "git
gc" in the respository that the get-new-workdir was spawned from.  See
also Daniel Farina's email, here:

http://archives.postgresql.org/pgsql-hackers/2010-07/msg01489.php

It's not easy for me to mentally verify that the way I work won't
cause problems with this approach, but you may feel differently, and
that's fine.

> PS. I highly recommend always using "git push --dry-run" before the real
> thing, to make sure you're not doing anything funny.

Ah, that sounds like a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Robert Haas

Date:

04 August 2010, 10:34:15

On Wed, Aug 4, 2010 at 9:29 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Hmm, if I understand correctly, Daniel talks about data loss when using
> "alternates", if you e.g delete a branch and run "git gc" in the parent
> repository, because the child repository referring to the parent via the
> alternate reference can depend on objects in the parent repository that are
> no longer required by the parent repository itself.
>
> I guess that applies to multiple workdirs too, if you have staged but
> uncommitted changes in the staging area of a workdir. This message
> http://kerneltrap.org/mailarchive/git/2007/10/11/335637 agrees. Shawn O
> Pearce's last paragraph says:

You might want to edit your new section so that it refers to some of
the earlier material rather than duplicating it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: documentation for committing with git

From

Heikki Linnakangas

Date:

04 August 2010, 10:35:03

On 04/08/10 13:32, Robert Haas wrote:
> On Sun, Aug 1, 2010 at 5:08 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>> I'm a bit disappointed that the wiki page advises against git-new-workdir -
>> that's exactly what I was planning to use. It claims there's data loss
>> issues with that, does someone know the details? Is there really a risk of
>> data loss if multiple workdirs are used, in our situation?
>
> As I understand it, there is a risk of corruption if you ever do "git
> gc" in the respository that the get-new-workdir was spawned from.  See
> also Daniel Farina's email, here:
>
> http://archives.postgresql.org/pgsql-hackers/2010-07/msg01489.php
>
> It's not easy for me to mentally verify that the way I work won't
> cause problems with this approach, but you may feel differently, and
> that's fine.

Hmm, if I understand correctly, Daniel talks about data loss when using 
"alternates", if you e.g delete a branch and run "git gc" in the parent 
repository, because the child repository referring to the parent via the 
alternate reference can depend on objects in the parent repository that 
are no longer required by the parent repository itself.

I guess that applies to multiple workdirs too, if you have staged but 
uncommitted changes in the staging area of a workdir. This message 
http://kerneltrap.org/mailarchive/git/2007/10/11/335637 agrees. Shawn O 
Pearce's last paragraph says:

> Heh.  As you can see it has some "issues" with its use.  Its a very
> powerful tool, but it does give you more than enough room to shoot
> yourself in the foot.  Using it is like tieing a gun to your ankle,
> keeping it aimed at your big toe at all times, with a string tied
> to your wrist and the gun's trigger.  Reach too far and *bam*.
> Which is why its still in contrib status.

All those issues can be avoided if you only run "git gc" when all the 
working directories are in a clean state, with no staged but uncommitted 
changes or other funny things. I can live with that gun tied to my ankle 
;-).

I've added a section describing git-new-workdir the way I'm going to use it.

>> PS. I highly recommend always using "git push --dry-run" before the real
>> thing, to make sure you're not doing anything funny.
>
> Ah, that sounds like a good idea.

I added a note of that to the wiki too.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: documentation for committing with git

From

Andrew Dunstan

Date:

04 August 2010, 10:50:57


On 08/04/2010 09:29 AM, Heikki Linnakangas wrote:
>
> All those issues can be avoided if you only run "git gc" when all the 
> working directories are in a clean state, with no staged but 
> uncommitted changes or other funny things. I can live with that gun 
> tied to my ankle ;-).


But to make sure of that I think you need to prevent git commands from 
running gc automatically:
    git config gc.auto 0

or possibly
    git config --global gc.auto 0

And you'll need to make sure you run gc yourself from time to time.

cheers

andrew

Re: documentation for committing with git

From

Daniel Farina

Date:

04 August 2010, 23:08:48

On Wed, Aug 4, 2010 at 6:29 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> All those issues can be avoided if you only run "git gc" when all the
> working directories are in a clean state, with no staged but uncommitted
> changes or other funny things. I can live with that gun tied to my ankle
> ;-).

Does even that open a possibility for data loss?

Use of the alternates feature will, to my knowledge, never write the
referenced repository: all new objects are held in the referencers.
The only condition as I understand it is not to generate garbage in
the reference repository, and that nominally does not happen in a repo
that exists only to be an object pool (you probably even want to use a
"bare" repository instead of one with checked out files).

I believe this feature is popular with hosting serving many repos of
the same project.

The especially paranoid may want to try setting their alternate,
referenced repository to be read-only with respect to the user doing
all the potentially-modifying work, undoing this if and when they feel
like adding more objects to the referenced repository. My guess is one
can do a clean checkout and then ride this strategy for quite a long
time (a year? more? it depends on how space-conscious one is), so that
would not be a incredibly onerous paranoia, if one has it.

fdr

Re: documentation for committing with git

From

Andrew Dunstan

Date:

05 August 2010, 00:04:16


On 08/04/2010 10:08 PM, Daniel Farina wrote:
> On Wed, Aug 4, 2010 at 6:29 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>> All those issues can be avoided if you only run "git gc" when all the
>> working directories are in a clean state, with no staged but uncommitted
>> changes or other funny things. I can live with that gun tied to my ankle
>> ;-).
> Does even that open a possibility for data loss?
>
> Use of the alternates feature will, to my knowledge, never write the
> referenced repository: all new objects are held in the referencers.
> The only condition as I understand it is not to generate garbage in
> the reference repository, and that nominally does not happen in a repo
> that exists only to be an object pool (you probably even want to use a
> "bare" repository instead of one with checked out files).
>
>

AIUI, git-new-workdir, which Heikki is proposing to use, does not work 
with bare clones.

cheers

andrew

Re: documentation for committing with git

From

Heikki Linnakangas

Date:

05 August 2010, 06:36:14

On 04/08/10 16:50, Andrew Dunstan wrote:
> On 08/04/2010 09:29 AM, Heikki Linnakangas wrote:
>>
>> All those issues can be avoided if you only run "git gc" when all the
>> working directories are in a clean state, with no staged but
>> uncommitted changes or other funny things. I can live with that gun
>> tied to my ankle ;-).
>
> But to make sure of that I think you need to prevent git commands from
> running gc automatically:
>
> git config gc.auto 0
>
> or possibly
>
> git config --global gc.auto 0
>
> And you'll need to make sure you run gc yourself from time to time.

Good idea. I'll add that to the wiki. I don't like the automatic garbage 
collection anyway, it always kicks in when I'm doing something, and I 
end up interrupting it anyway.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: documentation for committing with git

From

Heikki Linnakangas

Date:

05 August 2010, 06:43:57

On 05/08/10 05:08, Daniel Farina wrote:
> On Wed, Aug 4, 2010 at 6:29 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>> All those issues can be avoided if you only run "git gc" when all the
>> working directories are in a clean state, with no staged but uncommitted
>> changes or other funny things. I can live with that gun tied to my ankle
>> ;-).
>
> Does even that open a possibility for data loss?
>
> Use of the alternates feature will, to my knowledge, never write the
> referenced repository: all new objects are held in the referencers.
> The only condition as I understand it is not to generate garbage in
> the reference repository, and that nominally does not happen in a repo
> that exists only to be an object pool (you probably even want to use a
> "bare" repository instead of one with checked out files).
>
> I believe this feature is popular with hosting serving many repos of
> the same project.
>
> The especially paranoid may want to try setting their alternate,
> referenced repository to be read-only with respect to the user doing
> all the potentially-modifying work, undoing this if and when they feel
> like adding more objects to the referenced repository. My guess is one
> can do a clean checkout and then ride this strategy for quite a long
> time (a year? more? it depends on how space-conscious one is), so that
> would not be a incredibly onerous paranoia, if one has it.

We're talking about different things again. I was talking about using 
one shared repository with multiple workdirs created with 
git-new-workdir. You're talking about anternates. What you say is 
correct for altrenates, and what I said about staged but not committed 
changes is correct for the multiple workdirs approach.

BTW, "git gc" has a grace period, so that it won't delete any garbage 
newer than X days anyway. If I'm reading the git-gc man page correctly, 
that period is 2 weeks by default. That makes the possibility of 
accidentally deleting still-interesting staged but not committed changes 
quite small, even if you run "git gc" at a wrong time. You wouldn't 
normally have staged but not committed changes like that lying in 
backbranches for that long.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com