Thread: documentation for committing with git
At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Note that while anyone is welcome to comment, I mostly care about whether the document is adequate for our existing committers, rather than whether someone who is not a committer thinks we should manage the project differently... that might be an interesting discussion, but we're theoretically making this switch in about a month, and getting agreement on changing our current workflow will take about a decade, so there is not time now to do the latter before we do the former. So I would ask everyone to consider postponing those discussions until after we've made the switch and ironed out the kinks. On the other hand, if you have technical corrections, or if you have suggestions on how to do the same things better (rather than suggestions on what to do differently), that would be greatly appreciated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On ons, 2010-07-21 at 12:22 -0400, Robert Haas wrote: > At the developer meeting, I promised to do the work of documenting how > committers should use git. So here's a first version. > > http://wiki.postgresql.org/wiki/Committing_with_Git Looks good. Please consolidate this with the Committers page when the day comes. Comments: 3. ... your name and email address must match those configured on the server ==> How do we know what those are? Who controls that? 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. ==> This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful.
On Wed, Jul 21, 2010 at 21:07, Peter Eisentraut <peter_e@gmx.net> wrote: > On ons, 2010-07-21 at 12:22 -0400, Robert Haas wrote: >> At the developer meeting, I promised to do the work of documenting how >> committers should use git. So here's a first version. >> >> http://wiki.postgresql.org/wiki/Committing_with_Git > > Looks good. Please consolidate this with the Committers page when the > day comes. > > Comments: > > 3. ... your name and email address must match those configured on the > server > > ==> How do we know what those are? Who controls that? sysadmins team. It's set up when committers are added, just like today's authormap on the git mirror. Before we set up the system, we'll double check all of them with each committer, of course. > 6. Finally, you must push your changes back to the server. > > git push > > This will push changes in all branches you've updated, but only branches > that also exist on the remote side will be pushed; thus, you can have > local working branches that won't be pushed. > > ==> This is true, but I have found it saner to configure push.default = > tracking, so that only the current branch is pushes. Some people might > find that useful. Indeed. Why don't I do that more often... +1 on making that a general recommendation, and have people only not do that if they really know what they're doing :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote: >> 6. Finally, you must push your changes back to the server. >> >> git push >> >> This will push changes in all branches you've updated, but only branches >> that also exist on the remote side will be pushed; thus, you can have >> local working branches that won't be pushed. >> >> ==> This is true, but I have found it saner to configure push.default = >> tracking, so that only the current branch is pushes. Some people might >> find that useful. > > Indeed. Why don't I do that more often... > > +1 on making that a general recommendation, and have people only not > do that if they really know what they're doing :-) Hmm, I didn't know about that option. What makes us think that's the behavior people will most often want? Because it doesn't seem like what I want, just for one example... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 21:20, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> 6. Finally, you must push your changes back to the server. >>> >>> git push >>> >>> This will push changes in all branches you've updated, but only branches >>> that also exist on the remote side will be pushed; thus, you can have >>> local working branches that won't be pushed. >>> >>> ==> This is true, but I have found it saner to configure push.default = >>> tracking, so that only the current branch is pushes. Some people might >>> find that useful. >> >> Indeed. Why don't I do that more often... >> >> +1 on making that a general recommendation, and have people only not >> do that if they really know what they're doing :-) > > Hmm, I didn't know about that option. What makes us think that's the > behavior people will most often want? Because it doesn't seem like > what I want, just for one example... It'd be what I want for everything *except* when doing backpatching. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Jul 21, 2010, at 2:20 PM, Robert Haas wrote: > On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> 6. Finally, you must push your changes back to the server. >>> >>> git push >>> >>> This will push changes in all branches you've updated, but only branches >>> that also exist on the remote side will be pushed; thus, you can have >>> local working branches that won't be pushed. >>> >>> ==> This is true, but I have found it saner to configure push.default = >>> tracking, so that only the current branch is pushes. Some people might >>> find that useful. >> >> Indeed. Why don't I do that more often... >> >> +1 on making that a general recommendation, and have people only not >> do that if they really know what they're doing :-) > > Hmm, I didn't know about that option. What makes us think that's the > behavior people will most often want? Because it doesn't seem like > what I want, just for one example... So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Createa push on master. Bare git push. WIP commit gets pushed upstream. Oops. Regards, David -- David Christensen End Point Corporation david@endpoint.com
On Wed, Jul 21, 2010 at 3:23 PM, David Christensen <david@endpoint.com> wrote: > > On Jul 21, 2010, at 2:20 PM, Robert Haas wrote: > >> On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander <magnus@hagander.net> wrote: >>>> 6. Finally, you must push your changes back to the server. >>>> >>>> git push >>>> >>>> This will push changes in all branches you've updated, but only branches >>>> that also exist on the remote side will be pushed; thus, you can have >>>> local working branches that won't be pushed. >>>> >>>> ==> This is true, but I have found it saner to configure push.default = >>>> tracking, so that only the current branch is pushes. Some people might >>>> find that useful. >>> >>> Indeed. Why don't I do that more often... >>> >>> +1 on making that a general recommendation, and have people only not >>> do that if they really know what they're doing :-) >> >> Hmm, I didn't know about that option. What makes us think that's the >> behavior people will most often want? Because it doesn't seem like >> what I want, just for one example... > > > So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Createa push on master. Bare git push. WIP commit gets pushed upstream. Oops. Sure, oops, but I would never do that. I'd stash it or put it on a topic branch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Excerpts from Robert Haas's message of mié jul 21 15:26:47 -0400 2010: > > So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Createa push on master. Bare git push. WIP commit gets pushed upstream. Oops. > > Sure, oops, but I would never do that. I'd stash it or put it on a > topic branch. Somebody else will. Please remember you're writing docs that are not for yourself.
Robert Haas wrote: > At the developer meeting, I promised to do the work of documenting how > committers should use git. So here's a first version. > > http://wiki.postgresql.org/wiki/Committing_with_Git > > Note that while anyone is welcome to comment, I mostly care about > whether the document is adequate for our existing committers, rather > than whether someone who is not a committer thinks we should manage > the project differently... that might be an interesting discussion, > but we're theoretically making this switch in about a month, and > getting agreement on changing our current workflow will take about a > decade, so there is not time now to do the latter before we do the > former. So I would ask everyone to consider postponing those > discussions until after we've made the switch and ironed out the > kinks. On the other hand, if you have technical corrections, or if > you have suggestions on how to do the same things better (rather than > suggestions on what to do differently), that would be greatly > appreciated. > Well, either we have a terminology problem or a statement of policy that I'm not sure I agree with, in point 2. IMNSHO, what we need to forbid is commits that are not fast-forward commits, i.e. that do not have the current branch head as an ancestor, ideally as the immediate ancestor. Personally, I have a strong opinion that for everything but totally trivial patches, the committer should create a short-lived work branch where all the work is done, and then do a squash merge back to the main branch, which is then pushed. This pattern is not mentioned at all. In my experience, it is essential, especially if you're working on more than one thing at a time, as many people often are. cheers andrew
On Wed, Jul 21, 2010 at 21:37, Andrew Dunstan <andrew@dunslane.net> wrote: > > > Robert Haas wrote: >> >> At the developer meeting, I promised to do the work of documenting how >> committers should use git. So here's a first version. >> >> http://wiki.postgresql.org/wiki/Committing_with_Git >> >> Note that while anyone is welcome to comment, I mostly care about >> whether the document is adequate for our existing committers, rather >> than whether someone who is not a committer thinks we should manage >> the project differently... that might be an interesting discussion, >> but we're theoretically making this switch in about a month, and >> getting agreement on changing our current workflow will take about a >> decade, so there is not time now to do the latter before we do the >> former. So I would ask everyone to consider postponing those >> discussions until after we've made the switch and ironed out the >> kinks. On the other hand, if you have technical corrections, or if >> you have suggestions on how to do the same things better (rather than >> suggestions on what to do differently), that would be greatly >> appreciated. >> > > Well, either we have a terminology problem or a statement of policy that I'm > not sure I agree with, in point 2. IMNSHO, what we need to forbid is > commits that are not fast-forward commits, i.e. that do not have the current > branch head as an ancestor, ideally as the immediate ancestor. > > Personally, I have a strong opinion that for everything but totally trivial > patches, the committer should create a short-lived work branch where all the > work is done, and then do a squash merge back to the main branch, which is > then pushed. This pattern is not mentioned at all. In my experience, it is > essential, especially if you're working on more than one thing at a time, as > many people often are. Uh, that's going to create an actual merge commit, no? Or you mean squash-merge-but-only-fast-forward? I *think* the docs is based off the pattern of the committer having two repositories - one for his own work, one for comitting, much like I assume all of us have today in cvs. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Jul 21, 2010, at 2:39 PM, Magnus Hagander wrote: > On Wed, Jul 21, 2010 at 21:37, Andrew Dunstan <andrew@dunslane.net> wrote: >> >> >> Robert Haas wrote: >>> >>> At the developer meeting, I promised to do the work of documenting how >>> committers should use git. So here's a first version. >>> >>> http://wiki.postgresql.org/wiki/Committing_with_Git >>> >>> Note that while anyone is welcome to comment, I mostly care about >>> whether the document is adequate for our existing committers, rather >>> than whether someone who is not a committer thinks we should manage >>> the project differently... that might be an interesting discussion, >>> but we're theoretically making this switch in about a month, and >>> getting agreement on changing our current workflow will take about a >>> decade, so there is not time now to do the latter before we do the >>> former. So I would ask everyone to consider postponing those >>> discussions until after we've made the switch and ironed out the >>> kinks. On the other hand, if you have technical corrections, or if >>> you have suggestions on how to do the same things better (rather than >>> suggestions on what to do differently), that would be greatly >>> appreciated. >>> >> >> Well, either we have a terminology problem or a statement of policy that I'm >> not sure I agree with, in point 2. IMNSHO, what we need to forbid is >> commits that are not fast-forward commits, i.e. that do not have the current >> branch head as an ancestor, ideally as the immediate ancestor. >> >> Personally, I have a strong opinion that for everything but totally trivial >> patches, the committer should create a short-lived work branch where all the >> work is done, and then do a squash merge back to the main branch, which is >> then pushed. This pattern is not mentioned at all. In my experience, it is >> essential, especially if you're working on more than one thing at a time, as >> many people often are. > > Uh, that's going to create an actual merge commit, no? Or you mean > squash-merge-but-only-fast-forward? > > I *think* the docs is based off the pattern of the committer having > two repositories - one for his own work, one for comitting, much like > I assume all of us have today in cvs. You can also do a rebase after the merge to remove the local merge commit before pushing. I tend to do this anytime I mergea local branch, just to rebase on top of the most recent origin/master. Regards, David -- David Christensen End Point Corporation david@endpoint.com
Magnus Hagander wrote: >> Personally, I have a strong opinion that for everything but totally trivial >> patches, the committer should create a short-lived work branch where all the >> work is done, and then do a squash merge back to the main branch, which is >> then pushed. This pattern is not mentioned at all. In my experience, it is >> essential, especially if you're working on more than one thing at a time, as >> many people often are. >> > > Uh, that's going to create an actual merge commit, no? Or you mean > squash-merge-but-only-fast-forward? > Yes, exactly that. Something like: git checkout -b myworkbranch ... work, test, commit, rinse, lather repeat ... git checkout RELn_m_STABLE git pull git merge --squash myworkbranch git push > I *think* the docs is based off the pattern of the committer having > two repositories - one for his own work, one for comitting, much like > I assume all of us have today in cvs. > > So then what? After you've done your work you'll still need to pull the stuff somehow into your commit tree. I don't think this will buy you a lot. I usually clone the whole CVS tree for non-trivial work, but I'm not sure that's an ideal work pattern. cheers andrew
On Wed, Jul 21, 2010 at 3:31 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Robert Haas's message of mié jul 21 15:26:47 -0400 2010: > >> > So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Createa push on master. Bare git push. WIP commit gets pushed upstream. Oops. >> >> Sure, oops, but I would never do that. I'd stash it or put it on a >> topic branch. > > Somebody else will. Please remember you're writing docs that are not > for yourself. I don't have any problem suggesting it for those who may want it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 3:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Well, either we have a terminology problem or a statement of policy that I'm > not sure I agree with, in point 2. IMNSHO, what we need to forbid is > commits that are not fast-forward commits, i.e. that do not have the current > branch head as an ancestor, ideally as the immediate ancestor. There are two separate questions here. One is whether an update to a ref is fast-forward or history rewriting, and the other is whether it is a merge commit or not. I don't believe that we want either history-rewriting commits or merge commits to get pushed, but this paragraph is about merge commits. > Personally, I have a strong opinion that for everything but totally trivial > patches, the committer should create a short-lived work branch where all the > work is done, and then do a squash merge back to the main branch, which is > then pushed. This pattern is not mentioned at all. In my experience, it is > essential, especially if you're working on more than one thing at a time, as > many people often are. git merge --squash doesn't create a merge commit. Indeed, the whole point is to create a commit which essentially encapsulates the same diff as a merge commit but actually isn't one. From the man page: Produce the working tree and index state as if a real merge happened (except for the merge information), but do not actually make a commit or move the HEAD, nor record $GIT_DIR/MERGE_HEAD to cause the next git commit command to create a merge commit. As for whether to discuss the use of git merge --squash, I could go either way on that. Personally, my preferred workflow is to do 'git rebase -i master' on a topic branch, squash all the commits, and then switch to the master branch and do 'git merge otherbranch', resulting in a fast-forward merge with no merge commit. But there are many other ways to do it, including 'git merge --squash' and the already-mentioned 'git commit -a'. I think there's a risk of this turning into a complete tutorial on git, which might detract from its primary purpose of explaining to committers how to get a basic, working setup in place. But we can certainly add whatever you think is important, or maybe some language indicating that 'git commit -a' is just an EXAMPLE of how to create a commit... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 5:03 PM, Robert Haas <robertmhaas@gmail.com> wrote: > working setup in place. But we can certainly add whatever you think > is important, or maybe some language indicating that 'git commit -a' > is just an EXAMPLE of how to create a commit... I took a crack at this, as well as incorporating some of the other suggestions that have been made. I'm sure it's not perfect, but maybe it's an improvement... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 9:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On the other hand, if you have technical corrections, or if > you have suggestions on how to do the same things better (rather than > suggestions on what to do differently), that would be greatly > appreciated. Somewhere in that wiki page there is some musing about the size of .git directories being somewhat dissatisfying when one feels compelled to have multiple check-outs materialized. There's a facility in git known as "alternates" that let you fetch objects from some other pool. This is highly useful if you have the same project checked out many times, either for personal use or on a hosting server of some sort. Because the object pool being referenced has no knowledge of other repositories referencing it, garbage collection (should you be doing things that generate garbage, such deleting branches and tags) of the underlying pool can cause corruption in referencing repositories in the case where they reference objects that have since been deleted. This will never happen if the repository monotonically grows, as is often the case for a 'authoritative' repository, such as the one at git.postgresql.org that only has major branches and release tags that will never go away. (save the rare case when fixing up after a cvs race condition that has occurred a few times in the past). In practice, one can just make a clean clone of a project for the purposes of such an object pool and then let it sit for months or even years, as the size of each subsequent .git, even for considerable amounts of history, is marginal. Once in a while one can perform the clean-up task of catching up the object pool, if they feel their .git directories have gotten unacceptably large. Here's a brief section about it on a git wiki: https://git.wiki.kernel.org/index.php/GitFaq#How_to_share_objects_between_existing_repositories.3F fdr
On 21/07/10 18:22, Robert Haas wrote: > At the developer meeting, I promised to do the work of documenting how > committers should use git. So here's a first version. > > http://wiki.postgresql.org/wiki/Committing_with_Git > > Note that while anyone is welcome to comment, I mostly care about > whether the document is adequate for our existing committers, rather > than whether someone who is not a committer thinks we should manage > the project differently... that might be an interesting discussion, > but we're theoretically making this switch in about a month, and > getting agreement on changing our current workflow will take about a > decade, so there is not time now to do the latter before we do the > former. So I would ask everyone to consider postponing those > discussions until after we've made the switch and ironed out the > kinks. On the other hand, if you have technical corrections, or if > you have suggestions on how to do the same things better (rather than > suggestions on what to do differently), that would be greatly > appreciated. I'm a bit disappointed that the wiki page advises against git-new-workdir - that's exactly what I was planning to use. It claims there's data loss issues with that, does someone know the details? Is there really a risk of data loss if multiple workdirs are used, in our situation? I'm planning to have only one local repository, with one workdir per branch. When applying a patch to multiple branches, I could work simultaneously on all branches, finish and commit the patches on all branches, and finally do one "git push" to push all the changes to the PostgreSQL repository in one go. I'm working like that with the internal EDB repository right now, and seems to work fine. I keep the master branch checked out in the "main" workdir, within the repository, and for each backbranch there's an extra workdir created with git-new-workdir. Though I've only been doing this for a month or so - if there's issues I might not have noticed yet. PS. I highly recommend always using "git push --dry-run" before the real thing, to make sure you're not doing anything funny. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Sun, Aug 1, 2010 at 5:08 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > I'm a bit disappointed that the wiki page advises against git-new-workdir - > that's exactly what I was planning to use. It claims there's data loss > issues with that, does someone know the details? Is there really a risk of > data loss if multiple workdirs are used, in our situation? As I understand it, there is a risk of corruption if you ever do "git gc" in the respository that the get-new-workdir was spawned from. See also Daniel Farina's email, here: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01489.php It's not easy for me to mentally verify that the way I work won't cause problems with this approach, but you may feel differently, and that's fine. > PS. I highly recommend always using "git push --dry-run" before the real > thing, to make sure you're not doing anything funny. Ah, that sounds like a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Aug 4, 2010 at 9:29 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Hmm, if I understand correctly, Daniel talks about data loss when using > "alternates", if you e.g delete a branch and run "git gc" in the parent > repository, because the child repository referring to the parent via the > alternate reference can depend on objects in the parent repository that are > no longer required by the parent repository itself. > > I guess that applies to multiple workdirs too, if you have staged but > uncommitted changes in the staging area of a workdir. This message > http://kerneltrap.org/mailarchive/git/2007/10/11/335637 agrees. Shawn O > Pearce's last paragraph says: You might want to edit your new section so that it refers to some of the earlier material rather than duplicating it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On 04/08/10 13:32, Robert Haas wrote: > On Sun, Aug 1, 2010 at 5:08 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> I'm a bit disappointed that the wiki page advises against git-new-workdir - >> that's exactly what I was planning to use. It claims there's data loss >> issues with that, does someone know the details? Is there really a risk of >> data loss if multiple workdirs are used, in our situation? > > As I understand it, there is a risk of corruption if you ever do "git > gc" in the respository that the get-new-workdir was spawned from. See > also Daniel Farina's email, here: > > http://archives.postgresql.org/pgsql-hackers/2010-07/msg01489.php > > It's not easy for me to mentally verify that the way I work won't > cause problems with this approach, but you may feel differently, and > that's fine. Hmm, if I understand correctly, Daniel talks about data loss when using "alternates", if you e.g delete a branch and run "git gc" in the parent repository, because the child repository referring to the parent via the alternate reference can depend on objects in the parent repository that are no longer required by the parent repository itself. I guess that applies to multiple workdirs too, if you have staged but uncommitted changes in the staging area of a workdir. This message http://kerneltrap.org/mailarchive/git/2007/10/11/335637 agrees. Shawn O Pearce's last paragraph says: > Heh. As you can see it has some "issues" with its use. Its a very > powerful tool, but it does give you more than enough room to shoot > yourself in the foot. Using it is like tieing a gun to your ankle, > keeping it aimed at your big toe at all times, with a string tied > to your wrist and the gun's trigger. Reach too far and *bam*. > Which is why its still in contrib status. All those issues can be avoided if you only run "git gc" when all the working directories are in a clean state, with no staged but uncommitted changes or other funny things. I can live with that gun tied to my ankle ;-). I've added a section describing git-new-workdir the way I'm going to use it. >> PS. I highly recommend always using "git push --dry-run" before the real >> thing, to make sure you're not doing anything funny. > > Ah, that sounds like a good idea. I added a note of that to the wiki too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 08/04/2010 09:29 AM, Heikki Linnakangas wrote: > > All those issues can be avoided if you only run "git gc" when all the > working directories are in a clean state, with no staged but > uncommitted changes or other funny things. I can live with that gun > tied to my ankle ;-). But to make sure of that I think you need to prevent git commands from running gc automatically: git config gc.auto 0 or possibly git config --global gc.auto 0 And you'll need to make sure you run gc yourself from time to time. cheers andrew
On Wed, Aug 4, 2010 at 6:29 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > All those issues can be avoided if you only run "git gc" when all the > working directories are in a clean state, with no staged but uncommitted > changes or other funny things. I can live with that gun tied to my ankle > ;-). Does even that open a possibility for data loss? Use of the alternates feature will, to my knowledge, never write the referenced repository: all new objects are held in the referencers. The only condition as I understand it is not to generate garbage in the reference repository, and that nominally does not happen in a repo that exists only to be an object pool (you probably even want to use a "bare" repository instead of one with checked out files). I believe this feature is popular with hosting serving many repos of the same project. The especially paranoid may want to try setting their alternate, referenced repository to be read-only with respect to the user doing all the potentially-modifying work, undoing this if and when they feel like adding more objects to the referenced repository. My guess is one can do a clean checkout and then ride this strategy for quite a long time (a year? more? it depends on how space-conscious one is), so that would not be a incredibly onerous paranoia, if one has it. fdr
On 08/04/2010 10:08 PM, Daniel Farina wrote: > On Wed, Aug 4, 2010 at 6:29 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> All those issues can be avoided if you only run "git gc" when all the >> working directories are in a clean state, with no staged but uncommitted >> changes or other funny things. I can live with that gun tied to my ankle >> ;-). > Does even that open a possibility for data loss? > > Use of the alternates feature will, to my knowledge, never write the > referenced repository: all new objects are held in the referencers. > The only condition as I understand it is not to generate garbage in > the reference repository, and that nominally does not happen in a repo > that exists only to be an object pool (you probably even want to use a > "bare" repository instead of one with checked out files). > > AIUI, git-new-workdir, which Heikki is proposing to use, does not work with bare clones. cheers andrew
On 04/08/10 16:50, Andrew Dunstan wrote: > On 08/04/2010 09:29 AM, Heikki Linnakangas wrote: >> >> All those issues can be avoided if you only run "git gc" when all the >> working directories are in a clean state, with no staged but >> uncommitted changes or other funny things. I can live with that gun >> tied to my ankle ;-). > > But to make sure of that I think you need to prevent git commands from > running gc automatically: > > git config gc.auto 0 > > or possibly > > git config --global gc.auto 0 > > And you'll need to make sure you run gc yourself from time to time. Good idea. I'll add that to the wiki. I don't like the automatic garbage collection anyway, it always kicks in when I'm doing something, and I end up interrupting it anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 05/08/10 05:08, Daniel Farina wrote: > On Wed, Aug 4, 2010 at 6:29 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> All those issues can be avoided if you only run "git gc" when all the >> working directories are in a clean state, with no staged but uncommitted >> changes or other funny things. I can live with that gun tied to my ankle >> ;-). > > Does even that open a possibility for data loss? > > Use of the alternates feature will, to my knowledge, never write the > referenced repository: all new objects are held in the referencers. > The only condition as I understand it is not to generate garbage in > the reference repository, and that nominally does not happen in a repo > that exists only to be an object pool (you probably even want to use a > "bare" repository instead of one with checked out files). > > I believe this feature is popular with hosting serving many repos of > the same project. > > The especially paranoid may want to try setting their alternate, > referenced repository to be read-only with respect to the user doing > all the potentially-modifying work, undoing this if and when they feel > like adding more objects to the referenced repository. My guess is one > can do a clean checkout and then ride this strategy for quite a long > time (a year? more? it depends on how space-conscious one is), so that > would not be a incredibly onerous paranoia, if one has it. We're talking about different things again. I was talking about using one shared repository with multiple workdirs created with git-new-workdir. You're talking about anternates. What you say is correct for altrenates, and what I said about staged but not committed changes is correct for the multiple workdirs approach. BTW, "git gc" has a grace period, so that it won't delete any garbage newer than X days anyway. If I'm reading the git-gc man page correctly, that period is 2 weeks by default. That makes the possibility of accidentally deleting still-interesting staged but not committed changes quite small, even if you run "git gc" at a wrong time. You wouldn't normally have staged but not committed changes like that lying in backbranches for that long. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com