Thread: Hacking on PostgreSQL via GIT
Hi Florian, I am right now running an rsync of the Pg CVS repo to my work machine to get a git import underway. I'm rather keen on seeing your cool PITR Pg project go well and I have some git+cvs fu I can apply here (being one of the git-cvsimport maintainers) ;-) For the kind of work you'll be doing (writing patches that you'll want to be rebasing onto the latest HEAD for merging later) git is probably the best tool. That's what I use it for... tracking my experimental / custom branches of projects that use CVS or SVN :-) Initially, I'll post it on http://git.catalyst.net.nz/ and I can run a daily import for you - once that's in place you can probably get a repo with your work on http://repo.or.cz/ cheers, martin -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
Martin Langhoff wrote: > Hi Florian, > > I am right now running an rsync of the Pg CVS repo to my work machine to > get a git import underway. I'm rather keen on seeing your cool PITR Pg > project go well and I have some git+cvs fu I can apply here (being one > of the git-cvsimport maintainers) ;-) > > For the kind of work you'll be doing (writing patches that you'll want > to be rebasing onto the latest HEAD for merging later) git is probably > the best tool. That's what I use it for... tracking my experimental / > custom branches of projects that use CVS or SVN :-) > > Initially, I'll post it on http://git.catalyst.net.nz/ and I can run a > daily import for you - once that's in place you can probably get a repo > with your work on http://repo.or.cz/ Well, now that more than one of us are working with git on PostgreSQL... I've had a repo conversion running for a while... I've only got it to what I consider "stable" last week: http://repo.or.cz/w/PostgreSQL.git git://repo.or.cz/PostgreSQL.git Note that this is a "special" conversion - I intentionally "unmunge" all the $PostgreSQL$ tags in this repo. I hate the Keyword expansion, and it only servers to make otherwise automatically merging a manual process. So I specifically go through and un-munge any keyword a-like things before stomping it into GIT. For those interested int he conversion process, I've used a slightly modified version of fromcvs (A ruby cvs to git/Hg tool), and it runs on all of pgsql in about 20 minutes. I gave up on git-svn (because of both speed and my in-ablility to easy "filter" out Keywords, etc) and git-cvsimport (because cvsps doesn't seem to like pgsql's repo) I "update" the git repo daily, based on an anonymous rsync of the cvsroot. If the anon-rsync is updated much more frequently, and people think my git conversion should match it, I have no problem having cron run it more than daily. Also - note that I give *no* guarentees of it's integrity, etc. I've "diffed" a CVS checkout and a git checkout, and the are *almost* identical. Almost, because it seems like my git repository currently has 3 files that a cvs checkout doesn't:backend/parser/gram.c |12088 +++++++++++++++++++++++++++interfaces/ecpg/preproc/pgc.c | 2887 ++++++interfaces/ecpg/preproc/preproc.c |16988 ++++++++++++++++++++++++++++++++++ And at this point, I haven't been bothered to see where those files came from (and where they dissapear) in CVS and why my import isn't picking that up... I could probably be pushed if others find this repo really useful, but those files problematic...
* Aidan Van Dyk <aidan@highrise.ca> [070416 14:08]: > Note that this is a "special" conversion - I intentionally "unmunge" all the > $PostgreSQL$ tags in this repo. Blah - and I just noticed that I actually "missed" the $PostgreSQL$ (although I did catch the Date/Modified/From/etc)... > I hate the Keyword expansion, and it only servers to make otherwise > automatically merging a manual process. So I specifically go through and > un-munge any keyword a-like things before stomping it into GIT. Expect it to change in the next little while once more ;-) a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Martin Langhoff wrote: > Hi Florian, > > I am right now running an rsync of the Pg CVS repo to my work machine to > get a git import underway. I'm rather keen on seeing your cool PITR Pg > project go well and I have some git+cvs fu I can apply here (being one > of the git-cvsimport maintainers) ;-) Cool - I'm new to git, so I really appreciate any help that I can get. > For the kind of work you'll be doing (writing patches that you'll want > to be rebasing onto the latest HEAD for merging later) git is probably > the best tool. That's what I use it for... tracking my experimental / > custom branches of projects that use CVS or SVN :-) Thats how I figured I'd work - though I don't yet understand what the advantage of "rebase" is over "merge". Currently, I've setup a git repo that pulls in the changes from the SVN repo, and pushed them to my main soc git repo. On that main repo I have two branches, master and pgsql-head, and I call "cg-merge pgsql-head" if I want to merge with CVS HEAD. > Initially, I'll post it on http://git.catalyst.net.nz/ and I can run a > daily import for you - once that's in place you can probably get a repo > with your work on http://repo.or.cz/ Having a git mirror of the pgsql CVS would be great. BTW, I've just check out repo.or.cz, and noticed that there is already a git mirror of the pgsql CVS: http://repo.or.cz/w/PostgreSQL.git greetings + thanks Florian Pflug
Aidan Van Dyk wrote: > Martin Langhoff wrote: > Well, now that more than one of us are working with git on PostgreSQL... > > I've had a repo conversion running for a while... I've only got it to what > I consider "stable" last week: > http://repo.or.cz/w/PostgreSQL.git > git://repo.or.cz/PostgreSQL.git Ah - thats what I just stumbled over ;-) > For those interested int he conversion process, I've used a slightly > modified version of fromcvs (A ruby cvs to git/Hg tool), and it runs on all > of pgsql in about 20 minutes. > > I gave up on git-svn (because of both speed and my in-ablility to > easy "filter" out Keywords, etc) and git-cvsimport (because cvsps doesn't > seem to like pgsql's repo) Yeah, git-cvsimport didn't work for me either... > I "update" the git repo daily, based on an anonymous rsync of the cvsroot. > If the anon-rsync is updated much more frequently, and people think my git > conversion should match it, I have no problem having cron run it more than > daily. > > Also - note that I give *no* guarentees of it's integrity, etc. > > I've "diffed" a CVS checkout and a git checkout, and the are *almost* > identical. Almost, because it seems like my git repository currently has 3 > files that a cvs checkout doesn't: > backend/parser/gram.c |12088 +++++++++++++++++++++++++++ > interfaces/ecpg/preproc/pgc.c | 2887 ++++++ > interfaces/ecpg/preproc/preproc.c |16988 ++++++++++++++++++++++++++++++++++ > > And at this point, I haven't been bothered to see where those files came > from (and where they dissapear) in CVS and why my import isn't picking that > up... I could probably be pushed if others find this repo really useful, > but those files problematic... Thats interesting - the SVN mirror of the pgsql CVS at http://projects.commandprompt.com/public/pgsql/browser has exactly the same problem with those 3 files, as I found out the hard way ;-) In the case of pgc.c, I've compared that revisions in CVS with the one in SVN. SVN include the cvs-version 1.5 if this file in trunk, which seems to be the last version of that file in CVS HEAD. Interestingly, http://developer.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/ecpg/preproc/Attic/pgc.c shows no trace of the file being deleted from HEAD either - it just shows that it was removed from WIN32_DEV. But still a CVS checkout doesn't include that file... Since 3 tools (cvsweb, git-cvsimport and whatever commandprompt uses to create the SVN mirror) all come to the same conclusion regarding this file, I think that this is caused by some corruption of the CVS repository - but I don't have the cvs-fu to debug this... greetings, Florian Pflug
Aidan Van Dyk wrote: > I've "diffed" a CVS checkout and a git checkout, and the are *almost* > identical. Almost, because it seems like my git repository currently has 3 > files that a cvs checkout doesn't: > backend/parser/gram.c |12088 +++++++++++++++++++++++++++ > interfaces/ecpg/preproc/pgc.c | 2887 ++++++ > interfaces/ecpg/preproc/preproc.c |16988 ++++++++++++++++++++++++++++++++++ > > And at this point, I haven't been bothered to see where those files came > from (and where they dissapear) in CVS and why my import isn't picking that > up... I could probably be pushed if others find this repo really useful, > but those files problematic... These files are generated (from gram.y, pgc.l and preproc.y respectievly) and are not present in the CVS repo, though I think they have been at some point. It's strange that other generated files (that have also been in the repo in the past) like preproc.h are not showing up. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > These files are generated (from gram.y, pgc.l and preproc.y > respectievly) and are not present in the CVS repo, though I think they > have been at some point. > It's strange that other generated files (that have also been in the repo > in the past) like preproc.h are not showing up. The weird thing about these files is that the CVS history shows commits on HEAD later than the file removal commit. I don't recall if Vadim unintentionally re-added the files before making those commits ... but if he did, you'd think it'd have taken another explicit removal to get rid of them in HEAD. More likely, there was some problem in his local tree that allowed a "cvs commit" to think it should update the repository with copies of the derived files he happened to have. I think this is a corner case that CVS handles in a particular way and the tools people are using to read the repository handle in a different way. Which would be a bug in those tools, since CVS's interpretation must be right by definition. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> These files are generated (from gram.y, pgc.l and preproc.y >> respectievly) and are not present in the CVS repo, though I think they >> have been at some point. > >> It's strange that other generated files (that have also been in the repo >> in the past) like preproc.h are not showing up. > > The weird thing about these files is that the CVS history shows commits > on HEAD later than the file removal commit. I don't recall if Vadim > unintentionally re-added the files before making those commits ... but > if he did, you'd think it'd have taken another explicit removal to get > rid of them in HEAD. More likely, there was some problem in his local > tree that allowed a "cvs commit" to think it should update the > repository with copies of the derived files he happened to have. > > I think this is a corner case that CVS handles in a particular way and > the tools people are using to read the repository handle in a different > way. Which would be a bug in those tools, since CVS's interpretation > must be right by definition. The question is if it'd be acceptable to manually remove that last commit from the repository. I guess simply readding, and then removing the files again should do the trick, though I'd be cleaner to fix remove the offending commit in the first place. Should postgres ever decide to switch to another version control system (which I don't advocate), that'd be one obstacle less to deal with... Or is the risk of causing breakage too high? greetings, Florian Pflug
fgp@phlo.org ("Florian G. Pflug") writes: > Martin Langhoff wrote: >> Hi Florian, >> I am right now running an rsync of the Pg CVS repo to my work >> machine to >> get a git import underway. I'm rather keen on seeing your cool PITR Pg >> project go well and I have some git+cvs fu I can apply here (being one >> of the git-cvsimport maintainers) ;-) > Cool - I'm new to git, so I really appreciate any help that I can get. > >> For the kind of work you'll be doing (writing patches that you'll want >> to be rebasing onto the latest HEAD for merging later) git is probably >> the best tool. That's what I use it for... tracking my experimental / >> custom branches of projects that use CVS or SVN :-) > Thats how I figured I'd work - though I don't yet understand what > the advantage of "rebase" is over "merge". > > Currently, I've setup a git repo that pulls in the changes from the SVN > repo, and pushed them to my main soc git repo. On that main repo I have > two branches, master and pgsql-head, and I call "cg-merge pgsql-head" > if I want to merge with CVS HEAD. > >> Initially, I'll post it on http://git.catalyst.net.nz/ and I can run a >> daily import for you - once that's in place you can probably get a repo >> with your work on http://repo.or.cz/ > Having a git mirror of the pgsql CVS would be great. > BTW, I've just check out repo.or.cz, and noticed that there is already a > git mirror of the pgsql CVS: http://repo.or.cz/w/PostgreSQL.git This strikes me as being a really super thing, having both Subversion and Git repositories publicly available that are tracking the PostgreSQL sources. Stepping back to the SCM discussion, people were interested in finding out what merits there were in having these sorts of SCMs, and in finding out what glitches people might discover (e.g. - like the files where the CVS repository is a bit schizophrenic as to whether they are still there or not...). Having these repositories should allow some of this experimentation to take place now. I'd be interested in fiddling with a Git repository, at some point; I'll happily wait a bit to start drawing from one of these existing ones, to let the dust settle and to let things stabilize a bit. -- (reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc")) http://linuxdatabases.info/info/emacs.html "Support your local medical examiner - die strangely." -- Blake Bowers
aidan@highrise.ca (Aidan Van Dyk) writes: > I've "diffed" a CVS checkout and a git checkout, and the are *almost* > identical. Almost, because it seems like my git repository currently has 3 > files that a cvs checkout doesn't: > backend/parser/gram.c |12088 +++++++++++++++++++++++++++ > interfaces/ecpg/preproc/pgc.c | 2887 ++++++ > interfaces/ecpg/preproc/preproc.c |16988 ++++++++++++++++++++++++++++++++++ > > And at this point, I haven't been bothered to see where those files came > from (and where they dissapear) in CVS and why my import isn't picking that > up... I could probably be pushed if others find this repo really useful, > but those files problematic... Those three files are normally generated by either flex or bison (gram.c depends on gram.y, pgc.c on pgc.l, and preproc.c on preproc.y); I'd suggest removing those three files from your git repository. -- "cbbrowne","@","acm.org" http://cbbrowne.com/info/rdbms.html "They laughed at Columbus, they laughed at Fulton, they laughed at the Wright brothers. But they also laughed at Bozo the Clown." -- Carl Sagan
* Florian G. Pflug <fgp@phlo.org> [070416 16:16]: > >I think this is a corner case that CVS handles in a particular way and > >the tools people are using to read the repository handle in a different > >way. Which would be a bug in those tools, since CVS's interpretation > >must be right by definition. ;-) Would anyone know if these were "hand moved" to Attic? For instance, I *can't* seem to get non-dead files into Attic, no matter what I try with my cvs (on debian). But I haven't gone through the last 8 years of CVS's CVS logs to see if they fixed a bug in the cvs server code that would allow a non-dead HEAD rcs to be in the Attic... > The question is if it'd be acceptable to manually remove that last commit > from the repository. I guess simply readding, and then removing the files > again should do the trick, though I'd be cleaner to fix remove the > offending commit in the first place. Should postgres ever decide to switch > to another version control system (which I don't advocate), that'd be > one obstacle less to deal with... > > Or is the risk of causing breakage too high? Well, I've "hand fixed" this in my conversion process so my git conversion should not have this problem... I'm not a fan of mucking around by hand in CVS. It's only because of the short comings of CVS that it's necessary to every resort to that. So I don't think re-adding/deleting it is worth it... I've updated the repo.or.cz/PostgreSQL.git again - and this time it should be pretty good. Consider it "usable" to clone off and follow CVS development with... I won't re-convert the whole thing again, and will just provide daily updates to it now. Unless anybody finds issues with it... Ignore the "public" branch in there - that got in in an errant push, and I don't know how to remove branches on repo.or.cz. I'm now just putting "conversion notes" up in the public branch... IT's *not* a PostgreSQL branch. a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes: > Would anyone know if these were "hand moved" to Attic? Seems unlikely, since there's a commit log entry for the removal. But this all happened seven-plus years ago and I'm sure there's an old CVS bug involved *somewhere*. I like the idea of re-adding and then re-removing the files on HEAD. Does anyone think that poses any real risk? regards, tom lane
* Tom Lane <tgl@sss.pgh.pa.us> [070416 19:03]: > Aidan Van Dyk <aidan@highrise.ca> writes: > > Would anyone know if these were "hand moved" to Attic? > > Seems unlikely, since there's a commit log entry for the removal. But > this all happened seven-plus years ago and I'm sure there's an old CVS > bug involved *somewhere*. > > I like the idea of re-adding and then re-removing the files on HEAD. > Does anyone think that poses any real risk? No - it even fixed the "hand moved" test I had done trying to create an Attic with, when trying to figure out how they got that way in the first place... What I did when I converted the repo was just hand edit those files to have a state of "dead" to match their position in Attic for those RCS revs. If you "add" them and remove them, I believe my GIT conversion will actually "follow" that correctly... If not - I just rm -Rf it and let it go from scratch "one more time"... I'm glad computers are good at that type of repetitive task... a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes: > * Tom Lane <tgl@sss.pgh.pa.us> [070416 19:03]: >> I like the idea of re-adding and then re-removing the files on HEAD. >> Does anyone think that poses any real risk? > No - it even fixed the "hand moved" test I had done trying to create an > Attic with, when trying to figure out how they got that way in the first > place... Well, it doesn't work :-(. CVS is definitely a bit confused about the status of these files: $ touch gram.c $ cvs add gram.c cvs add: gram.c added independently by second party $ cvs remove gram.c cvs remove: file `gram.c' still in working directory cvs remove: 1 file exists; remove it first $ rm gram.c rm: remove regular empty file `gram.c'? y $ cvs remove gram.c cvs remove: nothing known about `gram.c' So there's no way, apparently, to fix the state of these files through the "front door". Shall we try the proposed idea of hand-moving the files out of the Attic subdirectory, whereupon they should appear live and we can cvs remove them again? I have login on cvs.postgresql.org and can try this, but I'd like confirmation from someone that this is unlikely to break things. Is there any hidden state to be fixed in the CVS repository? I don't see any ... regards, tom lane
I wrote: > So there's no way, apparently, to fix the state of these files through > the "front door". I take that back: the right sequence involving a "cvs update" got me into a state where it thought the files were "locally modified", and then I could commit and "cvs remove" and commit again. So hopefully it's all cleaned up now --- at least the states of the files look reasonable in cvsweb. regards, tom lane
Tom Lane wrote: > Aidan Van Dyk <aidan@highrise.ca> writes: > >> * Tom Lane <tgl@sss.pgh.pa.us> [070416 19:03]: >> >>> I like the idea of re-adding and then re-removing the files on HEAD. >>> Does anyone think that poses any real risk? >>> > > >> No - it even fixed the "hand moved" test I had done trying to create an >> Attic with, when trying to figure out how they got that way in the first >> place... >> > > Well, it doesn't work :-(. CVS is definitely a bit confused about the > status of these files: > > $ touch gram.c > $ cvs add gram.c > cvs add: gram.c added independently by second party > $ cvs remove gram.c > cvs remove: file `gram.c' still in working directory > cvs remove: 1 file exists; remove it first > $ rm gram.c > rm: remove regular empty file `gram.c'? y > $ cvs remove gram.c > cvs remove: nothing known about `gram.c' > > So there's no way, apparently, to fix the state of these files through > the "front door". Shall we try the proposed idea of hand-moving the > files out of the Attic subdirectory, whereupon they should appear live > and we can cvs remove them again? I have login on cvs.postgresql.org > and can try this, but I'd like confirmation from someone that this is > unlikely to break things. Is there any hidden state to be fixed in the > CVS repository? I don't see any ... > > Forgive my caution, but I'd suggest trying on a copy first. cheers andrew
Andrew Dunstan wrote: > Tom Lane wrote: > >So there's no way, apparently, to fix the state of these files through > >the "front door". Shall we try the proposed idea of hand-moving the > >files out of the Attic subdirectory, whereupon they should appear live > >and we can cvs remove them again? I have login on cvs.postgresql.org > >and can try this, but I'd like confirmation from someone that this is > >unlikely to break things. Is there any hidden state to be fixed in the > >CVS repository? I don't see any ... > > Forgive my caution, but I'd suggest trying on a copy first. Too late ;-) FWIW my CVSup copy seems happy with the change; it reported this when I updated it: $ pgcvsup Connected to cvsup.postgresql.org Updating collection repository/cvsEdit pgsql/src/backend/parser/gram.c,v -> AtticEdit pgsql/src/backend/utils/mb/encnames.c,vEditpgsql/src/bin/pg_dump/pg_dump.c,vEdit pgsql/src/bin/psql/common.c,vEdit pgsql/src/include/pg_config.h.win32,vEditpgsql/src/interfaces/ecpg/preproc/pgc.c,v -> AtticEdit pgsql/src/interfaces/ecpg/preproc/preproc.c,v-> AtticEdit pgsql/src/tools/msvc/Solution.pm,vRsync sup/repository/checkouts.cvs Finished successfully The gram.c,v file looks good -- it has the expected "state dead;" line. A checked out tree from that updates fine. A "cvs update" to a checked out tree direct from the main CVS server also updates fine. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
* Tom Lane <tgl@sss.pgh.pa.us> [070416 21:11]: > I wrote: > > So there's no way, apparently, to fix the state of these files through > > the "front door". > > I take that back: the right sequence involving a "cvs update" got me > into a state where it thought the files were "locally modified", and > then I could commit and "cvs remove" and commit again. So hopefully > it's all cleaned up now --- at least the states of the files look > reasonable in cvsweb. And my GIT conversion handled that nicely too. Looks good (at least from the GIT PoV). Now, on my hand-crafted GIT repo - you see them in and out now with Tom's commits. But any *real* conversion tracking the *actual* RCS cvs states should have them checked out from 1999 to now in the state they were from vadim's last changes, and Tom's first commit will "truncate" them (because he checked them in as empty files), and the 2nd commit will remove them again. So it's still a "gotcha" if you're trying to get a copy of CVS from ages ago via one of the alternative SCM conversions... But my git one works, so I'll let others worry about the others ;-) a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes: > Now, on my hand-crafted GIT repo - you see them in and out now with > Tom's commits. But any *real* conversion tracking the *actual* RCS cvs > states should have them checked out from 1999 to now in the state they > were from vadim's last changes, and Tom's first commit will "truncate" > them (because he checked them in as empty files), and the 2nd commit > will remove them again. > So it's still a "gotcha" if you're trying to get a copy of CVS from ages > ago via one of the alternative SCM conversions... It shouldn't be a big problem, assuming the checkout preserves the file dates --- they'll look older than the source files and so a rebuild will happen anyway in such a checkout. regards, tom lane
Tom Lane wrote: > It shouldn't be a big problem, assuming the checkout preserves the file > dates --- they'll look older than the source files and so a rebuild will > happen anyway in such a checkout. > Actually this is a problem with at least SVN. A "svn export" will create files with the original repository dates, but a "svn checkout" will use the current time unless you enable a config option for your local svn client. Kris Jurka
Chris Browne wrote: > This strikes me as being a really super thing, having both Subversion > and Git repositories publicly available that are tracking the > PostgreSQL sources. > > Stepping back to the SCM discussion, people were interested in finding > out what merits there were in having these sorts of SCMs, and in > finding out what glitches people might discover (e.g. - like the files > where the CVS repository is a bit schizophrenic as to whether they are > still there or not...). Having these repositories should allow some > of this experimentation to take place now. Yep. It'd be nice to have official GIT and SVN etc. mirrors of the main CVS repository. There's no pressing reason for the PostgreSQL project to switch from CVS, but we could provide alternatives to developers. As long as you can create a diff to send to pgsql-patches, it doesn't matter which version control system you use. I'm interested in trying GIT or Monotone myself, presumably they would be good for managing unapplied, work-in-progress patches. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Florian G. Pflug wrote: >> Initially, I'll post it on http://git.catalyst.net.nz/ and I can run a >> daily import for you - once that's in place you can probably get a repo >> with your work on http://repo.or.cz/ Ok - you can now clone from http://git.catalyst.net.nz/postgresql.git viewable from http://git.catalyst.net.nz/gitweb too. It's 24hs behind, and I'm sorting the updating scripts that will run daily. The HEAD of CVS is renamed to cvshead there. All the other branches and tags are untouched. Please DO check that the tip of cvshead matches a CVS checkout with -kk. I've had limited time to sanitycheck the import ;-) cheers, m -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
Florian G. Pflug wrote: > Cool - I'm new to git, so I really appreciate any help that I can get. Great - I am a SoC mentor for 2 other projects (git and moodle) so I've got some time set aside for SoC stuff. You might as well take advantage of it :-) >> For the kind of work you'll be doing (writing patches that you'll want >> to be rebasing onto the latest HEAD for merging later) git is probably >> the best tool. That's what I use it for... tracking my experimental / >> custom branches of projects that use CVS or SVN :-) > > Thats how I figured I'd work - though I don't yet understand what > the advantage of "rebase" is over "merge". Probably during your development cycle you'll want to merge the changes from cvshead into your dev branch - that's what you seem to be doing. Great. Later when you are getting things ready for actual merging into CVS you'll want to prepare a series of patches that apply to the top of cvshead. That's where the rebase tools become useful. > Currently, I've setup a git repo that pulls in the changes from the SVN > repo, and pushed them to my main soc git repo. On that main repo I have > two branches, master and pgsql-head, and I call "cg-merge pgsql-head" > if I want to merge with CVS HEAD. You are doing the right thing. If possible, I'd suggest that you use git instead of cogito. Recent git is as user-friendly as cogito. The main difference is that you'll need to learn a bit about the index, and that'll be useful. >> Initially, I'll post it on http://git.catalyst.net.nz/ and I can run a >> daily import for you - once that's in place you can probably get a repo >> with your work on http://repo.or.cz/ > > Having a git mirror of the pgsql CVS would be great. > BTW, I've just check out repo.or.cz, and noticed that there is already a > git mirror of the pgsql CVS: http://repo.or.cz/w/PostgreSQL.git Yes, I've seen it, but I don't know the guy. I can ensure you have a CVS->GIT gateway updated daily or twice daily. cheers, martin -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
* Martin Langhoff <martin@catalyst.net.nz> [070417 17:32]: > > Having a git mirror of the pgsql CVS would be great. > > BTW, I've just check out repo.or.cz, and noticed that there is already a > > git mirror of the pgsql CVS: http://repo.or.cz/w/PostgreSQL.git > > Yes, I've seen it, but I don't know the guy. I can ensure you have a > CVS->GIT gateway updated daily or twice daily. I'm an unknown here, I know - I've used PostgreSQL for years, but only recently started following the development community... And at this point I'm still pretty much just following, hence my interest in getting a GIT repot of PostgreSQL. GIT is *very* helful at a "new" code-base. I have my CVS->GIT conversion running hourly from the anon-rsync of the cvsroot. I don't know the specifics of the PostgreSQL rsync/mirror setup, so I may be pulling it more frequently than it's actually published, but until I hear from someone that tells me I'm taxing to many rsync resources, I'll just leave it that way... The CVS->GIT conversion is cheap - it's the rsync that takes most of the time... I can run it more frequently if people think it would be valuble and the rsync-admins don't care... And remember the warning I gave that my conversion is *not* a direct CVS import - I intentionally *unexpand* all Keywords before stuffing them into GIT so that merging and branching can ignore all the Keyword conflicts... a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk <aidan@highrise.ca> writes: > I have my CVS->GIT conversion running hourly from the anon-rsync of the > cvsroot. I don't know the specifics of the PostgreSQL rsync/mirror > setup, so I may be pulling it more frequently than it's actually > published, but until I hear from someone that tells me I'm taxing to > many rsync resources, I'll just leave it that way... The anoncvs mirror updates once an hour, so you're fine. regards, tom lane
Martin Langhoff wrote: > Aidan Van Dyk wrote: >> And remember the warning I gave that my conversion is *not* a direct CVS >> import - I intentionally *unexpand* all Keywords before stuffing them >> into GIT so that merging and branching can ignore all the Keyword >> conflicts... > > My import is unexpanding those as well to support rebasing and merging > better. > > So - if you are committed to providing your gateway long term to > Florian, I'm happy to drop my gateway in favour of yours. There seem to be other people than me who are interested in a git mirror. Maybe we could declare one of those mirrors the "official" one - I guess things would be easier if all people interested in using git would use the same mirror... What do you guys think? > (Florian, before basing your code on either you should get a checkout of > Aidan's and mine and check that the tips of the branches you are working > on match the cvs branches -- the cvsimport code is good but whereever > CVS is involved, there's a lot of interpretation at play, a sanity check > is always good). I actually hoped that I could just take my current git repo, and rebase my branch onto one of those two repos - or does rebase only work from an ancestor to a descendant? greetings, Florian Pflug
Aidan Van Dyk wrote: > I'm an unknown here, I know - I've used PostgreSQL for years, but only > recently started following the development community... And at this I'm probably unknown here as well. Hi everyone ;-) > And remember the warning I gave that my conversion is *not* a direct CVS > import - I intentionally *unexpand* all Keywords before stuffing them > into GIT so that merging and branching can ignore all the Keyword > conflicts... My import is unexpanding those as well to support rebasing and merging better. So - if you are committed to providing your gateway long term to Florian, I'm happy to drop my gateway in favour of yours. (Florian, before basing your code on either you should get a checkout of Aidan's and mine and check that the tips of the branches you are working on match the cvs branches -- the cvsimport code is good but whereever CVS is involved, there's a lot of interpretation at play, a sanity check is always good). cheers, m -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
* Florian G. Pflug <fgp@phlo.org> [070417 20:30]: > >So - if you are committed to providing your gateway long term to > >Florian, I'm happy to drop my gateway in favour of yours. > > There seem to be other people than me who are interested in a git > mirror. Maybe we could declare one of those mirrors the > "official" one - I guess things would be easier if all people > interested in using git would use the same mirror... > > What do you guys think? I'll provide that gateway as long as I have access to hardware and access that can keep up with PostgreSQL CVS... Of course, the beauty of a DVCS is that we don't really need an official one... And with GIT, you can even "graft" history in if you want. So you could even "start" your GIT work from a cvs checkout of whenever, and "graft" any commit from any of the CVS->GIT conversion history as a parent to your starting point. a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Martin Langhoff <martin@catalyst.net.nz> writes: > Aidan Van Dyk wrote: >> And remember the warning I gave that my conversion is *not* a direct CVS >> import - I intentionally *unexpand* all Keywords before stuffing them >> into GIT so that merging and branching can ignore all the Keyword >> conflicts... > My import is unexpanding those as well to support rebasing and merging > better. Um ... why do either of you feel there's an issue there? We switched over to $PostgreSQL$ a few years ago specifically to avoid creating merge problems for downstream repositories. If there are any other keyword expansions left in the source text I'd vote to remove them. If you have a problem with $PostgreSQL$, why? regards, tom lane
* Tom Lane <tgl@sss.pgh.pa.us> [070418 01:33]: > Um ... why do either of you feel there's an issue there? > > We switched over to $PostgreSQL$ a few years ago specifically to avoid > creating merge problems for downstream repositories. If there are any > other keyword expansions left in the source text I'd vote to remove > them. If you have a problem with $PostgreSQL$, why? Mine is only a generic warning. I convert many CVS repos to GIT, all using the same gateway setup, so I haven't done anything "specific" for PostgreSQL. Most other projects are not as diciplined as PostgreSQL, and I regularly see Modified, Date, Id, Log, etc keywords, as well as project specific ones like PostgreSQL, OpenBSD, FreeBSD, etc... Un-expansion *may* not be perfect... a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Tom Lane wrote: > Martin Langhoff <martin@catalyst.net.nz> writes: > > Aidan Van Dyk wrote: > >> And remember the warning I gave that my conversion is *not* a direct CVS > >> import - I intentionally *unexpand* all Keywords before stuffing them > >> into GIT so that merging and branching can ignore all the Keyword > >> conflicts... > > > My import is unexpanding those as well to support rebasing and merging > > better. > > Um ... why do either of you feel there's an issue there? > > We switched over to $PostgreSQL$ a few years ago specifically to avoid > creating merge problems for downstream repositories. If there are any > other keyword expansions left in the source text I'd vote to remove > them. If you have a problem with $PostgreSQL$, why? One weird thing I noticed some time ago is that we have an $Id$ (or was it $Header$? I don't remember) somewhere, which was supposed to be from the upstream repo where we got the file from, but it was being expanded to our local version to the file. We _also_ have the $PostgreSQL$ tag in there which carries the same info. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Tom Lane wrote: > Um ... why do either of you feel there's an issue there? > > We switched over to $PostgreSQL$ a few years ago specifically to avoid > creating merge problems for downstream repositories. If there are any > other keyword expansions left in the source text I'd vote to remove > them. If you have a problem with $PostgreSQL$, why? I have to accept the blame for not researching about the repo in the first place. I didn't know about $PostgreSQL$ - from the looks of it, it acts _just_ like $Id$. So I guess you use PostgreSQL instead of Id. As GIT won't touch them, Florian will probably be just fine with his patches, and I doubt they'll be more than a minor annoyance, if at all. Keyword expansions are generally bad because SCM tools should track _content_ - and keyword expansions _modify_ it to add metadata that is somewhat redundant, obtainable in other ways, and should just not be in the middle of the _data_. Those modifications lead to patches that have bogus hunks and sometimes don't apply, MD5/SHA1 checksums that don't match and a whole lot of uncertainty. You can't just say "the content is the same" by comparing bytes or SHA1 digests if the committer, the path or the history are different. And it is a mighty important ability for an SCM. The argument runs much longer than that - and the flamewars are quite entertaining. If anyone's keen we're having one right now on git@vger.kernel.org . I am sure Pg hackers will find parallels between keyword expansion (as a misfeature everyone is used to) and the SQL travesties that early MySQL is famous for. I've picked my poison... ran away from MySQL to Pg, and from CVS /SVN/Arch to GIT. Not looking back :-) cheers m -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
On Wed, Apr 18, 2007 at 06:39:34PM +1200, Martin Langhoff wrote: > Keyword expansions are generally bad because SCM tools should track > _content_ - and keyword expansions _modify_ it to add metadata that is > somewhat redundant, obtainable in other ways, and should just not be in > the middle of the _data_. Those modifications lead to patches that have > bogus hunks and sometimes don't apply, MD5/SHA1 checksums that don't > match and a whole lot of uncertainty. Then how do you tell what version a file is if it's outside of a checkout? -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
* Jim C. Nasby <jim@nasby.net> [070418 14:39]: > On Wed, Apr 18, 2007 at 06:39:34PM +1200, Martin Langhoff wrote: > > Keyword expansions are generally bad because SCM tools should track > > _content_ - and keyword expansions _modify_ it to add metadata that is > > somewhat redundant, obtainable in other ways, and should just not be in > > the middle of the _data_. Those modifications lead to patches that have > > bogus hunks and sometimes don't apply, MD5/SHA1 checksums that don't > > match and a whole lot of uncertainty. > > Then how do you tell what version a file is if it's outside of a > checkout? That's what all the fun is about ;-) Some would say that "labelling" the file is the job of the release processes. Others say it's the job of the SCM system... Of course I just sit on the fence because in the work I have to do, I'm quite happy that nothing is "outside of a checkout". GIT is good enough that I have it everywhere. I realise not everyone's that lucky.. ;-) a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
* Aidan Van Dyk <aidan@highrise.ca> [070418 15:03]: > > Then how do you tell what version a file is if it's outside of a > > checkout? > > That's what all the fun is about ;-) Some would say that "labelling" the > file is the job of the release processes. Others say it's the job of > the SCM system... Noting that if you take something "outside of a checkout" means you've "released" it from the VCS... -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk wrote: > * Aidan Van Dyk <aidan@highrise.ca> [070418 15:03]: > > > > Then how do you tell what version a file is if it's outside of a > > > checkout? > > > > That's what all the fun is about ;-) Some would say that "labelling" the > > file is the job of the release processes. Others say it's the job of > > the SCM system... > > Noting that if you take something "outside of a checkout" means you've > "released" it from the VCS... Which is not always what happens in reality. Consider for example that we borrowed some files from NetBSD, OpenBSD, Tcl, zic and others. It would be nice to know exactly at what point we borrowed the file, so we can go to the upstream repo and check if there's any bug fix that we should also apply to our local copy. And we _also_ modify locally the file of course, so just digesting the file we have to get a SHA1 (or whatever) identifier is not an option. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, Apr 19, 2007 at 10:07:08AM +1200, Martin Langhoff wrote: > Jim C. Nasby wrote: > > Then how do you tell what version a file is if it's outside of a > > checkout? > > It's trivial for git to answer that - the file will either be pristine, > and then we can just scan for the matching SHA1, or modified, and we can > scan (taking a weee bit more time) which are the "closest matches" in > your history, in what branches and commits. > > The actual scripting for this isn't written just yet -- Linus posted a > proof-of-concept shell implementation along the lines of > > git rev-list --no-merges --full-history v0.5..v0.7 -- > src/widget/widget.c > rev-list > > best_commit=none > best=1000000 > while read commit > do > git cat-file blob "$commit:src/widget/widget.c" > tmpfile > lines=$(diff reference-file tmpfile | wc -l) > if [ "$lines" -lt "$best" ] > then > echo Best so far: $commit $lines > best=$lines > fi > done < rev-list > > and it's fast. One of the good properties of this is that you can ask > for a range of your history (v0.5 to v0.7 in the example) and an exact > path (src/widget/widget.c) but you can also say --all (meaning "in all > branches") and a handwavy "over there", like src. And git will take an > extra second or two on a large repo, but tell you about all the good > candidates across the branches. > > Metadata is metadata, and we can fish it out of the SCM easily - and > data is data, and it's silly to pollute it with metadata that is mostly > incidental. > > If I find time today I'll post to the git list a cleaned up version of > Linus' shell script as > > git-findclosestmatch <head or range or --all> path/to/scan/ \ > randomfile.c Not bad... took you 40 lines to answer my question. Let's see if I can beat that... > > Then how do you tell what version a file is if it's outside of a > > checkout? Answer: you look at the $Id$ (or in this case, $PostgreSQL$) tag. Sorry, tried to get it to 2 lines, but couldn't. ;) I understand the argument about metadata and all, and largely agree with it. But on the other hand I think a version identifier is a critical piece of information; it's just as critical as the file name when it comes to identifying the information contained in the file. Or does GIT not use filenames, either? :) -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Hi, Alvaro Herrera wrote: > Which is not always what happens in reality. Consider for example that > we borrowed some files from NetBSD, OpenBSD, Tcl, zic and others. It > would be nice to know exactly at what point we borrowed the file, so we > can go to the upstream repo and check if there's any bug fix that we > should also apply to our local copy. And we _also_ modify locally the > file of course, so just digesting the file we have to get a SHA1 (or > whatever) identifier is not an option. I consider such information (i.e. 'where is this file coming from') to be historical information. As such, this information clearly belongs to the VCS sphere and should be tracked and presented by the VCS. Advanced VCSes can import files from other projects and properly track those files or propagate on request. Even subversion can do that to some extent. My point here is: given a decent VCS, you don't need such historical information as often as you do with CVS. You can sit back and let the VCS do the job. (Like looking up, when the last 'import' of the file from the external project happened, what changed and merge those changes back into your (locally modified variant of the) file.) And if you really want to dig in the history of your project, you can ask the VCS, which you are going to need anyway for other historic information. Regards Markus
Hi Jim C. Nasby wrote: > I understand the argument about metadata and all, and largely agree with > it. But on the other hand I think a version identifier is a critical > piece of information; it's just as critical as the file name when it > comes to identifying the information contained in the file. If you really want the files in your releases to carry a version identifier, you should let your release process handle that. But often enough, people can't even tell the exact PostgreSQL version they are running. How do you expect them to be able to tell you what version a single file has? For the developers: they have all the history the VCS offers them. There are tags to associate a release with a revision in your repository. And because a decent VCS can handle all the diff'ing, patching and merging you normally need, you shouldn't ever have to process files outside of your repository. So what exactly is the purpose of a version identifier within the file's contents? For whom could such a thing be good for? Regards Markus
Jim C. Nasby wrote: > Not bad... took you 40 lines to answer my question. Let's see if I can > beat that... Sure - it'll be 1 line when it's wrapped in a shell script. And then we'll be even. > I understand the argument about metadata and all, and largely agree with > it. But on the other hand I think a version identifier is a critical > piece of information; it's just as critical as the file name when it > comes to identifying the information contained in the file. Surely. It is important, but it's metadata and belongs elsewhere. That metadata _is_ important doesn't mean you corrupt _data_ with it. Just imagine that MySQL users were used to getting their SQL engine expand $Oid$ $Tablename$ $PrimayKey$ in TEXT fields. And that when INSERT/UPDATEing those were collapsed. And in comparisons too. Wouldn't you say "that's metadata, can be queried in a thousand ways, does not belong in the middle of the data"? And the _really_ interesting version identifier is usually the "commit"identifier, which gives you a SHA1 of the whole srcdirectory and the history. Projects that use git usually include that SHA1 in their build script, so even if a user compiles off a daily snapshot or a checkout on a random branch of your SCM, you can just ask them "what's the build identifier?" and they'll give you a SHA1. Actually, git can spit a nicer build identifier that includes the latest tag, so if you see the identifier being v8.2.<sha1> You know it's not 8.2 "release" but a commit soon after it, identified by that SHA1. GIT uses that during its build to insert the version identifier, so: $ git --version git version 1.5.1.gf8ce With that in your hand, you can say # show me what commits on top of the tagged 1.5.1 have I got: $ git log 1.5.1..gf8ce # file src/lib/foo.c at this exact commit git show gf8ce:src/lib/foo.c So if you use this identifier (just call `git version`) to - name your tarballs - create a "build-id" file at tarball creation time - tag your builds with a version id And then when you have code out there in the wild, and people report bugs or send you patches, there's a good identifier you can ask for that covers _all_ the files. If it happens that someone reports a bug and says they have 8.2.gg998 and you don't seem to have any gg998 commit after 8.2, you can say with confidence: you are running some a patched Pg - please repro with a pristine copy (or show us your code!) :-) cheers, m -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
Jim C. Nasby wrote: > Then how do you tell what version a file is if it's outside of a > checkout? It's trivial for git to answer that - the file will either be pristine, and then we can just scan for the matching SHA1, or modified, and we can scan (taking a weee bit more time) which are the "closest matches" in your history, in what branches and commits. The actual scripting for this isn't written just yet -- Linus posted a proof-of-concept shell implementation along the lines of git rev-list --no-merges --full-history v0.5..v0.7 -- src/widget/widget.c > rev-list best_commit=none best=1000000 while read commit do git cat-file blob "$commit:src/widget/widget.c"> tmpfile lines=$(diff reference-file tmpfile | wc -l) if [ "$lines"-lt "$best" ] then echo Best so far: $commit $lines best=$lines fi done < rev-list and it's fast. One of the good properties of this is that you can ask for a range of your history (v0.5 to v0.7 in the example) and an exact path (src/widget/widget.c) but you can also say --all (meaning "in all branches") and a handwavy "over there", like src. And git will take an extra second or two on a large repo, but tell you about all the good candidates across the branches. Metadata is metadata, and we can fish it out of the SCM easily - and data is data, and it's silly to pollute it with metadata that is mostly incidental. If I find time today I'll post to the git list a cleaned up version of Linus' shell script as git-findclosestmatch <head or range or --all> path/to/scan/ \ randomfile.c cheers, m -- ----------------------------------------------------------------------- Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017 Make things as simple as possible, but no simpler- Einstein -----------------------------------------------------------------------
Martin Langhoff wrote: > So - if you are committed to providing your gateway long term to > Florian, I'm happy to drop my gateway in favour of yours. > (Florian, before basing your code on either you should get a checkout of > Aidan's and mine and check that the tips of the branches you are working > on match the cvs branches -- the cvsimport code is good but whereever > CVS is involved, there's a lot of interpretation at play, a sanity check > is always good). Sorry for responding so late - I was rather busy during the last 1 1/2 weeks with university stuff, and had only very little time to spend on SoC. I've tried to switch my repo to both git mirrors, but there seems to be something strange happening. The checkout pulls a _lot_ of objects ( a few hunder thousands), and then takes ages to unpack them all, bloating my local repository (Just rm-ing my local repo takes a few minutes after the checkout). It seems as if git pulls all revisions of all files during the pull - which it shouldn't do as far as I understand things - it should only pull those objects referenced by some head, no? The interesting thing is that exactly the same problem occurs with both if your mirrors... Any ideas? Or is this just how things are supposed to work? greetings, Florian Pflug
* Florian G. Pflug <fgp@phlo.org> [070430 08:58]: > It seems as if git pulls all revisions of all files during the pull - > which it shouldn't do as far as I understand things - it should only > pull those objects referenced by some head, no? Git pulls full history to a common ancestor on the clone/pull. So the first pull on a repo *will* necessarily pull in the full object history. So unless you have a recent common ancestor, it will pull lots. Note that because git uses crypto hashes to identify objects, my conversion and Martin's probably do not have a recent common ancestor (because my header munging probably doesn't match Martin's exactly). > The interesting thing is that exactly the same problem occurs with > both if your mirrors... > > Any ideas? Or is this just how things are supposed to work? Until you have a local repository of it, you'll need to go through the full pull/clone. If you're really not interested in history you can "truncate" history with the --depth option to git clone. That will give you a "shallow repository", which you can use, develop, branch, etc in, but won't give you all the history locally. Also - what version of GIT are you using? I *really* recommend using at least 1.5 (1.5.2.X is current stable). Please, do your self a favour, and don't use 1.4.4. a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Aidan Van Dyk wrote: > * Florian G. Pflug <fgp@phlo.org> [070430 08:58]: > >> It seems as if git pulls all revisions of all files during the pull - >> which it shouldn't do as far as I understand things - it should only >> pull those objects referenced by some head, no? > > Git pulls full history to a common ancestor on the clone/pull. So the > first pull on a repo *will* necessarily pull in the full object history. > So unless you have a recent common ancestor, it will pull lots. Note > that because git uses crypto hashes to identify objects, my conversion > and Martin's probably do not have a recent common ancestor (because my > header munging probably doesn't match Martin's exactly). Ah, OK - that explains things. >> The interesting thing is that exactly the same problem occurs with >> both if your mirrors... >> >> Any ideas? Or is this just how things are supposed to work? > > Until you have a local repository of it, you'll need to go through the > full pull/clone. If you're really not interested in history you can > "truncate" history with the --depth option to git clone. That will give > you a "shallow repository", which you can use, develop, branch, etc in, > but won't give you all the history locally. I'll retry with the "--depth" option - I'm doing development on my powerbook, and OSX seems to cope badly with lots of little files - the initial unpacking took hours - literally.. > Also - what version of GIT are you using? I *really* recommend using at > least 1.5 (1.5.2.X is current stable). Please, do your self a favour, > and don't use 1.4.4. I'm using 1.5.0 currently - it was the latest stable release when I began to experiment with git. greetings, Florian Pflug