On Wed, Jan 7, 2009 at 1:58 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>
> Well, if you want to give it a try and then report back about whether there
> were any noticeable effects ...
>
I ran a regular git repack -a -d. This took about 3.5 cpu-intensive
hours, but made object counting *much* (I cannot stress that enough)
faster and made the repository shrink dramatically: 361M to 246M. I
also won't have any more open-file-limit problems (things like git
fsck --full would fail because of too many open files until I raised
ulimit -n). I should also mention that cloning from http seems
completely broken because of the huge number of packs...potentially
also an open file limit issue.
You may want to run 'git repack -a -d' also, but I'd advise waiting
until tomorrow when I write up my full report and compare that with
the much more aggressive packing options. My estimation is that using
the already-repacked repository that finding new deltas will take
about nine hours with extremely aggressive settings. It has a higher
likelihood of being worthwhile on projects as large as Postgres, so
we'll see.
After this I can either solidify the recipe I used and you can burn
another fifteen or so hours of compute time to re-derive this result
or I can simply give you the pack generated. You can use 'git fsck
--full' to ensure the pack's fidelity.
I suggest running 'git repack -a -d' to consolidate packs every once
in a while, maybe monthly or semi-monthly. It's quite cheap if there
aren't so many packs and/or loose objects. Aggressive repacking such
as what I'm doing may only be useful on a yearly basis or even
longer...unless git learns some better ways to build packs. I also
hope you (and everyone else) has git version >= 1.5.3, when the pack
format changed.
fdr