Re: PostgreSQL GIT mirror status - Mailing list pgsql-www

From Daniel Farina
Subject Re: PostgreSQL GIT mirror status
Date
Msg-id 7b97c5a40901090956n2eb0478x3133a1edddcfc102@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL GIT mirror status  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-www
On Fri, Jan 9, 2009 at 3:06 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Wow, that's impressive! How long does a "git gc --agressive" run take?

Actually, not that long. The main step that takes forever at this
point (starting from scratch) is counting all those objects. The
actual gc --aggressive time could probably be measured in minutes and
< 1hr on a reasonably fast machine.

> That could be because of the duplicated history we had there in December,
> that I then fixed. I reset the branches to just before the screwup, and then
> ran fromcvs to catch up with CVS HEAD again. That duplicated history is
> probably still there, but nor reachable from any branches or tags.
>
> Should we run "git prune" to get rid of the garbage?
>

Sounds like a good candidate, but I don't think that alone will do
it. I've had to do something like this before when I temporarily added
some large blobs to my git repository to move them between home and
work.

I have isolated the problem to the being the reflog, which sounds
about right. The "git reflog" man page says it has ways to delete
and/or expire these to be pruned, so try that first (and then tell me
if it worked as you expected, and what you did).

If it doesn't (i.e. for some reason is not pruning properly) and if
you are sure you won't need the reflog it seems that you can just
delete the 'logs' directory under the git repository (you may notice
that it seems that the repository at lolrus.org works fine, but has no
'logs' directory). That seems to be the same state as having no reflog
at all, after which a regular 'git gc' will collect most of those
objects.

"But wait, there's more!"

You'll then want to run a 'git prune', as it seems that gc will still
keep some objects around because they're inside the gc grace period,
which I believe to be distinct from the reflog. In this case it seems
that we really want them gone.

Given this information it seems like the right steps are something
like this:
1. Somehow expire and/or delete the reflogs so they register as   garbage.
    * By making use of the 'git reflog' expiration/deletion commands      (preferred, if one can figure out their
behaviorexactly)
 
    * Or just deleting $GITREPO/logs. (works for me at the moment)
2. Run 'git gc --aggressive'
3. Run 'git prune'

Alternatively, just steal the pack from fdr.lolrus.org, as mentioned
above.

fdr


pgsql-www by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Wiki wizard help?
Next
From: "Marc G. Fournier"
Date:
Subject: Re: Denver PUG mailing list