Thread: Idle git question: how come so many "objects"?

Idle git question: how come so many "objects"?

From
Tom Lane
Date:
So I just made a commit that touched four files in all six active
branches, and I see:

$ git push   
Counting objects: 172, done.
Compressing objects: 100% (89/89), done.
Writing objects: 100% (89/89), 17.07 KiB, done.
Total 89 (delta 80), reused 0 (delta 0)
To ssh://git@gitmaster.postgresql.org/postgresql.git  35a3def..8a6eb2e  REL8_1_STABLE -> REL8_1_STABLE
cfb6ac6..b0e2092 REL8_2_STABLE -> REL8_2_STABLE  301a822..0d45e8c  REL8_3_STABLE -> REL8_3_STABLE  61f8618..6bd3753
REL8_4_STABLE-> REL8_4_STABLE  09425f8..0a85bb2  REL9_0_STABLE -> REL9_0_STABLE  c0b5fac..225f0aa  master -> master
 

Now I realize that in addition to the four files there's a "tree" object
and a "commit" object, but that still only adds up to 36 objects that
should be created in this transaction.  How does it get to 172?  And
then where do the 89 and 80 numbers come from?
        regards, tom lane


Re: Idle git question: how come so many "objects"?

From
Martijn van Oosterhout
Date:
On Wed, Dec 01, 2010 at 01:03:26AM -0500, Tom Lane wrote:
> So I just made a commit that touched four files in all six active
> branches, and I see:
>
> $ git push
> Counting objects: 172, done.
> Compressing objects: 100% (89/89), done.
> Writing objects: 100% (89/89), 17.07 KiB, done.
> Total 89 (delta 80), reused 0 (delta 0)
> To ssh://git@gitmaster.postgresql.org/postgresql.git
>    35a3def..8a6eb2e  REL8_1_STABLE -> REL8_1_STABLE
>    cfb6ac6..b0e2092  REL8_2_STABLE -> REL8_2_STABLE
>    301a822..0d45e8c  REL8_3_STABLE -> REL8_3_STABLE
>    61f8618..6bd3753  REL8_4_STABLE -> REL8_4_STABLE
>    09425f8..0a85bb2  REL9_0_STABLE -> REL9_0_STABLE
>    c0b5fac..225f0aa  master -> master
>
> Now I realize that in addition to the four files there's a "tree" object
> and a "commit" object, but that still only adds up to 36 objects that
> should be created in this transaction.  How does it get to 172?  And
> then where do the 89 and 80 numbers come from?

IIRC, each directory also counts as an object. So if you change a file
in a/b/c/d you get 5 commit objects, one for the file and four for the
directories.

What the delta is for I don't know. Perhaps some of the diffs were the
same between branches and these got merged?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: Idle git question: how come so many "objects"?

From
Robert Haas
Date:
On Wed, Dec 1, 2010 at 2:08 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Wed, Dec 01, 2010 at 01:03:26AM -0500, Tom Lane wrote:
>> So I just made a commit that touched four files in all six active
>> branches, and I see:
>>
>> $ git push
>> Counting objects: 172, done.
>> Compressing objects: 100% (89/89), done.
>> Writing objects: 100% (89/89), 17.07 KiB, done.
>> Total 89 (delta 80), reused 0 (delta 0)
>> To ssh://git@gitmaster.postgresql.org/postgresql.git
>>    35a3def..8a6eb2e  REL8_1_STABLE -> REL8_1_STABLE
>>    cfb6ac6..b0e2092  REL8_2_STABLE -> REL8_2_STABLE
>>    301a822..0d45e8c  REL8_3_STABLE -> REL8_3_STABLE
>>    61f8618..6bd3753  REL8_4_STABLE -> REL8_4_STABLE
>>    09425f8..0a85bb2  REL9_0_STABLE -> REL9_0_STABLE
>>    c0b5fac..225f0aa  master -> master
>>
>> Now I realize that in addition to the four files there's a "tree" object
>> and a "commit" object, but that still only adds up to 36 objects that
>> should be created in this transaction.  How does it get to 172?  And
>> then where do the 89 and 80 numbers come from?
>
> IIRC, each directory also counts as an object. So if you change a file
> in a/b/c/d you get 5 commit objects, one for the file and four for the
> directories.

No, not 5 commit objects - 5 trees (for the directories, including the
root directory), 1 blob (for the file), and 1 commit.  From gIt(1):
      The object database contains objects of three main types: blobs, which      hold file data; trees, which point to
blobsand other trees to build up      directory hierarchies; and commits, which each reference a single tree      and
somenumber of parent commits. 

So in Tom's case I think we can account for the root directory, src,
src/backend, src/backend/executor, src/backend/optimizer,
src/backend/optimizer/util, src/test,  src/test/regress,
src/test/regress/expected, src/test/regress/sql, the 4 files actually
updated, and the commit - 15 objects per branch * 6 branches = 90
objects.  I'm not sure why the actual number is 89, unless perhaps two
of the post-commit regression test files were byte-for-byte identical
and got collapsed into a single object.  I believe that "delta" refers
to the number of those objects that are stored as deltas against an
existing object (in essence, diffs) rather than as completely new
copies.

I have no idea where the 172 number comes from.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Idle git question: how come so many "objects"?

From
Florian Weimer
Date:
* Tom Lane:

> $ git push
> Counting objects: 172, done.
> Compressing objects: 100% (89/89), done.
> Writing objects: 100% (89/89), 17.07 KiB, done.
> Total 89 (delta 80), reused 0 (delta 0)
> To ssh://git@gitmaster.postgresql.org/postgresql.git
>    35a3def..8a6eb2e  REL8_1_STABLE -> REL8_1_STABLE
>    cfb6ac6..b0e2092  REL8_2_STABLE -> REL8_2_STABLE
>    301a822..0d45e8c  REL8_3_STABLE -> REL8_3_STABLE
>    61f8618..6bd3753  REL8_4_STABLE -> REL8_4_STABLE
>    09425f8..0a85bb2  REL9_0_STABLE -> REL9_0_STABLE
>    c0b5fac..225f0aa  master -> master

> How does it get to 172?

These are the number of objects "git push" (actually, git-send-pack, I
think) needs to look at more closely, AFAIUI.  It's a pretty arbitrary
number.  You see it sometimes during pull, too.

> And then where do the 89 and 80 numbers come from?

89 is the number of objects which need to be transmitted.  Of those,
80 were compressed by diffing them to some other object (which might,
in turn, be a diff).

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99