Re: PostgreSQL Developer meeting minutes up - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: PostgreSQL Developer meeting minutes up
Date
Msg-id 20090529170559.99155w414h94hazb@mail.bluegap.ch
Whole thread Raw
In response to Re: PostgreSQL Developer meeting minutes up  (Aidan Van Dyk <aidan@highrise.ca>)
Responses Re: PostgreSQL Developer meeting minutes up  (Aidan Van Dyk <aidan@highrise.ca>)
List pgsql-hackers
Hi,

Quoting "Aidan Van Dyk" <aidan@highrise.ca>:
>> Ok, so seeing the interest in having a "good conversion", I took a stab at
>> parsecvs this afternoon, probably what I consider the leading "static"
>> conversion tool.

Here are some results from a conversion with cvs2git.

>> It takes about 10 minutes to run my old xeon.

The conversion with cvs2git certainly took a bit longer, however, I
don't think that matters at all. Everything below a day or two is good
enough, IMO. What counts is the result.

The first step is running cvs2git itself:

cvs2svn Statistics:
------------------
Total CVS Files:              6873
Total CVS Revisions:        140191
Total CVS Branches:          36057
Total CVS Tags:             457515
Total Unique Tags:             171
Total Unique Branches:          21
CVS Repos Size in KB:       377337
Total SVN Commits:           32889
First Revision Date:    Tue Jul  9 08:21:07 1996
Last Revision Date:     Thu May 28 22:02:10 2009

(number of files matches pretty well with my own algorithm, however,
total svn commits is a bit lower, compared to the ~ 40'000 blobs I got).

The output of cvs2git can then be imported with git fast-import:

git-fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects:     350000
Total objects:       349405 (     19563 duplicates                  )      blobs  :       132672 (      3255 duplicates
   119032 deltas)      trees  :       183967 (     16308 duplicates     165582 deltas)      commits:        32766 (
   0 duplicates          0 deltas)      tags   :            0 (         0 duplicates          0 deltas) 
Total branches:         194 (       664 loads     )      marks:     1073741824 (    168693 unique    )      atoms:
    5280 
Memory total:         16532 KiB       pools:          2860 KiB     objects:         13671 KiB
---------------------------------------------------------------------
pack_report: getpagesize()            =       4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit      = 8589934592
pack_report: pack_used_ctr            =     124414
pack_report: pack_mmap_calls          =       3674
pack_report: pack_open_windows        =          1 /          1
pack_report: pack_mapped              =  199500913 /  199500913
---------------------------------------------------------------------


The resulting repository contains the following branches. The
unlabeled ones contain only 1-2 files and seem rather irrelevant. In a
next try, I'd disable their creation completely, just wanted to check.
  REL2_0B  REL6_4  REL6_5_PATCHES  REL7_0_PATCHES  REL7_1_STABLE  REL7_2_STABLE  REL7_3_STABLE  REL7_4_STABLE  REL8_0_0
REL8_0_STABLE  REL8_1_STABLE  REL8_2_STABLE  REL8_3_STABLE  Release_1_0_3  WIN32_DEV  ecpg_big_bison 
* master  unlabeled-1.44.2   -> from src/backend/commands/tablecmds.c  unlabeled-1.51.2   -> from
src/test/regress/expected/alter_table.out unlabeled-1.59.2   -> from src/backend/executor/execTuples.c
unlabeled-1.87.2  -> from src/backend/executor/nodeAgg.c  unlabeled-1.90.2   -> from src/backend/parser/parse_target.c
and                            src/backend/access/common/tupdesc.c 

Comparison of the head of each branch between git and CVS (modulo CVS
keyword expansion, which I've filtered out):

ecpg_big_bison.diff:      0 files changed
master.diff:              0 files changed
REL2_0B.diff:             0 files changed
REL6_4.diff:              0 files changed
REL6_5_PATCHES.diff:      0 files changed
REL7_0_PATCHES.diff:      0 files changed
REL7_1_STABLE.diff:       0 files changed
REL7_2_STABLE.diff:       0 files changed
REL7_3_STABLE.diff:       0 files changed
REL7_4_STABLE.diff:       0 files changed
REL8_0_0.diff:            0 files changed
REL8_0_STABLE.diff:       0 files changed
REL8_1_STABLE.diff:       0 files changed
REL8_2_STABLE.diff:       0 files changed
REL8_3_STABLE.diff:       0 files changed
Release_1_0_3.diff:       0 files changed
WIN32_DEV.diff:           0 files changed

I plan to compare the tags as well and test what branch they are in,
but so far cvs2git seems to hold its promises. I'll report back again
within the next few days.

Regards

Markus Wanner


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: search_path vs extensions
Next
From: Aidan Van Dyk
Date:
Subject: Re: PostgreSQL Developer meeting minutes up