Home > mailing lists

Re: git: uh-oh - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: git: uh-oh
Date	August 25, 2010 02:11:41
Msg-id	12367.1282713087@sss.pgh.pa.us Whole thread Raw
In response to	Re: git: uh-oh (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: git: uh-oh
List	pgsql-hackers

Tree view

Robert Haas <robertmhaas@gmail.com> writes:
> 1. The new conversion seems to have stolen the apostrophe from "D'Arcy
> J.M. Cain <darcy@druid.net>", rendering him "DArcy J.M. Cain
> <darcy@druid.net>".

Yeah, I see that too.  It's probably bad input rather than the
converter's fault ;-)

> 2. Any non-ASCII characters in, for example, contributor's names show
> up differently in the two repos.  Generally, the original repo is OK
> and the new repo is garbled; although I found one very old example
> that went the other way.

What it looks like to me is that a Latin1->UTF8 conversion has been
applied to the log text.  Which might be a good idea if it all *was*
Latin1, but a fair-sized percentage isn't.  Applying this conversion to
UTF8 entries results in garbage, of course.  Even if this could be done
reliably, I think this counts as editorializing on the historical
record, and should be switched off if possible.

> There are also a number of commits that differ in order between the
> two repos, and an even larger number where commits are duplicated or
> merged in one repository relative to the other.

I suspect that this is an artifact of the converter trying to merge
nearby commits into one commit, which it more or less *has* to do for
sanity since CVS commits aren't atomic.  I don't have a problem with
the concept, but I notice cases where the converted commit has a
timestamp some minutes later than what the cvs2cl output claims.
I suspect this is what the converter was using as a cutoff time.
Would it be possible to make sure that the converted commit is always
timestamped with the latest individual file update timestamp from the
included CVS commits?
        regards, tom lane

pgsql-hackers by date:

From: "Luxenberg, Scott I."
Date: 25 August 2010, 01:40:47
Subject: Performance Farm Release

From: Itagaki Takahiro
Date: 25 August 2010, 02:35:04
Subject: Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)

Re: git: uh-oh - Mailing list pgsql-hackers

Previous

Next