Thread: Splitting lengthy sgml files

Splitting lengthy sgml files

From
Tatsuo Ishii
Date:
There are very lengthy (over 10k lines, for example) SGML files in
docs. While working on translating docs using GitHub, I noticed that
sometimes diffs are not showed in pull requests due to the limitation
of GitHub, which makes me pretty difficult to review PR. Any chance to
split those lengthy SGML files into smaller SGML files?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Splitting lengthy sgml files

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> There are very lengthy (over 10k lines, for example) SGML files in
> docs. While working on translating docs using GitHub, I noticed that
> sometimes diffs are not showed in pull requests due to the limitation
> of GitHub, which makes me pretty difficult to review PR. Any chance to
> split those lengthy SGML files into smaller SGML files?

Surely that's a github bug that you should be complaining to them about?

I'm disinclined to split existing files because (a) it would complicate
back-patching and (b) it would be completely destructive to git history.
git claims to understand about file moves but it doesn't do a terribly
good job with that history-wise (try git log or git blame on
recently-moved files such as pgbench).  And I've never heard even
a claim that it understands splits.

There might be reasons to override those disadvantages and do it
anyway ... but this doesn't sound like a very good reason.
        regards, tom lane



Re: Splitting lengthy sgml files

From
Tatsuo Ishii
Date:
> Surely that's a github bug that you should be complaining to them about?

No, it's a known limitation:
https://help.github.com/articles/what-are-the-limits-for-viewing-content-and-diffs-in-my-repository/

> I'm disinclined to split existing files because (a) it would complicate
> back-patching and (b) it would be completely destructive to git history.
> git claims to understand about file moves but it doesn't do a terribly
> good job with that history-wise (try git log or git blame on
> recently-moved files such as pgbench).  And I've never heard even
> a claim that it understands splits.
> 
> There might be reasons to override those disadvantages and do it
> anyway ... but this doesn't sound like a very good reason.

Ok, I will try to find workarounds for this, including forking.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Splitting lengthy sgml files

From
Robert Haas
Date:
On Mon, Mar 7, 2016 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tatsuo Ishii <ishii@postgresql.org> writes:
>> There are very lengthy (over 10k lines, for example) SGML files in
>> docs. While working on translating docs using GitHub, I noticed that
>> sometimes diffs are not showed in pull requests due to the limitation
>> of GitHub, which makes me pretty difficult to review PR. Any chance to
>> split those lengthy SGML files into smaller SGML files?
>
> Surely that's a github bug that you should be complaining to them about?

Well I'm sure it's not like they did it for no reason.  At some point
displaying a diff on a giant file is going to result in a page that
takes too long to load.

> I'm disinclined to split existing files because (a) it would complicate
> back-patching and (b) it would be completely destructive to git history.
> git claims to understand about file moves but it doesn't do a terribly
> good job with that history-wise (try git log or git blame on
> recently-moved files such as pgbench).  And I've never heard even
> a claim that it understands splits.

But we've split very long source files in the past, and I don't see
why splitting doc files is any stupider.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company