On Fri, 11 Oct 2024 12:16:50 +0900 (JST)
Tatsuo Ishii <ishii@postgresql.org> wrote:
> > We can check non-ASCII letters SGML/XML files by preparing "allowlist"
> > that contains lines which are allowed to have non-ascii characters,
> > although this list will need to be maintained when lines in it are modified.
> > I've attached a patch to add a simple Perl script to do this.
>
> I doubt it really works. For example, nbsp can be used formatting
> (that's the purpose of the character in the first place). Whenever a
> developer decides to or not to use nbsp, "allowlist" needs to be
> maintained. It's too annoying.
I suppose non-ascii characters including nbsp are basically disallowed,
so the allowlist will not increase unless there is some special reason.
However, it is true that there might be a cost for maintaining the list
more or less, so if people don't think it is worth adding this check,
I will withdraw this proposal.l.
> I think it's better to add the non-ASCII character checking to the
> comitting check list and let committers check non-ASCII character in
> the patch. Non-ASCII characters rarely used and it would not become a
> burden.
> https://wiki.postgresql.org/wiki/Committing_checklist
>
> Maybe we can add to the wiki page something like this?
>
> git diff origin/master | grep -P '[^\x00-\x7f]'
>
> > During testing this script, I found "stylesheet-man.xsl" also has non-ascii
> > characters. I don't know these characters are really necessary though, since
> > I don't understand this file well.
>
> They are U+201C (double turned comma quotation mark) and U+201D
> (double comma quotation mark).
>
> <l:template name="sect3" text="Section %n, “%t”, in the documentation"/>
>
> I would like to know why they are necessary too.
+1
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>