Thread: UTF-8 for SGML docs?

UTF-8 for SGML docs?

From
Tatsuo Ishii
Date:
Hi,

Is it possible to use UTF-8 for SGML docs?

I would like to enhance the full text search docs, especially table 12-1,
which includes ASCII and LATIN1 examples only. I find that word,
numword alias etc. allow not only LATIN characters but Asian
ones. This fact is not stated in the doc, and I'm afraid this might
discourage Japanese users to use the full text search.

For this I think I would like to add Japanese examples in the table by
using UTF-8 encoding for the doc text.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


Re: UTF-8 for SGML docs?

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> Is it possible to use UTF-8 for SGML docs?

No :-(.  We've been through this already, see discussions awhile back
about spelling non-English names correctly.  Unless there's a recognized
HTML entity for the character, you can't use it.
        regards, tom lane


Re: UTF-8 for SGML docs?

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <ishii@postgresql.org> writes:
> > Is it possible to use UTF-8 for SGML docs?
> 
> No :-(.  We've been through this already, see discussions awhile back
> about spelling non-English names correctly.  Unless there's a recognized
> HTML entity for the character, you can't use it.

Ok. So I will just add a comment that Japanese can be used too for word etc.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


Re: UTF-8 for SGML docs?

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Tatsuo Ishii <ishii@postgresql.org> writes:
>> Is it possible to use UTF-8 for SGML docs?
>
> No :-(.  We've been through this already, see discussions awhile back
> about spelling non-English names correctly.  Unless there's a recognized
> HTML entity for the character, you can't use it.

Are entities like ぁ ok?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!


Re: UTF-8 for SGML docs?

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
>> No :-(.  We've been through this already, see discussions awhile back
>> about spelling non-English names correctly.  Unless there's a recognized
>> HTML entity for the character, you can't use it.

> Are entities like ぁ ok?

No.  See prior thread.
        regards, tom lane