Thread: INFORMATION_SCHEMA node

INFORMATION_SCHEMA node

From
Tatsuo Ishii
Date:
In the following paragraph in information_schema:

 <term>character encoding form</term>
     <listitem>
      <para>
       An encoding of some character repertoire.  Most older character
       repertoires only use one encoding form, and so there are no
       separate names for them (e.g., <literal>LATIN1</literal> is an
       encoding form applicable to the <literal>LATIN1</literal>
       repertoire).  But for example Unicode has the encoding forms
       <literal>UTF8</literal>, <literal>UTF16</literal>, etc. (not
       all supported by PostgreSQL).  Encoding forms are not exposed
       as an SQL object, but are visible in this view.

This claims that the LATIN1 repertoire only uses one encoding form,
but actually LATIN1 can be encoded in another form: ISO-2022-JP-2 (a 7
bit encoding. See RFC 1554
(https://datatracker.ietf.org/doc/html/rfc1554) for more details).

If we still want to list a use-one-encoding-form example, probably we
could use LATIN2 instead or others that are not supported by
ISO-2022-JP-2 (ISO-2022-JP-2 supports LATIN1 and LATIN7).

Attached is the patch that does this.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: INFORMATION_SCHEMA note

From
Tatsuo Ishii
Date:
(typo in the subject fixed)

> In the following paragraph in information_schema:
> 
>  <term>character encoding form</term>
>      <listitem>
>       <para>
>        An encoding of some character repertoire.  Most older character
>        repertoires only use one encoding form, and so there are no
>        separate names for them (e.g., <literal>LATIN1</literal> is an
>        encoding form applicable to the <literal>LATIN1</literal>
>        repertoire).  But for example Unicode has the encoding forms
>        <literal>UTF8</literal>, <literal>UTF16</literal>, etc. (not
>        all supported by PostgreSQL).  Encoding forms are not exposed
>        as an SQL object, but are visible in this view.
> 
> This claims that the LATIN1 repertoire only uses one encoding form,
> but actually LATIN1 can be encoded in another form: ISO-2022-JP-2 (a 7
> bit encoding. See RFC 1554
> (https://datatracker.ietf.org/doc/html/rfc1554) for more details).
> 
> If we still want to list a use-one-encoding-form example, probably we
> could use LATIN2 instead or others that are not supported by
> ISO-2022-JP-2 (ISO-2022-JP-2 supports LATIN1 and LATIN7).
> 
> Attached is the patch that does this.

Any objection?
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: INFORMATION_SCHEMA note

From
Daniel Gustafsson
Date:
> On 4 Jan 2024, at 13:39, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:

>> Attached is the patch that does this.

I don't think the patch was attached?

> Any objection?

I didn't study the RFC in depth but as expected it seems to back up your change
so the change seems reasonable.

--
Daniel Gustafsson




Re: INFORMATION_SCHEMA note

From
Tatsuo Ishii
Date:
>> On 4 Jan 2024, at 13:39, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> 
>>> Attached is the patch that does this.
> 
> I don't think the patch was attached?
> 
>> Any objection?
> 
> I didn't study the RFC in depth but as expected it seems to back up your change
> so the change seems reasonable.

Oops. Sorry. Patch attached.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 0ca7d5a9e0..9e66be4e83 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -697,8 +697,8 @@
       <para>
        An encoding of some character repertoire.  Most older character
        repertoires only use one encoding form, and so there are no
-       separate names for them (e.g., <literal>LATIN1</literal> is an
-       encoding form applicable to the <literal>LATIN1</literal>
+       separate names for them (e.g., <literal>LATIN2</literal> is an
+       encoding form applicable to the <literal>LATIN2</literal>
        repertoire).  But for example Unicode has the encoding forms
        <literal>UTF8</literal>, <literal>UTF16</literal>, etc. (not
        all supported by PostgreSQL).  Encoding forms are not exposed

Re: INFORMATION_SCHEMA note

From
Daniel Gustafsson
Date:
> On 9 Jan 2024, at 00:54, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>
>>> On 4 Jan 2024, at 13:39, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>>
>>>> Attached is the patch that does this.
>>
>> I don't think the patch was attached?
>>
>>> Any objection?
>>
>> I didn't study the RFC in depth but as expected it seems to back up your change
>> so the change seems reasonable.
>
> Oops. Sorry. Patch attached.

That's exactly what I expected it to be, and it LGTM.

--
Daniel Gustafsson




Re: INFORMATION_SCHEMA note

From
Tatsuo Ishii
Date:
>> On 9 Jan 2024, at 00:54, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>> 
>>>> On 4 Jan 2024, at 13:39, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>>> 
>>>>> Attached is the patch that does this.
>>> 
>>> I don't think the patch was attached?
>>> 
>>>> Any objection?
>>> 
>>> I didn't study the RFC in depth but as expected it seems to back up your change
>>> so the change seems reasonable.
>> 
>> Oops. Sorry. Patch attached.
> 
> That's exactly what I expected it to be, and it LGTM.

Thanks for looking into it. Pushed to all supported branches.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp