RE: [PGdocs] fix description for handling pf non-ASCII characters - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [PGdocs] fix description for handling pf non-ASCII characters
Date
Msg-id TYAPR01MB58662BC412E1290FC3348973F525A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PGdocs] fix description for handling pf non-ASCII characters  (jian he <jian.universality@gmail.com>)
Responses Re: [PGdocs] fix description for handling pf non-ASCII characters
List pgsql-hackers
Dear Jian,

Thank you for checking my patch!

>     
> in your patch:
> > printable ASCII characters will be replaced with a hex escape.
> 
> My wording is not good. I think the result will be: ASCII characters
> will be as is, non-ASCII characters will be replaced with "a hex
> escape".

Yeah, your point was right. I have already said:
"anything other than printable ASCII characters will be replaced with a hex escape"
IIUC They have same meaning.

You might want to say the line was not good, so reworded like
"non-ASCII characters will be replaced with hexadecimal strings." How do you think?

> set application_name to 'abc漢字Abc';
> SET
> test16=# show application_name;
>         application_name
> --------------------------------
>  abc\xe6\xbc\xa2\xe5\xad\x97Abc
> (1 row)
> 
> I see multi escape, so I am not sure "a hex escape".

Not sure what you said, but I could not find word "hex escape" in the document.
So I used "hexadecimal string" instead. Is it acceptable? 

> to properly render it back to  'abc漢字Abc'
> here is how i do it:
> select 'abc' || convert_from(decode(' e6bca2e5ad97','hex'), 'UTF8') || 'Abc';

Yeah, your approach seems right, but I'm not sure it is related with us.
Just to confirm, I don't have interest the method for rendering non-ASCII characters.
My motivation of the patch was to document the the incompatibility noted in [1]:

>
Changed the conversion rules when non-ASCII characters are specified for ASCII-only
strings such as parameters application_name and cluster_name. Previously, it was
converted in byte units with a question mark (?), but in PostgreSQL 16, it is
converted to a hexadecimal string.
>

> I guess it's still painful if your application_name has non-ASCII chars.

I agreed that, but no one has recommended to use non-ASCII.

[1]:
https://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL16Beta1_New_Features_en_20230528_1.pdf

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

pgsql-hackers by date:

Previous
From: John Morris
Date:
Subject: Unified File API
Next
From: "Joel Jacobson"
Date:
Subject: Re: Do we want a hashset type?