Thread: BUG #3453: Error on COPY TO/FROM 'non-ascii-path'

BUG #3453: Error on COPY TO/FROM 'non-ascii-path'

From
"ITAGAKI Takahiro"
Date:
The following bug has been logged online:

Bug reference:      3453
Logged by:          ITAGAKI Takahiro
Email address:      itagaki.takahiro@oss.ntt.co.jp
PostgreSQL version: 8.2, 8.3dev
Operating system:   independent (especially Windows)
Description:        Error on COPY TO/FROM 'non-ascii-path'
Details:

When I set postgres a different character encoding from OS, COPY TO/FROM
'non-ascii-path' cannot open the path. The cause of the problem is that we
pass path-strings encoded in the server encoding to open() directly without
respect to the OS's native encodings.

It will be worse in Windows East Asia versions; Postgres doesn't support
their encodings as server encodings, so we cannot use non-ascii filenames
there.

Re: BUG #3453: Error on COPY TO/FROM 'non-ascii-path'

From
"Hiroshi Saito"
Date:
Hi ITAGAKI-san.

I think that it is use restrictions now....This is not a BUG.
In Japan, use is restricted by the reason referred to as that server
encoding does not support all now. Then, It is Shift-jis encoding...
http://winpg.jp/~saito/pg_bug/copy_to.png
http://winpg.jp/~saito/pg_bug/copy_from.png
This SERVER_ENCODING is EUC_JP..:-(
This will be cleared by Shift_jis being supported as you suggested
before. Therefore, only psql to \copy of a client can do it now.
Probably, It will become the item of 'TODO' correctly.....

Is it different?
Or do you have another point?

Regards,
Hiroshi Saito

From: "ITAGAKI Takahiro"

>
> The following bug has been logged online:
>
> Bug reference:      3453
> Logged by:          ITAGAKI Takahiro
> Email address:      itagaki.takahiro@oss.ntt.co.jp
> PostgreSQL version: 8.2, 8.3dev
> Operating system:   independent (especially Windows)
> Description:        Error on COPY TO/FROM 'non-ascii-path'
> Details:
>
> When I set postgres a different character encoding from OS, COPY TO/FROM
> 'non-ascii-path' cannot open the path. The cause of the problem is that we
> pass path-strings encoded in the server encoding to open() directly without
> respect to the OS's native encodings.
>
> It will be worse in Windows East Asia versions; Postgres doesn't support
> their encodings as server encodings, so we cannot use non-ascii filenames
> there.

Re: BUG #3453: Error on COPY TO/FROM 'non-ascii-path'

From
ITAGAKI Takahiro
Date:
"Hiroshi Saito" <z-saito@guitar.ocn.ne.jp> wrote:

Hi Saito-san,

> I think that it is use restrictions now....This is not a BUG.

Sure, but I'd like to notice that we don't pay attention about
mismatch of PG and OS encodings.

> In Japan, use is restricted by the reason referred to as that server
> encoding does not support all now. Then, It is Shift-jis encoding...

We don't have to support SJIS as a server encoding for this purpose.
We just need to have converter function like convert_server_to_os()
and use it when the path might include non-ascii characters.
We can skip the function in usual file operations, for example, opening
a relation file, because paths under $PGDATA consists of only ascii
characters. We can minimize the performance impact from the conversion.

One problem is that we might not have enough information about OS
encodings, at least we cannot determine it using a portable method.
If we would know the native encoding, convertion itself will be easy.
For example, we can use convert('path-for-copy', server-encoding, 'SJIS')
in Japanese version of Windows.

> > Bug reference:      3453
> > Description:        Error on COPY TO/FROM 'non-ascii-path'
> > When I set postgres a different character encoding from OS, COPY TO/FROM
> > 'non-ascii-path' cannot open the path.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: BUG #3453: Error on COPY TO/FROM 'non-ascii-path'

From
"Hiroshi Saito"
Date:
Hi.

>> I think that it is use restrictions now....This is not a BUG.
>
> Sure, but I'd like to notice that we don't pay attention about
> mismatch of PG and OS encodings.

Ah, yes.. I also agree that it has a user's confusion.

>
>> In Japan, use is restricted by the reason referred to as that server
>> encoding does not support all now. Then, It is Shift-jis encoding...
>
> We don't have to support SJIS as a server encoding for this purpose.
> We just need to have converter function like convert_server_to_os()
> and use it when the path might include non-ascii characters.
> We can skip the function in usual file operations, for example, opening
> a relation file, because paths under $PGDATA consists of only ascii
> characters. We can minimize the performance impact from the conversion.
>
> One problem is that we might not have enough information about OS
> encodings, at least we cannot determine it using a portable method.
> If we would know the native encoding, convertion itself will be easy.
> For example, we can use convert('path-for-copy', server-encoding, 'SJIS')
> in Japanese version of Windows.

Um, It has a means to avoid now as surely you suggest......
Do you mean judgment material by the server side?
Sorry, I don't have a good idea now....

Regards,
Hiroshi Saito

Re: BUG #3453: Error on COPY TO/FROM 'non-ascii-path'

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> One problem is that we might not have enough information about OS
> encodings, at least we cannot determine it using a portable method.

That's probably because there is no such thing as an "OS encoding".
At least on most platforms, there isn't any such centralized setting.

            regards, tom lane