Thread: BUG #3453: Error on COPY TO/FROM 'non-ascii-path'
The following bug has been logged online: Bug reference: 3453 Logged by: ITAGAKI Takahiro Email address: itagaki.takahiro@oss.ntt.co.jp PostgreSQL version: 8.2, 8.3dev Operating system: independent (especially Windows) Description: Error on COPY TO/FROM 'non-ascii-path' Details: When I set postgres a different character encoding from OS, COPY TO/FROM 'non-ascii-path' cannot open the path. The cause of the problem is that we pass path-strings encoded in the server encoding to open() directly without respect to the OS's native encodings. It will be worse in Windows East Asia versions; Postgres doesn't support their encodings as server encodings, so we cannot use non-ascii filenames there.
Hi ITAGAKI-san. I think that it is use restrictions now....This is not a BUG. In Japan, use is restricted by the reason referred to as that server encoding does not support all now. Then, It is Shift-jis encoding... http://winpg.jp/~saito/pg_bug/copy_to.png http://winpg.jp/~saito/pg_bug/copy_from.png This SERVER_ENCODING is EUC_JP..:-( This will be cleared by Shift_jis being supported as you suggested before. Therefore, only psql to \copy of a client can do it now. Probably, It will become the item of 'TODO' correctly..... Is it different? Or do you have another point? Regards, Hiroshi Saito From: "ITAGAKI Takahiro" > > The following bug has been logged online: > > Bug reference: 3453 > Logged by: ITAGAKI Takahiro > Email address: itagaki.takahiro@oss.ntt.co.jp > PostgreSQL version: 8.2, 8.3dev > Operating system: independent (especially Windows) > Description: Error on COPY TO/FROM 'non-ascii-path' > Details: > > When I set postgres a different character encoding from OS, COPY TO/FROM > 'non-ascii-path' cannot open the path. The cause of the problem is that we > pass path-strings encoded in the server encoding to open() directly without > respect to the OS's native encodings. > > It will be worse in Windows East Asia versions; Postgres doesn't support > their encodings as server encodings, so we cannot use non-ascii filenames > there.
"Hiroshi Saito" <z-saito@guitar.ocn.ne.jp> wrote: Hi Saito-san, > I think that it is use restrictions now....This is not a BUG. Sure, but I'd like to notice that we don't pay attention about mismatch of PG and OS encodings. > In Japan, use is restricted by the reason referred to as that server > encoding does not support all now. Then, It is Shift-jis encoding... We don't have to support SJIS as a server encoding for this purpose. We just need to have converter function like convert_server_to_os() and use it when the path might include non-ascii characters. We can skip the function in usual file operations, for example, opening a relation file, because paths under $PGDATA consists of only ascii characters. We can minimize the performance impact from the conversion. One problem is that we might not have enough information about OS encodings, at least we cannot determine it using a portable method. If we would know the native encoding, convertion itself will be easy. For example, we can use convert('path-for-copy', server-encoding, 'SJIS') in Japanese version of Windows. > > Bug reference: 3453 > > Description: Error on COPY TO/FROM 'non-ascii-path' > > When I set postgres a different character encoding from OS, COPY TO/FROM > > 'non-ascii-path' cannot open the path. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Hi. >> I think that it is use restrictions now....This is not a BUG. > > Sure, but I'd like to notice that we don't pay attention about > mismatch of PG and OS encodings. Ah, yes.. I also agree that it has a user's confusion. > >> In Japan, use is restricted by the reason referred to as that server >> encoding does not support all now. Then, It is Shift-jis encoding... > > We don't have to support SJIS as a server encoding for this purpose. > We just need to have converter function like convert_server_to_os() > and use it when the path might include non-ascii characters. > We can skip the function in usual file operations, for example, opening > a relation file, because paths under $PGDATA consists of only ascii > characters. We can minimize the performance impact from the conversion. > > One problem is that we might not have enough information about OS > encodings, at least we cannot determine it using a portable method. > If we would know the native encoding, convertion itself will be easy. > For example, we can use convert('path-for-copy', server-encoding, 'SJIS') > in Japanese version of Windows. Um, It has a means to avoid now as surely you suggest...... Do you mean judgment material by the server side? Sorry, I don't have a good idea now.... Regards, Hiroshi Saito
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > One problem is that we might not have enough information about OS > encodings, at least we cannot determine it using a portable method. That's probably because there is no such thing as an "OS encoding". At least on most platforms, there isn't any such centralized setting. regards, tom lane