Thread: Re: Client encoding conversion for binary data (was Re: GUC and postgresql.conf docs)

Re: Client encoding conversion for binary data (was Re: GUC and postgresql.conf docs)

From
"Zeugswetter Andreas SB SD"
Date:
> We could sidestep that issue if binary I/O for text was in server
> encoding in all cases.

I think that would be reasonable, yes. After all one argument for using
binary mode is speed and efficiency, and not it's editability with
a text editor.

Andreas


"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
>> We could sidestep that issue if binary I/O for text was in server
>> encoding in all cases.

> I think that would be reasonable, yes. After all one argument for using 
> binary mode is speed and efficiency, and not it's editability with
> a text editor.

I just realized that there is a comparable issue for plain-text COPY.
It performs client-to-server encoding conversions in all cases ---
including when reading/writing a file in the server's filesystem.

I think it is correct for plain-text COPY to perform such conversions
when doing COPY to/from the client.  I'm much less convinced that it
is sane to apply client_encoding to server-side files.  On the other
hand, there's still the point about dumping a file one way and loading
it back the other.  Also, it's probably unwise to change this behavior
without a really good argument for doing so, since (AFAIR) we've not
had bug reports about it.

Comments anyone?
        regards, tom lane


Re: Client encoding conversion for binary data (was Re:

From
Hannu Krosing
Date:
Tom Lane kirjutas N, 15.05.2003 kell 01:44:
> "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> >> We could sidestep that issue if binary I/O for text was in server
> >> encoding in all cases.
> 
> > I think that would be reasonable, yes. After all one argument for using 
> > binary mode is speed and efficiency, and not it's editability with
> > a text editor.
> 
> I just realized that there is a comparable issue for plain-text COPY.
> It performs client-to-server encoding conversions in all cases ---
> including when reading/writing a file in the server's filesystem.
> 
> I think it is correct for plain-text COPY to perform such conversions
> when doing COPY to/from the client.  I'm much less convinced that it
> is sane to apply client_encoding to server-side files.

Yes, it seems completely bogus. The whole reason for existance of
client-side encodings is that each client may have its own (and even the
same client may use several, at least for diffrent connections).

> On the other
> hand, there's still the point about dumping a file one way and loading
> it back the other.  Also, it's probably unwise to change this behavior
> without a really good argument for doing so, since (AFAIR) we've not
> had bug reports about it.

It works both ways, i.e. the lack of bug reports may also suggest that
nobody is doing it (copy file to server, then load the same file from
client)

> Comments anyone?

I have not been closely following the discussion about FE/BE protocol
changes, but the way converting binary seems dangerous - if you insert a
.gif file into bytea column using decode(encodedgiffile,'base64'), but
would like to get it out in binary for performance reasons, it is not
good if it gets run through conversion routines.

-----------------
Hannu



Re: Client encoding conversion for binary data (was Re:

From
"Christopher Kings-Lynne"
Date:
> Yes, it seems completely bogus. The whole reason for existance of
> client-side encodings is that each client may have its own (and even the
> same client may use several, at least for diffrent connections).
>
> > On the other
> > hand, there's still the point about dumping a file one way and loading
> > it back the other.  Also, it's probably unwise to change this behavior
> > without a really good argument for doing so, since (AFAIR) we've not
> > had bug reports about it.
>
> It works both ways, i.e. the lack of bug reports may also suggest that
> nobody is doing it (copy file to server, then load the same file from
> client)
>
> > Comments anyone?

Perhaps we should just have a flag in the COPY grammar 'WITH/OUT CONVERSION'
that specifies that an encoding is required...

Chris



Hannu Krosing <hannu@tm.ee> writes:
> I have not been closely following the discussion about FE/BE protocol
> changes, but the way converting binary seems dangerous - if you insert a
> .gif file into bytea column using decode(encodedgiffile,'base64'), but
> would like to get it out in binary for performance reasons, it is not
> good if it gets run through conversion routines.

bytea does not get converted in any case.  The issue here is what to do
about text datatypes.
        regards, tom lane


Re: Client encoding conversion for binary data (was Re:

From
Hannu Krosing
Date:
Tom Lane kirjutas N, 15.05.2003 kell 15:54:
> Hannu Krosing <hannu@tm.ee> writes:
> > I have not been closely following the discussion about FE/BE protocol
> > changes, but the way converting binary seems dangerous - if you insert a
> > .gif file into bytea column using decode(encodedgiffile,'base64'), but
> > would like to get it out in binary for performance reasons, it is not
> > good if it gets run through conversion routines.
> 
> bytea does not get converted in any case.  The issue here is what to do
> about text datatypes.

For me the logical behaviour is : 

1) all text moving between client and server should be converted

2) text staying on server should stay in server encoding.

3) if someone has to move text produced by \copy on server, let hin do
it using a postgresql function defined as "read_text(path) returns text"
so the file gets converted using standard mechanisms. 

4) for special cases we could add WITH CLIENTENCODING to copy


-------------
Hannu