Thread: Re: [GENERAL] invalid byte sequence ?
Am Donnerstag, 24. August 2006 00:52 schrieb Tom Lane: > A possible solution therefore is to have psql or libpq drive the > client_encoding off the client's locale environment instead of letting > it default to equal the server_encoding. I got started on this and just wanted to post an intermediate patch. I have taken the logic from initdb and placed it into libpq and refined the API a bit. At this point, there should be no behaviorial change. It remains to make libpq use this stuff if PGCLIENTENCODING is not set. Unless someone beats me, I'll figure that out later. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Fri, Aug 25, 2006 at 05:07:03PM +0200, Peter Eisentraut wrote: > I got started on this and just wanted to post an intermediate patch. I have > taken the logic from initdb and placed it into libpq and refined the API a > bit. At this point, there should be no behaviorial change. It remains to > make libpq use this stuff if PGCLIENTENCODING is not set. Unless someone > beats me, I'll figure that out later. Umm, why export all these functions. For starters, does this even need to be in libpq? I wouldn't have thought so the first time round, especially not three functions. The only thing you need is to take a locale name and return the charset you can pass to PQsetClientEncoding. In fact, the only thing you need is PQsetClientEncodingFromLocale(), anything else is just sugar. Why would the user care about what the OS calls it? We have a "pg_enc" enum, so lets use it. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout: > Umm, why export all these functions. For starters, does this even need > to be in libpq? Where else would you put it? > In fact, the only thing you need is PQsetClientEncodingFromLocale(), > anything else is just sugar. Why would the user care about what the OS > calls it? We have a "pg_enc" enum, so lets use it. initdb has different requirements. Let me know if you have a different way to refactor it that satisfies initdb. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Fri, Aug 25, 2006 at 05:38:20PM +0200, Peter Eisentraut wrote: > > In fact, the only thing you need is PQsetClientEncodingFromLocale(), > > anything else is just sugar. Why would the user care about what the OS > > calls it? We have a "pg_enc" enum, so lets use it. > > initdb has different requirements. Let me know if you have a different way to > refactor it that satisfies initdb. Well, check_encodings_match(pg_enc,ctype) is simply a short way of saying: if(find_matching_encoding(ctype) != pg_enc ) { error }. And get_encoding_from_locale() is not used outside of those functions. So the only thing initdb actually needs is an implementation of find_matching_encoding(ctype), which returns a value of "enum pg_enc". check_encodings_match() stays in initdb, and get_encoding_from_locale() becomes internal to libpq. How does that sound? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Peter Eisentraut <peter_e@gmx.net> writes: > Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout: >> Umm, why export all these functions. For starters, does this even need >> to be in libpq? > Where else would you put it? > ... > initdb has different requirements. Let me know if you have a different way to > refactor it that satisfies initdb. Um, but initdb doesn't use libpq, so it's going to need its own copy anyway. I agree with Martijn that putting these into libpq's API seems like useless clutter. regards, tom lane
Tom Lane wrote: > Um, but initdb doesn't use libpq, so it's going to need its own copy > anyway. initdb certainly links against libpq. > I agree with Martijn that putting these into libpq's API > seems like useless clutter. Where else to put it? We need it in libpq anyway if we want this behavior in all client applications (by default). -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> I agree with Martijn that putting these into libpq's API >> seems like useless clutter. > Where else to put it? We need it in libpq anyway if we want this > behavior in all client applications (by default). Having the code in libpq doesn't necessarily mean exposing it to the outside world. I can't see a reason for these to be in the API at all. Possibly we could avoid the duplication-of-source-code issue by putting the code in libpgport, or someplace, whence both initdb and libpq could get at it? regards, tom lane
On Fri, Aug 25, 2006 at 08:13:39PM +0200, Peter Eisentraut wrote: > > I agree with Martijn that putting these into libpq's API > > seems like useless clutter. > > Where else to put it? We need it in libpq anyway if we want this > behavior in all client applications (by default). Is that so? I thought we were only talkng about psql. Even then, I'm wondering if we should alter the current behaviour at all if stdout is not a tty (i.e. run as a pipe). And as a counter-example: pg_dump should absolutly not use the client locale, it should always dump as the same encoding as the server... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout <kleptog@svana.org> writes: > And as a counter-example: pg_dump should absolutly not use the client > locale, it should always dump as the same encoding as the server... Sure, but pg_dump should set that explicitly. I'm prepared to believe that looking at the locale is sane for all normal clients. It might be worth providing a way to set the client_encoding through a PQconnectdb connection-string keyword, just in case the override-via- PGCLIENTENCODING dodge doesn't suit someone. The priority order would presumably be connection string, then PGCLIENTENCODING, then locale. regards, tom lane
Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > And as a counter-example: pg_dump should absolutly not use the client > > locale, it should always dump as the same encoding as the server... > > Sure, but pg_dump should set that explicitly. I'm prepared to believe > that looking at the locale is sane for all normal clients. What are "normal clients"? I would think that programs in PHP or Perl have their own idea of the correct encoding (JDBC already has one). > It might be worth providing a way to set the client_encoding through a > PQconnectdb connection-string keyword, just in case the override-via- > PGCLIENTENCODING dodge doesn't suit someone. The priority order > would presumably be connection string, then PGCLIENTENCODING, then > locale. This sounds like a good idea anyway... -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support