On Mon, Apr 16, 2018 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Keith Fiske <keith.fiske@crunchydata.com> writes: > Running into an issue with helping a client upgrade from 8.3 to 10 (yes, I > know, please keep the out of support comments to a minimum, thanks :).
> The old database was in SQL_ASCII and it needs to stay that way for now > unfortunately. The dump and restore itself works fine, but we're now > running into issues with some data returning encoding errors unless we > specifically set the client_encoding value to SQL_ASCII.
I'm guessing you might be hitting this 9.1 change:
* Have psql set the client encoding from the operating system locale by default (Heikki Linnakangas)
This only happens if the PGCLIENTENCODING environment variable is not set.
I think the previous default was to set client encoding equal to the server encoding. > Looking at the 8.3 database, it has the client_encoding value set to UTF8 > and queries seem to work fine. Is this just a bug in the old 8.3 not > enforcing encoding properly?
Somewhere along the line we made SQL_ASCII -> something else conversions check that the data was valid per the other encoding, even though no actual data change happens. > The other thing I noticed on the 10 instance was that, while the LOCALE was > set to SQL_ASCII,
You mean encoding, I assume. > the COLLATE and CTYPE values for the restored databases > were en_US.UTF-8. Could this be having an affect?
This is not a great idea, no. You could be getting strange misbehaviors in e.g. string comparison, because strcoll() will expect UTF8 data and will likely not cope well with data that isn't valid in that encoding.
If you can't sanitize the encoding of your data, I'd suggest running with lc_collate and lc_ctype set to "C".
regards, tom lane
Thanks to both of you Adrian & Tom.
It is the 9.1 change to the psql client that seems to be causing this.
And pg_controldata was able to show that the CTYPE and COLLATE were UTF8 on the old system. If that's the case, do you still think it's a good idea to set the COLLATE and CTYPE to "C"?