Thread: Problem with client_encoding and PsqlODBC driver.

Problem with client_encoding and PsqlODBC driver.

From
Kadri Raudsepp
Date:
Hello, 

I have some trouble with client_encoding using psqlodbc driver and - being unable to solve it - was hoping you would be able to help me.

I need to import large amount of data to SQL Server from postgres database. I'm using psqlodbc driver version 9.03.04.00, Microsoft SQL Server 2012 and postgres version 9.2. My postgres cluster has the encoding SQL_ASCII and database encoding is LATIN1.

I tried both Unicode(x64) and ANSI(x64) drivers, writing "SET CLIENT_ENCODING TO 'LATIN1'" to Connect Settings textbox. For both drivers, connection is successful and I use openquery to read the data, but my text data is all garbled.

Looking into the driver log I can see that the client_encoding is indeed set to LATIN1 as expected, but after that cc_lookup_characterset is called and with that the encoding is changed to UTF8 with Unicode driver and - I'm not sure what happens with ANSI driver, but the result does not look like desired. I have attached a log file that contains the logs for both driver's connection test.

Is there a setting I've overlooked or an action I could take to have the driver not to override my desired client_encoding? Am I doing something wrong altogether? Or is there maybe an older version of the driver that doesn't behave this way?

Thank you in advance and all the best,
Katie
Attachment

Re: Problem with client_encoding and PsqlODBC driver.

From
Adrian Klaver
Date:
On 03/20/2015 04:38 AM, Kadri Raudsepp wrote:
> Hello,
>
> I have some trouble with client_encoding using psqlodbc driver and -
> being unable to solve it - was hoping you would be able to help me.
>
> I need to import large amount of data to SQL Server from postgres
> database. I'm using psqlodbc driver version 9.03.04.00, Microsoft SQL
> Server 2012 and postgres version 9.2. My postgres cluster has the
> encoding SQL_ASCII and database encoding is LATIN1.
>
> I tried both Unicode(x64) and ANSI(x64) drivers, writing "SET
> CLIENT_ENCODING TO 'LATIN1'" to Connect Settings textbox. For both
> drivers, connection is successful and I use openquery to read the data,
> but my text data is all garbled.

So what is the encoding set for in openquery?

>
> Looking into the driver log I can see that the client_encoding is indeed
> set to LATIN1 as expected, but after that cc_lookup_characterset is
> called and with that the encoding is changed to UTF8 with Unicode driver
> and - I'm not sure what happens with ANSI driver, but the result does
> not look like desired. I have attached a log file that contains the logs
> for both driver's connection test.
>
> Is there a setting I've overlooked or an action I could take to have the
> driver not to override my desired client_encoding? Am I doing something
> wrong altogether? Or is there maybe an older version of the driver that
> doesn't behave this way?

I would not supply the "SET CLIENT_ENCODING TO 'LATIN1'" settings. I
would use the ANSI driver because:

1) From the logs:

Right at the beginning it does:
-----ANSI-----

[6088-0.000]CC_connect: entering...
[6088-0.000]sslmode=require
[6088-0.000]LIBPQ_CC_connect: entering...
[6088-0.000]Driver Version='09.03.0400,Oct 26 2014' linking 1600 dynamic
Multithread library
[6088-0.000]extracted a client_encoding 'LATIN1' from conn_settings

and at the end:
[6088-0.156]CC_lookup_characterset: entering...
[6088-0.156]conn->unicode=0

whereas the Unicode driver does:

6088-63527.931]conn->unicode=1

2) https://odbc.postgresql.org/faq.html#4.1

>
> Thank you in advance and all the best,
> Katie
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Problem with client_encoding and PsqlODBC driver.

From
Adrian Klaver
Date:
On 03/20/2015 08:48 AM, Kadri Raudsepp wrote:
> Sorry, didn't see the first question at first - I haven't been able to
> find a way to specify encoding with openquery, every attempt has
> resulted in syntax errors.

Just to be certain you are talking about?:

https://msdn.microsoft.com/en-us/library/ms188427.aspx

Some more questions:

1) Where is your Postgres server and on what OS?

2) Do you have pgAdmin(http://www.pgadmin.org/) or the psql client
installed on the Windows machine?
If so does running a query through them show the correct results?


>
> Katie
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Problem with client_encoding and PsqlODBC driver.

From
Adrian Klaver
Date:
On 03/20/2015 09:28 AM, Kadri Raudsepp wrote:
>
>
> On Fri, Mar 20, 2015 at 5:57 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 03/20/2015 08:48 AM, Kadri Raudsepp wrote:
>
>         Sorry, didn't see the first question at first - I haven't been
>         able to
>         find a way to specify encoding with openquery, every attempt has
>         resulted in syntax errors.
>
>
>     Just to be certain you are talking about?:
>
>     https://msdn.microsoft.com/en-__us/library/ms188427.aspx
>     <https://msdn.microsoft.com/en-us/library/ms188427.aspx>
>
>
> Yes, this is what I'm using.
>
>
>
>     Some more questions:
>
>     1) Where is your Postgres server and on what OS?
>
>
> The servers in question, SQL server and Postgres server, are in
> different countries, if that's what you meant. Postgres runs on Debian
> 3.16.7 x86_64. I'm the dba for the postgres database in question, but
> not too familiar with MS tools. Data import will be done by MS dba-s in
> their location, but they have had even less luck figuring out the
> encoding issue.
>
>
>     2) Do you have pgAdmin(http://www.pgadmin.__org/
>     <http://www.pgadmin.org/>) or the psql client installed on the
>     Windows machine?
>     If so does running a query through them show the correct results?
>
>
> I did install PgAdmin III in the environment I set up for testing the
> solution, which is as close as possible to the live environments we'll
> be using. Simply running the query there gives the same garbled result,
>   "õ" instead of "õ" etc. Running "set client_encoding to latin1;
> select * from table", however, gives the correct result.
> I have given the instructions for getting the correct data via PgAdmin
> and Import Data to my colleagues, but they are strictly MS people so I'm
> yet to see how this works out.
> I would prefer if there was minimal amount of trouble on their end,
> because we'll be migrating the data while the services running on that
> data have been turned off, which means any delay will cause financial
> loss. That's why I'm looking to get the driver working.

I do not use MS products much and SQL Server not all, so I am working
blind here. That has never stopped me before, so what I could find that
might apply:

1)  This post:

https://social.msdn.microsoft.com/forums/sqlserver/en-US/acb718fd-0943-4e1c-95b0-067361157821/how-to-use-openquery-for-utf8-character

2) Led me to:
https://technet.microsoft.com/en-us/library/ms186839%28v=sql.105%29.aspx

3) Led me to:
https://msdn.microsoft.com/en-us/library/ms144260%28v=sql.105%29.aspx

My guess is the answer lies in the above, but that is requires someone
more SQL Server savvy then I to unravel.

>
>
>
>
>
>         Katie
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
> Katie


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Problem with client_encoding and PsqlODBC driver.

From
Kadri Raudsepp
Date:
Thank you for your reply.

I get exactly the same result with supplying "SET CLIENT_ENCODING TO 'LATIN1'" setting and without, with both drivers. And I don't see question-marks (?), I see, for example "mõisa" instead of "mõisa".

Katie

On Fri, Mar 20, 2015 at 5:10 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 03/20/2015 04:38 AM, Kadri Raudsepp wrote:
Hello,

I have some trouble with client_encoding using psqlodbc driver and -
being unable to solve it - was hoping you would be able to help me.

I need to import large amount of data to SQL Server from postgres
database. I'm using psqlodbc driver version 9.03.04.00, Microsoft SQL
Server 2012 and postgres version 9.2. My postgres cluster has the
encoding SQL_ASCII and database encoding is LATIN1.

I tried both Unicode(x64) and ANSI(x64) drivers, writing "SET
CLIENT_ENCODING TO 'LATIN1'" to Connect Settings textbox. For both
drivers, connection is successful and I use openquery to read the data,
but my text data is all garbled.

So what is the encoding set for in openquery?


Looking into the driver log I can see that the client_encoding is indeed
set to LATIN1 as expected, but after that cc_lookup_characterset is
called and with that the encoding is changed to UTF8 with Unicode driver
and - I'm not sure what happens with ANSI driver, but the result does
not look like desired. I have attached a log file that contains the logs
for both driver's connection test.

Is there a setting I've overlooked or an action I could take to have the
driver not to override my desired client_encoding? Am I doing something
wrong altogether? Or is there maybe an older version of the driver that
doesn't behave this way?

I would not supply the "SET CLIENT_ENCODING TO 'LATIN1'" settings. I would use the ANSI driver because:

1) From the logs:

Right at the beginning it does:
-----ANSI-----

[6088-0.000]CC_connect: entering...
[6088-0.000]sslmode=require
[6088-0.000]LIBPQ_CC_connect: entering...
[6088-0.000]Driver Version='09.03.0400,Oct 26 2014' linking 1600 dynamic Multithread library
[6088-0.000]extracted a client_encoding 'LATIN1' from conn_settings

and at the end:
[6088-0.156]CC_lookup_characterset: entering...
[6088-0.156]conn->unicode=0

whereas the Unicode driver does:

6088-63527.931]conn->unicode=1

2) https://odbc.postgresql.org/faq.html#4.1



Thank you in advance and all the best,
Katie





--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Problem with client_encoding and PsqlODBC driver.

From
Kadri Raudsepp
Date:
Sorry, didn't see the first question at first - I haven't been able to find a way to specify encoding with openquery, every attempt has resulted in syntax errors.

Katie

On Fri, Mar 20, 2015 at 5:45 PM, Kadri Raudsepp <raudsepp.kadri@gmail.com> wrote:
Thank you for your reply.

I get exactly the same result with supplying "SET CLIENT_ENCODING TO 'LATIN1'" setting and without, with both drivers. And I don't see question-marks (?), I see, for example "mõisa" instead of "mõisa".

Katie

On Fri, Mar 20, 2015 at 5:10 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 03/20/2015 04:38 AM, Kadri Raudsepp wrote:
Hello,

I have some trouble with client_encoding using psqlodbc driver and -
being unable to solve it - was hoping you would be able to help me.

I need to import large amount of data to SQL Server from postgres
database. I'm using psqlodbc driver version 9.03.04.00, Microsoft SQL
Server 2012 and postgres version 9.2. My postgres cluster has the
encoding SQL_ASCII and database encoding is LATIN1.

I tried both Unicode(x64) and ANSI(x64) drivers, writing "SET
CLIENT_ENCODING TO 'LATIN1'" to Connect Settings textbox. For both
drivers, connection is successful and I use openquery to read the data,
but my text data is all garbled.

So what is the encoding set for in openquery?


Looking into the driver log I can see that the client_encoding is indeed
set to LATIN1 as expected, but after that cc_lookup_characterset is
called and with that the encoding is changed to UTF8 with Unicode driver
and - I'm not sure what happens with ANSI driver, but the result does
not look like desired. I have attached a log file that contains the logs
for both driver's connection test.

Is there a setting I've overlooked or an action I could take to have the
driver not to override my desired client_encoding? Am I doing something
wrong altogether? Or is there maybe an older version of the driver that
doesn't behave this way?

I would not supply the "SET CLIENT_ENCODING TO 'LATIN1'" settings. I would use the ANSI driver because:

1) From the logs:

Right at the beginning it does:
-----ANSI-----

[6088-0.000]CC_connect: entering...
[6088-0.000]sslmode=require
[6088-0.000]LIBPQ_CC_connect: entering...
[6088-0.000]Driver Version='09.03.0400,Oct 26 2014' linking 1600 dynamic Multithread library
[6088-0.000]extracted a client_encoding 'LATIN1' from conn_settings

and at the end:
[6088-0.156]CC_lookup_characterset: entering...
[6088-0.156]conn->unicode=0

whereas the Unicode driver does:

6088-63527.931]conn->unicode=1

2) https://odbc.postgresql.org/faq.html#4.1



Thank you in advance and all the best,
Katie





--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Problem with client_encoding and PsqlODBC driver.

From
Kadri Raudsepp
Date:


On Fri, Mar 20, 2015 at 5:57 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 03/20/2015 08:48 AM, Kadri Raudsepp wrote:
Sorry, didn't see the first question at first - I haven't been able to
find a way to specify encoding with openquery, every attempt has
resulted in syntax errors.

Just to be certain you are talking about?:

https://msdn.microsoft.com/en-us/library/ms188427.aspx

Yes, this is what I'm using. 


Some more questions:

1) Where is your Postgres server and on what OS?

The servers in question, SQL server and Postgres server, are in different countries, if that's what you meant. Postgres runs on Debian 3.16.7 x86_64. I'm the dba for the postgres database in question, but not too familiar with MS tools. Data import will be done by MS dba-s in their location, but they have had even less luck figuring out the encoding issue.
 

2) Do you have pgAdmin(http://www.pgadmin.org/) or the psql client installed on the Windows machine?
If so does running a query through them show the correct results?

I did install PgAdmin III in the environment I set up for testing the solution, which is as close as possible to the live environments we'll be using. Simply running the query there gives the same garbled result,  "õ" instead of "õ" etc. Running "set client_encoding to latin1; select * from table", however, gives the correct result.
I have given the instructions for getting the correct data via PgAdmin and Import Data to my colleagues, but they are strictly MS people so I'm yet to see how this works out.
I would prefer if there was minimal amount of trouble on their end, because we'll be migrating the data while the services running on that data have been turned off, which means any delay will cause financial loss. That's why I'm looking to get the driver working.





Katie



--
Adrian Klaver
adrian.klaver@aklaver.com

Katie

Re: Problem with client_encoding and PsqlODBC driver.

From
Adrian Klaver
Date:
On 03/20/2015 11:10 AM, Kadri Raudsepp wrote:
> Thank you for trying to help, Adrian :)
>
> The links you sent I have already seen, no help unfortunately. I have
> dug deep into SQL Server's soul and system tables and system
> configuration and procedures and functions and anything I could think
> of. I'm quite convinced by now that my issue is impossible to solve from
> the SQL Server end and apart from the PgAdmin solution for getting the
> correct characters in my data which requires a lot of additional steps
> and time, the only hope is in the odbc driver itself, getting it somehow
> not to override my latin1 client_encoding.

I don't think it is. From the results you show most of the characters
display correctly. This seems to me that the encoding on the receiving
end is close but not an exact match. The only thing I have left to
suggest is the MS ODBC Test utility:

https://msdn.microsoft.com/en-us/library/ms712676%28v=vs.85%29.aspx

Maybe getting closer to a raw connection will help id the problem.

>
> Thank you again and have a nice weekend!
> Katie
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Problem with client_encoding and PsqlODBC driver.

From
Kadri Raudsepp
Date:
Thank you for trying to help, Adrian :)

The links you sent I have already seen, no help unfortunately. I have dug deep into SQL Server's soul and system tables and system configuration and procedures and functions and anything I could think of. I'm quite convinced by now that my issue is impossible to solve from the SQL Server end and apart from the PgAdmin solution for getting the correct characters in my data which requires a lot of additional steps and time, the only hope is in the odbc driver itself, getting it somehow not to override my latin1 client_encoding. 

Thank you again and have a nice weekend!
Katie

On Fri, Mar 20, 2015 at 6:39 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 03/20/2015 09:28 AM, Kadri Raudsepp wrote:


On Fri, Mar 20, 2015 at 5:57 PM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 03/20/2015 08:48 AM, Kadri Raudsepp wrote:

        Sorry, didn't see the first question at first - I haven't been
        able to
        find a way to specify encoding with openquery, every attempt has
        resulted in syntax errors.


    Just to be certain you are talking about?:

    https://msdn.microsoft.com/en-__us/library/ms188427.aspx
    <https://msdn.microsoft.com/en-us/library/ms188427.aspx>


Yes, this is what I'm using.



    Some more questions:

    1) Where is your Postgres server and on what OS?


The servers in question, SQL server and Postgres server, are in
different countries, if that's what you meant. Postgres runs on Debian
3.16.7 x86_64. I'm the dba for the postgres database in question, but
not too familiar with MS tools. Data import will be done by MS dba-s in
their location, but they have had even less luck figuring out the
encoding issue.


    2) Do you have pgAdmin(http://www.pgadmin.__org/
    <http://www.pgadmin.org/>) or the psql client installed on the
    Windows machine?
    If so does running a query through them show the correct results?


I did install PgAdmin III in the environment I set up for testing the
solution, which is as close as possible to the live environments we'll
be using. Simply running the query there gives the same garbled result,
  "õ" instead of "õ" etc. Running "set client_encoding to latin1;
select * from table", however, gives the correct result.
I have given the instructions for getting the correct data via PgAdmin
and Import Data to my colleagues, but they are strictly MS people so I'm
yet to see how this works out.
I would prefer if there was minimal amount of trouble on their end,
because we'll be migrating the data while the services running on that
data have been turned off, which means any delay will cause financial
loss. That's why I'm looking to get the driver working.

I do not use MS products much and SQL Server not all, so I am working blind here. That has never stopped me before, so what I could find that might apply:

1)  This post:
https://social.msdn.microsoft.com/forums/sqlserver/en-US/acb718fd-0943-4e1c-95b0-067361157821/how-to-use-openquery-for-utf8-character

2) Led me to:
https://technet.microsoft.com/en-us/library/ms186839%28v=sql.105%29.aspx

3) Led me to:
https://msdn.microsoft.com/en-us/library/ms144260%28v=sql.105%29.aspx

My guess is the answer lies in the above, but that is requires someone more SQL Server savvy then I to unravel.






        Katie



    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>


Katie


--
Adrian Klaver
adrian.klaver@aklaver.com