Re: utf8 encoding problem with plperlu - Mailing list pgsql-general

From Ronald Peterson
Subject Re: utf8 encoding problem with plperlu
Date
Msg-id CAJPRK8YEuGUYkxTmsayjAB9P_QUwy6H9WuWtdzmkerN_rNGS2Q@mail.gmail.com
Whole thread Raw
In response to Re: utf8 encoding problem with plperlu  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-general
Thanks Pavel, this looks promising.  I didn't know about the Data::Peek module - that might help me figure out what is going on.

On Wed, Jul 15, 2015 at 2:28 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:


2015-07-15 20:20 GMT+02:00 Ronald Peterson <ron@hub.yellowbank.com>:
That's interesting.  What I'm really doing, instead of the second elog statement, is this:

$ret = $ldap->modify( $dn,
                      replace => {
                         unicodePwd => $mspass
                  } );

This does work for strings that don't contain consecutive zeroes.  I'm not really passing the string to PostgreSQL, but to Net::LDAP, but it must hit PostgreSQL anyway?  Active Directory requires this encoding, so I'm not sure what to do here.

I had some issues, when I used some Perl libraries with UTF strings - some requires, some not UTF flag in string. And Postgres didn't well set thist UTF flag well.

http://blog.endpoint.com/2014/02/dbdpg-utf-8-perl-postgresql.html

Maybe you have similar issue - on server side.

Pavel

 


On Wed, Jul 15, 2015 at 11:57 AM, Daniel Verite <daniel@manitou-mail.org> wrote:
        Ronald Peterson wrote:

> # select * from doublezero();
> INFO:  double00
> CONTEXT:  PL/Perl function "doublezero"
> ERROR:  invalid byte sequence for encoding "UTF8": 0x00 at line 8, <DATA>
> line 558.
> CONTEXT:  PL/Perl function "doublezero"
>
> I don't understand this.  I need to pass $mspass to Active Directory, and it
> the encoding is exactly as it should be, which is to say, it works for
> strings that don't include two consecutive zeros.  Is this a bug?

When replacing the literal "double00" with "foobar" in your function,
the same error occurs for me:

    test=# select doublezero();
    INFO:  foobar
    CONTEXT:  PL/Perl function "doublezero"
    ERROR:  invalid byte sequence for encoding "UTF8": 0x00 at line 6.
    CONTEXT:  fonction PL/Perl « doublezero »

Anyway it's not clear what you expect. PG doesn't support UTF-16,
and even if it did, it wouldn't accept such strings when the current
encoding is UTF-8.
If Active Directory wants UTF-16LE, you have to do that conversion, but
don't pass the result back to postgres in this format.


Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org



--
-R-








--
-R-




pgsql-general by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: utf8 encoding problem with plperlu
Next
From: Ken Tanzer
Date:
Subject: EXCLUDE, Gist and integers