Thread: Prevent characters not transposable to LATIN9

Prevent characters not transposable to LATIN9

From
Arnaud Lesauvage
Date:
Hi list !

We have a database in UTF8, from which we have to export text files in
LATIN9 encoding (or WIN1252, which is almostthe same I believe).

Records are entered via MSAccess forms (on psqlodbc-linked tables).
The problem is that some of the characters input by the users have no
equivalent in LATIN9.

How could I easily write a CONSTRAINT (or RULE) that would check that
everything entered in the fields have an equivalent in my specific
destination encoding ?

Thanks for any hints on this !

Regards
--
Arnaud Lesauvage

Re: Prevent characters not transposable to LATIN9

From
Arnaud Lesauvage
Date:
Le 30/06/2010 2:42, Howard Rogers a écrit :
> Something I do in Oracle: do a TRANSLATE on whatever string is being
> supplied, converting matching characters to spaces, and measure the length.
> If the length is greater than zero, your supplied string has something in it
> you're not expecting, at which point you can intervene and warn your users
> with some sort of error message. For example:

Hi Howard, sorry for the late reply.
Yes, that could work indeed.
I did not want to take the hassle of having to list all acceptable
characters, but that might be the only way.

Thanks for the hint !



>
>   ims=# select length(translate('MYSTRING','ABCDEFGHIJKLM',' '));
>     ?column?
>    ----------
>            5
>    (1 row)
>
>
> It's '5' because I have no translation specified for 'Y', 'S' or 'T', 'R' or
> 'N'. If I add those in, then I get zero:
>
>   ims=# select length(translate('MYSTRING','ABCDEFGHIJKLMYSTRN',' '));
>     ?column?
>    ----------
>            0
>    (1 row)
>
>
> So, in that second argument to the translate function, specify all the
> characters you're prepared to accept, and test form input for the length
> being zero after the translate function has applied.
>
> Regards
> HJR
>
>
> On Wed, Jun 30, 2010 at 12:52 AM, Arnaud Lesauvage
> <arnaud.listes@codata.eu>wrote:
>
>>  Hi list !
>>
>>  We have a database in UTF8, from which we have to export text files in
>>  LATIN9 encoding (or WIN1252, which is almostthe same I believe).
>>
>>  Records are entered via MSAccess forms (on psqlodbc-linked tables).
>>  The problem is that some of the characters input by the users have no
>>  equivalent in LATIN9.
>>
>>  How could I easily write a CONSTRAINT (or RULE) that would check that
>>  everything entered in the fields have an equivalent in my specific
>>  destination encoding ?
>>
>>  Thanks for any hints on this !
>>
>>  Regards
>>  --
>>  Arnaud Lesauvage
>>
>>  --
>>  Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>  To make changes to your subscription:
>>  http://www.postgresql.org/mailpref/pgsql-general
>>
>


Re: Prevent characters not transposable to LATIN9

From
Sam Mason
Date:
On Tue, Jun 29, 2010 at 04:52:22PM +0200, Arnaud Lesauvage wrote:
> We have a database in UTF8, from which we have to export text files in
> LATIN9 encoding (or WIN1252, which is almostthe same I believe).
>
> Records are entered via MSAccess forms (on psqlodbc-linked tables).
> The problem is that some of the characters input by the users have no
> equivalent in LATIN9.
>
> How could I easily write a CONSTRAINT (or RULE) that would check that
> everything entered in the fields have an equivalent in my specific
> destination encoding ?

How about using the built in character conversion routines.  Something
like:

  col = convert_from(convert_to(col, 'LATIN9'),'LATIN9')

as the check constraint, or its inverse as the where clause for the
erroneous rows?

--
  Sam  http://samason.me.uk/

Re: Prevent characters not transposable to LATIN9

From
Arnaud Lesauvage
Date:
Le 1/07/2010 16:48, Sam Mason a écrit :
> On Tue, Jun 29, 2010 at 04:52:22PM +0200, Arnaud Lesauvage wrote:
>>  We have a database in UTF8, from which we have to export text files in
>>  LATIN9 encoding (or WIN1252, which is almostthe same I believe).
>>
>>  Records are entered via MSAccess forms (on psqlodbc-linked tables).
>>  The problem is that some of the characters input by the users have no
>>  equivalent in LATIN9.
>>
>>  How could I easily write a CONSTRAINT (or RULE) that would check that
>>  everything entered in the fields have an equivalent in my specific
>>  destination encoding ?
>
> How about using the built in character conversion routines.  Something
> like:
>
>    col = convert_from(convert_to(col, 'LATIN9'),'LATIN9')
>
> as the check constraint, or its inverse as the where clause for the
> erroneous rows?


What happens then for a character that does not have an equivalent in
LATIN9 ?
If an error is raised in the check constraint, does it look like a
normal check error ?

Re: Prevent characters not transposable to LATIN9

From
Sam Mason
Date:
On Thu, Jul 01, 2010 at 04:53:51PM +0200, Arnaud Lesauvage wrote:
> Le 1/07/2010 16:48, Sam Mason a écrit :
>> How about using the built in character conversion routines.  Something
>> like:
>>
>>    col = convert_from(convert_to(col, 'LATIN9'),'LATIN9')
>>
>> as the check constraint, or its inverse as the where clause for the
>> erroneous rows?
>
> What happens then for a character that does not have an equivalent in
> LATIN9 ?
> If an error is raised in the check constraint, does it look like a
> normal check error ?

Yoik, didn't think about how it would actually handle the conversion!
It appears to throw an exception, so you probably want to bundle it up
in a pl/pgsql function that catches it and does the "right thing" for
you.  Maybe something like:

  CREATE FUNCTION isstringrepresentable(str TEXT, charset TEXT) RETURNS BOOLEAN AS $$
    BEGIN
      RETURN str = convert_from(convert_to(str, charset),charset);
    EXCEPTION WHEN OTHERS THEN
      RETURN FALSE;
    END
  $$ LANGUAGE plpgsql IMMUTABLE;

I'm not sure which exception it should be catching, "OTHERS" is a catch
all which is a bit cheesy but should do the right thing most of the
time.

--
  Sam  http://samason.me.uk/

Re: Prevent characters not transposable to LATIN9

From
Arnaud Lesauvage
Date:
Le 1/07/2010 17:12, Sam Mason a écrit :
> On Thu, Jul 01, 2010 at 04:53:51PM +0200, Arnaud Lesauvage wrote:
>>  Le 1/07/2010 16:48, Sam Mason a écrit :
>>>  How about using the built in character conversion routines.  Something
>>>  like:
>>>
>>>     col = convert_from(convert_to(col, 'LATIN9'),'LATIN9')
>>>
>>>  as the check constraint, or its inverse as the where clause for the
>>>  erroneous rows?
>>
>>  What happens then for a character that does not have an equivalent in
>>  LATIN9 ?
>>  If an error is raised in the check constraint, does it look like a
>>  normal check error ?
>
> Yoik, didn't think about how it would actually handle the conversion!
> It appears to throw an exception, so you probably want to bundle it up
> in a pl/pgsql function that catches it and does the "right thing" for
> you.  Maybe something like:
>
>    CREATE FUNCTION isstringrepresentable(str TEXT, charset TEXT) RETURNS BOOLEAN AS $$
>      BEGIN
>        RETURN str = convert_from(convert_to(str, charset),charset);
>      EXCEPTION WHEN OTHERS THEN
>        RETURN FALSE;
>      END
>    $$ LANGUAGE plpgsql IMMUTABLE;

Yep, I'll give it a try as soon as I find some time !

Thanks for the hint.