Thread: Prevent characters not transposable to LATIN9
Hi list ! We have a database in UTF8, from which we have to export text files in LATIN9 encoding (or WIN1252, which is almostthe same I believe). Records are entered via MSAccess forms (on psqlodbc-linked tables). The problem is that some of the characters input by the users have no equivalent in LATIN9. How could I easily write a CONSTRAINT (or RULE) that would check that everything entered in the fields have an equivalent in my specific destination encoding ? Thanks for any hints on this ! Regards -- Arnaud Lesauvage
Le 30/06/2010 2:42, Howard Rogers a écrit : > Something I do in Oracle: do a TRANSLATE on whatever string is being > supplied, converting matching characters to spaces, and measure the length. > If the length is greater than zero, your supplied string has something in it > you're not expecting, at which point you can intervene and warn your users > with some sort of error message. For example: Hi Howard, sorry for the late reply. Yes, that could work indeed. I did not want to take the hassle of having to list all acceptable characters, but that might be the only way. Thanks for the hint ! > > ims=# select length(translate('MYSTRING','ABCDEFGHIJKLM',' ')); > ?column? > ---------- > 5 > (1 row) > > > It's '5' because I have no translation specified for 'Y', 'S' or 'T', 'R' or > 'N'. If I add those in, then I get zero: > > ims=# select length(translate('MYSTRING','ABCDEFGHIJKLMYSTRN',' ')); > ?column? > ---------- > 0 > (1 row) > > > So, in that second argument to the translate function, specify all the > characters you're prepared to accept, and test form input for the length > being zero after the translate function has applied. > > Regards > HJR > > > On Wed, Jun 30, 2010 at 12:52 AM, Arnaud Lesauvage > <arnaud.listes@codata.eu>wrote: > >> Hi list ! >> >> We have a database in UTF8, from which we have to export text files in >> LATIN9 encoding (or WIN1252, which is almostthe same I believe). >> >> Records are entered via MSAccess forms (on psqlodbc-linked tables). >> The problem is that some of the characters input by the users have no >> equivalent in LATIN9. >> >> How could I easily write a CONSTRAINT (or RULE) that would check that >> everything entered in the fields have an equivalent in my specific >> destination encoding ? >> >> Thanks for any hints on this ! >> >> Regards >> -- >> Arnaud Lesauvage >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general >> >
On Tue, Jun 29, 2010 at 04:52:22PM +0200, Arnaud Lesauvage wrote: > We have a database in UTF8, from which we have to export text files in > LATIN9 encoding (or WIN1252, which is almostthe same I believe). > > Records are entered via MSAccess forms (on psqlodbc-linked tables). > The problem is that some of the characters input by the users have no > equivalent in LATIN9. > > How could I easily write a CONSTRAINT (or RULE) that would check that > everything entered in the fields have an equivalent in my specific > destination encoding ? How about using the built in character conversion routines. Something like: col = convert_from(convert_to(col, 'LATIN9'),'LATIN9') as the check constraint, or its inverse as the where clause for the erroneous rows? -- Sam http://samason.me.uk/
Le 1/07/2010 16:48, Sam Mason a écrit : > On Tue, Jun 29, 2010 at 04:52:22PM +0200, Arnaud Lesauvage wrote: >> We have a database in UTF8, from which we have to export text files in >> LATIN9 encoding (or WIN1252, which is almostthe same I believe). >> >> Records are entered via MSAccess forms (on psqlodbc-linked tables). >> The problem is that some of the characters input by the users have no >> equivalent in LATIN9. >> >> How could I easily write a CONSTRAINT (or RULE) that would check that >> everything entered in the fields have an equivalent in my specific >> destination encoding ? > > How about using the built in character conversion routines. Something > like: > > col = convert_from(convert_to(col, 'LATIN9'),'LATIN9') > > as the check constraint, or its inverse as the where clause for the > erroneous rows? What happens then for a character that does not have an equivalent in LATIN9 ? If an error is raised in the check constraint, does it look like a normal check error ?
On Thu, Jul 01, 2010 at 04:53:51PM +0200, Arnaud Lesauvage wrote: > Le 1/07/2010 16:48, Sam Mason a écrit : >> How about using the built in character conversion routines. Something >> like: >> >> col = convert_from(convert_to(col, 'LATIN9'),'LATIN9') >> >> as the check constraint, or its inverse as the where clause for the >> erroneous rows? > > What happens then for a character that does not have an equivalent in > LATIN9 ? > If an error is raised in the check constraint, does it look like a > normal check error ? Yoik, didn't think about how it would actually handle the conversion! It appears to throw an exception, so you probably want to bundle it up in a pl/pgsql function that catches it and does the "right thing" for you. Maybe something like: CREATE FUNCTION isstringrepresentable(str TEXT, charset TEXT) RETURNS BOOLEAN AS $$ BEGIN RETURN str = convert_from(convert_to(str, charset),charset); EXCEPTION WHEN OTHERS THEN RETURN FALSE; END $$ LANGUAGE plpgsql IMMUTABLE; I'm not sure which exception it should be catching, "OTHERS" is a catch all which is a bit cheesy but should do the right thing most of the time. -- Sam http://samason.me.uk/
Le 1/07/2010 17:12, Sam Mason a écrit : > On Thu, Jul 01, 2010 at 04:53:51PM +0200, Arnaud Lesauvage wrote: >> Le 1/07/2010 16:48, Sam Mason a écrit : >>> How about using the built in character conversion routines. Something >>> like: >>> >>> col = convert_from(convert_to(col, 'LATIN9'),'LATIN9') >>> >>> as the check constraint, or its inverse as the where clause for the >>> erroneous rows? >> >> What happens then for a character that does not have an equivalent in >> LATIN9 ? >> If an error is raised in the check constraint, does it look like a >> normal check error ? > > Yoik, didn't think about how it would actually handle the conversion! > It appears to throw an exception, so you probably want to bundle it up > in a pl/pgsql function that catches it and does the "right thing" for > you. Maybe something like: > > CREATE FUNCTION isstringrepresentable(str TEXT, charset TEXT) RETURNS BOOLEAN AS $$ > BEGIN > RETURN str = convert_from(convert_to(str, charset),charset); > EXCEPTION WHEN OTHERS THEN > RETURN FALSE; > END > $$ LANGUAGE plpgsql IMMUTABLE; Yep, I'll give it a try as soon as I find some time ! Thanks for the hint.