Thread: converting curly apostrophes to standard apostrophes
Is there a way to replace all curly apostrophes with standard apostrophes (presumably with replace(x,y,z))? My database is SQL_ASCII and I can't find a character code for curly apostrophes in ASCII here: http://www.lookuptables.com, but nevertheless there appear to be curly apostrophes in the database. Thanks, CSN ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
On Mon, Aug 15, 2005 at 12:14:16PM -0700, CSN wrote: > Is there a way to replace all curly apostrophes with > standard apostrophes (presumably with replace(x,y,z))? > My database is SQL_ASCII and I can't find a character > code for curly apostrophes in ASCII here: > http://www.lookuptables.com, but nevertheless there > appear to be curly apostrophes in the database. The "Extended ASCII Codes" section of that page might not match what your system uses (it doesn't match mine). Have you tried using the ascii() function on the offending data? CREATE TABLE foo (testval text); INSERT INTO foo VALUES ('ABCDEFG'); SELECT substr(testval, 4, 1), ascii(substr(testval, 4, 1)) FROM foo; substr | ascii --------+------- D | 68 (1 row) -- Michael Fuhr
--- Michael Fuhr <mike@fuhr.org> wrote: > On Mon, Aug 15, 2005 at 12:14:16PM -0700, CSN wrote: > > Is there a way to replace all curly apostrophes > with > > standard apostrophes (presumably with > replace(x,y,z))? > > My database is SQL_ASCII and I can't find a > character > > code for curly apostrophes in ASCII here: > > http://www.lookuptables.com, but nevertheless > there > > appear to be curly apostrophes in the database. > > The "Extended ASCII Codes" section of that page > might not match > what your system uses (it doesn't match mine). Have > you tried using > the ascii() function on the offending data? I logged back in after changing putty's charset to UTF-8 and am now able to paste an (curly apostrophe) into psql, however that character still appears as a grey box in psql's select output. I've tried various queries to find which rows and fields contain it, without success (I know it's in some of them): db=>select ascii(''); ascii ------- 226 db=>select id from news where body ilike '%%'; (0 rows) db=>select id from news where body ilike '%' || chr(226) || '%'; db'> db'>^C db=> CSN __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Mon, Aug 15, 2005 at 01:48:00PM -0700, CSN wrote: > db=>select ascii(''); > ascii > ------- > 226 > > db=>select id from news where body ilike '%%'; > (0 rows) > > db=>select id from news where body ilike '%' || > chr(226) || '%'; > db'> > db'>^C > db=> What's going on with the last query? The prompt change suggests that psql is confused with quoting, and the ^C looks like you hit Control-C to get the regular prompt back. Did you ever run this query? If it produced no rows then you could widen the search. Example: SELECT id FROM news WHERE body ~ '[\200-\377]'; You could use the "string from pattern" variant of substring() to extract characters in a specific range. If you have PL/Perl then it would be trivial to extract all of and only the special characters along with their ASCII codes: CREATE FUNCTION special_chars(text) RETURNS text AS ' return join(" ", map {"$_:" . ord($_)} $_[0] =~ /[\200-\377]/g); ' LANGUAGE plperl IMMUTABLE STRICT; SELECT id, special_chars(body) FROM news WHERE body ~ '[\200-\377]'; -- Michael Fuhr
--- Michael Fuhr <mike@fuhr.org> wrote: > On Mon, Aug 15, 2005 at 01:48:00PM -0700, CSN wrote: > > db=>select ascii(''); > > ascii > > ------- > > 226 > > > > db=>select id from news where body ilike '%%'; > > (0 rows) > > > > db=>select id from news where body ilike '%' || > > chr(226) || '%'; > > db'> > > db'>^C > > db=> > > What's going on with the last query? The prompt > change suggests > that psql is confused with quoting, and the ^C looks > like you hit > Control-C to get the regular prompt back. Did you > ever run this > query? If it produced no rows then you could widen > the search. Hmm, I'm on another computer and I just tried that last query and it worked without psql thinking it needed another single quote. Appears the chr code is 146 not 226 (turns out chr(226) is â - why that doesn't cause problems with iso-8859-1/utf-8 xml and the single/double quotes and dashes do I don't know). Anyhow, I ended up doing this: update news set body=replace(body,chr(146),''''); -- left single quote update news set body=replace(body,chr(145),''''); -- right single quote update news set body=replace(body,chr(147),'"'); -- left double quote update news set body=replace(body,chr(148),'"'); -- right double quote update news set body=replace(body,chr(150),'-'); -- en dash update news set body=replace(body,chr(151),'-'); -- em dash and that seems to do the trick. Most places I found online listed different chars for these codes, but http://www.webopedia.com/quick_ref/asciicode.asp lists them. Jeez, I'm so confused with encodings, charsets, etc. now. :( Thanks, CSN > Example: > > SELECT id FROM news WHERE body ~ '[\200-\377]'; > > You could use the "string from pattern" variant of > substring() to > extract characters in a specific range. If you have > PL/Perl then > it would be trivial to extract all of and only the > special characters > along with their ASCII codes: > > CREATE FUNCTION special_chars(text) RETURNS text AS > ' > return join(" ", map {"$_:" . ord($_)} $_[0] =~ > /[\200-\377]/g); > ' LANGUAGE plperl IMMUTABLE STRICT; > > SELECT id, special_chars(body) FROM news WHERE body > ~ '[\200-\377]'; > > -- > Michael Fuhr > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com