Thread: Could not Store French Accent Marks Correctly in Postgres
Hi,
I'm having a problem right now. Some of our French users uploaded some files with file names that had French accent marks, and those file names were inserted into the Postgres database. When I examined the value of those file names, they all had some weird characters (the weird characters were in the same position where the accent marks were entered). I do not know how to handle this kind of situation. Most of my users are US based, but I have been told that there will be more international users in the future.
So my questions are:
(1) What is the best character encoding that would work for most of those languages that have accent marks?
(2) I assume that I also need to do some kind of conversion in the front end (PHP) as well.
I'm running on Linux and Postgres 8.3.8.
Any ideas?
Thanks in advance.
Mary Wang
On 08/20/10 2:10 PM, Wang, Mary Y wrote: > Hi, > I'm having a problem right now. Some of our French users uploaded > some files with file names that had French accent marks, and those > file names were inserted into the Postgres database. When I examined > the value of those file names, they all had some weird characters (the > weird characters were in the same position where the accent marks were > entered). I do not know how to handle this kind of situation. Most > of my users are US based, but I have been told that there will be more > international users in the future. > So my questions are: > (1) What is the best character encoding that would work for most > of those languages that have accent marks? > (2) I assume that I also need to do some kind of conversion in the > front end (PHP) as well. > UTF8 is the answer to your questions.
Am 20.08.2010 23:10, schrieb Wang, Mary Y:
our solution for storing uploaded files in database/filesystem with php uses utf-8 for the filenames in the database in combination with string-replacement for some special characters in php. These are in our case the german "Umlaute" (ä,ö,ü,ß), because otherwise we get the problem of strange translations of these characters (php uses utf-8, german windows uses cp-1250), that made them unusable for download-links. You can use the function below, just add your special characters to the $trans-array. As another benefit this function returns unique filenames that can be used for storing the files in a target-directory.
<SNIP>
public static function get_unique_file_name($target_dir, $current_file_name){
$trans = array ("ä" => "ae", "ö" => "oe", "ü" => "ue", "ß" => "ss", "Ä" => "Ae", "Ö" => "Oe", "Ü" => "Ue");
target_file_name = strtr($current_file_name, $trans);
$i = 0;
$old_target_file_name = $target_file_name;
while(file_exists($target_dir . '/' . $target_file_name)){
$i++;
$target_file_name = $i . $old_target_file_name;
}
return $target_file_name;
}
</SNIP>
Ludwig
Hi,Hi,I'm having a problem right now. Some of our French users uploaded some files with file names that had French accent marks, and those file names were inserted into the Postgres database. When I examined the value of those file names, they all had some weird characters (the weird characters were in the same position where the accent marks were entered). I do not know how to handle this kind of situation. Most of my users are US based, but I have been told that there will be more international users in the future.So my questions are:(1) What is the best character encoding that would work for most of those languages that have accent marks?(2) I assume that I also need to do some kind of conversion in the front end (PHP) as well.I'm running on Linux and Postgres 8.3.8.Any ideas?Thanks in advance.Mary Wang
our solution for storing uploaded files in database/filesystem with php uses utf-8 for the filenames in the database in combination with string-replacement for some special characters in php. These are in our case the german "Umlaute" (ä,ö,ü,ß), because otherwise we get the problem of strange translations of these characters (php uses utf-8, german windows uses cp-1250), that made them unusable for download-links. You can use the function below, just add your special characters to the $trans-array. As another benefit this function returns unique filenames that can be used for storing the files in a target-directory.
<SNIP>
public static function get_unique_file_name($target_dir, $current_file_name){
$trans = array ("ä" => "ae", "ö" => "oe", "ü" => "ue", "ß" => "ss", "Ä" => "Ae", "Ö" => "Oe", "Ü" => "Ue");
target_file_name = strtr($current_file_name, $trans);
$i = 0;
$old_target_file_name = $target_file_name;
while(file_exists($target_dir . '/' . $target_file_name)){
$i++;
$target_file_name = $i . $old_target_file_name;
}
return $target_file_name;
}
</SNIP>
Ludwig
On Fri Aug 20 05:10 PM, Wang, Mary Y wrote: > > So my questions are: > (1) What is the best character encoding that would work for most of > those languages that have accent marks? Store data in PostgreSQL as UTF-8 > (2) I assume that I also > need to do some kind of conversion in the front end (PHP) as well. > > I'm running on Linux and Postgres 8.3.8. > If users are submitting the file names using an HTML form, use: <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head> ... If using some other method, the common encoding in Europe,north america is CP1252, in php you can convert to UTF8 using: mb_convert_encoding($str, 'UTF-8', 'Windows-1252'); http://www.php.net/manual/en/function.mb-convert-encoding.php