Thread: [PATCH] Language selection based on browser preference, use UTF-8, care about direction, etc

Hello,

Attached patch:

* Picks a language based on the actual browser preference.
* Makes all pages default to UTF-8.
* Adds language direction support.
* Makes pages tell the browser what language they're in.
* Fixes language_map.
* Adds a couple of extra handy language functions.

It's in place at our pgweb, but diffed against gborg. Set your browser
to 'de' (Tools -> Options -> General -> Languages, add 'German [de]' and
move it up to the top) and you'll see a german news item at the top (I
google translated it) of the homepage. Go to About/Advantages and you
should see the german version of the page from advocacy. Right-clicking
on the page and clicking 'Properties' should say "Text Language: German."

Clicking the Arabic or Hebrew on the bottom should switch the page
direction.

Omar
Index: system/handler.php
===================================================================
RCS file: /usr/local/cvsroot/pgweb/portal/system/handler.php,v
retrieving revision 1.4
diff -u -r1.4 handler.php
--- system/handler.php    21 Apr 2004 14:50:44 -0000    1.4
+++ system/handler.php    15 Nov 2004 07:48:53 -0000
@@ -15,7 +15,7 @@
 if (!empty($_GET['lang'])) {
     $language = addslashes($_GET['lang']);
 } else {
-    $language = $_SETTINGS['defaultlanguage'];
+    $language = language_from_accept_language($_SETTINGS['defaultlanguage']);
 }

 if (!empty($_GET['page'])) {
@@ -26,11 +26,17 @@

 $_LANGUAGE = language_set($language);

+$language_direction = language_direction($language);
+
 // Prepare the template
 $tpl =& new HTML_Template_Sigma('../template', '/tmp');
 $tpl->setCallbackFunction('lang', 'gettext', true);
 $tpl->loadTemplateFile('common.html', true, true);
 $tpl->setGlobalVariable('current_page', $page);
+$tpl->setGlobalVariable('current_lang', $language);
+$tpl->setGlobalVariable('current_lang_dir', $language_direction);
+
+header("Content-Language: $language");

 if (preg_match('!^docs/([^/]*)/([a-z]*)/([^/]*)$!', $page, $docsAry)) {
     require './docs.php';
@@ -50,4 +56,4 @@
 ));

 $tpl->show();
-?>
\ No newline at end of file
+?>
Index: system/global/functions.language.php
===================================================================
RCS file: /usr/local/cvsroot/pgweb/portal/system/global/functions.language.php,v
retrieving revision 1.3
diff -u -r1.3 functions.language.php
--- system/global/functions.language.php    16 Mar 2004 22:53:48 -0000    1.3
+++ system/global/functions.language.php    15 Nov 2004 07:48:53 -0000
@@ -37,15 +37,16 @@
  * Checks whether the given language exists
  *
  * @param  string   Language name, from the name of the requested page
+ * @param  bool     Only check the root language?
  * @return bool
  */
-function language_valid($lang)
+function language_valid($lang, $at_root = true)
 {
     global $_LANGUAGES, $_LANGUAGE_ALIASES;

     if (isset($_LANGUAGES[$lang])) {
         return true;
-    } elseif (strpos($lang, '-')) {
+    } elseif ($at_root == true && strpos($lang, '-')) {
         $parts = explode('-', $lang);
         if (isset($_LANGUAGES[$parts[0]])) {
             return true;
@@ -68,7 +69,7 @@
         return $_LANGUAGE_ALIASES[$lang];
     }
     if (false !== strpos($lang, '-')) {
-        $parts = explode('-', $parts);
+        $parts = explode('-', $lang);
         if (isset($_LANGUAGES[$lang])) {
             return $parts[0] . '_' . strtoupper($parts[1]);
         }
@@ -82,4 +83,51 @@
     }
     return $lang;
 }
+
+/**
+ * Return the root of the language, e.g. the root of en-us is en
+ *
+ * @param  string   Language name
+ * @return string   Root of language name
+ */
+function language_root($lang) {
+    $parts = explode('-', $lang);
+    return $parts[0];
+}
+
+
+/**
+ * Return the first valid language from HTTP_ACCEPT_LANGUAGE, else current_lang
+ *
+ * @param  string   Language name, the current language
+ * @return string   Language name, either a matched ACCEPT_LANGUAGE or current_lang
+ */
+function language_from_accept_language($current_lang) {
+    if(isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
+        $accepts_langs = explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']);
+        foreach ($accepts_langs as $accepts_lang) {
+            $accepts_lang = explode(';', $accepts_lang);
+            $accepts_lang = $accepts_lang[0];
+            if (language_valid($accepts_lang, false)) {
+                return $accepts_lang;
+            }
+        }
+    }
+    return $current_lang;
+}
+
+/**
+ * Return the direction (ltr, rtl) of the language, else default to ltr
+ *
+ * @param  string   Language name
+ * @return string   Language direction
+ */
+function language_direction($lang) {
+    $direction = $GLOBALS['_LANGUAGE_DIRECTION'][language_root($lang)];
+    if (empty($direction)) {
+        return 'ltr';
+    }
+    return $direction;
+}
+
 ?>
Index: system/global/languages.php
===================================================================
RCS file: /usr/local/cvsroot/pgweb/portal/system/global/languages.php,v
retrieving revision 1.3
diff -u -r1.3 languages.php
--- system/global/languages.php    8 Mar 2004 10:24:23 -0000    1.3
+++ system/global/languages.php    15 Nov 2004 07:48:53 -0000
@@ -8,7 +8,7 @@
 $GLOBALS['_LANGUAGES'] = array(
     'en' => 'English',
 //  "de" => "Deutsch",
-  'ru' => 'Русский'
+  'ru' => 'Русский'
 );

 // Aliases for languages with different browser and gettext codes
@@ -17,4 +17,10 @@
     'en' => 'en_US',
     'ru' => 'ru_RU'
 );
-?>
\ No newline at end of file
+
+// Language directions
+$GLOBALS['_LANGUAGE_DIRECTION'] = array(
+    'en' => 'ltr',
+    'ru' => 'ltr'
+};
+?>
Index: template/common.html
===================================================================
RCS file: /usr/local/cvsroot/pgweb/portal/template/common.html,v
retrieving revision 1.7
diff -u -r1.7 common.html
--- template/common.html    13 Nov 2004 05:59:03 -0000    1.7
+++ template/common.html    15 Nov 2004 07:48:53 -0000
@@ -1,7 +1,8 @@
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+<html xmlns="http://www.w3.org/1999/xhtml" lang="{current_lang}" xml:lang="{current_lang}" dir="{current_lang_dir}">
 <head>
 <title>PostgreSQL<!-- BEGIN page_title_more -->: {page_title}<!-- END page_title_more --></title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 <link rel="stylesheet" type="text/css" href="/layout/css/new.css" />
 <link rel="alternate" type="application/rss+xml" title="PostgreSQL news" href="/news.rss" />
 <link rel="alternate" type="application/rss+xml" title="PostgreSQL events" href="/events.rss" />

Re: [PATCH] Language selection based on browser preference,

From
Alexey Borzov
Date:
Hi,

Omar Kilani wrote:
> Attached patch:
>
> * Picks a language based on the actual browser preference.
> * Makes all pages default to UTF-8.
> * Adds language direction support.
> * Makes pages tell the browser what language they're in.
> * Fixes language_map.
> * Adds a couple of extra handy language functions.

I have some questions regarding the patch:
1) What's the purpose of adding the encoding to the HTML? It's already
defined in .htaccess and having it in HTML may lead to some problems.
2) All files are expected to be in UTF-8, so is there a reason to encode
Cyrillic letters in HTML entitites?
3) The language_from_accept_language() function does not seem to handle
quality values for preferred languages, as described f.e. here:
http://httpd.apache.org/docs/content-negotiation.html