Thread: Considerations on a Multi-Lingual Site

Considerations on a Multi-Lingual Site

From
Josh Berkus
Date:
Andreas, folks:

You may have already thought of this, but there's two issues that need to
govern our support of multiple languages on www.postgresql.org:

1) It needs to be possible for translators to translate content without
knowing HTML, XHTML, or CVS.   Right now, we have 17 volunteer translators,
but if you limit translation to people who have HTML skills, that gets cut
down to *three*.   And if you have the translators send you stuff for you to
correct/markup, you will spend several/many hours per week doing this.
    The approach which Robert took with the Advocacy site isn't bad, except for
the below issue and that translation elements need to be groupable for
comprehensaibility.

2) It also needs to be possible for the non-English communities to contribute
content in some form to their own site without having it first in English.
I'm thinking of the German & Brazillian communities particularly.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Considerations on a Multi-Lingual Site

From
"Andreas Grabmüller"
Date:
----- Original-Nachricht -----
Von: "Josh Berkus" <josh@agliodbs.com>
An: pgsql-www@postgresql.org
Datum: Monday, November 10, 2003 07:31 PM
Betreff: [pgsql-www] Considerations on a Multi-Lingual Site

> Andreas, folks:

Hi Josh,

> You may have already thought of this, but there's two issues that need to
> govern our support of multiple languages on www.postgresql.org:

Well, I must confess I didn't think much about this issues as at the beginning the only plan was to add
multi-language-capabilitiesto the main site (with just some pages) and the idea to merge advocacy is something new... 

> 1) It needs to be possible for translators to translate content without
> knowing HTML, XHTML, or CVS.   Right now, we have 17 volunteer translators,
> but if you limit translation to people who have HTML skills, that gets cut
> down to *three*.   And if you have the translators send you stuff for you to
> correct/markup, you will spend several/many hours per week doing this.
>     The approach which Robert took with the Advocacy site isn't bad, except for
> the below issue and that translation elements need to be groupable for
> comprehensaibility.

It's difficult to create elements like lists without knowing HTML. I think we have two options here: a bbcode parser
thatallows some input like the popular bulletin boards do or using a WYSIWYG editor. 

The first one is easier to implement but the second is easier to handle by the translator. The only problem I see is
thatit's very difficult to limit a WYSIWYG editor (one of the freely available) to our design guidelines and converting
itsoutput to XML (using our CSS classes). So I would vote for the first option for the beginning. Can we expect our
translatorsto be able to use some bbcodes as long as they are compatible to those of the big boards (vBulletin for
example)?

> 2) It also needs to be possible for the non-English communities to contribute
> content in some form to their own site without having it first in English.
> I'm thinking of the German & Brazillian communities particularly.

If a translator creates a page in its own language, should it be not available to other languages or appear in that
foreignlanguage for the english people? Should they be able to change the menu (just for their own language or for
all)?

I'm currently thinking about how to rewrite the system to meet the new requirements, so I would be glad to get some
suggestionsand critics... 

> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco

Mit freundlichen Grüßen
Andreas Grabmüller

--
LetzPlay.de
| Freemail:       http://www.letzplay.de/mail
| Forenhosting: http://www.letzplay.de/foren

Re: Considerations on a Multi-Lingual Site

From
Josh Berkus
Date:
Andreas,

> It's difficult to create elements like lists without knowing HTML. I think
we have two options here: a bbcode parser that allows some input like the
popular bulletin boards do or using a WYSIWYG editor.

I vote for the bbcode parser, using bulletin-board or Wiki markup.


> If a translator creates a page in its own language, should it be not
available to other languages or appear in that foreign language for the
english people?

Well, if it was possible for it to be available to the other languages for
*translation*, I would say keen.

> Should they be able to change the menu (just for their own language or for
all)?

I don't think that's necessary.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Considerations on a Multi-Lingual Site

From
Michael Glaesemann
Date:
On Tuesday, November 11, 2003, at 08:03 AM, Andreas Grabmüller wrote:
> It's difficult to create elements like lists without knowing HTML.

Couldn't it be done along the lines of building selects? Say you've got
an array of list items:

$list_array = array ('apples','oranges','grapes');

$list_string =  '<ul>';
foreach $list_array as $this_item {
    $list_string .=  '<li>'.$this_item.'</li>';
}
$list_string .= '</ul>';

This doesn't solve the problem of how you get the list into the
database, but that shouldn't be too hard.

>  I think we have two options here: a bbcode parser that allows some
> input like the popular bulletin boards do or using a WYSIWYG editor

A bbcode parser would probably be adequate. Wiki markup isn't too hard
and parsers are easy to come by.

> If a translator creates a page in its own language, should it be not
> available to other languages or appear in that foreign language for
> the english people? Should they be able to change the menu (just for
> their own language or for all)?

I could see a link with something like "Translate this item" if a
translation isn't available in the preferred language of the viewer.
That then brings them to the bbcode page (requiring whatever
authentication you want, if you want to limit the translations to be
done only be authorized translators).

As for what news is displayed, I think it might be a good idea to
display all of the items, with as many translated as translations are
available. For languages that haven't got a full version of everything
translated, you might get some pretty sparse pages, and people wouldn't
necessarily know what they're missing.

It might be nice if you could store a language preference order in the
cookie. So for example, if I wanted English first, then German, then
Japanese, I'd get the translation of the highest preference that was
available.

This may have already been hashed out. I haven't looked closely enough
at that section of the code yet.

Michael


Re: Considerations on a Multi-Lingual Site

From
"Andreas Grabmüller"
Date:
----- Original-Nachricht -----
Von: "Michael Glaesemann" <grzm@myrealbox.com>
An: pgsql-www@postgresql.org
Datum: Tuesday, November 11, 2003 02:23 AM
Betreff: [pgsql-www] Considerations on a Multi-Lingual Site

> On Tuesday, November 11, 2003, at 08:03 AM, Andreas Grabmüller wrote:
> > It's difficult to create elements like lists without knowing HTML.
>
> Couldn't it be done along the lines of building selects? Say you've got
> an array of list items:
>
> $list_array = array ('apples','oranges','grapes');
>
> $list_string =  '<ul>';
> foreach $list_array as $this_item {
>     $list_string .=  '<li>'.$this_item.'</li>';
> }
> $list_string .= '</ul>';
>
> This doesn't solve the problem of how you get the list into the
> database, but that shouldn't be too hard.

Well, the problem I see is that we need a way for the translators to write the list without knowing HTML...

> >  I think we have two options here: a bbcode parser that allows some
> > input like the popular bulletin boards do or using a WYSIWYG editor
>
> A bbcode parser would probably be adequate. Wiki markup isn't too hard
> and parsers are easy to come by.
>
> > If a translator creates a page in its own language, should it be not
> > available to other languages or appear in that foreign language for
> > the english people? Should they be able to change the menu (just for
> > their own language or for all)?
>
> I could see a link with something like "Translate this item" if a
> translation isn't available in the preferred language of the viewer.
> That then brings them to the bbcode page (requiring whatever
> authentication you want, if you want to limit the translations to be
> done only be authorized translators).

Difficult - except we want to give everyone access to the translations, it does not make sense to offer this link to
non-translators...

> As for what news is displayed, I think it might be a good idea to
> display all of the items, with as many translated as translations are
> available. For languages that haven't got a full version of everything
> translated, you might get some pretty sparse pages, and people wouldn't
> necessarily know what they're missing.

Currently it's handled so if there's no translation the english version is used - this won't work any more if we don't
havean english version, so the question is if the user than gets a 404 - File not found message or the content in a
differentlanguage... 

> It might be nice if you could store a language preference order in the
> cookie. So for example, if I wanted English first, then German, then
> Japanese, I'd get the translation of the highest preference that was
> available.

Should be possible. Currently just the one prefered language is stored, but I see no reason why we should not allow to
storemore preferences... 

> This may have already been hashed out. I haven't looked closely enough
> at that section of the code yet.
>
> Michael

Mit freundlichen Grüßen
Andreas Grabmüller

--
LetzPlay.de
| Freemail:       http://www.letzplay.de/mail
| Forenhosting: http://www.letzplay.de/foren

Re: Considerations on a Multi-Lingual Site

From
Michael Glaesemann
Date:
On Tuesday, November 11, 2003, at 05:31 PM, Andreas Grabmüller wrote:
>> This doesn't solve the problem of how you get the list into the
>> database, but that shouldn't be too hard.
>
> Well, the problem I see is that we need a way for the translators to
> write the list without knowing HTML...

After thinking about it a little more, I realized this is what you were
talking about—not marking up stuff from the database, getting stuff
INTO the database.

If a requirement is that the translators don't need to know html, I
guess I'd go for something similar to wiki markup. Pretty comprehensive
for the needs of the articles. And CSS let's us keep all the
presentation markup separate. Just vanilla tags within the proper
sections lets us style them appropriately.

Ultimately, without having an editor do all the markup, some form of
markup is going to have to be employed—either html, wiki, or something
else. An advantage of html is that (obviously) there's no need for a
special parser. An advantage of wiki is that it might give us a little
more control over relative headlines. We can define, for example,
===Title=== to be <h3> or <h2> or whatever. Of course the html can be
parsed and the headlines renumbered relative to whatever baseline we
want. Or just make things really simple, and let <h1> be the headline
of every article, and define the style of h1 (and others) via CSS.

>> I could see a link with something like "Translate this item" if a
>> translation isn't available in the preferred language of the viewer.

> Difficult - except we want to give everyone access to the
> translations, it does not make sense to offer this link to
> non-translators...

Add a flag to translators' cookies that allows them to see the links?
CSS does allow you to display:none some items, and you could do that
via a simple JavaScript style switcher, but I don't know how secure
that'd be, as the links would still be in markup. Might be better just
to completely leave those links when building the page.

I'm just trying to think of an easy way for the translators to get to
the articles, rather than have to go through some cms system.

> Currently it's handled so if there's no translation the english
> version is used - this won't work any more if we don't have an english
> version, so the question is if the user than gets a 404 - File not
> found message or the content in a different language...
>
>> It might be nice if you could store a language preference order in the
>> cookie. So for example, if I wanted English first, then German, then
>> Japanese, I'd get the translation of the highest preference that was
>> available.
>
> Should be possible. Currently just the one prefered language is
> stored, but I see no reason why we should not allow to store more
> preferences...

And this would get around the "no translation available" problem. The
article needs to be written in *some* language, and you'd get that if
nothing else is available, or nothing is higher on your language
preference list.


Re: Considerations on a Multi-Lingual Site

From
Justin Clift
Date:
Hi guys,

Are we really the first Open Source Software project to really need an
easy-to-maintain-and-extend-and-be-multi-lingual site?

Would it be beneficial (all good ideas presented here aside) for us to
contact other large projects that have gone before us (perhaps the KDE
project and others?) and find out what they recommend?  Some of them may
be on their 2nd or more iteration of having done this, whereas we'll
kind of be on our first (not counting lessons learnt from the Advocacy
site).

Regards and best wishes,

Justin Clift


Michael Glaesemann wrote:

>
> On Tuesday, November 11, 2003, at 05:31 PM, Andreas Grabmüller wrote:
>
>>> This doesn't solve the problem of how you get the list into the
>>> database, but that shouldn't be too hard.
>>
>>
>> Well, the problem I see is that we need a way for the translators to
>> write the list without knowing HTML...
>
>
> After thinking about it a little more, I realized this is what you were
> talking about—not marking up stuff from the database, getting stuff INTO
> the database.
>
> If a requirement is that the translators don't need to know html, I
> guess I'd go for something similar to wiki markup. Pretty comprehensive
> for the needs of the articles. And CSS let's us keep all the
> presentation markup separate. Just vanilla tags within the proper
> sections lets us style them appropriately.
>
> Ultimately, without having an editor do all the markup, some form of
> markup is going to have to be employed—either html, wiki, or something
> else. An advantage of html is that (obviously) there's no need for a
> special parser. An advantage of wiki is that it might give us a little
> more control over relative headlines. We can define, for example,
> ===Title=== to be <h3> or <h2> or whatever. Of course the html can be
> parsed and the headlines renumbered relative to whatever baseline we
> want. Or just make things really simple, and let <h1> be the headline of
> every article, and define the style of h1 (and others) via CSS.
>
>>> I could see a link with something like "Translate this item" if a
>>> translation isn't available in the preferred language of the viewer.
>
>
>> Difficult - except we want to give everyone access to the
>> translations, it does not make sense to offer this link to
>> non-translators...
>
>
> Add a flag to translators' cookies that allows them to see the links?
> CSS does allow you to display:none some items, and you could do that via
> a simple JavaScript style switcher, but I don't know how secure that'd
> be, as the links would still be in markup. Might be better just to
> completely leave those links when building the page.
>
> I'm just trying to think of an easy way for the translators to get to
> the articles, rather than have to go through some cms system.
>
>> Currently it's handled so if there's no translation the english
>> version is used - this won't work any more if we don't have an english
>> version, so the question is if the user than gets a 404 - File not
>> found message or the content in a different language...
>>
>>> It might be nice if you could store a language preference order in the
>>> cookie. So for example, if I wanted English first, then German, then
>>> Japanese, I'd get the translation of the highest preference that was
>>> available.
>>
>>
>> Should be possible. Currently just the one prefered language is
>> stored, but I see no reason why we should not allow to store more
>> preferences...
>
>
> And this would get around the "no translation available" problem. The
> article needs to be written in *some* language, and you'd get that if
> nothing else is available, or nothing is higher on your language
> preference list.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


Re: Considerations on a Multi-Lingual Site

From
Michael Glaesemann
Date:
On Tuesday, November 11, 2003, at 08:03 AM, Andreas Grabmüller wrote:
> It's difficult to create elements like lists without knowing HTML.

Couldn't it be done along the lines of building selects? Say you've got
an array of list items:

$list_array = array ('apples','oranges','grapes');

$list_string =  '<ul>';
foreach $list_array as $this_item {
    $list_string .=  '<li>'.$this_item.'</li>';
}
$list_string .= '</ul>';

This doesn't solve the problem of how you get the list into the
database, but that shouldn't be too hard.

>  I think we have two options here: a bbcode parser that allows some
> input like the popular bulletin boards do or using a WYSIWYG editor

A bbcode parser would probably be adequate. Wiki markup isn't too hard
and parsers are easy to come by.

> If a translator creates a page in its own language, should it be not
> available to other languages or appear in that foreign language for
> the english people? Should they be able to change the menu (just for
> their own language or for all)?

I could see a link with something like "Translate this item" if a
translation isn't available in the preferred language of the viewer.
That then brings them to the bbcode page (requiring whatever
authentication you want, if you want to limit the translations to be
done only be authorized translators).

As for what news is displayed, I think it might be a good idea to
display all of the items, with as many translated as translations are
available. For languages that haven't got a full version of everything
translated, you might get some pretty sparse pages, and people wouldn't
necessarily know what they're missing.

It might be nice if you could store a language preference order in the
cookie. So for example, if I wanted English first, then German, then
Japanese, I'd get the translation of the highest preference that was
available.

This may have already been hashed out. I haven't looked closely enough
at that section of the code yet.

Michael


Re: Considerations on a Multi-Lingual Site

From
Michael Glaesemann
Date:
On Wednesday, November 12, 2003, at 02:47 AM, Justin Clift wrote:
> Are we really the first Open Source Software project to really need an
> easy-to-maintain-and-extend-and-be-multi-lingual site?
>
> Would it be beneficial (all good ideas presented here aside) for us to
> contact other large projects that have gone before us (perhaps the KDE
> project and others?) and find out what they recommend?

Where's the fun in that? :)

Sounds like a good idea. Definitely can't hurt to ask. Anyone know
people in these groups? I'd be willing to cold-call, but if someone
knows someone, might get a more efficient response.

Michael


Re: Considerations on a Multi-Lingual Site

From
Michael Glaesemann
Date:
On Wednesday, November 12, 2003, at 02:47 AM, Justin Clift wrote:
> Are we really the first Open Source Software project to really need an
> easy-to-maintain-and-extend-and-be-multi-lingual site?
>
> Would it be beneficial (all good ideas presented here aside) for us to
> contact other large projects that have gone before us (perhaps the KDE
> project and others?) and find out what they recommend?

Where's the fun in that? :)

Sounds like a good idea. Definitely can't hurt to ask. Anyone know
people in these groups? I'd be willing to cold-call, but if someone
knows someone, might get a more efficient response.

Michael


Re: Considerations on a Multi-Lingual Site

From
Josh Berkus
Date:
Guys,

> > Would it be beneficial (all good ideas presented here aside) for us to
> > contact other large projects that have gone before us (perhaps the KDE
> > project and others?) and find out what they recommend?

Well, I can eliminate one:   OpenOffice.org is all done ad-hoc in raw HTML and
is a disaster management-wise.   This is the reason why I'm so strident in
pushing for something more structured, and in reiterating that "CVS is NOT a
content management system."   I've a huge negative example in my past ....

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Considerations on a Multi-Lingual Site

From
Justin Clift
Date:
Michael Glaesemann wrote:
>
> On Wednesday, November 12, 2003, at 02:47 AM, Justin Clift wrote:
>
>> Are we really the first Open Source Software project to really need an
>> easy-to-maintain-and-extend-and-be-multi-lingual site?
>>
>> Would it be beneficial (all good ideas presented here aside) for us to
>> contact other large projects that have gone before us (perhaps the KDE
>> project and others?) and find out what they recommend?
>
> Where's the fun in that? :)
>
> Sounds like a good idea. Definitely can't hurt to ask. Anyone know
> people in these groups? I'd be willing to cold-call, but if someone
> knows someone, might get a more efficient response.

Ok, two potential leads spring to mind immediately... Jono Bacon of the
KDE <jonobacon@yahoo.com> project may know who the best people in the
KDE project to discuss this with, and Dave Shea <dave@mezzoblue.com> is
a web design professional that does stuff with the Mozilla project.

As far as I'm aware the Mozilla project doesn't (yet) have a good
multi-lingual website infrastructure, but Dave Shea looks to have a wide
range of knowledge and may be able to point us towards other projects
that do.

Hope that's helpful.

:-)

Regards and best wishes,

Justin Clift

> Michael



Re: Considerations on a Multi-Lingual Site

From
Euler Taveira de Oliveira
Date:
Hi Justin,

 >
> >> Are we really the first Open Source Software project to really need an
> >> easy-to-maintain-and-extend-and-be-multi-lingual site?
> >>
> >> Would it be beneficial (all good ideas presented here aside) for us to
> >> contact other large projects that have gone before us (perhaps the KDE
> >> project and others?) and find out what they recommend?
> >
> > Where's the fun in that? :)
> >
> > Sounds like a good idea. Definitely can't hurt to ask. Anyone know
> > people in these groups? I'd be willing to cold-call, but if someone
> > knows someone, might get a more efficient response.
>
> Ok, two potential leads spring to mind immediately... Jono Bacon of the
> KDE <jonobacon@yahoo.com> project may know who the best people in the
> KDE project to discuss this with, and Dave Shea <dave@mezzoblue.com> is
> a web design professional that does stuff with the Mozilla project.
>
I get some URLs about the internationalization of websites. Basically, it's about two projects: Debian e KDE.
http://www.debian.org/devel/website/translating
http://i18n.kde.org/
http://i18n.kde.org/translation-howto/

From last one, we could see how to deal the docs and how to proceed. As Alvaro said, the guys are using SGML docs and
extracta PO file. I think this is not too hard to handle .po files. There are some GUIs that handle this easily.
(Kbabel,ktranslator, gtranslator, etc). 

Comments?


--
Euler Taveira de Oliveira
euler (at) ufgnet.ufg.br
Desenvolvedor Web e Administrador de Sistemas
UFGNet - Universidade Federal de Goiás

Re: Considerations on a Multi-Lingual Site

From
Josh Berkus
Date:
Euler,

> From last one, we could see how to deal the docs and how to proceed. As
Alvaro said, the guys are using SGML docs and extract a PO file. I think this
is not too hard to handle .po files. There are some GUIs that handle this
easily. (Kbabel, ktranslator, gtranslator, etc).

I'm not clear on how the KDE solution would work in terms of keeping website
translations updated.   It seems designed for monumental rip-through of a
stable version of a project. It also requires KDE-based tools, which leaves
out a *lot* of people.

The Debian solution seems much more thourough.   If we could find some GUI
editor so that translation vols would not need to learn WML or HTML, this
would be perfect.  Hmmm, and we'd need to figure out how to do this if not
all translation vols are web CVS committers or if they're running on
platforms where good CVS tools are not available (i.e. Win95).


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Considerations on a Multi-Lingual Site

From
"Dave Page"
Date:

> -----Original Message-----
> From: Michael Glaesemann [mailto:grzm@myrealbox.com]
> Sent: 11 November 2003 17:55
> To: Justin Clift
> Cc: webmaster@letzplay.de; pgsql-www@postgresql.org
> Subject: Re: [pgsql-www] Considerations on a Multi-Lingual Site
>
>
> On Wednesday, November 12, 2003, at 02:47 AM, Justin Clift wrote:
> > Are we really the first Open Source Software project to
> really need an
> > easy-to-maintain-and-extend-and-be-multi-lingual site?
> >
> > Would it be beneficial (all good ideas presented here
> aside) for us to
> > contact other large projects that have gone before us
> (perhaps the KDE
> > project and others?) and find out what they recommend?
>
> Where's the fun in that? :)
>
> Sounds like a good idea. Definitely can't hurt to ask. Anyone
> know people in these groups? I'd be willing to cold-call, but
> if someone knows someone, might get a more efficient response.

Don't forget though that Andreas has done most of the work. All that is
being debated here is how to prevent translators having to know their
HTML tags.

Regards, Dave.

Re: Considerations on a Multi-Lingual Site

From
"Andreas Grabmüller"
Date:
----- Original-Nachricht -----
Von: "Dave Page" <dpage@vale-housing.co.uk>
An: "Michael Glaesemann" <grzm@myrealbox.com>, "Justin Clift" <justin@postgresql.org>
CC: <webmaster@letzplay.de>, <pgsql-www@postgresql.org>
Datum: Tuesday, November 11, 2003 11:13 PM
Betreff: [pgsql-www] Considerations on a Multi-Lingual Site

> > -----Original Message-----
> > From: Michael Glaesemann [mailto:grzm@myrealbox.com]
> > Sent: 11 November 2003 17:55
> > To: Justin Clift
> > Cc: webmaster@letzplay.de; pgsql-www@postgresql.org
> > Subject: Re: [pgsql-www] Considerations on a Multi-Lingual Site
> >
> >
> > On Wednesday, November 12, 2003, at 02:47 AM, Justin Clift wrote:
> > > Are we really the first Open Source Software project to
> > really need an
> > > easy-to-maintain-and-extend-and-be-multi-lingual site?
> > >
> > > Would it be beneficial (all good ideas presented here
> > aside) for us to
> > > contact other large projects that have gone before us
> > (perhaps the KDE
> > > project and others?) and find out what they recommend?
> >
> > Where's the fun in that? :)
> >
> > Sounds like a good idea. Definitely can't hurt to ask. Anyone
> > know people in these groups? I'd be willing to cold-call, but
> > if someone knows someone, might get a more efficient response.
>
> Don't forget though that Andreas has done most of the work. All that is
> being debated here is how to prevent translators having to know their
> HTML tags.
>
> Regards, Dave.

I think much of the code has to be redone to meet the new requirements we have heared the last days (especially the
translationthing was not designed to handle such a big site we'll get when we merge the sites), so it might be better
todiscuss other (maybe better) ways before doing that and in 4 weeks we see our solution is unhandy and we begin to
rewritethe code again... 

Mit freundlichen Grüßen
Andreas Grabmüller

--
LetzPlay.de
| Freemail:       http://www.letzplay.de/mail
| Forenhosting: http://www.letzplay.de/foren