How do you build a multi-language web site? - database

A friend of mine is now building a web application with J2EE and Struts, and it's going to be prepared to display pages in several languages.
I was told that the best way to support a multi-language site is to use a properties file where you store all the strings of your pages, something like:
welcome.english = "Welcome!"
welcome.spanish = "¡Bienvenido!"
...
This solution is ok, but what happens if your site displays news or something like that (a blog)? I mean, content that is not static, that is updated often... The people that keep the site have to write every new entry in each supported language, and store each version of the entry in the database. The application loads only the entries in the user's chosen language.
How do you design the database to support this kind of implementation?
Thanks.

Warning: I'm not a java hacker, so YMMV but...
The problem with using a list of "properties" is that you need a lot of discipline. Every time you add a string that should be output to the user you will need to open your properties file, look to see if that string (or something roughly equivalent to it) is already in the file, and then go and add the new property if it isn't. On top of this, you'd have to hope the properties file was fairly human readable / editable if you wanted to give it to an external translation team to deal with.
The database based approach is useful for all your database based content. Ideally you want to make it easy to tie pieces of content together with their translations. It only really falls down for all the places you may want to output something that isn't out of a database (error messages etc.).
One fairly old technology which we find still works really well, is to use gettext. Gettext or some variant seems to be available for most languages and platforms. The basic premise is that you wrap your output in a special function call like so:
echo _("Please do not press this button again");
Then running the gettext tools over your source code will extract all the instances wrapped like that into a "po" file. This will contain entries such as:
#: myfolder/my.source:239
msgid "Please do not press this button again"
msgstr ""
And you can add your translation to the appropriate place:
#: myfolder/my.source:239
msgid "Please do not press this button again"
msgstr "s’il vous plaît ne pas appuyer sur le bouton ci-dessous à nouveau"
Subsequent runs of the gettext tools simply update your po files. You don't even need to extract the po file from your source. If you know you may want to translate your site down the line, then you can just use the format shown above (the underscored function) with all your output. If you don't provide a po file it will just return whatever you put in the quotes. gettext is designed to work with locales so the users locale is used to retrieve the appropriate po file. This makes it really easy to add new translations.
Gettext Pros
Doesn't get in your way while coding
Very easy to add translations
PO files can be compiled down for speed
There are libraries available for most languages / platforms
There are good cross platform tools for dealing with translations. It is actually possible to get your translation team set up with a tool such as poEdit to make it very easy for them to manage translation projects
Gettext Cons
Solves your site "furniture" needs, but you would usually still want a database based approach for your database driven content
For more info on gettext see this wikipedia page

They way I have designed the database before is to have an News-table containing basic info like NewsID (int), NewsPubDate (datetime), NewsAuthor (varchar/int) and then have a linked table NewsText that has these columns: NewsID(int), NewsText(text), NewsLanguageID(int). And at last you have a Language-table that has LanguageID(int) and LanguageName(varchar).
Then, when you want to show your users the news-page you do:
SELECT NewsText FROM News INNER JOIN NewsText ON News.NewsID = NewsText.NewsID
WHERE NewsText.NewsLanguageID = <<Session["UserLanguageID"]>>
That Session-bit is a local variable where you store the users language when they log in or enters the site for the first time.

Java web applications support internationalization using the java standard tag library.
You've really got 2 problems. Static content and dynamic content.
for static content you can use jstl. It uses java ResourceBundles to accomplish this. I managed to get a Databased backed bundle working with the help of this site.
The second problem is dynamic content.
To solve this problem you'll need to store the data so that you can retrieve different translations based on the user's Locale. (Locale includes Country and Language).
It's not trivial, but it is something you can do with a little planning up front.

#Auron
thats what we apply it to. Our apps are all PHP, but gettext has a long heritage.
Looks like there is a good Java implementation

Tag libraries are fine if you're using JSP, but you can also achieve I18N using a template-based technology such as FreeMarker.

Related

Interpolating custom data onto a PDF

I am building an Angular test preparation app (with Laravel 5.1 API). One of the requirements is to allow the user to print a certificate of achievement.
The client wants the person's name and credentials interpolated into the document (e.g., highlighted below). Here is a snapshot of the PDF template they sent:
The way I'm handling PDF viewing is simply by storing the file on S3 and giving them a link to that file.
Interpolating information into a PDF doc doesn't seem trivial and I haven't found much information on programmatically allowing this, but there are tools like DocHub, that allow you do edit while viewing the PDF.
I'm interested in learning:
is doing this programmatically trivial?
are there 3rd party tools I'm unaware of?
would I even be able to send this information along to the S3 link to interpolate in the first place?
Using PDF as a format for editing is usually a bad choice. If you have a form with fixed fields, then it's easy. Create a PDF template with an interactive form. In this form, based on AcroForm technology, you'll define fields with fixed coordinates, and a fixed size. You can then add content to these fields.
One major disadvantage with this approach is the lack of flexibility. Did you notice that I used the word "fixed" three times in the previous paragraph? If text doesn't fit the predefined field, you're out of luck. If the field is overdimensioned, you'll end up with plenty of white space. This approach is great if you can predict what the data will be like. A typical use case is a ticket or a voucher. For instance: the empty form is a really nice page, with only a couple of fields where an automated system can put a name, a date, a time, and a seat number.
This isn't the best approach for the example you show in your screen shot. The position of every line of text, every word, every character is known in advance. If you want to replace a short word with a long word (or vice-versa), then all those positions (of each line, of the complete page, possibly of the complete document) need to be recalculated. That's madness. Only people with very poor design skills come up with such an idea.
A better idea, is to store the template as HTML. See for instance chapter 5 of iText's pdfHTML tutorial, where we have this snippet of HTML:
<html>
<head>
<title>Invitation to SXSW 2018</title>
</head>
<body>
<u><b>Re: Invitation</b></u>
<br>
<p>Dear <name>SXSW visitor</name>,
we hope you had a great SXSW film festival experience last year.
And we would like to invite you to the next edition of SXSW Film
that takes place from March 9 until March 17, 2018.</p>
<p>Sincerely,<br>
The SXSW crew<br>
<date>August 4, 2017</date></p>
</body>
</html>
Actually, it's not really HTML, because the <name> tag and the <date> tag don't exist in HTML. All HTML processors (browsers as well as pdfHTML) ignore those tags and treat their content as if the tag was a <span>:
It doesn't make much sense to have such tags in the context of pure HTML, but it does make a lot of sense in the case of pdfHTML. With pdfHTMLL, you can configure custom tags, and have a result that looks like the PDFs shown below:
Look at the document for "John Doe" and compare it with the document for "Bruno Lowagie". The name "John Doe" is much shorter than my name, hence more words fit on that first line. The text flows nicely (we could also have chosen to justify the text on both sides). This "flow" is impossible to achieve with your approach, because you will never get a PDF template to reflow nicely.
OK, I get it, you probably say, but what about the practical aspects? You talk about a Java / .Net library, but I am working with Laravel and Angular.js. First, let me tell you that I don't think you'll find any good PDF tools for Laravel or Angular.js, because of the nature of PDF and those development environments (in my opinion, those technologies don't play well together). Regardless of my opinion, this shouldn't be much of a problem for you because you work in an Amazon environment. AWS supports Java, and the Java code needed to get pdfHTML working is minimal. Most of the code samples I wrote for the pdfHTML tutorial are shorter than 15 lines. So why not try Java and pdfHTML?
If you're already using Amazon services, why not use an amazon lambda function, in combination with iText7 (java), to generate the pdf on demand?
That way, you are guaranteed that the pdf is correct, and has nice layout every time.
Generating the pdf can either be done by:
converting HTML,
programmatically creating your entire document,
filling and flattening an XFA form.
I think for your use-case, either option 1 or 2 are the most sustainable.

How to translate role in Drupal?

If I want to translate the role to other language, how do I do it?
I can change that to other language as the default but I would like to use English so I don't have to deal with UTF8 issue in my code with Asian charactors.
if(in_array("administer nodes", $user->roles))
I have tried to find it from translation module but this seems not translatable as other text in Drupal.
So I'm assuming you've already tried using the t() or st() functions?
If that's so, you may need to try a client-side AJAX translation solution. One way you might do this is to create a vocabulary of terms (corresponding to the English role names), and have the Asian character translation as a secondary field. Then use views to create a view of this vocabulary, and create a lightweight module that:
1) loads a Drupal AJAX script on every page (or every page where role names might be utilized)
2) the script looks for a list of specified containers by id that you know will contain role names
3) searches the view you created for the English pattern, and replaces it in the container with any positive matches
Drupal API's example AJAX module
You could then expand the module/AJAX script to solve other similar translate fails on your site.

Dynamic locale name in multi-language site

I am developing an application in cakephp 2.3.4, Which is multi-language.
Admin can add any number of new languages.
My question is, When admin decides to add a new language, how resulting locale name should be defined.
Can a locale name be any arbitrary name, given by admin or it should be a dropdown containg all languages code according to language.
Unfortunately, your question is a bit 'vague', i.e., will administrators be able to add GNU-locale files (*.po), or are you talking about adding translations inside the database.
In any case, CakePHP uses locales according to the ISO 639-3 standard see here and here for more information. A complete list of those locales can be found inside the I10n class.
Since you probably also want to switch the locale of PHP itself when switching locales, so that, for example, date, money and time-formats will follow the right format for the locale, it's best to stick with those locales and not 'invent' your own locales.
See setlocale(). Be aware though, that PHP may use slightly different locale-codes than CakePHP uses. And it will depend on what locales are installed on your server.
To get a list of locales installed on your server, use locale -a on the command line. See this page for more information: https://wiki.archlinux.org/index.php/Locale
Which techniques to use for localization
A quick summary of techniques to use;
Short messages (interface/UI)
In general, locale files are used for short pieces of text. Locale-files are therefore mostly used for fixed strings,
for example, strings that are used in the interface (like 'are you sure you want to delete this file?' => 'weet u zeker dat u dit bestand wilt verwijderen?).
Longer (fixed) text
For longer pieces of text in your application, that are not part of the 'content' (not the blog-post, but for example a fixed page with a disclaimer),
it's best to use separate views for translated content, for example;
app/Views/MyController/disclaimer_eng.ctp
app/Views/MyController/disclaimer_deu.ctp
app/Views/MyController/disclaimer_fre.ctp
Content
For the content of your website (the part of your website that is managed by the 'user' of the website),
put translations inside the database. This data may be updated frequently and all translations should be updated as well.
How to implement this, is really up to you and depends on your situation. CakePHP offers a Translate behavior that you can use (http://book.cakephp.org/2.0/en/core-libraries/behaviors/translate.html), but in most of my situations that behavior didn't really fit our needs (IMO it is not very efficient, because it stores translations per-field, per-model).

Is it possible for an entry to have two URL in Expression Engine, and translate template names?

I'm currently making a bilingual Expression Engine 2.5.2 website. I'm using this technique to create the two langues, which works perfectly.
I have created a {country_code} global variable in the two index.php files which allows me to detect the current language.
Using this technique, I have no problems to get language-relative data when accessing an entry. My only concern is that I apparently have to privilege a language-specific "clean" URL.
Example entry:
{entry_id} = 123
{title} = My test article
{title_permalink} = my-test-article
{name_fr} = Mon article
{name_en} = My article
If I request http://www.example.com/index.php/en/blog/articles/my-test-article, I expect to to find, in english, "My article" using the template articles in the blog template group.
Everything is fine, but the french translation is accessible when requesting http://www.example.com/index.php/fr/blog/articles/my-test-article. The correct translation of the URL should be http://www.example.com/index.php/fr/blogue/articles/mon-article-test.
Anyone encountered a problem like this? Any solutions via extensions or modules?
I believe the Transcribe module solves this by both providing the ability to translate template group and template names, and having you create a separate entry for each language and piece of content in your site (hence, you have two separate URL titles). But that means buying into their entire methodology for a multi-lingual site.
Myself, I usually just stick to using the entry_id instead of the url_title, and live with the template names being in the primary language.
The best way I found to achieve this is by embedding templates with segment translations, duplicating template groups and duplicating channels.
In the blog/articles template:
{embed="shared/.head" segment_2_translation="blogue" segment_3_translation="articles"}
In the blogue/articles template:
{embed="shared/.head" segment_2_translation="blog" segment_3_translation="articles"}
In shared/.head template:
[...] {if lang == "fr"}English{if:else}Français{/if} [...]
And then you can create a Articles (FR) and a Articles (EN) channels, and each will have their unique URL titles. You can also add a relationship custom field for each channel to associate an entry with it's translation.
It feels messy, but it is the only way I could make it work without modules, plugins or whatnot.

cakephp, i18n .po files, How to use them correctly

I have finally managed to set up a multilingual cakephp site.
Although not finished it is the first time where I can change the DEFAULT_LANGUAGE in the bootstrap and I can see the language to change.
My problem right now is that I cannot understand very well how to use the po files correctly.
According to the tutorials I've used I need to create a folder /app/locale and inside that folder create a folder for each language in the following format: /locale/eng/LC_MESSAGES.
I have done that and I have also managed to extract a default.pot file using cake i18n extract. And it appears that all occurrences of the __() function have been found succesfully.
In my application I'm using 2 languages: eng and gre.
I can see why you would need a seperate folder for each language.
However in my case nothing happens when I edit the po files inside each folder....well almost nothing. If I edit the /app/locale/gre/LC_MESSAGES/default.po I have no language changes. If I edit the /app/locale/eng/LC_MESSAGES/default.po then the language changes to the new value (on the translation field) and it does not switch to the other language.
What am I doing wrong.
I hope I made myself as clear as possible.
There are different ways to go about it. The easiest is to code the app in the primary language and to just wrap all translatable strings in __(). Later you can add .po files for each translation you may want.
The problem with this approach is that if you were to change the text in the original language, you'll also need to change the msgid entry for this string in every .po file you may have. This can become quite cumbersome if you have to support many different languages.
Please disregard the old information above. A properly set up i18n workflow will use xgettext or similar utilities to automatically extract __() wrapped strings from the source code and produce, update and merge .po files. Nothing cumbersome about it.
The alternative is to use a "descriptor" text in the source files and put the actual text in .po files, even for the primary language. I.e:
__('PRODUCT CAPTION');
/eng/.po
msgid "PRODUCT CAPTION"
msgstr "Buy our awesome products!"
/ger/.po
msgid "PRODUCT CAPTION"
msgstr "Kaufen Sie unsere Produkte!"
What works better depends on the project and on you, you'll have to figure it out...
Use
Configure::write('Config.language', 'fr');
to set the language of the user to french.

Resources