Does Watson Language Translator service support variables which shouldn't be translated - ibm-watson

Is there an option in Language translator to skip translation for certain words based on a regular expression or some other pattern.
For example we have some text where certain values are substituted based on a variable which we don’t want translated
"Hello $User_Name. How are you doing today?"
$User_Name should not be translated.
Is there a way to do this?

Related

Generate english versions of same content on Drupal?

My website needs to be able to support multiple languages for multiple countries. For example, the US might have English and Spanish, while the UK might only have English. If two countries use the same language, it DOES NOT mean the content is the same.
For this reason, I decided to use the internationalization module (i18n) and I created language codes as follows:
gb-en - UK English
us-en - US English
us-es - US Spanish
I set this up with no issues, but my problem comes in with creating all the default content. For each content type, I want to:
Set the content types default language as "English"
Create translated versions of each content type for each language
I know this will mean that the Spanish content would still be in English, but it's the first step towards translating it.
What is the easiest way to create all these "default" content pages?
You could create a module implementing hook_node_insert(). This module would intercept the creation of a new node (stored with the default language) and create as many copies as needed. Each of these copies should have a different value in the field language. These copies colud be easily stored in the dabase using node_save() function.

Xtext with multilingual keywords

Is it possible to create grammar with multilingual keywords? I'm implementing a dsl language with over 100 keywords, which have to be translated in multiple languages. Is there a way to achieve that with XText?
Below is an example of the dsl. The first line describes the keyword language. The second and the third line are examples of keywords in english and german
..language english
.help 'index.html'
.select '1_2'
..language german
.hilfe 'index.html'
.hole '1_2'
The dsl grammar is already defined and there are already files in different languages. Thus I have to create the editor and cannot change the grammar and the keywords.
What you ask for is certainly possible with Xtext. You can use a custom lexer yet have to make sure that you produce the very same token types as the original lexer does. You may want to check out the org.eclipse.xtext.generator.parser.antlr.ex.ExternalAntlrLexerFragment.

What is the purpose of dcgettext?

The GNU gettext manual describes the dcgettext function as follows:
Both take an additional argument at the first place, which corresponds to the argument of textdomain. The third argument of dcgettext allows to use another locale category but LC_MESSAGES. But I really don’t know where this can be useful. If the domain_name is NULL or category has an value beside the known ones, the result is undefined. It should also be noted that this function is not part of the second known implementation of this function family, the one found in Solaris.
Source: https://www.gnu.org/software/gettext/manual/html_node/Ambiguities.html
Is there any use for providing a different category than the default LC_MESSAGES for message translation? What would it even do? (Does it use the locale setting for that different category rather than the locale setting for LC_MESSAGES? What happens if LANGUAGE is set - wouldn't it override that category anyway, or does it only override LC_MESSAGES?) Since even the documentation writers are struggling to find a purpose for this feature, I really question whether it has any purpose at all. Trying
ls /usr/share/locale/*/LC_[^M]*
turned up no files, so it appears nobody is using this. But can anyone provide insight on what this feature was/is for and whether it's useful?
Apparently dcgettext was provided "for compatibility."
Quoting the GNU C Library's Translation with gettext section, third paragraph from the bottom:
The dcgettext function is only implemented for compatibility with other systems which have gettext functions. There is not really any situation where it is necessary (or useful) to use a different value but LC_MESSAGES in for the category parameter. We are dealing with messages here and any other choice can only be irritating.
(Personally I don't find this a particularly satisfying answer, as it gives no hint regarding which "other systems" the they were seeking compatibility with -- but it's the only authoritative explanation I've found so far.)
Edited for better examples:
A friends' birthday reminder widget
This requires dcgettext() or dcngettext(), because it should use the LC_TIME category rather than LC_MESSAGES category (for the one localized string, "It's %s's birthday today!"), because users would expect the LC_TIME environment variable to control the language of the widget, the same way it does for e.g. the date command.
A restaurant bill splitter widget
To make it easier to understand and split bills in other countries (especially countries where you can barely understand the bill), this would use LC_MONETARY category for the bill fields, so that the users can select the currency by changing the LC_MONETARY environment variable.
Let's assume the widget is intended for traveling users, or is perhaps supported by a simple server backend, which stores descriptions and numeric amounts, but no monetary units. Each bill is a simple dataset, containing locale, total amount, description string, and a list of participants, each participant specified by a string and a number. Sum of the numbers should always be at least the total amount, the extra be the tip.
The user interface (menus, options etc.) are localized as normal using the LC_MESSAGES category, but each bills locale overriding the LC_NUMERIC and LC_MONETARY locale categories, and the application-specific strings in the widget -- "total", "tip" and so on -- using the LC_MONETARY category in the localization file. (Therefore the code would have dcgettext(NULL,"Total",LC_MONETARY), `dcgettext(NULL,"Tip",LC_MONETARY) and so on.)
When creating a new bill, you can implement the locale selection by simply switching to the desired locale in LC_MONETARY and/or LC_NUMERIC category.
The reason you would want to do this is simple: you could have an user interface that shows the typical bill according to the local localization (per restaurant locale), while the rest of the user interface, especially tool tips, hints, help et cetera, is still in the main locale/language (as determined by LC_MESSAGES).
Regardless of whether the widget was a graphical Qt/GTK+ or a command-line one, it could always use the normal environment variables to define its initial locale (LC_MESSAGES for user interface, LC_MONETARY and LC_NUMERIC for the new bill).
Most programmers would likely use a configuration file or manager or registry key to store the locale, but since it is trivially available, well standardized, why duplicate the functionality? Moreover, a user could create aliases or shortcuts that simply set a different initial locale (for the two categories), and could have multiple instances of the widget open, using different billing locales, for example for comparison or understanding the bill.
gettext(msgid) is equivalent to dgettext(NULL,msgid) is equivalent to dcgettext(NULL,msgid,LC_MESSAGES).
In fact, in current GNU gettext, gettext(msgid) is a wrapper around dcgettext(NULL,msgid,LC_MESSAGES), and dgettext(domain,msgid) is a wrapper around
dcgettext(domain,msgid,LC_MESSAGES).
The category parameter to dcgettext() allows you to select which category is used to determine the locale. For example, if you used dcgettext(NULL, "FOO", LC_MONETARY), then the LC_MONETARY category would be used to determine the actual locale to use. Because the C library provides the category-specific functions like strftime() (uses the LC_TIME category) and strcoll() (uses the LC_COLLATE category), most applications only explicitly use the LC_MESSAGES category. (They do, however, use the other categories via the C library functions.)
The user can control the locale for each category via environment variables.
For the GNU C library, the environment variables are interpreted as follows:
If LC_ALL is not empty, it defines the locale for all categories.
Otherwise:
If LC_CATEGORY is not empty, it defines the locale for category CATEGORY.
Otherwise:
If LANG is not empty, it defines the locale.
Otherwise:
The locale is C/POSIX.
In other words, LANGUAGE is ignored, and LANG is only used if both LC_ALL and the relevant LC_category environment variables are empty or undefined.
In my experience, other OSes with gettext or similar localization support, have the same environment variable support pattern -- LC_ALL being the override, LC_category being the specific setting, with LANG (and possibly LANGUAGE) as defaults if nothing else is set.
It is very useful to use a mixed-locale environment, where LC_ALL is undefined or empty, some of the LC_ environment variables are defined to a specific locale, with others undefined or empty or C, possibly with a default LANG defined just to be sure.
I personally sometimes use
LC_ALL= \
LC_TIME=C \
LC_NUMERIC=C \
LC_CTYPE=C \
LC_MESSAGES=C \
LC_COLLATE=fi_FI.utf8 ls -laF --color=auto
as an alias for ll. It lists the files and directories in the specified directory, using the C/POSIX locale for everything except string collation (string sort order), which uses Finnish rules. That gives me the output sorted according to typical Finnish rules, but everything is in C/POSIX locale.
I might switch to a LC_TIME locale that used ISO 8601 dates, or perhaps a human-friendly version of ISO 8601 (YYYY-MM-DD HH:MM:SS.ttt TZ). Just haven't yet cared enough to look for one or write one myself.
Questions?

drupal 7 taxonomy i18n is not working when editing multilingual content.I'm stock to default language

I create a taxonomy called colors.
Each of my term are translatable, in french and english.
I create a content type called product where I can associate a taxonomy color. Note that my product content type is multilingual aswell
My admin default language is french.
When I create a product,the color taxonomy is only showing in french which is my admin default language. In my mind, it's supose to be displayed within the language set in the node.
It's a problem right now because, english node is associated to french taxonomy.
Anybody know how can I resolve this issue.
Thanks a lot.
It depends on how your taxonomy was set up.
There are three different modes, beside from no-multilingual:
Localize. Terms are common for all languages, but their name and description may be localized.
Translate. Different terms will be allowed for each language and they can be translated.
Fixed Language. Terms will have a global language and they will only show up for pages in that language.
I guess colors are common for all languages, so I would recommend using the first option.
You can then translate the colors via config > translate interface.
When you chose the "localize" option you don't have duplicate colors showing up on node forms.
Update
If the terms are common for all languages but you want to add more fields to the terms, you could use localize and in addition use the entity translation module.
Entity translation allows you to translate the different fields for each term.
There are two drawbacks though:
On top of the entity translation you have to translate the terms via translate interface or else the terms will not be translated on the node-edit-forms.
You have to alter the aliases for the Terms yourself via /admin/config/search/path so the terms have different path aliases for different languages
This is of course not a good solution for user-generated terms but works if the terms are moderated.

Multiple phrases per language in cakephp

I am creating a website using CakePHP that requires translation not only into multiple languages but also multiple phrases per language depending on the type of the logged in user. This will allow the same functionality but with more formal or more friendly language without duplication.
As a very simple example:
Type 1: "Customer", "purchase","shopping cart"
Type 2: "Client", "buy", "basket"
Type 3: "User", "order","invoice"
Each of these types would be available in multiple languages.
I've got the standard localization working in CakePHP (one of the reasons I chose it!) and have the appropriate default.po files in the /Locale/[lang]/LC_MESSAGES/ directory and all is working fine there (thank you to the user who noted on this site that ger needed to be deu to work ;) ).
Before I get too far into the app I'd like to add the phrasing so I can set e.g. the language as French and phrasing as type2. If I was doing this outside of a framework I'd have a matrix look-up to find the correct string based on language and phrase keys but am unsure of how to do this within CakePHP's localization.
Currently I'm using the standard __([string]) convention but as this is early in the development cycle it would be trivial to change if necessary.
I was considering using __d([phrase],[string]) but can't see how to set this without creating my app as a plugin and then I'm back to the same problem with /Locale/
I have been unable to find any example of this in my searches on SO or the cakePHP community sites so would be grateful for any suggestions.
Is there a standard way to do this within cakePHP? if not, what would be a good "best practice" way to implement this?
Edit - following the answer below here's how it was implemented:
in /app/Locale/[lang]/LC_MESSAGES/ I created a new .po files with the new phrasing in them as phrase1.po, phrase2.po etc.
Where I set the language I also set the phrasing where the phrase file matches the name of the po file:
Configure::write('Config.language', 'deu');
Configure::write('App.langDomain', 'phrase1');
and all strings were wrapped with:
__d(Configure::read('App.langDomain', 'string')
And it just works.
Use __d() like this:
__d(Configure::read('App.langDomain'), 'Some string');
In bootstrap.php check the conditions and set App.langDomain based on whatever you need.
__d() has nothing to do with plugins, you can use it everywhere.
And alternative would be to wrap your translations with a custom method, something like
__dd(Configure::read('App.langDomain'), array('foo' => __('String1', 'bar' => __('String2'));
The array is an array of langDomain => stringForThatDomain mappings. Your __dd() method would take the one that is passed in the first argument.

Resources