I'm trying to display some Unicode (Cyrillic, actually) using XmLabel and a server-side XLFD font (-monotype-arial-medium-r-normal--*-90-*-*-p-*-iso10646-1). Whenever I use XmStringCreate() or XmStringCreateLtoR() as an XmString factory, the result meets my expectations.
When I try to use XmStringGenerate() factory, however, passing in either XmMULTIBYTE_TEXT for a multi-byte Unicode string, or XmWIDECHAR_TEXT for a wide string, garbage is rendered onto the screen, regardless of the font used (I tried both UTF-8 and single-byte Cyrillic server-side fonts).
The result can be seen below (the 1st 2 lines are ok, 2nd through 6th labels were created with XmStringGenerate() and are obviously not ok):
The complete code (requires Motif 2.1+ and a C99-compliant compiler) is here.
Can anyone suggest a working XmStringGenerate() example suitable for displaying Unicode characters (not just ISO-8859-1)?
XmMULTIBYTE_TEXT is locale-dependent, as n.m suggested, and, aside from CJK (i. e. for Roman and Slavic languages), can only be used in UTF-8 locales. Core X11 fonts can be specified as either fonts (XmFONT_IS_FONT):
-monotype-arial-medium-r-normal--*-90-*-*-p-*-iso10646-1
or font sets (XmFONT_IS_FONTSET):
-monotype-arial-medium-r-normal--*-90-*-*-p-*-*-*:
Speaking of XmWIDECHAR_TEXT mode, it seems impossible to specify a proper font with an explicit encoding, but setting a font set instead works perfectly for Motif 2.1 through 2.3.
Related
I need to translate the date format to Japanese locale but its showing output wrongly.I also tried by changing the locale of the browser but its not working in both chrome and IE
app.filter('japan', function() {
return function(dateString, format) {
return moment().locale('ja').format('LLLL');
};
})
Output for the format is 2016蟷エ6譛�20譌・蜊亥燕11譎N蛻� 譛域屆譌・
Required output is 2016年6月20日午前11時30分 月曜日
This isn't an issue with Moment. It's an encoding problem known as mojibake and can happen when your page has an encoding that doesn't correctly handle the characters you are using. In general, it's preferable to use a neutral encoding like UTF-8 or UTF-16 (UTF-8 is the de-facto standard), and from the comments above, it sounds like this did indeed fix your issue.
Additionally, it is a good idea to set a lang="" attribute on the element containing your localized content (you can do this as high up as the <html> element), because certain characters can have different appearances depending on the locale.
To take your text as an example, the top-right portion of the character 曜 looks like 羽 with lang="zh", but looks like two side-by-side ヨs with lang="jp".
I am currently writing a simple bitmap font generator using CoreGraphics and CoreText. I am retrieving the kerning table of a font with:
CFDataRef kernTable = CTFontCopyTable(m_ctFontRef, kCTFontTableKern, kCTFontTableOptionNoOptions);
and then parse it which works fine. The kerning pairs give me the glyph indices (i.e. CGGlyph) for the kerning pairs, and I need to translate them to unicode (i.e. UniChar), which unfortunately does not seem super easy. The closest I got was using:
CGFontCopyGlyphNameForGlyph
to retrieve the glyph name of the CGGlyph, but I don't know how to convert the name to unicode, as they are really just strings such as quoteleft. Another thing I though about was parsing the kCTFontTableCmap myself to manually do the mapping from the glyph to the unicode id, but that seems to be a ton of extra work for the task. Is there any simple way of doing this?
Thanks!
I don't know a direct method to get the Unicode for a given glyph, but you could
build a mapping in the following way:
Get all characters of the font with CTFontCopyCharacterSet().
Map all these Unicode characters to their glyph with CTFontGetGlyphsForCharacters().
For each Unicode character and its glyph, store the mapping glyph -> Unicode
in a dictionary.
I am trying to add a special character (specifically the ndash) to a Model field's help_text. I'm using it in the Form output so I tried what seemed intuitive for the HTML:
help_text='2 – 30 characters'
Then I tried:
help_text='2 \2013 30 characters'
Still no luck. Thoughts?
django escapes all html by default. try wrapping your string in mark_safe
You almost had it on your second try. First you need to declare the string as Unicode by prefacing it with a u. Second, you wrote the codepoint wrong. It needs a preface as well; like \u.
help_text=u'2\u201330 characters'
Now it will work and has the added benefit of not polluting the string with HTML character entities. Remember that field value could be used elsewhere, not just in the Form display output. This tip is universal for using Unicode characters in Python.
Further reading:
Unicode literals in Python, which mentions other codepoint prefaces (\x and \U)
PEP263 has simple instructions for using actual raw Unicode characters in a source file.
I've got a WinForms RichTextBox in my application. When I enter the Chinese text "蜜蜜蜜蜜", the control uses the following RTF:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fmodern\fprq6\fcharset134 SimSun;}{\f1\fnil\fcharset0 Microsoft Sans Serif;}}
\viewkind4\uc1\pard\f0\fs17\'c3\'db\'c3\'db\'c3\'db\'c3\'db\f1\par
}
The test string is the same character four times. It's Unicode value is 34588 (0x871C). So how is it that the character is being stored as "\'c3\'db" in the RTF? What kind of encoding is that?
RTF is old, older than Job and considerably predates Unicode. I think it using code page 936, a double-byte character set for Simplified Chinese. Your snippet shows it using c3db for the character, it matches the glyph shown in this table.
I'm working on a legacy vb.net winform app, and would like to have have up and down arrows within my button controls.
I would think i need to invoke some sort of escape character sequence to have get the equivalent of &uparr; and &dnarr; ?
Open up "Character Map" (from Programs->Accessories->System Tools on WinXP). You can find all sorts of interesting characters there.
Sometimes, you'll want to use weird fonts like WebDings or WingDings, but be careful to only use fonts that will be on the users's machines.)
You can press ALT and type the unicode value for the character you want. Consult this table, specifically the "arrows" section, and convert from HEX to DEC.