Multilingual LaTeX document with math/scientific symbols - multilingual

Hoping for true unicode support as-is in pdfTeX
Trying to get a multilingual LaTeX document that can also display math/science symbols.
Installing babel, babel-vietnamese and vntex (Vietnamese), cjk (Chinese) works for Vietnamese and Chinese in a single document. I'm guessing that TeX is at a stage where there cannot be a unified multilingual platform (say, all served by babel). So, I think I'm beginning to get this multilingual thing (the T1 and T5 encoding are inside my MWE, as well as babel and cjk).
The thing I still have problems with is including math/science symbols.
MWE
\documentclass{book}
% Not needed anymore in TeX Live 2018, but babel still needs to learn to chill on this.
\usepackage[utf8]{inputenc} % Babel gives a warning otherwise.
% Install vntex. t5enc.def is in there.
\usepackage[vietnamese, main=english]{babel}
\usepackage{CJKutf8} % install cjk.
\begin{document}
Testing UTF-8 character here: é
\selectlanguage{vietnamese}
Vietnamese: Tiếng Việt
\selectlanguage{english}
And English again.
\begin{CJK}{UTF8}{gbsn}
Chinese: 中文
\end{CJK}
Temperature: 26°C
\end{document}
Expected and Actual Result
I would expect pdfTeX Live to "just display the unicode" (for degree symbol, 0+00B0), but I actually kinda know why it won't.
A few additional questions...
Why wasn't CJK included in babel? Because of multi-byte encoding?
I haven't figured out what to do for scientific symbols yet.

Related

How can I use » in Draftail?

The webpage should have special characters which are not on the keyboard. &'187; is an excample.
Draftail should not translate & into "&"
You should use the features provided by your browser and OS for inserting special characters. For example, on Chrome on Mac OS, you can right-click and select "Emoji & Symbols".
Draftail is not a programmer-facing tool: it's a tool for writers, and it wouldn't make sense for it to interpret the entered text as HTML code, any more than it would make sense for Google Docs or Microsoft Word to do the same.

VSCode 'Go to definition' not working only in big projects

I've been using VSCode for C language professionally almost everyday for +1 year. Right now, I hit with something that is really affecting my productivity.
When I open a big project, the features "Go to definition, Go to declaration, Peek..." etc don't work. I don't know how to describe how 'big' the project is. There are source files with +26k lines and it can take up to 45 min to compile. When I work with a more reasonably sized project, I have no issues, so until now I assumed this was a limitation of the program due to the size of my project and resigned myself. Now, I'm really bothered at this point and would like to find a solution.
What strikes me is that searching in the whole project (Ctrl + Shift + F) is blazing fast and works brilliantly, so VS seems to be capable of 'handling' this big project.
C/C++ extension from Microsoft last version v0.28.3
VSCode last version 1.46.1
Windows 10
Do you think there is a solution for this? Have you used VSCode with massive projects?
Edit: by 'don't work' I mean, it tries to perform the action but stays 'thinking' indefinitely.
Most propably it is not "Not working" but just "pretty slow". This is a known problem for C/C++ projects using the C/C++ extension for Visual studio code. The Indexer for intellisense needs some time (especially if you are not limiting it via limitSymbolsToIncludedHeaders or something like that). You could try reducing the amount of parsed files by using explizit browse paths in your c_cpp_properties.json like
"browse": {
"path": [
"/usr/include/",
"/usr/local/include/",
"${workspaceRoot}/../include",
"${workspaceRoot}/dir1",
"${workspaceRoot}/dir2",
"${workspaceRoot}/dir3/src/c++",
"${workspaceRoot}/dir5",
"${workspaceRoot}/dir6/src",
"${workspaceRoot}/dir7/src",
"${workspaceRoot}/dir4"
],
and excluding for example IDE/SDK files where you do not need autocompletion/Go To Symbol/Go to definition.
For more explanation see: https://github.com/microsoft/vscode-cpptools/issues/1695

Tesseract (tess-two) recognising symbols and retraining dataset for a few characters?

Sorry for bothering you but I'm just a beginner with tess-two.
The question I want to ask is I have been trying to figure out the characters/symbols that are included in the eng.traineddata but haven't been able to find it. Could anyone guide me in the correct way?
Also, I am using the tess-two for Android and I've built the tess-two library natively. I'm working on a project that uses a dataset which recognises the following symbols and I'm assuming that the eng.traineddata has them :
£, ¥, $ etc.
I've been successfully been able to recognise the Euro and the dollar symbol but the recognition fails when I add the other symbols to the whitelist. Is it because the dataset has not been trained on these characters or there is something wrong with the input image?
Also, please do let me know if there are any other traineddata files that contain all these symbols, would be of great help.
Thanks

Motif programming and UTF-8

I'm new to Motif programming and I want to use UTF-8 encoding.
I've tried XtSetLanguageProc (NULL, NULL, NULL); but when I read a file in Motif (editor text-like in 6A volume motif programming), I've got problems with accented characters.
I had to use setlocale()?
thanks!
With Motif, you have to switch to the correct font for the languages that you are using. There is currently no single UTF-8 font that has full support for all languages.
If there is more to your problem you might want to ask it on MotifZone http://www.motifzone.com/forum/unicode-support since Motif is not a commonly used toolkit anymore.
As Michael said, you need a font that supports Unicode. The ones with most broad support are Iso10646 fonts. Assuming Linux with X11, launch xfontsel to find them. Select iso10646 from the rgstry drop-down menu. Then fmly menu will list available fonts with that encoding. Some are very limited, but
-*-fixed-medium-*-*-*-18-*-*-*-*-*-iso10646-*
is a good choice that comes with the X11 installation.
Then, you need either to set that font as a fallback in your Motif program or supply the resource via command-line
xmprogram -xrm '*fontList: -*-fixed-medium-*-*-*-18-*-*-*-*-*-iso10646-*'
If all worked right, there will be no problems with accented characters anymore.
For a font supporting even more glyphs, consider GNU Unifont.

How to break words into syllables in LaTeX correctly

I am writing my MSc with LaTeX and I have the problem that sometimes my words are divided in a wrong way.
My language is spanish and I'm using babel package.
How could I solve it?
For example: propuestos appears prop-uestos (uestos in next line). It should be pro-puestos.
Thanks!!
If you only have a small number of hyphenation errors to correct, you can use \hyphenation to fix them. For instance: \hyphenation{pro-puestos}. This command goes after \documentclass and before \begin{document}.
You can put more than one dash in, if you want to give TeX more line-breaking options: \hyphenation{tele-mun-dos}. You can list many words inside the braces; put spaces between them.
If more than a handful of words are wrong, though, TeX is probably using hyphenation patterns for the wrong language -- and if "propuestos" were an English word, it would be hyphenated after "prop", so that's another point in favor of that theory. Do you get a message like this when you run LaTeX?
Package babel Warning: No hyphenation patterns were loaded for
(babel) the language `Spanish'
(babel) I will use the patterns loaded for \language=0 instead.
If so you need to reconfigure your TeX installation with Spanish hyphenation turned on. There should be instructions for that in the manuals that came with the installation. Unfortunately, this is one of the places where TeX's age shows through -- you can't just load a package with the proper hyphenation rules (or Babel would do that); you have to do it when compiling the "base format" with INITEX, which is a maintenance operation. Modern TeX installations have nice utilities for that but they're all different and I don't know which one you're using.

Resources