My windows C program does not print japanese characters - c

#include <locale.h>
#include <stdio.h>
#include <wchar.h>
int main() {
setlocale(LC_ALL, "");
wchar_t test = L'づ';
printf("%ls", L"\x3065");
printf("%lc", test);
return 0;
}
the expected output is: づづ, but
these two printf does not print anything, what can i do to solve this problem?

printf is a narrow string function and unless you have requested UTF-8 in your manifest and are running on an appropriate version Windows 10 it is not going to print Unicode correctly in all cases.
Use wprintf to print wide strings. Depending on the C runtime library, you might need to call _setmode(_fileno(stdout), _O_U16TEXT); first before printing.
Even if your program does everything correctly it might still not work in the console. Using the new Windows Terminal should work. The older console might just display squares. This is a console/font limitation. Copy the squares to the clipboard and paste in Wordpad to see that your program actually worked correctly.
See also:
Myth busting in the console

Related

Problem with encoding and terminals with Special Characters

I'm making a code and as im progressing the encoding changes to UTF-8, but that created a problem for me, im brazilian and i have some phrases in portuguese with special characters that are in ASCII table, but having to revise every printf and every phrase or word to see if have a special character is madness in a 700 line code, i have a short time so i tried changing the encoding to ISO-8859-1,UNICODE and WINDOWS-1252 but the moment when i build or save the file it returns to UTF-8, i tried changing the setlocale(LC_ALL,"pt_BR.utf8") or anything but nothing happens, i tought that was the Code::Blocks terminal that was broken then i made a new test file to see with WINDOWS-1252 encoding worked, anyone has any ideia to help or i'd have to make character by character?
Im using the default terminal of codeblocks cb_console_runner
Isn't the encoding UTF-8 enconding and bytes that is incompatible with special characters? Because the default in UNICODE is 16bytes or am i wrong?
EDIT:
#include <stdio.h>
#include <locale.h>
int main(){
printf("%s", setlocale(LC_ALL,"pt_BR.utf8"));
}
returned: (NULL)
#include <stdio.h>
#include <locale.h>
int main(){
printf("%s", setlocale(LC_ALL,""));
}
returned: Portuguese_Brazil.1252
as i looked in previewed questions in portuguese stackoverflow none has helped at all, some says is the encoding, others says is the terminal.
So, yesterday i talked to my professor and we both agreed that was the encoding, but he had an idea, i got my code and opened in Dev-C++ and as we noticed the file was "corrupted" with the special letters as i mentioned i think it was from when the file was been saved in UTF8 that changed.

What locale LC_CTYPE is used for Windows unicode console app?

While converting a multi-byte console application to Unicode, I ran up against a weird problem where _tcprintf and WriteConsole worked fine but _tprintf was printing the wrong characters...
I've traced it back to using setlocale(LC_ALL, "C") which uses LC_CTYPE of 1 byte based on MS doc:
The C locale assumes that all char data types are 1 byte and that their value is always less than 256.
However, I want to keep "C" for everything except the LC_CTYPE but I don't know what to use?
I thought the whole point of using UTF16 is that all the characters are available and things would print properly no matter the code page or locale.
Although it also appears setting the console output to UTF-8 (65001) (SetConsoleCP which of course is separate from the locale) in a Unicode app and outputting UTF16 also has problems displaying the correct characters.
Anyway, does anyone know what value I should be using the LC_CTYPE for UTF16 on Windows Unicode Console Application? Maybe it's as easy as setlocale( LC_CTYPE, "" ); ? TIA!!
Use _setmode() to set the file translation mode to _O_U16TEXT:
#include <fcntl.h>
#include <io.h>
#include <stdio.h>
int main(void)
{
_setmode(_fileno(stdout), _O_U16TEXT);
wprintf(L"ελληνικά\n");
}

Eclipse; Escape Sequences don't work?

I am doing a basic C tutorial. In an example this code was given to introduce escape sequences:
#include <stdio.h>
int main()
{
printf("This is a \"sample text\"\n");
printf("\tMore text\n");
printf("This is getting overwritten\r");
printf("By this, another sample text\n");
printf("The spa \bce is removed.\n");
return 0;
}
The console output is expected to look like this:
This is a "sample text"
More text
By this, another sample text
The space is removed.
Instead, I get this:
This is a "sample text"
More text
This is getting overwritten
By this, another sample text
The spa ce is removed.
I am using Eclipse Cpp Oxygen on Windows and the Cygwin toolchain to compile und run the code. I don't know what I'm doing wrong and I thought I'd ask here for help.
The console built in to Eclipse does not support the \r, \b (and \f) characters.
There is a long standing bug 76936 for this which has been open for 14 years. But doesn't look like being fixed.
In linux you example works exactly as you expect. Probably in windows the \r is considered like \n.
Instead on linux terminal the \r put (correctly) the cursor on the first char of the row.

c - _setmode function causing debug error

Ok, so after posting this question I tried to use the solutions provided in the related questions (particularly this) pointed by the community but I had another problem.
When trying to use the _setmode() function to change the Windows console to print UTF characters I get a debug error, just like the one posted on this other question. The debug error is as follows:
Text:
Debug Assertion Failed!
Program:
...kout-Desktop_Qt_5_5_0_MSVC2013_64bit-Debug\debug\Breakout.exe
File: f:\dd\vctools\crt\crtw32\stdio\output.c
Line: 1033
Expression: ((_Stream->_flag & _IOSTRG) || ( fn = _fileno(_Stream), (
(_textmode_safe(fn) == _IOINFO_TM_ANSI) &&
!_tm_unicode_safe(fn))))
For information on how your program can cause an assertion failure, see the Visual C++ documentation on asserts.
(Press Retry to debug the application)
Screenshot:
Without the _setmode() function I still can't print characters from the upper ASCII Table, like these: "┌──┐". What can I do to solve this problem? The solution to the question with the same problem doesn't work also.
Again, I'm using Qt Creator on Windows, with Qt version 5.5.0 MSVC 64 bits. The compiler is the Microsoft Visual C++ Compiler 12.0 (amd64).
Edit:
Here's a small sample code that causes the error:
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
int main(void)
{
//Using setmode to force the error
_setmode(_fileno(stdout), _O_U16TEXT);
printf("Hello World!\n");
return 0;
}
Upon execution the error appears.
It seems that if you set the output mode to UTF-16, you must then use wprintf instead of printf.
(Presumably, since you have told the runtime to translate from UTF-16, you have to provide UTF-16.)
This code runs on my machine:
#include <fcntl.h>
#include <io.h>
#include <stdio.h>
int main(void) {
_setmode(_fileno(stdout), _O_U16TEXT);
wprintf(L"\x043a\x043e\x0448\x043a\x0430 \x65e5\x672c\x56fd\n");
return 0;
}
So does
wprintf(L"Hello world!\n");
PS - I'm not sure whether this will solve your underlying problem, which I suspect has to do with the encoding of the source file. Even if using UTF-16 does solve your problem, it probably isn't the best solution.

C Programming - ascii for windows "unknown" characters

I'm programming in windows, but in my C console some characters (like é, à, ã) are not recognizable. I would like to see how can I make widows interpret those chars as using unicode in the console or utf-8.
I would be glad for some enlightening.
Thank you very much
By console do you mean cmd.exe? It doesn't handle Unicode well, but you can get it to display "ANSI" characters by changing the display font to Lucida Console and changing the code page from "OEM" to "ANSI." By the choice of characters you seem to be Western European, so try giving this command before running your application:
chcp 1252
If you want to try your luck with UTF-8 output use chcp 65001 instead.
Although I completely agree with Joni's answer, I think it can be added a detail:
Since Telmo Vaz asked about how to solve this problem for C programs, we can consider the alternative of adding a system command inside the code:
#include <stdlib.h> // To use the function system();
#include <stdio.h>
int main(void) {
system("CHCP 1252");
printf("Now accents are right: áéíüñÇ \n");
return 0;
}
EDIT It is a good idea to do some experiments with codepages. Check the following table for information (under Windows):
Windows Codepages

Resources