Why doesn't putchar() output the copyright symbol while printf() does? - c

So I want to print the copyright symbol and putchar() just cuts off the the most significant byte of the character which results in an unprintable character.
I am using Ubuntu MATE and the encoding I am using is en_US.UTF-8.
Now what I know is that the hex value for © is 0xc2a9 and when I try putchar('©' - 0x70) it gives me 9 which has the hex value of 0x39 add 0x70 to it and you'll get 0xa9 which is the least significant byte of 0xc2a9
#include <stdio.h>
main()
{
printf("©\n");
putchar('©');
putchar('\n');
}
I expect the output to be:
©
©
rather than:
©
�

The putchar function takes an int argument and casts it to an unsigned char to print it. So you can't pass it a multibyte character.
You need to call putchar twice, once for each byte in the codepoint.
putchar(0xc2);
putchar(0xa9);

You could try the wide version: putwchar
Edit: That was actually more difficult than I thought. Here's what I needed to make it work:
#include <locale.h>
#include <wchar.h>
#include <stdio.h>
int main() {
setlocale(LC_ALL, "");
putwchar(L'©');
return 0;
}

Related

input hex to string output in C

I am a beginner at C programming. I researched how to get a solution to my problem but I didn't find an answer so I asked here. My problem is:
I want to convert a hex array to a string. for example:
it is my input hex: uint8_t hex_in[4]={0x10,0x01,0x00,0x11};
and I want to string output like that: "10010011"
I tried some solutions but it gives me as "101011" as getting rid of zeros.
How can I obtain an 8-digit string?
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(){
char dene[2];
uint8_t hex_in[4]={0x10,0x01,0x00,0x11};
//sprintf(dene, "%x%*x%x%x", dev[0],dev[1],2,dev[2],dev[3]);
//sprintf(dene, "%02x",hex_in[1]);
printf("dene %s\n",dene);
}
In order to store the output in a string, the string must be large enough. In this case holding 8 digits + the null terminator. Not 2 = 1 digit + the null terminator.
Then you can print each number with %02x or %02X to get 2 digits. Lower-case x gives lower case abcdef, upper-case X gives ABCDEF - otherwise they are equivalent.
Corrected code:
#include <stdio.h>
#include <stdint.h>
int main(void)
{
char str[9];
uint8_t hex_in[4]={0x10,0x01,0x00,0x11};
sprintf(str,"%02x%02x%02x%02x\n", hex_in[0],hex_in[1],hex_in[2],hex_in[3]);
puts(str);
}
Though pedantically, you should always print uint8_t and other types from stdint.h using the PRIx8 etc specifiers from inttypes.h:
#include <inttypes.h>
sprintf(str,"%02"PRIx8"%02"PRIx8"%02"PRIx8"%02"PRIx8"\n",
hex_in[0],hex_in[1],hex_in[2],hex_in[3]);

Why Unicode characters are not displayed properly in terminal with GCC?

I've written a small C program:
#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
int main() {
wprintf(L"%s\n", setlocale(LC_ALL, "C.UTF-8"));
wchar_t chr = L'┐';
wprintf(L"%c\n", chr);
}
Why doesn't this print the character ┐ ?
Instead it prints gibberish.
I've checked:
tried compiling without setlocale, same result
the terminal itself can print the character, I can copy-paste it to terminal from text-editor, it's gnome-terminal on Ubuntu
GCC version is 4.8.2
wprintf is a version of printf which takes a wide string as its format string, but otherwise behaves just the same: %c is still treated as char, not wchar_t. So instead you need to use %lc to format a wide character. And since your strings are ASCII you may as well use printf. For example:
int main() {
printf("%s\n", setlocale(LC_ALL, "C.UTF-8"));
wchar_t chr = L'┐';
printf("%lc\n", chr);
}

C store and print wchar_t

I want to store a string with characters from extend ascii table, and print them.
I tried:
wchar_t wp[] = L"Росси́йская Акаде́мия Нау́к ";
printf("%S", wp);
I can compile but when I run it, nothing is actually displayed in my terminal.
Could you help me please?
Edit: In response to this comment:
wprintf(L"%s", wp);
Sorry, I forgot to mention that I can only use write(), as was only using printf for my first attempts.
If you want wide chars (16 bit each) as output, use the following code, as suggested by Michael:
wprintf(L"%s", wp);
If you need utf8 output, you have to use iconv() for conversion between the two. See question 7469296 as a starting point.
You need to call setlocale() first and use %ls in printf():
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main(int argc, char *argv[])
{
setlocale(LC_ALL, "");
// setlocale(LC_ALL, "C.UTF-8"); // this also works
wchar_t wp[] = L"Росси́йская Акаде́мия Нау́к";
printf("%ls\n", wp);
return 0;
}
For more about setlocale(), refer to Displaying wide chars with printf

Output unicode wchar_t character

Just trying to output this unicode character ☒ in C using MinGW. I first put it on a buffer using swprintf, and then write it to the stdout using wprintf.
#include <stdio.h>
int main(int argc, char **argv)
{
wchar_t buffer[50];
wchar_t c = L'☒';
swprintf(buffer, L"The character is: %c.", c);
wprintf(buffer);
return 0;
}
The output under Windows 8 is:
The character is: .
Other characters such as Ɣ doesn't work neither.
What I am doing wrong?
You're using %c, but %c is for char, even when you use it from wprintf(). Use %lc, because the parameter is whar_t.
swprintf(buffer, L"The character is: %lc.", c);
This kind of error should normally be caught by compiler warnings, but it doesn't always happen. In particular, catching this error is tricky because both %c and %lc actually take int arguments, not char and wchar_t (the difference is how they interpret the int).
To output Unicode (or to be more precise UTF-16LE) to the Windows console, you have to change the file translation mode to _O_U16TEXT or _O_WTEXT. The latter one includes the BOM which isn't of interest in this case.
The file translation mode can be changed with _setmode. But it takes a file descriptor (abbreviated fd) and not a FILE *! You can get the corresponding fd from a FILE * with _fileno.
Here's an example that should work with MinGW and its variants, and also with various Visual Studio versions.
#define _CRT_NON_CONFORMING_SWPRINTFS
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
int
main(void)
{
wchar_t buffer[50];
wchar_t c = L'Ɣ';
_setmode(_fileno(stdout), _O_U16TEXT);
swprintf(buffer, L"The character is: %c.", c);
wprintf(buffer);
return 0;
}
This works for me:
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
int main(int argc, char **argv)
{
wchar_t buffer[50];
wchar_t c = L'☒';
if (!setlocale(LC_CTYPE, "")) {
fprintf(stderr, "Cannot set locale\n");
return 1;
}
swprintf(buffer, sizeof buffer, L"The character is %lc.", c);
wprintf(buffer);
return 0;
}
What I changed:
I added wchar.h include required by the use of swprintf
I added size as the second argument of swprintf as required by C
I changed %c conversion specification to %lc
I change locale using setlocale
This FAQ explains how to use UniCode / wide characters in MinGW:
https://sourceforge.net/p/mingw-w64/wiki2/Unicode%20apps/

Can't assign wide char into wide char field.

I have been given this school project. I have to alphabetically sort list of items by Czech rules. Before I dig deeper, I have decided to test it on a 16 by 16 matrix so I did this:
typedef struct {
wint_t **field;
}LIST;
...
setlocale(LC_CTYPE,NULL);
....
list->field=(wint_t **)malloc(16*sizeof(wint_t *));
for(int i=0;i<16;i++)
list->field[i]=(wint_t *)malloc(16*sizeof(wint_t));
In another function I am trying to assign a char. Like this:
sorted->field[15][15] = L'C';
wprintf(L"%c\n",sorted->field[15][15]);
Everything is fine. Char is printed. But when I try to change it to
sorted->field[15][15] = L'Č';
It says: Extraneous characters in wide character constant ignored. (Xcode) And the printing part is skipped. The main.c file is in UTF-8. If I try to print this:
printf("ěščřžýááíé\n");
It prints it out as written. I am not sure if I should allocate mem using wint_t or wchar_t or if I am doing it right. I tested it with both but none of them works.
clang seems to support entering arbitrary byte sequences into to wide strings with the \x notation:
wchar_t c = L'\x2126';
This compiles without notice.
Edit: Adapting what I find on wikipedia about wide characters, the following works for me:
#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <locale.h>
int main(void)
{
setlocale(LC_ALL,"");
wchar_t myChar1 = L'\x2126';
wchar_t myChar2 = 0x2126; // hexadecimal encoding of char Ω using UTF-16
wprintf(L"This is char: %lc \n",myChar1);
wprintf(L"This is char: %lc \n",myChar2);
}
and prints nice Ω characters in my terminal. Make sure that your teminal is able to interpret utf-8 characters.

Resources