How can I get the Unicode code point for a character? Here is what I have tried, but it is not printing the same character, Am I properley understanding how unicode works?
How can I get the value of a unicode character?
#include <stdio.h>
int main()
{
char *a = "ā";
int n;
while(a[n] != '\0')
{
printf("%x", a[n]);
n+=1;
}
printf("\n \uC481");
return 0;
}
In the first place, there are few corrections in your code.
#include <stdio.h>
int main()
{
char *a = "ā";
int n = 0; //Initialize n with zero.
while(a[n] != '\0')
{
printf("%x", a[n]);
n+=1;
}
//\u will not work. To print hexadecimal value, use \x
printf("\n %X\n\", 0xC481);
return 0;
}
Here, you are trying to print hex value of each byte. This will be not a Unicode value of character beyond 0xff.
unsigned short is the most common data structure used to store Unicode value although it cannot store all the code points. If you need to store all the Unicode points as it is, then use int which must be 32-bit.
Unicode value of a character is numeric value of each character when it is represented in UTF-32. Otherwise, you will have to compute from the byte sequence if encoding is UTF-8 or UTF-16.
Related
I am trying to display a wide character in hexadecimal and it gives me unexpected results and it would be always like 2 digit hex and my code.
#include "stdlib.h"
#include "stdio.h"
#include"wchar.h"
#include "locale.h"
int main(){
setlocale(LC_ALL,"");
wchar_t ch;
wscanf (L"%lc",&ch);
wprintf(L"%x \n",ch);
return 0;
}
input : Ω
result: 0xea
expected result : 0xcea9
I changed setlocale several times but the results always be the same.
notice
When the input value is smaller than 1 byte it works as expected.
Note that you should use <..> for including standard headers. The line wprintf("%x", ch) is invalid, cause it's most probably undefined behavior - ch is (possibly) not an unsigned int, you can't apply %x on it.
You are expecting that wide characters will be stored in UTF-8. Well, that wouldn't make much sense, they are not. Your program reads a sequence of bytes in multibyte encoding and that sequence of bytes is then converted (depending on locale) to the wide character encoding. The wide character encoding (usually) stays the same and should be UTF-32 on linux. Locale affects the way multibyte characters are converted to wide characters and back, not the representation of wide characters.
The following program:
#include <stdlib.h>
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main(){
setlocale(LC_ALL,"");
wchar_t ch;
int cnt = wscanf(L"%lc",&ch);
if (cnt != 1) { /* handle error */ abort(); }
wprintf(L"%x\n", (unsigned int)ch);
return 0;
}
On linux when inputted Greek Capital Letter Omega Ω U+3A9 the program outputs 3a9. What actually happens is that the terminal reads UTF-8 encoded character, so it reads two bytes 0xCE 0xA9, then converts them to UTF-32 and stores the result in the wide character. You may convert the wide character from wide character encoding (UTF-32) to multibyte character encoding (UTF-8 should be default, but depends on locale) and print the bytes that represent the character in multibyte character encoding:
char tmp[MB_CUR_MAX];
int len = wctomb(tmp, ch); // prefer wcrtomb
if (len < 0) { /* handle error */ abort(); }
for (int i = 0; i < len; ++i) {
wprintf(L"%hhx", (unsigned char)tmp[i]);
}
wprintf(L"\n");
That will output cea9 on my platform.
I was trying to make this int to char program. The +'0' in the do while loop wont convert the int value to ascii, whereas, +'0' in main is converting. I have tried many statements, but it won't work in convert() .
#include<stdio.h>
#include<string.h>
void convert(int input,char s[]);
void reverse(char s[]);
int main()
{
int input;
char string[5];
//prcharf("enter int\n");
printf("enter int\n");
scanf("%d",&input);
convert(input,string);
printf("Converted Input is : %s\n",string);
int i=54;
printf("%c\n",(i+'0')); //This give ascii char value of int
printf("out\n");
}
void convert(int input,char s[])
{
int sign,i=0;
char d;
if((sign=input)<0)
input=-input;
do
{
s[i++]='0'+input%10;//but this gives int only
} while((input/=10)>0);
if(sign<0)
s[i++]='-';
s[i]=EOF;
reverse(s);
}
void reverse(char s[])
{
int i,j;
char temp;
for(i=0,j=strlen(s)-1;i<j;i++,j--)
{
temp=s[i];
s[i]=s[j];
s[j]=temp;
}
}
Output screenshot
Code screenshot
The +'0' in the do while loop wont convert the int value to ascii
Your own screenshot shows otherwise (assuming an ASCII-based terminal).
Your code printed 56, so it printed the bytes 0x35 and 0x36, so string[0] and string[1] contain 0x35 and 0x36 respectively, and 0x35 and 0x36 are the ASCII encodings of 5 and 6 respectively.
You can also verify this by printing the elements of string individually.
for (int i=0; string[i]; ++i)
printf("%02X ", string[i]);
printf("\n");
I tried your program and it is working for the most part. I get some goofy output because of this line:
s[i]=EOF;
EOF is a negative integer macro that represents "End Of File." Its actual value is implementation defined. It appears what you actually want is a null terminator:
s[i]='\0';
That will remove any goofy characters in the output.
I would also make that string in main a little bigger. No reason we couldn't use something like
char string[12];
I would use a bare minimum of 12 which will cover you to a 32 bit INT_MAX with sign.
EDIT
It appears (based on all the comments) you may be actually trying to make a program that simply outputs characters using numeric ascii values. What the convert function actually does is converts an integer to a string representation of that integer. For example:
int num = 123; /* Integer input */
char str_num[12] = "123"; /* char array output */
convert is basically a manual implementation of itoa.
If you are trying to simply output characters given ascii codes, this is a much simpler program. First, you should understand that this code here is a mis-interpretation of what convert was trying to do:
int i=54;
printf("%c\n",(i+'0'));
The point of adding '0' previously, was to convert single digit integers to their ascii code version. For reference, use this: asciitable. For example if you wanted to convert the integer 4 to a character '4', you would add 4 to '0' which is ascii code 48 to get 52. 52 being the ascii code for the character '4'. To print out the character that is represented by ascii code, the solution is much more straightforward. As others have stated in the comments, char is a essentially a numeric type already. This will give you the desired behavior:
int i = 102 /* The actual ascii value of 'f' */
printf("%c\n", i);
That will work, but to be safe that should be cast to type char. Whether or not this is redundant may be implementation defined. I do believe that sending incorrect types to printf is undefined behavior whether it works in this case or not. Safe version:
printf("%c\n", (char) i);
So you can write the entire program in main since there is no need for the convert function:
int main()
{
/* Make initialization a habit */
int input = 0;
/* Loop through until we get a value between 0-127 */
do {
printf("enter int\n");
scanf("%d",&input);
} while (input < 0 || input > 127);
printf("Converted Input is : %c\n", (char)input);
}
We don't want anything outside of 0-127. char has a range of 256 bits (AKA a byte!) and spans from -127 to 127. If you wanted literal interpretation of higher characters, you could use unsigned char (0-255). This is undesirable on the linux terminal which is likely expecting UTF-8 characters. Values above 127 will be represent portions of multi-byte characters. If you wanted to support this, you will need a char[] and the code will become a lot more complex.
I am new to C programming and trying to make a program to add up the digits from the input like this:
input = 12345 <= 5 digit
output = 15 <= add up digit
I try to convert the char index to int but it dosent seems to work! Can anyone help?
Here's my code:
#include <stdio.h>
#include <string.h>
int main(){
char nilai[5];
int j,length,nilai_asli=0,i;
printf("nilai: ");
scanf("%s",&nilai);
length = strlen(nilai);
for(i=0; i<length; i++){
int nilai1 = nilai[i];
printf("%d",nilai1);
}
}
Output:
nilai: 12345
4950515253
You have two problems with the code you show.
First lets talk about the problem you ask about... You display the encoded character value. All characters in C are encoded in one way or another. The most common encoding scheme is called ASCII where the digits are encoded with '0' starting at 48 up to '9' at 57.
Using this knowledge it should be quite easy to figure out a way to convert a digit character to the integer value of the digit: Subtract the character '0'. As in
int nilai1 = nilai[i] - '0'; // "Convert" digit character to its integer value
Now for the second problem: Strings in C are really called null-terminated byte strings. That null-terminated bit is quite important, and all strings functions (like strlen) will look for that to know when the string ends.
When you input five character for the scanf call, the scanf function will write the null-terminator on the sixth position in the five-element array. That is out of bounds and leads to undefined behavior.
You can solve this by either making the array longer, or by telling scanf not to write more characters into the array than it can actually fit:
scanf("%4s", nilai); // Read at most four characters
// which will fit with the terminator in a five-element array
First of all, your buffer isn't big enough. String input is null-terminated, so if you want to read in your output 12345 of 5 numbers, you need a buffer of at least 6 chars:
char nilai[6];
And if your input is bigger than 5 chars, then your buffer has to be bigger, too.
But the problem with adding up the digits is that you're not actually adding up anything. You're just assigning to int nilai1 over and over and discarding the result. Instead, put int nilai1 before the loop and increase it in the loop. Also, to convert from a char to the int it represents, subtract '0'. All in all this part should look like this:
int nilai1 = 0;
for (i = 0; i < length; i++) {
nilai1 += nilai[i] - '0';
}
printf("%d\n", nilai1);
For starters according to the C Standard the function main without parameters shall be declared like
int main( void )
This character array
char nilai[5];
can not contain a string with 5 digits. Declare the array with at least one more character to store the terminating zero of a string.
char nilai[6];
In the call of scanf
scanf("%s",&nilai);
remove the operator & before the name nilai. And such a call is unsafe. You could use for example the standard function fgets.
This call
length = strlen(nilai);
is redundant and moreover the variable length should be declared having the type size_t.
This loop
for(i=0; i<length; i++){
int nilai1 = nilai[i];
printf("%d",nilai1);
}
entirely does not make sense.
The program can look the following way
#include <stdio.h>
#include <ctype.h>
int main(void)
{
enum { N = 6 };
char nilai[N];
printf( "nilai: ");
fgets( nilai, sizeof( nilai ), stdin );
int nilai1 = 0;
for ( const char *p = nilai; *p != '\0'; ++p )
{
if ( isdigit( ( unsigned char ) *p ) ) nilai1 += *p - '0';
}
printf( "%d\n", nilai1 );
return 0;
}
Its output might look like
nilai: 12345
15
so i've been writing a program that convert a decimal number to it's boolean representation but every time i compile the return value which is a string show additional characters like p┐ here is the program
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void main (void)
{
signed char str[256];
int dec,rest;
int i = -1;
int j;
printf ("write a decimal number : ");
scanf ("%d",&dec);
do
{
rest = dec % 2;
dec/= 2;
for (j=i;j>-1;--j)
str[j+1]=str[j];
str[0]=rest+48;
++i;
} while (dec!=0);
printf ("the binary representation of this number is %s",str);
}
the output :
write a decimal number : 156
the binary representation of this number is 10011100p┐
i don't know if im missing something but i will be grateful if you guys help me
In C and C++, strings are null-terminated, this means that every valid string must end with a character with code 0. This character tells every function that is dealing with this string that it is in fact over.
In your program you create a string, signed char str[256]; and it is initially filled with random data; this means that you reserved space for 256 characters and they are all garbage, but the system does not know they are invalid. Try printing this string and see what happens.
In order to actually tell the system that your string is over after say, 8 characters, the 9th character hast to be the NUL character, or simply 0. In your code you can do it in two ways:
after the loop, assign str[i] = 0, or (even simpler)
initialize the string as signed char str[256]={0};, whiche creates the storage and fills it with nulls; after writing to the string you can be sure that the character after the last one you've written will be a NUL.
At the end of your do {} while () loop, you need to set the character after the last character in your string to 0. This is the array index of the last character you want (i) plus one. This lets printf know where your string ends. (Otherwise, how could it know?)
initialize the str variable to NUL.
void main (void)
{
signed char str[256];
int dec,rest;
int i = -1;
int j;
memset( str, '\0', sizeof(str) );
printf ("write a decimal number : ");
scanf ("%d",&dec);
do
{
rest = dec % 2;
dec/= 2;
for (j=i;j>-1;--j)
str[j+1]=str[j];
str[0]=rest+48;
++i;
} while (dec!=0);
printf ("the binary representation of this number is %s",str);
}
Hello. Im reading file using FILE and reading that using fgetc to read that.
fgetc function returns me int value of my chars in ASCII.
Now i want to print that data in char values.
How to convert my ascii numbers to chars?
You most likely don't need any conversion. If the native character set on your system is ASCII (which is the most common) then no conversion is needed. 'A' == 65 etc.
That means to print a character you just print it, with e.g. putchar or printf or any other function that allows you to print characters.
int x = 48;
printf("%c", x);
it will print 0, also you can do this
int x = 48;
char xx = (char)x;
Specify format as for a char, like so:
printf("%c", number);
printf("%c", 65); // A
#include <stdio.h>
int main()
{
int i;
for (i=97; i <=200 ; i++)
{
printf("%d %c,\t",i,i);
};
return 0;}
This will give you the whole chart upto 200.