ok so im reading this book: The C Programming Language - By Kernighan and Ritchie (second Edition) and one of the examples im having trouble understanding how things are working.
#include <stdio.h>
#define MAXLINE 1000
int getline(char line[], int maxline);
void copy(char to[], char from[]);
int main(int argc, char *argv[])
{
int len;
int max;
char line[MAXLINE];
char longest[MAXLINE];
max = 0;
while((len = getline(line, MAXLINE)) > 1)
{
if(len > max)
{
max = len;
copy(longest, line);
}
}
if(max > 0)
printf("%s", longest);
getchar();
getchar();
return 0;
}
int getline(char s[], int lim)
{
int c, i;
for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
s[i] = c;
if(c == '\n')
{
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
void copy(char to[], char from[])
{
int i;
i = 0;
while((to[i] = from[i]) != '\0')
++i;
}
the line : for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
where it says c = getchar(), how can an integer = characters input from the command line? Integers yes but how are the characters i type being stored?
Thanks in advance
Unlike some other languages you may have used, chars in C are integers. char is just another integer type, usually 8 bits and smaller than int, but still an integer type.
So, you don't need ord() and chr() functions that exist in other languages you may have used. In C you can convert between char and other integer types using a cast, or just by assigning.
Unless EOF occurs, getchar() is defined to return "an unsigned char converted to an int" (same as fgetc), so if it helps you can imagine that it reads some char, c, then returns (int)(unsigned char)c.
You can convert this back to an unsigned char just by a cast or assignment, and if you're willing to take a slight loss of theoretical portability, you can convert it to a char with a cast or by assigning it to a char.
The getchar() function returns an integer which is the representation of the character entered. If you enter the character A, you will get 'A' or 0x41 returned (upgraded to an int and assuming you're on an ASCII system of course).
The reason it returns an int rather than a char is because it needs to be able to store any character plus the EOF indicator where the input stream is closed.
And, for what it's worth, that's not really a good book for beginners to start with. It's from the days where efficiency mattered more than readability and maintainability.
While it shows how clever the likes of K&R were, you should probably be looking at something more ... newbie-friendly.
In any case, the last edition of it covered C89 and quite a lot has changed since then. We've been through C99 and now have C11 and the book hasn't been updated to reflect either of them, so it's horribly out of date.
The C char type is 8 bits, which means it can store the range of integers from (depending on if it is signed or not and the C standard does not dictate which it is if you do not specify it) either -128 to 127 or 0 to 255 (255 distinct values; this is the range of ASCII). getchar() returns int, which will be at least 16 bits (usually 32 bits on modern machines). This means that it can store the range of char, as well as more values.
The reason why the return type is int is because the special value EOF is returned when the end of the input stream is reached. If the return type were char, then there would be no way to signal that the end of the stream was encountered (unless it took a pointer to a variable where this condition was recorded).
Now let's play a game of logic.
Char is also a type of integer which has a smaller range than int, more specifically 8 bits, that is, 1 byte. As we all know, integer types consists of signed ( default ) and unsigned. As for char, the range of signed is -127 ~ 128 and the range of unsigned is 0 ~ 255. Now we know the type and "capability" of signed and unsigned char.
We human understand characters while the computer recogonize only binary sequence. Thus all kinds of programming language must provode a model to deal with the cevertion from characters to binary sequence. ASCII code is the standard for the mapping which applied in C and many other programming languages. It takes 0 - 255 to code basic characters like 0-9, a-z and A-Z, as well as usual special ones.
You may wonder that unsigned char is the exact choice. However, the progamming should know when to stop. The simplest way is to meet a special value, a negative one is a good choice since bigger positive values might be used for other languages. Finally, C choosed -1, which is more commonly called EOF.
Now we've got the point. Signed char will not suffice to code ASCII characters while unsigned leaves no room for the termination value. We require a larger range to balace this, that is, the int type. Savy?
Thanks for the answer of #cdhowie, it acually kindled me.
Every character (including numbers) entered on the command line is read as a character and every character has an integer value based on its ASCII code http://www.asciitable.com/.
Answer for your Question is answered. But just add 1 more thing.
As you are declaring variable c as int. It is pretty clear that you are taking values from 0 to 9 having ascii value of 48-57.
So you can just add 1 more line to the code-
c = c-48.
Related
I was trying to make this int to char program. The +'0' in the do while loop wont convert the int value to ascii, whereas, +'0' in main is converting. I have tried many statements, but it won't work in convert() .
#include<stdio.h>
#include<string.h>
void convert(int input,char s[]);
void reverse(char s[]);
int main()
{
int input;
char string[5];
//prcharf("enter int\n");
printf("enter int\n");
scanf("%d",&input);
convert(input,string);
printf("Converted Input is : %s\n",string);
int i=54;
printf("%c\n",(i+'0')); //This give ascii char value of int
printf("out\n");
}
void convert(int input,char s[])
{
int sign,i=0;
char d;
if((sign=input)<0)
input=-input;
do
{
s[i++]='0'+input%10;//but this gives int only
} while((input/=10)>0);
if(sign<0)
s[i++]='-';
s[i]=EOF;
reverse(s);
}
void reverse(char s[])
{
int i,j;
char temp;
for(i=0,j=strlen(s)-1;i<j;i++,j--)
{
temp=s[i];
s[i]=s[j];
s[j]=temp;
}
}
Output screenshot
Code screenshot
The +'0' in the do while loop wont convert the int value to ascii
Your own screenshot shows otherwise (assuming an ASCII-based terminal).
Your code printed 56, so it printed the bytes 0x35 and 0x36, so string[0] and string[1] contain 0x35 and 0x36 respectively, and 0x35 and 0x36 are the ASCII encodings of 5 and 6 respectively.
You can also verify this by printing the elements of string individually.
for (int i=0; string[i]; ++i)
printf("%02X ", string[i]);
printf("\n");
I tried your program and it is working for the most part. I get some goofy output because of this line:
s[i]=EOF;
EOF is a negative integer macro that represents "End Of File." Its actual value is implementation defined. It appears what you actually want is a null terminator:
s[i]='\0';
That will remove any goofy characters in the output.
I would also make that string in main a little bigger. No reason we couldn't use something like
char string[12];
I would use a bare minimum of 12 which will cover you to a 32 bit INT_MAX with sign.
EDIT
It appears (based on all the comments) you may be actually trying to make a program that simply outputs characters using numeric ascii values. What the convert function actually does is converts an integer to a string representation of that integer. For example:
int num = 123; /* Integer input */
char str_num[12] = "123"; /* char array output */
convert is basically a manual implementation of itoa.
If you are trying to simply output characters given ascii codes, this is a much simpler program. First, you should understand that this code here is a mis-interpretation of what convert was trying to do:
int i=54;
printf("%c\n",(i+'0'));
The point of adding '0' previously, was to convert single digit integers to their ascii code version. For reference, use this: asciitable. For example if you wanted to convert the integer 4 to a character '4', you would add 4 to '0' which is ascii code 48 to get 52. 52 being the ascii code for the character '4'. To print out the character that is represented by ascii code, the solution is much more straightforward. As others have stated in the comments, char is a essentially a numeric type already. This will give you the desired behavior:
int i = 102 /* The actual ascii value of 'f' */
printf("%c\n", i);
That will work, but to be safe that should be cast to type char. Whether or not this is redundant may be implementation defined. I do believe that sending incorrect types to printf is undefined behavior whether it works in this case or not. Safe version:
printf("%c\n", (char) i);
So you can write the entire program in main since there is no need for the convert function:
int main()
{
/* Make initialization a habit */
int input = 0;
/* Loop through until we get a value between 0-127 */
do {
printf("enter int\n");
scanf("%d",&input);
} while (input < 0 || input > 127);
printf("Converted Input is : %c\n", (char)input);
}
We don't want anything outside of 0-127. char has a range of 256 bits (AKA a byte!) and spans from -127 to 127. If you wanted literal interpretation of higher characters, you could use unsigned char (0-255). This is undesirable on the linux terminal which is likely expecting UTF-8 characters. Values above 127 will be represent portions of multi-byte characters. If you wanted to support this, you will need a char[] and the code will become a lot more complex.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
How can i store ASCII values of a string in an integral array and not print them in C?
For example: I input "ABab" and i have an array of a[4] then how can i store a[0]=67, a[1]=66, a[2]=97, a[3]=98?
It's already stored as you mention: a string is an array of char and a char is an integer.
I guess you know that computers work on "binary digits (bits)", so did you already ask yourself how the computer would store a character at all?
The answer is: That's what character encodings like ASCII are meant for: They assign a unique number to every character and the number is what gets stored!
As C should work on virtually any machine/system/OS, it doesn't specify which encoding is used for the characters, it just specifies a few properties this encoding must fulfill. ASCII only consists of 128 different codes (the amount that can be represented by 7 bits), so nowadays, larger encodings than ASCII are used, but with a probability very close to 1, your system uses an encoding that includes ASCII (very often this would be UTF-8).
So, as already commented, when you read characters in an array of char, you already end up with the ASCII values stored there. char is the smallest integer type in C and corresponds to a byte -- and a byte is exactly the amount of bits needed to store the representation of a character. This is almost always 8 bits, but larger values are allowed by the C standard.
If you want to print a number, you have to convert it to decimal places and convert these to the (ASCII) codes of the characters 0 to 9. Of course, C already has a function doing this for you: printf() with a suitable format string. To print the decimal value of a byte/char, you can simply do:
char c = 'A';
printf("%hhu", (unsigned char)c);
u is the conversion specifier for an unsigned integer, hh modifies the length of the argument to char, and you cast your char to unsigned char because C allows char to be either signed or unsigned.
You can use a loop to copy elements of a string into an array of the type int.
For example
#include <stdio.h?
int main( void )
{
char *s = "ABab";
int a[4];
const size_t N = sizeof(a) / sizeof(*a);
size_t i = 0;
for (; i < N && s[i] != '\0'; i++) a[i] = s[i];
while (i != N) a[i] = 0;
for (i = 0; i < N; i++) printf("%d ", a[i]);
putchar('\n');
return 0;
}
The program output is
65 66 97 98
If you need to output a character array as integers instead of their symbol representations then you can just write
#include <stdio.h?
int main( void )
{
char *s = "ABab";
for ( size_t i = 0; s[i] != '\0'; i++) printf("%d ", s[i]);
putchar('\n');
return 0;
}
Take into account that the type char can behave either as the type signed char or unsigned char. If you want to store or output characters as unsigned values you should write in the first program
for (; i < N && s[i] != '\0'; i++) a[i] = ( unsigned char )s[i];
and in the second program
for ( size_t i = 0; s[i] != '\0'; i++) printf("%d ", ( unsigned char )s[i]);
That is you need to cast explicitly char to unsigned char.
This question already has answers here:
Comparing unsigned char and EOF
(6 answers)
Closed 5 years ago.
I’m learning C using Xcode 8 and the compiler doesn’t run any code after a while- or for-loop executes. is this a bug? how can I fix it?
In the example provided below printf("code executed after while-loop"); never executes
#include <stdio.h>
int getTheLine(char string[]);
int getTheLine(char string[]) {
char character;
int index;
index = 0;
while ((character = getchar()) >= EOF) {
string[index] = character;
++index;
}
printf("code executed after while-loop");
return index;
}
int main(int argc, const char * argv[]) {
char string[100];
int length = getTheLine(string);
printf("length %d\n", length);
return 0;
}
getchar returns an int not a char, and comparison with EOF should be done with the != operator instead of the >= operator.
...
int character; // int instead of char
int index;
index = 0;
while ((character = getchar()) != EOF) { // != instead of >=
...
It's the >= EOF, which will let the condition be always true. The reason is that a "valid" result of getchar() will be a positive integer, and a "non-valid" result like end-of-file will be EOF, which is negative (cf. getchar()):
EOF ... integer constant expression of type int and negative value
Hence, any valid result from getchar will be >EOF, while the end-of-file-result will be ==EOF, such that >= EOF will always match.
Write != EOF instead.
Note further that you do not terminate your string by the string-terminating-character '\0', such that using string like a string (e.g. in a printf("%s",string)) will yield undefined behaviour (crash or something else probably unwanted).
So write at least:
while ((character = getchar()) != EOF) {
string[index] = character;
++index;
}
string[index]='\0';
Then there is still the issue that you may write out of bounds, e.g. if one enters more then 100 characters in your example. But checking this is now beyond the actual question, which was about the infinite loop.
The symbolic constant EOF is an integer constant, of type int. It's (usually) defined as a macro as -1.
The problem is that the value -1 as an (32-bit) int has the value 0xffffffff and as a (8-bit) char the same value would be 0xff. Those two values are not equal. Which in turn means that your loop condition will never be false, leading to an infinite loop.
The solution to this problem is that all standard functions that reads characters returns them as an int. Which means your variable character needs to be of that type too.
Important note: It's a compiler implementation detail if plain char is a signed or an unsigned type. If it is signed then a comparison to an int would lead to sign extension when the char value is promoted in the comparison. That means a signed char with the value 0xff would be extended to the int value 0xffffffff. That means if char is signed then the comparison would work.
This means that your compile have char as unsigned char. So the unsigned char value 0xff after promotion to int will be 0x000000ff.
As for why the value -1 becomes 0xffffffff is because of how negative numbers are usually represented on computers, with something called two's complement.
You also have another couple of flaws in your code.
The first is that since the loop is infinite you will go way out of bounds of the string array, leading to undefined behavior (and a possible crash sooner or later). The solution to this is to add a condition to make sure that index never reaches 100 (in the specific case of your array, should really be passed as an argument).
The second problem is that if you intend to use the string array as an actual string, you need to terminate it. Strings in C are actually called null terminated strings. That terminator is the character '\0' (equal to integer 0), and need to be put at the end of every string you want to pass to a standard function handling such strings. Having this terminator means that an array of 100 characters only can have 99 characters in it, to be able to fit the terminator. This have implications to the solution to the above problem. As for how to add the terminator, simply do string[index] = '\0'; after the loop (if index is within bounds of course).
Below are my codes that convert large letters to small letters and vice versa.
#if SOL_2
char ch;
char diff = 'A' - 'a';
//int diff = 'A' - 'a';
fputs("input your string : ", stdout);
while ((ch = getchar()) != '\n') {
if (ch >= 'a' && ch <= 'z') {
ch += diff;
}
else if (ch >= 'A' && ch <= 'Z') {
ch -= diff;
}
else {}
printf("%c", ch);
}
#endif
Above codes, instead of char diff = 'A' - 'a', I used the int = 'A' -'a' and the result was same. Therefore, I thought that using character can save memory since char is one byte but int is four bytes. I can't think other advantages of it.
I would appreciate it if you let me know other advantages of it.
And What is the main reason of using char in order to store character values?
It is because of just memory size problem?
You should be using int ch and int diff.
getchar() returns int, not char. Therefore ch needs to be int. This is so you can tell the difference between end-of-file and character 0xff, both of which would be -1 in a signed byte. (reference)
char might be signed or unsigned (see this answer). Therefore, you should use int for comparisons so that you know you have room for negative values (int is signed by default).
To answer your specific question, use char when you know you have byte data and, yes, you'll most likely save some memory. Another reason to use char (or wchar_t or other character types) is to make it clear to the reader of your code that you intend this data to be text and not numeric, if indeed that is the case. Another use case for char is to access individual bytes of a file or other data stream.
What is the main reason of using char in order to store character values? It is because of just memory size problem?
The primary use of using char vs. int with arrays and sequences of characters is space (and processing speed on machines with wide architectures). If code uses characters limited to an 8-bit range, excessively large data types slow things down.
With single instances of a type, int is often better as that is typically the "native" type that the processor is optimized for.
Yet optimizing for a single char vs int (assuming both work in the application) is usually not a fruitful use of your time. Worry about larger issues and let the compiler optimize the small stuff.
Note that int getchar() returns values in the range of unsigned char and EOF. These typically 257 different values cannot be store distinctly in a char. Use an int
C provides isupper(), islower(), toupper(), tolower() and is the robust method to handle simple character case conversion.
if (isupper(ch)) ch = tolower(ch);
Example usage:
int ch;
while ((ch = getchar()) != '\n' && ch != EOF) {
if (isupper(ch)) {
ch = tolower(ch);
}
else if (islower(ch)) {
ch = toupper(ch);
}
printf("%c", ch);
}
fflush(stdout);
With ASCII, EBCDIC and every small character encoding I've encounterd, A-Z case conversion can be done by simple toggling a bit. Notice no magic numbers.
ch ^= 'A' ^ 'a';
Example usage:
int ch;
while ((ch = getchar()) != '\n' && ch != EOF) {
if (isalpha(ch)) {
ch ^= 'A' ^ 'a';
}
printf("%c", ch);
}
fflush(stdout);
Yes, you pointed out correctly the character the we use in char are nothing but binary code of 1 byte i.e 256 number each number in binary represent a number mapping to a character (might be not all binary number represent different character it depends which encoding you use) refer unicode encoding , don't just considering only english language consider other language characters as well like chinesse or hindi... and so on .So each character in this language needs to be represented by a number which is standardise by unicode
so the point is when you use char of java it only contains a subset of only english language alphabets however when you develop a international software which has ability to choose across different languages to display you should use int rather . However if your scope is only english language char would be the best choice as when you use int it consumes more bits that are unused bit which are been padded off with zero this are just extra bits with no significance to match the length of a int
suppose you have a text in chinesse language opened in editor like notepad and if the character encoding is set to ASCII as ascii has a small charset that is only english A-Z, a-z, 0-9 , space , newline ... like 256 odd characters, you will see wired characters in the file just like a binary file to see the actually content of file you need to change encoding to UTF-8 which uses unicode charset , and now you can see the text
Plase read Standard 6.3.1.8 Usual arithmetic conversions and 6.3.1.1 Boolean, characters, and integers.
If an int can represent all values of the original type [...] the value is converted to an int;
In
char c1 = 'A', c2 = 'Z';
c2 - c1; // expression without side effects
the expression above, both x and y are converted to int before the subtraction is performed.
I am really desperate trying to figure out how can I read char with value -1/255 because for most functions this means EOF. For example if I enter all characters from extended ASCII from low to high (decimal value) I end up with -1/255 which is EOF so I will not get it to array. I created small code to express my problem.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define BUFFERSIZE 1024
int main(void){
char c;
unsigned int *array = (unsigned int*)calloc(BUFFERSIZE,sizeof(unsigned int)), i = 0;
while (1){
c = fgetc(stdin);
if (c == EOF)
break;
array[i] = c;
i++;
}
array[i] = 0;
unsigned char *string = (unsigned char *)malloc(i);
for(int j = 0;j < i;j++)
string[j] = array[j];
free(array);
//working with "string"
return 0;
}
I could mode
if (c == EOF)
break;
like this
c = fgetc(stdin);
array[i] = c;
i++;
if (c == EOF)
break;
but ofcourse, program will read control character that user input from keyboard too (for example Ctrl+D - Linux). I tried opening stdin as binary but I found out that posix systems carries all files as binary. I am using QT, GCC and Ubuntu. I tried fread, read, fgets but I ended up the same. Simply said, I need to read everything I enter on stdin and put it into char array except when I enter control character (Ctrl+D) to end reading. Any advices appreciated.
Edit: As noted in the comment by #TonyB you should not declare c as char because fgetc() returns int, so changing it to int c; should make it possible to store EOF in c.
I didn't see that you declared c as char c; so all the credit goes to #TonyB.
Original Answer: Although the problem was addressed in the comments, and I added the solution to this answer, I think you are confused, EOF is not a character, it's a special value returned by some I/O functions to indicate the end of an stream.
You should never assume that it's vaule is -1, it often is but there is a macro for a reason, so you should always rely on the fact that these functions return EOF not -1.
Since there is no ascii representation for the value -1 you can't input that as a character, you can however parse the input string {'-', '1', '\0'}, and convert it to a number if you need to.
Also, Do not cast the return value of malloc().