Char Data type variables - c

I'm studying a book called "Learn C on the Mac". It defines the char data type as a 1 byte data type. Does that mean that a variable with char data type can NOT hold an integer such as 5000? I'm confused by this. The book has an example program assigning a variable data type as char, with 5000 in the variable. It is actually a string, 5000 long. Example: (char rating[5000];). I thought char could only hold the ascii set or the numerical value? Sorry I am fairly new to programming.

Does that mean that a variable with char data type can NOT hold an integer such as 5000?
No. On a platform where a byte is extremely long, it is theoretically possible that char be able to hold the value 5000.
However, that has nothing to do with the example you read. This:
char rating[5000];
creates an array of 5000 chars. It is not initializing rating with the value 5000. Are you confusing this with the parentheses-initialization syntax of C++? That would be
char rating(5000);
and it does something entirely different. And it wouldn't be valid C at all anyway.

char rating[5000] means an array of 5000 characters. That is, it will occupy an space of 5000 * sizeof(char) in the memory.

char rating[5000] will create an array with 5000 char elements.

the char data type as a 1 byte data type. Does that mean that a variable with char data type can NOT hold an integer such as 5000?
Char refers to character. As you already know, a char variable holds a space of 1 byte. It can, therefore, hold exactly one digit or one letter—no more, no less. Char values are one-character values.
So that means—a variable with char data type cannot hold an integer such as 5000. There isn't enough space for it on the computer memory. If you input a value with more than one character (e.g. 5000), only the first character will be accepted. You cannot feed such data to a char variable.
Use this program to better your understanding.
#include <stdio.h>
main()
{
char s;
scanf("%c", &s);
printf("%c", s);
return (0);
}
Try inputting 5000 and observe what the program prints as output.
an example program assigning a variable data type as char, with 5000 in the variable. It is actually a string, 5000 long. Example: (char rating[5000];).
In case you didn't know, a string is an array of char (i.e. characters). char rating[5000]is declaring a string data type. It defines rating as a string of 5000 chars, that is, a string 5000 characters long. However, it is not initializing rating with the value 5000. Wrong interpretation: char rating='5000' Rather, it is declaring the size of rating to be 5000.
I thought char could only hold the ascii set or the numerical value.
Yes, you are right about the ascii part. A char variable can hold any one ascii value at any time. Letters, digits, and symbols altogether form the ascii set. But, numerical values don't fall into this category. Numerical value is the magnitude, which isn't necessarily made of only one digit. Digits (i.e. numerals from 0 to 9) is the right term for it.
To sum up, char is a data type which can store the value of a letter / alphabet or a digit / number. A string is a group of char. 'a' or '1' can be a char data, but a phrase or a sentence can't. To store a group of characters into a variable, use string. Remember these simple facts to make your life easy.

Related

What happens when we make an array defined using characters instead of integers in C?

This is a code I have used to define an array:
int characters[126];
following which I wanted to get a record of the frequencies of all the characters recorded for which I used the while loop in this format:
while((a=getchar())!=EOF){
characters[a]=characters[a]+1;
}
Then using a for loop I print the values of integers in the array.
How exactly is this working?
Does C assign a specific number for letters ie. a,b,c, etc in the array?
What happens when we make an array defined using characters instead of integers in C?
Let's be sure we are clear: you are using integer values returned by getchar() as indexes into your array. This is not defining the array, it is just accessing its elements.
Does C assign a specific number for letters ie. a,b,c, etc in the array?
There are no letters in the array. There are ints. However, yes, the characters read by getchar() are encoded as integer values, so they are, in principle, suitable array indexes. Thus, this line ...
characters[a]=characters[a]+1;
... reads the int value then stored at index a in array characters, adds 1 to it, and then assigns the result back to element a of the array, provided that the value of a is a valid index into the array.
More generally, it is important to understand that although one of its major uses is to represent characters, type char is an integer type. Its values are numbers. The mapping from characters to numbers is implementation and context dependent, but it is common enough for the mapping to be consistent with the ASCII code that you will often see programs that assume such a mapping.
Indeed, your code makes exactly such an assumption (and others) by allowing only for character codes less than 126.
You should also be aware that if your characters array is declared inside a function then it is not initialized. The code depends on all elements to be initially to zero. I would recommend this declaration instead:
int characters[UCHAR_MAX + 1] = {0};
That upper bound will be sufficient for all the non-EOF values returned by getchar(), and the explicit zero-initialization will ensure the needed initial values regardless of where the array is declared.
I have realized the charecter set that can function as an input for getchar() is part of the ASCII table and comes under an int. I used the code following to find that out:
#include <stdio.h>
int main(){
int a[128];
a['b']=4;
printf("%d",a[98]); //it is 98 as according to the table 'b' is assigned the value of 98
}
following which executing this code i get the output of 4.
I am really new to coding so feel free to correct me.
Character values are represented using some kind of integer encoding - ASCII (very common), EBCDIC (mostly IBM mainframes), UTF-8 (backward-compatible to ASCII), etc.
The character value 'a' maps to some integer value - 97 in ASCII and UTF-8, 129 in EBCDIC. So yes, you can use a character value to index into an array - arr['a']++ would be equivalent to arr[97]++ if you were using ASCII or UTF-8.
The C language does not dictate this - it's determined by the underlying platform.

Difference between 2 vs "\2"

While trying to implement the IKE session key generation algorithms I came across the following code snippets for the following algorithm implementation
Algorithm for generating a certain session key
SKEYID_e = HMAC (SKEYID, SKEYID_a || gxy || CKY-I || CKY-R || 2)
implementation to get the last concatenation HMAC of digit 2
hmac_update(ctx, (unsigned char *) "\2", 1)
here hmac_update is the API used to concatenate the buffer to get the HMAC before finalizing the digest and CTX is HMAC context "\2" is adding the digit 2 and 1 is size of the buffer.
My question is what is the difference between and escaped unsigned char * "\2" and a char/uint8_t value 2
The difference is that a char with numeric value 2 and the string "\2" is that the former is a char and the second is a literal representing a character array containing a char with numeric value 2 and then a char with numeric value 0. In other words:
(char)2 is a single character. Its type is char. Its value is 2.
"\2" is an array of characters. Its type decays to const char*. Its first entry is 2 and its second entry is 0.
Since hmac_update expects as its second argument a pointer to the bytes to use in the update, you can't provide 2 or (char)2 as an argument, since doing so would try to convert an integer to a pointer (oops). Using "\2" solves this problem by providing a pointer to the byte in question. You could also do something like this:
const char value = 2;
hmac_update(ctx, &value, 1);
"2" describes the character with the hex code 2 (which is a non-printable character, check http://ascii-table.com/info.php?u=x0002);
The digit "2" has the hex code 0x050 = 50, as is the printable character '2'.

What is the meaning of assigning a number (not a character) to a char variable?

char a =2; **without quotes**. WHAT is the meaning of this statement
since
char a='2'; means the ASCII value of 2 stores in a but what about without quotes
No, it does not mean that the "ASCII value of 2" is stored. It simply stores the small integer 2.
To store the integer that encodes the character 2 in the target character encoding, use char a = '2';.

Differences between int/char arrays/strings

I'm still new to the forum so I apologize in advance for forum - etiquette issues.
I'm having trouble understanding the differences between int arrays and char arrays.
I recently wrote a program for a Project Euler problem that originally used a char array to store a string of numbers, and later called specific characters and tried to use int operations on them to find a product. When I used a char string I got a ridiculously large product, clearly incorrect. Even if I converted what I thought would be compiled as a character (str[n]) to an integer in-line ((int)str[n]) it did the exact same thing. Only when I actually used an integer array did it work.
Code is as follows
for the char string
char str[21] = "73167176531330624919";
This did not work. I got an answer of about 1.5 trillion for an answer that should have been about 40k.
for the int array
int str[] = {7,3,1,6,7,1,7,6,5,3,1,3,3,0,6,2,4,9,1,9};
This is what did work. I took off the in-line type casting too.
Any explanation as to why these things worked/did not work and anything that can lead to a better understanding of these ideas will be appreciated. Links to helpful stuff are as well. I have researched strings and arrays and pointers plenty on my own (I'm self taught as I'm in high school) but the concepts are still confusing.
Side question, are strings in C automatically stored as arrays or is it just possible to do so?
To elaborate on WhozCraig's answer, the trouble you are having does not have to do with strings, but with the individual characters.
Strings in C are stored by and large as arrays of characters (with the caveat that there exists a null terminator at the end).
The characters themselves are encoded in a system called ascii which assigns codes between 0 - 127 for characters used in the english language (only). Thus "7" is not stored as 7 but as the ascii encoding of 7 which is 55.
I think now you can see why your product got so large.
One elegant way to fix would be to convert
int num = (int) str[n];
to
int num = str[n] - '0';
//thanks for fixing, ' ' is used for characters, " " is used for strings
This solution subtracts the ascii code for 0 from the ascii code for your character, say "7". Since the numbers are encoded linearly, this will work (for single digit numbers). For larger numbers, you should use atoi or strtol from stdlib.h
Strings are just character arrays with a null terminating byte.
There is no separate string data type in c.
When using a char as an integer, the numeric ascii value is used. For example, saying something like printf("%d\n", (int)'a'); will result in 97 (the ascii value of 'a') being printed.
You cannot use a string of numbers to do numeric calculations unless you convert it to an integer array. To convert a digit as a character into its integer form, you can do something like this:
char a = '2';
int a_num = a - '0';
//a_num now stores integer 2
This causes the ascii value of '0' (48) to be subtracted from ascii value '2' (50), finally leaving 2.
char str[21] = "73167176531330624919"
this code is equivalent to
char str[21] = {'7','3','1','6','7','1','7','6','5',/
'3','1','3','3','0','6','2','4','9','1','9'}
so whatever stored in str[21] is not numbers, but the char(their ASCII equivalent representation is different).
side question answer - yes/no, the strings are automatically stored as char arrays, but the string does has a extra character('\0') as the last element(where a char array need not have such a one).

Looping through an array in C

Just wondering if someone could explain this to me? I have a program that asks a user to input a sentence. The program then reads the user input into an array and changes all of the vowels to a $ sign. My question is how does the for loop work? When initialising char c = 0; does that not mean that the array element is an int? I can't understand how it functions.
#include <stdio.h>
#include <string.h>
int main(void)
{
char words[50];
char c;
printf("Enter any number of words: \n");
fgets(words, 50, stdin);
for(c = 0; words[c] != '\n'; c++)
{
if(words[c] =='a'||words[c]=='e'||words[c]=='i'||words[c]=='o'||words[c]=='u')
{
words[c] = '$';
}
}
printf("%s", words);
return 0;
}
The code treats c as an integer variable (in C, char is basically a very narrow integer). In my view it would be cleaner to declare it as int (perhaps unsigned int). However, given that words is at most 50 characters long, char c works fine.
As to the loop:
c = 0 initializes c to zero.
words[c] != '\n' checks -- right at the start and also after each iteration -- whether the current character (words[c]) is a newline, and stops if it is.
c++ increments c after each iteration.
An array is like a building, you have several floors each one with a number.
In the floor 1 lives John.
In floor 2 lives Michael.
If you want to go to Jonh apartment you press 1 on the elevator. If you want to go to Michael's you press 2.
Thats the same with arrays. Every position in the array stores a value, in this case a letter.
Every position has a index associated. The first position is 0.
When you want to access a position of the array you use array[position] where position is the index in the array that you want to access.
The variable c holds the position to be acessed. When you do words[c] you're acctualy accessing the cnt position in the array and retrieving its value.
Supose the word is cool
word[1] results in o,
word[0] results in c
To determine the end of the word, a the caracter \n is set at the last position of the array.
Not really, char and int are implicitly converted.
You can look at a char in this case as a smaller int. sizeof(char) == 1, so it's smaller than an int, that's probably the reason it was used. Programatically, there's no difference in this case, unless the input string is very long, in which case the char will overflow before an int does.
Number literals (such as 0 in your case) are compatible with variables of type char. In fact, even a character literal enclosed in single quotes (for example '\n') is of type int but is implicitly converted to a char when assigned or compared to another char.
Number literals are interchangeable with character literals, as long as the former do not exceed the range of a character.
The following should result in a compiler warning:
char c = 257;
whereas this will not:
char c = 127;
A char is C is an integral type as is short, int, long, and long long (and many other types):
It is defined as the smallest addressable unit on the machine you are compiling on and will usually be 8 bits which means it can hold values -128 to 127. And an unsigned char can hold values 0 - 255.
It works as an iterator in the above since it will stop before 50 all the time and it can hold values up to 127. Whereas an int type can usually hold values up to 2,147,483,647, but takes up 4 times the space in the machine as an 8 bit char. An int is only guaranteed to be at least 16 bits in C which means values between −32,768 and 32,767 or 0 - 6,5535 for an unsigned int.
So your loop is just accessing elements in your array, one after the other like words[0] at the beginning to look at the first character, then words[1] to look at the next character. Since you use a char, which I'm assuming is 8 bits on your machine as that is very common. Your char will be enough to store the iterator for your loop until it gets above 127. If you read in more than 127 characters (instead of just 50) and used a char to iterate you would run into weird problems since the char can't hold 128 and will loop around to -128. Causing you to access words[-128] which would most likely result in a Segmentation Fault.

Resources