Array element as an index to a different array - c

I came across some examples where an array is indexed based on values from a different array.
Example:
char s[] = "aloha";
int l= strlen(s);
int array_count[256];
memset(array_count,0,256*sizeof(int));
for(int i=0;i<l;i++)
{
array_count[s[i]]++;// What exactly happens in this statement ??
}
I understood it as it checking and setting the alphabets in s[] as 1's in the array array_count,which is the alphabet set. Is that right ?

The code is keeping a histogram of how many times a given character appears in the string. Every time a character appears in the string, the array element corresponding to the ASCII value of that character is incremented by one.
The elements in array_count [] are all set to 0 by your memset(). Then your loop iterates through s[]. So in the first iteration:
array_count [s[i]]++ // first evaluate [i]
array_count [s[0]]++ // i is zero
array_count ['a']++ // s[0] is 'a'
array_count [97]++ // promotion of 'a' from char to int; ASCII value of 'a' is 97
array_count [97] is zero because of the memset, so because of the ++ it gets incremented to 1.
Similar magic happens with the rest of the characters in subsequent iterations; when the loop terminates, array_count [97] will be 2 because of the two 'a's in "aloha"; array_count [0] will be 1 because of the NUL character at the end of "aloha"; and you can figure out what the rest of the array will be (mostly zeros).

Each char in s[] has an unsigned int value (usually it's ascii value) inclusively between 0 and 255. array_count[] is initialised to all zeros by the memset. Then, by iterating through s[] from start to end with i in the for loop, the value of each char is used to index into array_count[] and increment it's value with ++. So you get a count of the char values in s[].

256 is possible letter in string. see the ascii table.
http://www.asciitable.com/
for(int i=0;i<l;i++)
{
array_count[s[i]]++; // What exactly happens in this statement ??
for i=0
s[i] = 'a'
ascii value of 'a' is 97
so it will increment arry_count[97] value from 0 to 1
}

Related

\0 when initializing a char array with a loop

I need to initialize a char array with a loop and print it. Just like that:
int main( void )
{
char array[ 10 ];
for( int i = 1; i < 10; ++i ) {
array[ i - 1 ] = i;
}
// array[] contains numbers from 1 to 9 and an unitialized subscript
printf( "%s", array );
}
I want to know if I need to put the '\0' character in array[ 9 ] or if it is already there.
In other words: once I declared char array[ 10 ]; does the last subscript contains '\0' ?
I searched for similar questions and the better I could find is this where the array is filled with a loop but till the end, leaving no space for the terminating character.
Please tell me the truth.
In other words: once I declared char array[ 10 ]; does the last subscript contains '\0' ?
No.
You define a local variable and do not initialize it. These variables are not initialized by default but hold indetermined values.
If you want to have a defined value, you need to initialize or assign it:
char array[ 10 ] = "";
This will define an array with 10 elements.
As there is an initializer, the first element will be set to 0 (=='\0') due to the provided string literal.
Furthermore all other elements will be set to 0 because you provide less initializer values than you have elements in your array.
once I declared char array[10]; does the last subscript contains '\0' ?
The answer is NO: when you define the array as an automatic variable (a local variable in a function), it is uninitialized. Hence none of its elements can be assumed to have any specific value. If you initialize the array, even partially, all elements will be initialized, either explicitly from the values provided in the initializer or implicitly to 0 if there are not enough initializers.
0 and '\0' are equivalent, they are int constants representing the value 0. It is idiomatic to use '\0' to represent the null byte at the end of a char array that makes it a C string. Note that '0' is a different thing: it is the character code for the 0 digit. In ASCII, '0' has the value 48 (or 0x30), but some ancient computers used to use different encodings where '0' had a different value. The C standard mandates that the codes for all 10 digits from 0 to 9 must be consecutive, so the digit n has the code '0' + n.
Note that the loop in your code sets the value of 9 elements of the array to non zero values, and leaves the last entry uninitialized so the array is not null terminated, hence it is not a C string.
If you want to use the char array as a C string, you must null terminate it by setting array[9] to '\0'.
Note also that you can print a char array that is not null terminated by specifying the maximum number of bytes to output as a precision field in the conversion specifier: %.9s.
Finally, be aware that array[0] = 1; does not set a valid character in the first position of array, but a control code that might not be printable. array[0] = '0' + 1; set the character '1'.
#include <stdio.h>
int main(void) {
char array[10];
/* use the element number as the loop index: less error prone */
for (int i = 0; i < 9; ++i) {
array[i] = `0` + i + 1;
}
// array[] contains numbers from 1 to 9 and an unitialized subscript
printf("%.9s\n", array); // prints up to 9 bytes from `array`
array[9] = '\0';
// array[] contains numbers from 1 to 9 and a null terminator, a valid C string
printf("%s\n", array); // produce the same output.
return 0;
}
once I declared char array[ 10 ]; does the last subscript contains '\0' ?
No. It's uninitialized. It has garbage values in every element from whatever code used the same piece of memory last.
NO, You dont need to put '\0' at the end that is array[9].
why?
because when an array (char,int,float) is uninitialized it contains garbage values.
After initializing partial or full all other elements becomes 0 or \0 in case of char array.
example:
char array[10];
all elements contains garbage value.
after intialization
char array[10]={'a' ,'b'}; all other elements automitically becomes '\0'
this is true in case of structures also.

Unexpected output of int datatype

I created a simple program from the book let us c pg no.26 which is an example to illustrate and the code is somewhat like this
#include <stdio.h>
int main() {
char x,y;
int z;
x = 'a';
y = 'b';
z = x + y;
printf("%d", z);
return 0;
}
But the output i expected was the string ab (i know the z is in int but still that was the output i can think of) but instead the output was 195 which shocked me so please help me to figure this out in easy words.
Chars/letters are internally represented as numbers in terms of some protocols (e.g., Ascii or Unicode). ASCII is a popular standard to represent the most common symbols and letters. Here is the ASCII table. This table tells all the common symbols/letters in ASCII are essentially a number between 0 and 255 (ASCII has two parts: 0 to 127 is the standard ASCII; the upper range of 128 to 255 is defined in Extended ASCII; many variants of extended ASCII are used).
To put it into the context of your code, here is what happened.
// The letter/char 'a' is internally saved as 97 in the memory
// The letter/char 'b' is internally saved as 98 in the memory
x = 'a'; // this will copy 97 to x
y = 'b'; // this will copy 98 to x
z = x +y ; // 97+98=195 -> z
If you want to print "ab", you must have two chars next to each other. Here is what you should do
char z[3];
z[0]='a'; //move 'a' or 97 to the first element of z (recall in C, the index is zero-based
z[1]='b';//move 'b' or 98 to the second element or z
z[2]=0; //In C, a string is null-ended. That is, the last element must be a null (i.e.,0).
print("%s\n",z); // you will get "ab"
Alternatively, you can get "ab" in the following way based on the Ascii table:
char z[3];
z[0]=97; //move 97 to the first element of z, which is 'a' based on the ascii table
z[1]=98;//move 98 to the second element or z, which is 'b'
z[2]=0; //In C, a string is null-ended. That is, the last element must be a null (i.e.,0).
print("%s\n",z); // you will get "ab"
Edit/comment:
Considering this comment:
"Chars are signed on x86, so the range is -128 ... 127 and not 0 ... 255 as you state ".
Note that nowhere did I mention that the char type in C has a range of 0 ... 255. I refer to [0 ... 255 ] only in the context of the ASCII standard.
You summed up 97 to 98, hence the 195.
Feeding a sum of two char in an int will promote those char to int then store the result.
Then if you want that to be printed as a string, you can printf("%s\n", z);. Printing %d will interpret the variable as a decimal signed integer.
Don't print that as a string, because you don't know how far the first chars array terminator is.
Chars array in C, for many functions such as printf, don't end where its size ends, but where the terminator char (0x00 or 0 or '\0') marks its end.

how do char pointers work with while loop

In the while loop when *s is mentioned it means the value at the address contained in s, so in the first case, the value will be 'a',
my question is how will while loop checks it, does it checks the ASCII value of the characters to check the condition is true or false ..are some other way?
main()
{
char str[] = "abcd" ;
char *s = str;
while(*s)
printf ("%c",*s++) ;
}
When you declare a string variable like
char str[] = "abcd";
it's like declaring str[5] = "abcd\0";
So, in your while loop, it first checks the value of *s, which is 'a', that translates to 97 on the ascii table. Then you print the current value inside the *s pointer, and then increase the pointer by 1, which leads to the next character. When you reach the \0, the loop exits, because \0 is equal to 0;
while(conditon) in C/C++ code will execute if condition != 0
Since it is a dereferenced char*, this means it is a 1 byte value. Which ranges from 0-255.
Since the first value is 'a' this means it will print this table from values 'a'(61) to 'nbsp' (255) after 255 the char value will overflow to '0' or NULL character at which point the while(condition) will evaluate to false and the program will end.

Meaning of a C statement involving char arrays

I am working on an algorithm for a project and I ran across some code that I think may be helpful. However, as I try to read the code, I am having some difficulty understanding a statement in the code. Here is the code.
int firstWord[MAX_WORD_SIZE] = {0}, c = 0;
while (word1[c] != '\0') //word1 is char[] sent as a parameter
{
firstWord[word1[c]-'a']++;
c++;
}
I understand (I hope correctly) that the first line is setting up an integer array of my max size, and initializing elements to zero along with making the initial counter value "c" zero.
I understand that the while loop is looping through all of the characters of the word1[] array until it hits the final char '\0'
I am confused on the line firstWord[word1[c]-'a']++;
word1[c] should give a char, so what does doing the -'a' do? Does this cast the char to an integer which would allow you to access an element of the firstWord[] array and increment using ++? If so, which element or integer is given by word1[c]-'a'
word1[c]-'a' means the difference between the character in cth position of word1 and the integer value of 'a'. Basically it calculates the number of occurences of letters in a word.
So if word1[c] is 'b', then value of word1[c]-'a' will be ('b' - 'a') = 1. So the number of occurences of 'b' in the word will be incremented by 1.
This is a program that counts the number of letters a to z from a word. The key point here is, 'a' - 'a' has a value of 0, and 'b' - 'a' has a value of 1, etc.
For instance, if word1[c] is the letter 'd', then 'd' - 'a' is 3, so it would increment firstWord[3]. When the word has been iterated character by character, firstWord[3] contains the number of letter 'd' in the word.
it seems this code is doing a letter count
1 so what does doing the -'a' do?
if word1[c] is 'a' then word1[c]-'a' is 0
2 . Does this cast the char to an integer which would allow you to access an element of the firstWord[] array and increment using ++?
yes, it is integer promotion
3 .If so, which element or integer is given by word1[c]-'a'
if word1[c] is 'a' then word1[c]-'a' is 0

Undefined behavior of a program

While writing a program i am filling the entries of a char array with digits. After doing so the length calculated for an array having no zero is correct but for an array starting with zero is zero!
Why is this result coming so!I am not able to interpret my mistake!?
int main()
{
int number_of_terms,no,j,i;
char arr[100];
char c;
scanf("%d",&number_of_terms);
for(i=0;i<number_of_terms;i++)
{
j=0;
while(c!='\n')
{
scanf("%d",&arr[j]);
if(c=getchar()=='\n')
break;
j++;
}
printf("Length is:%d\n",strlen(arr));
}
return 0;
}
for eg if i input my array elements as 4 5
lenght is 2
and if my array elements as 0 5
length is 0..
You are using "%d" in your format specifier, which produces an integer, and you are passing in the address of a character array. This is, exactly like your title says, undefined behaviour. In particular, the value zero will take up 4 of the cells in your string, and will write zero to all of those. Since the character with value zero is the end marker, you get zero length string. However, on another architecture, the second character would probably cause a crash...
If you want to store integers in an array, you should use int arr[...];. If you want to store characters, use "%c".
You are copying the value 0 into the array. This eqals the character '\0' which is used to terminate strings. What you want is to copy the character '0' (has the value 48, see an ascii table).
Change %d to %c to interpret the input has character instead of decimal.
scanf("%c",&arr[j]);
Also your "string" in arr is not zero terminated. After all the characters of your string, you have to end the string with the value 0 (here a decimal is correct). strlen needs it, because it determines the length of the string by traversing the array and counting up until it finds a 0.

Resources