Length of a fully filled array - c

#include <stdio.h>
#include <string.h>
int main(void)
{
char cp [5];
cp[0]=1;
cp[1]=1;
cp[2]=1;
cp[3]=1;
cp[4]=1;
printf("%lu\n", strlen(cp));
}
If I run this program, it displays 8.(wrong)
But if I`ll change array to more than [5] (example: [6] or [7]...) with the same number of array elements, it displays 5 (right).

strlen return the number characters from the pointer you give up to a null character (not count), in your case you do not put a null character in your array, so strlen goes out of the array with an undefined behavior
for instance do
char cp [6];
cp[0]=1;
cp[1]=1;
cp[2]=1;
cp[3]=1;
cp[4]=1;
cp[5] = 0;
printf("%zu\n", strlen(cp));
also notice I changed the format, strlen returns a size_t the right format for it is %zu
out of that you put in your array characters having the code 1, that character is generally not writable, did you want '1' rather than 1 ?

Related

Why one string writes two char in C? [duplicate]

I'm currently learning C and I'm confused with differences between char array and string, as well as how they work.
Question 1:
Why is there a difference in the outcomes of source code 1 and source code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[2]="Hi";
printf("%d\n", strlen(c)); //returns 3 (not 2!?)
return 0;
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[3]="Hi";
printf("%d\n", strlen(c)); //returns 2 (not 3!?)
return 0;
}
Question 2:
How is a string variable different from a char array? How to declare them with the minimum required index numbers allowing \0 to be stored if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
#define name "Mick" //what is the size? Is there a \0?
Question 3:
Does the terminating NUL ONLY follow strings but not char arrays? So the actual value of the string "Hi" is [H][i][\0] and the actual value of the char array "Hi" is [H][i]?
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how this is done, using gets(c) maybe?). So where is the \0 stored? Is it stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be appended with a \0 to become c[3] which is [H][i][\0]?
It is quite confusing that sometimes there is a \0 following the string/char array and causes trouble when I compare two variables by if (c1==c2) as it most likely returns FALSE (0).
Detailed answers are appreciated. But keeping your answer brief helps my understanding :)
Thank you in advance!
Answer 1: In code 1 you have a char array that is not a string; in code 2 you have a char array that is also a string.
Answer 2: A string is a char array in which (at least) one element has the value 0; if you leave the size part empty, the compiler will automatically fill it with the minimum possible value.
char astring[] = "foobar"; /* compiler automagically uses 7 for size */
printf("%d\n", (int)sizeof astring);
Answer 3: a char array in which one of the elements is NUL is a string; a char array where no elements are NUL is not a string.
Answer 4: an array defined to hold two elements (char c[2];) cannot hold three elements. If it is going to be a string it can only be the empty string or a string with 1 character.
Question 1:
Why is there a difference in the outcomes of source code 1 and source
code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main()
{
char c[2]="Hi";
printf("%d", strlen(c)); //returns 3 (not 2!?)
getchar();
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main()
{
char c[3]="Hi";
printf("%d", strlen(c)); //returns 2 (not 3!?)
getchar();
}
answer:
Because in the first case, c[] is only holding "Hi". strlen looks for a zero at the end, and, depending on exactly what is behind c[] finds one sooner or later, or crashes. We can't say without knowing exactly what is in the memory behind the c[] array.
Question 2:
How is a string variable different from a char array? How to declare
them with the minimum required index numbers allowing \0 to be stored
if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
answer
Really depends on what you want to do. Probably 5 if you want to actually use the content as a string. But there's nothing saying you can't store "Mick" in a 4 character array - you just can't use strlen to find out how long it is, because strlen will continue to 5 and quite possibly (much) further to find the length, and if there is no zero in the next several memory locations, it could lead to a crash, because eventually, there won't be valid memory addresses to read.
#define name "Mick" //what is the size? Is there a \0?
This has absolutely no size at all, until you use name somwhere. #defines are not part of what the compiler sees - the pre-processor will replace name with "Mick" if you use name anywhere - and hopefully, that's in a place the compiler can make sense of. And then the same rules apply as in previous answer - it depends on how you want to use the array of characters. For correct operation with strlen, strpy, and nearly all other str... functions, you need a zero at the end.
Question 3:
Does the terminating null ONLY follow strings but not char arrays? So
the actual value of the string "Hi" is [H][i][\0] and the actual value
of the char array "Hi" is [H][i]?
Yes, no, maybe. It all depends on how you USE the "Hi" string literal (that's the technical name for 'something within double quotes'). If the compiler is "allowed", it will put a zero at the end. But if you initialize an array to a given size, it will stuff the bytes in there, and if there isn't room for a zero, that's your problem, not the compiler's.
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how
this is done, using gets(c) maybe?). So where is the \0 stored? Is it
stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be
appended with a \0 to become c[3] which is [H][i][\0]?
In c[2], beyond the 'H', 'i', there is no telling what is stored [technically, it could well be "the end of the earth" - in computer terms, that's "memory that can't be read - in which case strlen on that WILL crash your program, because strlen reads beyond the end of the earth]. But if could also be a zero, a one, the letter 'a', the number 42, or any other 8-bit [1] value.
It is quiet confusing that sometimes there is a \0 following the
string/char array and causes trouble when I compare two variables by
if (c1==c2) as it most likely returns FALSE (0).
If c1 and c2 are char arrays, that will ALWAYS be false since c1 and c2 are never going to have the same address, and when using an array in C in that way, it becomes "the address in memory of the first element in the array". So no matter what teh contents of c1 and c2 is, their address can never be the same [because they are two different variables, and two variables can not have the same location in memory - that's like trying to park two cars in a parking space large enough only for one car - and no, crushing either car is not allowed in our thought experiment].
[1] Char isn't guaranteed to be 8 bits. But lets inore that for now.
Running source code one is undefined behavior because strlen() requires a NUL-terminated string, which c[2] = "Hi"; /* = { 'H', 'i' } */ is not. A string differs from a char array in that a string is a char array with at least one NUL byte somewhere in the array.
The remaining answers should follow easily from this fact.
To autosize a char array to match the size of a string literal at initialization, simply specify no array size:
char c[] = "This will automatically size the c array (including the NUL).";
Note that you cannot compare char arrays with the == operator. You have to use
if (strcmp(c1, c2) == 0) {
/* Equal. */
} else {
/* Not equal. */
}
strlen() works on \0 terminating characters and in C all strings should be \0 terminated. So when you have given only 2 spaces for 2 characters H and i but there is no room for \0. Hence you are getting Undefined Behavior in strlen().
In case of char c[3] = "Hi"; there is \0 at the third place and strlen() will calculate the actual length.
How to declare them with the minimum required index numbers allowing \0 to be stored if any ?
When you are not sure about the size of char array , Do like this :
char c1[] = "Mike"; // strlen = 4
char c2[] = "Omkant" // strlen = 6
NOTE :
EDIT :In the above case where no size is mentioned explicitly , Do not confuse with sizeof with the strlen().
strlen() returns only number of charaters
sizeof gives number of characters plus one more (for \0 character).
So sizeof always gives exactly 1 more than the number returned by strlen().

CS50 IDE: printf returns extra characters

I am having problems with the printf function in the CS50 IDE. When I am using printf to print out a string (salt in this code), extra characters are being output that were not present in the original argument (argv).
Posted below is my code. Any help would be appreciated. Thank you.
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
// ensuring that only 1 command-line argument is inputted
if (argc != 2)
{
return 1;
}
char salt[2];
for (int i = 0; i < 2; i++)
{
char c = argv[1][i];
salt[i] = c;
}
printf("the first 2 characters of the argument is %s\n", salt);
}
You are missing a string terminator in salt.
Somehow the computer needs to know where your string ends in memory. It does so by reading until it encounters a NUL byte, which is a byte with value zero.
Your array salt has exactly 2 bytes of space, and after them, random garbage exists which just happens to be next in memory after your array. Since you don't have a string terminator, the computer will read this garbage as well until it encounters a NUL byte.
All you need to do is include such a byte in your array, like so:
char salt[3] = {0};
This will make salt one byte longer, and the {0} is a shorthand for {0, 0, 0} which will initialize the contents of the array with all zerores. (Alternatively, you could use char salt[3]; and later manually set the last byte to zero using salt[2] = 0;.)
In your case, salt is at least one element shy of being a string, unless the argv[1] is only one element, it does not contain a null-terminator.
You need to allocate space to hold the null-terminator and actually put one there to be able to use salt as string, as expected for the argument to %s conversion specifier in case of printf().
Otherwise, the string related functions and operations, which essentially rely on the fact that there will be a null terminator to mark the end of the char array (i.e., mark the end of valid memory that can be accessed), will try to access past the valid memory which causes undefined behavior. Once you hit UB, nothing is guaranteed.
So, considering the fact that you want to use
"....the first 2 characters of the argument....."
you need to make salt a 3-element char array, and make sure that salt[2] contains a null-terminator, like '\0'.

Declare a whole string with a single charcter

When declaring an array in C I can write int array[100]={0} and it assigns 0 to all index. Is there any way to do the same thing with string? Because when I write char string[100]={'A'} it only assigns 'A' to the first index.And when I print the string it displays "A" instead of what I want "AAAAA.....A(99 times)". I dont want to use a loop to assign all the indexes with 'A'. So what can I do?
Its not in the standard but some compilers (including gcc and clang that I tested it with on my system) allow this:
#include <stdio.h>
int main(){
char str[100] = {[0 ... 98] = 'A'};
str[99]='\0'; // this line is really not necessary, since default is 0 and I just wrote it for clarity
printf("%s\n", str);
return 0;
}
Standard C does not have a mechanism for what you want. In particular, note that the two cases you describe are the same: that int array[100]={0} causes the array to be initialized with all zeroes is not because the specified 0 is applied to all elements, but rather that the specified 0 is applied to the zeroth element, and all otherwise uninitialized elements are initialized with a default value, which coincidentally is specified to be 0. Initialization of the char array follows the same rule.
If you want to initialize 99 elements of an array to 'A', then the initializer must provide 99 'A's. If you want to avoid typing (and counting) them all, then you might use a macro to assist:
#define REPEAT11(x) x x x x x x x x x x x
char string[100] = REPEAT11("AAAAAAAAA");
That makes use of compile-time concatenation of adjacent string literals to form the wanted initializer.
You could also use memset() from <string.h> to fill the first n bytes of your array with 'A' characters. Here is an example:
#include <stdio.h>
#include <string.h>
#define SIZE 100
int main(void) {
/* initializes all elements to 0. */
/* Will not work with 'A' as the default values of this array are 0 */
char array[SIZE] = {0};
/* fills first 99 bytes with 'A', and leaves last byte as '\0' */
memset(array, 'A', sizeof(array)-1);
printf("array = %s\n", array);
printf("length of array = %zu\n", strlen(array));
/* Outputs:
array = AAAAAAAAA...
length of array = 99
*/
return 0;
}

About string length, terminating NUL, etc

I'm currently learning C and I'm confused with differences between char array and string, as well as how they work.
Question 1:
Why is there a difference in the outcomes of source code 1 and source code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[2]="Hi";
printf("%d\n", strlen(c)); //returns 3 (not 2!?)
return 0;
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[3]="Hi";
printf("%d\n", strlen(c)); //returns 2 (not 3!?)
return 0;
}
Question 2:
How is a string variable different from a char array? How to declare them with the minimum required index numbers allowing \0 to be stored if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
#define name "Mick" //what is the size? Is there a \0?
Question 3:
Does the terminating NUL ONLY follow strings but not char arrays? So the actual value of the string "Hi" is [H][i][\0] and the actual value of the char array "Hi" is [H][i]?
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how this is done, using gets(c) maybe?). So where is the \0 stored? Is it stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be appended with a \0 to become c[3] which is [H][i][\0]?
It is quite confusing that sometimes there is a \0 following the string/char array and causes trouble when I compare two variables by if (c1==c2) as it most likely returns FALSE (0).
Detailed answers are appreciated. But keeping your answer brief helps my understanding :)
Thank you in advance!
Answer 1: In code 1 you have a char array that is not a string; in code 2 you have a char array that is also a string.
Answer 2: A string is a char array in which (at least) one element has the value 0; if you leave the size part empty, the compiler will automatically fill it with the minimum possible value.
char astring[] = "foobar"; /* compiler automagically uses 7 for size */
printf("%d\n", (int)sizeof astring);
Answer 3: a char array in which one of the elements is NUL is a string; a char array where no elements are NUL is not a string.
Answer 4: an array defined to hold two elements (char c[2];) cannot hold three elements. If it is going to be a string it can only be the empty string or a string with 1 character.
Question 1:
Why is there a difference in the outcomes of source code 1 and source
code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main()
{
char c[2]="Hi";
printf("%d", strlen(c)); //returns 3 (not 2!?)
getchar();
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main()
{
char c[3]="Hi";
printf("%d", strlen(c)); //returns 2 (not 3!?)
getchar();
}
answer:
Because in the first case, c[] is only holding "Hi". strlen looks for a zero at the end, and, depending on exactly what is behind c[] finds one sooner or later, or crashes. We can't say without knowing exactly what is in the memory behind the c[] array.
Question 2:
How is a string variable different from a char array? How to declare
them with the minimum required index numbers allowing \0 to be stored
if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
answer
Really depends on what you want to do. Probably 5 if you want to actually use the content as a string. But there's nothing saying you can't store "Mick" in a 4 character array - you just can't use strlen to find out how long it is, because strlen will continue to 5 and quite possibly (much) further to find the length, and if there is no zero in the next several memory locations, it could lead to a crash, because eventually, there won't be valid memory addresses to read.
#define name "Mick" //what is the size? Is there a \0?
This has absolutely no size at all, until you use name somwhere. #defines are not part of what the compiler sees - the pre-processor will replace name with "Mick" if you use name anywhere - and hopefully, that's in a place the compiler can make sense of. And then the same rules apply as in previous answer - it depends on how you want to use the array of characters. For correct operation with strlen, strpy, and nearly all other str... functions, you need a zero at the end.
Question 3:
Does the terminating null ONLY follow strings but not char arrays? So
the actual value of the string "Hi" is [H][i][\0] and the actual value
of the char array "Hi" is [H][i]?
Yes, no, maybe. It all depends on how you USE the "Hi" string literal (that's the technical name for 'something within double quotes'). If the compiler is "allowed", it will put a zero at the end. But if you initialize an array to a given size, it will stuff the bytes in there, and if there isn't room for a zero, that's your problem, not the compiler's.
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how
this is done, using gets(c) maybe?). So where is the \0 stored? Is it
stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be
appended with a \0 to become c[3] which is [H][i][\0]?
In c[2], beyond the 'H', 'i', there is no telling what is stored [technically, it could well be "the end of the earth" - in computer terms, that's "memory that can't be read - in which case strlen on that WILL crash your program, because strlen reads beyond the end of the earth]. But if could also be a zero, a one, the letter 'a', the number 42, or any other 8-bit [1] value.
It is quiet confusing that sometimes there is a \0 following the
string/char array and causes trouble when I compare two variables by
if (c1==c2) as it most likely returns FALSE (0).
If c1 and c2 are char arrays, that will ALWAYS be false since c1 and c2 are never going to have the same address, and when using an array in C in that way, it becomes "the address in memory of the first element in the array". So no matter what teh contents of c1 and c2 is, their address can never be the same [because they are two different variables, and two variables can not have the same location in memory - that's like trying to park two cars in a parking space large enough only for one car - and no, crushing either car is not allowed in our thought experiment].
[1] Char isn't guaranteed to be 8 bits. But lets inore that for now.
Running source code one is undefined behavior because strlen() requires a NUL-terminated string, which c[2] = "Hi"; /* = { 'H', 'i' } */ is not. A string differs from a char array in that a string is a char array with at least one NUL byte somewhere in the array.
The remaining answers should follow easily from this fact.
To autosize a char array to match the size of a string literal at initialization, simply specify no array size:
char c[] = "This will automatically size the c array (including the NUL).";
Note that you cannot compare char arrays with the == operator. You have to use
if (strcmp(c1, c2) == 0) {
/* Equal. */
} else {
/* Not equal. */
}
strlen() works on \0 terminating characters and in C all strings should be \0 terminated. So when you have given only 2 spaces for 2 characters H and i but there is no room for \0. Hence you are getting Undefined Behavior in strlen().
In case of char c[3] = "Hi"; there is \0 at the third place and strlen() will calculate the actual length.
How to declare them with the minimum required index numbers allowing \0 to be stored if any ?
When you are not sure about the size of char array , Do like this :
char c1[] = "Mike"; // strlen = 4
char c2[] = "Omkant" // strlen = 6
NOTE :
EDIT :In the above case where no size is mentioned explicitly , Do not confuse with sizeof with the strlen().
strlen() returns only number of charaters
sizeof gives number of characters plus one more (for \0 character).
So sizeof always gives exactly 1 more than the number returned by strlen().

The char array in C. How to find actual length of valid input?

Suppose i have array of characters. say char x[100]
Now, i take input from the user and store it in the char array. The user input is less than 100 characters. Now, if i want to do some operation on the valid values, how do i find how many valid values are there in the char array. Is there a C function or some way to find the actual length of valid values which will be less than 100 in this case.
Yes, C has function strlen() (from string.h), which gives you number of characters in char array. How does it know this? By definition, every C "string" must end with the null character. If it does not, you have no way of knowing how long the string is or with other words, values of which memory locations of the array are actually "useful" and which are just some dump. Knowing this, sizeof(your_string) returns the size of the array (in bytes) and NOT length of the string.
Luckily, most C library string functions that create "strings" or read input and store it into a char array will automatically attach null character at the end to terminate the "string". Some do not (for example strncpy() ). Be sure to read their descriptions carefully.
Also, take notice that this means that the buffer supplied must be at least one character longer than the specified input length. So, in your case, you must actually supply char array of length 101 to read in 100 characters (the difference of one byte is for the null character).
Example usage:
#include <stdio.h>
#include <string.h>
int main(void)
{
char *string = "Hello World";
printf("%lu\n", (unsigned long)strlen(string));
return 0;
}
strlen() is defined as:
size_t strlen(const char * str)
{
const char *s;
for (s = str; *s; ++s);
return(s - str);
}
As you see, the end of a string is found by searching for the first null character in the array.
That depends on entirely where you got the input. Most likely strlen will do the trick.
Every time you enter a string in array in ends with a null character. You just have to find where is the null character in array.
You can do this manually otherwise, strlen() will solve your problem.
char ch;
int len;
while( (ch=getche() ) != '13' )
{
len++;
}
or use strlen after converting from char to string by %s

Resources