Sizeof vs Strlen - c

#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
char string[] = "october"; // 7 letters
strcpy(string, "september"); // 9 letters
printf("the size of %s is %d and the length is %d\n\n", string,
sizeof(string), strlen(string));
return 0;
}
Output:
$ ./a.out
the size of september is 8 and the length is 9
Is there something wrong with my syntax or what?

sizeof and strlen() do different things. In this case, your declaration
char string[] = "october";
is the same as
char string[8] = "october";
so the compiler can tell that the size of string is 8. It does this at compilation time.
However, strlen() counts the number of characters in the string at run time. So, after you call strcpy(), string now contains "september". strlen() counts the characters and finds 9 of them. Note that you have not allocated enough space for string to hold "september". This is undefined behaviour.

The Output is correct because
first statement string size was allocated by compiler that is 7+1 (October is 7 bytes & 1 byte for null terminator at compile time)
Second statement: you are copying September (9 bytes to 8 bytes string);
there for you got size of September as 8 bytes (still strlen() will not work for September it does not have null character)

Your destination array is 8 bytes (length of "october" plus \0) and you want to put in 9 chars in that array.
man strcpy says:
If the destination string of a strcpy() is not large enough, then anything might happen.
Please tell me what you really want to do, because this smells bad long way

You must eliminate buffer overflow problem in this example. One way to do this - is to use strncpy:
memset(string, 0, sizeof(string));
strncpy(string, "september", sizeof(string)-1);

Related

CS50 IDE: printf returns extra characters

I am having problems with the printf function in the CS50 IDE. When I am using printf to print out a string (salt in this code), extra characters are being output that were not present in the original argument (argv).
Posted below is my code. Any help would be appreciated. Thank you.
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
// ensuring that only 1 command-line argument is inputted
if (argc != 2)
{
return 1;
}
char salt[2];
for (int i = 0; i < 2; i++)
{
char c = argv[1][i];
salt[i] = c;
}
printf("the first 2 characters of the argument is %s\n", salt);
}
You are missing a string terminator in salt.
Somehow the computer needs to know where your string ends in memory. It does so by reading until it encounters a NUL byte, which is a byte with value zero.
Your array salt has exactly 2 bytes of space, and after them, random garbage exists which just happens to be next in memory after your array. Since you don't have a string terminator, the computer will read this garbage as well until it encounters a NUL byte.
All you need to do is include such a byte in your array, like so:
char salt[3] = {0};
This will make salt one byte longer, and the {0} is a shorthand for {0, 0, 0} which will initialize the contents of the array with all zerores. (Alternatively, you could use char salt[3]; and later manually set the last byte to zero using salt[2] = 0;.)
In your case, salt is at least one element shy of being a string, unless the argv[1] is only one element, it does not contain a null-terminator.
You need to allocate space to hold the null-terminator and actually put one there to be able to use salt as string, as expected for the argument to %s conversion specifier in case of printf().
Otherwise, the string related functions and operations, which essentially rely on the fact that there will be a null terminator to mark the end of the char array (i.e., mark the end of valid memory that can be accessed), will try to access past the valid memory which causes undefined behavior. Once you hit UB, nothing is guaranteed.
So, considering the fact that you want to use
"....the first 2 characters of the argument....."
you need to make salt a 3-element char array, and make sure that salt[2] contains a null-terminator, like '\0'.

C memory allocations

I was wondering how the C compiler allocated memory for a character array you initialize yourself, for instance:
char example[] = "An example string";
If it was a single character it would be 8 byte, so would the example be 17 bytes or does it have more because it needs to the \0to finish it off?
Or does it overestimate how much memory it needs?
This code:
#include <stdio.h>
int main(void)
{
char example[] = "An example string";
printf("%zu", sizeof(example));
}
Compiled with:
gcc -std=c99 -o proof proof.c
Returns:
18 (bytes, not bits)
Because of the \0 character at the end of string
The size of a null (\0) terminated string of strlen n is n+1. As you say, one extra for the null character.
If string litterals weren't terminated, standard calls like strlen could not work as they look for the terminating \0.
size of the string will be 18 byte including '\0' character. but length will be 17.
eg:
char arr[]="";
here empty string is assigned but it will have size=1 byte, due to null character'\0'. and length will be zero for the example string "" .
A single character should use one byte.
A string has automatically the zero-terminator and needs one byte more than characters in the string. If you use " " the zero-terminator is always added.
You can try to work with strlen and strnlen.
char cSample = 'T'; // one byte
char caSample[] = "T" //two bytes

Size definition of strcat() function

The question is why should I define size of string (string[] should be string[some-number])
When the program is as following it gives me Abort trap: 6:
#include <stdio.h>
#include <string.h>
int main(void)
{
char buffer1[] = "computer";
char string[]="program";
strcat( buffer1, string );
printf( "buffer1 = %s\n", buffer1 );
}
This is the program from http://www.tutorialspoint.com/cprogramming/c_data_types.htm it works fine:
#include <stdio.h>
#include <string.h>
int main ()
{
char str1[12] = "Hello";
char str2[12] = "World";
char str3[12];
int len ;
/* copy str1 into str3 */
strcpy(str3, str1);
printf("strcpy( str3, str1) : %s\n", str3 );
/* concatenates str1 and str2 */
strcat( str1, str2);
printf("strcat( str1, str2): %s\n", str1 );
/* total lenghth of str1 after concatenation */
len = strlen(str1);
printf("strlen(str1) : %d\n", len );
return 0;
}
What is the mistake? Even if I define all of the sizes of strings in my program, my code still gives Abort trap:6?
From the man page of strcat:
DESCRIPTION
The strcat() function appends the src string to the dest string, overwriting the termiā€
nating null byte ('\0') at the end of dest, and then adds a terminating null byte. The
strings may not overlap, and the dest string must have enough space for the result. If
dest is not large enough, program behavior is unpredictable; buffer overruns are a
favorite avenue for attacking secure programs.
When you declare your string, the compiler allocate the size of your initial string to be 9 (resp. 8) for the buffer1 (resp. string) (includin '\0').
Thus, strcat will result in 9 - 1 + 8 (i.e. 16 bytes) but only 9 are available.
Your strcat is buffer overflowing buffer1 which can hold only strlen("computer")+1 bytes. ommitting array size does not mean "dynamic" array! When you specify the size of the array, you are reserving as many bytes as you want: again you need to avoid bufferoverflow of course.
So,
strcpy(str3, str1);
and
strcat( str1, str2);
are ok since str3 size is enough for str1, and str1 is enough for strlen(str1) + strlen(str2) + 1, i.e. exactly 11: 5 (hello) + 5 (world) + 1 (terminator). The magic number 12 was choosen with a reason, big enough to hold both strings and a terminator.
About C strings
C-strings are array of chars where the last is "null", '\0', i.e. they are array of chars where the last one is 0. This terminator is needed so that string related functions can understand where the string ends.
If it happens that a null byte is found in the middle of a string, from the point of view of C string functions, the string will end at that point. E.g.
char buffer1[] = "computer\0program";
// array: { 'c', 'o', ... '\0', 'p', 'r', 'o', .., 'm', '\0' }
// ...
printf("%s\n", buffer1);
will print computer only. But at this point the buffer will be big enough to hold computer and program, a terminator (and another extra byte), since the compiler computed the size of the char array considering the literal sequence of characters which syntactically ends at the second ".
But for all C-string functions, the string contained in buffer1 is computer. Note also that sizeof buffer1 will give the correct size of the buffer, i.e. 17, opposed to the result of strlen(buffer1) which is just 8.
The first parameter of strcat is used to store the result, so it must have enough space for the concatenated string.
In your code:
char buffer1[] = "computer";
is equivalent to:
char buffer1[9] = "computer";
defines a char array with just enough space for the string "computer", but not enough space for the result.
char buffer1[] = "computer";
Creates a buffer big enough to hold 9 characters (strlen("Hello" + 1 byte for \0)). If you write anymore data to it what you end up with is Undefined behavior (UB). This is what happens when you do a strcat.
UB means the program might crash or show literally any behavior. You are rather lucky that a program with UB crashes because it does not need to, but if it does atleast there is a indication of something wrong in it. Most of the times programs with UB will continue running correctly and crash when you least expect or want them to.

printing improper number of characters in a string

#include <stdio.h>
int main()
{
char * name = "bob";
int x = sizeof(name);
printf("%s is %d characters\n",name,x);
}
I have the above code. I want to print the number of characters in this string. It keeps printing 8 instead of 3. Why?
sizeof() returns byte size. Specifically, it gives the bytes required to store an object of the type of the operand. In this case sizeof() is returning the byte size of a pointer to a string, which on your computer is 8 bytes, or, 64-bits.
strlen() is what you are looking for:
#include <stdio.h>
#include <string.h> // include string.h header to use strlen()
int main()
{
char * name = "bob";
int x = strlen(name); // use strlen() here
printf("%s is %d characters\n",name,x);
}
use strlen for finding the length of a string.
Each character is atleast 1 byte wide. It prints 8 because sizeof gets a pointer to bob and, on your machine, a pointer is 8 bytes wide.
strlen gives you the number of characters in a string. sizeof gives you the size of the object in bytes. On your system, an object of type char * is apparently 8 bytes wide.

fgets and strlen in c

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
main()
{
char str[5]={'\0'};
printf("Initial length before passing = %ld\n",strlen(str));
input(str);
printf("Received string = %s\n",str);
printf("Length after getting input(calling function) = %ld\n",sizeof(str));
}
input(char * buffer)
{
puts("Enter something: ");
printf("Initial length after passing = %ld\n",strlen(buffer));
if ( fgets(buffer, sizeof(buffer), stdin) == NULL )
return -1;
else
{
printf("Length after getting input(called function)= %ld\n",strlen(buffer));
return 0;
}
}
Output 1
Initial length before passing = 0
Enter something:
Initial length after passing = 0
hello
Length after getting input(called function)= 6
Received string = hello
Length after getting input(calling function) = 5
Output 2
Initial length before passing = 0
Enter something:
Initial length after passing = 0
helloooooo
Length after getting input(called function)= 7
Received string = hellooo
Length after getting input(calling function) = 5
Why is it printing different lengths in when I gave different input?
In output 1 & 2 why the initial length is 6 when I allocated space for only 5 characters?
Why is the length of string different before passing and after passing in both output 1 and output 2?
In output 2 why "Length after getting input(called function)= 7" when I allocated only less space?
strlen was supposed to be working with C strings, that is, char arrays that ends with \0. When you define:
char str[5];
str contains garbage, you are lucky calling strlen didn't cause segmentation fault.
1) In output 1 & 2 why the initial length is 6 when i allocated space for only 5 characters?
Your first strlen(str) call isn't really even defined since you declared char str[5] but didn't put anything into it. So the contents are who-knows-what. That fact that strlen(str) returned a 6 just means that there happened to be 6 non-zero characters in memory, starting at address str before it encountered a 0.
2) why the length of string is different before passing and after passing in both output 1 and output 2?
After getting a length of 6 for random memory contents, you loaded something into the string buffer and zero terminated it. So the length changed to something real.
3) In output 2 why "Length after getting input(called function)= 7" when i allocated only less space?
Because you actually overran your allocated space with a longer string (of length 7). You were lucky the program didn't crash in that case.
When you declare a buffer in C, such as:
char str[5];
All it does is tell the compiler you're reserving 5 bytes of space to do something and it entitles you to use that space and only that space for str. It doesn't necessarily have anything in the buffer to start unless you put something there, and it doesn't prevent you from writing more than you declared.
Note that str[5] isn't big enough to hold "hello" since strings in C are zero-terminated. So you need a character buffer of size N+1 to hold a string of size N. When you overflow buffers in C, your results will become erratic and unpredictable.
char str[5];
printf("Initial length before passing = %ld\n",strlen(str));
strlen returns the length of specified string, not the size of the buffer. You didn't initialize str to anything, so it's full of garbage. strlen will return the "length" of that garbage, which could be anything.

Resources