C - strcpy() function restrictions - c

I am incredibly new in C programming and I'm having a hard time understanding some aspects of it, including the strcpy() function.
I am doing some quizzes and passed over the following question:
To assure the correctness of the following strcpy(d,s) call, which of the following conditions must always be met:
a. sizeof(d) >= strlen(s) + 1
b. sizeof(d) >= sizeof(s)
c. sizeof(d) >= strlen(s)
d. strlen(d) >= strlen(s)
e. strlen(d) >= strlen(s) + 1
After doing some research, I found that the size of the destination string should be large enough to store the copied string. Source here. This led me to answers either b or d.
However the correct answer is 'a' and I cannot understand why, and cannot find any documentation. Could someone please explain in more details what the restrictions of strcpy() are?

It sort of depends on how the variables are declared and/or defined even, however judging by the fact that the answer is a, I'm positive that this is how they were declared and defined-
// Assume SIZE_0 and SIZE_1 are some integer values
char d[SIZE_0];
char s[SIZE_1];
// Assign a bunch of characters to `s` here and null terminate it
// Assume `s` now has `LEN` number of characters + 1 for the null terminator, for a total of `LEN + 1`
// Of course, `LEN + 1` is either less than, or equal to, `SIZE_1`
Now let's get the values cleared up real quick-
strlen(s) -> Returns LEN, as it counts the number of characters until the null terminator
sizeof(s) -> Returns SIZE_1
sizeof(d) -> Returns SIZE_0
strlen(d) -> Doesn't work as you'd think, strlen won't work without a null terminator, currently d has no value - so there's no length, not even 0 would count unless you set d[0] to '\0' by yourself
So, it's evident that for strcpy(d, s) to work, sizeof(d) (which is the only valid call) MUST BE more than, or equal to, LEN + 1 (LEN for all the characters from s and +1 for the null terminator).
Of course, strlen on s will return LEN, so we'll need strlen(s) + 1.
And that is why you need sizeof(d) to be more than, or equal to, strlen(s) + 1.
I must say though, if strlen did return the capacity (aka size) of the string, instead of the current length, your assumption would work.
At the end of the day, always remember the difference between capacity (size) and length, especially in C.

It is copying a sequence of character bytes with a null terminator. For example the string "ABC" is represented 65,66,67,0.
strlen calculates the length of the string excluding the null terminator (e.g. 3 for "ABC"). sizeof (for an array) gives you the amount of memory set aside for the array. The string it contains may be shorter.
char s[20] = "ABC"; would give you a string of length 3, using 4 bytes including the terminator, in a reserved memory space of 20 bytes.
To safely copy from s to d, you need to have enough space in d to receive the string and its terminator (without the terminator you won't have a valid string). Hence strlen(s)+1.
The size of s is irrelevant as not all of the reserved space will be copied. strlen(d) is irrelevant also - any existing string in d will be overwritten.
So option a is correct.

All character strings in C (the kind used by functions such as strcpy) must have a nul-terminator (a zero-value char signalling the end of the string). So, to store a string like "abc", you will need any char[] array to have at least four elements: one for each of the letters plus one for the nul-terminator.
The strlen() function returns the number of characters in the given string not including that nul-terminator; but the strcpy() function copies all characters including the terminator, so the destination buffer must be at least one 'chargreater than thestrlen` of the source.
Also note that the sizeof(d) calculation will only work if d is declared as an array of char (e.g. char d[42]); if that array is passed to a function, it will 'decay' to a pointer, and the array's size will not be (implicitly) known to that function; see this discussion: How to find the 'sizeof' (a pointer pointing to an array)?.

Related

Char string length not getting initialized properly despite literally putting in the integer size I want it to be?

I'm working with char arrays in C. I'm setting the size in a previous step. When I print it out it clearly shows the num_digits as 1.
But then when I put it in to set the size of a char array to make it a char array of size num_digits, its setting the size of the array as 6.
In the next step when I print strlen(number_array), it prints 6. Printing it out I get something with a lot of question marks. Does anyone know why this is happening?
int num_digits = get_num_digits(number);
printf("Num digits are %d\n", num_digits);
char number_array[num_digits];
printf("String len of array: %d\n", strlen(number_array));
You need to null terminate your array.
char number_array[num_digits + 1];
number_array[num_digits] = '\0';
Without this null terminator, C has no way of know when you've reached the end of the array.
just use 'sizeof' instead of 'strlen'
printf("String len of array: %d\n", sizeof(number_array));
There are a couple possible issues I see here:
As noted in Michael Bianconi's answer, C character arrays (often called strings) require null terminators. You would explicitly set this this with something like:
number_array[number + 1] = '\0'; /* See below for why number + 1 */
Rather than just setting the last element to null, pre-initializing the entire character array to nulls might be helpful. Some compilers may do this for you, but if not you'll need to do this explicitly with something like:
for (int i = 0; i < num_digits + 1; i ++) number_array[i] = '\0';
Note that with gcc I had to use C99 mode using -std=c99 to get this to compile, as the compiler didn't like the initialization within the for statement.
Also, the code presented sets the length of the character array to be the same length as number's length. We don't know what get_num_digits returns, but if it returns the actual number of significant digits in an integer, this will come up one short (see above and other answer), as you need an extra character for the null terminator. An example: if the number is 123456 and get_number_digits returns 6, you would would need to set the length of number_array to 7, instead of 6 (i.e. number + 1).
char number_array[num_digits]; allocates some space for a string. It's an array of num_digits characters. Strings in C are represented as an array of characters, with a null byte at the end. (A null byte has the value zero, not to be confused with the digit character '0'.) So this array has room for a string of up to num_digits - 1 characters.
sizeof(number_array) gives you the array storage size. That's the total amount of space you have for a string plus its null terminator. At any given time, the array can contain a string of any length up to number_array - 1, or it might not contain a string at all if the array doesn't contain a null terminator.
strlen(number_array) gives you the length of the string contained in the array. If the array doesn't contain a null terminator, this call may return a garbage value or crash your program (or make demons fly out of your nose, but most computers fortunately lack the requisite hardware).
Since you haven't initialized number_array, it contains whatever happened to be there in memory before. Depending on how your system works, this may or may not vary from one execution of the program to the next, and this certainly does vary depending on what the program has been doing and on the compiler and operating system.
What you need to do is:
Give the array enough room for the null terminator.
Initialize the array to an empty string by making setting the first character to zero.
Optionally, initialize the whole array to zero. This is not necessary, but it may simplify further work with the array.
Use %zu rather than %d to print a size. %d is for an int, but sizeof and strlen return a size_t, which depending on your system may or may not be the same size of integers.
char number_array[num_digits + 1];
number_array[0] = 0; // or memset(number_array, 0, sizeof(number_array));
printf("Storage size of array: %zu\n", sizeof(number_array));
printf("The array contains an empty string: length=%zu\n", strlen(number_array));

Last value of char array unknown - C

I'm making a simple program in C, which checks the length of some char array and if it's less than 8, I want to fill a new array with zeroes and add it to the former array. Here comes the problem. I don't know why the last values are some signs(see the photo).
char* hexadecimalno = decToHex(decimal,hexadecimal);
printf("Hexadecimal: %s\n", hexadecimalno);
char zeroes [8 - strlen(hexadecimalno)];
if(strlen(hexadecimalno) < 8){
for(i = 0; i < (8-strlen(hexadecimalno)); i++){
zeroes[i]='0';
}
}
printf("zeroes: %s\n",zeroes);
strcat(zeroes,hexadecimalno);
printf("zeroes: %s\n",zeroes);
result
In C, strings (which are, as you are aware, arrays of characters) do not have any special metadata that tells you their length. Instead, the convention is that the string stops at the first character whose char value is 0. This is called "null-termination". The way your code is initializing zeroes does not put any null character at the end of the array. (Do not confuse the '0' characters you are putting in with NUL characters -- they have char value 48, not 0.)
All of the string manipulation functions assume this convention, so when you call strcat, it is looking for that 0 character to decide the point at which to start adding the hexadecimal values.
C also does not automatically allocate memory for you. It assumes you know exactly what you are doing. So, your code is using a C99 feature to dynamically allocate an array zeroes that has exactly the number of elements as you need '0' characters appended. You aren't allocating an extra byte for a terminating NUL character, and strcat is also going to assume that you have allocated space for the contents of hexadecimalno, which you have not. In C, this does not trigger a bounds check error. It just writes over memory that you shouldn't actually write over. So, you need to be very careful that you do allocate enough memory, and that you only write to memory you have actually allocated.
In this case, you want hexadecimalno to always be 8 digits long, left-padding it with zeroes. That means you need an array with 8 char values, plus one for the NUL terminator. So, zeroes needs to be a char[9].
After your loop that sets zeroes[i] = '0' for the correct number of zeroes, you need to set the next element to char value 0. The fact that you are zero-padding confuses things, but again, remember that '0' and 0 are two different things.
Provided you allocate enough space (at least 9 characters, assuming that hexadecimalno will never be longer than 8 characters), and then that you null terminate the array when putting the zeroes into it for padding, you should get the expected result.

What happens when I write `char str[80];`?

What happens behind the scenes when I write: char str[80];?
I notice that I can now set str = "hello"; and also str = "hello world"; right afterwards. First time strlen(str) is 5, and second time it is 11;
But why? I thought that after str = "hello";, the char at index 5 becomes null (str[5] becomes '\0'). Doesn't that mean that str's size is now 6 and I shouldn't be able to set it to "hello world"?
And if not, then how does strlen and sizeof calculate the correct values every time?
I think you're getting confused between two different concepts: the allocated length of the array (how much total space is available), and the logical length of the string (how much space is being used).
When you write char str[80], you're getting storage space for 80 characters. You might not end up using all of that space, but regardless of what string you try storing in it, you're always going to have 80 slots into which you can place characters.
If you store the string "hello" into str, then the first six characters of str will be set to h, e, l, l, o, and a null terminating character. This doesn't change the allocated length, though - you still have 74 other slots that you can work with. If you then change it to "hello, world", you're using an extra seven characters, which fits just fine because you easily have enough allocated space to hold things. You've just changed the logical length, how much of that space is being used for meaningful data, but not the allocated length, how much space there is available.
Think of it this way. When you say char str[80], you're buying a plot of land that's, say, 80 acres. If you then put "hello" into it, you're using six acres of that available 80 acres. The rest of the land is still yours - you can build whatever you'd like there - so if you decide to tear everything down and build a longer string that uses up more acres of land, that's fine. No one is going to object.
The strlen function gives back the logical length of the string - how many characters are in the string that you're storing. It works by counting up characters until it finds a null terminator indicating the logical end of the string. The sizeof operator returns the allocated length of the array, how many slots you have. It works at compile-time and doesn't care what the array contents are.
When you declare a variable as char str[80], space for an 80 character array is allocated on the stack. This memory will be automatically released when that particular stack frame is out of scope.
When you assign it to the string literal "hello", it is copying each character into the array, then putting a null terminator at the end of the string (str[5] == '\0'). String length and array size are two different things, which is why you can reassign it to "hello world". String length is simply how many consecutive characters there are before the null terminator. If you instead declared str as char str[5], you would indeed cause a crash when you tried to reassign it to "hello world". It may be helpful to view a simple implementation of strlen:
size_t strlen(const char *str)
{
size_t return_val = 0;
while (str[return_val] != '\0') return_val++;
return return_val;
}
Of course, if there is no null terminating character, the above naive implementation will crash.
I am assuming that you are working in C. When you compile "char str[80];" basically a 80 character long space is allocated for you. sizeof(str) should always tell you that it is an 80 byte long chunk of memory. strlen(str) will count the non-zero characters starting at str[0]. This is why "Hello" is 5 and "Hello world".
I would suggest that you learn to use functions like strnlen, strncpy, strncmp, snprintf ..., this way you can prevent reading/writing beyond the end the array, for example: strnlen(str,sizeof(str)).
Also start working through online tutorials and find an introductory C/C++ book to learn from.
When you declare an array like char str[80]; 80 chars of space are reserved on the stack for you, but they are not initialized - they get whatever was already in memory at the time. It's your job as the programmer to initialize the array.
strlen does something along these lines:
int strlen(char *s)
{
int len = 0;
while(*s++) len++;
return len;
}
In other words, it returns the length of a null-terminated string in a character array, even if the length is less than the size of the total array.
sizeof returns the size of a type or expression. If your array is 80 chars long, and a char is a byte long, it will return 80, even if none of the values in the array have been initialized. If you had an array of 5 ints, and an int was 4 bytes long, sizeof would produce 20.

difference between sizeof and strlen in C linux

The first printf statement is giving output 3 and second giving 20.
Can anybody please explain what's the difference between the two here?
char frame[20],str[20];
printf("\nstrlen(frame)= %d",strlen(frame));
printf("\nsizeof(frame) = %d",sizeof(frame));
Thanks :)
sizeof is a compile-time operator and determines the size in bytes that a type consumes. In the case of frame (char[20]) that is 20 bytes.
strlen is a run-time function and scans a given pointer until the first occurrence of a nul terminator '\0' returning the amount of characters until then.
Because the contents of frame is not initialized, which means it is not a valid C string, so strlen(frame) could return any value, or crash. Actually, its behavior is undefined in this case.
Because frame is an array of 20 characters, therefore sizeof(frame) will return 20 * sizeof(char), which will always be 20 (sizeof(char) always equals 1).
strlen actually gives you the length of the string, whereas sizeof gives you the size of the allocated memory in bytes. It is infact quite nicely explained here http://www.cplusplus.com/reference/cstring/strlen/ Extract given below.
The length of a C string is determined by the terminating null-character: A C string is as long as the number of characters between the beginning of the string and the terminating null character (without including the terminating null character itself).
This should not be confused with the size of the array that holds the string. For example:
char mystr[100]="test string";
defines an array of characters with a size of 100 chars, but the C string with which mystr has been initialized has a length of only 11 characters. Therefore, while sizeof(mystr) evaluates to 100, strlen(mystr) returns 11.
And yes as per the other comments, you are trying to get length for uninitialized strings and that leads to undefined behaviour, it can be 3 or anything else, depending on whatever garbage is present in the memory that got allocated for your string.

Subtracting from pointer to get length

I wanted to find the length of a part of a string after searching for it within a bigger string.
I cannot use strlen since I am dealing with binary data.
char *temp= "this is some random text";
char *temp1 = strstr(temp,"some");
int len = strlen(temp);
int len1 =0;
len1 = temp+len - temp1;
to get length of "some random text"
len1 returns negative value (even the positive value of it is wrong)
If your data is not NULL-terminated, then you cannot call strstr() on it for the same reason you can't call strlen(). If you do that, you can end up scanning past the end of your data. If you find a match there (which is quite possible; reading past the end of arrays is not guaranteed to crash the program), then your pointer arithmetic is going to give you a negative value, because you're subtracting a larger address from a smaller one.
On the other hand, if your data is actually properly NULL-terminated, then your problem is probably that strstr() doesn't find the substring and thus returns NULL. Are you checking for NULL? Otherwise, what you end up doing is:
len1 = temp + len - (char*)NULL;
Final answer:
You're looking for len - (temp1 - temp). The length of the first part is temp1 - temp. Substract it from the length of the entire string to get the length of the remaining part.
Longer answer:
Since strlen (which is what you have used in your example, even if it only works for proper text messages) goes until it finds a \0 character you can simply use strlen(temp1) for the length of the last part of the input. If you are really concerned that calling strlen twice will harm your performance (really?) then you can use len - (temp1 - temp).
You only need to do pointer substraction if you are interested in the length of the first part of the input.
If you want to work with binary arrays which contain \0 in them at non-terminal position you cannot use strlen at all in your code. However, you have to have a way to specify the length of the entire input. Either you have this in an integer variable or you have a specific delimiter an a length-computing function.
If you have the integer variable for length then, since the length of the first part of the input is obtained by pointer substraction, you only have to do len - (temp1 - temp).
If you have a length-computing function, simply call it with temp1 as argument.
PS: Don't forget to check if strstr returns NULL (by the way, you cannot use strstr if you have binary data with \0 inside the buffer)

Resources