char arrays in c end char - c

I'm reading from a socket into a char array and I want to know when to stop reading. The terminating char sequence is '\r\n\r\n'. If what I read in is smaller than the array size I don't want to loop around anymore. My question is really if I load into the array say 10 characters and it has length 20, what is the array[20] index set to?
Thanks
edit:
Sorry I did mean array[19], setting the last index to NULL as suggested? seems like an appropriate solution. To give some more detail, I need to know when all the data has been read from the socket. I don't know the size of the data to be sent only that it terminates with '\r\n\r\n'

If it has length 20, then array[20] is outside your array and shouldn't be accessed like that (unless you want to do some sort of wizardy and hacking beyond your explanation).
EDIT: If you meant array[19], then no. You need to set the NUL character at array index = size of string received. ASCII NUL character '\0' is not C NULL constant, which for 32-bits machines would be 4-byte long, and that would potentially overwrite data.

My question is really if I load into the array say 10 characters and it has length 20, what is the array[20] index set to?
It's not set to anything. Feel free to set it to something yourself (for instance, a null terminator).

Generally in the name of efficiency C does not initialize an array to any known value, so you'll get whatever was leftover in memory.
You can explicitly initialize the array to fix this. A common initialization for a sequence of bytes is zero, which won't match your search string and will act as and end-of-string if you try to process the array as a string.
char array[20] = {0}; /* the extra elements are always initialized to 0 as well */
char array2[20];
memset(array2, 0, sizeof(array2));
I'll presume you had a typo and meant array[19] instead of array[20].

In C, when the array is malloced, the array has whatever is leftover in the malloced chunk of memory. If you copy several chars into the array and want the chars to be read as a string, you have to set the next char after the last char to be '\0'.

Since you know when to stop reading, you could set the next char in your array to '\0' to mark the end of the string.

To the best of my knowledge, the ANSI C standard does not describe what value should be allocated to uninitialized arrays. Consider it to be garbage and assume that nothing can be said about it. Although, I have mostly observed them to be 0 (using gcc). This implementation may vary across compilers.
Also, this value could depend on the previous steps which have modified array[19] (as mOskitO pointed out, array[20] is out of bounds).

Related

Does specifying array size for a user input string in C matter?

I am writing a code to take a user's input from the terminal as a string. I've read online that the correct way to instantiate a string in C is to use an array of characters. My question is if I instantiate an array of size [10], is that 10 indexes? 10 bits? 10 bytes? See the code below:
#include <stdio.h>
int main(int argc, char **argv){
char str[10] = "Jessica";
scanf("%s", &str);
printf("%c\n", str[15]);
}
In this example "str" is initialized to size 10 and I am able to to print out str[15] assuming that when the user inputs a a string it goes up to that index.
My questions are:
Does the size of the "str" array increase after taking a value from scanf?
At what amount of string characters will my original array have overflow?
.
When you declare an array of char as you have done:
char str[10] = "Jessica";
then you are telling the compiler that the array will hold up to 10 values of the type char (generally - maybe even always - this is an 8-bit character). When you then try to access a 'member' of that array with an index that goes beyond the allocated size, you will get what is known as Undefined Behaviour, which means that absolutely anything may happen: your program may crash; you may get what looks like a 'sensible' value; you may find that your hard disk is entirely erased! The behaviour is undefined. So, make sure you stick within the limits you set in the declaration: for str[n] in your case, the behaviour is undefined if n < 0 or n > 9 (array indexes start at ZERO). Your code:
printf("%c\n", str[15]);
does just what I have described - it goes beyond the 'bounds' of your str array and, thus, will cause the described undefined behaviour (UB).
Also, your scanf("%s", &str); may also cause such UB, if the user enters a string of characters longer than 9 (one must be reserved for a terminating nul character)! You can prevent this by telling the scanf function to accept a maximum number of characters:
scanf("%9s", str);
where the integer given after the % is the maximum input length allowed (anything after this will be ignored). Also, as str is defined as an array, then you don't need the explicit "address of" operator (&) in scanf - it is already there, as an array reference decays to a pointer!
Hope this helps! Feel free to ask for further clarification and/or explanation.
One of C's funny little foibles is that in almost all cases it does not check to make sure you are not overflowing your arrays.
It's your job to make sure you don't access outside the bounds of your arrays, and if you accidentally do, almost anything can happen. (Formally, it's undefined behavior.)
About the only thing that can't happen is that you get a nice error message
Error: array out-of-bounds access at line 23
(Well, theoretically that could happen, but in practice, virtually no C implementation checks for array bounds violations or issues messages like that.)
See also this answer to a similar question.
An array declares the given number of whatever you are declaring. So in the case of:
char str[10]
You are declaring an array of ten chars.
Does the size of the "str" array increase after taking a value from scanf?
No, the size does not change.
At what amount of string characters will my original array have overflow?
An array of 10 chars will hold nine characters and the null terminator. So, technically, it limits the string to nine characters.
printf("%c\n", str[15]);
This code references the 16th character in your array. Because your array only holds ten characters, you are accessing memory outside of the array. It's anyone's guess as to if your program even owns that memory and, if it does, you are referencing memory that is part of another variable. This is a recipe for disaster.

C char array and \0

In C, if I initialize a char array like this:
char lines[5];
memcpy((char *)line,"Hello",5)
Then if I execute the following expression:
line[6]='\0';
Would this cause buffer overflow? Thanks?
Many problems. For one, why cast to char *, when that is to what the array decays? Second, you need to use a zero-based index, not a one-based index; The first element of array a is a[0] not a[1].
Also you should have set the buffer size to 6, not 5, to make room for terminator
Then if I execute the following expression:
line[6]='\0';
Would this cause buffer overflow?
Yes. Because lines contains five characters and you are overwriting the seventh one.
Would the comp[il]er assign 8 bytes for 'lines'?
No.
It might put 3 bytes of padding after lines, in which case it's still a buffer overflow because lines is still 5 bytes long.
You are definitely writing outside the bounds of the array, which leads to undefined behavior. The result could be any of the following:
a runtime error (segfault);
corrupted data (overwriting part of another object);
behaving exactly as expected
Most platforms have alignment requirements such that there may be some unused bytes between the end of the array and the next object in memory1, and writing one or two bytes past the end of the array isn't much of an issue. But that's not the same thing as the compiler allocating "extra space" for the array.
Assuming the array size isn't a multiple of 2 or 4 bytes, anyway.

Last value of char array unknown - C

I'm making a simple program in C, which checks the length of some char array and if it's less than 8, I want to fill a new array with zeroes and add it to the former array. Here comes the problem. I don't know why the last values are some signs(see the photo).
char* hexadecimalno = decToHex(decimal,hexadecimal);
printf("Hexadecimal: %s\n", hexadecimalno);
char zeroes [8 - strlen(hexadecimalno)];
if(strlen(hexadecimalno) < 8){
for(i = 0; i < (8-strlen(hexadecimalno)); i++){
zeroes[i]='0';
}
}
printf("zeroes: %s\n",zeroes);
strcat(zeroes,hexadecimalno);
printf("zeroes: %s\n",zeroes);
result
In C, strings (which are, as you are aware, arrays of characters) do not have any special metadata that tells you their length. Instead, the convention is that the string stops at the first character whose char value is 0. This is called "null-termination". The way your code is initializing zeroes does not put any null character at the end of the array. (Do not confuse the '0' characters you are putting in with NUL characters -- they have char value 48, not 0.)
All of the string manipulation functions assume this convention, so when you call strcat, it is looking for that 0 character to decide the point at which to start adding the hexadecimal values.
C also does not automatically allocate memory for you. It assumes you know exactly what you are doing. So, your code is using a C99 feature to dynamically allocate an array zeroes that has exactly the number of elements as you need '0' characters appended. You aren't allocating an extra byte for a terminating NUL character, and strcat is also going to assume that you have allocated space for the contents of hexadecimalno, which you have not. In C, this does not trigger a bounds check error. It just writes over memory that you shouldn't actually write over. So, you need to be very careful that you do allocate enough memory, and that you only write to memory you have actually allocated.
In this case, you want hexadecimalno to always be 8 digits long, left-padding it with zeroes. That means you need an array with 8 char values, plus one for the NUL terminator. So, zeroes needs to be a char[9].
After your loop that sets zeroes[i] = '0' for the correct number of zeroes, you need to set the next element to char value 0. The fact that you are zero-padding confuses things, but again, remember that '0' and 0 are two different things.
Provided you allocate enough space (at least 9 characters, assuming that hexadecimalno will never be longer than 8 characters), and then that you null terminate the array when putting the zeroes into it for padding, you should get the expected result.

Array fill in C

I have this problem with a lot of arrays in my program, and I can't understand why. I think I miss something on array theory.
"Someone" adds at the end of my arrays some sort of char characters such as ?^)(&%. For example if I have an array of lenght 5 with "hello", so it's full, sometimes it prints hello?()/&%%. I can undesrtand it can occur if it's of 10 elements and i use only 5, so maybe the other 5 elements get some random values, but if it's full, where the hell gets those strange values?
I partially solve it by manaully adding at the end the character '\0'.
For example this problem occurs, sometimes, when I try to fill an array from another array (i read a line form a test file with fgets, then I have to extract single words):
...
for(x=0;fgets(c,500,fileb);x++) { // read old local file
int l=strlen(c);
i=0;
for (k=0;k<(l-34);k++) {
if(c[k+33]!='\n') {
userDatabaseLocalPath[k]=c[k+33];
}
}
Thanks
Strings in C are terminated by a character with the value 0, often referred to as a character literal, i.e. '\0'.
A character array of size 5 can not hold the string hello, since the terminator doesn't fit. Functions expecting a terminator will be confused.
To declare an array holding a string, the best syntax to use is:
char greeting[] = "hello";
This way, you don't need to specify the length (count the characters), since the compiler does that for you. And you also don't need to include the terminator, it's added automatically so the above will create this, in memory:
+-+-+-+-+-+--+
greeting: |h|e|l|l|o|\0|
+-+-+-+-+-+--+
You say that you have problems "filling an array from another longer array", this sounds like an operation most referred to as string copying. Since strings are just arrays with terminators, you can't blindly copy a longer string over a shorter, unless you know that there is extra space.
Given the above, this code would invoke undefined behavior:
strcpy(greeting, "hi there!");
since the string being copied into the greeting array is longer than what the array has space for.
This is typically avoided by using "known to be large enough" buffers, or adding checks that manually keep track of the space used. There is a function called strncpy() which sort of does this, but I would not recommend using it since its exact semantics are fairly odd.
You are facing the issue of boundary limit for the array.. if the array is of size 5 , then its not necessary that the sixth location which will be \0 be safe.. As it is not memory reserved/assigned for your array.. If after sometime some other application accesses this memory and writes to it. you will lose the \0 resulting in the string helloI accessed this space being read. which is what you are getting.

second memcpy() attaches previous memcpy() array to it

I have a little problem here with memcpy()
When I write this
char ipA[15], ipB[15];
size_t b = 15;
memcpy(ipA,line+15,b);
It copies b bytes from array line starting at 15th element (fine, this is what i want)
memcpy(ipB,line+31,b);
This copies b bytes from line starting at 31st element, but it also attaches to it the result for previous command i.e ipA.
Why? ipB size is 15, so it shouldnt have enough space to copy anything else. whats happening here?
result for ipA is 192.168.123.123
result for ipB becomes 205.123.123.122 192.168.123.123
Where am I wrong? I dont actually know alot about memory allocation in C.
It looks like you're not null-terminating the string in ipA. The compiler has put the two variables next to one another in memory, so string operations assume that the first null terminator is sometime after the second array (whenever the next 0 occurs in memory).
Try:
char ipA[16], ipB[16];
size_t b = 15;
memcpy(ipA,line+15,b);
ipA[15] = '\0';
memcpy(ipB,line+31,b);
ipB[15] = '\0';
printf("ipA: %s\nipB: %s\n", ipA, ipB)
This should confirm whether this is the problem. Obviously you could make the code a bit more elegant than my test code above. As an alternative to manually terminating, you could use printf("%.*s\n", b, ipA); or similar to force printf to print the correct number of characters.
Are you checking the content of the arrays by doing printf("%s", ipA) ? If so, you'll end up with the described effect since your array is interpreted as a C string which is not null terminated. Do this instead: printf("%.*s", sizeof(ipA), ipA)
Character strings in C require a terminating mark. It is the char value 0.
As your two character strings are contiguous in memory, if you don't terminate the first character string, then when reading it, you will continue until memory contains the end-of-string character.

Resources