Memmove() in C prints the result twice - c

I was playing around with memmove and I understand how it works. But whenever the end result contains more than the original source size, it prints out a bunch of random numbers. For example:
char str[] = "abcdefgh";
memmove(str + 6, str + 3, 4);
printf("%s\n", str);
gives me the output abcdefdefgbdefggh when it should give me
abcdefdefg Why is the other characters being added into str?

memmove(void *destination, void *source, size_t bytesToCopy)
The other characters added to the string are characters beyond the memory location of your declared char str[]. You have gone beyond the buffer address in memmove and the terminating character of '\0' has been over written. So when you call printf, the function will continue to print characters referenced by your pointer till it encounters '\0'.

The memory for str looks:
'a','b','c','d','e','f','g','h',0x0,?,?,?
^
End of buffer (terminates the string)
You copy 4 bytes from index 3 to index 6 which gives
'a','b','c','d','e','f','d','e','f','g',?,?
^
End of buffer
So you have
a) overwritten the string termination (0x0) by 'f'
b) written outside the buffer (i.e. 'g') which is really bad
Due to a) you'll get strange results when printing str as the string termination is gone.

Related

Problem reading two strings with getchar() and then printing those strings in C

This is my code for two functions in C:
// Begin
void readTrain(Train_t *train){
printf("Name des Zugs:");
char name[STR];
getlinee(name, STR);
strcpy(train->name, name);
printf("Name des Drivers:");
char namedriver[STR];
getlinee(namedriver, STR);
strcpy(train->driver, namedriver);
}
void getlinee(char *str, long num){
char c;
int i = 0;
while(((c=getchar())!='\n') && (i<num)){
*str = c;
str++;
i++;
}
printf("i is %d\n", i);
*str = '\0';
fflush(stdin);
}
// End
So, with void getlinee(char *str, long num) function I want to get user input to first string char name[STR] and to second char namedriver[STR]. Maximal string size is STR (30 charachters) and if I have at the input more than 30 characters for first string ("Name des Zuges"), which will be stored in name[STR], after that I input second string, which will be stored in namedriver, and then printing FIRST string, I do not get the string from the user input (first 30 characters from input), but also the second string "attached" to this, I simply do not know why...otherwise it works good, if the limit of 30 characters is respected for the first string.
Here my output, when the input is larger than 30 characters for first string, problem is in the row 5 "Zugname", why I also have second string when I m printing just first one...:
Name des Zugs:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is 30
Name des Drivers:xxxxxxxx
i is 8
Zugname: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxxxxxxxx
Drivername: xxxxxxxx
I think your issue is that your train->name is not properly terminated with '\0', as a consequence when you call printf("%s", train->name) the function keeps reading memory until it finds '\0'. In your case I guess your structure looks like:
struct Train_t {
//...
char name[STR];
char driver[STR];
//...
};
In getlinee() function, you write '\0' after the last character. In particular, if the input is more than 30 characters long, you copy the first 30 characters, then add '\0' at the 31-th character (name[30]). This is a first buffer overflow.
So where is this '\0' actually written? well, at name[30], even though your not supposed to write there. Then, if you have the structure above when you do strcpy(train->name, name); you will actually copy a 31-bytes long string: 30 chars into train->name, and the '\0' will overflow into train->driver[0]. This is the second buffer overflow.
After this, you override the train->driver buffer so the '\0' disappears and your data in memory basically looks like:
train->name = "aaa...aaa" // no '\0' at the end so printf won't stop reading here
train->driver = "xxx\0" // but there
You have an off-by-one error on your array sizes -- you have arrays of STR chars, and you read up to STR characters into them, but then you store a NUL terminator, requiring (up to) STR + 1 bytes total. So whenever you have a max size input, you run off the end of your array(s) and get undefined behavior.
Pass STR - 1 as the second argument to getlinee for the easiest fix.
Key issues
Size test in wrong order and off-by-one. ((c=getchar())!='\n') && (i<num) --> (i+1<num) && ((c=getchar())!='\n'). Else no room for the null character. Bad form to consume an excess character here.
getlinee() should be declared before first use. Tip: Enable all compiler warnings to save time.
Other
Use int c; not char c; to well distinguish the typical 257 different possible results from getchar().
fflush(stdin); is undefined behavior. Better code would consume excess characters in a line with other code.
void getlinee(char *str, long num) better with size_t num. size_t is the right size type for array sizing and indexing.
int i should be the same type as num.
Better code would also test for EOF.
while((i<num) && ((c=getchar())!='\n') && (c != EOF)){
A better design would return something from getlinee() to indicate success and identify troubles like end-of-file with nothing read, input error, too long a line and parameter trouble like str == NULL, num <= 0.
I believe you have a struct similar to this:
typedef struct train_s
{
//...
char name[STR];
char driver[STR];
//...
} Train_t;
When you attempt to write a '\0' to a string that is longer than STR (30 in this case), you actually write a '\0' to name[STR], which you don't have, since the last element of name with length STR has an index of STR-1 (29 in this case), so you are trying to write a '\0' outside your array.
And, since two strings in this struct are stored one after another, you are writing a '\0' to driver[0], which you immediately overwrite, hence when printing out name, printf doesn't find a '\0' until it reaches the end of driver, so it prints both.
Fixing this should be easy.
Just change:
while(((c=getchar())!='\n') && (i<num))
to:
while(((c=getchar())!='\n') && (i<num - 1))
Or, as I would do it, add 1 to array size:
char name[STR + 1];
char driver[STR + 1];

wierd behaviour while copying address in string

#include<stdio.h>
#include<malloc.h>
#include<string.h>
int main()
{
int *p = (int *)malloc(4 * sizeof(int));
char str1[20] ;
char str2[20] ;
sprintf(str1,"%20.20p",p);
sprintf(str2,"%20.20p",p);
printf("%d\t%20.20s\n",strlen(str1),str1);
printf("%d\t%20.20s\n",strlen(str2),str2);
if(strcmp(str1,str2) == 0)
printf("SAME\n");
else
printf("DIFFERENT\n");
free(p);
return 0;
}
OUTPUT:
42 0x000000000000083bc0
22 0x000000000000083bc0
DIFFERENT
The string length always differ in each and every compiler I ran . Though the pointer was same every single time . Not sure why . Because the length is different the strings are also not matching.
It looks like you are printing a 22 character string (23 counting the trailing \0) into a pair of 20 character buffers. This means that the program is free to overwrite the end of your strings at any point, since that memory is not reserved. You will see even stranger results if you replace %20.20s with plain %s in your print statements. The fix is to declare str1[23]; str2[23]; (don't forget the trailing \0), and don't restrict the print output: use plain %s.
You got off lucky in that your program prints an output without crashing. Not allocating enough memory can cause two problems:
You overwrite something important without realizing it.
Something else overwrites your data. This can cause things like your strings suddenly having a length of thousands of characters if the terminator is overwritten.
That's because sprintf(str1,"%20.20p",p); will print more than 20 symbols into str1, so you have a buffer overflow in your code - the first printf prints to both str1 and to a part of str2, and the second printf overwrites that part inside str2.
It actually prints 0x0000000000000173d010, which is 23 symbols, including zero character at the end.

String Initialization in c

I am quite new to C programming so feel free to correct me, I insist. My basic understanding of strings in C is when we initialize a string a null character is automatically assigned at the end of the string and the null character cannot be read read or written, but is used internally only.
So when I create a string of size 4 as char str[3] and assign a word to it say "RED" and print it using puts function or printf("%s",str), I get an unusual output printed as RED(SMIILEY FACE)
I then again reduce the size of string to char str[2] and assign RED to it and then compile it and the again receive a output stating RE(Smiley face)
If someone can explain it to me I will be thankful . Posting the C code below
int main()
{
char s1[3]="RED";
char s2[]="RED";
puts(s1);
puts(s2);
printf("%s",s1);
return 0;
}
char s1[3] = "RED";
Is a valid statement. It copies 3 characters from the constant string literal "RED" (which is 4 characters long) into the character array s1. There is no terminating '\0' in s1, because there is no room for it.
Note the copy, because s1 is mutable, while "RED" is not. This makes the statement different from e.g. const char *s1 = "RED";, where the string is not copied.
The result of both puts(s1) and printf("%s", s1) are undefined. There is no terminating '\0' in s1. Treating it as a string with one can lead to arbitrary behavior.
char s2[] = "RED";
Here, sizeof(s2) == 4, because "RED" has four characters, you need to count the trailing '\0' when calculating space.
The null character takes one exra character(byte). So you need to use an extra space in addition to the number of characters in the word you are initializing.
char s1[4]="RED"; //3 for RED and 1 for the null character
On the other hand
char s2[3]="RED";
there is no space for null character. "RED" is in there but you would encounter I/O problems when printing it as there is no null character stored at the end. Your data is stored fine but it can't be recognized properly by the printf as there is no null character.
char s2[]="RED";
This would work as memory of 4 (bytes) is automatically assigned which includes space for the terminating null character.

Size definition of strcat() function

The question is why should I define size of string (string[] should be string[some-number])
When the program is as following it gives me Abort trap: 6:
#include <stdio.h>
#include <string.h>
int main(void)
{
char buffer1[] = "computer";
char string[]="program";
strcat( buffer1, string );
printf( "buffer1 = %s\n", buffer1 );
}
This is the program from http://www.tutorialspoint.com/cprogramming/c_data_types.htm it works fine:
#include <stdio.h>
#include <string.h>
int main ()
{
char str1[12] = "Hello";
char str2[12] = "World";
char str3[12];
int len ;
/* copy str1 into str3 */
strcpy(str3, str1);
printf("strcpy( str3, str1) : %s\n", str3 );
/* concatenates str1 and str2 */
strcat( str1, str2);
printf("strcat( str1, str2): %s\n", str1 );
/* total lenghth of str1 after concatenation */
len = strlen(str1);
printf("strlen(str1) : %d\n", len );
return 0;
}
What is the mistake? Even if I define all of the sizes of strings in my program, my code still gives Abort trap:6?
From the man page of strcat:
DESCRIPTION
The strcat() function appends the src string to the dest string, overwriting the termiā€
nating null byte ('\0') at the end of dest, and then adds a terminating null byte. The
strings may not overlap, and the dest string must have enough space for the result. If
dest is not large enough, program behavior is unpredictable; buffer overruns are a
favorite avenue for attacking secure programs.
When you declare your string, the compiler allocate the size of your initial string to be 9 (resp. 8) for the buffer1 (resp. string) (includin '\0').
Thus, strcat will result in 9 - 1 + 8 (i.e. 16 bytes) but only 9 are available.
Your strcat is buffer overflowing buffer1 which can hold only strlen("computer")+1 bytes. ommitting array size does not mean "dynamic" array! When you specify the size of the array, you are reserving as many bytes as you want: again you need to avoid bufferoverflow of course.
So,
strcpy(str3, str1);
and
strcat( str1, str2);
are ok since str3 size is enough for str1, and str1 is enough for strlen(str1) + strlen(str2) + 1, i.e. exactly 11: 5 (hello) + 5 (world) + 1 (terminator). The magic number 12 was choosen with a reason, big enough to hold both strings and a terminator.
About C strings
C-strings are array of chars where the last is "null", '\0', i.e. they are array of chars where the last one is 0. This terminator is needed so that string related functions can understand where the string ends.
If it happens that a null byte is found in the middle of a string, from the point of view of C string functions, the string will end at that point. E.g.
char buffer1[] = "computer\0program";
// array: { 'c', 'o', ... '\0', 'p', 'r', 'o', .., 'm', '\0' }
// ...
printf("%s\n", buffer1);
will print computer only. But at this point the buffer will be big enough to hold computer and program, a terminator (and another extra byte), since the compiler computed the size of the char array considering the literal sequence of characters which syntactically ends at the second ".
But for all C-string functions, the string contained in buffer1 is computer. Note also that sizeof buffer1 will give the correct size of the buffer, i.e. 17, opposed to the result of strlen(buffer1) which is just 8.
The first parameter of strcat is used to store the result, so it must have enough space for the concatenated string.
In your code:
char buffer1[] = "computer";
is equivalent to:
char buffer1[9] = "computer";
defines a char array with just enough space for the string "computer", but not enough space for the result.
char buffer1[] = "computer";
Creates a buffer big enough to hold 9 characters (strlen("Hello" + 1 byte for \0)). If you write anymore data to it what you end up with is Undefined behavior (UB). This is what happens when you do a strcat.
UB means the program might crash or show literally any behavior. You are rather lucky that a program with UB crashes because it does not need to, but if it does atleast there is a indication of something wrong in it. Most of the times programs with UB will continue running correctly and crash when you least expect or want them to.

Why does printf concatenate two variables when outputing, but only if the length of the string is not specified?

Anyone know why printf concatenates these two variables when outputting, but only if the length of the string is not specified?
#include <stdio.h>
int main(){
char myname[3] = "tim";
char myage[3] = "ten";
printf("myname is:%s \n", myname);
printf("myage is:%s \n", myage);
}
myname is:tim
myage is:tentim
...But when I don't specify the length of the strings it seems to work as I had expected, without printing both variables.
#include <stdio.h>
int main(){
char myname[] = "tim";
char myage[] = "ten";
printf("myname is:%s \n", myname);
printf("myage is:%s \n", myage);
}
myname is:tim
myage is:ten
You declare the array to have size 3 but you try to store 4 elements in it. Since there is enough memory for only 3 elements there is no memory left for the last element(the string null terminator \0), this leaves your character array without a null terminator.
Note that character arrays in c are expected to be null terminated so that you can print them using printf. This is because printf simply walks through the character array till it encounters a \0. In your first example since the array was never \0 terminated what you end up getting is Undefined behavior.(Practically, pintf will keep printing till it encounters a \0 and in the process reading beyond the bounds of memory allocated to the array)
In second case since you do not specify the size by yourself the appropriate size is chosen depending on the number of elements specified in the string i.e: 4 and the \0 terminate is in place.
You are not leaving enough room in your array for the null terminator. In C, when you initialize a char array with a string of the exact same length, the null terminator is dropped.
char myname[3] = "tim"; // equivalent to char myname[3] = {'t','i','m'};
char myage[3] = "ten"; // equivalent to char myage[3] = {'t','e','n'};
Without the null terminator, the printf function doesn't know when to stop printing your string, so it keeps going to the next memory location after your myage array, which just happens to be the storage for your myname array. The stack probably looks like this:
t <- beginning of myage
e
n
t <- beginning of myname
i
m
\0 <- a null terminator, by coincindence.
The fact that you don't get other garbage after the name is just a coincidence. Anything might be stored after your myname array, but in your case it was a null character, so printf stopped printing.
If you don't specify a size for your array, then a size is chosen that is one greater than the length of the string so that the null terminator can be stored:
char myname[] = "tim"; // equivalent to myname[4] = {'t','i','m','\0'};
char myage[] = "ten"; // equivalent to myage[4] = {'t','e','n','\0'};
Now your null terminators are put in place explicitly, and your stack looks like this:
t <- beginning of myage
e
n
\0 <- explicit null terminator
t <- beginning of myname
i
m
\0 <- explicit null terminator.
Now the printf function knows exactly when to stop printing.
The %s directive corresponds to an argument that points to a string. A string is a sequence of characters that ends at the first '\0'. However, you aren't giving the arrays in the first example enough space for a '\0', so those arrays don't contain strings.
printf thinks that a string exists, and continues printing characters until it comes to that '\0' character which belongs at the end of a string. As previously stated, there is no '\0' character because there isn't space for one. Your code causes printf to access bytes outside of the bounds of your arrays, which is undefined behaviour.
The myname[3] and myage[3] suppose to have a place for terminating \0. Thus, you can actually store only 2 symbols in each array.
In the second case compiler automatically sets size equal to 4 that is enough to store the strings.

Resources