printf prints a weird character after malloc [duplicate] - c

This question already has answers here:
strcat and malloc
(2 answers)
Closed 5 years ago.
I'm trying to append on a string. So I'm starting with malloc and I append on that string.
char * loc,*X,*Y;
X = "4";
Y = "8";
loc = (char *)malloc(strlen(X)+strlen(Y)+6); //its +1 for the comma between, +1 for '\0' and +2 for "->" +2 for "()".
strcat(loc,"->");
strcat(loc,"(");
strcat(loc,X);
strcat(loc,",");
strcat(loc,Y);
strcat(loc,")");
printf("%s\n", loc);
So when I run it I'm waiting to see:
->(4,8)
Instead there is a weird character in the beginning of the string and I see this:
└->(4,8)
If I clean the string after malloc with strcpy(loc,"") it's not there.
Why does this
└
appear in the first place??

The malloc function doesn't initialize the memory it allocates. Its contents is indeterminate. And as such, you don't know where, or even if there is a string terminator in that memory.
The strcpy function doesn't care about the existing contents, and will write a terminator. The strcat function on the other hand relies on finding a string terminator to know where it should start writing, but as we already established there might not even be a terminator in the memory.
So you have four choices:
Use strcpy as the first call, instead of strcat.
Explicitly set the first element to a terminator, as in loc[0] = '\0'
Use calloc which initializes the memory to zero, which just happens to be the same as the string terminator.
Use snprintf instead.
I recommend choice four.

As it is now, your code accesses uninitialized memory, invoking undefined behavior. There could be anything in that buffer.
You need to initialize the memory. Since you're treating the bugger as a C string, it will suffice to simply set the first byte to the NUL terminator, \0.
*loc = '\0';
Now you have a valid string of length zero.

Related

When we call (char*)malloc(sizeof(char)) to allocate memory for a string, is it initialized? How to initialize?

char* str = (char*)malloc(100*sizeof(char));
strcpy(str, ""); //Does this line initialize str to an empty string?
After calling line 1, does the allocated memory contain garbage? What about after calling line 2?
After calling line 1, does the allocated memory contain garbage?
It can contain anything, since malloc per the standard isn't required to initialize the memory, and hence in most implementations shouldn't either. It will most likely just contain whatever data the previous "user" of that memory put there.
What about after calling line 2?
With that instruction you're copying a \0 character with "byte value" 0 to the first byte of the allocated memory. Everything else is still untouched. You could as well have done str[0] = '\0' or even *str = '\0'. All of these options makes str point at an "empty string".
Note also that, since you tagged the question C and not C++, that casting the return value from malloc is redundant.
malloc just provides a memory location creates a pointer to it and returns that. Initialization will not happen. You might get a junk value if the same memory was occupied by other stuff before.
Yes, malloc returns uninitialized memory, and yes, your second line of code initializes your memory to an empty string.

C - strcpy with malloc size less than argument's size [duplicate]

This question already has answers here:
I can use more memory than how much I've allocated with malloc(), why?
(17 answers)
Closed 7 years ago.
char* init_array()
{
const int size = 5;
char *p = (char*) malloc(size * sizeof(char));
strcpy(p, "Hello, world! How are you?");
return p;
}
with size = 5, malloc should get 5 free chars from memory, but given string does not fit into 5 chars, yet it works.
My question is why? First I thought the result would get truncated but p is the fully string, not just "Hello" or "Hell\0"
I'm using GCC on Linux. Is it related to the compiler or it is standard stuff?
It's called undefined behavior, since it's undefined sometimes it works. Yes you can write past a memory block in c, but that's illegal because it invokes undefined behavior, the behavior is therefore not predictable and your program might or might not work.
What you expect from strcpy() doesn't happen because strcpy() copies as many characters as it finds before the '\0' terminating byte, it doesn't care if the destination buffer is large enough, that's something you must be responsible about.
If you want to copy an exact number of bytes (let's say 5) you can use
memcpy(p, "Hello, this string is very large but it doesn't matter", 5);
but beware that p is not a valid string after that, because it has no terminating '\0'.
You also have 2 other common bad practice that new c programmers do
You don't need to cast the return value from malloc().
You don't need to use sizeof(char) because it's 1 by definition.
So,
p = malloc(size);
should be enough to allocate space for a size - 1 characters string.
What you are experiencing is a buffer overflow
In short, you write to invalid memory addresses invoking Undefined Behavior. Anything can happen.
It seems that it worked but in fact your code invokes undefined behavior. You are writing data to unallocated memory location.
You should note that in
strcpy(str1, str2);
strcpy has no way to check whether the string pointed to by str2 will fit into the character array str1. In your case it will continue to copy the characters from "Hello, world! How are you? to past the array p points to.

is this code correct?If yes then malloc is already assigning the addresses to name[i] variable then why strcpy is used?

Following is the piece of code
char str[20];
char *name[5];
for(i=0;i<5;i++){
printf("Enter a string");
gets(str);
name[i]=(char *)malloc(strlen(str));
strcpy(name[i],str);
}
When in line 5 address of each string(denoted by str variable) is stored in name[i] array, then why this code is copying each address into name[i] using strcpy()?
is this code correct?
Sorry, No. Please follow the below mentioned points.
Point 1
Please do not cast the return value of malloc() and family in C.
Point 2
malloc() is to allocate memory to the pointer. strcpy() is to fill the allocated memory. If you compare the code,
name[i]=malloc(<size>));
allocates memory of size bytes to name[i] pointer. but, the contains of the memory location is uninitialized or garbage.
strcpy(name[i],str);
it copies the containts of str to name[i]. After this, name[i] contains the same string as str.
Note:
That said, to strcpy() a string str, you need to malloc() for strlen(str) + 1 bytes, to have space for terminating null. Otherwise, you'll end up overrunning the allocated memory area which in turn invokes undefined behaviour.
Also, you should (IMHO, MUST) consider using fgets() over gets().
The strcpy() call copies the characters, not the pointer.
Also, you are under-allocating since you fail to include space for the terminating '\0' character. Thus, your code has undefined behavior.
So no, it's not correct (but the problem is not that it uses strcpy(), that's fine).
And perhaps it's not surprising that I too think that you should not cast the return value of malloc() in C.
Finally, you should never use gets(), it's very dangerous. Use fgets() instead, with a proper buffer size argument of course.
When you use malloc, you create a space in memory that equals the size of the string, but is an empty space, you have only an address.
You have to copy the value on the string to the name[i] array.
An analogy is, you have a pot with water, you can create another pot, but you only will have water on it, if you transfer from one to another.
the creation of the pot is the malloc function and the transfer of the contents is the strcpy.
char str[6]; //create a empty space for 6 characters
char *name[1]; //create a pointer for a location where
//the array will be stored, does not
//allocate any space
str = "abcdef" //assign letters to character array
name[1]=(char *)malloc(strlen(str+1)); //name[1] = _ _ _ _ _ _ _
//allocate space char array with
//size equal to str array plus 1
strcpy(name[1],str); //name[1] = a b c d e f /0
//copy the letters from one char
//array to the other
character array has 6 characters plus a null character to indicate end of array
Line 5 merely allocates space. The memory allocated by malloc() has unspecified values.
TL;DR;
malloc assigns memory for the process to use.
strcpy copies the required content into the malloced address space.

having memcpy problem

char *a=NULL;
char *s=NULL;
a=(char *)calloc(1,(sizeof(char)));
s=(char *)calloc(1,(sizeof(char)));
a="DATA";
memcpy(s,a,(strlen(a)));
printf("%s",s);
Can you plz tell me why its printing DATA½½½½½½½½■ε■????How to print only DATA?? Thanks
Strings in C are terminated by a zero character value (nul).
strlen returns the number of characters before the zero.
So you are not copying the zero.
printf keeps going, printing whatever is in the memory after s until it hits a zero.
You also are only creating a buffer of size 1, so you are writing data over whatever is after s, and you leak the memory calloc'd to a before you set a to be a literal.
Allocate the memory for s after finding the length of the string, allocating one more byte to include the nul terminator, then copy a into s. You don't need to allocate anything for a as the C runtime looks after storing the literal "DATA".
strlen does only count the chars without the terminator '\0'.
Without this terminator printf does not know the end od the string.
Solution:
memcpy(s,a,(strlen(a)+1));
You are first allocating memory, then throwing that memory away by re-assigning the pointer using a string literal. Your arguments to calloc() also look very wrong.
Also, memcpy() is not a string copying function, it doesn't include the terminator. You should use strcpy().
The best way to print only DATA would seem to be
puts("DATA");
You need to be more clear on what you want to do, to get help with the pointers/allocations/copying.
Your
a="DATA";
trashes the pointer to the allocated memory. It does not copy "DATA" into the memory. Which however would be not enough to store it, since
a=(char *)calloc(1,(sizeof(char)));
allocates a single char. While
memcpy(s,a,(strlen(a)));
copies what is pointed now by a (string literal "DATA") to the memory which is pointed by s. But again, s points to a single char allocated, and copying more than 1 char will overwrite something and results in a bug.
strlen(a) gives you 4 (the length of "DATA") and memcpy copies exactly 4 char. But to know where a string ends, C uses the convention to put a final "null" char ('\0') to its end. So indeed "DATA" is, in memory, 'D' 'A' 'T' 'A' '\0'.
All string related function expect the null byte, and they don't stop printing until they find it.
To copy strings, use instead strcpy (or strncpy), it copies the string with its final null byte too. (strcpy is less "secure" since you can overflow the destination buffer).
But the biggest problem I can see here is that you reserve a single char only for a (and you trash it then) and s, so DATA\0 won't fit anywhere.
You are reserving space for 1 character so you are actually using the memory of some other variable when you are writing "DATA" (which is 4 characters + the trailing \0 to mark the end of the string).
a=(char *)calloc(1,(sizeof(char)));
For this example you would need 5 characters or more:
a=(char *)calloc(5, (sizeof(char)));
You need to store a terminating \0 after that DATA string so printf() will know to stop printing.
You could replace memcpy with strcat:
strcat(s, a);
should do it.
Note, however, that there's a bug earlier on:
calloc(1,sizeof(char))
will only allocate a single byte! That's certainly not enough! Depending on the implementation, your program may or may not crash.

I'm new to C, can someone explain why the size of this string can change?

I have never really done much C but am starting to play around with it. I am writing little snippets like the one below to try to understand the usage and behaviour of key constructs/functions in C. The one below I wrote trying to understand the difference between char* string and char string[] and how then lengths of strings work. Furthermore I wanted to see if sprintf could be used to concatenate two strings and set it into a third string.
What I discovered was that the third string I was using to store the concatenation of the other two had to be set with char string[] syntax or the binary would die with SIGSEGV (Address boundary error). Setting it using the array syntax required a size so I initially started by setting it to the combined size of the other two strings. This seemed to let me perform the concatenation well enough.
Out of curiosity, though, I tried expanding the "concatenated" string to be longer than the size I had allocated. Much to my surprise, it still worked and the string size increased and could be printf'd fine.
My question is: Why does this happen, is it invalid or have risks/drawbacks? Furthermore, why is char str3[length3] valid but char str3[7] causes "SIGABRT (Abort)" when sprintf line tries to execute?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main() {
char* str1 = "Sup";
char* str2 = "Dood";
int length1 = strlen(str1);
int length2 = strlen(str2);
int length3 = length1 + length2;
char str3[length3];
//char str3[7];
printf("%s (length %d)\n", str1, length1); // Sup (length 3)
printf("%s (length %d)\n", str2, length2); // Dood (length 4)
printf("total length: %d\n", length3); // total length: 7
printf("str3 length: %d\n", (int)strlen(str3)); // str3 length: 6
sprintf(str3, "%s<-------------------->%s", str1, str2);
printf("%s\n", str3); // Sup<-------------------->Dood
printf("str3 length after sprintf: %d\n", // str3 length after sprintf: 29
(int)strlen(str3));
}
This line is wrong:
char str3[length3];
You're not taking the terminating zero into account. It should be:
char str3[length3+1];
You're also trying to get the length of str3, while it hasn't been set yet.
In addition, this line:
sprintf(str3, "%s<-------------------->%s", str1, str2);
will overflow the buffer you allocated for str3. Make sure you allocate enough space to hold the complete string, including the terminating zero.
void main() {
char* str1 = "Sup"; // a pointer to the statically allocated sequence of characters {'S', 'u', 'p', '\0' }
char* str2 = "Dood"; // a pointer to the statically allocated sequence of characters {'D', 'o', 'o', 'd', '\0' }
int length1 = strlen(str1); // the length of str1 without the terminating \0 == 3
int length2 = strlen(str2); // the length of str2 without the terminating \0 == 4
int length3 = length1 + length2;
char str3[length3]; // declare an array of7 characters, uninitialized
So far so good. Now:
printf("str3 length: %d\n", (int)strlen(str3)); // What is the length of str3? str3 is uninitialized!
C is a primitive language. It doesn't have strings. What it does have is arrays and pointers. A string is a convention, not a datatype. By convention, people agree that "an array of chars is a string, and the string ends at the first null character". All the C string functions follow this convention, but it is a convention. It is simply assumed that you follow it, or the string functions will break.
So str3 is not a 7-character string. It is an array of 7 characters. If you pass it to a function which expects a string, then that function will look for a '\0' to find the end of the string. str3 was never initialized, so it contains random garbage. In your case, apparently, there was a '\0' after the 6th character so strlen returns 6, but that's not guaranteed. If it hadn't been there, then it would have read past the end of the array.
sprintf(str3, "%s<-------------------->%s", str1, str2);
And here it goes wrong again. You are trying to copy the string "Sup<-------------------->Dood\0" into an array of 7 characters. That won't fit. Of course the C function doesn't know this, it just copies past the end of the array. Undefined behavior, and will probably crash.
printf("%s\n", str3); // Sup<-------------------->Dood
And here you try to print the string stored at str3. printf is a string function. It doesn't care (or know) about the size of your array. It is given a string, and, like all other string functions, determines the length of the string by looking for a '\0'.
Instead of trying to learn C by trial and error, I suggest that you go to your local bookshop and buy an "introduction to C programming" book. You'll end up knowing the language a lot better that way.
There is nothing more dangerous than a programmer who half understands C!
What you have to understand is that C doesn't actually have strings, it has character arrays. Moreover, the character arrays don't have associated length information -- instead, string length is determined by iterating over the characters until a null byte is encountered. This implies, that every char array should be at least strlen + 1 characters in length.
C doesn't perform array bounds checking. This means that the functions you call blindly trust you to have allocated enough space for your strings. When that isn't the case, you may end up writing beyond the bounds of the memory you allocated for your string. For a stack allocated char array, you'll overwrite the values of local variables. For heap-allocated char arrays, you may write beyond the memory area of your application. In either case, the best case is you'll error out immediately, and the worst case is that things appear to be working, but actually aren't.
As for the assignment, you can't write something like this:
char *str;
sprintf(str, ...);
and expect it to work -- str is an uninitialized pointer, so the value is "not defined", which in practice means "garbage". Pointers are memory addresses, so an attempt to write to an uninitialized pointer is an attempt to write to a random memory location. Not a good idea. Instead, what you want to do is something like:
char *str = malloc(sizeof(char) * (string length + 1));
which allocates n+1 characters worth of storage and stores the pointer to that storage in str. Of course, to be safe, you should check whether or not malloc returns null. And when you're done, you need to call free(str).
The reason your code works with the array syntax is because the array, being a local variable, is automatically allocated, so there's actually a free slice of memory there. That's (usually) not the case with an uninitialized pointer.
As for the question of how the size of a string can change, once you understand the bit about null bytes, it becomes obvious: all you need to do to change the size of a string is futz with the null byte. For example:
char str[] = "Foo bar";
str[1] = (char)0; // I'd use the character literal, but this editor won't let me
At this point, the length of the string as reported by strlen will be exactly 1. Or:
char str[] = "Foo bar";
str[7] = '!';
after which strlen will probably crash, because it will keep trying to read more bytes from beyond the array boundary. It might encounter a null byte and then stop (and of course, return the wrong string length), or it might crash.
I've written all of one C program, so expect this answer to be inaccurate and incomplete in a number of ways, which will undoubtedly be pointed out in the comments. ;-)
Your str3 is too short - you need to add extra byte for null-terminator and the length of "<-------------------->" string literal.
Out of curiosity, though, I tried
expanding the "concatenated" string to
be longer than the size I had
allocated. Much to my surprise, it
still worked and the string size
increased and could be printf'd fine.
The behaviour is undefined so it may or may not segfault.
strlen returns the length of the string without the trailing NULL byte (\0, 0x00) but when you create a variable to hold the combined strings you need to add that 1 character.
char str3[length3 + 1];
…and you should be all set.
C strings are '\0' terminated and require an extra byte for that, so at least you should do
char str3[length3 + 1]
will do the job.
In sprintf() ypu are writing beyond the space allocated for str3. This may cause any type of undefined behavior (If you are lucky then it will crash). In strlen(), it is just searching for a NULL character from the memory location you specified and it is finding one in 29th location. It can as well be 129 also i.e. it will behave very erratically.
A few important points:
Just because it works doesn't mean it's safe. Going past the end of a buffer is always unsafe, and even if it works on your computer, it may fail under a different OS, different compiler, or even a second run.
I suggest you think of a char array as a container and a string as an object that is stored inside the container. In this case, the container must be 1 character longer than the object it holds, since a "null character" is required to indicate the end of the object. The container is a fixed size, and the object can change size (by moving the null character).
The first null character in the array indicates the end of the string. The remainder of the array is unused.
You can store different things in a char array (such as a sequence of numbers). It just depends on how you use it. But string function such as printf() or strcat() assume that there is a null-terminated string to be found there.

Resources