Assignments to malloced string behaving wierd

Assignments to malloced string behaving wierd - c

Here are the codes, it's C code. And please explain return value of ="string"
char * p = (char*) malloc(sizeof(char) * 100);
p = "hello";
*(p+1) = '1';
printf("%s", p);
free(p);

Notice that p = "hello"; does not copy any string, but just sets the pointer p to become the address of the 6 bytes literal string "hello". To copy a string use strncpy or strcpy (read strncpy(3) ...) but be scared of buffer overflows.
So you do
char * p = malloc(100);
which allocates a memory zone capable of holding 100 bytes. Let's pretend that malloc succeeded (read malloc(3)...) and returned the address 0x123456 for example (often, that concrete address is not reproducible from one run to the next, e.g. because of ASLR).
Then you assign p = "hello"; so you forgot the address 0x123456 (you've got now a memory leak), and you put in p the address of the 6 bytes literal string "hello" (let's imagine it is 0x2468a).
Later the machine executes the code for *(p+1) = '1'; so you are trying to replace the e character (at address 0x2468b) inside literal "hello" by 1. You get a segmentation violation (or some other undefined behavior), since that literal string sits in constant read only memory (e.g. the text segment of your executable, at least on Linux).
A better code might be:
#define MY_BUFFER_LEN 100
char *p = malloc(MY_BUFFER_LEN);
if (!p) { perror("malloc for p"); exit(EXIT_FAILURE); };
strncpy(p, "hello", MY_BUFFER_LEN); /// you could do strcpy(p, "hello")
*(p+1) = '1';
printf("%s", p);
free(p);
and if you are lucky (no memory failure) that would much later output h1llo (the output will happen only when stdout becomes flushed since it is buffered, e.g. by some later call to fflush). So don't forget to call
fflush(NULL);
after the previous code chunk. Read perror(3), fflush(3).
A generic advice is to read documentation of every function that you are using (even printf(3) ...).
Regarding printf and related functions, since stdout is often line buffered, in practice I strongly recommend to end every format control string with \n -e.g. printf("%s\n", p); in your code; when you don't do that (there are some cases where you don't want to...) think twice and perhaps comment your code.
Don't forget to compile with all warnings and debug info (e.g. gcc -Wall -Wextra -g) then learn how to use the debugger (e.g. gdb)

Your code doesn't do what you think it does.
"hello" is behind the scenes a pointer to a static array of six chars, most likely write protected.
When you assign it to p, the pointer that malloc returned is lost, instead p now contains a pointer to that static array of six chars.
The assigment to p+1 may crash, or may not crash, but whatever it does, it is undefined behaviour and will cause trouble.
free (p) tries to free a static array of six chars. That isn't going to work. Again, undefined behaviour, and an immediate crash if you are lucky.

There are problems in nearly each line:
1) In the first line you are assigning to p the address of newly allocated memory. which is OK
2) In the next line you overwrite it with an address of some static string. Which is bad, since the allocated memory is "lost", thus causing memory leak.
3) In the third line you are trying to overwrite something in the static string location, which might be read only, which is bad.
4) In the last line you are trying to free the memory at the string's location, which is memory violation.

You need to understand what pointers are.
In C, there is no variable that can store a string. Instead the strings are stored in arrays of chars. In order to be able to handle such an array, you use a variable (called pointer) that stores the memory address of the first element of the array. Also, strings have a special character (the nul character) the indicate where they end, but that's not relevant here.
In the code you posted, you allocate memory to store 100 chars (this is usually 100 bytes) and get a pointer to the first element of this memory region, called p (for further reading: Do I cast the result of malloc?).
Whenever you use a string literal, some memory gets allocated and it gets stored in that memory region. When you assign it to p, p now points to the first element of the new memory region, that stores the string (so you've lost the memory allocated you malloc -> memory leak). Now, you try to modify the string pointed at by p. This won't work, as the string literals are stored as constant strings. And after that, you try to free the memory of this string, which is not possible. All of these mistakes generate runtime errors and are not very easy to detect and fix. In order to be able to achieve what you want, use a function like strcpy to copy the string from the read-only memory region to your pointer:
strcpy(p, "string");

Strings in C are arrays, and you can't use = to assign values to arrays, only to indexed elements of arrays. For copying string data, use strcpy() or strncpy(). Your code could look like:
char *p = (char*) malloc(sizeof(char) * 100);
strcpy(p, "hello");
*(p+1) = '1';
printf("%s", p);
free(p);
Try that. The difference is that p = "hello" will set the pointer p to point to the constant string "hello", overwriting the pointer to the 100-byte memory block you just allocated. The free(p); call below will fail because you're passing a pointer not returned by an allocation function, even if the system you're using doesn't give a write fault on attempting to modify constant data.
If you ever do need to point a pointer at a constant string (it happens), be sure you declare the pointer as const char *. This is required in C++. I don't know if it's required in newer versions of C, but it's a Real Good Idea in any case.

1) In C it is incorrect cast the return of malloc() (C++, okay to cast)
2) Once allocated memory, assigning a value to p is not done by =.
strcpy(p, "hello"); is better.
3) Once assigned correctly, using strcpy(), *(p+1) = '1' is the same as p[1] = '1', and will result in "h1llo".
By the way, using the line *(p+1) = '1' AFTER using p = "hello" to assign the value hello (instead of strcpy()) will cause problems, as described by #gnasher, #Basile and others.

Related

What happens with memory when we reassign value to char pointer?

I wonder what happens inside the memory when we do something like this
char *s;
s = "Text";
s = "Another Text";
If I'm getting it right, by assigning string to char pointer memory is dynamically allocated. So according to my understanding assignment expression
s = "Text";
equals to
s = (char *) malloc(5); // "Text" + '\0'
strcpy(s, "Text");
Well, this way we can easily free memory by using
free(s);
But... After reassigning same pointer to another value, it allocates new memory segment to store that value.
s = "Text";
printf("\n(%p) s = \"%s\"", s, s);
s = "Another Text";
printf("\n(%p) s = \"%s\"", s, s);
Output:
(0x400614) s = "Text"
(0x400628) s = "Another Text"
That means that address of old value is not accessible to us any longer and we can't free that any more. Another call to free(s); will probably deallocate only last memory segment used by that pointer.
My question is: If we reassign same char pointer over and over again, does it consume more and more program memory during run-time or that garbage somehow gets automatically freed?
I hope that was enough to demonstrate my problem, couldn't think better example. If something's not clear enough please ask for additional clarification.

Your understanding is wrong. It is just the assignment and it does not allocate any memory. In your example you assign the pointer with the addresses of the string literals. String literals are created compile time and placed in the read only memory
You do now allocate any memory by assigning the pointer

It's not equal to doing a malloc. What's happening is that the string literal is stored in a read only part of memory. And it's not the assignment of a pointer that does the allocation. All string literals in a program are already allocated from start.
It might be worth mentioning that they are not strictly speaking stored in read only memory, but they might be and writing to a string literal is undefined behavior.
You cannot and should not call free on a string literal. Well, you can, but the program will likely crash.

With no optimization, compiler will reserve two distinct memory space for string literals "text1" and "text2".
If assignment lines are very consecutive as in your question and if nothing is done after the first assignment line —assuming compiling with optimization— compiler, most probably, will not allocate any space for the first string literal nor will produce any opcode for the first assignment line.

What is the difference between using strcpy and equating the addresses of strings?

I am not able to understand the difference between strcpy function and the method of equating the addresses of the strings using a pointer.The code given below would make my issue more clear. Any help would be appreciated.
//code to take input of strings in an array of pointers
#include <stdio.h>
#include <strings.h>
int main()
{
//suppose the array of pointers is of 10 elements
char *strings[10],string[50],*p;
int length;
//proper method to take inputs:
for(i=0;i<10;i++)
{
scanf(" %49[^\n]",string);
length = strlen(string);
p = (char *)malloc(length+1);
strcpy(p,string);//why use strcpy here instead of p = string
strings[i] = p; //why use this long way instead of writing directly strcpy(strings[i],string) by first defining malloc for strings[i]
}
return 0;
}

A short introduction into the magic of pointers:
char *strings[10],string[50],*p;
These are three variables with distinct types:
char *strings[10]; // an array of 10 pointers to char
char string[50]; // an array of 50 char
char *p; // a pointer to char
Then the followin is done (10 times):
scanf(" %49[^\n]",string);
Read C string from input and store it into string considering that a 0 terminator must fit in also.
length = strlen(string);
Count non-0 characters until 0 terminator is found and store in length.
p = (char *)malloc(length+1);
Allocate memory on heap with length + 1 (for 0 terminator) and store address of that memory in p. (malloc() might fail. A check if (p != NULL) wouldn't hurt.)
strcpy(p,string);//why use strcpy here instead of p = string
Copy C string in string to memory pointed in p. strcpy() copies until (inclusive) 0 terminator is found in source.
strings[i] = p;
Assign p (the pointer to memory) to strings[i]. (After assignment strings[i] points to the same memory than p. The assignment is a pointer assignment but not the assignment of the value to which is pointed.)
Why strcpy(p,string); instead of p = string:
The latter would assign address of string (the local variable, probably stored on stack) to p.
The address of allocated memory (with malloc()) would have been lost. (This introduces a memory leak - memory in heap which cannot be addressed by any pointer in code.)
p would now point to the local variable in string (for every iteration in for loop). Hence afterwards, all entries of strings[10] would point to string finally.

char *strings[10]---- --------->1.
strcpy(strings[i],string) ----->2.
strings[i] = string ----------->3.
p = (char *)malloc(length+1); -|
strcpy(p,string); |-> 4.
strings[i] = p;----------------|
strings is an array of pointers, each pointer must point to valid memory.
Will lead undefined behavior since strings[i] is not pointing to valid memory.
Works but every pointer of strings will point to same location thus each will have same contents.
Thus create the new memory first, copy the contents to it and assign that memory to strings[i]

strcpy copies a particular string into allocated memory. Assigning pointers doesn't actually copy the string, just sets the second pointer variable to the same value as the first.
strcpy(char *destination, char *source);
copies from source to destination until the function finds '\0'. This function is not secure and should not be used - try strncpy or strlcpy instead. You can find useful information about these two functions at https://linux.die.net/man/3/strncpy - check where your code is going to run in order to help you choose the best option.
In your code block you have this declaration
char *strings[10],string[50],*p;
This declares three pointers, but they are quite different. *p is an ordinary pointer, and must have space allocated for it (via malloc) before you can use it. string[50] is also a pointer, but of length 50 (characters, usually 1 byte) - and it's allocated on the function stack directly so you can use it right away (though the very first use of it should be to zero out the memory unless you've used a zeroing allocator like Solaris' calloc. Finally, *strings[10] is a double pointer - you have allocated an array of 10 pointers, each element of which (strings[1], strings[9] etc) must be allocated for before use.
The only one of those which you can assign to immediately is string, because the space is already allocated. Each of those pointers can be addressed via subscripts - but in each case you must ensure that you do not walk off the end otherwise you'll incur a SIGSEGV "segmentation violation" and your program will crash. Or at least, it should, but you might instead get merely weird results.
Finally, pointers allocated to must be freed manually otherwise you'll have memory leaks. Items allocated on the stack (string) do not need to be freed because the compiler handles that for you when the function ends.

When we call (char*)malloc(sizeof(char)) to allocate memory for a string, is it initialized? How to initialize?

char* str = (char*)malloc(100*sizeof(char));
strcpy(str, ""); //Does this line initialize str to an empty string?
After calling line 1, does the allocated memory contain garbage? What about after calling line 2?

After calling line 1, does the allocated memory contain garbage?
It can contain anything, since malloc per the standard isn't required to initialize the memory, and hence in most implementations shouldn't either. It will most likely just contain whatever data the previous "user" of that memory put there.
What about after calling line 2?
With that instruction you're copying a \0 character with "byte value" 0 to the first byte of the allocated memory. Everything else is still untouched. You could as well have done str[0] = '\0' or even *str = '\0'. All of these options makes str point at an "empty string".
Note also that, since you tagged the question C and not C++, that casting the return value from malloc is redundant.

malloc just provides a memory location creates a pointer to it and returns that. Initialization will not happen. You might get a junk value if the same memory was occupied by other stuff before.

Yes, malloc returns uninitialized memory, and yes, your second line of code initializes your memory to an empty string.

Why this code giving run time exception when free-ing the pointer

I have simple code,
#include "stdafx.h"
#include <malloc.h>
int main()
{
char *p = (char*) malloc(10);
p = "Hello";
free(p);
return 0;
}
This code is giving run time exception while terminating. Below is error snippiest,
Microsoft Visual C++ Debug Library
Debug Assertion Failed!
Program: ...\my documents\visual studio 2010\Projects\samC\Debug\samC.exe
File: f:\dd\vctools\crt_bld\self_x86\crt\src\dbgheap.c
Line: 1322
Expression: _CrtIsValidHeapPointer(pUserData)
For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.
(Press Retry to debug the application)
Abort Retry Ignore

p = "Hello"; makes p point to a string literal and discards the previously assigned value. You can't free a string literal. You can't modify it.
If you want p to hold that string, just use
char* p = "Hello";
or
char p[] = "Hello";
if you plan on modifying it.
Neither requires free.

This is how you write a string in the memory allocated by malloc to a char pointer.
strcpy(p, "Hello");
Replace the line
p = "Hello";
with the strcpy one & your program will work fine.
You also need to
#include <string.h>
malloc returns a pointer to allocated memory. Say the address is 95000 (just a random number I pulled out).
So after the malloc - p will hold the address 95000
The p containing 95000 is the memory address which needs to be passed to free when you are done with the memory.
However, the next line p = "Hello"; puts the address of the string literal "Hello" (which say exists at address 25000) into p.
So when you execute free(p) you are trying to free 25000 which wasn't not allocated by malloc.
OTOH, when you strcpy, you copy the string "Hello" into the address starting at p (i.e. 95000). p remains 95000 after the strcpy.
And free(p) frees the right memory.
You can also avoid the malloc and use
char *p = "Hello";
However, in this method, you cannot modify the string.
i.e. if after this you do *p = 'B' to change the string to Bello, it becomes an undefined operation. This is not the case in the malloc way.
If instead, you use
char p[] = "Hello";
or
char p[10] = "Hello";
you get a modifiable string which need not be freed.

p = "Hello";
free(p);
Since Hello is statically allocated, you cannot free it. I'm not sure why you allocate some memory just to throw the pointer away by changing it to another pointer, but that has no effect. If you do this:
int i = 1;
i = 2;
i has no memory that it once held a 1, it holds a 2 now. Similarly, p has no memory that it once held a pointer to some memory you allocated. It holds a pointer to an immutable constant now.

this is a nice one.
the char sequence "hello" is constant and therefore placed niether on the heap nor the stack, but in the .bss/.data segment. when you do p="hello" you make p point to the address of the string hello in that segment instead of the memory you alocated on the heap using malloc. when you go to free p it tries to free the memory in the .bss/.data segment, and naturally fails.
what you probably want is something like strcpy(p,"hello"); which goes over every char in "hello" and places it in the memory pointed to by p. essentially creating a copy of the string "hello" at memory address p.

If you want to copy the contents of the string "Hello" to the memory you allocated, you need to use strcpy:
strcpy(p, "Hello");
The line
p = "Hello";
assigns the address of the string literal "Hello" to the pointer p, overwriting the pointer value that was returned from malloc, hence the crash when you call free.

Memory allocated in char * var; declaration

In C, declaring a char pointer like this
char* p="Hello";
allocates some memory for a string literal Hello\0. When I do this afterwards
p="FTW";
what happens to the memory allocated to Hello\0? Is the address p points to changed?

There is no dynamic memory allocation in either statement.
Those strings are stored in your executable, loaded in a (likely read-only) section of memory that will live as long as your process does.
The second assignment only changes what p points to. Nothing else happens.

The memory remains occupied by "Hello". It is lost (unless you have other references to it).
The address p is pointing to (the value of p) is changed of course.

In this case, "Hello" is created at compile time and is part of the binary. In most situation "Hello" is stored in read only memory. "FTW" is also part of the binary. Second assignment will only change the pointer.

in addition - "Hello" and "FTW" have static storge duration as Met have pointed out

It creates a string constant that cannot be modified and should be used as it is.
If you try doing
p[0]='m';
It would give segmentation fault since this is not string literal with allocated memory in which you can reassign and read back values.

what if
p = getbuffer();
getbuffer()
{
return buf = malloc(buf, size);
}
how can free this memory before allocating new memory to p! imagine that p should use getbuffer() many times.