Regarding initializing a string as an array - c

I'm told that when initializing a string like so
char str[] = "Hello world!";
The compiler will allocate an area in constants memory(read only for the program) and then copy the string to the array which resides in the stack. My question is, can I read or point to the original string after modifying the copy I'm given, and how? And if not, why does the string even exist outside of the stack in the first place?

It's done this way for space efficiency. When you write:
char str[] = "Hello world!";
it's compiled effectively as if you'd written:
static char str_init[] = "Hello world!";
char str[13];
strncpy(str, str_init, 13);
An alternative way to implement this might be equivalent to:
char str[13];
str[0] = 'H';
str[1] = 'e';
...
str[11] = '!';
str[12] = 0;
But for long strings, this is very inefficient. Instead of 1 byte of static data for each character of the string, it will use a full word of instruction (probably 4 bytes, but maybe more on some architectures) for each character. This will quadruple the size of the initialization data unnecessarily.

Because the program has to remember the string somewhere, i.e., your so-called "constant memory". Otherwise how can it know what values to assign when allocating the variable? Think about a variable with a given initial value. The variable is not allocated until declared. But the initial value must be stored somewhere else.

When this statement is compiled
char str[] = "Hello world!";
the compiler does not keep the string literal in the program. it is used only to initialize the array.
If you want to keep the string literal then you have to write the following way
char *s = "Hello world!";
char str[13];
strcpy( str, s );

When the program runs, "Hello world" will be stored in the constant part of the memory as a string literal, after that, the program will reserve enough space in the stack and copy character by character from the constant part of the memory. Unfortunately, you don't have access to the constant part that stores the string literal because you are telling the program that you want the values to be modifiable (string stored in stack), so it gives what you asked.

Most of your question has been addressed in the other answers, However, I did not see anyone address this one specifically:
Regarding your question: ...can I read or point to the original string after modifying the copy I'm given, and how?
The following sequence demonstrates how you can read the original after modifying a copy:
char str[] = "hello world"; //creates original (stack memory)
char *str2 = 0;//create a pointer (pointer created, no memory allocated)
str2 = StrDup(str); populate pointer with original (memory allocated on heap)
str2[5]=0; //edit copy: results in "hello" (i.e. modified) (modifying a location on the heap)
str; //still contains "hello world" (viewing value on the stack)
EDIT (answering comment question)
The answer above only addressed the specific question about accessing an original string after a copy has been modified. I just showed one possible set of steps to address that. You can edit the original string too:
char str[] = "Hello world!"; //creates location in stack memory called "str",
//and assigns space enough for literal string:
//"Hello world!", 13 spaces in all (including the \0)
strcpy(str, "new string"); //replaces original contents with "new string"
//old contents are no longer available.
So, using these steps, the original values in the variable str are changed, and are no longer available.
The method I outline in my original answer, (at top) shows a way whereby you can make an editable copy, while maintaining the original variable.
In your comment question, you are referring to things such as system memory and constant memory. Normally, system memory refers to RAM implementations on a system (i.e. how much physical memory). By constant memory, my guess is that you are referring to memory used by variables created on the stack. (read on)
First In a development, or run-time environment, there is stack memory. This is usually defaulted to some maximum value, such as 250,000 bytes perhaps. It is a pre-build settable value in most development environments, and is available for use by any variable you create on the stack. Example:
int x[10]; //creates a variable on the stack
//using enough memory space for 10 integers.
int y = 1; //same here, except uses memory for only 1 integer value
Second There is also what is referred to a heap memory. The amount of heap memory is system dependent, the more physical memory your system has available, the more heap memory you can use for variable memory space in your application. Heap memory is used when you dynamically allocate memory, for example using malloc(), calloc(), realloc().
int *x=0; //creates a pointer, no memory allocation yet...
x = malloc(10); //allocates enough memory for 10 integers, but the
//memory allocated is from the _heap_
//and must be freed for use by the system
//when you are done with it.
free(x);
I have marked the original post (above) with indications showing what type of memory each variable is using. I hope this helps.

Related

memory allocation of string literal strcpy

int main()
{
char *s;
strcpy(s,"here");
return 0;
}
In the code above I guess the memory for the string literal is assigned in a global space.Which section does it actually go to and when ? Does the compiler go through and assign it in the program space ?? Also if i initalise another string with same string literal i.e ( char *k = "here"; ) will it be pointing to the same memory location.
I am trying to think since I cannot free this location, do I run into any trouble if I have lot of string initialisations in my code. I guess the only thing I should be worried about is the compiler output being too big, since there is no run time memory allocation in this case?
The exact location depends on the object file format (PE vs. ELF vs. COFF) and any command-line options (some may allow string literals to be stored to a writable memory segment). ELF will store it in the .rodata segment, which, as the name implies, is read-only.
Multiple instances of the same string literal may map to the same location, but it's not required AFAIK (I'm not aware of any compiler that creates multiple instances of the same literal, but my experience isn't that broad).
Things that are certain:
Space for string literals is allocated at program startup (usually when the program is loaded into memory) and held until the program terminates;
Attempting to modify the contents of a string literal invokes undefined behavior - your code may segfault, or it may work as intended, or it may reformat your hard drive, or it may trigger the zombie apocalypse.
Note that your code has a bug - you never assign a meaningful address to s, so the strcpy is essentially trying to write the string "here" to a random location, which again is undefined behavior. You may have intended to write
s = "here";
which sets s to point to the literal. If not, then s will either have to be an array large enough to hold the string:
char s[sizeof "here"]; // sizeof evaluated at compile time
or you'll have to allocate that space dynamically:
char *s = malloc( strlen( "here" ) + 1 );
if ( s )
strcpy( s, "here" );

How do I free up the memory consumed by a string literal?

How do I free up all the memory used by a char* after it's no longer useful?
I have some struct
struct information
{
/* code */
char * fileName;
}
I'm obviously going to save a file name in that char*, but after using it some time afterwards, I want to free up the memory it used to take, how do I do this?
E: I didn't mean to free the pointer, but the space pointed by fileName, which will most likely be a string literal.
There are multiple string "types" fileName may point to:
Space returned by malloc, calloc, or realloc. In this case, use free.
A string literal. If you assign info.fileName = "some string", there is no way. The string literal is written in the executable itsself and is usually stored together with the program's code. There is a reason a string literal should be accessed by const char* only and C++ only allows const char*s to point to them.
A string on the stack like char str[] = "some string";. Use curly braces to confine its scope and lifetime like that:
struct information info;
{
char str[] = "some string";
info.fileName = str;
}
printf("%s\n", info.fileName);
The printf call results in undefined behavior since str has already gone out of scope, so the string has already been deallocated.
You could use foo.fileName = malloc(howmanychars); and free(foo.fileName);.
You cannot free the memory if you initialize fileName from a string literal or other non-dynamically allocated way.
But then, freeing a handful of bytes is next to pointless, unless you need a large number of such structs/fileNames. The OS will likely not return the freed memory to other processes; the returned memory may be available for future memory allocations of your process.

Where the char* is pointing to?

I have some very basic questions regarding pointers and memory allocation.
In below code where does pointer c actually point to? In other words where does the string "xyz" get stored in memory (stack/heap etc.)?
What will happen to memory location allocated for a as I am not using it anymore?
Code seems to work well if I un-comment the commented section. What's happening with the memory in that scenario?
#include <stdio.h>
main()
{
char *c;
//c = (char *)malloc(2);
c = "a";
c = "xyz" ;
printf("%s",c);
return 0;
}
Output:
xyz
Edit:
After reading few of the answer and first comment another question came up in my mind:
In below case, where do the strings get stored? Can I alter them later on?
char *c[] = {"a","xyz"};
The specific details are implementation dependent, but in most common implementations, literal strings like "a" and "xyz" are stored in the text section of the program, like the machine code that implements the program. Assigning a = "xyz"; sets a to point to that location in memory.
The memory for "a" is unaffected. However, an optimizing compiler may notice that c was never used between that assignment and being reassigned, so it could simply ignore the first assignment, and never allocate any space for "a" at all.
The memory you allocated with malloc() stays allocated until the program ends. Allocating memory without freeing it is called a memory leak, and you should try to avoid it.
1.In below code where does pointer 'c' is actually pointing to?In other words where does the string "xyz" get stored in memory(stack/heap etc.)?
will place xyz the read-only parts of the memory and making c a pointer to that, a variable of type pointer-to-char, called c , which is initialized with the location of the first character in that unnamed, read-only array.you created automatic storage (often called the stack). Here you just point to memory in text section.
2.What will happen to memory location allocated for"a"' as i am not using it anymore?
As per your code you not allocated any memory just you assign string literals to that.If you allocate then need to free it.
3.Code seems to work well if I un-comment the commented section.Whats happening with memory in that scenario?
If you un comment the allocation statement to allocate memory then also it point to you literal string but you need to free that memory.
Now for case
char *c[] = {"a","xyz"};
when you declare and initialize words, it'll be allocated on the stack as any other automatic array and its items will be assigned to pointers to the beginning of each string constant.Also to alter this string may illegal.
"xyz" and "a" are string literals which is mostly available in string table.
"xyz" is printed because it is recently assigned to that pointer.
To a place in the heap/stack in the READ-ONLY part of the memory which the string is in. When assigning a string literal directly into a pointer, the program searches for that string in the memory, if it exists through the short search he's doing, it'll point to it, if it doesn't - it will create it. Either way it's read only so it'll be the same as a const char* so you can't change it (of course you can somehow manipulate it, maybe by another pointer or so).
Nothing, it'll stay unaffected.
What's happening is that malloc returns a pointer and you just ignore it, you go to another address containing a and it will not have the same influence as strcpy(c, "a"); as you ignore the allocated memory and its pointer - you do not free it. Generally, nothing's happen if you don't free the memory (I mean, at the end of the program it is freed automatically by the OS) but it WILL take memory within the program so if i'd allocate, let's say 1000000 bytes (assuming it succeeded), allocating more heap memory would be a problem :P
about the other question... You can't alter them through that pointer, try and it will throw an interrupt in the middle of the program and it'll probably stop responding.

If referencing constant character strings with pointers, is memory permanently occupied?

I'm trying to understand where things are stored in memory (stack/heap, are there others?) when running a c program. Compiling this gives warning: function return adress of local variable:
char *giveString (void)
{
char string[] = "Test";
return string;
}
int main (void)
{
char *string = giveString ();
printf ("%s\n", string);
}
Running gives various results, it just prints jibberish. I gather from this that the char array called string in giveString() is stored in the stack frame of the giveString() function while it is running. But if I change the type of string in giveString() from char array to char pointer:
char *string = "Test";
I get no warnings, and the program prints out "Test". So does this mean that the character string "Test" is now located on the heap? It certainly doesn't seem to be in the stack frame of giveString() anymore. What exactly is going on in each of these two cases? And if this character string is located on the heap, so all parts of the program can access it through a pointer, will it never be deallocated before the program terminates? Or would the memory space be freed up if there was no pointers pointing to it, like if I hadn't returned the pointer to main? (But that is only possible with a garbage collector like in Java, right?) Is this a special case of heap allocation that is only applicable to pointers to constant character strings (hardcoded strings)?
You seem to be confused about what the following statements do.
char string[] = "Test";
This code means: create an array in the local stack frame of sufficient size and copy the contents of constant string "Test" into it.
char *string = "Test";
This code means: set the pointer to point to constant string "Test".
In both cases, "Test" is in the const or cstring segment of your binary, where non-modifiable data exists. It is neither in the heap nor stack. In the former case, you're making a copy of "Test" that you can modify, but that copy disappears once your function returns. In the latter case, you are merely pointing to it, so you can use it once your function returns, but you can never modify it.
You can think of the actual string "Test" as being global and always there in memory, but the concept of allocation and deallocation is not generally applicable to const data.
No. The string "Test" is still on the stack, it's just in the data portion of the stack which basically gets set up before the program runs. It's there, but you can think of it kind of like "global" data.
The following may clear it up a tad for you:
char string[] = "Test"; // declare a local array, and copy "Test" into it
char* string = "Test"; // declare a local pointer and point it at the "Test"
// string in the data section of the stack
It's because in the second case you are creating a constant string :
char *string = "Test";
The value pointed by string is a constant and can never change, so it's allocated at compile time like a static variable(but it's still stack not heap).

memory leak during char* manipulation

I do something like that in the loop :
char* test = "\0";
test = strcat(test, somestr);
...
char* tmp = strstr(test, 0, len);
free(test);
test = tmp;
And get memory leak. What I do wrong?
You don't actually have a memory leak (in the code you posted anyway), but you do several things wrong.
char* test = "\0";
This declares pointer named test and initializes it to point to some literal array of two bytes { 0, 0 }
test = strcat(test, somestr);
This tries to append something to the end of that string literal (and since as a C string it is empty it would be like a string copy). Literal values often are stored in memory that is not writable, so copying something into this memory would likely cause an error (segmentation fault or SIGSEGV on many operating systems). Additionally you only have two bytes of storage pointed to by test, which means that unless somestr refers to a string whose strlen is less than or equal to 1 you would end up attempting to write over some other memory (whatever happens to be after the "\0" that test points to).
char* tmp = strstr(test, 0, len);
I don't know what is going on here since strstr only takes 2 arguments (both of them const char *).
free(test);
Here you are attempting to free non-heap allocated memory. The heap is where malloc, realloc, and calloc get the memory they allocate. Calling free with a memory location that was not returned by one of these functions (and a few other functions on some systems) is an error because free does not know what to do with them.
You should probably keep in mind that often memory is huge array of bytes, and that the pointers you use are like array indexes. The system you are using may be able to distinguish between some areas of this array and determine how you can access them (readable, writable, and/or executable). But it is still just an array of bytes. When you have a string (such as "foo") that means that somewhere in RAM there are four bytes ( 3 letters + the \0 terminator byte) and you can access this area by knowing its index within the array of bytes that is RAM. There are likely other things that are stored adjacent to your string (such as { ..., 4, 2, 'f', 'o', 'o', 0, 99, 3, 2, ...}) so you have to try to make sure you stay within the space of that memory without wandering into the adjacent data.
There are a couple of problems:
strcat will append a string to the destination buffer. You need the first parameter to be a buffer not a string literal pointer. Here is an example of a char buffer or also called an array of chars: char test[1024];
The return value of strcat is a pointer to the destination buffer, it is not a newly allocated string on the heap. So you shouldn't call free on the return value.
You can't strcat to test because you are initially pointing it to a constant char *. You need to assign memory for it. strcat won't do it.
Change your code to something like:
char* test = (char*)malloc(20*sizeof(char));
test[0] = '\0'; // nothing on this string to begin with
strcat(test, "something");
free(test);
Also, this won't work:
char* tmp = strcat(test, 0, len);
Since there is no strcat function with three parameters.
Remember. 99.9% of the time there will be a free call for each malloc allocation.

Resources