C program trying to modify a location in text segment - c

#include<stdio.h>
#include<string.h>
int main()
{
char str[]="somethingisbetterthannothing";
memset(str,'-',6);
puts(str);
return 0;
}
I was expecting a segmentation fault when this program is executed .
But it printed
------ingisbetterthannothing
Does this indicate that the string literal is not stored in read only text segment?

char str[]="somethingisbetterthannothing";
There is no string literal in the above line.
There is only an initializer for a char-array.
char* str = "somethingisbetterthannothing";
That would be a pointer to a string-literal.
And there is no guarantee what happens when you try to modify a string literal.
It is literally and explicitly Undefined Behavior (BTW: The example in the accepted answer is modifying a string-literal).

When strings are declared as character arrays, they are stored like other types of arrays in C.
For eg if str[] is an auto variable then string is stored in stack segment, if it’s a global or static variable then stored in data segment.
Using character pointer strings can be stored in two ways:
---> Read only string in a shared segment.
When string value is directly assigned to a pointer, in most of the compilers, it’s stored in a read only block (generally in data segment) that is shared among functions.
char *str = "vinay";
"vinay" is stored in a shared read only location, but pointer str is stored in a read-write memory
--> dynamic allocation using malloc
If you try to modify string literals or constants segmentation fault will get since change of RO section not allowed. But in your case you changing WR section i.e stack section so obviuolsy no error

Related

memory allocation of string literal strcpy

int main()
{
char *s;
strcpy(s,"here");
return 0;
}
In the code above I guess the memory for the string literal is assigned in a global space.Which section does it actually go to and when ? Does the compiler go through and assign it in the program space ?? Also if i initalise another string with same string literal i.e ( char *k = "here"; ) will it be pointing to the same memory location.
I am trying to think since I cannot free this location, do I run into any trouble if I have lot of string initialisations in my code. I guess the only thing I should be worried about is the compiler output being too big, since there is no run time memory allocation in this case?
The exact location depends on the object file format (PE vs. ELF vs. COFF) and any command-line options (some may allow string literals to be stored to a writable memory segment). ELF will store it in the .rodata segment, which, as the name implies, is read-only.
Multiple instances of the same string literal may map to the same location, but it's not required AFAIK (I'm not aware of any compiler that creates multiple instances of the same literal, but my experience isn't that broad).
Things that are certain:
Space for string literals is allocated at program startup (usually when the program is loaded into memory) and held until the program terminates;
Attempting to modify the contents of a string literal invokes undefined behavior - your code may segfault, or it may work as intended, or it may reformat your hard drive, or it may trigger the zombie apocalypse.
Note that your code has a bug - you never assign a meaningful address to s, so the strcpy is essentially trying to write the string "here" to a random location, which again is undefined behavior. You may have intended to write
s = "here";
which sets s to point to the literal. If not, then s will either have to be an array large enough to hold the string:
char s[sizeof "here"]; // sizeof evaluated at compile time
or you'll have to allocate that space dynamically:
char *s = malloc( strlen( "here" ) + 1 );
if ( s )
strcpy( s, "here" );

Is declaring a string literal with a pointer more memory efficient than declaring constant array?

If I want to store a constant string,
const char array[] = "Some string literal.";
The C primer plus book says
then the quoted string is stored in a data segment that is part of the executable file. Memory for the array is allocated only after the program begins running. At that time, the quoted string is copied into the array.
Does this mean memory is allocated twice for the string literal?
On the other hand when declared with pointer, it only sets aside the storage for the pointer variable and stores the address of string literal into it.
const char *pt = "Some string literal.";
Which means there is only one copy of the string literal and declaring pointer with string literal is more memory efficient than array?
In the first case the data for the string is stored in the executable file, and it's stored in memory once the program is loaded. So yes it is "allocated twice" but in very different storage mediums (disk and memory).
However, the same is true for the second case as well. The string literal needs to be stored once in the executable file on disk, and once in memory when the program is running.
The difference is an implementation detail, namely that in the first case the string in memory is stored either on the stack or in some global modifiable data memory segment. In the second case the string is usually stored together with the code.
So if you only have one instance of the string in the first case, there is no difference in "memory efficiency".
The answer depends on whether the definition appears at global or local scope.
At global scope:
The first option defines an initialized constant array. Reading bytes from it with array[0] will result in code that reads single bytes from a global memory location, usually a single instruction.
The second option defines a modifiable pointer initialized to point to a constant array of characters. Reading bytes from it with array[0] will result in code that loads the pointer value and reads the element pointed by the pointer, usually at least 2 instructions.
If you do not need to change the string array refers to, it would probably be preferable to use the first option.
At local scope (automatic storage)
The first option defines an array initialized with a string. If this array was not constant and was modified inside the function, the code generated would be substantially similar to this:
char array[sizeof "Some string literal."];
memcpy(array, "Some string literal.", sizeof(array));
But since it is defined as const, the compile could optimize the code and generate references to array as references to the string literal in static storage. To avoid potential code generation such as the above, you could use this definition at local scope:
static const char array[] = "Some string literal.";
Conversely, the second option defines a local pointer initialized to point to a string literal, itself most likely stored in static storage such as a data segment or even the code segment. Provided the function uses array, the definition could generate a store to initalize the pointer and more or less code to read characters from it depending on the specific code inside the function and how efficient the compiler is.
At global scope it seems more efficient to use the first approach.
At local scope, it depends a lot on the actual code and compiler used, but defining a static const char array might be the most efficient.

What is the difference of these array declarations? [duplicate]

This question already has answers here:
Difference between char* and char[]
(8 answers)
String Literals
(3 answers)
Closed 9 years ago.
#include <stdio.h>
#include <string.h>
int main(void){
char s1[30]="abcdefghijklmnopqrstuvwxyz";
printf("%s\n",s1);
printf("%s",memset(s1,'b',7));
getch();
return 0;
}
Above code works but when I create s1 array like this,
char *s1="abcdefghijklmnopqrstuvwxyz";
it does not give any errors in compile time but fails to run in runtime.
I am using Visual Studio 2012.
Do you know why?
I found prototype of memset is:
void *memset( void *s, int c, size_t n );
char s1[30] allocates a writable memory segment to store the contents of the array, char *s1="Sisi is an enemy of Egypt."; doesn't - the latter only sets a pointer to the address of a string constant, which the compiler will typically place in a read-only section of the object code.
String literals gets space in "read-only-data" section which gets mapped into the process space as read-only (So you can't change it).
char s1[30]="abcdefghijklmnopqrstuvwxyz";
This declares s1 as array of type char, and initialized it.
char *s1="abcdefghijklmnopqrstuvwxyz";
Will place "abcdefghijklmnopqrstuvwxyz" in the read-only parts of the memory and making a pointer to that.
However modifying s1 through memset yields an undefined behavior.
An very good question!.
If you make gcc output the assembly, and compare the output, you could find out the answer, and the following is why:
char s1[30]="abcdef";
when defined in a function, it will define an array of char, and s1 is the name of the array. The program will allocate memory in stack.
when define globally, it will define a object in the program, and the object is not an read only data.
char* s2 = "abcdef"; only define a point of char, which point to an const char stored in the .rodata, that is the read only data in the program.
To make program run efficiently and make the progress management easily, the compiler will generate different sections for a given code. Constant chars, like the char* s2 = "abcdef"; and the printf format string will be stored in the .section rodata section. After loading into the main memory by the loader of the OS, this section will be marked as read only. That is why when you use memset to modify the memory which s2 point to, it will complain Segment fault.
Here is an explaination: Difference between char* and char[]

If referencing constant character strings with pointers, is memory permanently occupied?

I'm trying to understand where things are stored in memory (stack/heap, are there others?) when running a c program. Compiling this gives warning: function return adress of local variable:
char *giveString (void)
{
char string[] = "Test";
return string;
}
int main (void)
{
char *string = giveString ();
printf ("%s\n", string);
}
Running gives various results, it just prints jibberish. I gather from this that the char array called string in giveString() is stored in the stack frame of the giveString() function while it is running. But if I change the type of string in giveString() from char array to char pointer:
char *string = "Test";
I get no warnings, and the program prints out "Test". So does this mean that the character string "Test" is now located on the heap? It certainly doesn't seem to be in the stack frame of giveString() anymore. What exactly is going on in each of these two cases? And if this character string is located on the heap, so all parts of the program can access it through a pointer, will it never be deallocated before the program terminates? Or would the memory space be freed up if there was no pointers pointing to it, like if I hadn't returned the pointer to main? (But that is only possible with a garbage collector like in Java, right?) Is this a special case of heap allocation that is only applicable to pointers to constant character strings (hardcoded strings)?
You seem to be confused about what the following statements do.
char string[] = "Test";
This code means: create an array in the local stack frame of sufficient size and copy the contents of constant string "Test" into it.
char *string = "Test";
This code means: set the pointer to point to constant string "Test".
In both cases, "Test" is in the const or cstring segment of your binary, where non-modifiable data exists. It is neither in the heap nor stack. In the former case, you're making a copy of "Test" that you can modify, but that copy disappears once your function returns. In the latter case, you are merely pointing to it, so you can use it once your function returns, but you can never modify it.
You can think of the actual string "Test" as being global and always there in memory, but the concept of allocation and deallocation is not generally applicable to const data.
No. The string "Test" is still on the stack, it's just in the data portion of the stack which basically gets set up before the program runs. It's there, but you can think of it kind of like "global" data.
The following may clear it up a tad for you:
char string[] = "Test"; // declare a local array, and copy "Test" into it
char* string = "Test"; // declare a local pointer and point it at the "Test"
// string in the data section of the stack
It's because in the second case you are creating a constant string :
char *string = "Test";
The value pointed by string is a constant and can never change, so it's allocated at compile time like a static variable(but it's still stack not heap).

bus error when trying to access character on a string in C

I have used this line of code many times (update: when string was a parameter to the function!), however when I try to do it now I get a bus error (both with gcc and clang). I am reproducing the simplest possible code;
char *string = "this is a string";
char *p = string;
p++;
*p='x'; //this line will cause the Bus error
printf("string is %s\n",string);
Why am I unable to change the second character of the string using the p pointer?
You are trying to modify read only memory (where that string literal is stored). You can use a char array instead if you need to modify that memory.
char str[] = "This is a string";
str[0] = 'S'; /* works */
I have used this line of code many times..
I sure hope not. At best you would get a segfault (I say "at best" because attempting to modify readonly memory is unspecified behavior, in which case anything can happen, and a crash is the best thing that can happen).
When you declare a pointer to a string literal it points to read only memory in the data segment (look at the assembly output if you like). Declaring your type as a char[] will copy that literal onto the function's stack, which will in turn allow it to be modified if needed.

Resources