c language and some doubts of pointer - c

#include<stdio.h>
#include<stdlib.h>
char* re()
{
char *p = "hello";
return p;
}
int main()
{
char* tem = re();
printf("%s", tem);
return 0;
}
my compiler is Dev-C++.
I think that when the function of 're' completes, the pointer of 'p' will be deleted and the stack space which 'p' havs pointed to also will be deleted. So the pointer of 'tem' can not visit the stack space which the 'p' points to.
In my opinions, this code will appear some bugs. but why not?
This problem distorts me a long time. If you can tell me the reason, i will appreciate your kind heart.

p does not point to a stack space. It points to the string literal "hello". Since string literals are guaranteed to be valid at the whole program, your program is OK.
(I don't know about Dev-C++, but in most compilers, string literals are allocated in some read-only memory at the loading of the program, and stays there until the end of it)
Edit: note that even if the string was on the stack, and the code was really buggy, nothing in the language guarantee that is will not work. invalid memory can (but not have to) still contain the value it contained before being invalid.

The string "hello" is not stack alloc'ed (but char *p pointer is).It is in the 'data segment' because it's a constant value (read-only memory).
From C FAQ: http://c-faq.com/decl/strlitinit.html

Related

Is putting string direct to the pointer (without declare static or dynamic space in RAM) good practice?

char* load_text(char* text, int how_many)
{
char* result = 0;
char* here = 0;
result = fgets(text, how_many, stdin);
if (result)
{
here = strchr(text, '\n');
if (here)
*here = '\0';
else
while (getchar() != '\n')
continue;
}
return result;
}
This function is simple, send string from the keyboard and put to the pointer variable and then search new line '\n' in this string and remove in this place with null character '\0'
Then remove new line from function when you pressed ENTER
My question is:
Is it good to put string direct to the pointer from function fgets() to pointer *result?
Or is it a bad practice?
Because i searched a lot information about this problem
And many peaople told me:
Never put string direct to the pointer, which points to the random address in RAM without making static space or dynamic allocation in RAM !!!
I learn C programming language from Stephen Prat book "C Primer Plus, 6th Edition" and i founded this example in this book
From my point of view, I think putting strings directly to the random address RAM without static declare space for string or dynamic allocation is very bad idea
Correct me if i am wrong?
I think this example make really mess in RAM
I want know if my point of view is correct.
There is a notion that memory is associated with pointers. For example, in int *p = malloc(3 * sizeof *p);, some people say that memory is “allocated to p”. This is not correct. Memory is allocated, the address of that memory is returned, and that address is assigned to p, meaning that the value of p is set to that address. No ownership of the memory is associated with p, nor is any enduring relationships between p and the memory formed other than that p currently holds the address of the memory.
You could follow this with int *q = p; and int *r = p;, and then p, q, and r would have the same address, and none of them would have any privileged status over the others. They are simply variables that hold values.
In result = fgets(text, how_many, stdin);, the value returned by fgets is assigned to result. There is no need to allocate any memory for result to point to before this. In fact, doing so would be counterproductive, because the address assigned to result would be overwritten by result = fgets(text, how_many, stdin);.
Is it good to put string direct to the pointer from function fgets() to pointer *result?
It is fine.
And many peaople told me: Never put string direct to the pointer, which points to the random address in RAM without making static space or dynamic allocation in RAM !!!
You should never use a pointer before it has been assigned an appropriate address. For example, you should never use *result, as in *result = x; or x = *result;, before result has been set to pointer to an appropriate address.
result = fgets(text, how_many, stdin); sets result to an appropriate address, so it is fine.
there is no simple answer - it all depends, but your function depends on the caller supplying a valid pointer and a valid length. But c coding is like that , you are at the mercy of your caller. Look at the fgets function, it has to depend on its caller providing valid arguments too
This function has no responsibility to allocate memory, that has been pushed to the caller. If the caller does not allocate correctly when calling, this function will fail, but through no fault of its own.
Since you do check the result of fgets if there is an error you do return an error, so it all works out.
When you say this:
Never put string direct to the pointer, which points to the random address in RAM without making static space or dynamic allocation in RAM !!!
I think what you mean is "never write to an uninitialized pointer", which is true. It does not mean "never put a string direct to the pointer" as that doesn't mean anything.
For example, if you had this:
char* x;
load_text(x, 100);
That is undefined behaviour, no allocation was made, but it's not the fault of the load_text function itself. This is just buggy code that needs to be fixed.

Do I misunderstand this example about scope of string literals?

I was reading up on common C pitfalls and came up to this article on some famous Uni website. (It is the 2nd link that comes up on google).
The last example on that page is,
// Memory allocation on the stack
void b(char **p) {
char * str="print this string";
*p = str;
}
int main(void) {
char * s;
b(&s);
s[0]='j'; //crash, since the memory for str is allocated on the stack,
//and the call to b has already returned, the memory pointed to by str
//is no longer valid.
return 0;
}
That explanation in the comment got me thinking then, that, isn't the memory for string literals not static?
Isn't the actual error there then that you are not supposed to modify string literals, because it is undefined behavior? Or are the comments there correct and my understanding of that example is wrong?
Upon searching further, I saw this question: referencing a char that went out of scope and I understood from that question that, the following is valid code.
#include <malloc.h>
char* a = NULL;
{
char* b = "stackoverflow";
a = b;
}
int main() {
puts(a);
}
Also this question agrees with the other stackoverflow question and my thinking, but opposes the comment from that website's code.
To test it, I tried the following,
#include <stdio.h>
#include <malloc.h>
void b(char **p)
{
char * str = "print this string";
*p = str;
}
int main(void)
{
char * s;
b(&s);
// s[0]='j'; //crash, since the memory for str is allocated on the stack,
//and the call to b has already returned, the memory pointed to by str is no longer valid.
printf("%s \n", s);
return 0;
}
which as expected does not give a segmentation fault.
Standard says (emphasize is mine):
6.4.5 String literals
[...] The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. [...]
[...] If the program attempts to
modify such an array, the behavior is undefined. [...]
No, you misunderstand the reason for crash. String literals have static duration, meaning that they exist for the lifetime of the program. Since your pointer points to the literal, you can use it anytime.
The reason for the crash is the fact that string literals are read-only. In fact char* x = "" is an error in C++, as it should be const char* x = "". They are read-only from language perspective, and any attempt to modify them would lead to undefined behavior.
In practical terms, they are often put in the read-only segment, so any attempt at modification triggers a GPF - general protection fault. Usual response to GPF is a program termination - and this is what you are witnessing with your application.
String literals are placed in general in rodata section (read-only) within the ELF file, and under Linux\Windows\Mac-OS they will end up in a memory region which will generate a fault when written to (configured so using MMU or MPU by the OS upon loading)

If referencing constant character strings with pointers, is memory permanently occupied?

I'm trying to understand where things are stored in memory (stack/heap, are there others?) when running a c program. Compiling this gives warning: function return adress of local variable:
char *giveString (void)
{
char string[] = "Test";
return string;
}
int main (void)
{
char *string = giveString ();
printf ("%s\n", string);
}
Running gives various results, it just prints jibberish. I gather from this that the char array called string in giveString() is stored in the stack frame of the giveString() function while it is running. But if I change the type of string in giveString() from char array to char pointer:
char *string = "Test";
I get no warnings, and the program prints out "Test". So does this mean that the character string "Test" is now located on the heap? It certainly doesn't seem to be in the stack frame of giveString() anymore. What exactly is going on in each of these two cases? And if this character string is located on the heap, so all parts of the program can access it through a pointer, will it never be deallocated before the program terminates? Or would the memory space be freed up if there was no pointers pointing to it, like if I hadn't returned the pointer to main? (But that is only possible with a garbage collector like in Java, right?) Is this a special case of heap allocation that is only applicable to pointers to constant character strings (hardcoded strings)?
You seem to be confused about what the following statements do.
char string[] = "Test";
This code means: create an array in the local stack frame of sufficient size and copy the contents of constant string "Test" into it.
char *string = "Test";
This code means: set the pointer to point to constant string "Test".
In both cases, "Test" is in the const or cstring segment of your binary, where non-modifiable data exists. It is neither in the heap nor stack. In the former case, you're making a copy of "Test" that you can modify, but that copy disappears once your function returns. In the latter case, you are merely pointing to it, so you can use it once your function returns, but you can never modify it.
You can think of the actual string "Test" as being global and always there in memory, but the concept of allocation and deallocation is not generally applicable to const data.
No. The string "Test" is still on the stack, it's just in the data portion of the stack which basically gets set up before the program runs. It's there, but you can think of it kind of like "global" data.
The following may clear it up a tad for you:
char string[] = "Test"; // declare a local array, and copy "Test" into it
char* string = "Test"; // declare a local pointer and point it at the "Test"
// string in the data section of the stack
It's because in the second case you are creating a constant string :
char *string = "Test";
The value pointed by string is a constant and can never change, so it's allocated at compile time like a static variable(but it's still stack not heap).

Char* p, and scanf

I have been trying to look for a reason why the following code is failing, and I couldn't find one.
So please, excuse my ignorance and let me know what's happening here.
#include<stdio.h>
int main(void){
char* p="Hi, this is not going to work";
scanf("%s",p);
return 0;
}
As far as I understood, I created a pointer p to a contiguous area in the memory of the size 29 + 1(for the \0).
Why can't I use scanf to change the contents of that?
P.S Please correct me If I said something wrong about char*.
char* p="Hi, this is not going to work";
this does not allocate memory for you to write
this creates a String Literal which results inUndefined Behaviour every time you try to change its contents.
to use p as a buffer for your scanf do something like
char * p = malloc(sizeof(char) * 128); // 128 is an Example
OR
you could as well do:
char p[]="Hi, this is not going to work";
Which I guess is what you really wanted to do.
Keep in mind that this can still end up being UB because scanf() does not check whether the place you are using is indeed valid writable memory.
remember :
char * p is a String Literal and should not be modified
char p[] = "..." allocates enough memory to hold the String inside the "..." and may be changed (its contents I mean).
Edit :
A nice trick to avoid UB is
char * p = malloc(sizeof(char) * 128);
scanf("%126s",s);
p points to a constant literal, which may in fact reside in a read-only memory area (implementation dependent). At any rate, trying to overwrite that is undefined behaviour. I.e. it might result in nothing, or an immediate crash, or a hidden memory corruption which causes mysterious problems much later. Don't ever do that.
It is crashing because memory has not been allocated for p. Allocate memory for p and it should be ok. What you have is a constant memory area pointing to by p. When you attempt to write something in this data segment, the runtime environment will raise a trap which will lead to a crash.
Hope this answers your question
scanf() parses data entered from stdin (normally, the keyboard). I think you want sscanf().
However, the purpose of scanf() is to part a string with predefined escape sequences, which your test string doesn't have. So that makes it a little unclear exactly what you are trying to do.
Note that sscanf() takes an additional argument as the first argument, which specifies the string being parsed.

Is modifying a string pointed to by a pointer valid?

Here's a simple example of a program that concatenates two strings.
#include <stdio.h>
void strcat(char *s, char *t);
void strcat(char *s, char *t) {
while (*s++ != '\0');
s--;
while ((*s++ = *t++) != '\0');
}
int main() {
char *s = "hello";
strcat(s, " world");
while (*s != '\0') {
putchar(*s++);
}
return 0;
}
I'm wondering why it works. In main(), I have a pointer to the string "hello". According to the K&R book, modifying a string like that is undefined behavior. So why is the program able to modify it by appending " world"? Or is appending not considered as modifying?
Undefined behavior means a compiler can emit code that does anything. Working is a subset of undefined.
I +1'd MSN, but as for why it works, it's because nothing has come along to fill the space behind your string yet. Declare a few more variables, add some complexity, and you'll start to see some wackiness.
Perhaps surprisingly, your compiler has allocated the literal "hello" into read/write initialized data instead of read-only initialized data. Your assignment clobbers whatever is adjacent to that spot, but your program is small and simple enough that you don't see the effects. (Put it in a for loop and see if you are clobbering the " world" literal.)
It fails on Ubuntu x64 because gcc puts string literals in read-only data, and when you try to write, the hardware MMU objects.
You were lucky this time.
Especially in debug mode some compilers will put spare memory (often filled with some obvious value) around declarations so you can find code like this.
It also depends on the how the pointer is declared. For example, can change ptr, and what ptr points to:
char * ptr;
Can change what ptr points to, but not ptr:
char const * ptr;
Can change ptr, but not what ptr points to:
const char * ptr;
Can't change anything:
const char const * ptr;
According to the C99 specifification (C99: TC3, 6.4.5, §5), string literals are
[...] used to initialize an array of static storage duration and length just
sufficient to contain the sequence. [...]
which means they have the type char [], ie modification is possible in principle. Why you shouldn't do it is explained in §6:
It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
Different string literals with the same contents may - but don't have to - be mapped to the same memory location. As the behaviour is undefined, compilers are free to put them in read-only sections in order to cleanly fail instead of introducing possibly hard to detect error sources.
I'm wondering why it works
It doesn't. It causes a Segmentation Fault on Ubuntu x64; for code to work it shouldn't just work on your machine.
Moving the modified data to the stack gets around the data area protection in linux:
int main() {
char b[] = "hello";
char c[] = " ";
char *s = b;
strcat(s, " world");
puts(b);
puts(c);
return 0;
}
Though you then are only safe as 'world' fits in the unused spaces between stack data - change b to "hello to" and linux detects the stack corruption:
*** stack smashing detected ***: bin/clobber terminated
The compiler is allowing you to modify s because you have improperly marked it as non-const -- a pointer to a static string like that should be
const char *s = "hello";
With the const modifier missing, you've basically disabled the safety that prevents you from writing into memory that you shouldn't write into. C does very little to keep you from shooting yourself in the foot. In this case you got lucky and only grazed your pinky toe.
s points to a bit of memory that holds "hello", but was not intended to contain more than that. This means that it is very likely that you will be overwriting something else. That is very dangerous, even though it may seem to work.
Two observations:
The * in *s-- is not necessary. s-- would suffice, because you only want to decrement the value.
You don't need to write strcat yourself. It already exists (you probably knew that, but I'm telling you anyway:-)).

Resources