How char[] and char* are different in this case? - c

When we run this piece of code, it works normally and prints string constant on the screen:
char *someFun(){
char *temp = "string constant";
return temp;
}
int main(){
puts(someFun());
}
But when we run the following similar code, it won't work and print some garbage on screen:
char *someFun1(){
char temp[ ] = "string";
return temp;
}
int main(){
puts(someFun1());
}
What is the reason behind it? Essentially, both functions do similar things (i.e. return a "string"), but still they behave differently. Why is that?

char *temp = "string constant";
string constant literal resides on read only segment. It gets deallocated at program termination. So, you can have a reference pointing to it.
char temp[ ] = "string";
string is copied to temp which resides on stack. As the function returns, unwinding of stack begins which de-allocates the variables in the function scope. But you are returning a reference to it which no longer exists on stack and hence you are getting garbage. But sometimes you may still get the correct result but you should not rely on it.

In the first case, the pointer temp will point to a global constant storing "string constant". Therefore, when you return the pointer, it's valid.
In the second case, '"string"' is just a char array on the stack - which dies after you return from the function.

Related

Why doesn't the strcat implementation cause a segmentation fault?

I'm trying to learn C and I tried the exercise in the book "The C Programming Language" in which I implemented the strcat() function as below:
char *my_strcat(char *s, const char *t)
{
char *dest = s;
while (*s) s++;
while (*s++ = *t++);
return dest;
}
And I'm calling this like:
int main(int argc, char const *argv[])
{
char x[] = "Hello, ";
char y[] = "World!\n";
my_strcat(x, y);
puts(x);
return 0;
}
My problem is, I don't fully understand my own implementation.
My question is by the time line while (*s) s++; completes, now I have set the address held in s to the memory location that contains \0 which is the last element of array x.
Then in the line while (*s++ = *t++);, I'm setting s to the address of the next block of memory which is outside array x and copy the content of t to this new location. How is it allowed to write the content at the location of t to the location pointed by s when it was not part of the storage I had requested when I initialized x?
If I called my_strcat like below, I get a segmentation fault:
int main(int argc, char const *argv[])
{
char *x = "Hello, ";
char *y = "World!\n";
my_strcat(x, y);
puts(x);
return 0;
}
which kind of makes sense. My understanding is char *x = "foo" and char x[] = "foo" are the same with the difference that in the latter case, storage allocated to x is fixed while the first is not. So, I feel like the segmentation fault should happen in the latter case, rather the former?
Thank you for clarification.
This is undefined behavior. Anything can happen.
The real, practical reason that the first program works is because the strings in the first are stored on the stack. It overwrites the stack, which causes undefined behavior, but "just works", which is unfortunate.
The second program doesn't "work" because the strings are stored in read-only memory. Any attempt to write to these strings will cause undefined behavior (or a segfault).
Your implementation of strcat is valid, you just need to allocate adequate space for the string that you're trying to append to.
So, to recap:
How is it allowed to write the content at the location of t to the location pointed by s when it was not part of the storage I had requested when I initialized x?
It's not. This is undefined behavior that just happens to "work".

Why can I return `int` after a function but not `char *`?

I'm a newbie to C. I had extended the question from the previous question: Strange behavior when returning "string" with C (Thanks for all who answered or commented that question, by the way.)
Pretty straight forward:
Why can this work:
#include <stdio.h>
int func() {
int i = 5;
return i;
}
int main() {
printf("%d",func());
}
But not this:
#include <stdio.h>
char * func() {
char c[] = "Hey there!";
return c;
}
int main() {
printf("%s",func());
}
From the previous question, logically the int i should not exist too because the function has returned, but why can it still be returned while char c[] cannot?
(It seems to be duplicated from "Pointers and memory scope" but I would like to know more about what is the difference between returning an int and a char *.)
Problem is not returning char *, it is returning something that is allocated on stack.
If you allocate memory for your string rather than pointing to function stack, there will be no problem. Something like this:
char * func() {
char c[] = "Hey there!";
return strdup(c);
}
int main() {
char* str = func();
printf("%s", str);
free(str);
}
It is important to mention that in both cases, you are copying a value and in both cases copied value is correct, but the meaning of copied value differs.
In first case, your are copying an int value and after your return from function, you are using that int value which will be valid. But in 2nd case, even though you have a valid pointer value, it refers to an invalid address of memory which is stack of called function.
Based on suggestions in comment, I decided to add another better practice in memory allocating for this code:
#define NULL (void*)0
int func(char *buf, int len) {
char c[] = "Hey there!";
int size = strlen(c) + 1;
if (len >= size) {
strcpy(buf, c);
}
return size;
}
int main() {
int size = func(NULL, 0);
char *buf = calloc(size, sizeof(*buf));
func(buf, size);
printf("%s", buf);
free(buf);
return 0;
}
Similar approach is used in a lot of windows API functions. This approach is better, because owner of pointer is more obvious (main in here).
In the first example the return value is copied. In your second example you're returning a pointer, which will point to a memory location which no longer exists.
In the first case, you return the int value 5 from the function. You can then print that value.
In the second case however, you return a value of type char *. That value points to an array that is local to the function func. After that function returns the array goes out of scope, so the pointer points to invalid memory.
The difference between these two cases is a value that you use directly, versus a pointer value that no longer points to valid memory. Had you returned a pointer to memory allocated by malloc, then the pointer would point to valid memory.
You are trying to return pointer to local array, which is very bad. If you want to return a pointer to array, allocate it dynamically using malloc inside your func();
Then you must call free() on caller side to free up memory you allocated when you no longer need it
In the first example, you return an int, and the second you return a pointer to a char. They both return in exactly the same manner, it is just a matter of understanding the stack and how values are returned.
Even though i was declared in the function and is allocated on the stack, when the function returns it returns the value of i (which is basically copied, so when i falls off the stack the value of i is still returned.)
This is the exact same thing that happens to the char * in the second example. It will still be a pointer to a char, and it returns the 'copied' value of c. However, since it was allocated on the stack, the address it points to is effectively invalid. The pointer value itself has not changed, but what it points to has.
You would have to dynamically allocate this to avoid this situation.
The return value of function is returned by copy. In the first example, you get a copy of the integer variable from the function. In the second you get a copy of the char pointer, not a copy of the string.
The pointer references the string data that has automatic storage, so is no longer valid after the function returns. The space becomes available for use by other code and many be modified - any attempt to access it has undefined behaviour.
The point is, it is a pointer that is returned, not a string; in C a strings (and more generally arrays) are not a first-class data types.
Depending on your needs there are a number of valid ways of returning the string data; for example the following is valid:
char* func()
{
static char c[] = "Hey there!";
return c;
}
because here although the local variable goes out of scope the static data is not destroyed or de-allocated, and any reference to it remains valid.
Another alternative is to embed the string in a struct which is a first-class data type:
typedef struct
{
char content[256] ;
} sString ;
sString func()
{
sString c = {"Hey there!"};
return c;
}
Or more conventionally to copy the data to a caller buffer:
char* func( char* buffer )
{
char c[] = "Hey there!";
strcpy( buffer, c ) ;
return buffer ;
}
I have omitted code to mitigate the possibility of buffer overrun above for clarity in this last example, such code is advised.

Can I use values of array declared as local variable outside of its scope?

Suppose the following simple code:
int main(void){
char *p;
int i = 1;
while(i){
char str[] = "string";
p = str;
i = 0;
}
/* Can I use above string by using `p` in here? */
return 0;
}
I declared a string (char array) as local variable only valid in while{}. But I saved its array address to the pointer p which is also valid outside while{}. Is it okay use the string outside while{} by using p?
If it is okay, why does it work?
Is it okay use the string outside while{} by using p?
No. In the scope of while p is pointing to first character of str. Outside of while there is no str and hence there is no valid memory location where p is pointing to.
You can achieve the desired effect by using static storage class specifier in str declaration
static char str[] = "string";
Outside the while you can use p
printf("%s\n", p);
This is because static variables declared within a block resides at the same storage location throughout program execution.
No, this is not OK. This is what is known as a dangling pointer. If you manage to use the memory before the program attempts to reuse this address, you may still get your expected result. However, if the memory is used again, you will get unexpected results and bugs that are very hard to locate.
Try this as an example, it may not work with all compilers but in Xcode on OS X I get this result:
pa: apples
pa: orange
po: orange
#include <stdio.h>
char *pa;
char *po;
void apples(void)
{
char sa[]="apples";
pa = sa;
printf("pa: %s\n", pa);
}
void orange(void)
{
char so[]="orange";
po = so;
printf("pa: %s\n", pa);
printf("po: %s\n", po);
}
int main(void)
{
apples();
orange();
return 0;
}
Each function call to apples() and orange() subsequently grows and then shrinks the stack and you can see how the string (either apples or orange) ends up in the same memory location on the stack. But there are zero guarantees that string will be there once it is out of scope. This trick "works" here, but it is very dangerous.
As an additional exercise, try printing the contents of po in main() after the call to orange(), in most cases the string will be gone as the call to printf() overwrites that space on the stack.
str points to the read-only (data) area of the process. You can access (read) that data with p, as well. However, attempting to write to that location will result in segmentation fault.

What is the point of initializing a string pointer in C

Here is my question. in C, i saw code like this:
char *s = "this is a string";
but then, s is not actually pointing to an actual memory right?
and if you try to use s to modify the string, the result is undefined.
my question is, what is the point of assigning a string to the pointer
then?
thanks.
char *s = "this is a string";
This is a string literal. So the string is stored in read-only location and that memory address is returned to s . So when you try to write to the read-only location you see undefined behavior and might see a crash.
Q1:s is not actually pointing to an actual memory right?
You are wrong s is holding the memory address where this string is stored.
Q2:what is the point of assigning a string to the pointer then?
http://en.wikipedia.org/wiki/String_literal
When you do a char *s = "this is a string";, the memory is automatically allocated and populated with this string and a pointer to that memory is returned back to the caller (you). So, you do not need to explicitly put the string to some memory.
s is not actually pointing to an actual memory right?
Wrong, it does point to an actual memory whose allocation implementation is hidden from you. And this memory lies in the Read-Only sector of memory, so that it can't be changed/modified. Hence the keyword const as these literals are called constant literals.
if you try to use s to modify the string, the result is undefined.
Because, you are trying to modify memory which is marked as Read-Only.
what is the point of assigning a string to the pointer then?
Another way to achieve the same is,
char temp[260] = {0} ;
char *s ;
strcpy (temp, "this is a string");
s = temp ;
Here the memory temp is managed by you.
char *s = "this is a string" ;
Here the memory is managed by the OS.
By using const char * instead of a char [] the string will be stored in read only memory space. This allows the compiler to eliminate string duplication.
Try running this program:
#include <stdio.h>
int main()
{
const char *s1 = "This is a string";
const char *s2 = "This is a string";
if (s1 == s2) {
puts("s1 == s2");
} else {
puts("s1 != s2");
}
}
For the me it outputs s1 == s2 which means that the string pointers point to the same memory location.
Now try replacing const char * with char []:
int main()
{
const char s1[] = "This is a string";
const char s2[] = "This is a string";
if (s1 == s2) {
puts("s1 == s2");
} else {
puts("s1 != s2");
}
}
This outputs s1 != s2 which means that the compiler had to duplicate the string memory.
By using char * instead of char [] the compiler can do these optimizations that will decrease the size of you executable file.
Also note that you should not use char *s = "string". You should use const char *s = "string" instead. char *s is deprecated and unsafe. By using const char * you avoid the mistake of passing the string to a function that tries to modify the string.
s is not actually pointing to an actual memory right?
Technically, it is pointing to read-only memory. But the compiler is allowed to do whatever it wants as long as if follows the as-if rule. For example, if you never use s, it can be removed from your code completely.
Since it is read-only, any attempt to modify it is undefined behaviour. You can and should use const to indicate that the target of a pointer is immutable:
const char* s = "Hello const";
my question is, what is the point of assigning a string to the pointer then?
Just like storing a constant to any other type. You don't always need to modify strings. But you may want to pass a pointer to a string around to functions that don't care whether they point to a literal or to an array you own:
void foo(const char* str) {
// I won't modify the target of str. I don't care who owns it.
printf("foo: %s", str);
}
void bar(const char* str) {}
char* a = "Hello, this is a literal";
char b[] = "Hello, this is a char array and I own it";
foo(a);
bar(a);
foo(b);
Look at this code:
char *s = "this is a string";
printf("%s",s);
as you can see,I used "assigning a string to the pointer".Is that clear?
And know that s is pointing to an actual memory,but it is read-only.
If you are assigning like this,
char *s = "this is a string";
It will stored in the read only memory. So this is the reason for the undefined behavior. In this s will pointing to the some memory in the read-only area.
If you print the address like this you can get the some memory address.
printf("%p",s);
So in this case, if you allocate a memory and the copy the value to that pointer, you can access that pointer like array.
Everybody else has told you about the read-only memory and potential for undefined behavior if you attempt to modify the string, so I'll skip that part and answer the question, "what is the point of assigning a string to a pointer then?".
There are two reasons why
1) Just for brevity. After assigning the string to the pointer you can refer to the string as s instead of repeatedly typing "this is a string". This assumes of course that you intend to use the string in multiple function calls.
2) Because you may want to change the string that the pointer references. For example, in the following code, s is initialized assuming that the code will succeed, and is subsequently changed if there's a failure. At the end, the string that s points to is printed.
const char *s = "Yay, it worked!!!";
if ( openTheFile() == FAILED )
s = "Dang, couldn't open the file";
else if ( readTheFile() == FAILED )
s = "Oops, there's nothing in the file";
printf( "%s\n", s );
Note that const char * means that the string that s points to cannot be changed. It doesn't mean that s itself can't be changed.
In your case the pointer s is just pointing to the location that carries the first literal of the string. So if we want to change the string, it creates confusion as pointer s is pointing to the previous location. And if you want to change string using pointer, you should take care of the previous string's ending (i.e NULL).

Unexpected String behaviour during printing

I am experiencing some unpredictable behaviour in case of strings.Here it goes :
int main()
{
char *str = charfun();
printf("%s",str); // This is printing garbage values
printf("%c%c%c%c",str[0],str[1],str[2],str[3]); /* if I am printing
like this it is printing the result "Helo" why is it so ?
and str[4] is '\0' (checked its ASCII value)*/
return 0;
}
char* charfun()
{
char a[10]="Helo";
return a;
}
EDIT -
The thing which i am concerned about is not the local address which i am returning.I know it can land me into trouble . But i want to understand the printing methodology which the two printf are following and give different results.
It is because a in charfun() is a local array. When charfun() returns, a's address is assigned to str, but the array it pointed to is already invalidated.
The issue here is that when you create the local variable it is allocated on the stack and is therefore unavailable once the function finishes execution. The preferable way would be to use malloc() to reserve non-local memory. the string a is local to the function, you can't return a pointer to it, It's an Undefined Behavior so it must be allocated on heap instead of stack using malloc:
char *charfun(){
char *a = malloc(sizeof(char)*10);
strcpy(a,"Helo");
return a;
}

Resources