Why does strcpy trigger a segmentation fault with global variables? - c

So I've got some C code:
#include <stdio.h>
#include <string.h>
/* putting one of the "char*"s here causes a segfault */
void main() {
char* path = "/temp";
char* temp;
strcpy(temp, path);
}
This compiles, runs, and behaves as it looks. However, if one or both of the character pointers is declared as global variable, strcpy results in a segmentation fault. Why does this happen? Evidently there's an error in my understanding of scope.

As other posters mentioned, the root of the problem is that temp is uninitialized. When declared as an automatic variable on the stack it will contain whatever garbage happens to be in that memory location. Apparently for the compiler+CPU+OS you are running, the garbage at that location is a valid pointer. The strcpy "succeeds" in that it does not segfault, but really it copied a string to some arbitrary location elsewhere in memory. This kind of memory corruption problem strikes fear into the hearts of C programmers everywhere as it is extraordinarily difficult to debug.
When you move the temp variable declaration to global scope, it is placed in the BSS section and automatically zeroed. Attempts to dereference *temp then result in a segfault.
When you move *path to global scope, then *temp moves up one location on the stack. The garbage at that location is apparently not a valid pointer, and so dereferencing *temp results in a segfault.

The temp variable doesn't point to any storage (memory) and it is uninitialized.
if temp is declared as char temp[32]; then the code would work no matter where it is declared. However, there are other problems with declaring temp with a fixed size like that, but that is a question for another day.
Now, why does it crash when declared globally and not locally. Luck...
When declared locally, the value of temp is coming from what ever value might be on the stack at that time. It is luck that it points to an address that doesn't cause a crash. However, it is trashing memory used by someone else.
When declared globally, on most processors these variables will be stored in data segments that will use demand zero pages. Thus char *temp appears as if it was declared char *temp=0.

You forgot to allocate and initialize temp:
temp = (char *)malloc(TEMP_SIZE);
Just make sure TEMP_SIZE is big enough. You can also calculate this at run-time, then make sure the size is enough (should be at least strlen(path))

As mentioned above, you forgot to allocate space for temp.
I prefer strdup to malloc+strcpy. It does what you want to do.

No - this doesn't work regardless of the variables - it just looks like it did because you got (un)lucky. You need to allocate space to store the contents of the string, rather than leave the variable uninitialised.
Uninitialised variables on the stack are going to be pointing at pretty much random locations of memory. If these addresses happen to be valid, your code will trample all over whatever was there, but you won't get an error (but may get nasty memory corruption related bugs elsewhere in your code).
Globals consistently fail because they usually get set to specific patterns that point to unmapped memory. Attempting to dereference these gives you an segfault immediately (which is better - leaving it to later makes the bug very hard to track down).

I'd like to rewrite first Adam's fragment as
// Make temp a static array of 256 chars
char temp[256];
strncpy(temp, sizeof(temp), path);
temp[sizeof(temp)-1] = '\0';
That way you:
1. don't have magic numbers laced through the code, and
2. you guarantee that your string is null terminated.
The second point is at the loss of the last char of your source string if it is >=256 characters long.

The important part to note:
destination string dest must be large enough to receive the copy.
In your situation temp has no memory allocated to copy into.
Copied from the man page of strcpy:
DESCRIPTION
The strcpy() function copies the string pointed to by src (including
the terminating '\0' character) to the array pointed to by dest. The
strings may not overlap, and the destination string dest must be large
enough to receive the copy.

You're invoking undefined behavior, since you're not initializing the temp variable. It points to a random location in memory, so your program may work, but most likely it will segfault. You need to have your destination string be an array, or have it point to dynamic memory:
// Make temp a static array of 256 chars
char temp[256];
strncpy(temp, 256, path);
// Or, use dynamic memory
char *temp = (char *)malloc(256);
strncpy(temp, 256, path);
Also, use strncpy() instead of strcpy() to avoid buffer overruns.

Related

Overwriting global char array vs global char pointer in a loop in C

I have an infinite while loop, I am not sure if I should use a char array or char pointer. The value keeps getting overwritten and used in other functions. With a char pointer, I understand there could be a memory leak, so is it preferred to use an array?
char *recv_data = NULL;
int main(){
.....
while(1){
.....
recv_data = cJSON_PrintUnformatted(root);
.....
}
}
or
char recv[256] = {0};
int main(){
.....
while(1){
.....
strcpy(recv, cJSON_PrintUnformatted(root));
.....
}
}
The first version should be preferred.
It doesn't have a limit on the size of the returned string.
You can use free(recv_data) to fix the memory leak.
The second version has these misfeatures:
The memory returned from the function can't be freed, because you never assigned it to a variable that you can pass to free().
It's a little less efficient, since it performs an unnecessary copy.
Based on how you used it, the cJSON_PrintUnformatted returns a pointer to a char array. Since there are no input arguments, it probably allocates memory inside the function dynamically. You probably have to free that memory. So you need the returned pointer in order to deallocate the memory yourself.
The second option discards that returned pointer, and so you lost your only way to free the allocated memroy. Hence it will remain allocated -> memroy leak.
But of course this all depends on how the function is implemented. Maybe it just manipulates a global array and return a pointer to it, so there is no need to free it.
Indeed, the second version has a memory leak, as #Barmar points out.
However, even if you were to fix the memory leak, you still can't really use the first version of your code: With the first version, you have to decide at compile-time what the maximum length of the string returned by cJSON_PrintUnformatted(). Now,
If you choose a value that's too low, the strcpy() function would exceed the array bounds and corrupt your stack.
If you choose a value that's so high as to be safe - you might have to exceed the amount of space available for your program's stack, causing a Stack Overflow (yes, like the name of this site). You could fix that using a strncpy(), giving the maximum size - and then what you'd have is a truncated string.
So you really don't have much choice than using whatever memory is pointed to by the cJSON_PrintUnformatted()'s return value (it's probably heap-allocated memory). Plus - why make a copy of it when it's already there for you to use? Be lazy :-)
PS - What should really happen is for the cJSON_PrintUnformatted() to take a buffer and a buffer size as parameters, giving its caller more control over memory allocation and resource limits.

Where the char* is pointing to?

I have some very basic questions regarding pointers and memory allocation.
In below code where does pointer c actually point to? In other words where does the string "xyz" get stored in memory (stack/heap etc.)?
What will happen to memory location allocated for a as I am not using it anymore?
Code seems to work well if I un-comment the commented section. What's happening with the memory in that scenario?
#include <stdio.h>
main()
{
char *c;
//c = (char *)malloc(2);
c = "a";
c = "xyz" ;
printf("%s",c);
return 0;
}
Output:
xyz
Edit:
After reading few of the answer and first comment another question came up in my mind:
In below case, where do the strings get stored? Can I alter them later on?
char *c[] = {"a","xyz"};
The specific details are implementation dependent, but in most common implementations, literal strings like "a" and "xyz" are stored in the text section of the program, like the machine code that implements the program. Assigning a = "xyz"; sets a to point to that location in memory.
The memory for "a" is unaffected. However, an optimizing compiler may notice that c was never used between that assignment and being reassigned, so it could simply ignore the first assignment, and never allocate any space for "a" at all.
The memory you allocated with malloc() stays allocated until the program ends. Allocating memory without freeing it is called a memory leak, and you should try to avoid it.
1.In below code where does pointer 'c' is actually pointing to?In other words where does the string "xyz" get stored in memory(stack/heap etc.)?
will place xyz the read-only parts of the memory and making c a pointer to that, a variable of type pointer-to-char, called c , which is initialized with the location of the first character in that unnamed, read-only array.you created automatic storage (often called the stack). Here you just point to memory in text section.
2.What will happen to memory location allocated for"a"' as i am not using it anymore?
As per your code you not allocated any memory just you assign string literals to that.If you allocate then need to free it.
3.Code seems to work well if I un-comment the commented section.Whats happening with memory in that scenario?
If you un comment the allocation statement to allocate memory then also it point to you literal string but you need to free that memory.
Now for case
char *c[] = {"a","xyz"};
when you declare and initialize words, it'll be allocated on the stack as any other automatic array and its items will be assigned to pointers to the beginning of each string constant.Also to alter this string may illegal.
"xyz" and "a" are string literals which is mostly available in string table.
"xyz" is printed because it is recently assigned to that pointer.
To a place in the heap/stack in the READ-ONLY part of the memory which the string is in. When assigning a string literal directly into a pointer, the program searches for that string in the memory, if it exists through the short search he's doing, it'll point to it, if it doesn't - it will create it. Either way it's read only so it'll be the same as a const char* so you can't change it (of course you can somehow manipulate it, maybe by another pointer or so).
Nothing, it'll stay unaffected.
What's happening is that malloc returns a pointer and you just ignore it, you go to another address containing a and it will not have the same influence as strcpy(c, "a"); as you ignore the allocated memory and its pointer - you do not free it. Generally, nothing's happen if you don't free the memory (I mean, at the end of the program it is freed automatically by the OS) but it WILL take memory within the program so if i'd allocate, let's say 1000000 bytes (assuming it succeeded), allocating more heap memory would be a problem :P
about the other question... You can't alter them through that pointer, try and it will throw an interrupt in the middle of the program and it'll probably stop responding.

Concatenating strings - need clarification

char * a = (char *) malloc(10);
strcpy(a,"string1");
char * x = "string2";
strcat(a,x);
printf("\n%s",a);
Here, I allocated only 10B to a, but still after concatenating a and x (combined size is 16B), C prints the answer without any problem.
But if I do this:
char * a = "string1";
char * x = "string2";
strcat(a,x);
printf("\n%s",a);
Then I get a segfault. Why is this? Why does the first one work despite lower memory allocation? Does strcat reallocate memory for me? If yes, why does the second one not work? Is it because a & x declared that way are unmodifiable string literals?
In your first example, a is allocated in the heap. So when you're concatenating the other string, something in the heap will be overwritten, but there is no write-protection.
In your second example, a points to a region of the memory that contains constants, and is readonly. Hence the seg fault.
The first one doesn't always work, it already caused an overflow. The second one, a is a pointer to the constant string which is stored in the data section, in a read-only page.
In the 2nd case what you have is a pointer to unmodifiable string literals,
In 1st case, you are printing out a heap memory location and in that case its undefined, you cannot guarantee that it will work every time.
(may be write it in a very large loop, yo may see this undefined behavior)
Your code is writing beyond the buffer that it's permitted, which causes undefined behavior. This can work and it can fail, and worse: it can look like it worked but cause seemingly unrelated failures later. The language allows you to do things like this because you're supposed to know what you're doing, but it's not recommended practice.
In your first case, of having used malloc to allocate buffers, you're actually being helped but not in a manner you should ever rely on. The malloc function allocates at least as much space as you've requested, but in practice it typically rounds up to a multiple of 16... so your malloc(10); probably got a 16 byte buffer. This is implementation specific and it's never a good idea to rely on something like that.
In your second case, it's likely that the memory pointed to by your a (and x) variable(s) is non-writable, which is why you've encountered a segfault.

Freeing memory, all?

Maybe a bad topic, but given the following code, do i need to free(player->name) too?
#include <stdio.h>
struct Player
{
char *name;
int level;
};
int main()
{
struct Player *player;
player->name = malloc(sizeof(player->name)*256);
player->name = "John";
printf(player->name);
free(player);
printf("\n\n\n");
system("PAUSE");
return 0;
}
Oh boy, where to start? You really need a good book. Sigh. Let's start at the top of main():
This
struct Player *player;
defines a pointer to a struct Player, but it doesn't initialize it. It has thus a more or less random value, pointing somewhere into memory. This
player->name = malloc(sizeof(player->name)*256);
now writes into parts of that random location the address of a piece of memory obtained by malloc(). Writing to memory through an uninitialized pointer invokes Undefined Behavior. After that, all bets are off. No need to look further down your program. You are unlucky that, by accident, you write to a piece of memory that is owned by your process, so it doesn't crash immediately, making you aware of the problem.
There's two ways for you to improve that. Either stick to the pointer and have it point to a piece of memory allocated for a Player object. You could obtain it by calling malloc(sizeof(Player).
Or just use a local, automatic (aka stack-based) object:
struct Player player;
The compiler will generate the code to allocate memory on the stack for it and will release it automatically. This is the easiest, and should certainly be your default.
However, your code has more problems than that.
This
player->name = malloc(sizeof(player->name)*256);
allocates consecutive memory on the heap to store 256 pointers to characters, and assigns the address of the first pointer (the address of a char*, thus a char**) to player->name (a char*). Frankly, I'm surprised that even compiles, but then I'm more used to C++' stricter type enforcement. Anyway, what you probably want instead instead is to allocate memory for 256 characters:
player->name = malloc(sizeof(char)*256);
(Since sizeof(char) is, by definition, 1, you will often see this as malloc(256).)
However, there more to this: Why 256? What if I pass a string 1000 chars long? No, simply allocating space for a longer string is not the way to deal with this, because I could pass it a string longer still. So either 1) fix the maximum string length (just declare Player::name to be a char array of that length, instead of a char*) and enforce this limit in your code, or 2) find out the length needed dynamically, at run-time, and allocate exactly the memory needed (string length plus 1, for the terminating '\0' char).
But it gets worse. This
player->name = "John";
then assigns the address of a string literal to player->name, overriding the address of the memory allocated by malloc() in the only variable you store it in, making you lose and leak the memory.
But strings are no first-class citizens in C, so you cannot assign them. They are arrays of characters, by convention terminated with a '\0'. To copy them, you have to copy them character by character, either in a loop or, preferably, by calling strcpy().
To add insult to injury, you later attempt to free the memory a string literal lives in
free(player);
thereby very likely seriously scrambling the heap manager's data structures. Again, you seem to be unlucky for that to not causing an immediate crash, but the code seemingly working as intended is one of the worst possibilities of Undefined Behavior to manifest itself. If it weren't for all bets being off before, they now thoroughly would be.
I'm sorry if this sounds condemning, it really wasn't meant that way, but you certainly and seriously fucked up this one. To wrap this up:
You need a good C++ book. Right now. Here is a list of good books assembled by C programmers on Stack Overflow. (I'm a C++ programmer by heart, so I won't comment on their judgment, but K&R is certainly a good choice.)
You should initialize all pointers immediately, either with the address of an existing valid object, or with the address of a piece of memory allocated to hold an object of the right type, or with NULL (which you can easily check for later). In particular, you must not attempt to read from or write to a piece of memory that has not been allocated (dynamically on the heap or automatically on the stack) to you.
You need to free() all memory that was obtained by calling malloc() exactly once.
You must not attempt to free() any other memory.
I'm sure there is more to that code, but I'll stop here. And did I mention you need a good C book? Because you do.
You have to free() everything that you malloc() and you must malloc() everything that is not allocated at compile time.
So:
You must malloc player and you must free player->name
Ok, so your variable player is a pointer, which you have not initialized, and therefore points to a random memory location.
You first need to allocate the memory for player the way you have done for player->name, and then alocate for player->name.
Any memory allocated with malloc() needs to be freed with free().
Take a look at this and this.
This is awful code. Why? Firstly you allocate memory for player->name. malloc returns pointer to allocated memory. In next step you lose this pointer value because reassign player->name to point to static "John" string. Maybe you want to use strdup or sprintf functions?
Also the big mistake is to use uninitialized pointer to player struct. Try to imagine that it can point to random memory location. So it is good idea allocate memory for your structure with help of malloc. Or don't use pointer to structure and use real structure variable.
player doesn't need to be freed because it was never malloc'd, it's simply a local stack variable. player->name does need to be freed since it was allocated dynamically.
int main()
{
// Declares local variable which is a pointer to a Player struct
// but doesn't actually point to a Player because it wasn't initialised
struct Player *player;
// Allocates space for name (in an odd way...)
player->name = malloc(sizeof(player->name)*256);
// At this point, player->name is a pointer to a dynamically allocated array of size 256*4
// Replaces the name field of player with a string literal
player->name = "John";
// At this point, the pointer returned by malloc is essentially lost...
printf(player->name);
// ?!?!
free(player);
printf("\n\n\n");
system("PAUSE");
return 0;
}
I guess you wanted to do something like this:
int main() {
struct Player player;
player.name = malloc( 256 );
// Populate the allocated memory somehow...
printf("%s", player.name);
free(player.name);
}

Segmentation fault while using strcpy()?

I have a global structure:
struct thread_data{
char *incall[10];
int syscall arg_no;
int client_socket;
};
and in main()
char buffer[256];
char *incall[10];
struct thread_data arg_to_thread;
strcpy(incall[0],buffer); /*works fine*/
strcpy(arg_to_thread.incall[0],buffer); /*causes segmentation fault*/
Why does this happen and Please suggest a way out.
thanks
A segfault means that something is wrong. But no segfault does not mean that something isn't wrong. If two situations are basically the same, and one segfaults and the other does not, it usually means that they are both wrong, but only one of them happens to be triggering the segfault.
Looking at the line char* incall[10], what that means is you have an array of 10 pointers to a char. By default, these pointers will be pointing at random places. Therefore, strcpying into incall[0] will be copying the string to a random location. This is most likely going to segfault! You need to initialise incall[0] first (using malloc).
So a bigger question is why doesn't the first line segfault? I would imagine the reason is that it just so happens that whatever was in memory before was a valid pointer. Therefore, the strcpy doesn't segfault, it just overwrites something else which will later cause completely unexpected behaviour. So you must fix both lines of code.
Another issue (once you have fixed that) is that strcpy itself is highly dangerous -- since it copies strings until it finds a 0 byte and then stops, you can never be sure exactly how much it's going to copy (unless you use strlen to allocate the destination memory). So you should use strncpy instead, to limit the number of bytes copied to the size of the buffer.
You've not initialized the pointer incall[0], so goodness only knows where the first strcpy() writes to. You are unlucky that your program does not crash immediately.
You've not initialized the pointer arg_to_thread.incall[0], so goodness only knows where the second strcpy() writes to. You are lucky that your program crashes now, rather than later.
In neither case is it the compiler's fault; you must always ensure you initialize your pointers.
Make sure you have enough memory allocated for your string buffers.
Stay away from strcpy. Use strncpy instead. strcpy is a notorious source of buffer overflow vulnerabilities - a security and maintenance nightmare for which there really isn't an excuse.

Resources