Consider this pointless program:
/* main.c */
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv) {
int i;
for (i = 0; i < 1024; i++) {
int pid = fork();
int status;
if (pid) {
wait(&status);
}
else {
char *ptr = (char *)malloc(1024*sizeof(char));
char *args[2] = {"Hello, world!", NULL};
execve("/bin/echo", args, NULL);
}
}
}
Would not freeing ptr constitute a memory leak for either main.c or the other program, or is it going to be freed anyway when execve is called?
No.
This is not a memory leak. exec*() will make a local copy of the string data in the args array, then blow away the child process memory image and overlay it with the memory image used by /bin/echo. Essentially all that remains after the exec() is the pid.
Edit:
User318904 brought up the case of exec() returning -1 (i.e., failure). In this case, the child process that has forked but failed to exec does, indeed technically have a memory leak, but as the usual response to a failed exec is to just exit the child process anyways, the memory will be reclaimed by the OS. Still, freeing it is probably a good habit to get into, if for no other reason than that it will keep you from wondering about it later on.
When execve returns -1, yes. Otherwise, maybe.
The allocated memory should be freed by exec. After the call completes you can't access it anyway.
Related
I am reading the Operating Systems: Three Easy Pieces book, Chapter 5.
It says:
The fork() system call is strange; its partner in crime, exec(), is
not so normal either. What it does: given the name of an executable
(e.g., wc), and some arguments (e.g., p3.c), it loads code (and static
data) from that executable and overwrites its current code segment
(and current static data) with it; the heap and stack and other parts
of the memory space of the program are re-initialized.
Then I have a question with this sample code in this chapter:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <assert.h>
#include <sys/wait.h>
int
main(int argc, char *argv[])
{
int rc = fork();
if (rc < 0) {
// fork failed; exit
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) {
// child: redirect standard output to a file
close(STDOUT_FILENO);
open("./p4.output", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);
// now exec "wc"...
char *myargs[3];
myargs[0] = strdup("wc"); // program: "wc" (word count)
myargs[1] = strdup("p4.c"); // argument: file to count
myargs[2] = NULL; // marks end of array
execvp(myargs[0], myargs); // runs word count
} else {
// parent goes down this path (original process)
int wc = wait(NULL);
assert(wc >= 0);
}
return 0;
}
According to man strdup, margs[0] and margs[1] are created with malloc on the heap. So when execvp reinitialize the heap and stack the child's memory space, won't they be cleared or destroyed so as a result using margs[0] and margs[1] would be undefined behaviour?
The newly created process makes a copy of the arguments from the myargs array before the old process memory is zapped, precisely so there is no problem with memory accesses.
The POSIX specification for excevp() et al says:
The arguments specified by a program with one of the exec functions shall be passed on to the new process image in the corresponding main() arguments.
That page specifies a lot of other key behaviours of the exec() family of functions. You'll probably find that the Linux equivalent page specifies even more things that are affected (or not affected) by the exec() family of functions.
Note that if a function from the exec() family succeeds, it does not return. If it returns, it failed. There's no need to check the return value (because it will always be -1). But there is usually a need to report that the execution failed and most often, a process exits with a non-zero status after a failed exec().
There are similar question all over and one which is closes is from this stack exchange site. But even tho I learned a lot reading them none of them a exactly answer my question.
Here is the question, say I have this program (it is a very simplified version)
inline void execute(char *cmd)
{
int exitstat = 0;
//lots of other thins happed
exitstat = execve(cmd, cmdarg, environ);
free(cmd), free(other passed adress); ## DO I NEED THIS, since I am already about to exit
_exit (exitstat);
}
int main(void)
{
int childstat;
char *str = malloc(6); //I also have many more allocated space in heap in the parent process
strcpy(str, "Hello");
pid_t childid = fork() //creat a child process, which will also get a copy of all the heap memories even tho it is CoW.
if (childid < 0)
exit(-1);
if (child == 0)
{
execute(str);
}
else
{
wait(&childstat);
free(str);
}
//do somethign else with str with other functions and the rest of the program
}
In this program, the parent process does its thing for a while, allocates a lot of process in the heap, free some, keep other for later and at some point it wants to execute some command, but it doesn't want to terminate, so it creates a child to do its biting.
The child then calls another function which will do some task and in the end use execve to execute the command passed to it. If this was successful there would be no problem, since the executed program will handle all the allocated spaces, but if it failed the child exits with a status code. The parent waits for the child to answer, and when it does it moves on to its next routine. The problem here is that when the child fails to execute all the heap data remains allocated, but does that matter? since,
it is about to be exited int he next line,
Even though I am getting started with this, I have learned that a new process is created during fork, so the memory leak shouldn't affect the parent since the child is about to die.
If my assumption is correct and this leaks doesn't actually matter, why does valgrind lament about them?
Would there be a better way to free all the memories in the child heap with out actually passing all the memores (str, and the others in this example) to the execute function and laboriously call free each time? does kill() have a mechanism for this?
Edit
Here is working code
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <string.h>
inline void execute(char *cmd)
{
extern char **environ;
int exitstat = 0;
char *cmdarg[2];
cmdarg[0] = cmd;
cmdarg[1] = NULL;
exitstat = execve(cmd, cmdarg, environ);
free(cmd);
_exit (exitstat);
}
int main(void)
{
int childstat;
char *str = malloc(6);
pid_t childid;
strcpy(str, "Hello");
childid = fork();
if (childid < 0)
exit(-1);
if (childid == 0)
{
execute(str);
}
else
{
wait(&childstat);
free(str);
}
return (0);
}
When there is a free inside the child process here is valgrinds result.
When run with the free inside the child process is commented.(#free(cmd) in the execute function).
As you might see there is no overall error () since this programm is short lived. But in my real program which has an infinite loop every problem in the child () matters. So I was asking if I be worried by this leaked memories in this child process just before exiting and hunt them down or just leave them as they are in another virtual space and they are cleaned as their processes exit (but in that case why is valgrind still complaining about them if their corresponding process clean them during exit)
exitstat = execve(cmd, cmdarg, environ);
free(cmd), free(other passed adress);
If successful, execve does not return. No code is executed after it, the execution is terminated and the cmd process is started. All memory hold by current process is freed anyway by the operating system. The code free(cmd), free(other passed adress); will not execute if execve call is successful.
The only case where those liens could be relevant are in case of execve error. And in case execve returns with an error, yes, you should free that memory. I much doubt most programs with execve actually do that and I believe they would just call abort() after failed execve call.
inline void execute(char *cmd)
Please read about inline. inline is tricky. I recommend not use it, forget it exists. I doubt the intention here is to make an inline function - there is no other function the compiler can choose from anyway. After fixing typos in code, I could not compile the code because of undefined reference to execute - inline has to be removed.
The problem here is that when the child fails to execute all the heap data remains allocated, but does that matter?
Ultimately it depends on what do you want to do after the child fails. If you want to just call exit and you do not care about memory leaks in such case, then just don't care and it will not matter.
I see glibc exec*.c functions try to avoid dynamic allocation as much as possible and use variable length arrays or alloca().
If my assumption is correct and this leaks doesn't actually matter, why does valgrind lament about them?
It doesn't. If you correct all the mistakes in your code, then valgrind will not "lament" about memory leaks before calling execve (well, unless the execve call fails).
Would there be a better way to free all the memories in the child heap with out actually passing all the memores (str, and the others in this example) to the execute function and laboriously call free each time?
"Better" will tend to be opinion based. Subjectively: no. But writing/using own wrapper around dynamic allocation with garbage collector to free all memory is also an option.
does kill() have a mechanism for this?
No, sending a signal seems unrelated to freeing memory.
There is no simple solution.
On termination, the OS will free all memory for a process. That means that you don't have to free everything. However, for memory that is no longer needed it is usually best to free it as soon as possible. That leaves things like buffers that could potentially be used right up to the last moment before termination.
Reasons to free memory.
There's only one that I can think of. If you have a large number of blocks in use at the end it can be difficult to sort the wheat from the chaff. Either you end up with enormous logs that no-one looks at, or you have to maintain a suppression file. Either way you risk overlooking genuine issues.
Reasons not to free memory
Small performance gain
It can be quite difficult sometimes to free everything. For instance if you use putenv() then knowing whether the string was added (and needs freeing) or replaced (must not be freed) is a bit tricky.
If you need to use atexit() to free things, then there is a limit on the number of functions that you can call via atexit(). That might be a problem if you have many things being freed in many modules.
My advice is try to free most things but don't sweat it or you'll get bitten by the law of diminishing returns. Use a suppression file judiciously for the few percent that are tricky to free.
I am trying to use clone() to create a child process to exec() some programs. I know that exec() replaces the original process and the process that calls it should end with it, so I use a child process to call exec(). However, for some reasons, after exec(), my parent process crashes also. Could someone tell me why is this happening? (if i replace clone with fork or vfork, it works)
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sched.h>
void delNL(char * arry){ //A function to delete the newline at the end of the array
char * position;
position = strchr(arry,'\n');
*position = '\0';
}
int mySysC(char * command){ //clone funtion
char *cmd[10]={" "};
int nb=0;
int cnb=0;
while(command[nb]!='\0'){
char coa[10];
int cici=0;
while(command[nb]!=' ' && command[nb]!='\0'){
coa[cici]=command[nb];
nb++;
cici++;
}
coa[cici]='\0';
char *nad=(char *)malloc(10);
strcpy(nad,coa);
cmd[cnb]=nad;
cnb++;
if(command[nb]==' '){
nb++;
}
}
cmd[cnb]=NULL;
execvp(cmd[0],cmd);
exit(0);
}
void my_system_c(char * command){ //clone version
void * stack = (void *)malloc(10000);
void * stackTop = stack + 100000;
pid_t pid = clone((void *)mySysC(command),stackTop,CLONE_THREAD,NULL); //clone
waitpid(pid,NULL,0);
}
int main(){
char commdd[100];
char ex[10]="os_exit";
while(1){
printf("Please enter your command or enter \"os_exit\" to exit:\n");
fgets(commdd,100,stdin);
delNL(commdd);
if(strlen(commdd)>0 && strcmp(commdd,ex)!=0){
my_system_c(commdd); //select version
}
else if(strcmp(commdd,ex)==0) break;
else printf("Empty command\n");
}
return 0;
}
execvp(cmd[0],cmd); this is what crashed the whole program. I add two prints one before and one after, the later one never runs. I don't understand because of I thought clone works just like fork, which creates a new process, and the end of the chile process won't affect the parent?
Thanks!!!
clone(…CLONE_THREAD…) is not the clone() you're looking for. It creates a new process in the same thread group as the parent process, and:
If any of the threads in a thread group performs an execve(2), then all threads other than the thread group leader are terminated, and the new program is executed in the thread group leader.
If you are looking for a way to start a process without using fork(), consider using posix_spawn() instead.
Additionally, the stack pointer you are passing to clone() is invalid. The stack you allocate is 10,000 bytes large, but the stack pointer is 100,000 bytes beyond the start of the stack -- and 90,000 bytes beyond its end.
This happens because you never actually call clone. This part:
clone((void *)mySysC(command), ...);
is equivalent to:
int result = mSysC(command);
void* first = (void*) result;
clone(first, ...);
so it calls your function before it ever calls clone. You need to pass it as a function pointer instead.
In addition to that, you should remove one zero from your stackTop to match the malloc, and avoid passing CLONE_THREAD since you want a new process:
void my_system_c(char * command){ //clone version
void * stack = (void *)malloc(10000);
void * stackTop = stack + 10000;
pid_t pid = clone(mySysC,stackTop,0,command); //clone
waitpid(pid,NULL,0);
}
I am having an issue with passing a pid_t by reference as a void pointer, and typecasting it back to a pid_t. My code is as follows:
typedef void * ProcessHandle_t;
void createProcess( ProcessHandle_t * processHandle )
{
pid_t newTask = fork();
if( newTask != 0 )
{
/* Parent process */
/* Return a process handle for the main task to use */
*processHandle = &newTask;
printf("pid_t output 1: %d\n", *((pid_t *)*processHandle));
} else
{
while(1){
printf("Child running\n");
}
}
}
void deleteProcess( ProcessHandle_t processHandle )
{
pid_t deleteTask = *((pid_t *)processHandle);
printf("pid_t output 3: %d\n", deleteTask));
kill(deleteTask, SIGKILL);
}
int main( )
{
ProcessHandle_t processHandle;
createProcess( &processHandle );
printf("pid_t output 2: %d\n", ((pid_t *)*processHandle));
deleteProcess( processHandle );
printf("Parent exiting\n");
}
And my output is:
pid_t output 1: 19876
pid_t output 2: 19876
pid_t output 3: 493972479
Parent exiting
But I have no idea why. If I do the same kind of dereferencing with ints, it works, but I get a really strange value when I do the same for pid_t.
Is there a specific reason why this does not work with pid_t, but works with other variable types?
Remember that local variables go out of scope once the function they were defined in returns.
In the createProcess function you have
*processHandle = &newTask;
Here you make *processHandle point to the local variable newTask.
When createProcess have returned, the memory previously occupied by the newTask variable no longer "exist" (actually it will be reused by the next function call), leaving you with a stray pointer.
Dereferencing this pointer will lead to undefined behavior!
If you want to copy the contents of newTask using pointers, then you need to allocate memory for the copied value, and actually copy the value into the newly allocated memory. And if you allocate memory then you of course have to free it as well.
A simpler solution is to not use pointers at all. Avoiding pointers is usually a very good way to avoid undefined behaviors and crashes in general.
I am trying to understand the fork() concept and there's is one thing I can't seem to understand.
In the following code - why does the parent still print i=0 even when child process changes it to 5?
The wait(NULL) blocks parent process until child finishes first.
int main(int argc, char *argv[]) {
int i = 0;
if (fork() == 0) {
i = 5;
} else {
wait(NULL);
printf("i = %d\n", i);
}
return 0;
}
Can somebody explain why my assumption is incorrect?
Variables are not shared between processes. After the call to fork, there are two completely separate processes. fork returns 0 in the child, where the local variable is set to 5. In the parent, where fork returns the process ID of the child, the value of i is not changed; it still has the value 0 set before fork was called. It's the same behavior as if you had two programs run separately:
int main(int args, char *argv[]) {
int i=0;
printf("i = %d\n", i);
return 0;
}
and
int main(int argc, char *argv[]) {
int i = 0;
i = 5;
return 0;
}
Processes are not threads! When you fork, you create a full cloned process, with independant memory allocation that simply contains same values (except for the result of the fork call) at the time of the fork.
If you want a child to update some data in the parent process, you will need to use a thread. A thread shares all static and dynamic allocated memory with its parent, and simply has independant automatic variables. But even there, you should use static allocation for i variable:
int i = 0;
int main(int argc, char *argv[]) {
...
When you fork the child process gets a copy of the address space of the parent address space, they don't share it, so when the child changed i the parent won't see it.
This copying of the address space it typically done using copy on write to avoid allocating memory that will never change.