child process changing memory image

child process changing memory image - c

I was reading about processes and I came across this:
Usually, the child process then executes execve or a similar system call to change its memory image
what I can derive from this is this pseudocode:
if(child_created_sucessfully)
{
do_ABC_and_ignore_the_part_of_the_parent's_control_flow //is this what it meant to "change its memory image"?
}
(Question asked in the pseudocode's comment)
I completely don't understand this other part:
example, when a user types a command, say, sort, to the shell, the
shell forks off a child process and the child executes sort. The reason for this twostep
process is to allow the child to manipulate its file descriptors after the fork but
before the execve in order to accomplish redirection of standard input, standard
output, and standard error.

Regarding the first part
Usually, the child process then executes execve or a similar system
call to change its memory image
This simply means that when you create a child process it initializes it's own stack and heap memory although this is not 100% true. Since the new process is forked at time T at the time T + 1 when the process starts to run it is pretty much identical when it comes to the data in memory so there is a smart optimization called 'copy on write' more here.
Regarding the second part
example, when a user types a command, say, sort, to the shell, the
shell forks off a child process and the child executes sort. The
reason for this twostep process is to allow the child to manipulate
its file descriptors after the fork but before the execve in order to
accomplish redirection of standard input, standard output, and
standard error.
Simply put this means that when you execute a shell command (like ls, ps, grep, nstat...) the OS forks the current process which executes the command and the command itself is executed by this new process. An easy way to understand this is by using ps | grep ps this will first fork and create a new process, then this part comes to play
this twostep process is to allow the child to manipulate its file
descriptors after the fork but before the execve
and the standard output of the process is changed. After the new ps process executes the ps it will then fork and create one more process for the grep ps which will execute the grep and you should be able to see the ps process which created this grep process.

Related

execlp sleep doesn't work [duplicate]

the man page says that "The exec() family of functions replaces the current process image with a new process image." but I am not quite understand the meaning of "replaces the current process image with a new process image". For example, if exec succeed, perror would not be reached
execl("/bin/ls", /* Remaining items sent to ls*/ "/bin/ls", ".", (char *) NULL);
perror("exec failed");

Correct. If the exec works, the perror will not be called, simply because the call to perror no longer exists.
I find it's sometimes easier when educating newcomers to these concepts, to think of the UNIX execution model as being comprised of processes, programs and program instances.
Programs are executable files such as /bin/ls or /sbin/fdisk (note that this doesn't include things like bash or Python scripts since, in that case, the actual executable would be the bash or python interpreter, not the script).
Program instances are programs that have been loaded into memory and are basically running. While there is only one program like /bin/ls, there may be multiple instances of it running at any given time if, for example, both you and I run it concurrently.
That "loaded into memory" phrase is where processes come into the picture. Processes are just "containers" in which instances of programs can run.
So, when you fork a process, you end up with two distinct processes but they're still each running distinct instances of the same program. The fork call is often referred to as one which one process calls but two processes return from.
Likewise, exec will not have an effect on the process itself but it will discard the current program instance in that process and start a new instance of the requested program.
This discard in a successful exec call is what dictates that the code following it (perror in this case) will not be called.

It means your current process becomes the new process instead of what it was. You stop doing what you're doing and start doing,really being, something else instead, never to rebecome what that process once was.
Instead of starting a whole new process, however, your current pid and environment become the new process instead. That let's you setup things the way the new process will need it before doing the exec

You are correct. perror will not be called unless the execl fails. The exec functions are the means for starting new processes in a POSIX compliant OS (typically combined with a fork call). Maybe an example will help. Suppose your program, call it programX, is running. It then calls one of the exec functions like the one you have above. programX will no longer exist as a running process. Instead, ls will be running. It will have the same exact PID as programX, but pretty much be a whole new process otherwise.

Know if process has been called by exec()

Is there any way to know if a process has started to run from a call of exec() or has started from the terminal by the user?

Helpful to you: child and parent process id;
getppid() returns the process ID of the parent of the calling
process. This will be either the ID of the process that created this
process using fork(), or, (!!!CARE!!!) if that process has already terminated, the
ID of the process to which this process has been reparented;
I would also consider adding additional program arg.

All programs are started by a call to exec family of functions.
When you type a command in the terminal, for example, it searches for the binary executable, forks and calls exec in the child process. This will substitute the binary image of the calling process (the terminal) for the binary image of the new program. The program will execute and the terminal process will wait.
There is this absolutely awesome answer by paxdiablo on the question Please explain exec() function and its family that will surely help you understand how exec works.

In Unix, all processes are created by using the fork system call, optionally followed by the exec system call, even those started by a user (they are fork/exec'd by the user's shell).
Depending on what you really want to do, the library function isatty() will tell you if stdin, stdout or stderr are file descriptors of a tty device. i.e. input comes from a terminal, output goes to a terminal or errors go to a terminal. However, a command like
myprog < somefile 1>someotherfile 2>errorfile
will fool code using isatty. But maybe that is what you want. If you want to take different actions based on whether there is a user typing input from a keyboard or input is coming from a file, isatty is what you need.

Linux C code to start another process asynchronously

I am looking for C code to use on a Linux based system to start another process asynchronously. The second process should continue, even if the first ends. I've looked through the "fork" and "system" and "exec" options, but don't see anything that will spawn a peer process that's not communicating with or a child of the original process.
Can this be done?

Certainly you can. In the parent fork() a child, and in that child first call daemon() (which is an easy way to avoid setsid etc.), then call something from the exec family.

In Linux (and Unix), every process is created by an existing process. You may be able to create a process using fork and then, kill the parent process. This way, the child will be an orphan but still, it gets adopted by init. If you want to create a process that is not inherited by another, I am afraid that may not be possible.

You do a fork (man 2 fork) followed by an execl (man 2 execl)
For creates a new process of the same image as the calling process (so a perfect twin), where execl replaces one of the twins with a new image.
If you search google for "fork execl" you will find many text book examples -- including how to use correctly fork() and exec()
The most common fork-execl you will still have the new process associated to the terminal -- to create a perfect background process you need to create what is called a daemon process -- the template for that can be fornd in this answer here Creating a daemon in Linux

Passing the shell to a child before aborting

Current scenario, I launch a process that forks, and after a while it aborts().
The thing is that both the fork and the original process print to the shell, but after the original one dies, the shell "returns" to the prompt.
I'd like to avoid the shell returning to the prompt and keep as if the process didn't die, having the child handle the situation there.
I'm trying to figure out how to do it but nothing yet, my first guess goes somewhere around tty handling, but not sure how that works.
I forgot to mention, the shell takeover for the child could be done on fork-time, if that makes it easier, via fd replication or some redirection.

I think you'll probably have to go with a third process that handles user interaction, communicating with the "parent" and "child" through pipes.
You can even make it a fairly lightweight wrapper, just passing data back and forth to the parent and terminal until the parent dies, and then switching to passing to/from the child.
To add a little further, as well, I think the fundamental problem you're going to run into is that the execution of a command by the shell just doesn't work that way. The shell is doing the equivalent of calling system() -- it's going to wait for the process it just spawned to die, and once it does, it's going to present the user with a prompt again. It's not really a tty issue, it's how the shell works.

bash (and I believe other shells) have the wait command:
wait: wait [n]
Wait for the specified process and report its termination status. If
N is not given, all currently active child processes are waited for,
and the return code is zero. N may be a process ID or a job
specification; if a job spec is given, all processes in the job's
pipeline are waited for.

Have you considered inverting the parent child relationship?
If the order in which the new processes will die is predictable, run the code that will abort in the "child" and the code that will continue in the parent.

What is the purpose of fork()?

In many programs and man pages of Linux, I have seen code using fork(). Why do we need to use fork() and what is its purpose?

fork() is how you create new processes in Unix. When you call fork, you're creating a copy of your own process that has its own address space. This allows multiple tasks to run independently of one another as though they each had the full memory of the machine to themselves.
Here are some example usages of fork:
Your shell uses fork to run the programs you invoke from the command line.
Web servers like apache use fork to create multiple server processes, each of which handles requests in its own address space. If one dies or leaks memory, others are unaffected, so it functions as a mechanism for fault tolerance.
Google Chrome uses fork to handle each page within a separate process. This will prevent client-side code on one page from bringing your whole browser down.
fork is used to spawn processes in some parallel programs (like those written using MPI). Note this is different from using threads, which don't have their own address space and exist within a process.
Scripting languages use fork indirectly to start child processes. For example, every time you use a command like subprocess.Popen in Python, you fork a child process and read its output. This enables programs to work together.
Typical usage of fork in a shell might look something like this:
int child_process_id = fork();
if (child_process_id) {
// Fork returns a valid pid in the parent process. Parent executes this.
// wait for the child process to complete
waitpid(child_process_id, ...); // omitted extra args for brevity
// child process finished!
} else {
// Fork returns 0 in the child process. Child executes this.
// new argv array for the child process
const char *argv[] = {"arg1", "arg2", "arg3", NULL};
// now start executing some other program
exec("/path/to/a/program", argv);
}
The shell spawns a child process using exec and waits for it to complete, then continues with its own execution. Note that you don't have to use fork this way. You can always spawn off lots of child processes, as a parallel program might do, and each might run a program concurrently. Basically, any time you're creating new processes in a Unix system, you're using fork(). For the Windows equivalent, take a look at CreateProcess.
If you want more examples and a longer explanation, Wikipedia has a decent summary. And here are some slides here on how processes, threads, and concurrency work in modern operating systems.

fork() is how Unix create new processes. At the point you called fork(), your process is cloned, and two different processes continue the execution from there. One of them, the child, will have fork() return 0. The other, the parent, will have fork() return the PID (process ID) of the child.
For example, if you type the following in a shell, the shell program will call fork(), and then execute the command you passed (telnetd, in this case) in the child, while the parent will display the prompt again, as well as a message indicating the PID of the background process.
$ telnetd &
As for the reason you create new processes, that's how your operating system can do many things at the same time. It's why you can run a program and, while it is running, switch to another window and do something else.

fork() is used to create child process. When a fork() function is called, a new process will be spawned and the fork() function call will return a different value for the child and the parent.
If the return value is 0, you know you're the child process and if the return value is a number (which happens to be the child process id), you know you're the parent. (and if it's a negative number, the fork was failed and no child process was created)
http://www.yolinux.com/TUTORIALS/ForkExecProcesses.html

fork() is basically used to create a child process for the process in which you are calling this function. Whenever you call a fork(), it returns a zero for the child id.
pid=fork()
if pid==0
//this is the child process
else if pid!=0
//this is the parent process
by this you can provide different actions for the parent and the child and make use of multithreading feature.

fork() will create a new child process identical to the parent. So everything you run in the code after that will be run by both processes — very useful if you have for instance a server, and you want to handle multiple requests.

System call fork() is used to create processes. It takes no arguments and returns a process ID. The purpose of fork() is to create a new process, which becomes the child process of the caller. After a new child process is created, both processes will execute the next instruction following the fork() system call. Therefore, we have to distinguish the parent from the child. This can be done by testing the returned value of fork():
If fork() returns a negative value, the creation of a child process was unsuccessful.
fork() returns a zero to the newly created child process.
fork() returns a positive value, the process ID of the child process, to the parent. The returned process ID is of type pid_t defined in sys/types.h. Normally, the process ID is an integer. Moreover, a process can use function getpid() to retrieve the process ID assigned to this process.
Therefore, after the system call to fork(), a simple test can tell which process is the child. Please note that Unix will make an exact copy of the parent's address space and give it to the child. Therefore, the parent and child processes have separate address spaces.
Let us understand it with an example to make the above points clear. This example does not distinguish parent and the child processes.
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#define MAX_COUNT 200
#define BUF_SIZE 100
void main(void)
{
pid_t pid;
int i;
char buf[BUF_SIZE];
fork();
pid = getpid();
for (i = 1; i <= MAX_COUNT; i++) {
sprintf(buf, "This line is from pid %d, value = %d\n", pid, i);
write(1, buf, strlen(buf));
}
}
Suppose the above program executes up to the point of the call to fork().
If the call to fork() is executed successfully, Unix will make two identical copies of address spaces, one for the parent and the other for the child.
Both processes will start their execution at the next statement following the fork() call. In this case, both processes will start their execution at the assignment
pid = .....;
Both processes start their execution right after the system call fork(). Since both processes have identical but separate address spaces, those variables initialized before the fork() call have the same values in both address spaces. Since every process has its own address space, any modifications will be independent of the others. In other words, if the parent changes the value of its variable, the modification will only affect the variable in the parent process's address space. Other address spaces created by fork() calls will not be affected even though they have identical variable names.
What is the reason of using write rather than printf? It is because printf() is "buffered," meaning printf() will group the output of a process together. While buffering the output for the parent process, the child may also use printf to print out some information, which will also be buffered. As a result, since the output will not be send to screen immediately, you may not get the right order of the expected result. Worse, the output from the two processes may be mixed in strange ways. To overcome this problem, you may consider to use the "unbuffered" write.
If you run this program, you might see the following on the screen:
................
This line is from pid 3456, value 13
This line is from pid 3456, value 14
................
This line is from pid 3456, value 20
This line is from pid 4617, value 100
This line is from pid 4617, value 101
................
This line is from pid 3456, value 21
This line is from pid 3456, value 22
................
Process ID 3456 may be the one assigned to the parent or the child. Due to the fact that these processes are run concurrently, their output lines are intermixed in a rather unpredictable way. Moreover, the order of these lines are determined by the CPU scheduler. Hence, if you run this program again, you may get a totally different result.

You probably don't need to use fork in day-to-day programming if you are writing applications.
Even if you do want your program to start another program to do some task, there are other simpler interfaces which use fork behind the scenes, such as "system" in C and perl.
For example, if you wanted your application to launch another program such as bc to do some calculation for you, you might use 'system' to run it. System does a 'fork' to create a new process, then an 'exec' to turn that process into bc. Once bc completes, system returns control to your program.
You can also run other programs asynchronously, but I can't remember how.
If you are writing servers, shells, viruses or operating systems, you are more likely to want to use fork.

Multiprocessing is central to computing. For example, your IE or Firefox can create a process to download a file for you while you are still browsing the internet. Or, while you are printing out a document in a word processor, you can still look at different pages and still do some editing with it.

Fork creates new processes. Without fork you would have a unix system that could only run init.

Fork() is used to create new processes as every body has written.
Here is my code that creates processes in the form of binary tree.......It will ask to scan the number of levels upto which you want to create processes in binary tree
#include<unistd.h>
#include<fcntl.h>
#include<stdlib.h>
int main()
{
int t1,t2,p,i,n,ab;
p=getpid();
printf("enter the number of levels\n");fflush(stdout);
scanf("%d",&n);
printf("root %d\n",p);fflush(stdout);
for(i=1;i<n;i++)
{
t1=fork();
if(t1!=0)
t2=fork();
if(t1!=0 && t2!=0)
break;
printf("child pid %d parent pid %d\n",getpid(),getppid());fflush(stdout);
}
waitpid(t1,&ab,0);
waitpid(t2,&ab,0);
return 0;
}
OUTPUT
enter the number of levels
3
root 20665
child pid 20670 parent pid 20665
child pid 20669 parent pid 20665
child pid 20672 parent pid 20670
child pid 20671 parent pid 20670
child pid 20674 parent pid 20669
child pid 20673 parent pid 20669

First one needs to understand what is fork () system call. Let me explain
fork() system call creates the exact duplicate of parent process, It makes the duplicate of parent stack, heap, initialized data, uninitialized data and share the code in read-only mode with parent process.
Fork system call copies the memory on the copy-on-write basis, means child makes in virtual memory page when there is requirement of copying.
Now Purpose of fork():
Fork() can be used at the place where there is division of work like a server has to handle multiple clients, So parent has to accept the connection on regular basis, So server does fork for each client to perform read-write.

fork() is used to spawn a child process. Typically it's used in similar sorts of situations as threading, but there are differences. Unlike threads, fork() creates whole seperate processes, which means that the child and the parent while they are direct copies of each other at the point that fork() is called, they are completely seperate, neither can access the other's memory space (without going to the normal troubles you go to access another program's memory).
fork() is still used by some server applications, mostly ones that run as root on a *NIX machine that drop permissions before processing user requests. There are some other usecases still, but mostly people have moved to multithreading now.

The rationale behind fork() versus just having an exec() function to initiate a new process is explained in an answer to a similar question on the unix stack exchange.
Essentially, since fork copies the current process, all of the various possible options for a process are established by default, so the programmer does not have supply them.
In the Windows operating system, by contrast, programmers have to use the CreateProcess function which is MUCH more complicated and requires populating a multifarious structure to define the parameters of the new process.
So, to sum up, the reason for forking (versus exec'ing) is simplicity in creating new processes.

Fork() system call use to create a child process. It is exact duplicate of parent process. Fork copies stack section, heap section, data section, environment variable, command line arguments from parent.
refer: http://man7.org/linux/man-pages/man2/fork.2.html

Fork() was created as a way to create another process with shared a copy of memory state to the parent. It works the way it does because it was the most minimal change possible to get good threading capabilities in time-slicing mainframe systems that previously lacked this capability. Additionally, programs needed remarkably little modification to become multi-process, fork() could simply be added in the appropriate locations, which is rather elegant. Basically, fork() was the path of least resistance.
Originally it actually had to copy the entire parent process' memory space. With the advent of virtual memory, it has been hacked and changed to be more efficient, with copy-on-write mechanisms avoiding the need to actual copy any memory.
However, modern systems now allow the creation of actual threads, which simply share the parent process' actual heap. With modern multi-threading programming paradigms and more advanced languages, it's questionable whether fork() provides any real benefit, since fork() actually prevents processes from communicating through memory directly, and forces them to use slower message passing mechanisms.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight