The following code never ends. Why is that?
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#define SIZE 5
int nums[SIZE] = {0, 1, 2, 3, 4};
int main()
{
int i;
pid_t pid;
pid = vfork();
if(pid == 0){ /* Child process */
for(i = 0; i < SIZE; i++){
nums[i] *= -i;
printf(”CHILD: %d “, nums[i]); /* LINE X */
}
}
else if (pid > 0){ /* Parent process */
wait(NULL);
for(i = 0; i < SIZE; i++)
printf(”PARENT: %d “, nums[i]); /* LINE Y */
}
return 0;
}
Update:
This code is just to illustrate some of the confusions I have regarding to vfork(). It seems like when I use vfork(), the child process doesn't copy the address space of the parent. Instead, it shares the address space. In that case, I would expect the nums array get updated by both of the processes, my question is in what order? How the OS synchronizes between the two?
As for why the code never ends, it is probably because I don't have any _exit() or exec() statement explicitly for exit. Am I right?
UPDATE2:
I just read: 56. Difference between the fork() and vfork() system call?
and I think this article helps me with my first confusion.
The child process from vfork() system call executes in the parent’s
address space (this can overwrite the parent’s data and stack ) which
suspends the parent process until the child process exits.
To quote from the vfork(2) man page:
The vfork() function has the same effect as fork(), except that the behaviour is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions.
You're doing a whole bunch of those things, so you shouldn't expect it to work. I think the real question here is: why you're using vfork() rather than fork()?
Don't use vfork. That's the simplest advice you can get. The only thing that vfork gives you is suspending the parent until the child either calls exec* or _exit. The part about sharing the address space is incorrect, some operating systems do it, other choose not to because it's very unsafe and has caused serious bugs.
Last time I looked at how applications use vfork in reality the absolute majority did it wrong. It was so bad that I threw away the 6 character change that enabled address space sharing on the operating system I was working on at that time. Almost everyone who uses vfork at least leaks memory if not worse.
If you really want to use vfork, don't do anything other than immediately call _exit or execve after it returns in the child process. Anything else and you're entering undefined territory. And I really mean "anything". You start parsing your strings to make arguments for your exec call and you're pretty much guaranteed that something will touch something it's not supposed to touch. And I also mean execve, not some other function from the exec family. Many libc out there do things in execvp, execl, execle, etc. that are unsafe in a vfork context.
What is specifically happening in your example:
If your operating system shares address space the child returning from main means that your environment cleans things up (flush stdout since you called printf, free memory that was allocated by printf and such things). This means that there are other functions called that will overwrite the stack frame the parent was stuck in. vfork returning in the parent returns to a stack frame that has been overwritten and anything can happen, it might not even have a return address on the stack to return to anymore. You first entered undefined behavior country by calling printf, then the return from main brought you into undefined behavior continent and the cleanup run after the return from main made you travel to undefined behavior planet.
From the official specification:
the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(),
In your program you modify data other than the pid variable, meaning the behavior is undefined.
You also have to call _exit to end the process, or call one of the exec family of functions.
The child must _exit rather than returning from main. If the child returns from main, then the stack frame does not exist for the parent when it returns from vfork.
just call the _exit instead of calling return or insert _exit(0) to the last line in "child process". return 0 calls exit(0) while close the stdout, so when another printf follows, the program crashes.
Related
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void) {
for (int i = 1; i < 4; i++) {
printf("%d", i);
int id = fork();
if (id == 0) {
printf("Hello\n");
exit(0);
} else {
exit(0);
}
}
return 0;
}
For this code, it prints 11Hello on my computer. It seems counter-intuitive to me because "1" is printed twice but it's before the fork() is called.
The fork() system call forks a new process and executes the instruction that follows it in each process parallelly. After your child process prints the value of i to the stdout, it gets buffered which then prints the value of 'i' again because stdout was not flushed.
Use the fflush(stdout); so that 'i' gets printed only once per fork.
Alternately, you could also use printf("%d\n", i); where the new line character at the end does the job.
Where does the process start to execute after fork()
fork() duplicates the image of the process and it's context. It will run the next line of code pointed by the instruction pointer.
It seems counter-intuitive to me because "1" is printed twice but it's before the fork() is called.
Read printf anomaly after "fork()"
To begin with, the for loop is superfluous in your example.
Recall that the child copies the caller's memory(that of its parent) (code, globals, heap and stack), registers, and open files. To be performant or there may be some other reason, the printf call may not flush the buffer and put the things passed to that except for some cases such as appending new-line-terminator.
Before forking, the parent(main process) is on the way.
Let's assume we're on a single core system and the child first preempts the core.
1 is in the buffer because its parent put it into that before forking. Then, the child reaches second print statement, a caveat here is that the child can be orphaned at that time(no matter for this moment), passing "Hello\n" string including new-line character giving rise to dump the buffer/cache(whatever you call.) Since it sees \n character, it flushes the buffer including prior 1 added by its parent, that is 11Hello.
Let's assume the parent preempts the core at first,
It surrenders after calling exit statement, bringing on the child to be orphaned, causing memory leak. After that point, the boss(init possessing process id as 1) whose newly name I forget(it may be sys-something) should handle this case. However, nothing is changed as to the printing-steps. So you run into again 11Hello except if not the buffer is flushed automagically.
I don't have much working experience with them but university class(I failed at the course 4 times). However, I can advise you whenever possible use stderr while coping with these tings since it is not buffered, in lieu of stdout or there is some magical way(I forget it again, you call it at the beginning in main()) you can opt for to disable buffering for stdout as well.
To be more competent over these topics, you should glance at The Linux Programming Interface of Michael Kerrisk and the topics related to William Pursell,
Jonathan Leffler,
WhozCraig,
John Bollinger, and
Nominal Animal. I have learnt a plethora of information from them even if the information almost wholly is useless in Turkey borders.
*Magic means needing a lot of details to explain.
I do not know, if this is ok, but it compiles:
typedef struct
{
int fd;
char *str;
int c;
} ARG;
void *ww(void *arg){
ARG *a = (ARG *)arg;
write(a->fd,a->str,a->c);
return NULL;
}
int main (void) {
int fd = open("./smf", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);
int ch = fork();
if (ch==0){
ARG *arg; pthread_t p1;
arg->fd = fd;
arg->str = malloc(6);
strcpy(arg->str, "child");
arg->c = 6;
pthread_create( &p1, NULL, ww, arg);
} else {
write(fd, "parent\0", 7);
wait(NULL);
}
return 0;
}
I am wait()int in parent, but I do not know if I should also pthread_join to merge threads or it is implicitly by wait(). However is it even safe to write to the same file in two threads? I run few times and sometimes output was 1) parentchild but sometimes only 2) parent, no other cases - I do not know why child did not write as well when parent wait()s for it. Can someone please explain why these outputs?
You need to call pthread_join() in the child process to avoid potential race conditions during the child process’s exit sequence (for example the child process can otherwise exit before its thread gets a chance to write to the file). Calling pthread_join() in the parent process won’t help,
As for the file, having both processes write to it is safe in the sense that it won’t cause a crash, but the order in which the data is written to the file will be indeterminate since the two processes are executing concurrently.
I do not know, if this is ok, but it compiles:
Without even any warnings? Really? I suppose the code you are compiling must include all the needed headers (else you should have loads of warnings), but if your compiler cannot be persuaded to spot
buggy.c:30:15: warning: ‘arg’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
arg->fd = fd;
^
then it's not worth its salt. Indeed, variable arg is used uninitialized, and your program therefore exhibits undefined behavior.
But even if you fix that, after which the program can be made to compile without warnings, it still is not ok.
I am wait()int in parent, but I do not know if I should also
pthread_join to merge threads or it is implicitly by wait().
The parent process is calling wait(). This waits for a child process to terminate, if there are any. Period. It has no implications for the behavior of the child prior to its termination.
Moreover, in a pthreads program, the main thread is special: when it terminates, the whole program terminates, including all other threads. Your child process therefore suffers from a race condition: the main thread terminates immediately after creating a second thread, without ensuring that the other thread terminates first, so it is undefined what, if any, of the behavior of the second thread is actually performed. To avoid this issue, yes, in the child process, the main thread should join the other one before itself terminating.
However
is it even safe to write to the same file in two threads?
It depends -- both on the circumstances and on what you mean by "safe". POSIX requires the write() function to be thread-safe, but that does not mean that multiple threads or processes writing to the same file cannot still interfere with each other by overwriting each other's output.
Yours is a somewhat special case, however, in that parent and child are writing via the same open file description in the kernel, the child having inherited an association with that from its parent. According to POSIX, then, you should see both processes' output (if any; see above) in the file. POSIX provides no way to predict the order in which those outputs will appear, however.
I run few
times and sometimes output was 1) parentchild but sometimes only 2)
parent, no other cases - I do not know why child did not write as well
when parent wait()s for it. Can someone please explain why these
outputs?
The child process can terminate before its second thread performs its write. In this case you will see only the parent's output, not the child's.
Can someone help me understand how the system handles variables that are set before a process makes a fork() call. Below is a small test program I wrote to try understanding what is going on behind the scenes.
I understand that the current state of a process is "cloned", variables included, at the time of the forking. My thought was, that if I malloc'd a 2D array before calling fork, I would need to free the array both in the parent and the child processes.
As you can see from the results below the sample code, the two values act as if they are totally separate from each other, yet they have the exact same address space. I expected that my final result for tmp would be -4 no matter which process completed first.
I am newer to C and Linux, so could someone explain how this is possible? Perhaps the variable tmp becomes a pointer to a pointer which is distinct in each process? Thanks so much.
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
int tmp = 1;
pid_t forkReturn= fork();
if(!forkReturn) {
/*child*/
tmp=tmp+5;
printf("Value for child %d\n",tmp);
printf("Address for child %p\n",&tmp);
}
else if(forkReturn > 0){
/*parent*/
tmp=tmp-10;
printf("Value for parent %d\n",tmp);
printf("Address for parent %p\n",&tmp);
}
else {
/*Error calling fork*/
printf("Error calling fork);
}
return 0;
}
RESULTS of standard out:
Value for child 6
Address for child 0xbfb478d8
Value for parent -9
Address for parent 0xbfb478d8
It did indeed copy the entire address space, and changing memory in the child process does not affect the parent. The key to understanding this is to remember that a pointer can only point to something in your own process, and the copy happens at a lower level.
However, you should not call malloc() or free() at all in the child of fork. This can deadlock (another thread was in malloc() when you called fork()). The only functions safe to call in the child are the ones also listed as safe for signal handlers. I used to be able to claim this was true only if you wrote multithreaded code; however Apple was kind enough to spawn a background thread in the standard library, so the deadlock is real all the time. The child of fork should never be allowed to drop out of the if block. Call _exit to make sure it doesn't.
when i try the following snippet i am getting an error called stack smashing detected. what could be the reason for this potential bug? Can some one explain?
#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
int glob=88;
int main()
{
int loc=2;
pid_t pid=vfork();
if(pid==0)
{
printf("Inside child");
glob++;
loc++;
printf("%d %d" ,glob,loc);
}
else
{
printf("Inside parent");
glob++;
loc++;
printf("%d %d",glob,loc);
}
}
and the output when I run this code is like that
user018#sshell ~ $ gcc one.c
user018#sshell ~ $ ./a.out
Inside child89 3Inside parent90 945733057*** stack smashing detected ***: a.out
- terminated
a.out: stack smashing attack in function <unknown> - terminated
KILLED
From the Linux man page (and POSIX):
The vfork() function has the same effect as fork(2), except that the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit(2) or one of the exec(3) family of functions.
You're modifying data and returning from the function in which vfork was invoked - both of these lead to undefined behavior. vfork is not equivalent to fork, the number of things you can do in a vforkd child are very, very limited. It should only be used in very specific circumstances, essentially when the only thing you need to do in the child is exec something else.
See your operating system's man page for the full details.
vfork() is used to create new processes without copying the page tables of the parent process. So you can't modify the variables in the child process because they are not there anymore. Use fork() instead.
One more thing, it's better to add a \n to the end of printf() because stdout is line buffered by default.
1) I would definitely add a "return 0", since you declared "int main()".
2) If you wanted to disable the warning, use -fno-stack-protector in your compile line.
3) If you wanted to debug where the error is coming from, use "-g" in your compile line, and run the program from gdb (instead of running ./a.out).
My closest to "what's wrong" is this man page about vfork():
http://linux.die.net/man/3/vfork
The vfork() function shall be equivalent to fork(), except that the
behavior is undefined if the process created by vfork() either
modifies any data other than a variable of type pid_t used to store
the return value from vfork(), or returns from the function in which
vfork() was called, or calls any other function before successfully
calling _exit() or one of the exec family of functions.
For Linux, just use "fork()", and I think you'll be happy :)
#include<stdio.h>
#include<sys/types.h>
#include<stdlib.h>
int turn;
int flag[2];
int main(void)
{
int pid,parent=1;
printf("before vfork\n");
if((pid=vfork())<0)
{
perror("vfork error\n");
return 1;
}
while(1)
{
if(pid==0)
{
while(parent==1)
{
sleep(2);
}
parent=0;
flag[0]=1;
turn=1;
while(flag[1]&&turn==1);
printf("This is critical section:parent process\n");
flag[0]=0;
}
else
{
parent=2;
printf("This is parent");
flag[1]=1;
turn=0;
while(flag[0]&&turn==0);
printf("This is critical section:child process %d \n",pid);
flag[1]=0;
}
}
}
This is my code. Can anyone tell why control is not coming to my parent process.
man 2 vfork says:
(From POSIX.1) The vfork() function has the same effect as fork(2),
except that the behavior is undefined if the process created by vfork()
either modifies any data other than a variable of type pid_t used to
store the return value from vfork(), or returns from the function in
which vfork() was called, or calls any other function before success-
fully calling _exit(2) or one of the exec(3) family of functions.
Because you are modifying data in child process, vfork() behaviour is undefined, so whatever it does is correct (here by "correct" I mean "complies with the spec").
Because, the entire virtual address space of the parent is replicated in the child.
So two processes has separate address spaces. And flag[1] never become 1 in parent process.
vfork() was designed for fast creation of processes with execve(): vfork() + immediate execve(). And in cases when children modifies any data of parent process, behavior is undefined. Check man about details.
In my case, that behavior was such that parent reach control only after close of children.
Here is the important bit from the Linux man page. Keywords: "parent is suspended until the child ..."
vfork() differs from fork(2) in that the parent is suspended until the
child makes a call to execve(2) or _exit(2). The child shares all mem-
ory with its parent, including the stack, until execve(2) is issued by
the child. The child must not return from the current function or call
exit(3), but may call _exit(2).
Your parent process has been suspended. Consider using clone.
This code doesn't guarantee fairness; it only prevents deadlock.
There's nothing stopping the child process from executing over and over again.
If you want the lock to be fair, you'll have to make it so.