is it safe to write to a file in another thread? - c

I do not know, if this is ok, but it compiles:
typedef struct
{
int fd;
char *str;
int c;
} ARG;
void *ww(void *arg){
ARG *a = (ARG *)arg;
write(a->fd,a->str,a->c);
return NULL;
}
int main (void) {
int fd = open("./smf", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);
int ch = fork();
if (ch==0){
ARG *arg; pthread_t p1;
arg->fd = fd;
arg->str = malloc(6);
strcpy(arg->str, "child");
arg->c = 6;
pthread_create( &p1, NULL, ww, arg);
} else {
write(fd, "parent\0", 7);
wait(NULL);
}
return 0;
}
I am wait()int in parent, but I do not know if I should also pthread_join to merge threads or it is implicitly by wait(). However is it even safe to write to the same file in two threads? I run few times and sometimes output was 1) parentchild but sometimes only 2) parent, no other cases - I do not know why child did not write as well when parent wait()s for it. Can someone please explain why these outputs?

You need to call pthread_join() in the child process to avoid potential race conditions during the child process’s exit sequence (for example the child process can otherwise exit before its thread gets a chance to write to the file). Calling pthread_join() in the parent process won’t help,
As for the file, having both processes write to it is safe in the sense that it won’t cause a crash, but the order in which the data is written to the file will be indeterminate since the two processes are executing concurrently.

I do not know, if this is ok, but it compiles:
Without even any warnings? Really? I suppose the code you are compiling must include all the needed headers (else you should have loads of warnings), but if your compiler cannot be persuaded to spot
buggy.c:30:15: warning: ‘arg’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
arg->fd = fd;
^
then it's not worth its salt. Indeed, variable arg is used uninitialized, and your program therefore exhibits undefined behavior.
But even if you fix that, after which the program can be made to compile without warnings, it still is not ok.
I am wait()int in parent, but I do not know if I should also
pthread_join to merge threads or it is implicitly by wait().
The parent process is calling wait(). This waits for a child process to terminate, if there are any. Period. It has no implications for the behavior of the child prior to its termination.
Moreover, in a pthreads program, the main thread is special: when it terminates, the whole program terminates, including all other threads. Your child process therefore suffers from a race condition: the main thread terminates immediately after creating a second thread, without ensuring that the other thread terminates first, so it is undefined what, if any, of the behavior of the second thread is actually performed. To avoid this issue, yes, in the child process, the main thread should join the other one before itself terminating.
However
is it even safe to write to the same file in two threads?
It depends -- both on the circumstances and on what you mean by "safe". POSIX requires the write() function to be thread-safe, but that does not mean that multiple threads or processes writing to the same file cannot still interfere with each other by overwriting each other's output.
Yours is a somewhat special case, however, in that parent and child are writing via the same open file description in the kernel, the child having inherited an association with that from its parent. According to POSIX, then, you should see both processes' output (if any; see above) in the file. POSIX provides no way to predict the order in which those outputs will appear, however.
I run few
times and sometimes output was 1) parentchild but sometimes only 2)
parent, no other cases - I do not know why child did not write as well
when parent wait()s for it. Can someone please explain why these
outputs?
The child process can terminate before its second thread performs its write. In this case you will see only the parent's output, not the child's.

Related

How am I supposed to use *status in waitpid(pid_t pid, int *status, int options)?

I don't understand what *status is supposed to do, or what he is doing.
There's an example below; could you explain what stat_cliente is doing?
for(int i = 0; i < Config.CLIENTES; i++){
int stat_cliente;
waitpid(Ind.pid_clientes[i], &stat_cliente,0);
if(WIFEXITED(stat_cliente)){ // Se terminou normalmente WIFEXITED devolve true
int status = WEXITSTATUS(stat_cliente); // WEXITSTATUS devolve os 8 bits menos
if(status < Config.SERVICOS){ // significativos do status usado no
// exit do filho
Ind.servicos_recebidos_por_clientes[status]++;
}
}
}
As widely described here, waitpid() is one of the functions to make a father process wait for a child process status change after a fork(). That's mainly used to terminate correctly child processes and to release their resources.
There's no need to replicate the complete manual page explanation. Lets just say that in
pid_t waitpid(pid_t pid, int *status, int options);
pid is the process ID of the child process to wait for. This is the parameter distinguishing this function from its "sisters", because it allows to wait for a specific process. Well, it also allows to wait for groups of processes; read the manual for further clarifications.
status. It is a pointer to integer in which waitpid() will store the new status.
options specifies the behavior of the function (allowing for example to work in non-blocking mode). See the manual for a deeper explanation.
About status parameter
So, to answer your question, what's the meaning of status parameter? And why it is a pointer to integer? Passing variables like pointers is often an alternative way to output something.
So this function has actually two outputs: the changed pid (through the return value) and the new status, written in address you provided to it (unless the address is NULL).
Code analysis
There's a loop on a known number of clients defined within Config.CLIENTES
The program waits for the status change of each client, the stat_cliente variable is filled with the new state of that process (in fact, it address, a int * is passed to waitpid()). Warning: there's no check on the return value of waitpid(), that could actually return -1 in case of error. In that case stat_cliente 's value would be meaningless!
The program checks if the child process exited normally through WIFEXITED() macro
In case of normal termination, WEXITSTATUS() macro can be called in order to retreive the exit status
For a limited subset of the child processes (those classified as services, I suppose, since the check against Config.SERVICOS is performed) increase the specific counter of the found status for the current service.
In conclusion, this program portion performs two tasks:
It makes sure that all child processes terminate gracefully through waitpid().
It updates a statistic table for a subset of them, called "services", in order to trace along the history the occurrence of all their possible termination causes for each of them.

C unnnamed pipes and fork for calculation

So I'm trying to create program that accepts user input (price for example 50) and then first child passes it to second, second one add 10 (price is now 60), third one then 50 (price is now 110) and 4 one just prints/returns final price. I have fork in loop and I'm creating pipes, but price is always the same, only 10 is added in each child. What is wrong or how to fix so that it's going to work as I want to.
My code:
int main(int argc,char *argv[])
{
int anon_pipe[2];
int n,N=4;
char value_price[100];
if(argc>1)
{
int price=atoi(argv[1]);
printf("%d\n",price);
if(pipe(anon_pipe)==-1){
perror("Error opening pipe");
return -1;
}
for(n = 0; n < N; n++){
switch(fork()){
case -1:
perror("Problem calling fork");
return -1;
case 0:
close(anon_pipe[1]);
read(anon_pipe[0],value_price,100);
price+=10;
sprintf(value_price,"%d \n",price);
printf("Price: %d\n",atoi(value_price));
write(anon_pipe[1],value_price,sizeof(value_price));
_exit(0);
}
}
close(anon_pipe[0]);
sleep(1);
close(anon_pipe[1]);
}
return 0;
}
You seem to think that forking makes the child start from the beginning of the program. This is not the case, forking makes the child start at the same line when the fork() was called
For instance look at this code here:
read(anon_pipe[0],value_price,100);
price+=10;
sprintf(value_price,"%d \n",price);
printf("Price: %d\n",atoi(value_price));
See you increase the value of price but you never read that value form the pipe. So all children will always output +10 to their respective pipe.
You should check the return values of your function calls for error codes. If you had done, you would have detected the error arising from this combination of calls:
close(anon_pipe[1]);
// ...
write(anon_pipe[1],value_price,sizeof(value_price));
Very likely, you would also have detected that many of these calls ...
read(anon_pipe[0],value_price,100);
... signal end-of-file without reading anything. At the very least, you need read()'s return value to determine where to place the needed string terminator (which you fail to place before using the buffer as a string).
As a general rule, it is mandatory to handle the return values of read() and write(), for in addition to the possibility of errors / EOF, these functions may perform short data transfers instead of full ones. The return value tells you how many bytes were transferred, which you need to know to determine whether to loop to attempt to transfer more bytes.
Moreover, you have all of your processes using the same pipe to communicate with each other. You might luck into that working, but it is probable that at least sometimes you'll end up with garbled communication. You really ought to create a separate pipe for each pair of communicating processes (including the parent process).
Furthermore, do not use sleep() to synchronize processes. It doesn't work reliably. Instead, the parent should wait() or waitpid() for each of its child processes, but only after starting them all and performing all needed pipe-end handling. Waiting for the child processes also prevents them from remaining zombies for any significant time after they exit. That doesn't much matter when the main process exits instead of proceeding to any other work, as in this case, but otherwise it constitutes a resource leak (file descriptors). You should form the good habit of waiting for your child processes.
Of course, all of that is moot if you don't actually write the data you mean to write; #SanchkeDellowar explains in his answer how you fail to do that.

Forks and Pointers in C

Can someone help me understand how the system handles variables that are set before a process makes a fork() call. Below is a small test program I wrote to try understanding what is going on behind the scenes.
I understand that the current state of a process is "cloned", variables included, at the time of the forking. My thought was, that if I malloc'd a 2D array before calling fork, I would need to free the array both in the parent and the child processes.
As you can see from the results below the sample code, the two values act as if they are totally separate from each other, yet they have the exact same address space. I expected that my final result for tmp would be -4 no matter which process completed first.
I am newer to C and Linux, so could someone explain how this is possible? Perhaps the variable tmp becomes a pointer to a pointer which is distinct in each process? Thanks so much.
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
int tmp = 1;
pid_t forkReturn= fork();
if(!forkReturn) {
/*child*/
tmp=tmp+5;
printf("Value for child %d\n",tmp);
printf("Address for child %p\n",&tmp);
}
else if(forkReturn > 0){
/*parent*/
tmp=tmp-10;
printf("Value for parent %d\n",tmp);
printf("Address for parent %p\n",&tmp);
}
else {
/*Error calling fork*/
printf("Error calling fork);
}
return 0;
}
RESULTS of standard out:
Value for child 6
Address for child 0xbfb478d8
Value for parent -9
Address for parent 0xbfb478d8
It did indeed copy the entire address space, and changing memory in the child process does not affect the parent. The key to understanding this is to remember that a pointer can only point to something in your own process, and the copy happens at a lower level.
However, you should not call malloc() or free() at all in the child of fork. This can deadlock (another thread was in malloc() when you called fork()). The only functions safe to call in the child are the ones also listed as safe for signal handlers. I used to be able to claim this was true only if you wrote multithreaded code; however Apple was kind enough to spawn a background thread in the standard library, so the deadlock is real all the time. The child of fork should never be allowed to drop out of the if block. Call _exit to make sure it doesn't.

main process -> pthread -> fork + execvp

I am seeing a strange issue.
Sometimes when i run my program long enough i see that there are two copies of my program running. The second is a child process of the first since i see that the parent PID of the second one is that of the first one.
I realized that i have a fork in my code and its only because of this that i can have two copies running -- i can otherwise never have two copies of my program running.
This happens very rarely but it does happen.
The architecture is as follows:
The main program gets an event and spawns a pthread. In that thread i do some processing and based on some result i do a fork immediately followed by an execvp.
I realize that its not best to call a fork from a pthread but in my design the main process gets many events and the only way to parallely work on all those events was to use pthreads. Each pthread does some processing and in certain cases it needs to call a different program (for which i use execvp). Since i had to call a different program i had to use fork
I am wondering if because i am eventually calling a fork from a thread context is it possible that multiple threads parallely call fork + execvp and this "somehow" results in two copies being created.
If this is indeed happening would it help if i protect the code that does fork+execvp with a mutex since that would result in only one thread calling the fork + execvp.
However, if i take a mutex before fork + excvp then i dont know when to release it.
Any help here would be appreciated.
thread code that does fork + execvp -- in case you guys can spot an issue there:
In main.c
status = pthread_create(&worker_thread, tattr,
do_some_useful_work, some_pointer);
[clipped]
void *do_some_useful_work (void * arg)
{
/* Do some processing and fill pArguments array */
child_pid = fork();
if (child_pid == 0)
{
char *temp_log_file;
temp_log_file = (void *) malloc (strlen(FORK_LOG_FILE_LOCATION) +
strlen("/logfile.") + 8);
sprintf (temp_log_file, "%s/logfile.%d%c", FORK_LOG_FILE_LOCATION, getpid(),'\0');
/* Open log file */
int log = creat(temp_log_file, 0777);
/* Redirect stdout to log file */
close(1);
dup(log);
/* Redirect stderr to log file */
close(2);
dup(log);
syslog(LOG_ERR, "Opening up log file %s\n", temp_log_file);
free (temp_log_file);
close (server_sockets_that_parent_is_listening_on);
execvp ("jazzy_program", pArguments);
}
pthread_exit (NULL);
return NULL;
}
I looked through this code and i see no reason why i would do a fork and not do an execvp -- so the only scenario that comes to my mind is that multiple threads get executed and they all call fork + execvp. This sometimes causes two copies of my main program to run.
In the case where execvp fails for any reason (perhaps too many processes, out of memory, etc.), you fail to handle the error; instead the forked copy of the thread keeps running. Calling pthread_exit (or any non-async-signal-safe) function in this process has undefined behavior, so it might not exit properly but hang or do something unexpected. You should always check for exec failure and immediately _exit(1) or similar when this happens. Also, while this probably isn't your problem, it's unsafe to call malloc after forking in a multithreaded process since it's non-async-signal-safe.

vfork never ends

The following code never ends. Why is that?
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#define SIZE 5
int nums[SIZE] = {0, 1, 2, 3, 4};
int main()
{
int i;
pid_t pid;
pid = vfork();
if(pid == 0){ /* Child process */
for(i = 0; i < SIZE; i++){
nums[i] *= -i;
printf(”CHILD: %d “, nums[i]); /* LINE X */
}
}
else if (pid > 0){ /* Parent process */
wait(NULL);
for(i = 0; i < SIZE; i++)
printf(”PARENT: %d “, nums[i]); /* LINE Y */
}
return 0;
}
Update:
This code is just to illustrate some of the confusions I have regarding to vfork(). It seems like when I use vfork(), the child process doesn't copy the address space of the parent. Instead, it shares the address space. In that case, I would expect the nums array get updated by both of the processes, my question is in what order? How the OS synchronizes between the two?
As for why the code never ends, it is probably because I don't have any _exit() or exec() statement explicitly for exit. Am I right?
UPDATE2:
I just read: 56. Difference between the fork() and vfork() system call?
and I think this article helps me with my first confusion.
The child process from vfork() system call executes in the parent’s
address space (this can overwrite the parent’s data and stack ) which
suspends the parent process until the child process exits.
To quote from the vfork(2) man page:
The vfork() function has the same effect as fork(), except that the behaviour is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions.
You're doing a whole bunch of those things, so you shouldn't expect it to work. I think the real question here is: why you're using vfork() rather than fork()?
Don't use vfork. That's the simplest advice you can get. The only thing that vfork gives you is suspending the parent until the child either calls exec* or _exit. The part about sharing the address space is incorrect, some operating systems do it, other choose not to because it's very unsafe and has caused serious bugs.
Last time I looked at how applications use vfork in reality the absolute majority did it wrong. It was so bad that I threw away the 6 character change that enabled address space sharing on the operating system I was working on at that time. Almost everyone who uses vfork at least leaks memory if not worse.
If you really want to use vfork, don't do anything other than immediately call _exit or execve after it returns in the child process. Anything else and you're entering undefined territory. And I really mean "anything". You start parsing your strings to make arguments for your exec call and you're pretty much guaranteed that something will touch something it's not supposed to touch. And I also mean execve, not some other function from the exec family. Many libc out there do things in execvp, execl, execle, etc. that are unsafe in a vfork context.
What is specifically happening in your example:
If your operating system shares address space the child returning from main means that your environment cleans things up (flush stdout since you called printf, free memory that was allocated by printf and such things). This means that there are other functions called that will overwrite the stack frame the parent was stuck in. vfork returning in the parent returns to a stack frame that has been overwritten and anything can happen, it might not even have a return address on the stack to return to anymore. You first entered undefined behavior country by calling printf, then the return from main brought you into undefined behavior continent and the cleanup run after the return from main made you travel to undefined behavior planet.
From the official specification:
the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(),
In your program you modify data other than the pid variable, meaning the behavior is undefined.
You also have to call _exit to end the process, or call one of the exec family of functions.
The child must _exit rather than returning from main. If the child returns from main, then the stack frame does not exist for the parent when it returns from vfork.
just call the _exit instead of calling return or insert _exit(0) to the last line in "child process". return 0 calls exit(0) while close the stdout, so when another printf follows, the program crashes.

Resources