I am working on an implementation where multiple processes read a regular file A. While this is happening, a new process P starts up and copies the contents from A to regular file B. All processes starting up after this, should now read file B.
To make the switch from A to B, process P creates a temporary file T once B is written. All the processes check whether T exists to decide where to read data from (i.e., read from A if T does NOT exist, and read from B if T exists).
Since T file is just an indicator here, is it better to use a memory mapped file rather than regular file for faster performance?
Using a tmp-file for synchronization is not safe. The check, if file exists and reading that file is not atomic. One process could switch the files just after another process has completed the check and is about to read.
If you are developing in C and allowed to use IPC API, you could set the flag in shared memory and guard it by a semaphore.
Also, the processes should signal, that they have finished the reading.
Related
I have a task in which I have to write a program in C language that manages access and reading/writing to a file.
When the program starts it should create two processes(using fork()).
-The first process will be responsible for the initial write to the file(The file is a text file with 2000 random characters from a to z).
-The second process will be responsible for reading from the file ,after the first process has finished writing.
My question is :
How can I synchronize the execution order by using semaphores(sem() call system) in order to ensure that the first process always starts first and the second process starts only after the first process has finished writing?
I can recommend using binarysemaphores:
//
https://www.freertos.org/xSemaphoreCreateBinary.html
https://controllerstech.com/how-to-use-binary-semaphore-in-stm32/
If you are working in an embedded context i would recommend using Tasknotification since they are less ram hungry and therefore may be more fitting in a less powerfull system.
https://www.freertos.org/RTOS-task-notifications.html
I need to solve a concurrency assignment for my operating systems class. I don't want the solution here, but I am lacking one part.
We should write a process that writes to file, reads from it and then deltetes it. This process we should run two times in two different shells. No fork here for simplicity. Process A should write, Process B then read and then Process should delete the file. Afterwards they switch roles.
I understand that you can achieve atomicity easily by locking. With while loops around the read-, and write sections etc. you can also get further control. But when I run process A and then process B, process B will spin before the write seciton until it achieves the lock and not got into reading when process A releases the lock. So my best guess is to have a read and a write lock. This information must be shared somehow between the processes. The only way I can think of is some global variable, but since both processes hold copies of the variables, I think this is not possible. Another way would be to have a read lock file and a write lock file, but that seems overly complicated to me.
Is there a better way?
You can use semaphores to ensure the writer and deleter wait for the previous process to finish its job. (Use man sem_init for details)
When running multiple processes with semaphores, it should be created using shared mem (man shm_open for more details).
You will need as many semaphores as the number of pipelines in this process.
You can use file as a lock. Two processes try to create a file with a previously agreed upon name using the O_EXCL flag. Only one will succeed. The one that succeeds gets the access to the resource. So in this case process A should try to create a file with name say, foo, with O_EXCL flag and, if successful, it should go ahead and write to file the information. After its work is complete, Process A should unlink foo. Process B should try to create file foo with O_EXCL flag, and if successful, try to read the file created by Process A. After its attempt is over, Process B should unlink the file foo. That way only one process will be accessing the file at any time.
Your problem (with files and alternating roles in the creation/deletion of files) seems to be a candidate to use the O_EXCL flag on opening/creating the file. This flag makes the open(2) system call to succeed in creating a file only if the file doesn't exist, so it makes the file to appear as a semaphore itself. Each process can liberate the lock (A or B) but the one that does, just liberates the lock and makes the role of owning again accessible.
You will see that both processes try to use one of the roles, but if they both try to use the owner role, one of them will succeed, and the other will fail.
Just enable a SIGINT signal handler on the owning process, to allow it to delete the file in case it gets signalled, or you will leave the file and after that no process will be able to assume the owning role (at least you will need to delete it manually).
This was the first form of locking feature in unix, long before semaphores, shared memory or other ways to block processes existed. It is based on the atomicity of system calls (you cannot execute two system calls on the same file simultaneously)
Here is process a and b, both of which are multithreaded.
a forks b and b immediatly execs one new program;
a dups and freopens stderr to the logfile (a is defacto apache's httpd2.22)
b inherits the opened stderr from a. ( i am adapting apache httpd, b is my program), and b uses fprintf(stderr....) for logging
so a, b share the same file for logging
there is no lock mechanism for a, b to write log
I found that some log msg are interleaving, and a little bit of log msg got lost.
Can the two writers to the same file implicitly lock each other out?
The more important question is: If we use fprintf only within one single multithreaded process, fprintf is thread safe, i.e. one call of fprintf won't intervene another fprintf call in another thread? Many articles said this, but this is not easy to ensure myself, so I ask for help here.
A: the code for duplicate the fd is like this:
......
rv = apr_file_dup2(stderr_log, s_main->error_log, stderr_p);//dup the stderr to the logfile
apr_file_close(s_main->error_log);//here ,2 fd point to the same file description,so close one of
then
B:apache it self use this manner for logging:
......
if (rv != APR_SUCCESS) {
ap_log_error(APLOG_MARK, APLOG_CRIT, rv, s_main, ".........");
C:for convenience,i logging in this way:
fprintf(stderr,".....\n")
I am quite sure apache and me use the same fd for file writing.
If you're using a single FILE object to perform output on an open file, then whole fprintf calls on that FILE will be atomic, i.e. lock is held on the FILE for the duration of the fprintf call. Since a FILE is local to a single process's address space, this setup is only possible in multi-threaded applications; it does not apply to multi-process setups where several different processes are accessing separate FILE objects referring to the same underlying open file. Even though you're using fprintf here, each process has its own FILE it can lock and unlock without the others seeing the changes, so writes can end up interleaved. There are several ways to prevent this from happening:
Allocate a synchronization object (e.g. a process-shared semaphore or mutex) in shared memory and make each process obtain the lock before writing to the file (so only one process can write at a time); OR
Use filesystem-level advisory locking, e.g. fcntl locks or the (non-POSIX) BSD flock interface; OR
Instead of writing directly to the log file, write to a pipe that another process will feed into the log file. Writes to a pipe are guaranteed (by POSIX) to be atomic as long as they are smaller than PIPE_BUF bytes long. You cannot use fprintf in this case (since it might perform multiple underlying write operations), but you could use snprintf to a PIPE_BUF-sized buffer followed by write.
I have a multithreaded server program where each thread is required to read the contents of a file to retrieve data requested by the client.
I am using pthreads in C to accomplish creating a thread and passing the function the thread which the thread will execute.
In the function, if I assign to a new FILE pointer with fopen() and subsequently read the contents of the file with fgets(), will each thread have its own file offset? That is, if thread 1 is reading from the file, and it is on line 5 of the file when thread 2 reads for the first time, does thread 2 start reading at line 5 or is it independent of where thread 1 is in the file?
Each open FILE has only one file pointer. That has one associated FD, and one file position (file offset as you put it).
But you can fopen the file twice (from two different threads or for that matter from the same thread) - as your edit now implies you are doing. That means you'll have two associated FDs and two separate file positions.
IE, this has nothing to do with threads per se, just that if you want separate file positions, you will need two FDs which (with stdio) means two FILEs.
I have different processes concurrently accessing a named pipe in Linux and I want to make this access mutually exclusive.
I know is possible to achieve that using a mutex placed in a shared memory area, but being this a sort of homework assignment I have some restrictions.
Thus, what I thought about is to use locking primitives on files to achieve mutual exclusion; I made some try but I can't make it work.
This is what i tried:
flock(lock_file, LOCK_EX)
// critic section
flock(lock_file, LOCK_UN)
Different projects will use different file descriptors but referring to the same file.
Is it possible to achieve something like that? Can you provide some example.
The standard lock-file technique uses options such as O_EXCL on the open() call to try and create the file. You store the PID of the process using the lock, so you can determine whether the process still exists (using kill() to test). You have to worry about concurrency - a lot.
Steps:
Determine name of lock file based on name of FIFO
Open lock file if it exists
Check whether process using it exists
If other process exists, it has control (exit with error, or wait for it to exit)
If other process is absent, remove lock file
At this point, lock file did not exist when last checked.
Try to create it with open() and O_EXCL amongst the other options.
If that works, your process created the file - you have permission to go ahead.
Write your PID to the file; close it.
Open the FIFO - use it.
When done (atexit()?) remove the lock file.
Worry about what happens if you open the lock file and read no PID...is it that another process just created it and hasn't yet written its PID into it, or did it die before doing so? Probably best to back off - close the file and try again (possibly after a randomized nanosleep()). If you get the empty file multiple times (say 3 in a row) assume that the process is dead and remove the lock file.
You could consider having the process that owns the file maintain an advisory lock on the file while it has the FIFO open. If the lock is absent, the process has died. There is still a TOCTOU (time of check, time of use) window of vulnerability between opening the file and applying the lock.
Take a good look at the open() man page on your system to see whether there are any other options to help you. Sometimes, processes use directories (mkdir()) instead of files because even root can't create a second instance of a given directory name, but then you have issues with how to know the PID of the process with the resource open, etc.
I'd definitely recommend using an actual mutex (as has been suggested in the comments); for example, the pthread library provides an implementation. But if you want to do it yourself using a file for educational purposes, I'd suggest taking a look at this answer I posted a while ago which describes a method for doing so in Python. Translated to C, it should look something like this (Warning: untested code, use at your own risk; also my C is rusty):
// each instance of the process should have a different filename here
char* process_lockfile = "/path/to/hostname.pid.lock";
// all processes should have the same filename here
char* global_lockfile = "/path/to/lockfile";
// create the file if necessary (only once, at the beginning of each process)
FILE* f = fopen(process_lockfile, "w");
fprintf(f, "\n"); // or maybe write the hostname and pid
fclose(f);
// now, each time you have to lock the file:
int lock_acquired = 0;
while (!lock_acquired) {
int r = link(process_lockfile, global_lockfile);
if (r == 0) {
lock_acquired = 1;
}
else {
struct stat buf;
stat(process_lockfile, &buf);
lock_acquired = (buf.st_nlink == 2);
}
}
// do your writing
unlink(global_lockfile);
lock_acquired = 0;
Your example is as good as you're going to get using flock (2) (which is after all, merely an "advisory" lock (which is to say not a lock at all, really)). The man page for it on my Mac OS X system has a couple of possibly important provisos:
Locks are on files, not file descriptors. That is, file descriptors duplicated through dup(2) or fork(2) do not result in multiple instances of a
lock, but rather multiple references to a single lock. If a process holding a lock on a file forks and the child explicitly unlocks the file, the
parent will lose its lock
and
Processes blocked awaiting a lock may be awakened by signals.
both of which suggest ways it could fail.
// would have been a comment, but I wanted to quote the man page at some length