In Linux or other modern OS, each process's memory is protected, so that a wild write in one process does not crash any other process. Now assume we have memory shared between process A and process B. Now say, due to a soft error, process A unintentionally writes something to that memory area. Is there any way to protect against this, given that both process A and process B have full write access to that memory?
When you call shm_open you can pass it the O_RDONLY flag to the mode parameter.
Alternatively you can use mprotect to mark specific pages as (e.g.) read-only. You'll need cooperation and trust between the two processes to do this, there is no way for B to say A can't write to it using mprotect.
If you really want to be sure that the other process can't interfere then communicating via pipes or sockets of some description might be a sensible idea.
You could also use mmap to map a something (e.g. in /dev/shm?) the file permissions make impossible to write to for one of the two processes if they're running as separate UIDs. For example if you have /dev/shm/myprocess owned by user producer and group consumer and set the file permissions to 0640 before mapping it by a process running with that UID and GID then you could prevent the second process from writing to it.
You may use a simple checksum on each write. So, when a process detects wrong checksum upon a read operation, it's the sign of the failure of the other process.
Related
I need to solve a concurrency assignment for my operating systems class. I don't want the solution here, but I am lacking one part.
We should write a process that writes to file, reads from it and then deltetes it. This process we should run two times in two different shells. No fork here for simplicity. Process A should write, Process B then read and then Process should delete the file. Afterwards they switch roles.
I understand that you can achieve atomicity easily by locking. With while loops around the read-, and write sections etc. you can also get further control. But when I run process A and then process B, process B will spin before the write seciton until it achieves the lock and not got into reading when process A releases the lock. So my best guess is to have a read and a write lock. This information must be shared somehow between the processes. The only way I can think of is some global variable, but since both processes hold copies of the variables, I think this is not possible. Another way would be to have a read lock file and a write lock file, but that seems overly complicated to me.
Is there a better way?
You can use semaphores to ensure the writer and deleter wait for the previous process to finish its job. (Use man sem_init for details)
When running multiple processes with semaphores, it should be created using shared mem (man shm_open for more details).
You will need as many semaphores as the number of pipelines in this process.
You can use file as a lock. Two processes try to create a file with a previously agreed upon name using the O_EXCL flag. Only one will succeed. The one that succeeds gets the access to the resource. So in this case process A should try to create a file with name say, foo, with O_EXCL flag and, if successful, it should go ahead and write to file the information. After its work is complete, Process A should unlink foo. Process B should try to create file foo with O_EXCL flag, and if successful, try to read the file created by Process A. After its attempt is over, Process B should unlink the file foo. That way only one process will be accessing the file at any time.
Your problem (with files and alternating roles in the creation/deletion of files) seems to be a candidate to use the O_EXCL flag on opening/creating the file. This flag makes the open(2) system call to succeed in creating a file only if the file doesn't exist, so it makes the file to appear as a semaphore itself. Each process can liberate the lock (A or B) but the one that does, just liberates the lock and makes the role of owning again accessible.
You will see that both processes try to use one of the roles, but if they both try to use the owner role, one of them will succeed, and the other will fail.
Just enable a SIGINT signal handler on the owning process, to allow it to delete the file in case it gets signalled, or you will leave the file and after that no process will be able to assume the owning role (at least you will need to delete it manually).
This was the first form of locking feature in unix, long before semaphores, shared memory or other ways to block processes existed. It is based on the atomicity of system calls (you cannot execute two system calls on the same file simultaneously)
I am writing a rudimentary shell program in C which uses a parent process to handle shell events and fork() to create child processes that call execv on another executable (also C).
I am trying to keep a process counter on the parent process. And as such I thought of the possibility of creating a pointer to a variable that keeps track of how many processes are running.
However, that seems to be impossible since the arguments execv (and the program executed by it) takes are of type char * const argv[].
I have tried to keep track of the amount of processes using mmap for shared memory between processes, but couldn't get that to work since after the execv call the process simply dies and doesn't let me update the process counter.
In summary, my question is: Is there a way for me to pass a pointer to an integer on an execv call to another program?
Thank you in advance.
You cannot meaningfully pass a pointer from one process to another because the pointer is meaningless in the other process. Each process has its own memory, and the address is relative to that memory space. In other words, the virtual memory manager lets every process pretend it has the entire machine's memory; other processes are simply invisible.
However, you do have a few options for setting up communications between related processes. The most obvious one is a pipe, which you've presumably already encountered. That's more work, though, because you need to make sure that some process is always listening for pipe communications.
Another simple possibility is to just leave a file descriptor open when you fork and exec (see the close-on-exec flag to see how to accomplish the latter); although mmap is not preserved by exec, you can remap the memory to the open fd in the child process. If you don't want to pass the fd, you can mmap the memory to a temporary file, and use an environment variable to record the name of the temporary file.
Another possibility is Posix shared memory. Again, you might want to communicate the shm name through an environment variable, rather than hard-coding it in to the application.
Note that neither shared mmaps nor shared memory are atomic. If you're incrementing a counter, you'll need to use some locking mechanism to avoid race conditions.
For possibly a lot more information than you really wanted, you can read ESR's overview of interprocess communication techniques in Chapter 7 of The Art of Unix Programming.
As the parent process is using huge mount of memory, fork may fail with errno of ENOMEM under some configuration of kernel overcommit policy. Even though the child process may only exec low memory-consuming program like ls.
To clarify the problem, when /proc/sys/vm/overcommit_memory is configured to be 2, allocation of (virtual) memory is limited to SWAP + MEMORY * ration(default to 50%).
When a process forks, virtual memory is not copied thanks to COW. But the kernel still need to allocate virtual memory space. As an analogy, fork is like malloc(virtual memory space size) which will not allocate physical memory and writing to shared memory will cause copy of virtual memory and physical memory is allocated. When overcommit_memory is configured to be 2, fork may fail due to virtual memory space allocation.
Is it possible to fork a process without inherit virtual memory space of parent process in the following conditions?
if the child process calls exec after fork
if the child process doesn't call exec and will not using any global or static variable from parent process. For example, the child process just do some logging then quit.
As Basile Starynkevitch answered, it's not possible.
There is, however, a very simple and common solution used for this, that does not rely on Linux-specific behaviour or memory overcommit control: Use an early-forked slave process do the fork and exec.
Have the large parent process create an unix domain socket and fork a slave process as early as possible, closing all other descriptors in the slave (reopening STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO to /dev/null). I prefer a datagram socket for its simplicity and guarantees, although a stream socket will also work.
In some rare cases it is useful to have the slave process execute a separate dedicated small helper program. In most instances this is not necessary, and makes security design much easier. (In Linux, you can include SCM_CREDENTIALS ancillary messages when passing data using an Unix domain socket, and use the process ID therein to verify the identity/executable the peer is using the /proc/PID/exe pseudo-file.)
In any case, the slave process will block in reading from the socket. When the other end closes the socket, the read/receive will return 0, and the slave process will exit.
Each datagram the slave process receives, describes a command to execute. (Using a datagram allows using C strings, delimited with NUL characters, without any escaping etc.; using an Unix stream socket typically requires you to delimit the "command" somehow, which in turn means escaping the delimiters in the command component strings.)
The slave process creates one or more pipes, and forks a child process. This child process closes the original Unix socket, replaces the standard streams with the respective pipe ends (closing the other ends), and executes the desired command. I personally prefer to use an extra close-on-exec socket in Linux to detect successful execution; in an error case, the errno code is written to the socket, so that the slave-parent can reliably detect the failure and the exact reason, too. If success, the slave-parent closes the unnecessary pipe ends, replies to the original process about the success, with the other pipe ends as SCM_RIGHTS ancillary data. After sending the message, it closes the rest of the pipe ends, and waits for a new message.
On the original process side, the above process is sequential; only one thread may execute start executing an external process at a time. (You simply serialize the access with a mutex.) Several can run at the same time; it is only the request to and response from the slave helper that is serialized.
If that is an issue -- it should not be in typical cases -- you can for example multiplex the connections, by prefixing each message with an ID number (assigned by the parent process, monotonically increasing). In that case, you'll probably use a dedicated thread on the parent end to manage the communications with the slave, as you certainly cannot have multiple threads reading from the same socket at the same time, and expect deterministic results.
Further improvements to the scheme include things like using a dedicated process group for the executed processes, setting limits to them (by setting limits to the slave process), and executing the commands as dedicated users and groups by using a privileged slave.
The privileged slave case is where it is most useful to have the parent execute a separate helper process for it. In Linux, both sides can use SCM_CREDENTIALS ancillary messages via Unix domain sockets to verify the identity (PID, and with ID, the executable) of the peer, making it rather straightforward to implement robust security. (But note that /proc/PID/exe has to be checked more than once, to catch the attacks where a message is sent by a nefarious program, quickly executing the appropriate program but with command-line arguments that cause it to exit soon, making it occasionally look like the correct executable made the request, while a copy of the descriptor -- and thus the entire communications channel -- was in control of a nefariuous user.)
In summary, the original problem can be solved, although the answer to the posed question is No. If the executions are security-sensitive, for example change privileges (user accounts) or capabilities (in Linux), then the design has to be carefully considered, but in normal cases the implementation is quite straight-forward.
I'd be happy to elaborate if necessary.
No, it is not possible. You might be interested by vfork(2) which I don't recommend. Look also into mmap(2) and its MAP_NORESERVE flag. But copy-on-write techniques are used by the kernel, so you practically won't double the RAM consumption.
My suggestion is to have enough swap space to not being concerned by such an issue. So setup your computer to have more available swap space than the largest running process. You can always create some temporary swap file (e.g. with dd if=/dev/zero of=/var/tmp/swapfile bs=1M count=32768 then mkswap /var/tmp/swapfile) then add it as a temporary swap zone (swapon /var/tmp/swapfile) and remove it (swapoff /var/tmp/swapfile and rm /var/tmp/swapfile) when you don't need it anymore.
You probably don't want to swap on a tmpfs file system like /tmp/ often is, since tmpfs file systems are backed up by swap space!.
I dislike memory overcommitment and I disable it (thru proc(5)). YMMV.
I'm not aware of any way to do (2), but for (1) you could try to use vfork which will fork a new process without copying the page tables of the parent process. But this generally isn't recommended for a number of reasons, including because it causes the parent to block until the child performs an execve or terminates.
This is possible on Linux. Use the clone syscall without the flag CLONE_THREAD and with the flag CLONE_VM. The parent and child processes will use the same mappings, much like a thread would; there is no COW or page table copying.
madvise(addr, size, MADV_DONTFORK)
Alternatively, you can call munmap() after fork() to remove the virtual addresses inherited from the parent process.
My question is regarding initializing memory obtained from using shm_open() and mmap(). One common advice I have seen in several places is to call shm_open() with flags O_CREAT|O_EXCL: if that succeeds then we are the first user of the shared memory and can initialize it, otherwise we are not the first and the shared memory has already been initialized by another process.
However, from what I understand about shm_open and from the testing that I did on Linux, this wouldn't work: the shared memory objects get left over in the system, even after the last user of the shared memory object has unmapped and closed. A simple test program which calls shm_open with O_CREAT|O_EXCL, then closes the descriptor and exit, will succeed on the first run, but will still fail on the second run, even though nobody else is using the shared memory at that time.
It actually seems to me that (at least on the system that I tested) the behavior of shm_open is pretty much identical to open(): if I modify my simple test program to write something to the shared memory (through the pointer obtained by mmap) and exit, then the shared memory object will keep its contents persistently (I can run another simple program to read back the data I wrote previously).
So is the advice about using shm_open with O_CREAT|O_EXCL just wrong, or am I missing something?
I do know that the shared memory object can be removed with shm_unlink(), but it seems that will only cause more problems:
If a process dies before calling shm_unlink() then we are back to the problem described above.
If one process calls shm_unlink() while some other processes are still mapped into the same shared memory, these other processes will still continue using it as usual. Now, if another process comes and calls shm_open() with the same name and O_CREAT specified, it will actually succeed in creating new shared memory object with the same name, which is totally unrelated to the old shared memory object the other processes are still using. Now we have a process trying to communicate with other processes via the shared memory and totally unaware that it is using a wrong channel.
I m used to Windows semantics where shared memory object exists only as long as at least one handle is open to it, so this Posix stuff is very confusing.
Since you use the O_EXCL flag I will assume that you have a set of processes gathered around one master (the creator of the segment).
Then, your master process will create the shared memory segment using a call to shm_open :
shmid = shm_open("/insert/name/here", O_CREAT|O_EXCL, 0644);
if (-1 == shmid) {
printf("Oops ..\n");
}
Here, the slaves are ready to use the segment. Since the master HAS to create the segment, there is no need to use the O_CREAT flag in the slaves calls. You'll just have to handle possible errors if the slave call is performed when the segment is not created yet or already destroyed.
When any of your processes is done with the segment, it should call shm_unlink(). In this kind of architecture, the master is usually feeding the slaves. When it has nothing more to say, it just shuts up. The slaves have then the responsibility to handle corresponding errors gracefully.
As you stated, if a process dies before calling the shm_unlink procedure, then the segment will continue to live thereafter. To avoid this in some cases, you could define your own signal handlers in order to perform the operation when signals such as SIGINT are received. Anyway, you won't be able to cover the mess in case SIGKILL is sent to your process.
EDIT :
To be more specific, the use of O_CREAT | O_EXCL is wrong when unnecessary. With the little example above, you can see that it is required for the master to create the segment, thus those flags are needed. On the other hand, none of the slave processes would have to ever create it. Thus, you will absolutely forbid the use of O_CREAT in the related calls.
Now, if another process calls shm_open(..., O_CREAT, ...) when the segment is already created, it will just retrieve a file descriptor related to this very segment. It will thus be on the right channel (if it has the rights to do so, see the mode argument)
You can do the following :
int test = shmget(key_t key,size,0); Put this at the star of each process. Zero flag here tries to open an existing shared memory if its not created yet test will equal -1 so you can make a check after this statement if test -1 go and creat a shared memory else you just got an id to an existing shared memory ..... I hope this help
Linux Kernel version 3.2 and further have a capability called cross memory attach.
Here is the link to it. I was not able to get a lot of help in that regard.
http://man7.org/linux/man-pages/man2/process_vm_readv.2.html
In the syntax we need the address of the remote memory where we want to write to or read from. My question is how do I get the address of this remote memory if I am using fork().
Suppose I am sending something from parent process to child process using cross memory attach. How do I send the address of the remote memory to the parent process from the child process?
The system calls process_vm_readv and process_vm_writev are meant for fast data transfer between processes. They are supposed to be used in addition to some traditional way of interprocess communication.
For example, you may use a regular pipe or fifo to transfer the required addresses between your processes. Then you may use those addresses to establish faster process_vm_ communication. The simpliest way to transfer something between forked processes should be the pipe() function (man 2 pipe has a good example of its usage). There are many other ways to do so of course, like using sockets or messages. You can even write an address to a file and let the other process read it.