Where to store data to be used by all processes? - c

I am trying several projects in editing a Linux kernel.
One of the projects is to write a system call function that will receive a name of a program and not allow that program to be run through execv (note, several programs can be blocked - we need a list of blocked programs).
I have figured out what to do for most of the exercise. For example, one challenge is to log all attempts of a certain process to execute any of the blocked programs - I decided to store this in the heap of the process using kmalloc().
However, I am debating where to store the "list of blocked programs" - no matter which process is running, when in execv we must have access to this list. Would it make sense then to store this list inside the heap of the init process or is there some "general" memory location that is shared between all processes (never heard of one before but I was wondering if there might be one).
If the answer is indeed inside the heap of init, how do I allocate memory there from whichever process is currently running?

Related

What happen if thread crashes, which is better thread or process?

I am writing a server application with one connection at a time, I receive a TCP request which has symbol names of function and name of shared libraray.
My server needs to load the shared library using the dlsym system call and call the function using symbol name received.
Right now loading the shared lib and executing the function I am doing in separate thread. My doubt is when thread crashed due to segmentation fault or any signals will my process gets affected ?
Which one is better whether to run in separate thread or process.
Please ask me question If my question is not clear.
A crash in a thread takes down the whole process. And you probably wouldn't want it any other way since a crash signal (like SIGSEGV, SIGBUS, SIGABRT) means that you lost control over the behavior of the process and anything could have happened to its memory.
So if you want to isolate things, spawning separate processes is definitely better. Of course, if someone can make your process crash it's pretty close to them owning your computer anyway. I sure hope that you don't intend to expose this to untrusted users.

When a process writes to a file

Generally, when a process writes to a file, e.g a python script running open('file', 'w').write('text'), what are the exact events that occur? By that I mean something among the lines of 'process A loads file from hard disk to RAM, process B changes content then ...'. I've read about IPC and now I'm trying to dig deeper and understand more on the subject of processes. I couldn't find a thorough explanation on the subject, so if you could find one or explain I'd really appreciate it.
The example of "a python script running open('file', 'w').write('text')" is heavily OS-dependent. The only processes involved here are the process running the Python interpreter, which, e.g. on Linux, can sometimes execute in userspace and sometimes execute in kernel space, and possibly some kernel-only processes, with any IPC, if required, happening inside the kernel. There is no particular requirement that everything down to the disk read itself cannot be handled on the user's process when it is running in kernel mode, but in practice, there may be other processes involved. This is OS- and even driver-specific behavior.
In this particular example (which isn't great, because it relies on the automatic cPython close when the variable goes out of scope), the Python process makes a system call to open a file, one to write the file, and one to close the file. These are all blocking -- that is, they do not return until the results are ready. When the process blocks, it is put on a queue waiting for some event to occur to make it ready to run again.
The opposite of this is asynchronous I/O, which can be performed by polling, by callbacks, or by the select statement, which can block until any one of a number of events has occurred.
But when most people talk about IPC, they are not usually talking about communication between or with kernel processes. Rather, they are talking about communication between multiple user processes and/or threads, using semaphores, mutexes, named pipes, etc. A good introduction to these sorts of things would be any tutorial information you can find on using pthreads, or even the Python threads and multiprocessing modules. There are examples there for several simple cases.
The primary difference between processes and threads on Linux is that threads share an address space and processes each have their own address space. Python itself adds the wrinkle of the GIL, which limits the utility of threads in Python somewhat.

Watchpoints in shared memory?

I'm debugging an issue in a patch to PostgreSQL where a word in shared memory seems to get overwritten unintentionally.
Valgrind isn't any help as it can't keep track of interactions in shared memory between multiple processes.
The address that gets overwritten is fairly stable but not totally fixed, though it's always identified by a pointer in a global struct initialized early in startup by each process.
I'm trying to find a way to get a stack trace whenever any process writes to the address of interest, but it's proving rather harder than I would've expected.
gdb watchpoints aren't any help, as gdb can't follow fork() and establish the same watch on child processes. Manually doing it with multiple gdb processes is very cumbersome due to the number of child processes PostgreSQL uses and the timing issues involved in setting it up by hand.
perf userspace probes looked promising, but seem to attach only to functions, there's no apparent way to trap a write to a memory address.
So is there any way to grab stack traces for each writer to a given shared memory address across multiple processes?
gdb can't follow fork() and establish the same watch on child processes
A sufficiently recent GDB can do that. Documentation here and here.

How can I delete unused shared memory and semaphores?

Similar to: Delete all shared memory and semaphores on Linux however, I want to do so in C, not with some script.
My specific problem: in linux + mac, when I debug a program and terminate it mid process, the shared resources (memory + semaphores) arent released. My program does some client server stuff where the server is the first process to acquire the shared resources. Therefore, after a termination without detaching, when I restart the program, it assumes that it is the client when there is no server (because the resource has been created and not released).
Currently, I am using Qt to manage the shared resources but Qt does not appear to have a way to deal with this situation (the error code that create returns is that the resource has already been created). Therefore, Im looking to a more OS specific way to do this. NOTE: windows does not have this problem because the shared resource is released on termination automatically.
Check man ipcrm.
ipcrm - remove a message queue, semaphore set or shared memory id
Does the server terminate normally? If so you can have it call shmdt() before exiting.
If it is crashing, then that's a little harder. One thing is to have it use shmctl to
see how many processes have the shm attached. If it's 0, then you are obviously not the client.
There's also a flag you can set on shm segments IPC_RMID, although the usage seems a little ambiguous.

Any possible solution to capture process entry/exit?

I Would like to capture the process entry, exit and maintain a log for the entire system (probably a daemon process).
One approach was to read /proc file system periodically and maintain the list, as I do not see the possibility to register inotify for /proc. Also, for desktop applications, I could get the help of dbus, and whenever client registers to desktop, I can capture.
But for non-desktop applications, I don't know how to go ahead apart from reading /proc periodically.
Kindly provide suggestions.
You mentioned /proc, so I'm going to assume you've got a linux system there.
Install the acct package. The lastcomm command shows all processes executed and their run duration, which is what you're asking for. Have your program "tail" /var/log/account/pacct (you'll find its structure described in acct(5)) and voila. It's just notification on termination, though. To detect start-ups, you'll need to dig through the system process table periodically, if that's what you really need.
Maybe the safer way to move is to create a SuperProcess that acts as a parent and forks children. Everytime a child process stops the father can find it. That is just a thought in case that architecture fits your needs.
Of course, if the parent process is not doable then you must go to the kernel.
If you want to log really all process entry and exits, you'll need to hook into kernel. Which means modifying the kernel or at least writing a kernel module. The "linux security modules" will certainly allow hooking into entry, but I am not sure whether it's possible to hook into exit.
If you can live with occasional exit slipping past (if the binary is linked statically or somehow avoids your environment setting), there is a simple option by preloading a library.
Linux dynamic linker has a feature, that if environment variable LD_PRELOAD (see this question) names a shared library, it will force-load that library into the starting process. So you can create a library, that will in it's static initialization tell the daemon that a process has started and do it so that the process will find out when the process exits.
Static initialization is easiest done by creating a global object with constructor in C++. The dynamic linker will ensure the static constructor will run when the library is loaded.
It will also try to make the corresponding destructor to run when the process exits, so you could simply log the process in the constructor and destructor. But it won't work if the process dies of signal 9 (KILL) and I am not sure what other signals will do.
So instead you should have a daemon and in the constructor tell the daemon about process start and make sure it will notice when the process exits on it's own. One option that comes to mind is opening a unix-domain socket to the daemon and leave it open. Kernel will close it when the process dies and the daemon will notice. You should take some precautions to use high descriptor number for the socket, since some processes may assume the low descriptor numbers (3, 4, 5) are free and dup2 to them. And don't forget to allow more filedescriptors for the daemon and for the system in general.
Note that just polling the /proc filesystem you would probably miss the great number of processes that only live for split second. There are really many of them on unix.
Here is an outline of the solution that we came up with.
We created a program that read a configuration file of all possible applications that the system is able to monitor. This program read the configuration file and through a command line interface you was able to start or stop programs. The program itself stored a table in shared memory that marked applications as running or not. A interface that anybody could access could get the status of these programs. This program also had an alarm system that could either email/page or set off an alarm.
This solution does not require any changes to the kernel and is therefore a less painful solution.
Hope this helps.

Resources