Linux Kernel - Read/Write to a File - c

I'm working on a LKM which needs to retrieve and write a certain set of information to files. I looked up common ways to do so, but could not find a working one for Linux 4.x. I also found out that it is possible to retrieve system calls from memory and effectively call them.
As I found currently no better way I'd be interested if it'd be feasible to find the system call table, and call open, read/write and close this way.

This is strongly discouraged in most situations.
https://www.linuxjournal.com/article/8110 was a really good read for me the first time I thought I had to do this as well.
From within the Linux kernel, however, reading data out of a file for configuration information is considered to be forbidden. This is due to a vast array of different problems that could result if a developer tries to do this.
Indeed this is possible to do using system calls from within the kernel, but the practice of calling system calls from within the kernel is also generally discouraged. They're designed as interfaces for userspace applications to ask things of the kernel, not for the kernel to get itself to do work.
What kind of files do you want to manipulate from within the kernel? If the kind of file you'd like to manipulate is provided by the proc filesystem or the sysfs filesystem or the dev filesystem, you can modify the contents of the file from within the kernel (since the kernel provides these to userspace itself) -- this should be done NOT with file manipulation calls. If it's a normal userspace file, almost never do you want the kernel to be able to modify it.
If you provide more specifics I'd be interested to hear them, but this is usually a bad idea.

Related

What is the best method to detect which files are used/modified/created/deleted by a process?

I want to write software that will detect all used/created/modified/deleted files during the execution of a process (and its child processes). The process has not yet run - the user provides a command line which will later be subprocessed via bash, so we can do things before and after execution, and control the environment the command is run in.
I have thought of four methods so far that could be useful:
Parse the command line to identify files and directories mentioned. Assume all files explicitly mentioned are used. Check directories before/after for created/deleted files. MD5 existing files before/after to see any are modified. This works on all operating systems and environments, but obviously has serious limitations (doesnt work when command is "./script.sh")
Run the process via another process like strace (dtruss for OSX, and there are equivalent windows programs), which listens for system calls. Parse output file to find files used/modified/deleted/created. Pros that its more sensitive than MD5 method and can deal with script.sh. Cons that its very OS specific (dtruss requires root privileges even if the process being run does not - outputs of tools all different). Could also create huge log files if there are a lot of read/write operations, and will certainly slow things down.
Integrate something similar to the above into the kernel. Obviously still OS specific, but at least now we are calling the shots, creating common output format for all OS's. Wouldn't create huge log files, and could even stop hooking syscalls to, say, read() after process has requested the first read() to the file. I think this is what the tool inotify is doing, but im not familiar with it at all, nor kernel programming!
Run the process using the LD_PRELOAD trick (called DYLD_INSERT_LIBRARIES on OSX, not sure if it exists in Windows) which basically overwrites any call to open() by the process with our own version of open() which logs what we're opening. Same for write, read, etc. It's very simple to do, and very performant since you're essentially teaching the process to log itself. The downside is that it only works for dynamically-linked process, and i have no idea of the prevalence of dynamic/statically linked programs. I dont even know if it is possible before execution to tell if a process is dynamically or statically linked (with the intention of using this method by default, but falling back to a less-performant method if its not possible).
I need help choosing the optimal path to go down. I have already implemented the first method because it was simple and gave me a way to work on the logging backend (http://ac.gt/log) but really i need to upgrade to one of the other methods. Your advice would be invaluable :)
Take a look to the source code of "strace" (and its -f to trace children). It does basically what you are trying to do. It captures all the system calls of the process (or its childs) so you can grep for operations like "open", etc.
The following link provides some examples of implementing your own strace by using the ptrace system call:
https://blog.nelhage.com/2010/08/write-yourself-an-strace-in-70-lines-of-code/

How to test a file system implementation?

I am going to implement a file system in C and i'm wondering how can i test it without installing it in the kernel nor using FUSE API. Ideally what i'd like to do is to use dd command to create a virtual hard drive and interact with it using linux system calls like write and read (the idea is to not write drivers). Is that posible?
(I'm sorry if i misspelled words, but eanglish isn't my first language. Also i'm sorry if this is off-topic, it's my first question)
Thanks.
If you are really implementing a file system, you can test it in virtual machine.
Otherwise, you can implement a file system in a file which exist in real file system, and implement some functions like read/write/etc...
Virtual hard drive and virtual filesystem are a bit different things, - you write different functions and handle different requests when implementing them. Given that you implement a filesystem, your best bet on linux is to expose your filesystem via FUSE for testing. Then write different tests that will access your FUSE-based filesystem to perform various tasks.
Unfortunately testing a filesystem is hard and requires writing many tests. Manual testing with different software (file managers) is also required.

Is there a way to get battery info (status, plugged in, etc) without reading a proc/sys file on linux?

I want to get information about the battery in C on linux. I don't want to read or parse any file! Is there any low-level interface to acpi/the kernel or any other module to get the information I want to have?
I already searched the web, but every question results in the answer "parse /proc/foo/bar". I really don't want to do this because I think, low-level interfaces won't change as fast as Files do.
best regards.
The /proc filesystem does not exist on a disk. Instead, the kernel creates it in memory. They are generated on-demand by the kernel when accessed. As such, your concerns are invalid -- the /proc files will change as quickly as the kernel becomes aware of changes.
Check this for more info about /proc file system.
In any case, I don't believe there's any alternative interface.
You might be looking for UPower: http://upower.freedesktop.org/
This is a common need for both desktop environments and mobile devices, so there have been many solutions over time. For example, one of the oldest ones was acpid, which is pretty much obsolete now.
While I'd recommend using a light-weight abstraction like UPower for code clarity reasons, the files in /proc and (to some extent) /sys are considered part of the Linux kernel ABI, which means that changing them is generally frowned upon.

File in both KLM and user space

I remembering reading this concept somewhere. I do not remember where though.
I have a file say file.c, which along with other files I compile along with some other files as a library for use by applications.
Now suppose i compile the same file and build it with a Kernel module. Hence now the same file object is in both user space and kernel space and it allows me to access kernel data structures without invoking a system call. I mean i can have api's in the library by which applications can access kernel data structures without system calls. I am not sure if I can write anything into the kernel (which i think is impossile in this manner), but reading some data structures from kernel this way would be fine?
Can anyone give me more details about this approach. I could not find anything in google regarding this.
I believe this is a conceptually flawed approach, unless I misunderstand what you're talking about.
If I understand you correctly, you want to take the same file and compile it twice: once as a module and once as a userspace program. Then you want to run both of them, so that they can share memory.
So, the obvious problem with that is that even though the programs come from the same source code, they would still exist as separate executables. The module won't be its own process: it only would get invoked when the kernel get's going (i.e. system calls). So by itself, it doesn't let you escape the system call nonsense.
A better solution depends on what your goal is: do you simply want to access kernel data structures because you need something that you can't normally get at? Or, are you concerned about performance and want to access these structures faster than a system call?
For (1), you can create a character device or a procfs file. Both of these allow your userspace programs to reach their dirty little fingers into the kernel.
For (2), you are in a tough spot, and the problem gets a lot nastier (and more insteresting). To solve the speed issue, it depends a lot on what exact data you're trying to extract.
Does this help?
There are two ways to do this, the most common being what's called a Character Device, and the other being a Block Device (i.e. something "disk-like").
Here's a guide on how to create drivers that register chardevs.

Making a new filesystem

I'm looking to make a custom filesystem for a project I'm working on. Currently I am looking at writing it in Python combined with fusepy, but it got me wondering how a compiled non-userspace filesystem is made in Linux. Are there specific libraries that you need to work with or functions you need to implement for the mount command to work properly. Overall I'm not sure how the entire process works.
Yup you'd be programming to the kernel interfaces, specifically the VFS layer at a minimum. Edit Better link [1]
'Full' documentation is in the kernel tree: http://www.mjmwired.net/kernel/Documentation/filesystems/vfs.txt. Of course, the fuse kernel module is programmed to exactly the same interface
This, however, is not what you'd call a library. It is a kernel component and intrinsically there, so the kernel doesn't have to know how a filesystem is implemented to work with one.
[1] google was wrong: the first hit wasn't the best :)
If you'd like to write it in Python, fuse is a good option. There are lots of tutorials for this, such as the one here: http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FUSE_Python_tutorial
In short: Linux is a monolithic kernel with some module-loading capabilities. That means that every kernel feature (filesystems, scheduler, drivers, memory management, etc.) is part of the one big program called Linux. Loadable modules are just a specialized way of run-time linking, which allows the user to pick those features as needed, but they're all still developed mostly as a single program.
So, to create a new filesystem, you just add new C source code files to the kernel code, defining the operations your filesystem has to perform. Then, create an initialization function that allocates a new instance of the VFS structure, fills it with the appropriate function pointers and registers with the VFS.
Note that FUSE is nothing more than a userlevel accessible API to do the same, so the FUSE hooks correspond (roughly) to the VFS operations.

Resources