Modifying the bio structure in linux - c

I am attempting to modify the bio structure (in blk_types.h) for linux-3.2.0 (running Ubuntu). The only thing I need to do to this structure is to add an additional variable to keep track of an integer variable (it is for a tainting algorithm). However, adding a single line such as "int id;" to the structure halts the boot sequence of the OS.
It compiles, but when booting it gives the following error:
>Gave up wiating for root device. Common problems:
>Boot args
>check rootdelay= ...
>check root= ...
>missing modules (cat /proc/modules; ls /dev)
>ALERT! /dev/disk/by-uuid/15448888-84a0-4ccf-a02a-0feb3f150a84 does not exist. Dropping to a shell!
>BusyBox Built In Shell ...
>(initramfs)
I took a look around using the given shell and could not find the desired file system by uuid or otherwise (no /dev/sda). Any ideas what might be going on?
Thanks,
-Misiu

I suppose you are trying to modify the Linux kernel header bio.h, not its userland "friend" bui.h.
Said that I must warn you that in many places around kernel sizeof() may be used which is more portable and perhaps some other implementation or API may expect some fixed size. If the later is true then you'll have problems since bio' struct size has been changed by you.
It is a guessing with no further investigation from my side (I mean I hadn't investigate about bio in detail) but when patching the Linux kernel one must make sure of any possible side effects and take the whole scenario on account, specially when modifying lower levels implementation.
Bio helper functions do lots of low level operations on bio struct, take a loot at bio_integrity.c for example.

I managed to fix the problem with your help Caf. Though re-building/installing the modules did not seem to help immediately, I was able to get the system to boot by building the SATA drivers into the kernel, as advised by this forum thread: https://unix.stackexchange.com/questions/8405/kernel-cant-find-dev-sda-file-during-boot.
Thanks for your help,
-Misiu

Related

How can I get a filesystem label from sysfs?

How can I get the label of a filesystem using /sys? I know I can get much of the info about a block device by going to /sys/class/block/<device>, e.g. /sys/class/block/sr1 for a cd that I know has the filesystem label config. I hunted through each item, found everything but the label.
I did dig through the lsblk source code, which, in turn, depends on calling udev_device_new_from_subsystem_sysname in libudev, so I went through that. It does appear to populate the property ID_FS_LABEL_ENC, but I cannot figure out where it takes it from in the tree, unless it is tracking it elsewhere?
I would just use libudev, but need to access outside of a C program.
I think that the problem here is that you seem to think that the label of a volume is a kernel thing, as is the size or the free space.
But AFAIK it is not, the kernel doesn't care at all about volume labels, it is just a thing that goes from the in-disk format to user-land: there is no kernel API to get that information. If you need it, you just open the raw binary volume and read the data from there.
But then, there is the big issue that every filesystem is different, so you need special code to manage every single partition type there is. Fortunately, somebody has done the hard work, and you have blkid, part of util-linux available in most Linux distributions. If you need it, you can call the program directly, or link to the library libblkid that does the hard work.
Naturally, to use blkid/libblkid you need read access to the block device, that is, root access. If you think that root access should not be needed to read a label, the people from udev think the same, and that is why there is a udev rule that copies the label when the filesystem is first dectected (running blkid of course). This is the ID_FS_LABEL_ENC you already know about.

How to add (and use) binary data to compiled executable?

There are several questions dealing with some aspects of this problem, but neither seems to answer it wholly. The whole problem can be summarized as follows:
You have an already compiled executable (obviously expecting the use of this technique).
You want to add an arbitrarily sized binary data to it (not necessarily by itself which would be another nasty problem to deal with).
You want the already compiled executable to be able to access this added binary data.
My particular use-case would be an interpreter, where I would like to make the user able to produce a single file executable out of an interpreter binary and the code he supplies (the interpreter binary being the executable which would have to be patched with the user supplied code as binary data).
A similar case are self-extracting archives, where a program (the archiving utility, such as zip) is capable to construct such an executable which contains a pre-built decompressor (the already compiled executable), and user-supplied data (the contents of the archive). Obviously no compiler or linker is involved in this process (Thanks, Mathias for the note and pointing out 7-zip).
Using existing questions a particular path of solution shows along the following examples:
appending data to an exe - This deals with the aspect of adding arbitrary data to arbitrary exes, without covering how to actually access it (basically simple append usually works, also true with Unix's ELF format).
Finding current executable's path without /proc/self/exe - In companion with the above, this would allow getting a file name to use for opening the exe, to access the added data. There are many more of these kind of questions, however neither focuses especially on the problem of getting a path suitable for the purpose of actually getting the binary opened as a file (which goal alone might (?) be easier to accomplish - truly you don't even need the path, just the binary opened for reading).
There also may be other, probably more elegant ways around this problem than padding the binary and opening the file for reading it in. For example could the executable be made so that it becomes rather trivial to patch it later with the arbitrarily sized data so it appears "within" it being in some proper data segment? (I couldn't really find anything on this, for fixed size data it should be trivial though unless the executable has some hash)
Can this be done reasonably well with as little deviation from standard C as possible? Even more or less cross-platform? (At least from maintenance standpoint) Note that it would be preferred if the program performing the adding of the binary data didn't rely on compiler tools to do it (which the user might not have), but solutions necessiting those might also be useful.
Note the already compiled executable criteria (the first point in the above list), which requires a completely different approach than solutions described in questions like C/C++ with GCC: Statically add resource files to executable/library or SDL embed image inside program executable , which ask for embedding data compile-time.
Additional notes:
The problems with the obvious approach outlined above and suggested in some comments, that to just append to the binary and use that, are as follows:
Opening the currently running program's binary doesn't seem something trivial (opening the executable for reading is, but not finding the path to supply to the file open call, at least not in a reasonably cross-platform manner).
The method of acquiring the path may provide an attack surface which probably wouldn't exist otherwise. This means that a potential attacker could trick the program to see different binary data (provided by him) like which the executable actually has, exposing any vulnerability which might reside in the parser of the data.
It depends on how you want other systems to see your binary.
Digital signed in Windows
The exe format allows for verifying the file has not been modified since publishing. This would allow you to :-
Compile your file
Add your data packet
Sign your file and publish it.
The advantage of following this system, is that "everybody" agrees your file has not been modified since signing.
The easiest way to achieve this scheme, is to use a resource. Windows resources can be added post- linking. They are protected by the authenticode digital signature, and your program can extract the resource data from itself.
It used to be possible to increase the signature to include binary data. Unfortunately this has been banned. There were binaries which used data in the signature section. Unfortunately this was used maliciously. Some details here msdn blog
Breaking the signature
If re-signing is not an option, then the result would be treated as insecure. It is worth noting here, that appended data is insecure, and can be modified without people being able to tell, but so is the code in your binary.
Appending data to a binary does break the digital signature, and also means the end-user can't tell if the code has been modified.
This means that any self-protection you add to your code to ensure the data blob is still secure, would not prevent your code from being modified to remove the check.
Running module
Windows GetModuleFileName allows the running path to be found.
Linux offers /proc/self or /proc/pid.
Unix does not seem to have a method which is reliable.
Data reading
The approach of the zip format, is to have a directory written to the end of the file. This means the data can be found at the end of the location, and then looked backwards for the start of the data. The advantage here, is the data blob is signposted from the end of the data, rather than the natural start.

How feasible is it to virtualise the FILE* interfaces of C?

It have often noticed that I would have been able to solve practical problems in C elegantly if there had been a way of creating a ‘virtual FILE’ and attaching the necessary callbacks for events such as buffer full, input requested, close, flush. It should then be possible to use a large part of the stdio.h functions, e.g. fprintf unchanged. Is there a framework enabling one to do this? If not, is it feasible with a moderate amount of effort, on at least some platforms?
Possible applications would be:
To write to or read from a dynamic or static region of memory.
To write to multiple files in parallel.
To read from a thread or co-routine generating data.
To apply a filter to another (virtual or real) FILE.
Support for file formats with indirection (like #include).
A C pre-processor(?).
I am less interested in solutions for specific cases than in a framework to let you roll your own FILE. I am also not looking for a virtual filesystem, but rather virtual FILE*s that I can pass to the CRT.
To my disappointment I have never seen anything of the sort; as far as I can see C11 considers FILE entirely up to the language implementer, which is perhaps reasonable if one wishes to keep the language (+library) specifications small but sad if you compare it with Java I/O streams.
I feel sure that virtual FILEs must be possible with any (fully) open source implementation of the C run-time, but I imagine there might be a large number of details making it trickier than it seems, and if it has already been done it would be a shame to reduplicate the effort. It would also be greatly preferable not to have to modify the CRT code. Without open source one might be able to reverse engineer the functions supplied, but I fear the result would be far too vulnerable to changes in unsupported features, unless there were a commitment to a set of interfaces. I suppose too that any system for which one can write a device driver would allow one to create a virtual device, but I suspect that of being unnecessarily low-level and of requiring one to write privileged code.
I have to admit that while I have code that would have benefited from virtual FILEs, I have no current requirement for it; nonetheless it is something I have often wondered about and that I imagine could be of interest to others.
This is somewhat similar to a-reader-interface-that-consumes-files-and-char-in-c, but there the questioner did not hope to return a virtual FILE; the answer, however, using fmemopen, did.
There is no standard C interface for creating virtual FILE*s, but both the GNU and the BSD standard libraries include one. On linux (glibc), you can use fopencookie; on most *BSD systems, funopen (including Mac OS X). (See Note 1)
The two interfaces are similar but slightly different in some details. However, it is usually very simple to adapt code written for one interface to the other.
These are not complete virtualizations. They associated the FILE* with four callbacks and a void* context (the "cookie" in fopencookie). The callbacks are read, write, seek and close; there are no callbacks for flush or tell operations. Still, this is sufficient for many simple FILE* adaptors.
For a simple example, see the two answers to Write simultaneousely to two streams.
Notes:
funopen is derived from "functional open", not from "file unopen".

what's the difference between switch_root and run_init?

What's the difference between switch_root and run_init, besides switch_root being made by busybox while run_init is from klibc?
Thanks very much
They both perform exactly the same function, which is to switch to the "real" root and execv(3) the "real" init(8) program from an initramfs. They both assume that the filesystem that should become the root has been mounted on some directory, which they take as an argument.
(An initramfs is a (usually) temporary in-memory filesystem loaded by the bootloader. Its purpose is to do any setup that might be required before mounting the real root and switching to the real init program.)
Recent source code for run-init can be found here. run_init() is the entry point (called from run-init.c, which parses the arguments).
Recent source code for switch_root can be found here. switch_root_main() is the entry point.
The code is short for both implementations (though a bit tricky), which makes it easy to compare them by eye. The only difference seems to be that they perform slightly different sanity checks, and that recent versions of run-init have an extra option to drop selected capabilities(7) before execv()'ing the new init.

Questions about register_chrdev_region() in linux device driver

I'm learning about the registration of a kernel module using register_chrdev_region(dev_t from, unsigned count, const char * name);.
I notice that with or without this function, my kernel module worked as expected. The code I used for testing:
first = MKDEV(MAJOR_NUM, MINOR_NUM);
register_chrdev_region(first, count, DEVICE_NAME);//<---with and without
mycdev=cdev_alloc();
mycdev->ops= &fops;
mycdev->owner = THIS_MODULE;
if (cdev_add(mycdev,first, count) == 0)
{printk(KERN_ALERT "driver loaded\n");}
I commented out the line register_chrdev_region(first, count, DEVICE_NAME);, and the printk message still appeared. I tried to communicate with the driver with or without this from user space, and both are successful.
So my question is, is this function register_chrdev_region() only used to make my driver a good kernel citizen, just like telling the others that "I'm using up the major number, please don't use"?
I tried to have a look in the kernel source char_dev.c to understand the function, but I find it too difficult to understand, anyone that's familiar with this?
Thanks!
That will work because it's not actually necessary to allocate your device numbers up front. In fact, it's considered preferable by many kernel developers to use the dynamic (on-the-fly, as-needed) allocation function alloc_chrdev_region.
Whether you do it statically up front or dynamically as needed, it is something you should do to avoid conflict with other device drivers which may have played by the rules and been allocated the numbers you're trying to use. Even if your driver works perfectly well without it, that won't necessarily be true on every machine or at any time in the future.
The rules are there for a reason and, especially with low-level stuff, you are well advised to follow them.
See here for more details on the set-up process.
If the major number for your devices clash with any other device already in use, then the driver won't have the allocation done.
If you have already tested which major number is free and used it, it might generally not throw up an error and you will face no problem as u load the driver.
But if you run on various systems and if the major number is already captured and used by some other system., Then your driver loading can fail.
Its always better to use dynamic allocation !!

Resources