Print all files on a filesystem using system call - c

I am working in the kernel and I am trying to make a system call that takes a partition as input (i.e. /dev/sda1) and then prints every file on the filesystem using printk().
I enter a partition (i.e. /dev/sda1) and I put a printk() inside this system call to print.
First, I tried to do this with a process, because if I am right each process is represented by a task_struct and I tried to access the files with the files_struct. But the problem is that I only have the file descriptors of the opened files and not all the files.
So, what I want to do is that I pass the name of the partition and I printk() the names of all the files.
For example:
I enter the path /dev/sda1 as an argument and let's suppose I have the file a.txt and b.txt inside this partition , so the system call should print a.txt and b.txt.
The signature will be like this:
asmlinkage long sys_acall(char *partition_name);

There is a few things that needs to be discussed.
The partition_name parameter of your syscall should have the __user tag.
If you want to, strictly speaking, read files from a partition you will have to implement filesystem recognition (is that partition ext3, reiserfs, ntfs, ...?) and then implement the driver for that kind of filesystem. As Christ pointed out, partitions doesn't contain files but filesystems does. Another option is use the drivers already implemented for the filesystem on that partition. This option is just horrible.
If you want to read files from a filesystem your work gets easier, you can use the VFS interface to access it, but you will need that filesystem to be mounted (you can do it on-the-fly though).
My final opinion, I would change "implement a system call that prints every file in a partition" for "implement a system call that prints every file in a directory". The signature for that system call would be:
asmlinkage long sys_crazyness(__user const char *dir);
We don't care if the directory passed is the root of a filesystem or just a folder in any depth-level of a filesystem.
If you can change your problem to this one it would be much easier ;)

Related

What is the result of a stat() call on a file being written to?

Having a little problem with the stat() system call on Solaris 10. I'm doing FTP and at the same time, calling stat (to check on file size) on files that are being written to concurrently through FTP.
Let's assume that files are being written to a directory while a stat() command/call is being called (in parallel). Then would the result of st_size in the struct be 0?
Or would the stat call reflect the current size of the file while FTP is happening?
Is FTP as transactional as I think it is?
The stat()-call would show you the same as ls, since ls uses stat() (or a similar function from this family) to show the file size and attributes.
So, for all common filesystems, stat() would return the current filesize, which will usually constantly grow during the ftp put transaction.
However, an FTP-Server (or even an FTP-client) might choose to create an empty file of the requested target name, write the actual data to a temporary file and rename this file to the real file name after the transfer completed. In this case, stat() would return size 0. But this is not the usual way it happens.

Get path from file descriptor when path is longer than PATH_MAX

I receive filesystem events from fanotify. Sometimes I want to get an absolute path to a file that's being accessed.
Usually, it's not a problem - fanotify_event_metadata contains a file descriptor fd, so I can call readlink on /proc/self/fd/<fd> and get my path.
However, if a path exceeds PATH_MAX readlink can no longer be used - it fails with ENAMETOOLONG. I'm wondering if there's a way to get a file path in this case.
Obviously, I can fstat the descriptor I get from a fanotify and traverse the entire filesystem looking for files with identical device ID and inode number. But this approach is not feasible for me performance-wise (even if I optimize it to ignore paths shorter than PATH_MAX).
I've tried getting a parent directory by reopening fd with O_PATH and calling openat(fd, "..", ...). Obviously, that failed because fd doesn't refer to a directory. I've also tried examining contents of a buffer after a failed readlink call (hoping it contains partial path). That didn't work either.
So far I've managed to get long paths for files inside the working directory of a process that opened them (fanotify events contain a pid of a target process, so I can read /proc/<pid>/cwd and get the path to the root from there). But that is a partial solution.
Is there a way to get an absolute path from a file descriptor without traversing the whole filesystem? Preferably the one that will work with kernel 2.6.32/glibc 2.11.
Update: For the curious. I've figured out why calling readlink("/proc/self/fd/<fd>", ... with a buffer large enough to store the entire path doesn't work.
Look at the implementation of do_proc_readlink. Notice that it doesn't use provided buffer directly. Instead, it allocates a single page and uses it as a temporary buffer when it calls d_path. In other words, no matter how large is buffer, d_path will always be limited to a size of a page. Which is 4096 bytes on amd64. Same as PATH_MAX! The -ENAMETOOLONG itself is returned by prepend when it runs out of mentioned page.
readlink can be used with a link target that's longer than PATH_MAX. There are two restrictions: the name of the link itself must be shorter than PATH_MAX (check, "/proc/self/fd/<fd>" is about 20 characters) and the provided output buffer must be large enough. You might want to call lstat first to figure out how big the output buffer should be, or just call readlink repeatedly with growing buffers.
the limitation of PATH_MAX births from the fact that the unix (or linux, from now) needs to bind the size of parameters passed to the kernel. There's no limit on how deep a file hierarchy can grow, and always there's the possibility to access all files, independent on how deep they are in the filesystem hierarchy. What is actually limited is the lenght of the string you can pass or receive from the kernel representing a file name. This means you cannot create (because you have to pass the target path) a symlink longer than this length, but you can have easily paths far longer this limit.
When you pass a filename to the kernel, you can do that for two reasons, to name a file (or device, or socket, or fifo, or whatever), to open it, etc. YOu do this and your filename goes first to a routine that converts that path into an inode (which is what the kernel manages actually). That routine begins scanning from two possible point in the filesystem hierarchi. Those points are the inode reference of the root inode and the inode reference of the curren working diretory of a process. The selection of which inode to use as departure inode depends on the presence of a leading / character at the begining of the path. From this point, up to PATH_MAX characters will be processed each time, but that can lead us deep enough that we cannot get to the root in one step only...
Suppose you use the path to change your current directory, and do a chdir A/B/C/D/E/.../Z. Once there, you create new directories and do the same thing, chdir AA/AB/AC/AD/AE/.../AZ, then chdir BA/BB/BC/BD/... and so on... there's nothing in the system that forbids you to get so deep in the filesystem (you can try that yourself, I have done and tested before) You can grow to a map that is by far larger than PATH_MAX. But this only mean that you cannot get there directly from the filesystem root. You can go there in steps, as much as the system allows you, and depending on where you fix you root directory (by means of the chroot(2) syscall) or your current directory (by means of the chdir(2) syscall)
probably you have notice (or not) that there's no system call to get your curren working directory path from root... There are several reasons for this:
root inode and curren working inode are two local-to-process concepts. Two processes in the same system can have different working directories, and also different root directories, up to the point that they are able to share nothing in common and no way from one's directory to reach the other.
inode path can be ambiguous. Well, this is not true for a directory, as it is not allowed two hard links to point to the same directory inode (this was possible in older unices, where directories had to be created with the mknod(2) system call, if you have access to some hp-ux v6 or old Unix SysV R4 you can create directories with a ... entry ---pointing to the granparent of a directory or similar things, just being root and knowing how to use the mknod(2) syscall) The idea is that when two links point to the same inode, which (or both) of then goes to the root, which one is the right path from the root inode to the current dir?
curren inode and root can be separated by a path far enough to not fit in the PATH_MAX limit.
there can be several different filesystems (and filesystem types) involved in getting to the root. So this is not something that can be obtained only knowing the stored data in the disks, you must know the mounting table.
For these reasons, there's no direct support in the kernel to know the root path to a file. And also there's no way to get the path (and this is what the pwd(1) command does) than to follow the .. entry and get to the parent directory and search there a link that gets to the inode number of the current dir... and repeat this until the parent inode is the same as the last inode visited. Only then you'll be in the root directory (your root directory, that is different in general of other processes root directories)
Just try this exercise:
i=0
while [ "$i" -lt 10000 ]
do
mkdir dir-$i
cd dir-$i
i=$(expr "$i" + 1)
done
and see how far you can go from the root directory in your hierarchy.
NOTE 1
Another reason to be impossible to get the path to a file from an open descriptor is that you have access only to the inode (the path you used to open(2) it can have no relationship to the actual root path, as you can use symlinks and relative to the working directory, or changed root dir in between the open call and the time you want to access the path, it can even not exist, as you can have unlink(2)d it) The inode information has no reference to the path to the inode, as there can be multiple (even millions) paths to a file. In the inode you have only a ref count, which means the number of paths that actually finish on that inode.

Copying a directory using sockets

I'm writing a program in C that sends files across the network using sockets. This works fine for files - they are read into a buffer and then written onto the socket. They are picked up at the other end by reversing this process.
However, how can this apply to directories? I also want to copy directories, keeping the permissions the same (so I don't think mkdir will work). At the moment when I try to run this on a directory, it says the size is -1. How is a directory represented?
To be clear, for example, if I want my program to copy /tmp across the network, it will do this:
/tmp/1.txt - OK
/tmp/2.txt - OK
/tmp/dir/ - Skip
/tmp/dir/3.txt - Can't write to path
There are several possibilities. It would fit fairly will with what you have already to tar the directory to transfer, send the resulting archive across the network, and untar on the other side.
Alternatively, you can walk the directory tree recursively. For each directory you need transfer only the name and whichever attributes you want to preserve, but then you must list the directory contents (probably via readdir()) and transfer each member.
By the way, don't neglect to think about how you're going to handle links, both symbolic ones and hard ones. And if you want your program to be really robust then consider also what to do with special files such as device files and FIFOs.
I guess it is homework, otherwise why not use FTP, scp, rsync, unison etc.
To test if a file path is a plain file, a device, a directory, etc etc... use
stat(2)
To read a directory, use opendir(3) then loop on readdir(3) (then of course closedir). You don't need to know how a directory is represented.
You probably should be interested in nftw(3) to recursively traverse a file tree.
To make one directory, use mkdir(2)
You should read Advanced Linux Programming
BTW, this answer contains useful information too...

Safely writing to and reading from the same file with multiple processes on Linux and Mac OS X

I have three processes designed to run constantly in both Linux and Mac OS X environments. One process (the Downloader) downloads and stores a local copy of a large XML file every 30 seconds. Two other processes (the Workers) use the stored XML file for input. Each Worker starts and runs at random times. Since the XML file is big, it takes a long time to download. The Workers also take a long time to read and parse it.
What is the safest way to setup the processes so the Downloader doesn't clobber the stored file while the Workers are trying to read it?
For Linux and Mac OS X machines that use inode based file systems, use temporary files to store the data while its being downloaded (and is an incomplete state). Once the download is complete, move the temporary file into its final location with an atomic action.
For a little more detail, there are two main things to watch out for when one process (e.g. Downloader) writes a file that's actively read by other processes (e.g. Workers):
Make sure the Workers don't try to read the file before the Downloader has finished writing it.
Make sure the Downloader doesn't alter the file while the Workers are reading it.
Using temporary files accommodates both of these points.
For a more specific example, when the Downloader is actively pulling the XML file, have it write to a temporary location (e.g. 'data-storage.tmp') on the same device/disk* where the final file will be stored. Once the file is completely downloaded and written, have the Downloader move it to its final location (e.g. 'data-storage.xml') via an atomic (aka linearizable) rename command like bash's mv.
* Note that the reason the temporary file needs to be on the same device as the final file location is to ensure the inode number stays the same and the rename can be done atomically.
This methodology ensures that while the file is being downloaded/written the Workers won't see it since it's in the .tmp location. Because of the way renaming works with inodes, it also make sure that any Worker that opened the file continues to see the old content even if a new version of the data-storage file is put in place.
Downloader will point 'data-storage.xml' to a new inode number when it does the rename, but the Worker will continue to access 'data-storage.xml' from the previous inode number thereby continuing to work with the file in that state. At the same time, any Worker that opens a new copy 'data-storage.xml' after Downloader has done the rename will see contents from the new inode number since it's now what is referenced directly in the file system. So, two Workers can be reading from the same filename (data-storage.xml) but each will see a different (and complete) version of the contents of the file based on which inode the filename was pointed to when the file was first opened.
To see this in action, I created a simple set of example scripts that demonstrate this functionality on github. They can also be used to test/verify that using a temporary file solution works in your environment.
An important note is that it's the file system on the particular device that matters. If you are using a Linux or Mac machine but working with a FAT file system (for example, a usb thumb drive), this method won't work.

How to obtain a file name from the standard FILE structure?

What I want:
void printFname(FILE * f)
{
char buf[255];
MagicFunction(f,buf);
printf("File name: %s",buf);
}
So, all I need is "MagicFunction", but unfortunatelly I haven't found such ...
Is there any way to implement using an OS library? (windows.h , cocoa.h, posix.h etc.)
There is no such function. There may be no filename, or more than one filename that correspond with the FILE *. On Unix, a program can continue to have a reference to a file after it has been renamed or deleted, which could mean that you have a FILE * with no name. Or more hard links may be made to the file, which means a file can have multiple names; which one would you choose? To further confuse things, a file can be temporarily hidden, by mounting a filesystem over a directory containing that file. The file will still be on disk, at its original pathname, but the file will be inaccessible at that path because the mount is obscuring it.
It's also possible that the FILE * never corresponded to a file on the filesystem at all; while they usually do, you can create one from any file descriptor using fdopen(), and that file descriptor may be a pipe, socket, or other file-like object that has never had a path on the disk. In some versions of the C library, you can open a string stream (for instance, fmemopen() in glibc), so the FILE * actually just corresponds to a memory buffer.
If you care about the name, it's best to just keep track of what it was named when you opened the file.
There are some hacky ways to approximate getting the filename; if you're just using this for debugging or informational purposes, then they may be sufficient. Most of these will require operating on the file descriptor rather than the FILE *, as the file descriptor is the lower level way of referring to a file. To get the file descriptor, run fileno() on the FILE *, and remember to check for errors in case there is no file descriptor associated with that FILE *.
On Linux, you can do readlink() on "/proc/self/fd/fileno" where fileno is the file descriptor. That will show you what filename the file had when the file was opened, or a string indicating what other kind of file descriptor it is, like a socket or inotify handle. FreeBSD and NetBSD have Linux emulation layers, which include emulation of Linux-style procfs; you may be able to do this on those if you mount a Linux-compatible procfs, though I don't have them available for testing.
On Mac OS X, you don't have /proc/self/fd. If you don't care about finding the original filename, but some other filename that refers to the file would work (such that you could pass it to another program), you can construct one: /.vol/deviceid/inode. For example, /.vol/234881030/281363. To get those values, run fstat() on the file descriptor, and use st_dev and st_ino on the resulting struct stat.
On Windows, files and the filesystem work quite differently than Unix. Apparently it's possible to map a file back to its name on Windows. As of Windows Vista, you can simply call GetFinalPathNameByHandle(). This takes a HANDLE; to get the HANDLE from the file descriptor, call _get_osfhandle(). Prior to Windows Vista, you need to do a little more work, as described in this article. Note that on Windows fileno() is named _fileno(), though the former may work with a warning.
Going even further into hacky territory, there are a few more techniques that you could use. You could shell out to lsof, or you could extract the code it uses to resolve pathnames. lsof actually looks directly in kernel memory, extracting information from the kernel's name cache. This has several limitations, outlined in the lsof FAQ. And of course, you need root or equivalent privileges to do this, either directly or with an suid/sgid binary.
And finally, for a portable but slow solution for finding one or more filenames matching an open file, you could find the device and inode number using fstat() on the file descriptor, and then recursively traverse the filesystem stat()ing every file, until you find a file with matching device and inode number. Remember the caveats I mention above; you may find no matching files, more than one matching file, and even if you don't find any matching files, the file might still be there, but hidden by a mount point. And of course, there may be race conditions; something may rename the file in such a way that you never see it while traversing the hierarchy.
There is no such standard function.
Do you fopen() yourself? If then, maintain FILE * to filename hash table yourself.
Otherwise, it's not possible in general.
I don't think that there is such function even at windows.h,coca.h or unistd.h.
Most probably you write it yourself. Just make a
struct myFile {
FILE *fh;
char *filename;
}
and hold such structures into array of struct myFile and in MagicFunction(f,b) walk on the array looking for the address equal to f.

Resources