Linux function to get mount points - c

Is there a function (or interface; ioctl, netlink etc) in the standard Linux libs that will return the current mounts directly from the kernel without parsing /proc? straceing the mount command, it looks like it parses files in /proc

Please see the clarification at the bottom of the answer for the reasoning being used in this answer.
Is there any reason that you would not use the getmntent libc library call? I do realize that it's not the same as an 'all in one' system call, but it should allow you to get the relevant information.
#include <stdio.h>
#include <stdlib.h>
#include <mntent.h>
int main(void)
{
struct mntent *ent;
FILE *aFile;
aFile = setmntent("/proc/mounts", "r");
if (aFile == NULL) {
perror("setmntent");
exit(1);
}
while (NULL != (ent = getmntent(aFile))) {
printf("%s %s\n", ent->mnt_fsname, ent->mnt_dir);
}
endmntent(aFile);
}
Clarification
Considering that the OP clarified about trying to do this without having /proc mounted, I'm going to clarify:
There is no facility outside of /proc for getting the fully qualified list of mounted file systems from the linux kernel. There is no system call, there is no ioctl. The /proc interface is the agreed upon interface.
With that said, if you don't have /proc mounted, you will have to parse the /etc/mtab file - pass in /etc/mtab instead of /proc/mounts to the initial setmntent call.
It is an agreed upon protocol that the mount and unmount commands will maintain a list of currently mounted filesystems in the file /etc/mtab. This is detailed in almost all linux/unix/bsd manual pages for these commands. So if you don't have /proc you can sort of rely on the contents of this file. It's not guaranteed to be a source of truth, but conventions are conventions for these things.
So, if you don't have /proc, you would use /etc/mtab in the getmntent libc library call below to get the list of file systems; otherwise you could use one of /proc/mounts or /proc/self/mountinfo (which is recommended nowadays over /proc/mounts).

There is no syscall to list this information; instead, you can find it in the file /etc/mtab

Related

How exactly _fsopen() works?

How exactly _fsopen() works? Does Linux also has similar way of opening files which prepares the file for subsequent shared reading or writing based on shflag?
Referred article here.
How exactly _fsopen() works?
You've linked to the docs. It does what they say it does. If you're asking how it is implemented then we cannot answer because that information is proprietary.
and Does linux also has similar way of opening files which prepares the file for subsequent shared reading or writing based on shflg?
Linux does not have share modes. That's a Windows quirk. Under Linux or other Unix-like operating systems such as macOS, you don't need special flags or modes to share files between processes.
Overall, _fsopen() is an MS-specific variant of the C standard library's fopen() function. In addition to the share-mode flag, which is not relevant to other operating systems, it performs parameter validation in the manner of various other MS extension functions. On Linux, one takes responsibility for validating one's own arguments and simply uses fopen().
On Windows files are opened using the CreateFileW function which uses the NtCreateFile system call.
Argument dwShareMode is used to specify file sharing policy and contains combination of flags FILE_SHARE_DELETE, FILE_SHARE_READ and FILE_SHARE_WRITE which are mapped to shflag argument of _fsopen.
If you want to know how possible implementation of the function can look like, then first you should keep in mind that MSVCRT tries to support to some equivalent of POSIX file descriptor API. Then check the following functions:
_open_osfhandle allows you to convert NT HANDLE to POSIX-like file descriptor
_fdopen allows you to get a FILE * from a file descriptor (equivalent of POSIX fdopen function).
So the possible implementation can look like this (in pseudo code):
FILE *_fsopen(...)
{
HANDLE hFile = CreateFileW(...);
int fd = _open_osfhandle(hFile, ...);
return _fdopen(fd, ...);
}
Linux doesn't provide an equivalent of file sharing policy, so there is no equivalent.
PS: Another related function is _wsopen - combines CreateFileW and _open_osfhandle.

Using `read` system call on a directory

I was looking at an example in K&R 2 (8.6 Example - Listing Directories). It is a stripped down version of Linux command ls or Windows' dir. The example shows an implementation of functions like opendir, readdir. I've tried and typed the code word-by-word but it still doesn't work. All it does is that it prints the a dot (for the current directory) and exits.
One interesting thing I found in the code (in the implementation of readdir) was that it was calling the system calls like open and read on directory. Something like -
int fd, n;
char buf[1000], *bufp;
bufp = buf;
fd = open("dirname", O_RDONLY, 0);
n = read(fd, bufp, 1000);
write(fd, bufp, n);
When I run this code I get no output even when the folder name "dirname" has some files in it.
Also, the book says, that the implementation is for Version 7 and System V UNIX systems. Is that the reason why it is not working on Linux?
Here is the code- http://ideone.com/tw8ouX.
So does Linux not allow read system calls on directories? Or something else is causing this?
In Version 7 UNIX, there was only one unix filesystem, and its directories had a simple on-disk format: array of struct direct. Reading it and interpreting the result was trivial. A syscall would have been redundant.
In modern times there are many kinds of filesystems that can be mounted by Linux and other unix-like systems (ext4, ZFS, NTFS!), some of which have complex directory formats. You can't do anything sensible with the raw bytes of an arbitrary directory. So the kernel has taken on the responsibility of providing a generic interface to directories as abstract objects. readdir is the central piece of that interface.
Some modern unices still allow read() on a directory, because it's part of their history. Linux history began in the 90's, when it was already obvious that read() on a directory was never going to be useful, so Linux has never allowed it.
Linux does provide a readdir syscall, but it's not used very much anymore, because something better has come along: getdents. readdir only returns one directory entry at a time, so if you use the readdir syscall in a loop to get a list of files in a directory, you enter the kernel on every loop iteration. getdents returns multiple entries into a buffer.
readdir is, however, the standard interface, so glibc provides a readdir function that calls the getdents syscall instead of the readdir syscall. In an ordinary program you'll see readdir in the source code, but getdents in the strace. The C library is helping performance by buffering, just like it does in stdio for regular files when you call getchar() and it does a read() of a few kilobytes at a time instead of a bunch of single-byte read()s.
You'll never use the original unbuffered readdir syscall on a modern Linux system unless you run an executable that was compiled a long time ago, or go out of your way to bypass the C library.
In fact Linux dosn't allow read for directories. See man page and search for errno EISDIR. You will find
The read() and pread() functions shall fail if ...
The fildes argument refers to a directory and the implementation does not allow the directory to be read using read() or pread(). The readdir() function should be used instead.
. Other UNIXes allow it nevertheless.

reading seq_file from kernel

Could you post some examples how to read list of meanings from /proc files?
list_head* get_from_proc_file()
{
struct file* file = fopen("example","r");
seq_open(file, &seq_ops);
struct seq_file *p = file->private_data;
READ LIST OF DATA?????
}
You can't use fopen as this is a libc function. The example bellow shows how to read a file from the kernel.
http://www.wasm.ru/forum/viewtopic.php?pid=467952#p467952
Probably you don't need to read a /proc file within kernel, because a /proc interface is used by kernel to export some information to user-space, the information definitely already exists in kernel, either in some list of struct's or other global containers. So the proper way is probably just getting the global list/container by calling some kernel API or using them directly, if they are exported.

How do I open a directory at kernel level using the file descriptor for that directory?

I'm working on a project where I must open a directory and read the files/directories inside at kernel level. I'm basically trying to find out how ls is implemented at kernel level.
Right now I've figured out how to get a file descriptor for a directory using sys_open() and the O_DIRECTORY flag, but I don't know how to read the fd that I receive. If anyone has any tips or other suggestions I'd appreciate it. (Keep in mind this has to be done at kernel level).
Edit:For a long story short, For a school project I am implementing file/directory attributes. Where I'm storring the attributes is a hidden folder at the same level of the file with a given attribute. (So a file in Desktop/MyFolder has an attributes folder called Desktop/MyFolder/.filename_attr). Trust me I don't care to mess around in kernel for funsies. But the reason I need to read a dir at kernel level is because it's apart of project specs.
To add to caf's answer mentioning vfs_readdir(), reading and writing to files from within the kernel is is considered unsafe (except for /proc, which acts as an interface to internal data structures in the kernel.)
The reasons are well described in this linuxjournal article, although they also provide a hack to access files. I don't think their method could be easily modified to work for directories. A more correct approach is accessing the kernel's filesystem inode entries, which is what vfs_readdir does.
Inodes are filesystem objects such as regular files, directories, FIFOs and other
beasts. They live either on the disc (for block device filesystems)
or in the memory (for pseudo filesystems).
Notice that vfs_readdir() expects a file * parameter. To obtain a file structure pointer from a user space file descriptor, you should utilize the kernel's file descriptor table.
The kernel.org files documentation says the following on doing so safely:
To look up the file structure given an fd, a reader
must use either fcheck() or fcheck_files() APIs. These
take care of barrier requirements due to lock-free lookup.
An example :
rcu_read_lock();
file = fcheck_files(files, fd);
if (file) {
// Handling of the file structures is special.
// Since the look-up of the fd (fget() / fget_light())
// are lock-free, it is possible that look-up may race with
// the last put() operation on the file structure.
// This is avoided using atomic_long_inc_not_zero() on ->f_count
if (atomic_long_inc_not_zero(&file->f_count))
*fput_needed = 1;
else
/* Didn't get the reference, someone's freed */
file = NULL;
}
rcu_read_unlock();
....
return file;
atomic_long_inc_not_zero() detects if refcounts is already zero or
goes to zero during increment. If it does, we fail fget() / fget_light().
Finally, take a look at filldir_t, the second parameter type.
You probably want vfs_readdir() from fs/readdir.c.
In general though kernel code does not read directories, user code does.

Alternatives to using stat() to get file type?

Are there any alternatives to stat (which is found on most Unix systems) which can determine the file type? The manpage says that a call to stat is expensive, and I need to call it quite often in my app.
The alternative is fstat() if you already have the file open (so you have a file descriptor for it). Or lstat() if you want to find out about symbolic links rather than the file the symlink points to.
I think the man page is exaggerating the cost; it is not much worse than any other system call that has to resolve the name of the file into an inode. It is more costly than getpid(); it is less costly than open().
The "file type" that stat() gives you is whether the file is a regular file or something like a device file or directory, among other things like its size and inode number. If that's what you need to know, then you must use stat().
If what you actually need to know is the type of the file's contents -- e.g. text file, JPEG image, MP3 audio -- then you have two options. You can guess based on the filename extension (if it ends in ".mp3", the file probably contains MP3 audio), or you can use libmagic, which actually opens the file and reads some of its contents to figure out what it is. The libmagic approach is more expensive (if you're trying to avoid stat(), you probably want to avoid open() too), but less prone to error (in case that ".mp3" file is actually a JPEG image, for example).
Under Linux with some filesystems the file type (regular, char device, block device, directory, pipe, sym link, ...) is stored in the linux_dirent struct, which is what the kernel supplies applications directory entries in via the getdents system call. If the only thing in the stat structure you needed was the file type and you needed to get that for all or many entries of a directory, you could use getdents directly (rather than readdir) and attempt to get the file type out of that, only using stat if you found an invalid file type in linux_dirent. Depending on the your application's filesystem usage pattern this could be faster than using stat if you are using Linux, but stat should be fast in many cases.
Stat's speed has mostly to do with locating the data that is being asked for on disk. If you are traversing a directory recursively stat-ing all of the files then each stat should end up being fairly quick overall because most of the work getting the data stat needs ends up cached before you ask the kernel for it by a previous call to stat. If on the other hand you stat the same number of files randomly distributed around the system then the kernel will likely have to read from disk several directories for each file you are going to call stat on.
fstat should always be very fast since the kernel should already have the data you're asking for in RAM, as it needs to access it for the file to be in the open state, and the kernel won't have to go through the trouble of traversing the path of the filename to see if each component is in RAM or on disk and possibly reading in a directory from disk (but likely not having to), only to discover that it has the data that you are asking for in RAM.
That being said, calling stat on an open file should be faster than calling it on an unopened file.
Are you aware of the "magic" file on *nix systems? By querying a file from the command line with something like file myfile.ext you can get the real file type.
This is done by reading the contents of the file rather than looking at its extension, and is widely used on *nix (Linux, Unix, ...) systems.
If your application is expected to run on Linux systems, why don't you try inotify(7). It is definitely faster than stating many files.

Resources