K&R interface for reading directories: superfluous DIR structure? - c

In the 2nd edition of "The C Programming Language" by Kernighan and Ritchie they implement a simplified version of the UNIX command ls (section 8.6 "Example - Listing Directories", p. 179). For this purpose they create the following interface which provides a system-independent access to the name and inode number of the files stored in a directory.
#define NAME_MAX 14 /* longest filename component; */
/* system dependent */
typedef struct { /* portable director-entry */
long ino; /* inode number */
char name[NAME_MAX+1]; /* name + '\0' terminator */
} Dirent;
typedef struct { /* minimal DIR: no buffering, etc. */
int fd; /* file descriptor for directory */
Dirent d; /* the directory entry */
} DIR;
DIR *opendir(char *dirname);
Dirent *readdir(DIR *dfd);
void closedir(DIR *dfd);
Then they implement this interface for Version 7 and System V UNIX systems.
opendir() basically uses the system
call open() to open a directory and
malloc() to allocate space for a
DIR structure. The file descriptor
returned by open() is then stored
in the variable fd of that DIR.
Nothing is stored in the Dirent
component.
readdir() uses the system call
read() to get the next
(system-dependent) directory entry of
an opened directory and copies the so
obtained inode number and filename
into a static Dirent structure (to
which a pointer is returned). The
only information needed by
readdir() is the file descriptor
stored in the DIR structure.
Now to my question: What is the point of having a DIR structure? If my understanding of this program is correct, the Dirent component of DIR is never used, so why not replace the whole structure with a file descriptor and directly use open() and close()?
Thanks.
Ps: I am aware that on modern UNIX systems read() can no longer be used on directories (I have tried out this program on Ubuntu 10.04), but I still want to make sure that I have not overlooked something important in this example.

From K&R:
Regrettably, the format and precise contents of a directory are not the same on all
versions of the system. So we will divide the task into two pieces to try to isolate
the non-portable parts. The outer level defines a structure called a Dirent and three routines opendir, readdir, and closedir to provide system-independent access to the name and inode number in a directory entry.
So the reason is portability. They want to define an interface that can survive on systems that have different stat structs or nonstandard open() and close(). They go on to build a bunch of reusable tools around it, which don't even care if they're on a Unix-like system. That's the point of wrappers.
Maybe it's not used because they started out by defining their data structures (with a Dirent inside DIR) but ended up not using it. Keeping data structures grouped like that is good design.

It is so they don't have to allocate memory for the Dirent structure that is returned by readdir. This way they can reuse the Dirent between subsiquent calls to readdir.

Related

how to change file timestamp including nanoseconds

I am making a program to copy files from a source to a destination directory and would like to change the destination file timestamps so they match the source file timestamps.
So far I have discovered the utime() function and have manipulated the utimbuf struct with the times I would like to use.
However, the times do not take into account the nanoseconds.
For example:
If I want to copy "file1" and it has a timestamp of 123.213241, my copy will have 123.000000 when running my current program. I would like to include the nanoseconds .213241 etc.
Here is my code so far:
struct stat buf;
struct utimbuf time;
stat(filename, &buf) // get metadata of file "filename" and then store in buf
time.actime = buf.st_atim.tv_sec; // set times in time struct
time.modtime = buf.st_mtim.tv_sec;
utime(filename_copy, &time); // load file copy with time struct
How can I include nanoseconds in my file timestamps?
According to POSIX, the function you need is utimensat() (or its close relative, futimens()). Both of these take a pair of struct timespec values in an array, which allows you to specify a time to nanoseconds. The first element is the access time; the second is the modification time.
Not all file systems support nanosecond timestamps. Not all systems actually support nanosecond resolution — they might round to the nearest microsecond.
Note that modern versions of the stat() function return a structure with elements st_atim, st_ctim, and st_mtim. These are also struct timespec values. The <sys/stat.h> defines some backwards-compatibility macros:
For compatibility with earlier versions of this standard, the st_atime macro shall be defined with the value st_atim.tv_sec. Similarly, st_ctime and st_mtime shall be defined as macros with the values st_ctim.tv_sec and st_mtim.tv_sec, respectively.
For Linux, see utimensat(2). However, the documentation for stat(2) only mentions subsecond times in the Notes section near the bottom. Be cautious.

What's the meaning of "utsname" in Linux?

In Linux, there is a header file called <sys/utsname.h>. And there is a utsname structure which shall contain members below:
char sysname[] Name of this implementation of the operating system.
char nodename[] Name of this node within the communications
network to which this node is attached, if any.
char release[] Current release level of this implementation.
char version[] Current version level of this release.
char machine[] Name of the hardware type on which the system is running.
I'm curious about the letter "u, t, s", what do they mean?
I think it comes from the ancestor operating system where this syscall was already defined, so utsname means "Unix Time-Sharing System Name".
The full name for UNIX can be seen here:
http://cva.stanford.edu/classes/cs99s/papers/ritchie-thompson-unix-time-sharing-system.pdf

linux kernel: how to remove a file in kernel space

I know this is strongly not recommended. But does is possible to do this in kernel space.
Given the file path, can we remove the corresponding file in kernel space?
Maybe it's too late, I'll try to reply. As Tsyvarev said in his comment probably you are looking for the vfs_unlink function that you can find here namei.c.
Before the implementation there is a description, but a simple example can be this one
/*
fcheck's prototype is in linux/fdtable.h and returns a file pointer given a
given a file descriptor
*/
struct file *filp= fcheck(fd);
struct inode *parent_inode = filp->f_path.dentry->d_parent->d_inode;
inode_lock(parent_inode);
vfs_unlink(parent_inode, filp->f_path.dentry, NULL);
inode_unlock(parent_inode);
I hope it's can be useful to someone.

How to find development information about struct timespec

I'm writing a C program using nftw() to walk a filesystem and retrieve file modification times for each file.
nftw() calls a supplied function pointer and provides a struct stat as an argument.
man stat(2) states that the time modification fields are:
struct timespec st_atim; /* time of last access */
struct timespec st_mtim; /* time of last modification */
struct timespec st_ctim; /* time of last status change */
However, whilst man stat(2) provides an example of how to print the time fields, it doesn't tell me how to find information about struct timespec, nor how to query/manipulate the time modification fields.
How should I go about finding that information on my computer alone, without resorting to Google?
Typically one of the man pages describes what these structures contain. If you tell us your platform I can give further details. Otherwise, open up the header /usr/include/time.h to see what struct timespec is defined as.
$ apropos timespec
clock_gettime (2) - Return the current timespec value of tp for the specified clock
$ man 2 clock_gettime
Usually, when I need information on data types or functions, if not included in the man pages, I issue a command like the following:
grep -r "timespec" /usr/include/
in the path where are the header files.

What exactly is the FILE keyword in C?

I've started learning some C as a hobby and have blindly used FILE as a declaration for file pointers for quite some time, and I've been wondering. Is this a keyword or special data type for C to handle files with? Does it contain a stream to the file within and other data? Why is it defined as a pointer?
An example to show what I mean to make it a little more clear:
FILE* fp; //<-- this
fp = fopen("datum.txt", "r");
while(!feof(fp)) {
// etc.
}
is this a keyword or special data type for C to handle files with?
What you are refering to is a typedef'd structure used by the standard io library to hold the appropriate data for use of fopen, and its family of functions.
Why is it defined as a pointer?
With a pointer to a struct, you can then pass it as a parameter to a function. This is for example what fgets or fgetc will accept, in the form of function(FILE* fp)
The fopen function will return a pointer to a newly created FILE struct, assigning this new pointer to your unused one will cause them to point to the same thing.
Does it contain a stream to the file within and other data?
The structure definition seems a little more illusive than its description. This is directly taken from my stdio.h, from MinGW32 5.1.4
typedef struct _iobuf
{
char* _ptr;
int _cnt;
char* _base;
int _flag;
int _file;
int _charbuf;
int _bufsiz;
char* _tmpfname;
} FILE;
Which includes the lovely comment before it:
Some believe that nobody in their right mind should make use of the
internals of this structure.
The contents of this structure appear to change greatly on other implementations, the glibc sources usually have some form of commenting but their structure for this is burried under a lot of code.
It would make sense to heed the aforementioned warning and just not worry what it does. :)
FILE is an identifier used as a typedef name, usually for a struct.
The stdio library usually has something like
typedef struct {
...
} FILE;
somewhere. All stdio functions dealing with FILE pointers know the contens of ... and can access the structure members. The C programmers must use functions like fopen, feof, ferror, ungetc etc to create and operate on FILE structures. Such types are called opaque (i.e. you can´t peek inside them but must use accessor functions).
Why is it defined as a pointer?
It isn't. It's a struct to which your code declares a pointer. Note the asterisk in your
FILE* fp;
which is another example of why the asterisk should go with the variable identifier, not the type name:
FILE *fp;
It's not a keyword, it's a data type defined in the ANSI C standard to operate with files. It usually points to an internal structure that describes the file and its current state to the library functions.
It's a special data type. It contains a file handle as well as various flags used internally by the various stdio calls. You'll never need to actually know what's in it, just that it's a data type that you can pass around.
http://www.cplusplus.com/reference/clibrary/cstdio/FILE/
However if you're interested, here's what it looks like:
http://en.allexperts.com/q/C-1587/2008/5/FILE-Structure.htm

Resources