Reading a directory

Reading a directory - c

I'm trying to solve exercise from K&R; it's about reading directories. This task is system dependent because it uses system calls. In the book example authors say that their example is written for Version 7 and System V UNIX systems and that they used the directory information in the header < sys/dir.h>, which looks like this:
#ifndef DIRSIZ
#define DIRSIZ 14
#endif
struct direct { /* directory entry */
ino_t d_ino; /* inode number */
char d_name[DIRSIZ]; /* long name does not have '\0' */
};
On this system they use 'struct direct' combined with 'read' function to retrieve a directory entry, which consist of file name and inode number.
.....
struct direct dirbuf; /* local directory structure */
while(read(dp->fd, (char *) &dirbuf, sizeof(dirbuf)
== sizeof(dirbuf) {
.....
}
.....
I suppose this works fine on UNIX and Linux systems, but what I want to do is modify this so it works on Windows XP.
Is there some structure in Windows like 'struct direct' so I can use it
with 'read' function and if there is what is the header name where it is
defined?
Or maybe Windows requires completely different approach?

There's nothing like that in windows. If you want to enumerate a directory in Windows, you must use the FindFirstFile/FindNextFile API.

Yes, this works on Linux/Unix only. However if you are just playing around you may use Cygwin to build programs on Windows that use this Unix API.

It is interesting to note that you are using K&R 1st Edition, from 1978, and not the second edition. The second edition has different structures, etc, at that point in the book.
That code from the first edition does not work on many Unix-like systems any more. There are very few Unix machines left with file systems that restrict file names to the 14-character limit, and that code only works on those systems. Specifically, it does not work on MacOS X (10.6.2) or Solaris (10) or Linux (SuSE Linux Enterprise Edition 10, kernel 2.6.16.60-0.21-smp). With the test code shown below, the result is:
read failed: (21: Is a directory)
Current editions of POSIX explicitly permit the implementation to limit what you can do with a file descriptor opened on a directory. Basically, it can be used in the 'fchdir()' system calls and maybe a few relatives, but that is all.
To read the contents of the directory, you have to use the opendir() family of functions, and then readdir() etc. The second edition of K&R goes on to use these system calls instead of raw open() and read() etc.
All-in-all, it is not dreadfully surprising that code from over 30 years ago doesn't quite work the same as it used to.
On Windows, you can either use the POSIX sub-system or an emulation of that such as Cygwin or MingW, and in either case you will need to use the opendir() and readdir() family of function calls, not direct open() and read() on the directory file descriptor.
Or you can use the native Windows API that BillyONeal references, FindFirstFile and relatives.
Test code:
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <errno.h>
int main()
{
int fd = open(".", O_RDONLY);
if (fd != -1)
{
char buffer[256];
ssize_t n = read(fd, buffer, sizeof(buffer));
if (n < 0)
{
int errnum = errno;
printf("read failed: (%d: %s)\n", errnum, strerror(errnum));
}
else
printf("read OK: %d bytes (%s)\n", (int)n, buffer);
close(fd);
}
return(0);
}

boost::filesystem::directory_iterator provides a portable equivalent of Windows FindFirstFile/FindNextFile API and POSIX readdir_r() API. See this tutorial.
Note that this is C++ not plain C.

Related

How to append "ifconfig" in a file .txt in C language?

I'm trying to get my own ip addres with C.
The idea is to get the output of "ifconfig", put it in a .txt file and extract the inet and the inet6 values.
I stack trying to write the ifconfig output in the .txt file:
#include <stdio.h>
#include <stdlib.h>
#include <string>
int main ()
{
char command[1000];
strcpy(command, "ifconfig");
system(command);
FILE *fp = fopen("textFile.txt", "ab+");
//... how to write the output of 'system(command)' in the textFile.txt?
fclose(fp);
//... how to scraping a file .text in C ???
return 0;
}
Thank you for any help and suggestion,
Stefano

You actually want to use system calls to accomplish this - rather than running the ifconfig command.
There's a similar question here for Linux: using C code to get same info as ifconfig
(Since ifconfig is a Linux command, I'm assuming you're asking about Linux).
The general gist was to use the ioctl() system call.
Otherwise, you'll be forking your process to split it into two, creating a pipe from the output of the child to the input of the parent, and calling exec on the child in order to replace it with "ifconfig". Then, you'll have to parse the string if you want to get anything useful out of it.

In Linux, use man 7 netdevice describes the interface you can use, as Anish Goyal already answered.
This is very simple and robust, and uses very little resources (since it is just a few syscalls). Unfortunately, it is Linux specific, and makes the code nonportable.
It is possible to do this portably. I describe the portable option here, because the Linux-specific one is rather trivial. Although the approach is quite complicated for just obtaining the local host IP addresses, the pattern is surprisingly often useful, because it can hide system-specific quirks, while easily allowing system administrators to customize the behaviour.
The idea for a portable solution is that you use a small helper program or shell script to obtain the information, and have it output the information in some easy-to-parse format to your main program. If your application is named yourapp, it is common to install such helpers in /usr/lib/yourapp/, say /usr/lib/yourapp/local-ip-addresses.
Personally, I'd recommend using a shell script (so that system admins can trivially edit the helpers if they need to customize the behaviour), and an output format where each interface is on its own line, fields separated by spaces, perhaps
inet interface-name ipv4-address [ hostname ]*
inet6 interface-name ipv6-address [ hostname ]*
i.e. first token specifies the address family, second token the interface name, third the address, optionally followed by the hostnames or aliases corresponding to that address.
As to the helper program/shell script itself, there are two basic approaches:
One-shot
For example, parsing LANG=C LC_ALL=C ip address output in Linux.
The program/script will exit after the addresses have been printed.
Continuous
The program/script will print the ip address information, but instead of exiting, it will run as long as the pipe stays open, providing updates if interfaces are taken down or come up.
In Linux, a program/script could use DBUS or NetworkManager to wait for such events; it would not need to poll (that is, repeatedly check the command output).
A shell script has the extra benefit that it can support multiple distributions, even operating systems (across POSIX systems at least), at the same time. Such scripts often have a similar outline:
#!/bin/sh
export LANG=C LC_ALL=C
case "$(uname -s)" in
Linux)
if [ -x /bin/ip ]; then
/bin/ip -o address | awk \
'/^[0-9]*:/ {
addr = $4
sub(/\/.*$/, "", addr)
printf "%s %s %s\n", $3, $2, addr
}'
exit 0
elif [ -x /sbin/ifconfig ]; then
/sbin/ifconfig | awk \
'BEGIN {
RS = "[\t\v\f\r ]*\n"
FS = "[\t\v\f ]+"
}
/^[0-9A-Za-z]/ {
iface = $1
}
/^[\t\v\f ]/ {
if (length(iface) > 0)
for (i = 1; i < NF-1; i++)
if ($i == "inet") {
addr = $(i+1)
sub(/^addr:/, "", addr)
printf "inet %s %s\n", iface, addr
} else
if ($i == "inet6") {
addr = $(i+2)
sub(/\/.*$/, "", addr)
printf "inet6 %s %s\n", iface, addr
}
}'
exit 0
fi
;;
# Other systems?
esac
printf 'Cannot determine local IP addresses!\n'
exit 1
The script sets the locale to C/POSIX, so that the output of external commands will be in the default locale (in English and so on). uname -s provides the kernel name (Linux for Linux), and further checks can be done using e.g. the [ shell command.
I only implemented the scriptlet for Linux, because that's the machine I'm on right now. (Both ip and ifconfig alternatives work on my machine, and provide the same output -- although you cannot expect to get the interfaces in any specific order.)
The situations where a sysadmin might need to edit this particular helper script includes as-yet-unsupported systems, systems with new core tools, and systems that have interfaces that should be excluded from normal interface lists (say, those connected to a internal sub-network that are reserved for privileged purposes like DNS, file servers, backups, LDAP, and so on).
In the C application, you simply execute the external program or shell script using popen("/usr/lib/yourapp/local-ip-addresses", "r"), which provides you a FILE * that you can read as if it was a file (except you cannot seek or rewind it). After you have read everything from the pipe, pclose() the handle:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
/* ... */
FILE *in;
char *line_ptr = NULL;
size_t line_size = 0;
ssize_t line_len;
int status;
in = popen("/usr/lib/myapp/local-ip-addresses", "r");
if (!in) {
fprintf(stderr, "Cannot determine local IP addresses: %s.\n", strerror(errno));
exit(EXIT_FAILURE);
}
while (1) {
line_len = getline(&line_ptr, &line_size, in);
if (line_len < 1)
break;
/* Parse line_ptr */
}
free(line_ptr);
line_ptr = NULL;
line_size = 0;
if (ferror(in) || !feof(in)) {
pclose(in);
fprintf(stderr, "Read error while obtaining local IP addresses.\n");
exit(EXIT_FAILURE);
}
status = pclose(in);
if (!WIFEXITED(status) || WEXITSTATUS(status)) {
fprintf(stderr, "Helper utility /usr/lib/myapp/local-ip-addresses failed to obtain local IP addresses.\n");
exit(EXIT_FAILURE);
}
I omitted the parsing code, because there are so many alternatives -- strtok(), sscanf(), or even a custom function that splits the line into tokens (and populates an array of pointers) -- and my own preferred option is the least popular one (last one, a custom function).
While the code needed for just this task is almost not worth the while, applications that use this approach tend to use several of such helpers. Then, of course, it makes sense to choose the piped data format in a way that allows easy parsing, but supports all use cases. The amortized cost is then much easier to accept, too.

open a windows file directory for reading/writing in c

I'm trying to write the contents of a windows directory to a file using c. For example, if I had a directory of jpegs (i.e. a directory that contains multiple jpegs) and wanted to convert them to a .raw file, I have something like this:
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
typedef uint8_t BYTE;
#define BLOCK 512*sizeof(BYTE);
int main(void)
{
FILE * fd = fopen("C:\\jpegs", "r");
if (fd == NULL) {
fprintf(stderr, "Error opening device file.\n");
return EXIT_FAILURE;
}
int block = BLOCK;
FILE * fn = fopen("new.raw", "w+");
void * buff = malloc(block);
while(feof(fd) == 0) {
fread(buff,block,1,fd);
fwrite(buff,block,1,fn);
}
free(buff);
fclose(fd);
fclose(fn);
return 0;
}
The problem is I don't think windows directories are terminated with EOF. Does anyone have any ideas about how to solve this?

On Unix systems, although you can open a directory for reading, you can't really read from it unless you use the opendir(), readdir(), closedir() family of calls. You can't write to a directory on Unix; even superuser (root) can't do that. (The main reason for opening a directory, more usually with open() than fopen(), is so that you can use chdir() followed by fchdir() to get back to where you started, or use the various *at() functions, such as openat(), to reference the directory.)
On Windows, you'd at minimum need to use "rb" mode, but frankly, I'd not expect you to be able to do much with it. There are probably analogues to the Unix opendir() functions in the Windows API, and you should use those instead.

Reading a directory file in C

I'm trying to write a small program to show me the internal representation of a directory in linux (debian, specifically). The idea was a small C program using open(".", O_RDONLY), but this seems to give no output. The program is the following:
#include <stdio.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
int fd = open(argv[1],O_RDONLY,0 );
char buf;
printf("%i\n",fd);
while(read(fd, &buf, 1) > 0)
printf("%x ", buf);
putchar('\n');
}
When I run it on regular files it works as expected, but on a directory such as ".", it gives no output. The value of fd is 3 (as expected) but the call to read returns -1.
Why isn't this working, and how could I achieve to read the internal representation?
Thanks!

For handling directories, you need to use opendir/readdir/closedir. Read the corresponding man pages for more infos.
To check whether a filename corresponds to a directory, you first need to call stat for the filename and check whether it's a directory (S_ISDIR(myStatStruc.st_mode)).

Directories are a filesystem specific representation and are part of the file system. On extfs, they are a table of string/inode pairs, unlike files which have blocks of data(that you read using your code above).
To read directory-specific information in C, you need to use dirent.h .
Look at this page for more information
http://pubs.opengroup.org/onlinepubs/7908799/xsh/dirent.h.html
On POSIX systems, the system call "stat" would give you all the information about an inode on the filesystem(file/directory/etc.)

How to increase the limit of "maximum open files" in C on Mac OS X

The default limit for the max open files on Mac OS X is 256 (ulimit -n) and my application needs about 400 file handlers.
I tried to change the limit with setrlimit() but even if the function executes correctly, i'm still limited to 256.
Here is the test program I use:
#include <stdio.h>
#include <sys/resource.h>
main()
{
struct rlimit rlp;
FILE *fp[10000];
int i;
getrlimit(RLIMIT_NOFILE, &rlp);
printf("before %d %d\n", rlp.rlim_cur, rlp.rlim_max);
rlp.rlim_cur = 10000;
setrlimit(RLIMIT_NOFILE, &rlp);
getrlimit(RLIMIT_NOFILE, &rlp);
printf("after %d %d\n", rlp.rlim_cur, rlp.rlim_max);
for(i=0;i<10000;i++) {
fp[i] = fopen("a.out", "r");
if(fp[i]==0) { printf("failed after %d\n", i); break; }
}
}
and the output is:
before 256 -1
after 10000 -1
failed after 253
I cannot ask the people who use my application to poke inside a /etc file or something. I need the application to do it by itself.

rlp.rlim_cur = 10000;
Two things.
1st. LOL. Apparently you have found a bug in the Mac OS X' stdio. If I fix your program up/add error handling/etc and also replace fopen() with open() syscall, I can easily reach the limit of 10000 (which is 240 fds below my 10.6.3' OPEN_MAX limit 10240)
2nd. RTFM: man setrlimit. Case of max open files has to be treated specifically regarding OPEN_MAX.

etresoft found the answer on the apple discussion board:
The whole problem here is your
printf() function. When you call
printf(), you are initializing
internal data structures to a certain
size. Then, you call setrlimit() to
try to adjust those sizes. That
function fails because you have
already been using those internal
structures with your printf(). If you
use two rlimit structures (one for
before and one for after), and don't
print them until after calling
setrlimit, you will find that you can
change the limits of the current
process even in a command line
program. The maximum value is 10240.

For some reason (perhaps binary compatibility), you have to define _DARWIN_UNLIMITED_STREAMS before including <stdio.h>:
#define _DARWIN_UNLIMITED_STREAMS
#include <stdio.h>
#include <sys/resource.h>
main()
{
struct rlimit rlp;
FILE *fp[10000];
int i;
getrlimit(RLIMIT_NOFILE, &rlp);
printf("before %d %d\n", rlp.rlim_cur, rlp.rlim_max);
rlp.rlim_cur = 10000;
setrlimit(RLIMIT_NOFILE, &rlp);
getrlimit(RLIMIT_NOFILE, &rlp);
printf("after %d %d\n", rlp.rlim_cur, rlp.rlim_max);
for(i=0;i<10000;i++) {
fp[i] = fopen("a.out", "r");
if(fp[i]==0) { printf("failed after %d\n", i); break; }
}
}
prints
before 256 -1
after 10000 -1
failed after 9997
This feature appears to have been introduced in Mac OS X 10.6.

This may be a hard limitation of your libc. Some versions of solaris have a similar limitation because they store the fd as an unsigned char in the FILE struct. If this is the case for your libc as well, you may not be able to do what you want.
As far as I know, things like setrlimit only effect how many file you can open with open (fopen is almost certainly implemented in terms on open). So if this limitation is on the libc level, you will need an alternate solution.
Of course you could always not use fopen and instead use the open system call available on just about every variant of unix.
The downside is that you have to use write and read instead of fwrite and fread, which don't do things like buffering (that's all done in your libc, not by the OS itself). So it could end up be a performance bottleneck.
Can you describe the scenario that requires 400 files open ** simultaneously**? I am not saying that there is no case where that is needed. But, if you describe your use case more clearly, then perhaps we can recommend a better solution.

I know that's sound a silly question, but you really need 400 files opened at the same time?
By the way, are you running this code as root are you?

Mac OS doesn't allow us to easily change the limit as in many of the unix based operating system. We have to create two files
/Library/LaunchDaemons/limit.maxfiles.plist
/Library/LaunchDaemons/limit.maxproc.plist
describing the max proc and max file limit. The ownership of the file need to be changed to 'root:wheel'
This alone doesn't solve the problem, by default latest version of mac OSX uses 'csrutil', we need to disable it. To disable it we need to reboot our mac in recovery mode and from there disable csrutil using terminal.
Now we can easily change the max open file handle limit easily from terminal itself (even in normal boot mode).
This method is explained in detail in the following link. http://blog.dekstroza.io/ulimit-shenanigans-on-osx-el-capitan/
works for OSX-elcapitan and OSX-Seirra.

How do you get a directory listing in C?

How do you scan a directory for folders and files in C? It needs to be cross-platform.

The following POSIX program will print the names of the files in the current directory:
#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
int main (void)
{
DIR *dp;
struct dirent *ep;
dp = opendir ("./");
if (dp != NULL)
{
while ((ep = readdir (dp)) != NULL)
puts (ep->d_name);
(void) closedir (dp);
return 0;
}
else
{
perror ("Couldn't open the directory");
return -1;
}
}
Credit: http://www.gnu.org/software/libtool/manual/libc/Simple-Directory-Lister.html
Tested in Ubuntu 16.04.

The strict answer is "you can't", as the very concept of a folder is not truly cross-platform.
On MS platforms you can use _findfirst, _findnext and _findclose for a 'c' sort of feel, and FindFirstFile and FindNextFile for the underlying Win32 calls.
Here's the C-FAQ answer:
http://c-faq.com/osdep/readdir.html

I've created an open source (BSD) C header that deals with this problem. It currently supports POSIX and Windows. Please check it out:
https://github.com/cxong/tinydir
tinydir_dir dir;
tinydir_open(&dir, "/path/to/dir");
while (dir.has_next)
{
tinydir_file file;
tinydir_readfile(&dir, &file);
printf("%s", file.name);
if (file.is_dir)
{
printf("/");
}
printf("\n");
tinydir_next(&dir);
}
tinydir_close(&dir);

There is no standard C (or C++) way to enumerate files in a directory.
Under Windows you can use the FindFirstFile/FindNextFile functions to enumerate all entries in a directory. Under Linux/OSX use the opendir/readdir/closedir functions.

GLib is a portability/utility library for C which forms the basis of the GTK+ graphical toolkit. It can be used as a standalone library.
It contains portable wrappers for managing directories. See Glib File Utilities documentation for details.
Personally, I wouldn't even consider writing large amounts of C-code without something like GLib behind me. Portability is one thing, but it's also nice to get data structures, thread helpers, events, mainloops etc. for free
Jikes, I'm almost starting to sound like a sales guy :) (don't worry, glib is open source (LGPL) and I'm not affiliated with it in any way)

opendir/readdir are POSIX. If POSIX is not enough for the portability you want to achieve, check Apache Portable Runtime

Directory listing varies greatly according to the OS/platform under consideration. This is because, various Operating systems using their own internal system calls to achieve this.
A solution to this problem would be to look for a library which masks this problem and portable. Unfortunately, there is no solution that works on all platforms flawlessly.
On POSIX compatible systems, you could use the library to achieve this using the code posted by Clayton (which is referenced originally from the Advanced Programming under UNIX book by W. Richard Stevens). this solution will work under *NIX systems and would also work on Windows if you have Cygwin installed.
Alternatively, you could write a code to detect the underlying OS and then call the appropriate directory listing function which would hold the 'proper' way of listing the directory structure under that OS.

The most similar method to readdir is probably using the little-known _find family of functions.

You can find the sample code on the wikibooks link
/**************************************************************
* A simpler and shorter implementation of ls(1)
* ls(1) is very similar to the DIR command on DOS and Windows.
**************************************************************/
#include <stdio.h>
#include <dirent.h>
int listdir(const char *path)
{
struct dirent *entry;
DIR *dp;
dp = opendir(path);
if (dp == NULL)
{
perror("opendir");
return -1;
}
while((entry = readdir(dp)))
puts(entry->d_name);
closedir(dp);
return 0;
}
int main(int argc, char **argv) {
int counter = 1;
if (argc == 1)
listdir(".");
while (++counter <= argc) {
printf("\nListing %s...\n", argv[counter-1]);
listdir(argv[counter-1]);
}
return 0;
}