How to check if a symbolic link refers to a directory - c

I am currently recoding the "ls" command to learn. However, when I browse files: I may have an error when I try to open the "folder" of the path pointed by the symbolic link. Because it's not a directory (I thought all symbolic links pointed to folders).
How can I check if it points to a directory? (I watch the manuals, stat, dir ..)

I thought all symbolic links pointed to folders
Nope. A symbolic link is an indirect reference to another path. That other path can refer to any kind of file that can be represented in any mounted file system, or to no file at all (i.e. it can be a broken link).
How to check that it points to a directory?
You mention the stat() function, but for reimplementing ls you should mostly be using lstat(), instead. The difference is that when the specified path refers to a symbolic link, stat returns information about the link's target path, whereas lstat returns information about the link itself (including information about the file type, from which you can tell that it is a link).
In the event that you encounter a symbolic link, you can simply check the same path again with stat() to find out what kind of file it points to. stat() will recursively resolve symbolic links to discover the information for the ultimate target, which will be a symbolic link only if it is a broken one. Any way around, you don't need to distinguish between a broken link and any other form of non-directory for your particular purpose.

I just ran into the same problem, and here is my solution:
bool IsDir(const char *path)
{
std::string tmp = path;
tmp += '/';
struct stat statbuf;
return (lstat(tmp.c_str(), &statbuf) >= 0) && S_ISDIR(statbuf.st_mode);
}
the key is the tail / in the path
however, I have no idea whether it's portable

Related

stat alternative for long file paths

I'm writing a program that iterates through a directory tree depth first (similar to the GNU find program) by recursively constructing paths to each file in the tree and stores the relative paths of encountered files. It also collects some statistics about these files. For this purpose I'm using the stat function.
I've notices that this fails for very deep directory hierarchies, i.e. long file paths, in accordance with stat's documentation.
Now my question is: what alternative approach could I use here that is guaranteed to work for paths of any length? (I don't need working code, just a rough outline would be sufficient).
As you are traversing, open each directory you traverse.
You can then get information about a file in that directory using fstatat. The fstatat function takes an additional parameter, dirfd. If you pass a handle to an open directory in that parameter, the path is interpreted as relative to that directory.
int fstatat(int dirfd, const char *pathname, struct stat *buf,
int flags);
The basic usage is:
int dirfd = open("directory path", O_RDONLY);
struct stat st;
int r = fstatat(dirfd, "relative file path", &st, 0);
You can, of course, also use openat instead of open, as you recurse. And the special value AT_FDCWD can be passed as dirfd to refer to the current working directory.
Caveats
It is easy to get into symlink loops and recurse forever. It is not uncommon to find symlink loops in practice. On my system, /usr/bin/X11 is a symlink to /usr/bin.
Alternatives
There are easier ways to traverse file hierarchies. Use ftw or fts instead, if you can.

How to know the path of current binary file? [duplicate]

This question already has answers here:
Finding current executable's path without /proc/self/exe
(14 answers)
Closed 7 years ago.
Is there a way in C/C++ to find the location (full path) of the current executed program?
(The problem with argv[0] is that it does not give the full path.)
To summarize:
On Unixes with /proc really straight and realiable way is to:
readlink("/proc/self/exe", buf, bufsize) (Linux)
readlink("/proc/curproc/file", buf, bufsize) (FreeBSD)
readlink("/proc/self/path/a.out", buf, bufsize) (Solaris)
On Unixes without /proc (i.e. if above fails):
If argv[0] starts with "/" (absolute path) this is the path.
Otherwise if argv[0] contains "/" (relative path) append it to cwd
(assuming it hasn't been changed yet).
Otherwise search directories in $PATH for executable argv[0].
Afterwards it may be reasonable to check whether the executable isn't actually a symlink.
If it is resolve it relative to the symlink directory.
This step is not necessary in /proc method (at least for Linux).
There the proc symlink points directly to executable.
Note that it is up to the calling process to set argv[0] correctly.
It is right most of the times however there are occasions when the calling process cannot be trusted (ex. setuid executable).
On Windows: use GetModuleFileName(NULL, buf, bufsize)
Use GetModuleFileName() function if you are using Windows.
Please note that the following comments are unix-only.
The pedantic answer to this question is that there is no general way to answer this question correctly in all cases. As you've discovered, argv[0] can be set to anything at all by the parent process, and so need have no relation whatsoever to the actual name of the program or its location in the file system.
However, the following heuristic often works:
If argv[0] is an absolute path, assume this is the full path to the executable.
If argv[0] is a relative path, ie, it contains a /, determine the current working directory with getcwd() and then append argv[0] to it.
If argv[0] is a plain word, search $PATH looking for argv[0], and append argv[0] to whatever directory you find it in.
Note that all of these can be circumvented by the process which invoked the program in question. Finally, you can use linux-specific techniques, such as mentioned by emg-2. There are probably equivalent techniques on other operating systems.
Even supposing that the steps above give you a valid path name, you still might not have the path name you actually want (since I suspect that what you actually want to do is find a configuration file somewhere). The presence of hard links means that you can have the following situation:
-- assume /app/bin/foo is the actual program
$ mkdir /some/where/else
$ ln /app/bin/foo /some/where/else/foo # create a hard link to foo
$ /some/where/else/foo
Now, the approach above (including, I suspect, /proc/$pid/exe) will give /some/where/else/foo as the real path to the program. And, in fact, it is a real path to the program, just not the one you wanted. Note that this problem doesn't occur with symbolic links which are much more common in practice than hard links.
In spite of the fact that this approach is in principle unreliable, it works well enough in practice for most purposes.
Not an answer actually, but just a note to keep in mind.
As we could see, the problem of finding the location of running executable is quite tricky and platform-specific in Linux and Unix. One should think twice before doing that.
If you need your executable location for discovering some configuration or resource files, maybe you should follow the Unix way of placing files in the system: put configs to /etc or /usr/local/etc or in current user home directory, and /usr/share is a good place to put your resource files.
In many POSIX systems you could check a simlink located under /proc/PID/exe. Few examples:
# file /proc/*/exe
/proc/1001/exe: symbolic link to /usr/bin/distccd
/proc/1023/exe: symbolic link to /usr/sbin/sendmail.sendmail
/proc/1043/exe: symbolic link to /usr/sbin/crond
Remember that on Unix systems the binary may have been removed since it was started. It's perfectly legal and safe on Unix. Last I checked Windows will not allow you to remove a running binary.
/proc/self/exe will still be readable, but it will not be a working symlink really. It will be... odd.
On Mac OS X, use _NSGetExecutablePath.
See man 3 dyld and this answer to a similar question.
For Linux you can find the /proc/self/exe way of doing things bundled up in a nice library called binreloc, you can find the library at:
http://autopackage.org/docs/binreloc/
I would
1) Use the basename() function: http://linux.die.net/man/3/basename
2) chdir() to that directory
3) Use getpwd() to get the current directory
That way you'll get the directory in a neat, full form, instead of ./ or ../bin/.
Maybe you'll want to save and restore the current directory, if that is important for your program.

Obtaining absolute path of files in C

I need a method to obtain the absolute path of a file in C programming language for the implementation of 'cp' UNIX's command. The objective is show an error when the source path and destination path are the same.
There are multiple possibilities, for example:
cp file . // show error
cp ../file .
cp file file // show error
I haven't found a good method to solve this problem.
Converting comments into an answer.
Lookup realpath() to get the 'real name' of a path, but it really isn't necessary. You can use stat() to see if the device and inode number are the same for two names.
Also note that if you have two files linked (for example, /home/user1/name1 and /home/user2/name2), the names might be different but still refer to the same file (and the links could be 'hard' or symbolic). You can detect their equivalence with stat() but not with realpath() — at least, not with realpath() if the link is a hard link.

FindFirstFile and Junctions

I use this go get the content of directory foo: FindFirstFile(L"foo\\*", &findData). It works great when foo is a regular directory. However when foo is a junction pointing to another directory (created with mklink /j foo C:\gah) FindFirstFile fails.
The docs have this to say: "If the path points to a symbolic link, the WIN32_FIND_DATA buffer contains information about the symbolic link, not the target." But when I run it the debugger I just get an INVALID_HANDLE_VALUE and findData remains untouched.
So, how do I work around this?
Raymond Chen has an answer for you.
Functions like GetFileAttributes and FindFirstFile, when asked to
provide information about a symbolic link, returns information about
the link itself and not the link destination. If you use the
FindFirstFile function, you can tell that you have a symbolic link
because the file attributes will have the
FILE_ATTRIBUTES_REPARSE_POINT flag set, and the dwReserved0 member
will contain the special value IO_REPARSE_TAG_SYMLINK.
Okay, great, so now I know I have a symbolic link, but what if I want
information about the link target? For example, I want to know the
size of the link target, its last-modified time, and its name.
To do this, you open the symbolic link. The I/O manager dereferences
the symbolic link and gives you a handle to the link destination. You
can then call functions like GetFileSize,
GetFileInformationByHandleEx, or GetFinalPathNameByHandle to obtain
information about the symbolic link target.

How to find out if a file or directory exists?

I am trying to make a simple program that handles files and directories, but I have two major problems:
how can I check whether a file or directory exists or not, and
how do I know if it is a file, directory, symbolic link, device, named pipe etc.? Mainly file and directories matter for now, but I'd like to know the others too.
EDIT: Too all of those who are suggesting to use stat() or a similar function, I have already looked into that, and while it might answer my first question, I can't figure out how it would answer the second...
Since you're inquiring about named pipes/symlinks etc, you're probably on *nix, so use the
lstat() function
struct stat info;
if(lstat(name,&info) != 0) {
if(errno == ENOENT) {
// doesn't exist
} else if(errno == EACCES) {
// we don't have permission to know if
// the path/file exists.. impossible to tell
} else {
//general error handling
}
return;
}
//so, it exists.
if(S_ISDIR(info.st_mode)) {
//it's a directory
} else if(S_ISFIFO(info.st_mode)) {
//it's a named pipe
} else if(....) {
}
Se docs here for the S_ISXXX macros you can use.
The stat() function should give you everything you are looking for (or more specifically lstat() since stat() will follow the link).
Use stat (or if you wish to get information about a symbolic link instead of following it and getting information about the destination, lstat)
NAME
stat - get file status
SYNOPSIS
#include <sys/stat.h>
int stat(const char *restrict path, struct stat *restrict buf);
DESCRIPTION
The stat() function shall obtain information about the named file and write it to the area pointed to by the buf argument. The path argument points to a pathname naming a file. Read, write, or execute permission of the named file is not required. An implementation that provides additional or alternate file access control mechanisms may, under implementation-defined conditions, cause stat() to fail. In particular, the system may deny the existence of the file specified by path.
If the named file is a symbolic link, the stat() function shall continue pathname resolution using the contents of the symbolic link, and shall return information pertaining to the resulting file if the file exists.
The buf argument is a pointer to a stat structure, as defined in the header, into which information is placed concerning the file.

Resources