How to retrieve filepath relatively to a given directory in C - c

I'm looking for an efficient way to convert absolute filepath to a path relative to a specific directory.
Let's say we have to following structure:
D:\main\test1\blah.txt
D:\test2\foo.txt
With "D:\main" being the reference directory, then result would be:
blah.txt => "\test1\blah.txt"
foo.txt => "..\test2\foo.txt"
Any clue ?
Notes for the record:
It seems that:
there is no unified API function (cross-platform) for performing this
this question has been asked various times for other languages (though most answers take advantage of function PathRelativePathTo):
How to get relative path from absolute path
Getting a file path relative to a particular directory
How do I get a relative path from one path to another in C#

You are giving windows paths in your example. So, if it is acceptable for you to use the WinAPI functions, you can use PathRelativePathTo.

Here is the shortest solution I could figure out.
Algorithm is actually quite simple:
Given 1) a reference path (path to which result path will be relative to); and 2) an absolute path (full path of a file) :
while path parts are equals : skip them
when we come across a difference
add a ".." for each remaining part of reference path
add remaining parts from absolute path
The only limitation under windows is in case of distinct volume (drive letters differ), in which situation we have no choice but to return the original absolute path.
Cross-platform C source :
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32) && !defined(__CYGWIN__)
const char* path_separator = "\\";
#else
const char* path_separator = "/";
#endif
#define FILENAME_MAX 1024
char* get_relative_path(char* reference_path, char* absolute_path) {
static char relative_path[FILENAME_MAX];
// init result string
relative_path[0] = '\0';
// check first char (under windows, if differs, we return absolute path)
if(absolute_path[0] != reference_path[0]) {
return absolute_path;
}
// make copies to prevent altering original strings
char* path_a = strdup(absolute_path);
char* path_r = strdup(reference_path);
int inc;
int size_a = strlen(path_a)+1;
int size_r = strlen(path_r)+1;
for(inc = 0; inc < size_a && inc < size_r; inc += strlen(path_a+inc)+1) {
char* token_a = strchr(path_a+inc, path_separator[0]);
char* token_r = strchr(path_r+inc, path_separator[0]);
if(token_a) token_a[0] = '\0';
if(token_r) token_r[0] = '\0';
if(strcmp(path_a+inc, path_r+inc) != 0) break;
}
for(int inc_r = inc; inc_r < size_r; inc_r += strlen(path_r+inc_r)+1) {
strcat(relative_path, "..");
strcat(relative_path, path_separator);
if( !strchr(reference_path+inc_r, path_separator[0]) ) break;
}
if(inc < size_a) strcat(relative_path, absolute_path+inc);
return relative_path;
}

First thing you need to identify the file path separator as stated here.
const char kPathSeparator =
#ifdef _WIN32
'\\';
#else
'/';
#endif
Then, you need to write a function to compute a canonical absolute file path. You will have to use #ifdef _WIN32 again because there are specific windows treatment required (add current disk at the begining of path if none is present).
After that, remove all . in the path, and remove all .. with their previous directory.
Once this function is written, you need to use it twice to get your origin and target canonical absolute paths, and then as explains #Weather Vane you need to identify the common part in the two paths and to add the number of .. concatenated to the end of the target canonical path.

Related

Fopen function returns null when given an existing path

When trying to open a file with fopen(path, "2"); i get NULL on an existing path
iv'e tried to enter only the file name and it works but i want the program to write the file in the path...
Yes, i write the path with double backslashes "\\" when it's necesary.
Yes the path without doubt exists.
FILE* log;
char directory_path[PATH_LEN] = { 0 };
char directory_file[PATH_LEN] = { 0 };
//directory_path is the directory, entered by the user
//LOG_NAME is the files name without the path - "log.txt"
//#define PATH_LEN 100
printf("Folder to scan: ");
fgets(directory_path, PATH_LEN, stdin);
directory_path[strlen(directory_path) - 1] = 0;
//this section connects the path with the file name.
strcpy(directory_file, directory_path);
strcat(directory_file, "\\");
strcat(directory_file, LOG_NAME);
if ((log = fopen(directory_file, "w")) == NULL)
{
printf("Error");
}
My program worked until i tried to write into a file in order to create a log file. This means that the path is correct without doubt.
Can anyone tell me the problem here?
You have several issues in your code:
For one, fopen(path, "2"); is not valid.
The mode argument needs to include one of a, r, and w and can optionally include b or +.
As another thing, directory_path[strlen(directory_path) - 1] = 0; may truncate the end of your path (if it's over PATH_LEN characters long).
There also may be a possible issue with buffer overflow due to the fact that you copy a string to a buffer of the same size and then concatenate two other strings to it. Therefore, you should change this line:
char directory_file[PATH_LEN] = { 0 };
to this:
char directory_file[PATH_LEN+sizeof(LOG_NAME)+1] = { 0 };
To debug this issue, you should print the string entered and ask for confirmation before using it (wrap this in #ifdef DEBUG).

LINUX C how to convert two full path's to a relative path (updated) [duplicate]

Given two absolute paths, e.g.
/a/path/to/a
/a/path/to/somewhere/else
How can I get a relative path from one to the other, ../a?
In a sense, the opposite of what realpath does.
Find the longest common path (in this case, /a/path/to) and delete it from both absolute paths. That would give:
/a
/somewhere/else
Now, replace each path component in the starting path with ../ and prepend the result to the destination path. If you want to go from directory else to directory a, that would give you:
../../a
If you want to go the other way, you'd instead have:
../somewhere/else
I answered a similar question here: Resolving a relative path without referencing the current directory on Windows.
There is no standard function for this. There is a function in vi-like-emacs for this purpose. A quick check of apropos relative shows me few other programs which likely implement this: revpath for example).
It could be done as a string-manipulation (no need to compute working directories):
start by finding the longest common prefix which ends with a path-separator.
if there is no common prefix, you are done
strip the common prefix from (a copy of...) the current and target strings
replace each directory-name in the current string with ".."
add that (with a path-separator) in front of the target string
return that combined string
The "done" in the second step presumes that you want to use a relative path to shorten the result. On the other hand, you might want to use a relative pathname regardless of the length. In that case, just skip the step (the result will be longer, but relative).
Build a tree with the first absolute path, then add the second path to that tree, and then walk from one leaf to the other: a step from one node to its parent is translated to a "../" sequence, and a step from a node to one of its children is translated to the name of that children. Notice that there might be more than one solution. For example:
1) /a/path/to/a
And
2) /a/path/to/a/new/one
The obvious path from (1) to (2) is new/one but ../../../a/path/to/a/new/one is also valid. When you write the algorithm to do the walking in your tree you have to be aware of this
Using cwalk you can use cwk_path_get_relative, which even works cross-platform:
#include <cwalk.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char buffer[FILENAME_MAX];
cwk_path_get_relative("/hello/there/", "/hello/world", buffer, sizeof(buffer));
printf("The relative path is: %s", buffer);
return EXIT_SUCCESS;
}
Output:
The relative path is: ../world
This is implemented in the "ln" command, part of the GNU Coreutils package (for ln -r). Obviously there are many ways to go about this, and one could even derive some benefit from coming up with a solution oneself, without looking at existing code. Personally I find the code in Coreutils to be rather instructive.
If I had to convert absolute paths to a relative path in a C project, I would just copy "relpath.c" from Coreutils. It has a few dependencies to other utility functions in the package, these would need to be brought in in some form as well. Here is the main "relpath()" function. Note that it works on canonicalized pathnames, for example it doesn't like for paths to contain stuff like "//" or "/.". Basically it finds the common prefix of the two paths, then it treats the four cases which arise depending on whether the remaining part of each path is empty or not.
/* Output the relative representation if possible.
If BUF is non-NULL, write to that buffer rather than to stdout. */
bool
relpath (const char *can_fname, const char *can_reldir, char *buf, size_t len)
{
bool buf_err = false;
/* Skip the prefix common to --relative-to and path. */
int common_index = path_common_prefix (can_reldir, can_fname);
if (!common_index)
return false;
const char *relto_suffix = can_reldir + common_index;
const char *fname_suffix = can_fname + common_index;
/* Skip over extraneous '/'. */
if (*relto_suffix == '/')
relto_suffix++;
if (*fname_suffix == '/')
fname_suffix++;
/* Replace remaining components of --relative-to with '..', to get
to a common directory. Then output the remainder of fname. */
if (*relto_suffix)
{
buf_err |= buffer_or_output ("..", &buf, &len);
for (; *relto_suffix; ++relto_suffix)
{
if (*relto_suffix == '/')
buf_err |= buffer_or_output ("/..", &buf, &len);
}
if (*fname_suffix)
{
buf_err |= buffer_or_output ("/", &buf, &len);
buf_err |= buffer_or_output (fname_suffix, &buf, &len);
}
}
else
{
buf_err |= buffer_or_output (*fname_suffix ? fname_suffix : ".",
&buf, &len);
}
if (buf_err)
error (0, ENAMETOOLONG, "%s", _("generating relative path"));
return !buf_err;
}
You can view the rest of "relpath.c" here. There is a higher-level wrapper function in "ln.c" which canonicalizes its path name arguments before calling relpath, it is named convert_abs_rel(). That's probably what you want to be calling most of the time.

How to find relative path given two absolute paths?

Given two absolute paths, e.g.
/a/path/to/a
/a/path/to/somewhere/else
How can I get a relative path from one to the other, ../a?
In a sense, the opposite of what realpath does.
Find the longest common path (in this case, /a/path/to) and delete it from both absolute paths. That would give:
/a
/somewhere/else
Now, replace each path component in the starting path with ../ and prepend the result to the destination path. If you want to go from directory else to directory a, that would give you:
../../a
If you want to go the other way, you'd instead have:
../somewhere/else
I answered a similar question here: Resolving a relative path without referencing the current directory on Windows.
There is no standard function for this. There is a function in vi-like-emacs for this purpose. A quick check of apropos relative shows me few other programs which likely implement this: revpath for example).
It could be done as a string-manipulation (no need to compute working directories):
start by finding the longest common prefix which ends with a path-separator.
if there is no common prefix, you are done
strip the common prefix from (a copy of...) the current and target strings
replace each directory-name in the current string with ".."
add that (with a path-separator) in front of the target string
return that combined string
The "done" in the second step presumes that you want to use a relative path to shorten the result. On the other hand, you might want to use a relative pathname regardless of the length. In that case, just skip the step (the result will be longer, but relative).
Build a tree with the first absolute path, then add the second path to that tree, and then walk from one leaf to the other: a step from one node to its parent is translated to a "../" sequence, and a step from a node to one of its children is translated to the name of that children. Notice that there might be more than one solution. For example:
1) /a/path/to/a
And
2) /a/path/to/a/new/one
The obvious path from (1) to (2) is new/one but ../../../a/path/to/a/new/one is also valid. When you write the algorithm to do the walking in your tree you have to be aware of this
Using cwalk you can use cwk_path_get_relative, which even works cross-platform:
#include <cwalk.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char buffer[FILENAME_MAX];
cwk_path_get_relative("/hello/there/", "/hello/world", buffer, sizeof(buffer));
printf("The relative path is: %s", buffer);
return EXIT_SUCCESS;
}
Output:
The relative path is: ../world
This is implemented in the "ln" command, part of the GNU Coreutils package (for ln -r). Obviously there are many ways to go about this, and one could even derive some benefit from coming up with a solution oneself, without looking at existing code. Personally I find the code in Coreutils to be rather instructive.
If I had to convert absolute paths to a relative path in a C project, I would just copy "relpath.c" from Coreutils. It has a few dependencies to other utility functions in the package, these would need to be brought in in some form as well. Here is the main "relpath()" function. Note that it works on canonicalized pathnames, for example it doesn't like for paths to contain stuff like "//" or "/.". Basically it finds the common prefix of the two paths, then it treats the four cases which arise depending on whether the remaining part of each path is empty or not.
/* Output the relative representation if possible.
If BUF is non-NULL, write to that buffer rather than to stdout. */
bool
relpath (const char *can_fname, const char *can_reldir, char *buf, size_t len)
{
bool buf_err = false;
/* Skip the prefix common to --relative-to and path. */
int common_index = path_common_prefix (can_reldir, can_fname);
if (!common_index)
return false;
const char *relto_suffix = can_reldir + common_index;
const char *fname_suffix = can_fname + common_index;
/* Skip over extraneous '/'. */
if (*relto_suffix == '/')
relto_suffix++;
if (*fname_suffix == '/')
fname_suffix++;
/* Replace remaining components of --relative-to with '..', to get
to a common directory. Then output the remainder of fname. */
if (*relto_suffix)
{
buf_err |= buffer_or_output ("..", &buf, &len);
for (; *relto_suffix; ++relto_suffix)
{
if (*relto_suffix == '/')
buf_err |= buffer_or_output ("/..", &buf, &len);
}
if (*fname_suffix)
{
buf_err |= buffer_or_output ("/", &buf, &len);
buf_err |= buffer_or_output (fname_suffix, &buf, &len);
}
}
else
{
buf_err |= buffer_or_output (*fname_suffix ? fname_suffix : ".",
&buf, &len);
}
if (buf_err)
error (0, ENAMETOOLONG, "%s", _("generating relative path"));
return !buf_err;
}
You can view the rest of "relpath.c" here. There is a higher-level wrapper function in "ln.c" which canonicalizes its path name arguments before calling relpath, it is named convert_abs_rel(). That's probably what you want to be calling most of the time.

Check if a file is a specific type in C

I'm writing my first C program, though I come from a C++ background.
I need to iterate through a directory of files and check to see if the file is a header file, and then return the count.
My code is as follows, it's pretty rudimentary I think:
static int CountHeaders( const char* dirname ) {
int header_count = 0;
DIR* dir_ptr;
struct dirent* entry;
dir_ptr = opendir( dirname );
while( ( entry = readdir( dir_ptr ) ) )
{
if ( entry->d_type == DT_REG )
{
//second if statement to verify the file is a header file should be???
++header_count;
}
}
closedir( dir_ptr );
return header_count;
}
What would be a good if statement to check to see if the file is a header?
Simply check if the file extension is .h, something like:
const char *ext = strrchr (entry->d_name, '.');
if ((ext != NULL) && (!strcmp (ext+1, "h"))) {
// header file
}
Ofcourse, note that this assumes all your header files have an .h extension, which may or may not be true, the C standard does not mandate that header files must have an .h extension.
Each dirent structure has a d_name containing the name of the file, so I'd be looking to see if that followed some pattern, like ending in .h or .hpp.
That would be code along the lines of:
int len = strlen (entry->d_name);
if ((len >= 2) && strcmp (&(entry->d_name[len - 2]), ".h") == 0))
header_count++;
if ((len >= 4) && strcmp (&(entry->d_name[len - 4]), ".hpp") == 0))
header_count++;
Of course, that won't catch truly evil people from calling their executables ha_ha_fooled_you.hpp but thanfkfully they're in the minority.
You may even want to consider an endsWith() function to make your life easier:
int endsWith (char *str, char *end) {
size_t slen = strlen (str);
size_t elen = strlen (end);
if (slen < elen)
return 0;
return (strcmp (&(str[slen-elen]), end) == 0);
}
:
if (endsWith (entry->d_name, ".h")) header_count++;
if (endsWith (entry->d_name, ".hpp")) header_count++;
There are some much better methods than checking the file extension.
Wikipedia has a good article here and here. The latter idea is called the magic number database which essentially means that if a file contains blah sequence then it is the matching type listed in the database. Sometimes the number has restrictions on locations and sometimes it doesnt. This method IMO is more accurate albeit slower than file extension detection.
But then again, for something as simple as checking to see if its a header, this may be a bit of overkill XD
You could check if the last few characters are one of the header-file extensions, .h, .hpp, etc. Use the dirent struct's d_name for the name of the file.
Or, you could run the 'file' command and parse its result.
You probably just want to check the file extension. Using dirent, you would want to look at d_name.
That's up to you.
The easiest way is to just look at the filename (d_name), and check whether it ends with something like ".h" or ".hpp" or whatever.
Opening the file and actually reading it to see if it's valid c/c++, on the other hand, will be A LOT more complex... you could run it through a compiler, but not every header works on its own, so that test will give you a lot of false negatives.

a recursive function to manipulate a given path

I am working on modifying the didactic OS xv6 (written in c) to support symbolic links (AKA shortcuts).
A symbolic link is a file of type T_SYM that contains a path to it's destination.
For doing that, i wrote a recursive function that gets a path and a buffer and fills the buffer with the "real" path (i.e. if the path contains a link, it should be replaced by the real path, and a link can occur at any level in the path).
Basically, if i have a path a/b/c/d, and a link from f to a/b, the following operations should be equivalent:
cd a/b/c/d
cd f/c/d
Now, the code is written, but the problem that i try to solve is the problem of starting the path with "/" (meaning that the path is absolute and not relative).
Right now, if i run it with a path named /dir1 it treats it like dir1 (relative instead of absolute).
This is the main function, it calls the recursive function.
pathname is the given path, buf will contain the real path.
int readlink(char *pathname, char *buf, size_t bufsize){
char name[DIRSIZ];
char realpathname[100];
memset(realpathname,0,100);
realpathname[0] = '/';
if(get_real_path(pathname, name, realpathname, 0, 0)){
memmove(buf, realpathname, strlen(realpathname));
return strlen(realpathname);
}
return -1;
}
This is the recursive part.
the function returns an inode structure (which represents a file or directory in the system). it builds the real path inside realpath.
ilock an iunlock are being used to use the inode safely.
struct inode* get_real_path(char *path, char *name, char* realpath, int position){
struct inode *ip, *next;
char buf[100];
char newpath[100];
if(*path == '/')
ip = iget(ROOTDEV, ROOTINO);// ip gets the root directory
else
ip = idup(proc->cwd); // ip gets the current working directory
while((path = skipelem(path, name)) != 0){name will get the next directory in the path, path will get the rest of the directories
ilock(ip);
if(ip->type != T_DIR){//if ip is a directory
realpath[position-1] = '\0';
iunlockput(ip);
return 0;
}
if((next = dirlookup(ip, name, 0)) == 0){//next will get the inode of the next directory
realpath[position-1] = '\0';
iunlockput(ip);
return 0;
}
iunlock(ip);
ilock(next);
if (next->type == T_SYM){ //if next is a symbolic link
readi(next, buf, 0, next->size); //buf contains the path inside the symbolic link (which is a path)
buf[next->size] = 0;
iunlockput(next);
next = get_real_path(buf, name, newpath, 0);//call it recursively (might still be a symbolic link)
if(next == 0){
realpath[position-1] = '\0';
iput(ip);
return 0;
}
name = newpath;
position = 0;
}
else
iunlock(next);
memmove(realpath + position, name, strlen(name));
position += strlen(name);
realpath[position++]='/';
realpath[position] = '\0';
iput(ip);
ip = next;
}
realpath[position-1] = '\0';
return ip;
}
I have tried many ways to do it right but with no success. If anyone sees the problem, i'd be happy to hear the solution.
Thanks,
Eyal
I think it's clear that after running get_real_path(pathname, name, realpathname, 0, 0) the realpathname cannot possibly start with a slash.
Provided the function executes successfully, the memmove(realpath + position, name, strlen(name)) ensures that realpath starts with name, as the position variable always contains zero at the first invocation of memmove.
I'd suggest something like
if(*path == '/') {
ip = iget(ROOTDEV, ROOTINO); // ip gets the root
realpath[position++] = '/';
} else
ip = idup(proc->cwd); // ip gets the current working directory
P.S. I'm not sure why you put a slash into the realpathname before executing the get_real_path, since at this point you don't really know whether the path provided is an absolute one.
Ok, found the problem...
The problem was deeper than what i thought...
Somehow the realpath was changed sometimes with no visible reason... but the reason was the line:
name = newpath;
the solution was to change that line to
strcpy(name,newpath);
the previous line made a binding between the name and the realpath... which can be ok if we were not dealing with softlinks. When dereferencing a subpath, this binding ruined everything.
Thanks for the attempts

Resources