How to test existence of 'file' that cannot be accessed? - c

I have a full path stored in char full_path[1000] and I'd like to know whether anything exists at that location.
(The next thing my code would do after this check is create something at that location, but I want it to count as an error if there is already something there instead of clearing the spot with the equivalent of rm -rf)
The spot might be occupied by a file or a directory or even a link to some no-longer-existing-target e.g.:
lrwxrwxrwx 1 user grp 4 Jun 16 20:02 a-link -> non-existent-thing
With an invalid link, access(full_path, F_OK) is going to tell me I can't access full_path and that could be because (1) nothing exists there or because (2) the link is invalid.
Given that, what's the best way to determine if anything exists at full_path?

You simply cannot do that in a cross-platform obvious way. stat() and fopen() would not work. You can, though, use the OS API, for example, on windows you could use WinAPI with the example code:
int doesFileExist(TCHAR* path)
{
WIN32_FIND_DATA FindFileData;
HANDLE handle = FindFirstFile(path, &FindFileData) ;
int found = (handle != INVALID_HANDLE_VALUE);
if(found)
{
FindClose(handle);
}
return found;
}
If you just want to check if anything exists, finding any file (again, using Windows API) would also work, you could just go directory by directory to check if it exists, if one doesn't - return an error. Keep going until you got to the directory then check for the certain file in the way mentioned above.
Say you have C:/Dir1/Dir2/Dir3/file.txt then you'd go to C: first, then check if Dir1 exists, if it does - go to it, if it doesn't return an error. Same for Dir2 and so on up until you get to the last directory and check for the file OR if you don't check for a certain file and for any item - just try using the functions mentioned in MSDN for finding first file or first directory.

Since the next thing we plan to do is create something at that location, and since we want to treat it as an error if something already exists there, then let's not bother checking. Just attempt the create and exit with an error if it fails. The create step uses symlink so if it fails we can use strerror(errno) for the explanation of any failure.
To create a file in general (vs just a symlink), EOF points out in comments that open(path, O_CREAT|O_EXCL, mode) will return failure (-1) if the file already exists, and create the file otherwise. This is atomic and safe (unlike trying to stat) and guaranteed to work atomically on POSIX (with the possible exception of NFS).

Related

How to solve process control reported by Fortify

I'm working on below process control issue reported by fortify which is described in https://vulncat.fortify.com/en/detail?id=desc.dataflow.abap.process_control#C%2FC%2B%2B
The function load() in filename.c calls dlopen() on line 3. The call loads a library without specifying an absolute path.
It could result in the program using a malicious library supplied by an attacker.
I have below function which is getting invoked in different places of code.
void* load(char* name)
{
void* handle;
handle = dlopen(name, RTLD_LAZY | RTLD_GLOBAL);
return(handle);
}
void somefunc()
{
void *login_module_handle = load("/home/myuser/load_this_shared_lib.so");
}
Here I'm already using absolute path, but don't understand why Fortify still reports the error.
The possible recommendation from Fortify is as shown below.
Whenever possible, libraries should be controlled by the application and executed using an absolute path.
In cases where the path is not known at compile time, such as for cross-platform applications, an absolute path should be constructed from known values during execution.
Any suggestion would be helpful.
The load function could in theory be called someplace else that doesn't pass a full path or, even worse, a variable populated by the user that isn't properly checked.
load might better be implemented as a macro in this case:
#define load(name) dlopen(name, RTLD_LAZY | RTLD_GLOBAL)
Then when the substitution happens, dlopen is actually given a string constant specifying an absolute path that Fortify should be able to see.
There are two related but separate problems here:
How to load dynamic libraries in a secure manner at run time
How to make Fortify see security is properly taken care of
The common approach that Fortify recommends, is to define a fixed path to a directory, in which plugins must be located. It is assumed that only those with sufficient privileges have access to that directory. The pattern is similar to
#define _GNU_SOURCE
#ifndef PLUGIN_PATH
#define PLUGIN_PATH "/usr/lib/thisapp/plugins"
#endif
#ifndef PLUGIN_SUFFIX
#define PLUGIN_SUFFIX ".so"
#endif
// Try to load dynamic library 'name' under PLUGIN_PATH.
// Returns NULL if failed, with errno set.
// If this returns NULL and errno==0, use dlerror() to
// get the error message.
void *load(const char *name)
{
// Empty or NULL name is not allowed, and it may not start with '.'.
if (!name || !*name || *name == '.') {
errno = EINVAL;
return NULL;
}
// Name may not contain '/'.
if (strchr(name, '/')) {
errno = EINVAL;
return NULL;
}
char *path = NULL;
int pathlen = asprintf(&path, "%s/%s%s", PLUGIN_PATH, name, PLUGIN_SUFFIX);
if (pathlen < 1 || !path) {
errno = ENOMEM;
return NULL;
}
void *handle = dlopen(path, RTLD_LAZY | RTLD_GLOBAL);
free(path);
errno = 0; /* If handle is NULL, look at dlerror() */
return handle;
}
In general, a fixed path named after the application in the /usr/lib tree, say /usr/lib/application/plugins/ can be considered secure, assuming it is installed so the directory and all its parent directories are owned by superuser and a privileged group, with no write access to the others (i.e., drwxrwxr-x or drwr-xr-x).
If the user can specify full path to the library, each directory along the path must be vetted, to verify if they may be modified by unprivileged users. If any are suspect, then the library is not secure.
The library file itself can be opened read-only, and a read lease (see fcntl()) placed on it, if the process has sufficient privileges (CAP_LEASE capability, or run as a privileged user). This will fail if any process has the file open for writing, meaning another process is able to modify it while we use it. If the lease is granted, then any other process trying to open the file for writing causes them to block, and this process be notified by a signal. This process can delay (but not fully block) the other process from opening the library, for up to lease-break-time, typically 45 seconds. This way, a secure process can detect if someone (even privileged) tries to modify the plugin while it is being used.
With a valid read lease, it is time to examine the file ownership and mode (via fstat() using the file descriptor used to open it read-only), to see if it is owned and only modifiable by privileged users. If it is modifiable by unprivileged users, it is not secure.
After all the above checks, given the file descriptor number FD, the path provided to dlopen() is /proc/self/fd/FD. This reuses the same file descriptor, ensuring we do not have a TOCTTOU race window (time of check to time of use).
Whether Fortify recognizes the above measures, or just complains whenever it sees dlopen() used at all, I have no idea: I do not use it myself.

Determine if path is inside directory

In my application, I am trying to check if a path is inside of a specific directory. E.g., I want the files in the path /x/y/z not to be accessible by parts of my application. I cannot use traditional file permissions, as other parts of the application should be able to access these files.
Several Internet resources suggest the use of realpath to first canonicalize paths, i.e., resolving all symlinks and instances of .. (e.g. 1, 2).
However, it seems not to be possible to perform path resolution followed by an open without an race condition (TOCTOU).
char *resolved = realpath("/my/potentially/dangerous/path.txt", NULL);
// someone changes any part of the path to a symlink to something else <--- race condition
if (check_path(resolved)) {
// <--- race condition
int fd = open(resolved, O_RDONLY);
}
Am I overlooking something or does POSIX (and Linux) not provide any way to do something like this without a race condition?
What about 'openat2' (Linux only)?
And once you have a file descriptor, see man open
Description:
A file descriptor is a reference to an open file description; this
reference is unaffected if pathname is subsequently removed or
modified to refer to a different file.

Using File Descriptors with readlink()

I have a situation where I need to get a file name so that I can call the readlink() function. All I have is an integer that was originally stored as a file descriptor via an open() command. Problem is, I don't have access to the function where the open() command executed (if I did, then I wouldn't be posting this). The return value from open() was stored in a struct that I do have access to.
char buf[PATH_MAX];
char tempFD[2]; //file descriptor number of the temporary file created
tempFD[0] = fi->fh + '0';
tempFD[1] = '\0';
char parentFD[2]; //file descriptor number of the original file
parentFD[0] = (fi->fh - 1) + '0';
parentFD[1] = '\0';
if (readlink(tempFD, buf, sizeof(buf)) < 0) {
log_msg("\treadlink() error\n");
perror("readlink() error");
} else
log_msg("readlink() returned '%s' for '%s'\n", buf, tempFD);
This is part of the FUSE file system. The struct is called fi, and the file descriptor is stored in fh, which is of type uint64_t. Because of the way this program executes, I know that the two linked files have file descriptor numbers that are always 1 apart. At least that's my working assumption, which I am trying to verify with this code.
This compiles, but when I run it, my log file shows a readlink error every time. My file descriptors have the correct integer values stored in them, but it's not working.
Does anyone know how I can get the file name from these integer values? Thanks!
If it's acceptable that your code becomes non portable and is tied to being run on a somewhat modern version of Linux, then you can use /proc/<pid>/fd/<fd>. However, I would recommend against adding '0' to the fd as a means to get the string representing the number, because it uses the assumption that fd < 10.
However it would be best if you were able to just pick up the filename instead of relying on /proc. At the very least, you can replace calls to the library's function with a wrapper function using a linker flag. Example of usage is gcc program.c -Wl,-wrap,theFunctionToBeOverriden -o program, all calls to the library function will be linked against __wrap_theFunctionToBeOverriden; the original function is accessible under the name __real_theFunctionToBeOverriden. See this answer https://stackoverflow.com/a/617606/111160 for details.
But, back to the answer not involving linkage rerouting: you can do it something like
char fd_path[100];
snprintf("/proc/%d/fd/%d", sizeof(fd_path), getpid(), fi->fh);
You should now use this /proc/... path (it is a softlink) rather than using the path it links to.
You can call readlink to find the actual path in the filesystem. However, doing so introduces a security vulnerability and I suggest against using the path readlink returns.
When the file the descriptor points at is deleted,unlinked, then you can still access it through the /proc/... path. However, when you readlink on it, you get the original pathname (appended with a ' (deleted)' text).
If your file was /tmp/a.txt and it gets deleted, readlink on the /proc/... path returns /tmp/a.txt (deleted). If this path exists, you will be able to access it!, while you wanted to access a different file (/tmp/a.txt). An attacker may be able to provide hostile contents in the /tmp/a.txt (deleted) file.
On the other hand, if you just access the file through the /proc/... path, you will access the correct (unlinked but still alive) file, even if the path claims to be a link to something else.

How to determine in Windows whether file exists, does not exist or this can't be known (using c)

I have to clean up from a list of files the ones that do not exist any more. The ones whose status is indeterminable should be given a warning about but left on the list. Sounds simple enough. However, the c functions I tried to solve this with don't seem to give a reliable answer between whether the file really does not exist or it e.g. resides on a network share that is at the moment inaccessible (e.g. due to network problems).
stat function sets errno to ENOENT if the file can't be reached, so that is indistinguishable from the file not actually existing.
FindFirstFile in some cases sets last error (obtainable with GetLastError()) to ERROR_PATH_NOT_FOUND when the network share can't be reached.
Yes, I know FindFirstFile is for reading directories, but I thought I could deduce what I need to know by the error code it sets.
Also GetFileAttributes seems to in some cases set last error to ERROR_PATH_NOT_FOUND in case the network drive is unreachable.
CreateFile does set LastError to 0x35 (network path not found) if network share is not availiable and to 0x2 (system cannot find the path specified) if share is availiable, but file does not exist
if((f = fopen(file, "r")) == NULL){
//File does not exist or can not be read
}else{
//File exists
fclose(f);
}
Drawbacks:
You don't know if a file is nonexistent or just can't be read (privileges etc),
On the other hand, it is 100% portable.

How to check if a file is already open by another process in C?

I see that standard C has no way of telling if a file is already opened in another process. So the answer should contain several examples for each platform. I need that check for Visual C++ / Windows though.
Windows: Try to open the file in exclusive mode. If it works, no one else has opened the file and will not be able to open the file
HANDLE fh;
fh = CreateFile(filename, GENERIC_READ, 0 /* no sharing! exclusive */, NULL, OPEN_EXISTING, 0, NULL);
if ((fh != NULL) && (fh != INVALID_HANDLE_VALUE))
{
// the only open file to filename should be fh.
// do something
CloseHandle(fh);
}
MS says: dwShareMode
The sharing mode of an object, which can be read, write, both, delete, all of these, or none (refer to the following table).
If this parameter is zero and CreateFile succeeds, the object cannot be shared and cannot be opened again until the handle is closed.
You cannot request a sharing mode that conflicts with the access mode that is specified in an open request that has an open handle, because that would result in the following sharing violation: ERROR_SHARING_VIOLATION.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858%28v=vs.85%29.aspx
extension:
how to delete a (not readonly) file filesystem which no one has open for read/write?
access right FILE_READ_ATTRIBUTES, not DELETE. DELETE could cause problems on smb share (to MS Windows Servers) - CreateFile will leave with a still open FileHandle /Device/Mup:xxx filename - why ever and whatever this Mup is. Will not happen with access right FILE_READ_ATTRIBUTES
use FILE_FLAG_OPEN_REPARSE_POINT to delete filename. Else you will delete the target of a symbolic link - which is usually not what you want
HANDLE fh;
fh = CreateFile(filename, FILE_READ_ATTRIBUTES, FILE_SHARE_DELETE /* no RW sharing! */, NULL, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT|FILE_FLAG_DELETE_ON_CLOSE, NULL);
if ((fh != NULL) && (fh != INVALID_HANDLE_VALUE))
{
DeleteFile(filename); /* looks stupid?
* but FILE_FLAG_DELETE_ON_CLOSE will not work on some smb shares (e.g. samba)!
* FILE_SHARE_DELETE should allow this DeleteFile() and so the problem could be solved by additional DeleteFile()
*/
CloseHandle(fh); /* a file, which no one has currently opened for RW is delete NOW */
}
what to do with an open file? If the file is open and you are allowed to do an unlink, you will be left a file where subsequent opens will lead to ACCESS_DENIED.
If you have a temporary folder, then it could be a good idea to rename(filename, tempdir/filename.delete) and delete tempdir/filename.delete.
There's no way tell, unless the other process explicitly forbids access to the file. In MSVC, you'd do so with _fsopen(), specifying _SH_DENYRD for the shflag argument. The notion of being interested whether a file is opened that isn't otherwise locked is deeply flawed on a multitasking operating system. It might be opened a microsecond after you'd have found it wasn't. That's also the reason that Windows doesn't have a IsFileLocked() function.
If you need synchronized access to files, you'll need to add this with a named mutex, use CreateMutex().
Getting the open_files information is DIFFICULT, it's like pulling teeth, and if you don't have an immediate need for it you shouldn't be asking for "several examples for each platform" just for the hell of it. Just my opinion, of course.
Linux and many Unix systems have a system utility called lsof which finds open file handles and stuff. The way it does so is by accessing /dev/kmem, which is a pseudo-file containing a copy of "live" kernel memory, i.e. the working storage of the operating system kernel. There are tables of open files in there, naturally, and the memory structure is open-source and documented, so it's just a matter of a lot of busywork for lsof to go in there, find the information and format it for the user.
Documentation for the deep innards of Windows, on the other hand, is practically nonexistent, and I'm not aware that the data structures are somehow exposed to the outside. I'm no Windows expert, but unless the Windows API explicitly offers this kind of information it may simply not be available.
Whatever is available is probably being used by Mark Russinovich's SysInternals utilities; the first one that comes to mind is FileMon. Looking at those may give you some clues. Update: I've just been informed that SysInternals Handles.exe is even closer to what you want.
If you manage to figure that out, good; otherwise you may be interested in catching file open/close operations as they happen: The Windows API offers a generous handful of so-called Hooks: http://msdn.microsoft.com/en-us/library/ms997537.aspx. Hooks allow you to request notification when certain things happen in the system. I believe there's one that will tell you when a program –systemwide– opens a file. So you can make your own list of files opened for the duration you're listening to your hooks. I don't know for sure but I suspect this may be what FileMon does.
The Windows API, including the hook functions, can be accessed from C. Systemwide hooks will require you to create a DLL to be loaded alongside your program.
Hope these hints help you get started.
For Windows, this code works also:
boolean isClosed(File f) { return f.renameTo(f); }
An opened file can not be renamed, and a rename to same name does not cause another error. So if the rename succeeds, not having really done something, you know the file is not open.
Any such check would be inherently racy. Another process could always open the file between the point where you did the check and the point where you accessed the file.
The answers so far should tell you that finding out the information you've asked for is tricky, non-portable, and often inherently unreliable. So, from my perspective, the real answer is don't do that. Try to find a way to think about your real problem so that this question doesn't arise.
this can't be that hard guys.
do this:
try{
File fileout = new File(path + ".xls");
FileOutPutStream out = new FileOutPutStream(fileout);
}
catch(FileNotFoundException e1){
// if a MS Windows process is already using the file, this exception will be thrown
}
catch(Exception e){
}
You can use something like this. It is not a proper solution. But it works,
bool IsFileDownloadComplete(const std::wstring& dir, const std::wstring& fileName)
{
std::wstring originalFileName = dir + fileName;
std::wstring tempFileName = dir + L"temp";
while(true)
{
int ret = rename(convertWstringToString(originalFileName).c_str(), convertWstringToString(tempFileName).c_str());
if(ret == 0)
break;
Sleep(10);
}
/** File is not open. Rename to original. */
int ret = rename(convertWstringToString(tempFileName).c_str(), convertWstringToString(originalFileName).c_str());
if(ret != 0)
throw std::exception("File rename failed");
return true;
}

Resources