Linux Kernel Module File close not quite correct - c

I have a small problem with this code. I can't figure out why it is not working.
static int test(const char *path)
{
struct file *filp;
filp = filp_open(path, O_RDONLY, 0);
if (IS_ERR(filp))
return filp;
// some code (only read from filp (like inode and stuff))
filp_close(filp, NULL);
}
When I use this snippet once or twice or even a thousand times it works but after approximately 63000 times I run into error -23 and after that cann't open a single file. I looked up the syscalls for open and close and these use filp_open/filp_close and I just can't figure out what is wrong with this code. It must be something with the file descriptors not being deallocated, but why?

Try this:
mm_segment_t st_old_fs;
st_old_fs = get_fs();
set_fs(get_ds());
struct file *filp;
filp = file_open(path, O_RDONLY, 0);
if (IS_ERR(filp))
{
set_fs(st_old_fs);
return filp;
}
// some code (only read from filp (like inode and stuff))
filp_close(filp, NULL);
set_fs(st_old_fs);
The kernel has some special memory manager way for the file. So you need to save the old way and restore it after use the file_open.

file objects, like plenty others, are freed only after RCU grace period finishes. I suspect your kernel is not preemptible and you don't context switch anywhere, thus the grace period never completes and the objects accumulate. You can learn more about rcu here: https://lwn.net/Articles/262464/
The real question is what are you doing. Kernel-level work is not fit for programming beginners. If this is a college assignment, it is extremely likely you are going about it wrong or the assignment itself is just bad.

Related

stat(), fstat(), lstat(), and fopen(); how to write TOCTOU protected system independent code

I've been dealing with a problem for a few weeks now updating 20 year code that needs to be system independent (work on both Linux and Windows). It involves Time-of-Check, Time-of-Use (TOCTOU) issues. I made a thread here, but it didn't go very far, and after ruminating on it for a while and searching deeper into the problem, I think I understand my question a bit better. Maybe I can ask it a bit better too...
From what I've read, the code needs to check if the file exists, if it is accessible, open the file, do some operations and finally close the file. It seems the best way to do this is a call to lstat(), a call to fopen(), a call to fstat() (to rule out the TOCTOU), and then the operations and closing the file.
However, I've been lead to believe that lstat() and fstat() are POSIX defined, not C Standard defined, ruling out their use for a system agnostic program, much in the same way open() shouldn't be used for cross-compatibility. How would you implement this?
If you look at my first post, you can see the developer from 20 years ago used the C preprocessor to cut the code into cross-compatible parts, but even if I did that, I wouldn't know what to replace lstat() or fstat() with (their windows counterparts).
Edit: Added abreviated code to this post; if something is unclear please go to the original post
#ifdef WIN32
struct _stat buf;
#else
struct stat buf;
#endif //WIN32
FILE *fp;
char data[2560];
// Make sure file exists and is readable
#ifdef WIN32
if (_access(file.c_str(), R_OK) == -1) {
#else
if (access(file.c_str(), R_OK) == -1) {
#endif //WIN32
char message[2560];
sprintf(message, "File '%s' Not Found or Not Readable", file.c_str());
throw message;
}
// Get the file status information
#ifdef WIN32
if (_stat(file.c_str(), &buf) != 0) {
#else
if (stat(file.c_str(), &buf) != 0) {
#endif //WIN32
char message[2560];
sprintf(message, "File '%s' No Status Available", file.c_str());
throw message;
}
// Open the file for reading
fp = fopen(file.c_str(), "r");
if (fp == NULL) {
char message[2560];
sprintf(message, "File '%s' Cound Not be Opened", file.c_str());
throw message;
}
// Read the file
MvString s, ss;
while (fgets(data, sizeof(data), fp) != (char *)0) {
s = data;
s.trimBoth();
if (s.compare( 0, 5, "GROUP" ) == 0) {
//size_t t = s.find_last_of( ":" );
size_t t = s.find( ":" );
if (t != string::npos) {
ss = s.substr( t+1 ).c_str();
ss.trimBoth();
ss = ss.substr( 1, ss.length() - 3 ).c_str();
group_list.push_back( ss );
}
}
}
// Close the file
fclose(fp);
}
The reliable way to check whether the file exists and can be opened is to try opening it. If it was opened, all was OK. If it was not opened, you can think about spending time to analyze what went wrong.
The access() function formally asks a different question from what you think; it asks 'can the real user ID or the real group ID access the file', but the program will use the effective user ID or the effective group ID to access the file. If your program is not running SUID or SGID, and was not launched from a program running SUID or SGID — and that's the normal case — then there's no difference. But the question is different.
The use of stat() or
lstat() doesn't seem helpful. In particular, lstat() only tells you whether you start at a symlink, but the code doesn't care about that.
Both the access() and the stat() calls provide you with TOCTOU windows of vulnerability; the file could be removed after they reported it was present, or created after they reported it was absent.
You should simply call fopen() and see whether it works; the code will be simpler and more resistant to TOCTOU problems. You might need to consider whether to use open() with all its extra controls (O_EXCL, etc), and then convert the file descriptor to a file pointer (fdopen()).
All of this applies to the Unix side.
The details will be different, but on the Windows side, you will still be best off trying to open the file and reacting appropriately to failure.
In both systems, make sure the options provided to the open function are appropriate.

Write to file in Kernel Module - have fd, have pointer to write sys call

I know there have been questions like this before, but I'm hoping that I can get some help. As an academic exercise, I'm trying to write to a file from a kernel module. I have saved the original write call from the system call table to a typedef (sys_write_orig) and have replaced it with my own function. That all works fine.
In my new sys_write function, if I use sys_write_orig with the original buffer passed in from userland - it works fine. But when I try to create a new buffer - the issues begin. I understand the separation of kernel memory and user memory - but I thought there was a way to do all this. Any ideas? Here's kind of what I'm trying to do:
char* kernbuf = "foo";
char __user* userbuf = (char*) kmalloc(3*sizeof(char), GFP_USER);
int n = copy_to_user(userbuf,kernbuf,3);
printk("%d bytes copied to user space (I think).\n",n);
n = sys_write_orig(fd,userbuf,3);
printk("%d is the result from the write.\n",n);
I'm kind of new to kernel-land. So any help is appreciated. Thanks!
I think this does it. I did not want to use VFS because I have a pointer to the sys_write function (I hooked it with my own function and saved the original), so I might as well use it. But either way - I sill needed to get kernel space data into user space. This seems to do the trick. Thanks for pointing me back to the VFS post - it had the info that got me to this solution.
void append_file(unsigned int fd)
{
// http://www.linuxjournal.com/node/8110/print
mm_segment_t old_fs;
old_fs = get_fs();
set_fs(KERNEL_DS);
sys_write_orig(fd, (char*)APPEND_TEXT, strlen(APPEND_TEXT));
set_fs(old_fs);
printk(KERN_INFO "Appended \"%s\" to fd:%d.\n", APPEND_TEXT, fd);
}

Asynchronous io in c using windows API: which method to use and why does my code execute synchronous?

I have a C application which generates a lot of output and for which speed is critical. The program is basically a loop over a large (8-12GB) binary input file which must be read sequentially. In each iteration the read bytes are processed and output is generated and written to multiple files, but never to multiple files at the same time. So if you are at the point where output is generated and there are 4 output files you write to either file 0 or 1 or 2, or 3. At the end of the iteration I now write the output using fwrite(), thereby waiting for the write operation to finish. The total number of output operations is large, up to 4 million per file, and output size of files ranges from 100mb to 3.5GB. The program runs on a basic multicore processor.
I want to write output in a separate thread and I know this can be done with
Asyncronous I/O
Creating threads
I/O completion ports
I have 2 type of questions, namely conceptual and code specific.
Conceptual Question
What would be the best approach. Note that the application should be portable to Linux, however, I don't see how that would be very important for my choice for 1-3, since I would write a wrapper around anything kernel/API specific. For me the most important criteria is speed. I have read that option 1 is not that likely to increase the performance of the program and that the kernel in any case creates new threads for the i/o operation, so then why not use option (2) immediately with the advantage that it seems easier to program (also since I did not succeed with option (1), see code issues below).
Note that I read https://stackoverflow.com/questions/3689759/how-can-i-run-a-specific-function-of-thread-asynchronously-in-c-c, but I dont see a motivation on what to use based on the nature of the application. So I hope somebody could provide me with some advice what would be best in my situation. Also from the book "Windows System Programming" by Johnson M. Hart, I know that the recommendation is using threads, mainly because of the simplicity. However, will it also be fastest?
Code Question
This question involves the attempts I made so far to make asynchronous I/O work. I understand that its a big piece of code so that its not that easy to look into. In any case I would really appreciate any attempt.
To decrease execution time I try to write the output by means of a new thread using WINAPI via CreateFile() with FILE_FLAGGED_OVERLAP with an overlapped structure. I have created a sample program in which I try to get this to work. However, I encountered 2 problems:
The file is only opened in overlapped mode when I delete an already existing file (I have tried using CreateFile in different modes (CREATE_ALWAYS, CREATE_NEW, OPEN_EXISTING), but this does not help).
Only the first WriteFile is executed asynchronously. The remainder of WriteFile commands is synchronous. For this problem I already consulted http://support.microsoft.com/kb/156932. It seems that the problem I have is related to the fact that "any write operation to a file that extends its length will be synchronous". I've already tried to solve this by increasing file size/valid data size (commented region in code). However, I still do not get it to work. I'm aware of the fact that it could be the case that to get most out of asynchronous io i should CreateFile with FILE_FLAG_NO_BUFFERING, however I cannot get this to work as well.
Please note that the program creates a file of about 120mb in the path of execution. Also note that print statements "not ok" are not desireable, I would like to see "can do work in background" appear on my screen... What goes wrong here?
#include <windows.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define ASYNC // remove this definition to run synchronously (i.e. using fwrite)
#ifdef ASYNC
struct _OVERLAPPED *pOverlapped;
HANDLE *pEventH;
HANDLE *pFile;
#else
FILE *pFile;
#endif
#define DIM_X 100
#define DIM_Y 150000
#define _PRINTERROR(msgs)\
{printf("file: %s, line: %d, %s",__FILE__,__LINE__,msgs);\
fflush(stdout);\
return 0;} \
#define _PRINTF(msgs)\
{printf(msgs);\
fflush(stdout);} \
#define _START_TIMER \
time_t time1,time2; \
clock_t clock1; \
time(&time1); \
printf("start time: %s",ctime(&time1)); \
fflush(stdout);
#define _END_TIMER\
time(&time2);\
clock1 = clock();\
printf("end time: %s",ctime(&time2));\
printf("elapsed processor time: %.2f\n",(((float)clock1)/CLOCKS_PER_SEC));\
fflush(stdout);
double aio_dat[DIM_Y] = {0};
double do_compute(double A,double B, int arr_len);
int main()
{
_START_TIMER;
const char *pName = "test1.bin";
DWORD dwBytesToWrite;
BOOL bErrorFlag = FALSE;
int j=0;
int i=0;
int fOverlapped=0;
#ifdef ASYNC
// create / open the file
pFile=CreateFile(pName,
GENERIC_WRITE, // open for writing
0, // share write access
NULL, // default security
CREATE_ALWAYS, // create new/overwrite existing
FILE_FLAG_OVERLAPPED, // | FILE_FLAG_NO_BUFFERING, // overlapped file
NULL); // no attr. template
// check whether file opening was ok
if(pFile==INVALID_HANDLE_VALUE){
printf("%x\n",GetLastError());
_PRINTERROR("file not opened properly\n");
}
// make the overlapped structure
pOverlapped = calloc(1,sizeof(struct _OVERLAPPED));
pOverlapped->Offset = 0;
pOverlapped->OffsetHigh = 0;
// put event handle in overlapped structure
if(!(pOverlapped->hEvent = CreateEvent(NULL,TRUE,FALSE,NULL))){
printf("%x\n",GetLastError());
_PRINTERROR("error in createevent\n");
}
#else
pFile = fopen(pName,"wb");
#endif
// create some output
for(j=0;j<DIM_Y;j++){
aio_dat[j] = do_compute(i, j, DIM_X);
}
// determine how many bytes should be written
dwBytesToWrite = (DWORD)sizeof(aio_dat);
for(i=0;i<DIM_X;i++){ // do this DIM_X times
#ifdef ASYNC
//if(i>0){
//SetFilePointer(pFile,dwBytesToWrite,NULL,FILE_CURRENT);
//if(!(SetEndOfFile(pFile))){
// printf("%i\n",pFile);
// _PRINTERROR("error in set end of file\n");
//}
//SetFilePointer(pFile,-dwBytesToWrite,NULL,FILE_CURRENT);
//}
// write the bytes
if(!(bErrorFlag = WriteFile(pFile,aio_dat,dwBytesToWrite,NULL,pOverlapped))){
// check whether io pending or some other error
if(GetLastError()!=ERROR_IO_PENDING){
printf("lasterror: %x\n",GetLastError());
_PRINTERROR("error while writing file\n");
}
else{
fOverlapped=1;
}
}
else{
// if you get here output got immediately written; bad!
fOverlapped=0;
}
if(fOverlapped){
// do background, this msgs is what I want to see
for(j=0;j<DIM_Y;j++){
aio_dat[j] = do_compute(i, j, DIM_X);
}
for(j=0;j<DIM_Y;j++){
aio_dat[j] = do_compute(i, j, DIM_X);
}
_PRINTF("can do work in background\n");
}
else{
// not overlapped, this message is bad
_PRINTF("not ok\n");
}
// wait to continue
if((WaitForSingleObject(pOverlapped->hEvent,INFINITE))!=WAIT_OBJECT_0){
_PRINTERROR("waiting did not succeed\n");
}
// reset event structure
if(!(ResetEvent(pOverlapped->hEvent))){
printf("%x\n",GetLastError());
_PRINTERROR("error in resetevent\n");
}
pOverlapped->Offset+=dwBytesToWrite;
#else
fwrite(aio_dat,sizeof(double),DIM_Y,pFile);
for(j=0;j<DIM_Y;j++){
aio_dat[j] = do_compute(i, j, DIM_X);
}
for(j=0;j<DIM_Y;j++){
aio_dat[j] = do_compute(i, j, DIM_X);
}
#endif
}
#ifdef ASYNC
CloseHandle(pFile);
free(pOverlapped);
#else
fclose(pFile);
#endif
_END_TIMER;
return 1;
}
double do_compute(double A,double B, int arr_len)
{
int i;
double res = 0;
double *xA = malloc(arr_len * sizeof(double));
double *xB = malloc(arr_len * sizeof(double));
if ( !xA || !xB )
abort();
for (i = 0; i < arr_len; i++) {
xA[i] = sin(A);
xB[i] = cos(B);
res = res + xA[i]*xA[i];
}
free(xA);
free(xB);
return res;
}
Useful links
http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011/compiler_c/cref_cls/common/cppref_asynchioC_aio_read_write_eg.htm
http://www.ibm.com/developerworks/linux/library/l-async/?ca=dgr-lnxw02aUsingPOISIXAIOAPI
http://www.flounder.com/asynchexplorer.htm#Asynchronous%20I/O
I know this is a big question and I would like to thank everybody in advance who takes the trouble reading it and perhaps even respond!
You should be able to get this to work using the OVERLAPPED structure.
You're on the right track: the system is preventing you from writing asynchronously because every WriteFile extends the size of the file. However, you're doing the file size extension wrong. Simply calling SetFileSize will not actually reserve space in the MFT. Use the SetFileValidData function. This will allocate clusters for your file (note that they will contain whatever garbage the disk had there) and you should be able to execute WriteFile and your computation in parallel.
I would stay away from FILE_FLAG_NO_BUFFERING. You're after more performance with parallelism I presume? Don't prevent the cache from doing its job.
Another option that you did not consider is a memory mapped file. Those are available on Windows and Linux. There is a handy Boost abstraction that you could use.
With a memory mapped file, every thread in your process could write its output to the file on its own time, assuming that the record sizes are known and each thread has its own output area.
The operating system will take care of writing the mapped pages to disk when needed or when it gets around to it or when you close the file. Maybe when you close the file. Now that I think about it, some operating systems may require that you call msync to guarantee it.
I don't see why you would want to write asynchronously. Doing things in parallel does not make them faster in all cases. If you write two file at the same time to the same disk, it will almost always be a lot faster. If that is the case, just write them one after another.
If you have some fancy drive like SSD or a virtual RAM drive, parallel writing could be faster. You have to create an file with at full size and then do your parallel magic.
Asynchronous writing is nice, but is done by any OS anyway. The potential gain for you is that you can do other things than writing to disk like displaying a progress bar. This is where multi-threading can help you.
So imho you should use serial writing or parallel writing to multiple disks.
hth

Retrieve filename from file descriptor in C

Is it possible to get the filename of a file descriptor (Linux) in C?
You can use readlink on /proc/self/fd/NNN where NNN is the file descriptor. This will give you the name of the file as it was when it was opened — however, if the file was moved or deleted since then, it may no longer be accurate (although Linux can track renames in some cases). To verify, stat the filename given and fstat the fd you have, and make sure st_dev and st_ino are the same.
Of course, not all file descriptors refer to files, and for those you'll see some odd text strings, such as pipe:[1538488]. Since all of the real filenames will be absolute paths, you can determine which these are easily enough. Further, as others have noted, files can have multiple hardlinks pointing to them - this will only report the one it was opened with. If you want to find all names for a given file, you'll just have to traverse the entire filesystem.
I had this problem on Mac OS X. We don't have a /proc virtual file system, so the accepted solution cannot work.
We do, instead, have a F_GETPATH command for fcntl:
F_GETPATH Get the path of the file descriptor Fildes. The argu-
ment must be a buffer of size MAXPATHLEN or greater.
So to get the file associated to a file descriptor, you can use this snippet:
#include <sys/syslimits.h>
#include <fcntl.h>
char filePath[PATH_MAX];
if (fcntl(fd, F_GETPATH, filePath) != -1)
{
// do something with the file path
}
Since I never remember where MAXPATHLEN is defined, I thought PATH_MAX from syslimits would be fine.
In Windows, with GetFileInformationByHandleEx, passing FileNameInfo, you can retrieve the file name.
As Tyler points out, there's no way to do what you require "directly and reliably", since a given FD may correspond to 0 filenames (in various cases) or > 1 (multiple "hard links" is how the latter situation is generally described). If you do still need the functionality with all the limitations (on speed AND on the possibility of getting 0, 2, ... results rather than 1), here's how you can do it: first, fstat the FD -- this tells you, in the resulting struct stat, what device the file lives on, how many hard links it has, whether it's a special file, etc. This may already answer your question -- e.g. if 0 hard links you will KNOW there is in fact no corresponding filename on disk.
If the stats give you hope, then you have to "walk the tree" of directories on the relevant device until you find all the hard links (or just the first one, if you don't need more than one and any one will do). For that purpose, you use readdir (and opendir &c of course) recursively opening subdirectories until you find in a struct dirent thus received the same inode number you had in the original struct stat (at which time if you want the whole path, rather than just the name, you'll need to walk the chain of directories backwards to reconstruct it).
If this general approach is acceptable, but you need more detailed C code, let us know, it won't be hard to write (though I'd rather not write it if it's useless, i.e. you cannot withstand the inevitably slow performance or the possibility of getting != 1 result for the purposes of your application;-).
Before writing this off as impossible I suggest you look at the source code of the lsof command.
There may be restrictions but lsof seems capable of determining the file descriptor and file name. This information exists in the /proc filesystem so it should be possible to get at from your program.
You can use fstat() to get the file's inode by struct stat. Then, using readdir() you can compare the inode you found with those that exist (struct dirent) in a directory (assuming that you know the directory, otherwise you'll have to search the whole filesystem) and find the corresponding file name.
Nasty?
There is no official API to do this on OpenBSD, though with some very convoluted workarounds, it is still possible with the following code, note you need to link with -lkvm and -lc. The code using FTS to traverse the filesystem is from this answer.
#include <string>
#include <vector>
#include <cstdio>
#include <cstring>
#include <sys/stat.h>
#include <fts.h>
#include <sys/sysctl.h>
#include <kvm.h>
using std::string;
using std::vector;
string pidfd2path(int pid, int fd) {
string path; char errbuf[_POSIX2_LINE_MAX];
static kvm_t *kd = nullptr; kinfo_file *kif = nullptr; int cntp = 0;
kd = kvm_openfiles(nullptr, nullptr, nullptr, KVM_NO_FILES, errbuf); if (!kd) return "";
if ((kif = kvm_getfiles(kd, KERN_FILE_BYPID, pid, sizeof(struct kinfo_file), &cntp))) {
for (int i = 0; i < cntp; i++) {
if (kif[i].fd_fd == fd) {
FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
if (file_system) {
while ((parent = fts_read(file_system))) {
child = fts_children(file_system, 0);
while (child && child->fts_link) {
child = child->fts_link;
if (!S_ISSOCK(child->fts_statp->st_mode)) {
if (child->fts_statp->st_dev == kif[i].va_fsid) {
if (child->fts_statp->st_ino == kif[i].va_fileid) {
path = child->fts_path + string(child->fts_name);
goto finish;
}
}
}
}
}
finish:
fts_close(file_system);
}
}
}
}
kvm_close(kd);
return path;
}
int main(int argc, char **argv) {
if (argc == 3) {
printf("%s\n", pidfd2path((int)strtoul(argv[1], nullptr, 10),
(int)strtoul(argv[2], nullptr, 10)).c_str());
} else {
printf("usage: \"%s\" <pid> <fd>\n", argv[0]);
}
return 0;
}
If the function fails to find the file, (for example, because it no longer exists), it will return an empty string. If the file was moved, in my experience when moving the file to the trash, the new location of the file is returned instead if that location wasn't already searched through by FTS. It'll be slower for filesystems that have more files.
The deeper the search goes in the directory tree of your entire filesystem without finding the file, the more likely you are to have a race condition, though still very unlikely due to how performant this is. I'm aware my OpenBSD solution is C++ and not C. Feel free to change it to C and most of the code logic will be the same. If I have time I'll try to rewrite this in C hopefully soon. Like macOS, this solution gets a hardlink at random (citation needed), for portability with Windows and other platforms which can only get one hard link. You could remove the break in the while loop and return a vector if you want don't care about being cross-platform and want to get all the hard links. DragonFly BSD and NetBSD have the same solution (the exact same code) as the macOS solution on the current question, which I verified manually. If a macOS user wishes to get a path from a file descriptor opened any process, by plugging in a process id, and not be limited to just the calling one, while also getting all hard links potentially, and not being limited to a random one, see this answer. It should be a lot more performant that traversing your entire filesystem, similar to how fast it is on Linux and other solutions that are more straight-forward and to-the-point. FreeBSD users can get what they are looking for in this question, because the OS-level bug mentioned in that question has since been resolved for newer OS versions.
Here's a more generic solution which can only retrieve the path of a file descriptor opened by the calling process, however it should work for most Unix-likes out-of-the-box, with all the same concerns as the former solution in regards to hard links and race conditions, although performs slightly faster due to less if-then, for-loops, etc:
#include <string>
#include <vector>
#include <cstring>
#include <sys/stat.h>
#include <fts.h>
using std::string;
using std::vector;
string fd2path(int fd) {
string path;
FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
if (file_system) {
while ((parent = fts_read(file_system))) {
child = fts_children(file_system, 0);
while (child && child->fts_link) {
child = child->fts_link; struct stat info = { 0 };
if (!S_ISSOCK(child->fts_statp->st_mode)) {
if (!fstat(fd, &info) && !S_ISSOCK(info.st_mode)) {
if (child->fts_statp->st_dev == info.st_dev) {
if (child->fts_statp->st_ino == info.st_ino) {
path = child->fts_path + string(child->fts_name);
goto finish;
}
}
}
}
}
}
finish:
fts_close(file_system);
}
return path;
}
An even quicker solution which is also limited to the calling process, but should be somewhat more performant, you could wrap all your calls to fopen() and open() with a helper function which stores basically whatever C equivalent there is to an std::unordered_map, and pair up the file descriptor with the absolute path version of what is passed to your fopen()/open() wrappers (and the Windows-only equivalents which won't work on UWP like _wopen_s() and all that nonsense to support UTF-8), which can be done with realpath() on Unix-likes, or GetFullPathNameW() (*W for UTF-8 support) on Windows. realpath() will resolve symbolic links (which aren't near as commonly used on Windows), and realpath() / GetFullPathNameW() will convert your existing file you opened from a relative path, if it is one, to an absolute path. With the file descriptor and absolute path stored an a C equivalent to a std::unordered_map (which you likely will have to write yourself using malloc()'d and eventually free()'d int and c-string arrays), this will again, be faster than any other solution that does a dynamic search of your filesystem, but it has a different and unappealing limitation, which is it will not make note of files which were moved around on your filesystem, however at least you can check whether the file was deleted using your own code to test existence, it also won't make note of the file in whether it was replaced since the time you opened it and stored the path to the descriptor in memory, thus giving you outdated results potentially. Let me know if you would like to see a code example of this, though due to files changing location I do not recommend this solution.
Impossible. A file descriptor may have multiple names in the filesystem, or it may have no name at all.
Edit: Assuming you are talking about a plain old POSIX system, without any OS-specific APIs, since you didn't specify an OS.

linux threads and fopen() fclose() fgets()

I'm looking at some legacy Linux code which uses pthreads.
In one thread a file is read via fgets(). The FILE variable is a global variable shared across all threads. (Hey, I didn't write this...)
In another thread every now and again the FILE is closed and reopened with another filename.
For several seconds after this has happened, the thread fgets() acts as if it is continuing to read the last record it read from the previous file: almost as if there was an error but fgets() was not returning NULL. Then it sorts itself out and starts reading from the new file.
The code looks a bit like this (snipped for brevity so I hope it's still intelligible):
In one thread:
while(gRunState != S_EXIT){
nanosleep(&timer_delay,0);
flag = fgets(buff, sizeof(buff), gFile);
if (flag != NULL){
// do something with buff...
}
}
In the other thread:
fclose(gFile);
gFile = fopen(newFileName,"r");
There's no lock to make sure that the fgets() is not called at the same time as the fclose()/fopen().
Any thoughts as to failure modes which might cause fgets() to fail but not return NULL?
How the described code goes wrong
The stdio library buffers data, allocating memory to store the buffered data. The GNU C library dynamically allocates file structures (some libraries, notably on Solaris, use pointers to statically allocated file structures, but the buffer is still dynamically allocated unless you set the buffering otherwise).
If your thread works with a copy of a pointer to the global file pointer (because you passed the file pointer to the function as an argument), then it is conceivable that the code would continue to access the data structure that was orginally allocated (even though it was freed by the close), and would read data from the buffer that was already present. It would only be when you exit the function, or read beyond the contents of the buffer, that things start going wrong - or the space that was previously allocated to the file structure is reallocated for a new use.
FILE *global_fp;
void somefunc(FILE *fp, ...)
{
...
while (fgets(buffer, sizeof(buffer), fp) != 0)
...
}
void another_function(...)
{
...
/* Pass global file pointer by value */
somefunc(global_fp, ...);
...
}
Proof of Concept Code
Tested on MacOS X 10.5.8 (Leopard) with GCC 4.0.1:
#include <stdio.h>
#include <stdlib.h>
FILE *global_fp;
const char etc_passwd[] = "/etc/passwd";
static void error(const char *fmt, const char *str)
{
fprintf(stderr, fmt, str);
exit(1);
}
static void abuse(FILE *fp, const char *filename)
{
char buffer1[1024];
char buffer2[1024];
if (fgets(buffer1, sizeof(buffer1), fp) == 0)
error("Failed to read buffer1 from %s\n", filename);
printf("buffer1: %s", buffer1);
/* Dangerous!!! */
fclose(global_fp);
if ((global_fp = fopen(etc_passwd, "r")) == 0)
error("Failed to open file %s\n", etc_passwd);
if (fgets(buffer2, sizeof(buffer2), fp) == 0)
error("Failed to read buffer2 from %s\n", filename);
printf("buffer2: %s", buffer2);
}
int main(int argc, char **argv)
{
if (argc != 2)
error("Usage: %s file\n", argv[0]);
if ((global_fp = fopen(argv[1], "r")) == 0)
error("Failed to open file %s\n", argv[1]);
abuse(global_fp, argv[1]);
return(0);
}
When run on its own source code, the output was:
Osiris JL: ./xx xx.c
buffer1: #include <stdio.h>
buffer2: ##
Osiris JL:
So, empirical proof that on some systems, the scenario I outlined can occur.
How to fix the code
The fix to the code is discussed well in other answers. If you avoid the problem I illustrated (for example, by avoiding global file pointers), that is simplest. Assuming that is not possible, it may be sufficient to compile with the appropriate flags (on many Unix-like systems, the compiler flag '-D_REENTRANT' does the job), and you will end up using thread-safe versions of the basic standard I/O functions. Failing that, you may need to put explicit thread-safe management policies around the access to the file pointers; a mutex or something similar (and modify the code to ensure that the threads use the mutex before using the corresponding file pointer).
A FILE * is just a pointer to the various resources. If the fclose does not zero out those resource, it's possible that the values may make enough sense that fgets does not immediately notice it.
That said, until you add some locking, I would consider this code completely broken.
Umm, you really need to control access to the FILE stream with a mutex, at the minimum. You aren't looking at some clever implementation of lock free methods, you are looking at really bad (and dusty) code.
Using thread local FILE streams is the obvious and most elegant fix, just use locks appropriately to ensure no two threads operate on the same offset of the same file at once. Or, more simply, ensure that threads block (or do other work) while waiting for the file lock to clear. POSIX advisory locks would be best for this, or your dealing with dynamically growing a tree of mutexes... or initializing a file lock mutex per thread and making each thread check the other's lock (yuck!) (since files can be re-named).
I think you are staring down the barrel of some major fixes .. unfortunately (from what you have indicated) there is no choice but to make them. In this case, its actually easier to debug a threaded program written in this manner than it would be to debug something using forks, consider yourself lucky :)
You can also put some condition-wait (pthread_cond_wait) instead of just some nanosleep which will get signaled when intended e.g. when a new file gets fopened.

Resources