readdir(3) strange behavior : finding non existing files in /dev/

readdir(3) strange behavior : finding non existing files in /dev/ - c

I'm using opendir / readdir / closedir to reproduce a program similar to ls, it went pretty well until I tried to ls "/dev/" when it comes to "/dev/fd/" with the recursive options, it find more files than it actually exists, those are not hidden files (I mean '.' commencing files ).
The true ls give me :
"/dev/fd/ :"
"0 1 2 3"
Mine too.
But, the thing is that in gdb, it find 3 more files that are 4,5 and 6. I heard that gdb create his own environement so let's forget this.
When I try ls "/dev/fd/" -R the true ls stop immediately the listing while my program gives :
"/dev/fd/3:"
"/dev/fd/3/3/"
"/dev/fd/3/3/......../10"
stat return -1 after 40 files at least but the execution continues : segmentation fault.
In my computer, "/dev/fd/3/" and so are symbolic links, the macro "S_ISDIR" returns me 0 on the existing files but on the non existing files like : "/dev/fd/6/" it return 1...
I wanted to know why my program goes wrong while the true ls doesn't, I noticed that ls use stat64 in my computer but when I do it still goes wrong.. it also use fstat64, futex and others syscall that I don't know.
I can show you some sample of my codes or detail a bit more it's really hard to explain for me I'm sorry for that.
Thanks you.
PS : I don't get that statement in the readdir manpage : "The data returned by readdir may be overwritten by subsequent calls to readdir for the same directory stream"

PS : I don't get that statement in the readdir manpage : "The data
returned by readdir may be overwritten by subsequent calls to readdir
for the same directory stream"
What they are basically saying is that the function is not re-entrant, and that the pointer returned by readdir shouldn't simply be cached as a unique value, as the underlying data that is being pointed to will change the next time you call the readdir function. Basically they are allowing for implementations to either define statically allocated data that can be recycled by the function, or dynamic memory managed by the OS, so that the caller of readdir does not have to worry about managing the memory pointed to by the return value of readdir. For instance, for a sample function like:
int* my_sample_increment()
{
static int val = 0;
val++;
return &val;
}
if you were to-do something like
int* int_ptr_1 = my_sample_increment();
int* int_ptr_2 = my_sample_increment();
Then both int_ptr_1 and int_ptr_2 will point to the same value, and in this case it will be the value 1. Each pointer won't be pointing to a unique integer value.
So the same is true with readdir. You cannot simply call readdir and store the pointer being returned, expecting to use it at a later date without the data that is being pointed to being modified by any subsequent calls to readdir between the time you saved the pointer, and the time you use it. If you need such functionality, that is what the re-entrant version, readdir_r is for.

Related

Why is this C program doing nothing in Ubuntu?

My very simple C program just hangs and I don’t know why.
I am trying to make a simple executable to handle multiple monotonous actions for me every time I start a new programming session.
So I decided with something simple (below) yet every time I run it, the app just hangs, never returns. So I have to Ctrl-C out of it. I have added printf commands to see if it goes anywhere, but those never appear.
My build command returns no error messages:
gcc -o tail tail.c
Just curious what I am missing.
#include <stdio.h>
#include <unistd.h>
int main() {
chdir("\\var\\www");
return 0;
}

There are at least two problems with the source code:
It is unlikely that you have a sub-directory called \var\www in your current directory — Ubuntu uses / and not \ for path separators.
Even if there was a sub-directory with the right name, your program would change directory to it but that wouldn't affect the calling program.
You should check the return value from chdir() — at minimum:
if (chdir("/var/www") != 0)
{
perror("chdir");
exit(EXIT_FAILURE);
}
And, as Max pointed out, calling your program by the name of a well-known utility such as tail is likely to lead to confusion. Use a different name.
Incidentally, don't use test as a program name either. That, too, will lead to confusion as it is a shell built-in as well as an executable in either /bin or /usr/bin. There is also a program /bin/cd or /usr/bin/cd on your machine — it will check that it can change directory, but won't affect the current directory of your shell. You have to invoke it explicitly by the full pathname to get it to run at all because cd is another shell built-in.

Two things:
First, that's not what Linux paths look like
Second, check the return value from chdir()
ie
if (chdir("/var/www") != 0)
printf("failed to change directory");
Finally, the effect of chdir() lasts for the duration of the program. It will not change the current directory of your shell once this program finishes.

The other answers adequately cover the issues in your C code. However, the reason you are seeing it hang is because you chose the name tail for your program.
In Linux, tail is a command in /usr/bin in most setups, and if you just type tail at the command line, the shell searches the $PATH first, and runs this. Without any parameters, it waits for input on its stdin. You can end it by pressing control-d to mark the end of file.
You can bypass the $PATH lookup by typing ./tail instead.
$ tail
[system tail]
$ ./tail
[tail in your current directory]
It is a good idea to use ./ as a habit, but you can also avoid confusion by not naming your program the same as common commands. Another name to avoid is test which is a shell built-in for testing various aspects of files, but appears to do nothing as it reports results in its system return code.

Weird behavior with filename in /proc/pid/stat

I have a weird behavior with the value of filename in /proc/pid/stat.
My program's name is "test_dev", and when I execute it with "./test_dev" and i look on /proc/pid/stat, i see "pid (test) ....".
Same in /proc/pid/status.
I change "test_dev" to "testdev" to see if the underscore is the culprit of this mess, but the same thing appear again.
I printf argv[0], and I correctly see "test_dev" (or "testdev").
I wonder how the field in stat is set, and why it's incomplet, because the man of /proc say that is the filename of the executable.
I think a little, and I wonder if Eclipse can be the culprit.
This EDI gave me some surprise sometimes, and I won't be surprised if this is the case again, even if it really bug me that argv[0] and /proc/pid/stat have not the same value.
Is somebody have an explanation ?
thanks.

However, the real filename without "_dev" is 15 character long, and
the "comm" field seem to be limited by the kernel to 15 character long
... So, the filename is truncated in /proc/pid/stat, … Where can I
find this kind of documentation ?
Unfortunately, this seems to be not well documented where one would expect it, but there are some hints here and there on the proc man page.
/proc/[pid]/comm (since Linux 2.6.33)
… Strings longer than TASK_COMM_LEN (16) characters are silently truncated.
/proc/[pid]/stat
Status information about the process. This is used by ps(1).
It is defined in the kernel source file fs/proc/array.c.
TASK_COMM_LEN and the task_struct are defined in include/linux/sched.h:
#define TASK_COMM_LEN 16
…
struct task_struct {
…
char comm[TASK_COMM_LEN];
The difference between the 15 characters you observe and the 16 here is due to the terminating '\0'.

The field in /proc/pid/stat you mention does not show argv[0], but the executed command. Probably, eclipse compiles the file to an executable named "test" and executes it.
This example:
execl("foo", "bar", "baz", NULL);
has a comm of "foo" (the thing in /proc/pid/stat), an argv[0] of "bar" and "baz" as argv[1]
To double-check that, you might try to call your program "xxxxx" and see, if it is still named "test" in /proc/pid/stat

How to test existence of 'file' that cannot be accessed?

I have a full path stored in char full_path[1000] and I'd like to know whether anything exists at that location.
(The next thing my code would do after this check is create something at that location, but I want it to count as an error if there is already something there instead of clearing the spot with the equivalent of rm -rf)
The spot might be occupied by a file or a directory or even a link to some no-longer-existing-target e.g.:
lrwxrwxrwx 1 user grp 4 Jun 16 20:02 a-link -> non-existent-thing
With an invalid link, access(full_path, F_OK) is going to tell me I can't access full_path and that could be because (1) nothing exists there or because (2) the link is invalid.
Given that, what's the best way to determine if anything exists at full_path?

You simply cannot do that in a cross-platform obvious way. stat() and fopen() would not work. You can, though, use the OS API, for example, on windows you could use WinAPI with the example code:
int doesFileExist(TCHAR* path)
{
WIN32_FIND_DATA FindFileData;
HANDLE handle = FindFirstFile(path, &FindFileData) ;
int found = (handle != INVALID_HANDLE_VALUE);
if(found)
{
FindClose(handle);
}
return found;
}
If you just want to check if anything exists, finding any file (again, using Windows API) would also work, you could just go directory by directory to check if it exists, if one doesn't - return an error. Keep going until you got to the directory then check for the certain file in the way mentioned above.
Say you have C:/Dir1/Dir2/Dir3/file.txt then you'd go to C: first, then check if Dir1 exists, if it does - go to it, if it doesn't return an error. Same for Dir2 and so on up until you get to the last directory and check for the file OR if you don't check for a certain file and for any item - just try using the functions mentioned in MSDN for finding first file or first directory.

Since the next thing we plan to do is create something at that location, and since we want to treat it as an error if something already exists there, then let's not bother checking. Just attempt the create and exit with an error if it fails. The create step uses symlink so if it fails we can use strerror(errno) for the explanation of any failure.
To create a file in general (vs just a symlink), EOF points out in comments that open(path, O_CREAT|O_EXCL, mode) will return failure (-1) if the file already exists, and create the file otherwise. This is atomic and safe (unlike trying to stat) and guaranteed to work atomically on POSIX (with the possible exception of NFS).

Using File Descriptors with readlink()

I have a situation where I need to get a file name so that I can call the readlink() function. All I have is an integer that was originally stored as a file descriptor via an open() command. Problem is, I don't have access to the function where the open() command executed (if I did, then I wouldn't be posting this). The return value from open() was stored in a struct that I do have access to.
char buf[PATH_MAX];
char tempFD[2]; //file descriptor number of the temporary file created
tempFD[0] = fi->fh + '0';
tempFD[1] = '\0';
char parentFD[2]; //file descriptor number of the original file
parentFD[0] = (fi->fh - 1) + '0';
parentFD[1] = '\0';
if (readlink(tempFD, buf, sizeof(buf)) < 0) {
log_msg("\treadlink() error\n");
perror("readlink() error");
} else
log_msg("readlink() returned '%s' for '%s'\n", buf, tempFD);
This is part of the FUSE file system. The struct is called fi, and the file descriptor is stored in fh, which is of type uint64_t. Because of the way this program executes, I know that the two linked files have file descriptor numbers that are always 1 apart. At least that's my working assumption, which I am trying to verify with this code.
This compiles, but when I run it, my log file shows a readlink error every time. My file descriptors have the correct integer values stored in them, but it's not working.
Does anyone know how I can get the file name from these integer values? Thanks!

If it's acceptable that your code becomes non portable and is tied to being run on a somewhat modern version of Linux, then you can use /proc/<pid>/fd/<fd>. However, I would recommend against adding '0' to the fd as a means to get the string representing the number, because it uses the assumption that fd < 10.
However it would be best if you were able to just pick up the filename instead of relying on /proc. At the very least, you can replace calls to the library's function with a wrapper function using a linker flag. Example of usage is gcc program.c -Wl,-wrap,theFunctionToBeOverriden -o program, all calls to the library function will be linked against __wrap_theFunctionToBeOverriden; the original function is accessible under the name __real_theFunctionToBeOverriden. See this answer https://stackoverflow.com/a/617606/111160 for details.
But, back to the answer not involving linkage rerouting: you can do it something like
char fd_path[100];
snprintf("/proc/%d/fd/%d", sizeof(fd_path), getpid(), fi->fh);
You should now use this /proc/... path (it is a softlink) rather than using the path it links to.
You can call readlink to find the actual path in the filesystem. However, doing so introduces a security vulnerability and I suggest against using the path readlink returns.
When the file the descriptor points at is deleted,unlinked, then you can still access it through the /proc/... path. However, when you readlink on it, you get the original pathname (appended with a ' (deleted)' text).
If your file was /tmp/a.txt and it gets deleted, readlink on the /proc/... path returns /tmp/a.txt (deleted). If this path exists, you will be able to access it!, while you wanted to access a different file (/tmp/a.txt). An attacker may be able to provide hostile contents in the /tmp/a.txt (deleted) file.
On the other hand, if you just access the file through the /proc/... path, you will access the correct (unlinked but still alive) file, even if the path claims to be a link to something else.

Why does FUSE readdir returns Input/output error?

I am seeing a strange issue while implementing the readdir() functionality in fuse. Basically when I do ls on any directory in fuse, I get an error such as:
# ls
ls: reading directory .: Input/output error
file1.c file2.c
But the strange thing is, readdir() is doing exactly what it is supposed to do. In the sense that in that particular directory, I have two files named file1.c and file2.c and it is able to read it correctly.
While debugging the issue I noticed that fuse filler function (fuse_fill_dir_t passed as an argument to readdir() ) is what may be causing this error.
This is because if I simply print the contents of the directory using a debug printf without returning the contents using the filler function, I do not see the error.
But as soon as I start using the filler function to return the contents, I start seeing this error.
I have two questions related to this:
1) Anybody have any idea as to why the filler function might be causing this problem?
2) How do I look for the definition of the code for the fuse_fill_dir_t function? I have looked through most of the fuse functions with that kind of arguments but have had no luck until now.
Any help is appreciated!
Cheers,
Vinay

Such messages may be caused by failed calls to other (possibly unimplemented) FUSE callbacks like getxattr(). Then readdir() is called and results are obtained right.
You can debug a FUSE filesystem running its executable with key -d (debug mode), - that does not daemonize process and prints detailed debug output about FUSE calls.
Also, it would be nice to know what is your platform (Linux/OS X/etc).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

readdir(3) strange behavior : finding non existing files in /dev/ - c

Related

Why is this C program doing nothing in Ubuntu?

Weird behavior with filename in /proc/pid/stat

How to test existence of 'file' that cannot be accessed?

Using File Descriptors with readlink()

Why does FUSE readdir returns Input/output error?

Categories

Resources