Where does fopen() search for File to read? - c

The question is self-descriptive. I just want to know the search range of fopen() in :
a) Windows
b) Unix-like systems like MacOS & Linux
When asked to open a file for reading, or reading & writing or even just writing, with a relative path, i.e "File.txt". And I need an answer addressing both - text & binary files (if at all they differ in this regard).
Does it scan only the current directory , or does it scan particular folders ?
(Since scanning full disk would be painstakingly slow, right ?)
Edit:
Why the downvotes ? Because the ya'll simply don't know ?

fopen() doesn't scan at all
It just opens the file you tell it to open.
The path is either absolute, or relative to the current directory.
The behaviour is pretty much the same across platforms.
Of course in Windows paths look a bit different (drive letters, backslashes instead of slashes).
One relevant difference I can think of:
If the path starts with a drive letter and a colon, it will look at another drive.
If there is no backslash after the drive letter and colon, then the location will be relative to that drive's current working directory (as Windows remembers a current directory per drive letter).

Related

Copying a directory using sockets

I'm writing a program in C that sends files across the network using sockets. This works fine for files - they are read into a buffer and then written onto the socket. They are picked up at the other end by reversing this process.
However, how can this apply to directories? I also want to copy directories, keeping the permissions the same (so I don't think mkdir will work). At the moment when I try to run this on a directory, it says the size is -1. How is a directory represented?
To be clear, for example, if I want my program to copy /tmp across the network, it will do this:
/tmp/1.txt - OK
/tmp/2.txt - OK
/tmp/dir/ - Skip
/tmp/dir/3.txt - Can't write to path
There are several possibilities. It would fit fairly will with what you have already to tar the directory to transfer, send the resulting archive across the network, and untar on the other side.
Alternatively, you can walk the directory tree recursively. For each directory you need transfer only the name and whichever attributes you want to preserve, but then you must list the directory contents (probably via readdir()) and transfer each member.
By the way, don't neglect to think about how you're going to handle links, both symbolic ones and hard ones. And if you want your program to be really robust then consider also what to do with special files such as device files and FIFOs.
I guess it is homework, otherwise why not use FTP, scp, rsync, unison etc.
To test if a file path is a plain file, a device, a directory, etc etc... use
stat(2)
To read a directory, use opendir(3) then loop on readdir(3) (then of course closedir). You don't need to know how a directory is represented.
You probably should be interested in nftw(3) to recursively traverse a file tree.
To make one directory, use mkdir(2)
You should read Advanced Linux Programming
BTW, this answer contains useful information too...

Convert windows device filename to drive letter

I am trying to get the filename associated with a process handle in C, and since my code needs to run on Windows XP I'm using GetProcessImageFileName (rather than QueryFullProcessImageName).
However, GetProcessImageFileName returns the path in device form, e.g. \device\harddiskvolume0\ - how can I convert this to a drive letter?
I was going to suggest GetModuleFileNameEx like Luke did in comments.
QueryDosDevice() on all drive letters (you can find all the drive letters with GetLogicalDrives()) would be another bet, though it's theoretically possible that you could get a path without a drive letter, or symbolic links could screw up a straightforward string compare.
But.. How about this... You should be able to prefix the NT path with \??\GLOBALROOT (this is from memory, it might not be exactly that) and then use it in functions like CreateFileW(). (AFAIK it has to be the Unicode versions of the file APIs..)
You can try convert the drive letter form to device form instead, and here is what I did, hope it helps:
TCHAR szTemp[MAX_PATH] = {0};
_tcsncpy(szTemp, lpszImageFile, 2);
QueryDosDevice(szTemp, szImageFile, MAX_PATH);
_tcsncat(szImageFile, lpszImageFile+2, _tcslen(lpszImageFile) - 2);
In this code, lpszImageFile is the full path name of the process, e.g. c:\program files\test.exe.

Spaces in the filename

Some file system cares about spaces at the beginning or end of the file or directory name?
They (file system) convert this: "/ directory /" to this "/directory/" when create a file?
English is not my native language, so I apologize any mistake.
Yes they do care.
For instance in Linux Ext3 / Ext4:
touch "file1"
touch " file1"
touch "file1 "
Will create three different files. One without spaces, other with a leading space, and the other with a trailing one.
It works just the same with directories, as Linux follows the Unix principle of everything is a file.
Windows filename rules advices against using trailing spaces for files or directories, even though the underlying filesystem may support it.

Tcl determine file name from browser upload

I have run into a problem in one of my Tcl scripts where I am uploading a file from a Windows computer to a Unix server. I would like to get just the original file name from the Windows file and save the new file with the same name. The problem is that [file tail windows_file_name] does not work, it returns the whole file name like "c:\temp\dog.jpg" instead of just "dog.jpg". File tail works correctly on a Unix file name "/usr/tmp/dog.jpg", so for some reason it is not detecting that the file is in Windows format. However Tcl on my Windows computer works correctly for either name format. I am using Tcl 8.4.18, so maybe it is too old? Is there another trick to get it to split correctly?
Thanks
The problem here is that on Windows, both \ and / are valid path separators so long Windows API is concerned (even though only \ is deemed to be "official" on Windows). On the other hand, in POSIX, the only valid path separator is /, and the only two bytes which can't appear in a pathname component are / and \0 (a byte with value 0).
Hence, on a POSIX system, "C:\foo\bar.baz" is a perfectly valid short filename, and running
file normalize {C:\foo\bar.baz}
would yield /path/to/current/dir/C:\foo\bar.baz. By the same logic, [file tail $short_filename] is the same as $short_filename.
The solution is to either do what Glenn Jackman proposed or to somehow pass the short name from the browser via some other means (some JS bound to an appropriate file entry?). Also you could attempt to detect the user's OS from the User-Agent header.
To make Glenn's idea more agnostic to user's platform, you could go like this:
Scan the file name for "/".
If none found, do set fname [string map {\\ /} $fname] then go to the next step.
Use [file tail $fn] to extract the tail name.
It's not very bullet-proof, but supposedly better than nothing.
You could always do [lindex [split $windows_file_name \\] end]

What corner cases must we consider when parsing $PATH on Linux?

I'm working on a C application that has to walk $PATH to find full pathnames for binaries, and the only allowed dependency is glibc (i.e. no calling external programs like which). In the normal case, this just entails splitting getenv("PATH") by colons and checking each directory one by one, but I want to be sure I cover all of the possible corner cases. What gotchas should I look out for? In particular, are relative paths, paths starting with ~ meant to be expanded to $HOME, or paths containing the : char allowed?
One thing that once surprised me is that the empty string in PATH means the current directory. Two adjacent colons or a colon at the end or beginning of PATH means the current directory is included. This is documented in man bash for instance.
It also is in the POSIX specification.
So
PATH=:/bin
PATH=/bin:
PATH=/bin::/usr/bin
All mean the current directory is in PATH
I'm not sure this is a problem with Linux in general, but make sure that your code works if PATH has some funky (like, UTF-8) encoding to deal with directories with fancy letters. I suspect this might depend on the filesystem encoding.
I remember working on a bug report of some russian guy who had fancy letters in his user name (and hence, his home directory name which appeared in PATH).
This is minor but I'll added it since it hasn't already been mentioned. $PATH can include both absolute and relative paths. If your crawling the paths list by chdir(2)ing into each directory, you need to keep track of the original working directory (getcwd(3)) and chdir(2) back to it at each iteration of the crawl.
The existing answers cover most of it, but it's worth covering parts of the question that wasn't answered yet:
$ and ~ are not special in the value of $PATH.
If $PATH is not set at all, execvp() will use a default value.

Resources