How to read all .txt files in a directory in C - c

I currently have a short program to read and sort a text tile in C.
If I want to read many files, is there a substitute for:
FILE *f
f = fopen("*.txt", "rw");
Thanks in advance.

f = fopen("*.txt", "rw"); won't work in any case.
The usual way to do this probably depends on your operating system. On Unix-like systems, the simple way is to invoke your program with a command line like "my_pgm *.txt" and let the shell find the matching files. (You'll get multiple arguments, each one being a file name.) I understand that microsoft OSes would require the program to find the files itself.
To do that more or less portably, I'd probably use opendir() and readdir() to examine directory entries and see whether they matched the desired pattern.

Related

check if file exists, case sensitive in C

What is a good way to check that a file exists with case sensitivity in C on Windows?
I have got this to work by comparing the filename with the all the file entries in the directory of the filename. Is there a more efficient method in C?
Use this:
WIN32_FIND_DATAA FindFileData;
HANDLE h = FindFirstFile(filenametocheck, &FindFileData);
now FindFileData.cFileName contains the filename as it is stored in NTFS.
All you need to do is compare filenametocheck with FindFileData.cFileName.
Don't forget to close the h handle with FindClose(h) and do error checking.
This works only for checking in the current directory, if filenametocheck contains a path (e.g ..\somefile.txt, or C:\\Somedir\Somefile.txt) you need to do some more work.
For further details read the documentation of FindFirstFile and possibly look into this sample.
Be aware that depending on what exactly you're trying to achieve, this may cause a TOCTOU bug as mentioned in a comment.

How can I detect if a file exists?

I'm programming in C and trying to write portable code.
My question is, how can I tell if a file exists and is readable?
I am currently using the code:
f = fopen(filename, "r");
if (f) printf("File exists!");
In my program, filename is set by the user, and should be considered as untrusted input (e.g. filename could be maliciously crafted).
My issue is that the above code is not robust. For example, when used on windows with the filename "PRN" it will print "File exists!" even though no such file exists on the filesystem.
I know I could filter out the reserved filenames on Windows, as there is only about a dozen of them, but that feels like a hack. Also, I only know what I know. Maybe there are other "reserved" or "special" names that I don't know about.
Is there any simple and portable way to determine if a file exists in C?
Alternatively, if I have to use an OS API, which function should I use?

File opening with only base name in C

In C , How to open a file by considering only base name of the file for example there may be any name in the suffix part of file but the base name will be same like Unit_123, Unit_245, Unit_658.
In C , I have to give only base name, irrespective of any suffix like 123, 245, 658 by giving only base name the file should open.
In Linux shell script, this can be achieved by giving file name followed by as astreix(), for example if we give Unit irrespective of suffix it will take the file name.. how to achieve this in c
There is no standard way in C to do this. It is operating system dependent.
You need to iterate over the files in a directory with wildcards. The C standard doesn't provide any function for this, but there are, of course, platform-dependent solutions:
On Linux or other Posix ststelms, you can use glob (3), which can take wildcards like those understood in the shell.
On Windows, there is FindFirstFile and FindNextFile, which takes at least asterisks and question marks as wildcards.
To achieve what you want in C, the standard way would involve getting a directory listing for the directory holding the files of interest. If the files are located in a single directory, then the function scandir will fill the dirent struct with filenames from the directory. scandir takes as its 3rd argument a filter function of the type:
int (*filter)(const struct dirent *)
This allows you to match only the filenames that satisfy the criteria you provide in the filter function.
If you need to search a directory-tree for files/sub-directories, then functions you want are ftw and nftw. Both can return listings of the files and/or sub-directories present (depending on the FLAGS) which can then be parsed for the matching files. Take a look at all and decide what will fit your needs the best.
None of these functions represent the only way to obtain and parse file listings in C. They are simply the general functions that come to mind to do what you describe.

C - Reading multiple files

just had a general question about how to approach a certain problem I'm facing. I'm fairly new to C so bear with me here. Say I have a folder with 1000+ text files, the files are not named in any kind of numbered order, but they are alphabetical. For my problem I have files of stock data, each file is named after the company's respective ticker. I want to write a program that will open each file, read the data find the historical low and compare it to the current price and calculate the percent change, and then print it. Searching and calculating are not a problem, the problem is getting the program to go through and open each file. The only way I can see to attack this is to create a text file containing all of the ticker symbols, having the program read that into an array and then run a loop that first opens the first filename in the array, perform the calculations, print the output, close the file, then loop back around moving to the second element (the next ticker symbol) in the array. This would be fairly simple to set up (I think) but I'd really like to avoid typing out over a thousand file names into a text file. Is there a better way to approach this? Not really asking for code ( unless there is some amazing function in c that will do this for me ;) ), just some advice from more experienced C programmers.
Thanks :)
Edit: This is on Linux, sorry I forgot to metion that!
Under Linux/Unix (BSD, OS X, POSIX, etc.) you can use opendir / readdir to go through the directory structure. No need to generate static files that need to be updated, when the file system has the information you want. If you only want a sub-set of stocks at a given time, then using glob would be quicker, there is also scandir.
I don't know what Win32 (Windows / Platform SDK) functions are called, if you are developing using Visual C++ as your C compiler. Searching MSDN Library should help you.
Assuming you're running on linux...
ls /path/to/text/files > names.txt
is exactly what you want.
opendir(); on linux.
http://linux.die.net/man/3/opendir
Exemple :
http://snippets.dzone.com/posts/show/5734
In pseudo code it would look like this, I cannot define the code as I'm not 100% sure if this is the correct approach...
for each directory entry
scan the filename
extract the ticker name from the filename
open the file
read the data
create a record consisting of the filename, data.....
close the file
add the record to a list/array...
> sort the list/array into alphabetical order based on
the ticker name in the filename...
You could vary it slightly if you wish, scan the filenames in the directory entries and sort them first by building a record with the filenames first, then go back to the start of the list/array and open each one individually reading the data and putting it into the record then....
Hope this helps,
best regards,
Tom.
There are no functions in standard C that have any notion of a "directory". You will need to use some kind of platform-specific function to do this. For some examples, take a look at this post from Cprogrammnig.com.
Personally, I prefer using the opendir()/readdir() approach as shown in the second example. It works natively under Linux and also on Windows if you are using Cygwin.
Approach 1) I would just have a specific directory in which I have ONLY these files containing the ticker data and nothing else. I would then use the C readdir API to list all files in the directory and iterate over each one performing the data processing that you require. Which ticker the file applies to is determined only by the filename.
Pros: Easy to code
Cons: It really depends where the files are stored and where they come from.
Approach 2) Change the file format so the ticker files start with a magic code identifying that this is a ticker file, and a string containing the name. As before use readdir to iterate through all files in the folder and open each file, ensure that the magic number is set and read the ticker name from the file, and process the data as before
Pros: More flexible than before. Filename needn't reflect name of ticker
Cons: Harder to code, file format may be fixed.
but I'd really like to avoid typing out over a thousand file names into a text file. Is there a better way to approach this?
I have solved the exact same problem a while back, albeit for personal uses :)
What I did was to use the OS shell commands to generate a list of those files and redirected the output to a text file and had my program run through them.
On UNIX, there's the handy glob function:
glob_t results;
memset(&results, 0, sizeof(results));
glob("*.txt", 0, NULL, &results);
for (i = 0; i < results.gl_pathc; i++)
printf("%s\n", results.gl_pathv[i]);
globfree(&results);
On Linux or a related system, you could use the fts library. It's designed for traversing file hierarchies: man fts,
or even something as simple as readdir
If on Windows, you can use their Directory Management API's. More specifically, the FindFirstFile function, used with wildcards, in conjunction with FindNextFile

How to check whether two file names point to the same physical file

I have a program that accepts two file names as arguments: it reads the first file in order to create the second file. How can I ensure that the program won't overwrite the first file?
Restrictions:
The method must keep working when the file system supports (soft or hard) links
File permissions are fixed and it is only required that the first file is readable and the second file writeable
It should preferably be platform-neutral (although Linux is the primary target)
On linux, open both files, and use fstat to check if st_ino (edit:) and st_dev are the same. open will follow symbolic links. Don't use stat directly, to prevent race conditions.
The best bet is not to use filenames as identities. Instead, when you open the file for reading, lock it, using whatever mechanism your OS supports. When you then also open the file for writing, also lock it - if the lock fails, report an error.
If possible, open the first file read-only, (O_RDONLY) in LINUX. Then, if you try to open it again to write to it, you will get an error.
You can use stat to get the file status, and check if the inode numbers are the same.
Maybe you could use the system() function in order to invoke some shell commands?
In bash, you would simply call:
stat -c %i filename
This displays the inode number of a file. You can compare two files this way and if their inodes are identical, it means they are hard links. The following call:
stat -c %N filename
will display the file's name and if it's a symbolic link, it'll print the file name it links to as well. It prints out only one name, even if the file it points to has hard links, so checking the symbolic link would require comparing inode numbers for the 2nd file and the file the symbolic links links to in order to make sure.
You could redirect stat output to a text file and then parse the file in your program.
If you mean the same inode, in bash, you could do
[ FILE1 -ef FILE2 ] && echo equal || echo difference
Combined with realpath/readlink, that should handle the soft-links as well.

Resources