Reading multiple text files in C - c

What is the correct way to read and extract data from text files when you know that there will be many in a directory? I know that you can use fopen() to get the pointer to the file, and then do something like while(fgets(..) != null){} to read from the entire file, but then how could I read from another file? I want to loop through every file in the directory.

Sam, you can use opendir/readdir as in the following little function.
#include <stdio.h>
#include <dirent.h>
static void scan_dir(const char *dir)
{
struct dirent * entry;
DIR *d = opendir( dir );
if (d == 0) {
perror("opendir");
return;
}
while ((entry = readdir(d)) != 0) {
printf("%s\n", entry->d_name);
//read your file here
}
closedir(d);
}
int main(int argc, char ** argv)
{
scan_dir(argv[1]);
return 0;
}
This just opens a directory named on the command line and prints the names of all files it contains. But instead of printing the names, you can process the files as you like...

Typically a list of files is provided to your program on the command line, and thus are available in the array of pointers passed as the second parameter to main(). i.e. the invoking shell is used to find all the files in the directory, and then your program just iterates through argv[] to open and process (and close) each one.
See p. 162 in "The C Programming Language", Kernighan and Ritchie, 2nd edition, for an almost complete template for the code you could use. Substitute your own processing for the filecopy() function in that example.
If you really need to read a directory (or directories) directly from your program, then you'll want to read up on the opendir(3) and related functions in libc. Some systems also offer a library function called ftw(3) or fts(3) that can be quite handy too.

Related

How do you check to see if two different file reference "strings" refer to the same file?

We have a function in straight C that is intended to tell us if two file references refer to the same file. Right now it first checks to see if the passed in parameters are actually the same pointer, and if not, it looks at the characters to see if the strings of characters are the same. But that doesn't account for the possibility of different ways to refer to the same file. For example...
"/d1/d2/d3/theFile" compared to "../d3/theFile"
Those could be the same file or not, depending on the directory structure and where the current point of reference is. I'd like to improve the following function to be able to check to see if the string reference to a file refers to the same file as another string...
static bool is_same_file (const char *f1, const char *f2) {
if (f1==f2)
return true;
return (0==strcmp(f1, f2));
}
I imagine that it might be possible to try opening both files and use that in some way. But I don't know how to check to see if the opened files are the same physical file on the drive, and not just coincidentally the same file that happen to exist in different directories. In C#, there's a FileInfo class that can be used to find a file based on that reference string, and then you can compare complete directory information, file name, and so on. Is there a way to do something similar in C?
On a POSIX system you can use stat() to check if they have the same inode number on the same filesystem.
You should also check the generation number, to handle a race condition due to the file being deleted and a new file getting its inode number between the two calls.
#include <sys/stat.h>
static bool is_same_file (const char *f1, const char *f2) {
struct stat s1, s2;
if (stat(f1, &s1) < 0)) {
perror("stat f1");
return false;
}
if (stat(f2, &s2) < 0)) {
perror("stat f2");
return false;
}
return s1.st_dev == s2.st_dev && s1.st_ino == st.st_ino && s1.st_gen == s2.st_gen;
}

'fopen' in C can't open existing file in current directoy on Unix

I am using fopen(3) in C to read file and process it. The file is present in current working directory where the binary exists, but I am unable to read the file (Linux environment / Cygwin environment).
Here is the sample code:
C code:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
FILE *inFile;
static char fileName[255];
int process_file(FILE *inFile)
{
char ch;
inFile = fopen(fileName,"r");
if (inFile == NULL)
{
perror(fileName);
exit(1);
}
else
{
// Process file
}
fclose(inFile);
return 0;
}
int main(int argc, char *argv[])
{
printf("Enter filename to process \n");
scanf("%s", fileName);
process_file(inFile);
getchar();
return 0;
}
I have file permissions set to 777 in the current directory. The resulting binary as well as my source code reside in this directory where the input file exits. Why is the file not opened?
Update :
This question was written in few years back and this code could be improved a lot.
1. The process file should accept char * or char array instead of file pointer
2. unused variables can be removed
3. unused libraries or include files can be removed
4. Can make use of argv to accept filename with path from cmdline
5. return instead of exit in process_file and also proper return code instead of returning 0 from process_file.
I should have asked this question little more elaborate...
I had three functions to process the same file, like process_fil1e1(), process_file2() and process_file3() even though I called fclose() in all three functions. Somehow the file handle was not closed that properly or the file pointer pointed to EOF or some undefined behavior. It was not working fine.
When I used a single process file and rewind() together, it worked fine...
Be sure to input file name with its extension. This may cause problems with reading the file.
If you know the extension of the file you can input only the name and after that make the program add the extension. After scanf("%s", fileName); add strcat(fileName, ".txt"); if you want to enter only the name without extension and the file you read has extension .txt.
Your inFile and fileName variables are extern so you don't need to have arguments for the function process_file();, any function can access those variables.
You can change function int process_file(); to void process_file(); and delete return 0, you don't need that.
You have declared the inFile and fileName as global. You should change your function prototype from
int process_file(FILE *inFile)
to
int process_file()
This would at least make your program more clear. Now regarding your problem: It would almost certain be that you are doing something wrong in the input file (like not putting in the file extension) in your input. Remember, you need to pass the complete file name (including the extension which on some systems like Windows (by default) would be hidden). Otherwise, the logic looks correct to me, and it should work fine.

how to access particular file in folder through file handling in c

I have suppose two text file abc.txt and def.txt in folder "my". I have a programme which directly goes to that folder and search particular file and if that particular file find out then how to access that file's information.
I know how to read write file in C through file handling but I have no idea how to search particular file and after that read that particular file to match particular string in file.
**All these things access through file handling in C.**
So please if any one have any solution I will be thankful for that
Example will be best way to understand .
Thanks in advance
To get a listing of the files in a directory in Linux, you can use the 'opendir', 'readdir' and 'closedir' functions from 'dirent.h'. For example:
#include <dirent.h>
#include <stdio.h>
int ListDir(const char *pDirName)
{
DIR *pDir;
struct dirent *pEntry;
pDir = opendir(pDirName);
if (!pDir)
{
perror("opendir");
return -1;
}
while ((pEntry = readdir(pDir)) != NULL)
{
printf("%s\n", pEntry->d_name);
}
closedir(pDir);
return 0;
}

C++ / C: Move Directory to Another Location

I want to move the contents of one directory to another. I specify the source and destination directories via command line arguments. Here's the code:
#include <stdlib.h>
#include <stdio.h>
void move_dir(FILE *src, FILE *dest) {
int c = getc(src);
while(getc(src)!=EOF) {
putc(c,dest);
}
}
int main(int argc, char* argv[])
{
FILE *src=fopen(argv[1]);
FILE *dest=fopen(argv[2]);
while(--argc>0) {
if(src!=NULL && dest!=NULL) {
move_dir(src,dest);
}
}
fclose(src);
fclose(dest);
return 0;
}
For example:
./a.out /Folder1/Folder2/Source /Folder1
This will move the folder called Source inside of Folder1. However when I execute this code it doesn't work. It compiles just fine with g++ and no errors when running but it just doesn't move anything at all. Any ideas on what could be wrong?
Edit: This is referring to the original post, which read FILE * src = opendir( argv[1] );.
The function opendir() returns a DIR *, which is quite different from a FILE * (and cannot be used as a parameter to getc() / putc().
You have to read directory entries from that DIR * using readdir(), which will yield a filename, then copying that file using that information.
Edit: This is referring to the updated post.
You don't use file functions (fopen(), getc() etc.) on directories. The way to go is opendir() followed by readdir(), then acting on the yielded filenames.
I don't really know why fopen() on a directory actually returns a non-null pointer. Personally, I consider this a design flaw, as the operations possible on FILE * are not defined for directories. I would stay well clear of this construct.
Generally speaking, you should read the documentation (man page) of the functions you are using, not (wrongly) assuming things about them. And while you are at it, check return values, too - they might tell you why things don't work as expected.

Traversing file system according to a given root place by using threads by using C for unix

I wanna traverse inside the file system by using threads and processes.My program has to assume the first parameter is either given as "-p" which offers a multi-process application or "-t" which runs in a multi-threaded way. The second parameter is the
pathname of a file or directory. If my program gets the path of a file, it should print out the size of the file in bytes. If my program gets the path of a directory, it should, in the same way, print out the directory name, then process all the entries in the
directory except the directory itself and the parent directory. If my program is given a directory, it must display the entire hierarchy rooted at the specified directory. I wrote something but I got stuck in.I can not improve my code.Please help me.
My code is as following:
include
include
include
include
include
include
include
int funcThread(DIR *D);
int main(int argc, char * argv[])
{
pthread_t thread[100];
DIR *dirPointer;
struct stat object_file;
struct dirent *object_dir;
int counter;
if(opendir(argv[1])==NULL)
{
printf("\n\nERROR !\n\n Please enter -p or -t \n\n");
return 0;
}
if((dirPointer=opendir(argv[1]))=="-t")
{
if ((object_dir = opendir(argv[2])) == NULL)
{
printf("\n\nERROR !\n\nPlease enter the third argument\n\n");
return 0;.
}
else
{
counter=0;
while ((object_dir = readdir(object_dir)) != NULL)
{
pthread_create(&thread[counter],NULL,funcThread,(void *) object_dir);
counter++;
}
}
}
return 0;
}
int funcThread(DIR *dPtr)
{
DIR *ptr;
struct stat oFile;
struct dirent *oDir;
int num;
if(ptr=readdir(dPtr)==NULL)
rewinddir(ptr);
if(S_ISDIR(oFile.st_mode))
{
ptr=readdir(dPtr);
printf("\t%s\n",ptr);
return funcThread(ptr);
}
else
{
while(ptr=readdir(dPtr)!=NULL)
{
printf("\n%s\n",oDir->d_name);
stat(oDir->d_name,&oFile);
printf("\n%f\n",oFile.st_size);
}
rewinddir(ptr);
}
}
This line:
if((dirPointer=opendir(argv[1]))=="-t")
dirPointer is a pointer DIR* so how can it be equal to a literal string pointer?
I spotted a few errors:
Why are you using opendir() to check your arguments? You should use something like strcmp for that.
You're passing struct dirent* to funcThread() but funcThread() takes a DIR*.
You're using oFile on funcThread() before you initialize it (by calling stat()).
What is the purpose of calling rewinddir()? I guess you're blindly trying to get readdir() to work with a struct dirent*.
You're using oDir but it's never initialized.
You're calling printf() from multiple threads with no means to synchronize the output so it would be completelly out of order or garbled.
I suggest you read and understand the documentation of all those functions before using them (google "posix function_name") and get familiar with the basics of C. And before bringing threads into the equation try to get it working on a single threaded program. Also you won't see an improvement in performance by using that many threads unless you have close to that many cores, it will actually decrease performance and increase resource usage.
if(ptr=readdir(dPtr)==NULL){}
The = operator has lower precedence than ==
[this error is repeated several times]

Resources