C: Providing regular expression as argument to open() ? - c

Is there a way to open a file with the open() without knowing its full name?
The linux shell provides an easy way to do that (in some sense), by accepting regular expressions as input.
For example, if you have a folder containing the files:
a.out file1 file2 file3 file4 file.txt test
and you want to list only the files with the prefix file you can do so by:
$ ls file*
file1 file2 file3 file4 file.txt
Or:
$ ls file[1-9]
file1 file2 file3 file4
To list only numbered files and so on...
I need to open the same file whenever my program launches.
The problem is, the file it needs to open is of the form: X*Y, meaning it starts with an X and ends with Y, but it could be anything in between.
For example, it could be X-Toshiba_12.45y9-Y, or it might be X-Dell-5.44s-Y.
I want to be able to open this file without having to consider the model.
The file may reside with some other files in that folder, but the X prefix and Y postfix are unique.
I could iterate the files in that folder and try to find my file by matching strings, but I'd rather avoid it.
Is there a way to provide open() with a regular expression somehow?

These are not regular expressions! You are talking about glob patterns.
You can use the POSIX.1-2001 glob() function to expand a glob pattern (like *.* or foo-*.?a* or *.[a-z]* and so on) to an array of filenames/pathnames that match the given pattern (starting at the current working directory, unless the pattern specifies an absolute path). This is basically what most shells use when they expand file name patterns.
If you were hell-bent on using regular expressions to specify file names (say, you need find-type behaviour, but with regular expressions), use SUSv4 nftw() function to traverse a directory tree -- it even works with the corner cases, like fewer descriptors than tree depth, or files modified, renamed, or moved while tree traversal --, and POSIX regex functions to filter on the file names. Note: regcomp() and regexec() etc. are built-in to POSIX.1-2001 -supporting C libraries, and that includes just about all current C library implementations for Linux. No external libraries are needed at all.
It makes me very sad to see example code using opendir()/readdir() to traverse a directory tree, when nftw() is available and much smarter and more robust. Just define _XOPEN_SOURCE 700 and _POSIX_C_SOURCE 200809L to get all these nice features in Linux and many *BSD variants, too.

Check this example
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <dirent.h>
int
startswith(const char *const haystack, const char *const needle)
{
size_t haystackLength;
size_t needleLength;
if ((haystack == NULL) || (needle == NULL))
return 0;
haystackLength = strlen(haystack);
needleLength = strlen(needle);
if (haystackLength < needleLength)
return 0;
return (memcmp(haystack, needle, needleLength) == 0);
}
int
endswith(const char *const haystack, const char *const needle)
{
size_t haystackLength;
size_t needleLength;
if ((haystack == NULL) || (needle == NULL))
return 0;
haystackLength = strlen(haystack);
needleLength = strlen(needle);
if (haystackLength < needleLength)
return 0;
return (memcmp(haystack + haystackLength - needleLength, needle, needleLength) == 0);
}
void
searchdir(const char *const directory, const char *const starts, const char *const ends)
{
DIR *dir;
struct dirent *entry;
dir = opendir(directory);
if (dir == NULL)
return;
while ((entry = readdir(dir)) != NULL)
{
struct stat statbuf;
char filepath[PATH_MAX];
size_t length;
const char *name;
name = entry->d_name;
if ((strcmp(name, ".") == 0) || (strcmp(name, "..") == 0))
continue;
length = snprintf(filepath, sizeof(filepath), "%s/%s", directory, name);
if (length >= sizeof(filepath))
{
fprintf(stderr, "unexpected error\n");
closedir(dir);
return;
}
if (stat(filepath, &statbuf) == -1)
{
fprintf(stderr, "cannot stat `%s'\n", filepath);
continue;
}
/* if the entry is a directory, probably recures */
if (S_ISDIR(statbuf.st_mode) != 0)
saerchdir(filepath, starts, ends);
/* or just, continue? */
/* The file name does not match */
if ((startswith(name, starts) == 0) || (endswith(name, ends) == 0))
continue;
/* Do whatever you want with the file */
fprintf(stdout, "%s\n", filepath);
}
closedir(dir);
}
int
main(int argc, char **argv)
{
if (argc < 4)
{
fprintf(stderr, "usage: %s directory startpattern endpattern\n", argv[0]);
fprintf(stderr, "\tex. %s /home/${USER} X Y\n", argv[0]);
return -1;
}
searchdir(argv[1], argv[2], argv[3]);
return 0;
}
Do whatever you want with the file, can go from pushing it into a char * array, to passing a function pointer to the function and executing that function on the file path.

Related

How to find file in directory in c

I'm trying to create a function in c which scans the given directory for files of the given format (for example :- _sort.txt) and check if the directory contains that format file or not.
if directory contain file of that format then it should return -1 else 0.
but i am stuck on how to scan the directory.can anyone help me with this.
i am fairly new to c so please bear with me.
operating system :- Linux
you can use strstr to find the pattern in the filename.
don't forget to include these headers:
#include <dirent.h>
#include <string.h>
int find_file(const char *pattern, DIR *dr) {
struct dirent *de;
int found = -1;
while ( (de = readdir(dr)) != NULL) {
// DT_REG is a regular file
if (de->d_type == DT_REG && strstr(de->d_name, pattern) != NULL) {
found = 0;
break;
}
}
return found;
}
any where that you want to use this function you should pass the DIR which is opened with opendir.
int main() {
DIR *d = opendir(".");
const char *pattern = "_sort.txt";
if (find_file(pattern, d) == 0)
printf("found it\n");
else
printf("not found it\n");
return 0;
}
for further information about dirent struct:
man readdir

Program to print directories using c not working, issues with recursive call

I need to create a program that basically acts similarly to the list utility on Linux. I've been trying to get this to work and I'm pretty close but now I've gotten stuck. Essentially it will print whatever files and sub-directories that are contained withing a directory(i.e. if i run ./project3, it lists whatevers in that directory). However, once I try to get the recursive call working it spits out something like:
sh: 1: /home/RageKage/Documents/Project3/dir1: Permission denied
That's where I'm stuck, I'm not exactly sure what to do from here. I'm getting the path of the directory to explore using realpath and that works fine, but the recursive call just isn't working and I'm not exactly sure what I'm doing wrong. Any help would be appreciated as I'm relatively new to this.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <dirent.h>
#include <string.h>
#include <limits.h>
int main (int argc, char *argv[])
{
DIR *dir;
struct dirent *sd;
const char *direct;
char buf[PATH_MAX + 1];
if (argc < 2)
{
direct = ".";
}else{
direct = argv[1];
//printf("Hey this is argv[1]: %s\n", argv[1]);
}
dir = opendir(direct);
if (dir == NULL)
{
printf("ERROR! NO DIRECTORY TO OPEN!\n");
exit(1);
}
while( (sd=readdir(dir)) != NULL )
{
if (!strcmp(sd->d_name, ".") || !strcmp(sd->d_name, ".."))
{
}else{
printf("\n>> %s\n", sd->d_name);
}
if (!strcmp(sd->d_name, "..") || !strcmp(sd->d_name, "."))
{
}else if (sd->d_type == 4){
printf("Attempting to Run!\n");
realpath(sd->d_name, buf);
printf("[%s]\n", buf);
system(("./project3 %s", buf));
printf("\n");
}
}
closedir(dir);
return 0;
}
system(("./project3 %s", buf));
Are you recursively calling the program itself again? That sounds a bit inefficient, and hard to do since you'd need to know where the executable file is. In general it could be just about anywhere (starting with /bin, /usr/bin etc.), and all you are likely to get in argv[0] is the filename part, not the whole path.
Also, as said in the comments, func((this, that)) is the same as func(that), not func(this, that), since the parenthesis make the comma act as the comma operator, not as an argument separator. And system() only takes one argument anyway, so you'd need to use sprintf() to build the command line. (Or perhaps use the exec() functions to actually give separate arguments without invoking a shell, but then you need to do the fork(), too.)
I'd suggest scrapping that idea, and putting the directory tree walking into a function of it's own, and calling that recursively:
void walkpath(void)
{
DIR *dir = opendir(".");
struct dirent *sd;
while((sd = readdir(dir)) != NULL) {
/* ... */
if (sd->d_type == DT_DIR) {
chdir(sd->d_name);
walkpath();
chdir("..");
}
}
}
int main(...)
{
/* ... */
chdir(directory);
walkpath();
}
I used chdir here to change the process's working directory along with the walk. If you need to track the full directory name, then you'll need to add that.
Also, now you have the test for . and .. twice. Use continue to end that iteration of the loop so you don't need to test the same thing again.
if (strcmp(sd->d_name, ".") == 0 || strcmp(sd->d_name, "..") == 0) {
continue;
}

How to code my own version of mv (rename/move) unix command in C language?

I want to write my own code for move(mv) Unix command. I am completely new to C language and apparently lost on how to fix my code. I want to perform actions like renaming a file if both the inputs are file names. If the the dest_folder is a directory I would like to move the file into the directory.
But I am unable to fix code for the particular problem as I am not much familiar with directories and C in particular. The program takes 2 inputs source and destination after which it performs necessary functions. I am apparently able to rename my files but I am unable to move the file to a particular folder for some reason I don't know?
Need help with moving file to a particular directory.
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#define SBUF 256
#define DBUF 256
int main(int ac, char *argv[])
{
DIR* dir_ptr; // the directory
struct dirent* direntp;
if( ac == 1 )
{
printf("Usage: %s MOVE\n", argv[0] );
exit(0);
}
if(ac>1 && ac<3)
{
printf("Error! few arguments provided " );
exit(0);
}
char src_folder[SBUF];
char dest_folder[DBUF];
strcpy(src_folder, argv[1]);
strcpy(dest_folder, argv[2]);
dir_ptr = opendir("."); //open directory
if ( dir_ptr == NULL )
{
perror( "." );
exit( 1 );
}
while( (direntp = readdir( dir_ptr )) != NULL )
{
if ( strcmp(direntp->d_name, dest_folder) !=0) //search file or directory
{
printf("found the file %s", dest_folder);
break;
}else
printf("not found");
break;
}
rename(src_folder, dest_folder);
closedir( dir_ptr );
return 0;
}
rename(3) does not work the way you want it to work (I don't know why, ask the committee). You cannot do a rename(some_file, some_directory), just as the man-page says.
Just use stat(2) (or lstat(2) if necessary) and check what you have been given. Here is a short, runnable sketch.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <unistd.h>
#include <errno.h>
// check if it is the same inode on the same device
#define SAME_INODE(a, b) ((a).st_ino == (b).st_ino && (a).st_dev == (b).st_dev)
// ALL CHECKS OMMITTED!
int main(int argc, char **argv)
{
struct stat statbuf_src, statbuf_dest;
char *src, *dest, *new_src, *new_dest;
char *current_directory;
if (argc != 3) {
fprintf(stderr, "usage: %s src dest\n", argv[0]);
exit(EXIT_FAILURE);
}
// work on copy
src = malloc(strlen(argv[1]) + 1);
dest = malloc(strlen(argv[2]) + 1);
strcpy(src, argv[1]);
strcpy(dest, argv[2]);
stat(src, &statbuf_src);
stat(dest, &statbuf_dest);
// there are many more, of course
printf("\"%s\" is a ", src);
if (S_ISREG(statbuf_src.st_mode)) {
puts("a regular file");
}
if (S_ISDIR(statbuf_src.st_mode)) {
puts("a directory");
}
printf("\"%s\" is a ", dest);
if (S_ISREG(statbuf_dest.st_mode)) {
puts("a regular file");
}
if (S_ISDIR(statbuf_dest.st_mode)) {
puts("a directory");
}
if (SAME_INODE(statbuf_dest, statbuf_src)) {
printf("%s and %s are the identical\n", src, dest);
}
// if that is not set you have to do it by hand:
// climb up the tree, concatenating names until the inodes are the same
current_directory = getenv("PWD");
printf("current directory is \"%s\"\n", current_directory);
// I'm pretty sure it can be done in a much more elegant way
new_src = malloc(strlen(src) + 1 + strlen(current_directory) + 1);
strcpy(new_src,current_directory);
strcat(new_src,"/");
strcat(new_src,src);
printf("new_src = %s\n",new_src);
new_dest = malloc(strlen(dest) + 1 + strlen(current_directory) + 1 + strlen(src) + 1);
strcpy(new_dest,current_directory);
strcat(new_dest,"/");
strcat(new_dest,dest);
strcat(new_dest,"/");
strcat(new_dest,src);
printf("new_dest = %s\n",new_dest);
if(rename(new_src,new_dest) != 0){
fprintf(stderr,"rename failed with error %s\n",strerror(errno));
}
free(new_src);
free(new_dest);
free(src);
free(dest);
exit(EXIT_SUCCESS);
}
Edit: added code for the desciption below
At the end you have a the path where you are, the information if the arguments given are directories or regular files and the path. If the source is a regular file and the destination a directory, you concatenate the path with the name of the regular file, the path with the name of the directory and the name of the regular file (your source)
Out of
Path = /home/foo
src = bar
dest = coffee
build
new_src = /home/foo/bar
new_dest = /home/foo/coffee/bar
Such that the call to rename() is
rename(new_src, new_dest);
That way you rename a regular file to a regular file which rename() accepts.
Please be aware that rename() does not work across every filesystem, but most.
Like you know, mv is implemented by rename. rename is a atomic system call that can rename a file to a file , an emtpy directory to an empty directory or a directory to a directory(the dest must be nonentity). So there are following situation to deal with:
mv file1 file2 - use rename function
mv dir1 dir2(nonentity or empty) - use rename function
mv dir1 dir2(not empty) - rename dir1 to dir2/dir1
mv file dir(exist) - rename file to dir/file
mv dir file - illegal operation
can you understand?

In C, is there a way to get the size of multiple files with names matching a string?

I have files something like this:
file1_a_etc.txt,
file1_b_etc.txt
file2_a_z.txt
file2_b_z.txt
I want to get the size of files with "a" i.e. file2_a_z.txt & file1_a_etc.txt
I have got a large number of files this way, so cant specify each name individually.
I am a beginner at C.
I know how to read the size of a single file. And I am working on windows.
#include <stdio.h>
#include <sys/stat.h> // For struct stat and stat()
struct stat attr;
void main()
{
if(stat("filename.txt", &attr) == 0)
{
float x;
x=(attr.st_size)/1048576.0; //1MB=1048576 bytes
printf("Filesize: %.2f MB", x);
}
else
{
// couldn't open the file
printf("Couldn't get file attributes...");
}
}
For Windows console there is function _findfirst. For first parameter put *a*.txt.
You need to iterate over the files in a given directory while searching for the substring in each file name.
This answer, under the section (Unix/Linux), specifies how to iterate over each filename while comparing for an exact match, you can modify the strcmp function call to strstr to look for a substring.
You could make an Array of strings to store all filenames. Then you can use the strchr function to test, if an 'a' or other character is the name. The use of this function is explained e.g at http://www.tutorialspoint.com/ansi_c/c_strchr.htm
Reading directories programmatically can be done with readdir.
You could do something like this:
#include <dirent.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>
static void lookup(const char *dir)
{
DIR *dirp;
struct dirent *dp;
if ((dirp = opendir(dir)) == NULL) {
perror("couldn't open '.'");
return;
}
do {
errno = 0;
if ((dp = readdir(dirp)) != NULL) {
if (strstr(dp->d_name, "_a_") == NULL)
continue;
(void) printf("found %s\n", dp->d_name);
// Add code to handle the file
}
} while (dp != NULL);
if (errno != 0)
perror("error reading directory");
(void) closedir(dirp);
return;
}
readdir is part of POSIX.1-2001, which is supported by unix/linux-type systems (including OS/X) but only some windows compilers. If you are programming in windows you may have to use another solution.

Using an array of filenames stored as strings

My program iterates through a single directory (non-recursively) and stores the names of all the files in that directory inside an array. Then, it uses that array in the second part of my program and returns some information about each file. I can iterate through the directory, and I can process a single file, but I'm having trouble combining the two parts of the program. Here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
int getArraySize(char* arr[]);
int getArraySize(char* arr[]) {
return sizeof(&arr);
}
char *filesArray[200];
int main (int argc, char* argv[])
{
DIR *dir;
struct dirent *ent;
int filesCtr = 0;
if ((dir = opendir ("/home/dshah/Documents/CECS 420/Project 3")) != NULL) {
while ((ent = readdir (dir)) != NULL) { /* print all the files and directories within directory */
if (strcmp(ent->d_name, ".") == 0) {
continue;
} else if (strcmp(ent->d_name, "..") == 0) {
continue;
} else if (ent->d_type == 4) { // if a directory
continue;
} else {
filesArray[filesCtr] = ent->d_name;
printf("%s\n", filesArray[filesCtr]);
filesCtr++;
}
}
closedir (dir);
} else { /* could not open directory */
perror ("Could not open directory");
}
int i;
for (i = 0; i < getArraySize(filesArray); i++) {
char* filename = filesArray[i];
FILE *file = fopen (filename, "r");
if (file != NULL) {
char line [128]; /* or other suitable maximum line size */
int ctr = 1;
while (fgets(line, sizeof line, file) != NULL) { /* read a line */
if (strstr(line, "is") != NULL) {
printf("%s:%d:%s", filename, ctr, line);
}
ctr++;
}
fclose (file);
} else {
perror (filename); /* why didn't the file open? */
}
}
return 0;
}
The line I am having trouble with is:
char* filename = filesArray[i];
Is this line of code correct? It works when I set filename to a string like "file.txt", so shouldn't this also work when I do printf("n %s\n", filesArray[i]);? Is filesArray[i] in this line of code a string?
EDIT:
Thanks, that fixed the problem. One more quick question: I'm trying to append the full path on
FILE *file = fopen (filename, "r");`
line by changing it to
FILE *file = fopen (strcat("/home/dshah/Documents/CECS 420/Project 3/", filename), "r");
but it gives me a segmentation fault. Shouldn't this work cause I'm just specifying the path?
When you pass an array to a function, it decays to a pointer, so when you do e.g. &arr you actually get a pointer to that pointer, and the size of a pointer is most likely not the size of the original array. If (and I mean really if) the array is actually a string, you can use strlen to get the length of the string (not including the string terminator character).
In your case, you don't actually need the getArraySize function, as you already have a counter telling you how many strings there is in the filesArray array: The filesCtr variable.
Also, when using a function such as readdir the d_name field of the returned entry may actually be pointing to a static array so you can't really just copy the pointer, you have to copy the complete string. This is done with the strdup function:
filesArray[filesCtr] = strdup(ent->d_name);
Remember that when done you have to free this string.
Oh, and avoid using "magic numbers" in your code, for example when checking if the directory entry is a sub-directory (ent->d_type == 4). Use the macros available to use (end->d_type == DT_DIR).
And a final thing, the d_name field of the readdir entry only contains the actual filename, not the full path. So if you want the full path you have to append the path and the filename.

Resources