Looping through directories in proc - c

I have a function which loops through the directories in the proc file system. This function then greps a process name to find its PID and returns this PID to the calling function.
The function seems to work fine but fails in one or two cases while opening some directory(corresponding to a process).
This is what I am doing.
dr = readdir(dp);
Loop through dr
Check dr type for directory and process name
compare the process name with a string.
Return PID in case of a match
dr = readdir(dp);
end loop
main() {
DIR *d;
struct dirent *e;
e=malloc(sizeof(struct dirent));
d=opendir("/proc");
while ((e = readdir(d)) != NULL) {
printf("%d %s\n", e->d_type, e->d_name);
}
closedir(d);
}

Presumably the problem is that directories are disappearing before you get to check out the files inside. This would mean that a process that was running when you go the directory listing is no longer running when you go to read its process information. This is normal and something you'll have to handle (ideally silently) in your application.
Also, the code snippet you provided definitely does not do what you described above it. Presumably you edited it for simplicity, but in doing so you removed any clues as to what you might be doing wrong.

Related

C: Temp file deleted immediately after opening

I'm trying to make a temp file, to which I want write a bunch of stuff, and then print out upon receiving a signal. However, after some diagnostics with lsof it looks like the temp file is deleted immediately after opening it. Take the following snippet:
FILE *tmp;
int main(int argc, char *argv[]) {
if ((tmp = tmpfile()) == NULL)
err_sys("tmpfile error");
sleep(60);
Now if I go do a ps aux, get the pid of my process, and then do a lsof -p <pid>, I see the following:
10.06 1159 daniel 3u REG 0,1 0 10696049115128289 /tmp/tmpfCrM7Jn (deleted)
This is a bit of a head-scratcher for me. Considering that it's really only a single built in function call, which is not causing an error when being called, I'm not sure what the problem is.
From the man page:
The created file is unlinked before tmpfile() returns, causing the
file to be automatically deleted when the last reference to it is
closed.
The output from lsof simply indicates that the path pointing to the inode was removed. However, the current file handle FILE *tmp should still be valid, until the file is closed, or the program exits.

Need help to know how to use threads while reading a single directory

I need to read a single directory containing 100K files. Every time when i do readdir this is taking lot of time.
Can someone suggest me the logic of how to read a single directory using multiple threads. Consider this directory is not having any sub-dir, only files.
Below is what i am trying to make it work but this is taking ~5 min per invocation
void dirwalk(char *dir, void (*fcn)(char *))
{
char name[MAX_PATH];
Dirent *dp;
DIR *dfd;
if ((dfd = opendir(dir)) == NULL) {
fprintf(stderr, "dirwalk: can't open %s\n", dir);
return;
}
while ((dp = readdir(dfd)) != NULL) {
if (strcmp(dp->name, ".") == 0
|| strcmp(dp->name, ".."))
continue; /* skip self and parent */
if (strlen(dir)+strlen(dp->name)+2 > sizeof(name))
fprintf(stderr, "dirwalk: name %s %s too long\n",
dir, dp->name);
else {
sprintf(name, "%s/%s", dir, dp->name);
(*fcn)(name);
}
}
closedir(dfd);
}
You can try the following in the order below to see if it improves the performance:
Spawn a different thread using pthread_create() primitive to perform the action in fcn() to remove any possibility of an expensive operation that could come along with the function callback. Based on your need, you could create joinable or detached threads. If this does not help, try 2 below.
Write a modified dirwalk() function as part of a thread routine. Create a bunch of threads (using pthread_create() primitive) that call the same thread routine from outside. The threads would run until they reach the end of the directory stream. Remember the directory stream is always shared, and readdir() is not a reentrant function. So use readdir_r() instead to your advantage. Also use the pthread_mutex to lock the directory stream. Remember to lock and unlock before and after the readdir_r() respectively, so that the rest of the work is done outside the critical section.
Locking would have a bearing on the performance, but it should take care of the concurrency issues and you cant avoid locking. However, I think Linux (I hope you are running Linux) would provide a little more opportunity for the dirwalk() to run with more threads but I am not sure if it would be as substantial as you might expect.

Joining threads confusion

I'm doing my homework, what I have to accomplish is count the directories and files of a given directory, but each directory that I found should be counted aswell with another thread of my process, this is what I have so far:
void *dirCounter(void *param){
queue<pthread_t> queue;
dir_ptr dir = (dir_ptr)param;
dir->countDir = 0;
DIR* dirName = dir->name;
struct dirent *curr;
off_t dsp;
dsp= telldir(dirName);
while(dsp!= -1){
curr = readdir(dirName);
if(curr == NULL){
break;
}
if(!strcmp(curr->d_name,".")|!strcmp(curr->d_name,"..")) { //To avoid counting . and ..
dsp = telldir(dirName); //Actual position asociated to the stream
continue; //Executes the beginning of the while
}
if(curr->d_type == DT_DIR){
dir->countDir++; //counts directories in the first level
//For each directory found, create another thread and add it to the queue:
pthread_attr_t attr1;
pthread_t tid1;
pthread_attr_init(&attr1);
dir_ptr par1 = (dir_ptr)malloc(sizeof(directorio));
par1->name = opendir(curr->d_name);
par1->countDir = par1->countFile = 0;
pthread_create(&tid1,&attr1, dirCounter, par1);
//queue.push(tid1);
}
if(curr->d_type == DT_REG){
dir->countFile++; //Counts files
}
dsp = telldir(dirName);
}
//pthread_join(tid1, NULL);
//while(!queue.empty()){
//pthread_join(queue.front(), NULL);
// queue.pop();
//}
printf("Dirs: %d Files: %d\n", dir->countDir, dir->countFile);
pthread_exit(NULL);
}
So far the code does count the current files and dirs of the "first level" if the join is commented, and then it just gives a segmentation fault, if the line is uncommented it gives just an output line and then dies with the segmentation fault.
The idea was to create a thread whenever I found a directory and then join all them at the end creating a semi-recursive routine.
Modifications:
char str[256];
strcpy(str, "./");
strcat(str, curr->d_name);
//strcat(str, "\"");
puts(str);
par1->name = opendir(str);
par1->countDir = par1->countFile = 0;
pthread_create(&tid1,&attr1, dirCounter, par1);
queue.push(tid1);
What it does after the modification:
Prints ALL the directories, however it does give segmentation fault and some threads do not complete it's task.
The proximate cause of your problem is that dir->name is NULL in the additional threads created, because opendir(curr->d_name); is failing. This is because the directory curr->d_name is not an absolute pathname - opendir() will look in the current working directory for the directory you're trying to open, but that directory is actually within the directory you're currently working on.
I suggest that instead of passing the DIR * value to the thread, you instead simply pass the pathname of the directory, and let the thread do the opendir() itself. It should then test the return value, and only proceed to call readdir() if opendir() returned non-NULL.
When you find a directory entry that is a directory, you need to construct a pathname to pass to the new thread by concatenating "/" and curr->d_name onto the pathname of the directory being processed.
Note that you do not need the dsp variable and the calls to telldir() at all. If you have a valid DIR *dir, you can loop over it simply with:
while (curr = readdir(dir)) {
/* Do something with curr */
}
I see a few bugs. I'm not sure if this explains your crash.
You allocated an instance of "directorio" for each directory and corresponding thread. But you never free it. Memory leak.
Is it the intent to print the total number of directories and files of the whole file system? Or just a individual directory and file count for each directory? If the former, you aren't adding the results back up. I would even suggest having all threads share the same integer pointers for dirCount and fileCount. (And use a lock to serialize access or just use __sync_add_and_fetch). You could also just use a set of global variables for the integer dir and file counts.
If the latter case (each thread prints it's own summation of child files), just pass a directory name (string) as the thread parameter, and let the thread use local variables off the stack for the counters. (The thread would call opendir on the string passed in. It would still need to free the allocated string passed in.)
You don't need to pass a pthread_attr_t instance into pthread_create. You can pass NULL as the second parameter and get the same effect.
You aren't checking the return value of pthread_create. If it were to fail (unlikely), then tid1 could be a garbage value.
Hope this helps.

Custom shell glob problem

I have to write a shell program in c that doesn't use the system() function. One of the features is that we have to be able to use wild cards. I can't seem to find a good example of how to use glob or this fnmatch functions that I have been running into so I have been messing around and so far I have a some what working blog feature (depending on how I have arranged my code).
If I have a glob variable declared as a global then the function partially works. However any command afterwards produces in error. example:
ls *.c
produce correct results
ls -l //no glob required
null passed through
so I tried making it a local variable. This is my code right now:
int runCommand(commandStruct * command1) {
if(!globbing)
execvp(command1->cmd_path, command1->argv);
else{
glob_t globbuf;
printf("globChar: %s\n", globChar);
glob(globChar, GLOB_DOOFFS, NULL, &globbuf);
//printf("globbuf.gl_pathv[0]: %s\n", &globbuf.gl_pathv[0]);
execvp(command1->cmd_path, &globbuf.gl_pathv[0]);
//globfree(&globbuf);
globbing = 0;
}
return 1;
}
When doing this with the globbuf as a local, it produces a null for globbuf.gl_path[0]. Can't seem to figure out why. Anyone with a knowledge of how glob works know what might be the cause? Can post more code if necessary but this is where the problem lies.
this works for me:
...
glob_t glob_buffer;
const char * pattern = "/tmp/*";
int i;
int match_count;
glob( pattern , 0 , NULL , &glob_buffer );
match_count = glob_buffer.gl_pathc;
printf("Number of mathces: %d \n", match_count);
for (i=0; i < match_count; i++)
printf("match[%d] = %s \n",i,glob_buffer.gl_pathv[i]);
globfree( &glob_buffer );
...
Observe that the execvp function expects the argument list to end with a NULL pointer, i.e. I think it will be the easiest to create your own char ** argv copy with all the elements from the glob_buffer.gl_pathv[] and a NULL pointer at the end.
You are asking for GLOB_DOOFFS but you did not specify any number in globbuf.gl_offs saying how many slots to reserve.
Presumably as a global variable it gets initialized to 0.
Also this: &globbuf.gl_pathv[0] can simply be globbuf.gl_pathv.
And don't forget to run globfree(globbuf).
I suggest running your program under valgrind because it probably has a number of memory leaks, and/or access to uninitialized memory.
If you don't have to use * style wildcards I've always found it simpler to use opendir(), readdir() and strcasestr(). opendir() opens a directory (can be ".") like a file, readdir() reads an entry from it, returns NULL at the end. So use it like
struct dirent *de = NULL;
DIR *dirp = opendir(".");
while ((de = readdir(dirp)) != NULL) {
if ((strcasestr(de->d_name,".jpg") != NULL) {
// do something with your JPEG
}
}
Just remember to closedir() what you opendir(). A struct dirent has the d_type field if you want to use it, most files are type DT_REG (not dirs, pipes, symlinks, sockets, etc.).
It doesn't make a list like glob does, the directory is the list, you just use criteria to control what you select from it.

Traversing file system according to a given root place by using threads by using C for unix

I wanna traverse inside the file system by using threads and processes.My program has to assume the first parameter is either given as "-p" which offers a multi-process application or "-t" which runs in a multi-threaded way. The second parameter is the
pathname of a file or directory. If my program gets the path of a file, it should print out the size of the file in bytes. If my program gets the path of a directory, it should, in the same way, print out the directory name, then process all the entries in the
directory except the directory itself and the parent directory. If my program is given a directory, it must display the entire hierarchy rooted at the specified directory. I wrote something but I got stuck in.I can not improve my code.Please help me.
My code is as following:
include
include
include
include
include
include
include
int funcThread(DIR *D);
int main(int argc, char * argv[])
{
pthread_t thread[100];
DIR *dirPointer;
struct stat object_file;
struct dirent *object_dir;
int counter;
if(opendir(argv[1])==NULL)
{
printf("\n\nERROR !\n\n Please enter -p or -t \n\n");
return 0;
}
if((dirPointer=opendir(argv[1]))=="-t")
{
if ((object_dir = opendir(argv[2])) == NULL)
{
printf("\n\nERROR !\n\nPlease enter the third argument\n\n");
return 0;.
}
else
{
counter=0;
while ((object_dir = readdir(object_dir)) != NULL)
{
pthread_create(&thread[counter],NULL,funcThread,(void *) object_dir);
counter++;
}
}
}
return 0;
}
int funcThread(DIR *dPtr)
{
DIR *ptr;
struct stat oFile;
struct dirent *oDir;
int num;
if(ptr=readdir(dPtr)==NULL)
rewinddir(ptr);
if(S_ISDIR(oFile.st_mode))
{
ptr=readdir(dPtr);
printf("\t%s\n",ptr);
return funcThread(ptr);
}
else
{
while(ptr=readdir(dPtr)!=NULL)
{
printf("\n%s\n",oDir->d_name);
stat(oDir->d_name,&oFile);
printf("\n%f\n",oFile.st_size);
}
rewinddir(ptr);
}
}
This line:
if((dirPointer=opendir(argv[1]))=="-t")
dirPointer is a pointer DIR* so how can it be equal to a literal string pointer?
I spotted a few errors:
Why are you using opendir() to check your arguments? You should use something like strcmp for that.
You're passing struct dirent* to funcThread() but funcThread() takes a DIR*.
You're using oFile on funcThread() before you initialize it (by calling stat()).
What is the purpose of calling rewinddir()? I guess you're blindly trying to get readdir() to work with a struct dirent*.
You're using oDir but it's never initialized.
You're calling printf() from multiple threads with no means to synchronize the output so it would be completelly out of order or garbled.
I suggest you read and understand the documentation of all those functions before using them (google "posix function_name") and get familiar with the basics of C. And before bringing threads into the equation try to get it working on a single threaded program. Also you won't see an improvement in performance by using that many threads unless you have close to that many cores, it will actually decrease performance and increase resource usage.
if(ptr=readdir(dPtr)==NULL){}
The = operator has lower precedence than ==
[this error is repeated several times]

Resources