Custom shell glob problem - c

I have to write a shell program in c that doesn't use the system() function. One of the features is that we have to be able to use wild cards. I can't seem to find a good example of how to use glob or this fnmatch functions that I have been running into so I have been messing around and so far I have a some what working blog feature (depending on how I have arranged my code).
If I have a glob variable declared as a global then the function partially works. However any command afterwards produces in error. example:
ls *.c
produce correct results
ls -l //no glob required
null passed through
so I tried making it a local variable. This is my code right now:
int runCommand(commandStruct * command1) {
if(!globbing)
execvp(command1->cmd_path, command1->argv);
else{
glob_t globbuf;
printf("globChar: %s\n", globChar);
glob(globChar, GLOB_DOOFFS, NULL, &globbuf);
//printf("globbuf.gl_pathv[0]: %s\n", &globbuf.gl_pathv[0]);
execvp(command1->cmd_path, &globbuf.gl_pathv[0]);
//globfree(&globbuf);
globbing = 0;
}
return 1;
}
When doing this with the globbuf as a local, it produces a null for globbuf.gl_path[0]. Can't seem to figure out why. Anyone with a knowledge of how glob works know what might be the cause? Can post more code if necessary but this is where the problem lies.

this works for me:
...
glob_t glob_buffer;
const char * pattern = "/tmp/*";
int i;
int match_count;
glob( pattern , 0 , NULL , &glob_buffer );
match_count = glob_buffer.gl_pathc;
printf("Number of mathces: %d \n", match_count);
for (i=0; i < match_count; i++)
printf("match[%d] = %s \n",i,glob_buffer.gl_pathv[i]);
globfree( &glob_buffer );
...
Observe that the execvp function expects the argument list to end with a NULL pointer, i.e. I think it will be the easiest to create your own char ** argv copy with all the elements from the glob_buffer.gl_pathv[] and a NULL pointer at the end.

You are asking for GLOB_DOOFFS but you did not specify any number in globbuf.gl_offs saying how many slots to reserve.
Presumably as a global variable it gets initialized to 0.
Also this: &globbuf.gl_pathv[0] can simply be globbuf.gl_pathv.
And don't forget to run globfree(globbuf).
I suggest running your program under valgrind because it probably has a number of memory leaks, and/or access to uninitialized memory.

If you don't have to use * style wildcards I've always found it simpler to use opendir(), readdir() and strcasestr(). opendir() opens a directory (can be ".") like a file, readdir() reads an entry from it, returns NULL at the end. So use it like
struct dirent *de = NULL;
DIR *dirp = opendir(".");
while ((de = readdir(dirp)) != NULL) {
if ((strcasestr(de->d_name,".jpg") != NULL) {
// do something with your JPEG
}
}
Just remember to closedir() what you opendir(). A struct dirent has the d_type field if you want to use it, most files are type DT_REG (not dirs, pipes, symlinks, sockets, etc.).
It doesn't make a list like glob does, the directory is the list, you just use criteria to control what you select from it.

Related

Problems grabbing file names using SDL_strdup and similar

I'm trying to create a program with SDL2.
In a certain part of the code, I'm writing functions to grab names of all present files in a given directory path (and keep them in memory) so that, in another function, I can check if a specified file was present the last moment the directory was checked.
I'm using dirent.h to suit my needs but I'm running into a few problems:
All the files are properly captured by readdir() (no exception), however they aren't always properly copied into memory after using SDL_strdup() (code is below).
I'm using SDL_malloc()/SDL_realloc()/SDL_strdup() to be as cross-platform as possible to avoid having problems when porting code (as I've read that strdup isn't C standard).
Here's my code:
typedef struct FileList {
char **files;
size_t num;
} FileList;
FileList *GetFileList(const char *path){
struct dirent *dp = NULL;
DIR *dir = NULL;
size_t i = 0;
FileList *filelist = SDL_malloc(sizeof(FileList)); /* changing this to a calloc doesn't help */
/* Check if filelist == NULL */
filelist->files = NULL;
dir = opendir(path);
/* Check if dir == NULL */
while ((dp = readdir(dir))){
if (dp->d_name[0] == '.'){
continue; /* skip self, parent and all files starting with . */
}
printf("Copying: %s\n", dp->d_name); /* Always show the name of each file */
filelist->files = SDL_realloc(filelist->files, ++i);
filelist->files[i-1] = SDL_strdup(dp->d_name);
printf("Copied: %s\n\n", filelist->files[i-1]); /* Varies: either shows the file's name, either gives me plain gibberish or just nothing */
}
filelist->num = i;
closedir(dir);
return filelist;
}
Output varies. When it doesn't crash, I either get all filenames correctly copied, or I get most of them copied and some contain nothing or plain gibberish (as commented); if it does crash, sometimes I get a Segfault while using SDL_strdup(), other times I get a Segfault when using closedir().
I've even considered exchanging the SDL_realloc() scenario with an initial memory allocation of filelist->files by giving it the number of files (thanks to another function) but I get the same problem.
Any suggestion to change my coding style to a more defensive one (since I do believe this one is rather dangerous) will be appreciated, although I've tried all I could for this case. I'm currently working on a Mac OS X using built-in gcc Apple LLVM 6.0 (clang-600.0.56).
You need space for pointers, and sizeof(char *) != 1 so
filelist->files = (char**) SDL_realloc(filelist->files, ++i);
needs to be
filelist->files = SDL_realloc(filelist->files, ++i * sizeof(char *));
but that's actually a bad idea, because SDL_realloc could return NULL in which case you will loose reference to the original pointer, so a good way of doing it is
void *ptr;
ptr = SDL_realloc(filelist->files, ++i * sizeof(char *));
if (ptr == NULL)
handleThisErrorAndDoNotContinue();
filelist->files = ptr;
and always check for allocator functions if they returned NULL, because you have no control over the size of the data you are trying to read and you can run out of memory at least in theory, so you should make your code safe by checking the success of these functions.

C - Unlink/Remove produces error for filenames with spaces

I am trying to make a function in C to erase all the contents of a temp folder and to erase the folder.
Whilst I already have successfully created the code to cycle through the files and to erase the folder (it is pretty much straight forward) I am having trouble erasing the files using unlink.
Here is the code that I am using:
int delete_folder(char *foldername) {
DIR *dp;
struct dirent *ep;
dp=opendir(foldername);
if (dp!=NULL) {
readdir(dp); readdir(dp);
while (ep=readdir(dp)) {
char* cell = concatenate(concatenate(foldername, "\\"), "Bayesian Estimation.xlsx");//ep->d_name);
printf("%s\n", cell);
remove(cell);
printf("%s\n", strerror(errno));
}
closedir(dp);
}
if (!rmdir(foldername)) {return(0);} else {return(-1);}
}
The code that I wrote is fully functional for all files but those which include spaces in the filename. After some testing, I can guarantee that the unlink functions eliminates all files in the folder (even those with special characters in the filename) but fails if the filename includes a space (however, for this same file, if I remove the space(s), this function works again).
Has anyone else encountered this problem? And, more importantly, can it be solved/circunvented?
(The problem remains even if I introduce the space escape sequences directly)
The error presented by unlink is "No such file or directory" (ENOENT). Mind you that the file is indeed at the referred location (as can be verified by the code outputing the correct filename in the variable cell) and this error also occurs if I use the function remove instead of unlink.
PS: The function concatenate is a function of my own making which outputs the concatenation of the two input strings.
Edit:
The code was written in Codeblocks, in Windows.
Here's the code for the concatenate function:
char* concatenate(char *str1, char *str2) {
int a1 = strlen(str1), a2 = strlen(str2); char* str3[a1+a2+1];
snprintf(str3, a1+a2+2, "%s%s", str1, str2);
return(str3);
}
Whilst you are right in saying that it is a possible (and easy) memory leak, the functions' inputs and outputs are code generated and only for personal use and therefore there is no great reason to worry about it (no real need for foolproofing the code.)
You say "using unlink()" but the code is using remove(). Which platform are you on? Is there any danger that your platform implements remove() by running an external command which doesn't handle spaces in file names properly? On most systems, that won't be a problem.
What is a problem is that you don't check the return value from remove() before printing the error. You should only print the error if the function indicates that it generated an error. No function in the Standard C (or POSIX) library sets errno to zero. Also, errors should be reported on standard error; that's what the standard error stream is for.
if (remove(cell) != 0)
fprintf(stderr, "Failed to remove %s (%d: %s)\n", cell, errno, strerror(errno));
else
printf("%s removed OK\n", cell);
I regard the else clause as a temporary measure while you're getting the code working.
It also looks like you're leaking memory like a proverbial sieve. You capture the result of a double concatenate operation in cell, but you never free it. Indeed, if the nested calls both allocate memory, then you've got a leak even if you add free(cell); at the end of the loop (inside the loop, after the second printf(), the one I deconstructed). If concatenate() doesn't allocate new memory each time (it returns a pointer to statically allocated memory, then I think concatenating a string with the output of concatenate() is also dangerous, probably invoking undefined behaviour as you copy a string over itself. You need to look hard at the code for concatenate(), and/or present it for analyis.
Thank you very much for all your input, after reviewing your comments and making a few experiments myself, I figured out that remove/unlink was not working because the filename was only temporarily saved at variable cell (it was there long enough for it to be printed correctly to console, hence my confusion). After appropriately storing my filename before usage, my problem has been completely solved.
Here's the code (I have already checked it with filenames as complex as I could make them):
int delete_folder(char* foldername) {
DIR *dp;
struct dirent *ep;
dp=opendir(foldername);
if (dp!=NULL) {
readdir(dp); readdir(dp);
while (ep=readdir(dp)) {
char cell[strlen(foldername)+1+strlen(ep->d_name)+1];
strcpy(cell, concatenate(concatenate(foldername, "\\"), ep->d_name));
unlink(cell);
printf("File \"%s\": %s\n", ep->d_name, strerror(errno));
}
closedir(dp);
}
if (!rmdir(foldername)) {return(0);} else {return(-1);}
}
I realize it was kind of a noob mistake, resulting from my being a bit out of practice for a while in programming in C, so... Thank you very much for your all your help!

Forking with command line arguments

I am building a Linux Shell, and my current headache is passing command line arguments to forked/exec'ed programs and system functions.
Currently all input is tokenized on spaces and new lines, in a global variable char * parsed_arguments. For example, the input dir /usa/folderb would be tokenized as:
parsed_arguments[0] = dir
parsed_arguments[1] = /usa/folderb
parsed_arguments tokenizes everything perfectly; My issue now is that i wish to only take a subset of parsed_arguments, which excludes the command/ first argument/path to executable to run in the shell, and store them in a new array, called passed_arguments.
so in the previous example dir /usa/folderb
parsed_arguments[0] = dir
parsed_arguments[1] = /usa/folderb
passed_arguments[0] = /usa/folderb
passed_arguments[1] = etc....
Currently I am not having any luck with this so I'm hoping someone could help me with this. Here is some code of what I have working so far:
How I'm trying to copy arguments:
void command_Line()
{
int i = 1;
for(i;parsed_arguments[i]!=NULL;i++)
printf("%s",parsed_arguments[i]);
}
Function to read commands:
void readCommand(char newcommand[]){
printf("readCommand: %s\n", newcommand);
//parsed_arguments = (char* malloc(MAX_ARGS));
// strcpy(newcommand,inputstring);
parsed = parsed_arguments;
*parsed++ = strtok(newcommand,SEPARATORS); // tokenize input
while ((*parsed++ = strtok(NULL,SEPARATORS)))
//printf("test1\n"); // last entry will be NULL
//passed_arguments=parsed_arguments[1];
if(parsed[0]){
char *initial_command =parsed[0];
parsed= parsed_arguments;
while (*parsed) fprintf(stdout,"%s\n ",*parsed++);
// free (parsed);
// free(parsed_arguments);
}//end of if
command_Line();
}//end of ReadCommand
Forking function:
else if(strstr(parsed_arguments[0],"./")!=NULL)
{
int pid;
switch(pid=fork()){
case -1:
printf("Fork error, aborting\n");
abort();
case 0:
execv(parsed_arguments[0],passed_arguments);
}
}
This is what my shell currently outputs. The first time I run it, it outputs something close to what I want, but every subsequent call breaks the program. In addition, each additional call appends the parsed arguments to the output.
This is what the original shell produces. Again it's close to what I want, but not quite. I want to omit the command (i.e. "./testline").
Your testline program is a sensible one to have in your toolbox; I have a similar program that I call al (for Argument List) that prints its arguments, one per line. It doesn't print argv[0] though (I know it is called al). You can easily arrange for your testline to skip argv[0] too. Note that Unix convention is that argv[0] is the name of the program; you should not try to change that (you'll be fighting against the entire system).
#include <stdio.h>
int main(int argc, char **argv)
{
while (*++argv != 0)
puts(*argv);
return 0;
}
Your function command_line() is also reasonable except that it relies unnecessarily on global variables. Think of global variables as a nasty smell (H2S, for example); avoid them when you can. It should be more like:
void command_Line(char *argv[])
{
for (int i = 1; argv[i] != NULL; i++)
printf("<<%s>>\n", argv[i]);
}
If you're stuck with C89, you'll need to declare int i; outside the loop and use just for (i = 1; ...) in the loop control. Note that the printing here separates each argument on a line on its own, and encloses it in marker characters (<< and >> — change to suit your whims and prejudices). It would be fine to skip the newline in the loop (maybe use a space instead), and then add a newline after the loop (putchar('\n');). This makes a better, more nearly general purpose debug routine. (When I write a 'dump' function, I usually use void dump_argv(FILE *fp, const char *tag, char *argv[]) so that I can print to standard error or standard output, and include a tag string to identify where the dump is written.)
Unfortunately, given the fragmentary nature of your readCommand() function, it is not possible to coherently critique it. The commented out lines are enough to elicit concern, but without the actual code you're running, we can't guess what problems or mistakes you're making. As shown, it is equivalent to:
void readCommand(char newcommand[])
{
printf("readCommand: %s\n", newcommand);
parsed = parsed_arguments;
*parsed++ = strtok(newcommand, SEPARATORS);
while ((*parsed++ = strtok(NULL, SEPARATORS)) != 0)
{
if (parsed[0])
{
char *initial_command = parsed[0];
parsed = parsed_arguments;
while (*parsed)
fprintf(stdout, "%s\n ", *parsed++);
}
}
command_Line();
}
The variables parsed and parsed_arguments are both globals and the variable initial_command is set but not used (aka 'pointless'). The if (parsed[0]) test is not safe; you incremented the pointer in the previous line, so it is pointing at indeterminate memory.
Superficially, judging from the screen shots, you are not resetting the parsed_arguments[] and/or passed_arguments[] arrays correctly on the second use; it might be an index that is not being set to zero. Without knowing how the data is allocated, it is hard to know what you might be doing wrong.
I recommend closing this question, going back to your system and producing a minimal SSCCE. It should be under about 100 lines; it need not do the execv() (or fork()), but should print the commands to be executed using a variant of the command_Line() function above. If this answer prevents you deleting (closing) this question, then edit it with your SSCCE code, and notify me with a comment to this answer so I get to see you've done that.

malloc on (char**)

Well, I'm trying to write a shell for linux using C. Using the functions fork() and execl(), I can execute each command, but now I'm stuck trying to read the arguments:
char * command;
char ** c_args = NULL;
bytes_read = getline (&command, &nbytes, stdin);
command = strtok(command, "\n ");
int arg = 0;
c_arg = strtok(NULL, "\n ");
while( c_arg != NULL ) {
if( c_args == NULL ) {
c_args = (char**) malloc(sizeof(char*));
}
else {
c_args = (char**) realloc( c_args, sizeof(char*) * (arg + 1) );
}
c_args[arg] = (char*) malloc( sizeof(char)*1024 );
strcpy( c_args[arg], c_arg );
c_arg = strtok(NULL, "\n ");
arg++;
}
...
pid_t pid = fork()
...
...
execl( <path>, command, c_args, NULL)
...
...
That way I get errors from the command when I try to pass arguments, for example:
ls -l
Gives me:
ls: cannot access p��: No such file or directory
I know that the problem is the c_args allocation. What's wrong with it?
Cheers.
You can't use execl() for a variable list of arguments; you need to use execv() or one of its variants (execve(), execvp(), etc). You can only use execl() when you know all the arguments that will be present at compile time. In most cases, a general shell won't know that. An exception is when you do something like:
execl("/bin/sh", "/bin/sh", "-c", command_line, (char *)0);
Here, you're invoking the shell to run a single string as the command line (with no other arguments). However, when you're dealing with what people type at the keyboard in a full shell, you won't have the luxury of knowing how many arguments they typed at compile time.
At its simplest, you should be using:
execvp(c_args[0], c_args);
The zeroth argument, the command name, should be what you pass to execvp(). If that's a simple file name (no /), then it will look for the command in directories on your $PATH environment variable. If the command name contains a slash, then it will look for the (relative or absolute) file name specified and execute that if it exists, and fail if it does not. The other arguments should all be in the null-terminated list c_args.
Now, there may also be other memory allocation issues; I've not scrutinized the code. You could check them, though, by diagnostic printing of the argument list:
char **pargs = c_args;
while (*pargs != 0)
puts(*pargs++);
That prints each argument on a separate line. Note that it doesn't stop until it encounters a null pointer; it is crucial that you null terminate your list of pointers to the argument strings.
This bit of your code:
c_args[arg] = (char*) malloc( sizeof(char)*1024 );
strcpy( c_args[arg], c_arg );
looks like overkill in the usual case, and an inadequate memory allocation in the extreme cases. When you're copying strings around, allocate enough length. I see that you're using strtok() to bust apart a string — it'll do for the early incarnations of a shell, but when you get to process command lines like ls -l>$tmp, you will find strtok()'s penchant for trampling over your delimiter before you get to read it becomes a major liability. However, while you're using it, you probably don't have to copy the arguments like that; you can just set c_args[arg++] = result_from_strtok;. When you do need to copy, you should probably use strdup(); it doesn't forget to allocate enough space for the trailing '\0', for example, and neither over-allocates nor under-allocates.
Jonathan has a great answer, I just wanted to add a few more things.
You could use popen or system to just execute by the shell directly. They are usually frowned upon since they can be injected so easily, but if you are writing an open shell, I don't see the harm in using them.
If you are going for a limited shell (which accepts a sh-like syntax), take a look into wordexp. It does a lot, but in my experience it does too much, especially if you are trying to write a moderately secure interpreter (it does silly things like tilde expansion and variable substitution).

Traversing file system according to a given root place by using threads by using C for unix

I wanna traverse inside the file system by using threads and processes.My program has to assume the first parameter is either given as "-p" which offers a multi-process application or "-t" which runs in a multi-threaded way. The second parameter is the
pathname of a file or directory. If my program gets the path of a file, it should print out the size of the file in bytes. If my program gets the path of a directory, it should, in the same way, print out the directory name, then process all the entries in the
directory except the directory itself and the parent directory. If my program is given a directory, it must display the entire hierarchy rooted at the specified directory. I wrote something but I got stuck in.I can not improve my code.Please help me.
My code is as following:
include
include
include
include
include
include
include
int funcThread(DIR *D);
int main(int argc, char * argv[])
{
pthread_t thread[100];
DIR *dirPointer;
struct stat object_file;
struct dirent *object_dir;
int counter;
if(opendir(argv[1])==NULL)
{
printf("\n\nERROR !\n\n Please enter -p or -t \n\n");
return 0;
}
if((dirPointer=opendir(argv[1]))=="-t")
{
if ((object_dir = opendir(argv[2])) == NULL)
{
printf("\n\nERROR !\n\nPlease enter the third argument\n\n");
return 0;.
}
else
{
counter=0;
while ((object_dir = readdir(object_dir)) != NULL)
{
pthread_create(&thread[counter],NULL,funcThread,(void *) object_dir);
counter++;
}
}
}
return 0;
}
int funcThread(DIR *dPtr)
{
DIR *ptr;
struct stat oFile;
struct dirent *oDir;
int num;
if(ptr=readdir(dPtr)==NULL)
rewinddir(ptr);
if(S_ISDIR(oFile.st_mode))
{
ptr=readdir(dPtr);
printf("\t%s\n",ptr);
return funcThread(ptr);
}
else
{
while(ptr=readdir(dPtr)!=NULL)
{
printf("\n%s\n",oDir->d_name);
stat(oDir->d_name,&oFile);
printf("\n%f\n",oFile.st_size);
}
rewinddir(ptr);
}
}
This line:
if((dirPointer=opendir(argv[1]))=="-t")
dirPointer is a pointer DIR* so how can it be equal to a literal string pointer?
I spotted a few errors:
Why are you using opendir() to check your arguments? You should use something like strcmp for that.
You're passing struct dirent* to funcThread() but funcThread() takes a DIR*.
You're using oFile on funcThread() before you initialize it (by calling stat()).
What is the purpose of calling rewinddir()? I guess you're blindly trying to get readdir() to work with a struct dirent*.
You're using oDir but it's never initialized.
You're calling printf() from multiple threads with no means to synchronize the output so it would be completelly out of order or garbled.
I suggest you read and understand the documentation of all those functions before using them (google "posix function_name") and get familiar with the basics of C. And before bringing threads into the equation try to get it working on a single threaded program. Also you won't see an improvement in performance by using that many threads unless you have close to that many cores, it will actually decrease performance and increase resource usage.
if(ptr=readdir(dPtr)==NULL){}
The = operator has lower precedence than ==
[this error is repeated several times]

Resources