Im trying to use the md5sum command in a C program, right now im using dirent.h to get all the files in a folder, now, I want to get all the md5 of all those files, I am doing this:
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <syslog.h>
#include <string.h>
#include <dirent.h>
int main(void){
char *word = ".gz";
int i=0;
char *word2 = ".";
char *word3 = "..";
unsigned int md5;
DIR *d;
struct dirent *dir;
d = opendir(".");
if (d) {
while ((dir = readdir(d)) != NULL)
{
if((strstr(dir->d_name, word) == NULL) && (strcmp(dir->d_name, word2) != 0) && (strcmp(dir->d_name, word3)!= 0)) {
md5 = system("md5sum dir->d_name");
printf("The md5 of %s is %d\n", dir->d_name, md5);
}
}
}
return(0);
}
but when I run it, it says, for example:
md5sum: dir-: No such file or directory
The md5 of ej1_signal.c is 256
md5sum: dir-: No such file or directory
The md5 of pipeL.c is 256
Could you please explain me why is this happening? Thanks !
The system function doesn't returns you what you think. system is used to launch a command and when that command finished, it (generally) exits with an exit code. This is the value you catched.
What you need is the output of the command not its return value. So what you need is popen which lets you launch some external command and read/write to it through a pipe. See http://pubs.opengroup.org/onlinepubs/009695399/functions/popen.html for example.
system does not return the output of a command. To get the output of a command, you need to create a process and tie the standard output stream to a file descriptor you can read data off in the other process. For an example on how to do that, you can refer to the pipe man page (section 2).
Another option is to use a library that provides an MD5 implementation (eg. OpenSSL). The man page of EVP_DigestInit (section 3) provides an example for that.
Another problem is that your code tries to calculate the digest of d->d_name, not the file which name is in d->d_name. You could use sprintf or strncat with a suitably sized buffer (ie. the length of the static string part md5sum plus the maximum size of the file name (usually 256 bytes, may vary between library implementations and file systems) plus another byte for safely terminating the string (as some implementations may report an unterminated string in d->d_name)). Please note that this does not apply if you use a library for digest calculation, as the library uses either the file name or you need to pass the file contents to a library function (eg. EVP_DigestUpdate).
The first problem is that you launch a new shell process executing "md5sum dir->d_name", meaning it does a md5 on the "file" named dir->d_name, instead of using the value you get from readdir.
So you could add a temp variable, and prepare the command in it prior to running system.
limits.h is for Linux, adjust it if necessary to get the max length of a path
...
#include <linux/limits.h>
char temp[PATH_MAX];
then instead of
md5 = system("md5sum dir->d_name");
add
strcpy(temp, "md5sum ");
strcat(temp, dir->d_name);
system(temp);
as for the other problem (system will not return the md5 string), this will display the md5 of the file in the directory. And you can just remove the printf ...
There is no command in C to return the output of an external command, but there exists popen you can just open a command as a FILE * and read the output from it. This is how you can do it, and it's all explained within the code
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
int main(void)
{
DIR *d;
struct dirent *dir;
d = opendir(".");
if (d == NULL)
return -1;
while ((dir = readdir(d)) != NULL)
{
char command[sizeof dir->d_name + 10];
struct stat st;
FILE *pipe;
if (stat(dir->d_name, &st) == -1)
continue;
/* check if the entry is a directory, md5sum does not work with them */
if (S_ISDIR(st.st_mode) != 0)
continue;
/*
* md5sum dir->d_name will pass `dir->d_name` as the argument to the md5sum command,
* we need to build the command string, I like snprintf in this case
*/
snprintf(command, sizeof command, "md5sum \"%s\"", dir->d_name);
/*
* Open the pipe, it will execute the new command in a new process (fork)
* and create a pipe for communication with the current porcess
*/
pipe = popen(command, "r");
if (pipe != NULL)
{
char md5[33];
/* read the md5 digest string from the command output */
fread(md5, 1, sizeof md5 - 1, pipe);
/* append a null terminator */
md5[sizeof md5 - 1] = '\0';
printf("The md5 of %s is %s\n", dir->d_name, md5);
}
/* close the pipe */
pclose(pipe);
}
/* you should always call closedir() if opendir() succeded */
closedir(d);
return 0;
}
Related
I've found on google code that was over 50 lines long and that's completely unnecessary for what I'm trying to do.
I want to make a very simple cp implementation in C.
Just so I can play with the buffer sizes and see how it affects performance.
I want to use only Linux API calls like read() and write() but I'm having no luck.
I want a buffer that is defined as a certain size so data from file1 can be read into buffer and then written to file2 and that continues until file1 has reached EOF.
Here is what I tried but it doesn't do anything
#include <stdio.h>
#include <sys/types.h>
#define BUFSIZE 1024
int main(int argc, char* argv[]){
FILE fp1, fp2;
char buf[1024];
int pos;
fp1 = open(argv[1], "r");
fp2 = open(argv[2], "w");
while((pos=read(fp1, &buf, 1024)) != 0)
{
write(fp2, &buf, 1024);
}
return 0;
}
The way it would work is ./mycopy file1.txt file2.txt
This code has an important problem, the fact that you always write 1024 bytes regardless of how many you read.
Also:
You don't check the number of command line arguments.
You don't check if the source file exists (if it opens).
You don't check that the destination file opens (permission issues).
You pass the address of the array which has a different type than the pointer to the first element to the array.
The type of fp1 is wrong, as well as that of fp2.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char **argv)
{
char buffer[1024];
int files[2];
ssize_t count;
/* Check for insufficient parameters */
if (argc < 3)
return -1;
files[0] = open(argv[1], O_RDONLY);
if (files[0] == -1) /* Check if file opened */
return -1;
files[1] = open(argv[2], O_WRONLY | O_CREAT | S_IRUSR | S_IWUSR);
if (files[1] == -1) /* Check if file opened (permissions problems ...) */
{
close(files[0]);
return -1;
}
while ((count = read(files[0], buffer, sizeof(buffer))) != 0)
write(files[1], buffer, count);
return 0;
}
Go to section 8.3 of the K&R "The C Programming Language". There you will see an example of what you want to accomplish. Try using different buffer sizes and you will end up seeing a point where the performance tops.
#include <stdio.h>
int cpy(char *, char *);
int main(int argc, char *argv[])
{
char *fn1 = argv[1];
char *fn2 = argv[2];
if (cpy(fn2, fn1) == -1) {
perror("cpy");
return 1;
}
reurn 0;
}
int cpy(char *fnDest, char *fnSrc)
{
FILE *fpDest, *fpSrc;
int c;
if ((fpDest = fopen(fnDest, "w")) && (fpSrc = fopen(fnSrc, "r"))) {
while ((c = getc(fpSrc)) != EOF)
putc(fpDest);
fclose(fpDest);
fclose(fpSrc);
return 0;
}
return -1;
}
First, we get the two file names from the command line (argv[1] and argv[2]). The reason we don't start from *argv, is that it contains the program name.
We then call our cpy function, which copies the contents of the second named file to the contents of the first named file.
Within cpy, we declare two file pointers: fpDest, the destination file pointer, and fpSrc, the source file pointer. We also declare c, the character that will be read. It is of type int, because EOF does not fit in a char.
If we could open the files succesfully(if fopen does not return NULL), we get characters from fpSrc and copy them onto fpDest, as long as the character we have read is not EOF. Once we have seen EOF, we close our file pointers, and return 0, the success indicator. If we could not open the files, -1 is returned. The caller can check the return value for -1, and if it is, print an error message.
Good question. Related to another good question:
How can I copy a file on Unix using C?
There are two approaches to the "simplest" implementation of cp. One approach uses a file copying system call function of some kind - the closest thing we get to a C function version of the Unix cp command. The other approach uses a buffer and read/write system call functions, either directly, or using a FILE wrapper.
It's likely the file copying system calls that take place solely in kernel-owned memory are faster than the system calls that take place in both kernel- and user-owned memory, especially in a network filesystem setting (copying between machines). But that would require testing (e.g. with Unix command time) and will be dependent on the hardware where the code is compiled and executed.
It's also likely that someone with an OS that doesn't have the standard Unix library will want to use your code. Then you'd want to use the buffer read/write version, since it only depends on <stdlib.h> and <stdio.h> (and friends).
<unistd.h>
Here's an example that uses function copy_file_range from the unix standard library <unistd.h>, to copy a source file to a (possible non-existent) destination file. The copy takes place in kernel space.
/* copy.c
*
* Defines function copy:
*
* Copy source file to destination file on the same filesystem (possibly NFS).
* If the destination file does not exist, it is created. If the destination
* file does exist, the old data is truncated to zero and replaced by the
* source data. The copy takes place in the kernel space.
*
* Compile with:
*
* gcc copy.c -o copy -Wall -g
*/
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <unistd.h>
/* On versions of glibc < 2.27, need to use syscall.
*
* To determine glibc version used by gcc, compute an integer representing the
* version. The strides are chosen to allow enough space for two-digit
* minor version and patch level.
*
*/
#define GCC_VERSION (__GNUC__*10000 + __GNUC_MINOR__*100 + __gnuc_patchlevel__)
#if GCC_VERSION < 22700
static loff_t copy_file_range(int in, loff_t* off_in, int out,
loff_t* off_out, size_t s, unsigned int flags)
{
return syscall(__NR_copy_file_range, in, off_in, out, off_out, s,
flags);
}
#endif
/* The copy function.
*/
int copy(const char* src, const char* dst){
int in, out;
struct stat stat;
loff_t s, n;
if(0>(in = open(src, O_RDONLY))){
perror("open(src, ...)");
exit(EXIT_FAILURE);
}
if(fstat(in, &stat)){
perror("fstat(in, ...)");
exit(EXIT_FAILURE);
}
s = stat.st_size;
if(0>(out = open(dst, O_CREAT|O_WRONLY|O_TRUNC, 0644))){
perror("open(dst, ...)");
exit(EXIT_FAILURE);
}
do{
if(1>(n = copy_file_range(in, NULL, out, NULL, s, 0))){
perror("copy_file_range(...)");
exit(EXIT_FAILURE);
}
s-=n;
}while(0<s && 0<n);
close(in);
close(out);
return EXIT_SUCCESS;
}
/* Test it out.
*
* BASH:
*
* gcc copy.c -o copy -Wall -g
* echo 'Hello, world!' > src.txt
* ./copy src.txt dst.txt
* [ -z "$(diff src.txt dst.txt)" ]
*
*/
int main(int argc, char* argv[argc]){
if(argc!=3){
printf("Usage: %s <SOURCE> <DESTINATION>", argv[0]);
exit(EXIT_FAILURE);
}
copy(argv[1], argv[2]);
return EXIT_SUCCESS;
}
It's based on the example in my Ubuntu 20.x Linux distribution's man page for copy_file_range. Check your man pages for it with:
> man copy_file_range
Then hit j or Enter until you get to the example section. Or search by typing /example.
<stdio.h>/<stdlib.h> only
Here's an example that only uses stdlib/stdio. The downside is it uses an intermediate buffer in user-space.
/* copy.c
*
* Compile with:
*
* gcc copy.c -o copy -Wall -g
*
* Defines function copy:
*
* Copy a source file to a destination file. If the destination file already
* exists, this clobbers it. If the destination file does not exist, it is
* created.
*
* Uses a buffer in user-space, so may not perform as well as
* copy_file_range, which copies in kernel-space.
*
*/
#include <stdlib.h>
#include <stdio.h>
#define BUF_SIZE 65536 //2^16
int copy(const char* in_path, const char* out_path){
size_t n;
FILE* in=NULL, * out=NULL;
char* buf = calloc(BUF_SIZE, 1);
if((in = fopen(in_path, "rb")) && (out = fopen(out_path, "wb")))
while((n = fread(buf, 1, BUF_SIZE, in)) && fwrite(buf, 1, n, out));
free(buf);
if(in) fclose(in);
if(out) fclose(out);
return EXIT_SUCCESS;
}
/* Test it out.
*
* BASH:
*
* gcc copy.c -o copy -Wall -g
* echo 'Hello, world!' > src.txt
* ./copy src.txt dst.txt
* [ -z "$(diff src.txt dst.txt)" ]
*
*/
int main(int argc, char* argv[argc]){
if(argc!=3){
printf("Usage: %s <SOURCE> <DESTINATION>\n", argv[0]);
exit(EXIT_FAILURE);
}
return copy(argv[1], argv[2]);
}
Another way to ensure portability in general while still working with a Unix-like C API is to develop with GNOME (e.g. GLib, GIO)
https://docs.gtk.org/glib/
https://docs.gtk.org/gio/
I want to create a directory that has the name ends with the process ID (to make it unique) and then store the new files that I just wrote inside that directory.
What I want:
1) Create a new directory named : mydirectory.1923 (example of the process id number)
2) Store a file that I just created using
FILE * fPointer = fopen("new.txt",w+)
into mydirectory.1923
What I have so far is this:
int bufSize = 20;
int pid = getpid();
char *fileName = malloc(bufSize);
char *prefix = "that.rooms.";
snprintf(fileName, bufSize,"%s%d", prefix, pid);
printf("%s\n",fileName);
struct stat st = {0};
if (stat(fileName, &st) == -1) {
mkdir(fileName, 0755);
}
DIR *dir = opendir (fileName);
if (dir != NULL) {
FILE *fLib = fopen("library.txt" , "w+");
fclose(fLib);
}
closedir(fileName);
return 0;
My Question:
This code doesn't work, apparently it says error on the DIR part.
Is this the right thing to do if I want to create directory, create file and store that file directly to the new directory?
Is there any suggestion or advice to do it better than this? Thank you.
Some comments:
As commented above, you should really allocate more space for your dir's name, a 5 digit pid will break your code right away. There are the macros PATH_MAX and FILE_MAX in limits.h
You don't need to open the directory for anything. This is usually used to iterate over items in directories. You only want to create a file in it.
Even if you don't really need it, the closedir function receives a DIR * argument, which I suppose would be dir in your code.
To create the file inside the new directory you should include your dir in the path on creation, or at least chdir to it before the fopen call. I don't recommend the latter as it affects the process wide working directory.
Below is a quick and dirty patched version of your code that creates the directory and the new file inside it taking into account the above:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/stat.h>
#include <limits.h>
int main(int argc, char **argv){
int pid = getpid();
char dirName[NAME_MAX+1];
char *prefix = "that.rooms.";
snprintf(dirName, NAME_MAX + 1,"%s%d", prefix, pid);
printf("%s\n",dirName);
struct stat st = {0};
if (stat(dirName, &st) == -1) {
if(mkdir(dirName, 0755) != -1){
char libPath[PATH_MAX+1];
snprintf(libPath, PATH_MAX + 1, "%s/library.txt", dirName);
FILE *fLib = fopen(libPath , "w+");
fclose(fLib);
}else{
perror("mkdir: ");
}
}
return 0;
}
Changed the variable names a bit so it's clearer:
dirName is used to hold the directory name. It uses NAME_MAX as this is the system limit for the length of a file name
libPath is used to hold the path to the library.txt file you are creating. If your pid was 3123, libPath would read that.rooms.3123/library.txt after the snprintf call. It uses PATH_MAX as this is the system limit for a the length of a file's path.
I am trying to 1) Find all files in a directory and display them, 2) Open all found files and read data from them (characters) 3) Output the read data to the screen or a new file.
This is done in C Language and you will see below my current code. The problem that I am running into is that: I can find all the files in my directory and print them to the screen just fine (point 1 above), but when I try to open the found files and read data (characters) from them (point 2 above), I get a segmentation fault.
If I comment out the fscanf(entry_file, "%s", files); line below, but leave the entry_file = fopen(in_file->d_name, "r"); line, it compiles okay and writes the files to the screen. I also tried indexing the fscanf line with the int i (not shown below) and produced the same segmentation fault.
So, how can I read data from these found files? Thanks!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>
#include <errno.h>
int main()
{
DIR* dir;
FILE *entry_file;
struct dirent *in_file;
char files[1000];
int i;
dir = opendir("/Users/tcn/data");
if(dir==NULL){
printf("Error! Unable to read directory");
exit(1);
}
while( (in_file=readdir(dir)) != NULL) {
if (!strcmp (in_file->d_name, "."))
continue;
if (!strcmp (in_file->d_name, ".."))
continue;
printf("%s\n", in_file->d_name);
entry_file = fopen(in_file->d_name, "r");
fscanf(entry_file, "%s", files);
}
closedir(dir);
fclose(entry_file);
return 0;
}
Seeing as you are correctly checking for NULL against dir and in_file before using them, the only other thing that could possibly be causing this is entry_file being null. Check it before using it:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>
#include <errno.h>
int main()
{
DIR* dir;
FILE *entry_file;
struct dirent *in_file;
char files[1000];
int i;
dir = opendir("/Users/tcn/data");
if(dir==NULL) {
printf("Error! Unable to read directory");
exit(1);
}
while((in_file=readdir(dir)) != NULL) {
if (!strcmp (in_file->d_name, "."))
continue;
if (!strcmp (in_file->d_name, ".."))
continue;
printf("%s\n", in_file->d_name);
entry_file = fopen(in_file->d_name, "r");
if (entry_file != NULL) {
fscanf(entry_file, "%s", files);
/* whatever you want to do with files */
fclose(entry_file);
}
}
closedir(dir);
return 0;
}
Note also that, as multiple other users have commented, you should close entry_file within the loop.
The two most likely causes of the crash are not checking the return value of fopen – then either the fscanf or the fclose may crash when attempting to use entry_file when it's NULL – and the potential overflow of files.
Another problem which does not cause a crash is that the in_file->d_name does not contain the full path, but only the name of the file. So if you are testing the code inside /Users/tcn/data then it will appear to work, but it will fail elsewhere. Either prefix the filename with /Users/tcn/data/ or operate only on the current directory (.).
Fixes:
if ((entry_file = fopen(in_file->d_name, "r"))) {
(void) printf("%s\n", in_file->d_name);
if (fgets(files, sizeof files, entry_file)) { // or `while`?
// do something with `files`, it will be overwritten for next file
}
(void) fclose(entry_file);
}
And remove the other fclose(entry_file) from the end of the code.
Also note that if you use this code with an arbitrary directory, it might contain pipes and/or device nodes that will hang forever when you attempt to read them.
You will need a function with a loop using fread() to replace the fscanf line, and do a hex dump. For one thing, you don't know if the files are text files or binary files. For another, the segfault could be coming from reading a binary file that contains no newline into char files[1000]; And even if the files are all text files, you cannot predict that your "generous" 1000 length is enough to hold the first line of text.
My first post :), am starting out with C language as basic learning step into programming arena. I am using following code which reads string from text file, makes directory with that string name and opens a file for writing in that created directory. But am not able to create a file inside directory made, here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <direct.h>
#include <string.h>
int main()
{
char file_name[25], cwd[100];
FILE *fp, *op;
fp = fopen("myfile.txt", "r");
if (fp == NULL)
{
perror("Error while opening the file.\n");
exit(EXIT_FAILURE);
}
fgets(file_name, 25, fp);
_mkdir(file_name);
if (_getcwd(cwd,sizeof(cwd)) != 0)
{
fprintf(stdout, "Your dir name: %s\\%s\n", cwd,file_name);
op = fopen("cwd\\file_name\\mynewfile.txt","w");
fclose(op);
}
fclose(fp);
return 0;
}
What you need is to store the file name (with the path) in a c-string before opening. What you are opening is cwd\file_name\mynewfile.txt. I doubt that your directory is named cwd.
A sample could could be:
char file_path[150];
sprintf(file_path, "%s\\%s\\mynewfile.txt", cwd, file_name);
op = fopen(file_path,"w");
use
#include <sys/stat.h>
#include <sys/types.h>
instead of
#include <direct.h>
and modify
op = fopen("cwd\\file_name\\mynewfile.txt","w”);
I see you are using the return values. That is a good start for a beginner. You can refine your error messages by including "errno.h". Instead of printing your own error messages call
printf("%s", strerror(errno));
You get more precise error messages that way.
op = fopen("cwd\\file_name\\mynewfile.txt","w”);
You’re actually passing the string literals “cwd” and “file_name” as part of the path of the file, when I think you actually mean to put the contents of the variables with those names in there. You will probably have to piece together a string for the path. Try looking into strcat()
http://www.cplusplus.com/reference/cstring/strcat/
I'm writing a UNIX minishell on ubuntu, and am trying to add built-in commands at this point. When it's not a built-in command I fork and then the child executes it, however for built-in commands I'll just execute it in the current process.
So, I need a way to see if the files exist(if they do it's not a built-in command), however execvp uses the environment PATH variable to automatically look for them, so I have no idea how I would manually check beforehand.
So, do you guys know how I could test an argument to see if it's a built-in command simply by supplying the name?
Thanks guys.
I have tested the answer by Tom
It contained a number of problems. I have fixed them here and provided a test program.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
int is_file(const char* path) {
struct stat buf;
stat(path, &buf);
return S_ISREG(buf.st_mode);
}
/*
* returns non-zero if the file is a file in the system path, and executable
*/
int is_executable_in_path(char *name)
{
char *path = getenv("PATH");
char *item = NULL;
int found = 0;
if (!path)
return 0;
path = strdup(path);
char real_path[4096]; // or PATH_MAX or something smarter
for (item = strtok(path, ":"); (!found) && item; item = strtok(NULL, ":"))
{
sprintf(real_path, "%s/%s", item, name);
// printf("Testing %s\n", real_path);
if ( is_file(real_path) && !(
access(real_path, F_OK)
|| access(real_path, X_OK))) // check if the file exists and is executable
{
found = 1;
}
}
free(path);
return found;
}
int main()
{
if (is_executable_in_path("."))
puts(". is executable");
if (is_executable_in_path("echo"))
puts("echo is executable");
}
Notes
the test for access return value was reversed
the second strtok call had the wrong delimiter
strtok changed the path argument. My sample uses a copy
there was nothing to guarantee a proper path separator char in the concatenated real_path
there was no check whether the matched file was actually a file (directories can be 'executable' too). This leads to strange things like . being recognized as an external binary
What you can do is you can change the path to the particular directory and then use #include<dirent.h> header file and its readdir and scandir functions to walk through the directory or stat structure to see if the file exists in the directory or not.
You can iterate yourself through the PATH directories, and for each entry in PATH (You will have to split PATH with :, probably using strtok) concatenate at the end of each path the name of the command called. When you have create this path, check if the file exists and if it is executable using access.
int is_built_in(char *path, char *name)
{
char *item = strtok(path, ":");
do {
char real_path[4096] = strcat(item, name); // you would normally alloc exactly the size needed but lets stick to it for the sake of the example
if (!access(real_path, F_OK) && !access(real_path, X_OK)) // check if the file exists and is executable
return 0;
} while ((item = strtok(NULL, ":")) != NULL);
return 1;
}
Why do you want to test before calling execvp? That's the wrong approach. Just call execvp and it will tell you if the program does not exist.