I have a program that I'm doing for class where I need to take the content of one file, reverse it, and write that reversed content to another file. I have written a program that successfully does this (after much googling as I am new to the C programming language). The problem however is that my professor wants us to submit the program in a certain way with a couple supporting .h and .c files (which I understand is good practice). So I was hoping someone could help me understand exactly how I can take my already existing program and make it into one that is to his specifications, which are as follows:
he would like a file named "file_utils.h" that has function signatures and guards for the following two functions
int read_file( char* filename, char **buffer );
int write_file( char* filename, char *buffer, int size);
thus far I have created this file to try and accomplish this.
#ifndef UTILS_H
#define UTILS_H
int read_file(char* filename, char **buffer);
int write_file(char* filename, char *buffer, int size);
#endif
he would like a file named "file_utils.c" that has the implemented code for the previous two functions
he would like a file named "reverse.c" that accepts command arguments, includes a main function, and calls the functions from the previous two files.
now. I understand how this is supposed to work, but as I'm looking at the program I wrote my way I'm unsure how to actually accomplish the same result by adhering to the previously mentioned specifications.
Below is the program that successfully accomplishes the desired functionality
#include<stdlib.h>
#include<stdio.h>
#include<fcntl.h>
#include<string.h>
#include<sys/stat.h>
#include<unistd.h>
int main(int argc, char *argv[]) {
int file1, file2, char_count, x, k;
char buffer;
// if the number of parameters passed are not correct, exit
//
if (argc != 3) {
fprintf(stderr, "usage %s <file1> <file2>", argv[0]);
exit(EXIT_FAILURE);
}
// if the origin file cannot be opened for whatever reason, exit
// S_IRUSR specifies that this file is to be read by only the file owner
//
if ((file1 = open(argv[1], S_IRUSR)) < 0) {
fprintf(stderr, "The origin-file is inaccessible");
exit(EXIT_FAILURE);
}
// if the destination-file cannot be opened for whatever reason, exit
// S_IWUSR specifies that this file is to be written to by only the file owner
//
if ((file2 = creat(argv[2], S_IWUSR)) < 0) {
fprintf(stderr, "The destination-file is inaccessible");
exit(EXIT_FAILURE);
}
// SEEK_END is used to place the read/write pointer at the end of the file
//
char_count = lseek(file1, (off_t) 0, SEEK_END);
printf("origin-file size is %d\n", char_count - 1);
for (k = char_count - 1; k >= 0; k--) {
lseek(file1, (off_t) k, SEEK_SET);
x = read(file1, &buffer, 1);
if (x != 1) {
fprintf(stderr, "can't read 1 byte");
exit(-1);
}
x = write(file2, &buffer, 1);
if (x != 1) {
fprintf(stderr, "can't write 1 byte");
exit(-1);
}
}
write(STDOUT_FILENO, "Reversal & Transfer Complete\n", 5);
close(file1);
close(file2);
return 0;
}
any insight as to how I can accomplish this "re-factoring" of sorts would be much appreciated, thanks!
The assignment demands a different architecture than your program. Unfortunately, this will not be a refactoring but a rewrite.
You have most of the pieces of read_file and write_file already: opening the file, determining its length, error handling. Those can be copy-pasted into the new functions.
But read_file should call malloc and read the file into memory, which is different.
You should create a new function in reverse.c, called by main, to reverse the bytes in a memory buffer.
After that function runs, write_file should attempt to open the file, and only do its error checking at that point.
Your simple program is superior because it validates the output file before any I/O, and it requires less memory. Its behavior satisfies the assignment, but its form does not.
Related
EDIT 3
THE FILE CONTAINS BYTES - I guess I have to sort the bytes, the task doesn't say more - it says that I pass an argument - the name of a binary file that contains bytes - that's it. And I am trying to work with low-level funcs.
I am trying to sort a binary file using qsort but I got stuck - I dont know how to write the content of a file to a buffer so I could pass it to qsort
What I have done:
int main(int argc, char*argv[]){
int fd1;
if((fd1=open(argv[1], O_RDONLY))==-1){
printf("Error occurred while opening the file");
exit(-1);
}
int size;
char c;
while(read(fd1, &c, 1)){
size=size+1;
}
size=size+1;
close(fd1);
fd1=open(argv[1], O_RDONLY);
if(fd1==-1){
printf("Error occured while opening the file");
}
char*buffer;
buffer=malloc(size);
setbuf(fd1, buffer);
//EDIT I TRIED THIS AND IT STILL DOES NOT WORK
int i=0;
while(read(fd1, &c, 1)){
buffer[i]=c;
i++;
}
for(int i=0; i<size;i++){
printf("lele %s", buffer[i]);
}
//EDIT 2: after making buffer[i]=c I get this error Segmentation fault
}
SetBuf does not work this way.. How to make it work? Also, I am trying to use func like open, close, read, write, etc.
Your algorithm for reading a file into a buffer is good:
Open the file
Count bytes in file
Close the file
Allocate the buffer
Open the file
Read the file
Close the file
A bit inefficient, because you read the file twice, but that's fine. You just have to implement it properly; any small mistake will make it look like it doesn't work. Use a debugger to check each step.
Here is my try. I didn't debug, to not deny you the "fun" of debugging. I put comments instead.
int main(int argc, char*argv[])
{
// 1. Open the file
int fd1;
if((fd1=open(argv[1], O_RDONLY))==-1){
printf("Error occurred while opening the file");
exit(-1);
}
// 2. Count bytes in file
int size = 0;
char c;
while(read(fd1, &c, 1))
size=size+1;
// To check that this part is good, print the size here!
// 3. Close the file
close(fd1);
// Allocate the buffer
char *buffer;
buffer = malloc(size);
// Might want to print the buffer here, to make sure it's not NULL
// 5. Open the file
fd1=open(argv[1], O_RDONLY);
if(fd1==-1){
printf("Error occurred while opening the file");
}
// 6. Read the file
for (int index = 0; index < size; ++index)
read(fd1, &buffer[index], 1);
// Might want to print what "read" returns in each iteration, to make sure it's successful
// 7. Close the file
close(fd1);
}
As noted by Eric Postpischil, the algorithm is actually not good.
The size of the file at one time does not guarantee the size at another time.
If you want to do that correctly, you must read the file only once. This will make the allocation harder: you cannot calculate the required buffer size, so you have to "guess" an initial size and use realloc.
However, in this small example, this is clearly not the requirement - you can probably ignore the possibility of the file changing asynchronously.
There is another possible problem - I/O error on the file when you read it the second time. This is easy to check, so maybe you should add it.
I'm quite new to C. I faced a problem while studying the last chapter of K&R.
I'm trying to implement fopen() and fillbuf() function by using system calls, open and read.
I exactly copied the source code from the book but repeatedly get segmentation error after I compile.
fp->fd = fd;
fp->cnt = 0;
fp->base = NULL;
fp->flag = (*mode=='r')? _READ : _WRITE;
Why does error occur? Here is my complete code.
#include<fcntl.h>
#include<unistd.h>
#include<stdlib.h>
#define PERM 0644
#define EOF (-1)
#define BUFSIZE 1024
#define OPEN_MAX 20
typedef struct _iobuf{
int cnt;
char *ptr;
char *base;
int flag;
int fd;
} myFILE;
enum _flags {
_READ = 01,
_WRITE = 02,
_UNBUF = 04,
_EOF = 010,
_ERR = 020
};
myFILE _iob[OPEN_MAX]={
{0, (char *) 0, (char *) 0, _READ, 0 },
{0, (char *) 0, (char *) 0, _WRITE, 1 },
{0, (char *) 0, (char *) 0, _WRITE | _UNBUF, 2 }
};
#define stdin (&_iob[0])
#define stdout (&_iob[1])
#define stderr (&_iob[2])
#define getc(p) ( --(p)->cnt>=0 ? (unsigned char) *(p)->ptr++ : _fillbuf(p) )
int _fillbuf(myFILE *fp)
{
int bufsize;
if((fp->flag & (_READ|_EOF|_ERR))!=_READ)
return EOF;
bufsize=(fp->flag & _UNBUF)? 1 : BUFSIZE;
if(fp->base==NULL)
if((fp->base=(char *)malloc(bufsize))==NULL)
return EOF;
fp->ptr=fp->base;
fp->cnt=read(fp->fd, fp->ptr, bufsize);
if(--fp->cnt<0){
if(fp->cnt == -1)
fp->flag |= _EOF;
else
fp->flag |= _ERR;
return EOF;
}
return (unsigned char) *fp->ptr++;
}
myFILE *myfopen(char *name, char *mode)
{
int fd;
myFILE *fp;
if(*mode!='r' && *mode!='w' && *mode!='a')
return NULL;
for(fp=_iob; fp<_iob+OPEN_MAX; fp++)
if((fp->flag & (_READ | _WRITE))==0)
break;
if(fp>=_iob+OPEN_MAX)
return NULL;
if(*mode=='w')
fd=creat(name, PERM);
else if(*mode=='a'){
if((fd=open(name, O_WRONLY, 0))==-1)
fd=creat(name, PERM);
lseek(fd, 0L, 2);
} else
fd=open(name, O_RDONLY, 0);
if(fd==-1)
return NULL;
fp->fd = fd;
fp->cnt = 0;
fp->base = NULL;
fp->flag = (*mode=='r')? _READ : _WRITE;
return fp;
}
int main(int argc, char *argv[])
{
myFILE *fp;
int c;
if((fp=myfopen(argv[1], "r"))!=NULL)
write(1, "opened\n", sizeof("opened\n"));
while((c=getc(fp))!=EOF)
write(1, &c, sizeof(c));
return 0;
}
EDIT: Please see Jonathan Leffler's answer. It is more accurate and provides a better diagnosis. My answer works, but there is a better way to do things.
I see the problem.
myFILE *fp;
if(*mode!='r' && *mode!='w' && *mode!='a')
return NULL;
for(fp=_iob; fp<_iob+OPEN_MAX; fp++)
if((fp->flag & (_READ | _WRITE))==0) // marked line
break;
When you reach the marked line, you try to dereference the fp pointer. Since it is (likely, but not certainly) initialized to zero (but I should say NULL), you are dereferencing a null pointer. Boom. Segfault.
Here's what you need to change.
myFILE *fp = (myFILE *)malloc(sizeof(myFILE));
Be sure to #include <malloc.h> to use malloc.
Also your close function should later free() your myFILE to prevent memory leaks.
A different analysis of the code in the question
The code shown in the question consists of parts, but not all, of the code from K&R "The C Programming Language, 2nd Edition" (1988; my copy is marked 'Based on Draft Proposed ANSI C'), pages 176-178, plus a sample main program that is not from the book at all. The name of the type was changed from FILE to myFILE too, and fopen() was renamed to myfopen(). I note that the expressions in the code in the question have many fewer spaces than the original code in K&R. The compiler doesn't mind; human readers generally prefer spaces around operators.
As stated in another (later) question and answer, the diagnosis given by Mark Yisri in the currently accepted answer is incorrect — the problem is not a null pointer in the for loop. The prescribed remedy works (as long as the program is invoked correctly), but the memory allocation is not necessary. Fortunately for all concerned, the fclose() function was not included in the implementations, so it wasn't possible to close a file once it was opened.
In particular, the loop:
for (fp = _iob; fp < _iob + OPEN_MAX; fp++)
if ((fp->flag & (_READ | _WRITE)) == 0)
break;
is perfectly OK because the array _iob is defined as:
FILE _iob[OPEN_MAX] = {
…initializers for stdin, stdout, stderr…
};
This is an array of structures, not structure pointers. The first three elements are initialized explicitly; the remaining elements are implicitly initialized to all zeros. Consequently, there is no chance of there being a null pointer in fp as it steps through the array. The loop might also be written as:
for (fp = &_iob[0]; fp < &_iob[OPEN_MAX]; fp++)
if ((fp->flag & (_READ | _WRITE)) == 0)
break;
Empirically, if the code shown in the question (including the main(), which was not — repeat not — written by K&R) is invoked correctly, it works without crashing. However, the code in the main() program does not protect itself from:
Being invoked without a non-null argv[1].
Being invoked with a non-existent or non-readable file name in argv[1].
These are very common problems, and with the main program as written, either could cause the program to crash.
Although it is hard to be sure 16 months later, it seems likely to me that the problem was in the way that the program was invoked rather than anything else. If the main program is written more-or-less appropriately, you end up with code similar to this (you also need to add #include <string.h> to the list of included headers):
int main(int argc, char *argv[])
{
myFILE *fp;
int c;
if (argc != 2)
{
static const char usage[] = "Usage: mystdio filename\n";
write(2, usage, sizeof(usage) - 1);
return 1;
}
if ((fp = myfopen(argv[1], "r")) == NULL)
{
static const char filenotopened[] = "mystdio: failed to open file ";
write(2, filenotopened, sizeof(filenotopened) - 1);
write(2, argv[1], strlen(argv[1]));
write(2, "\n", 1);
return 1;
}
write(1, "opened\n", sizeof("opened\n"));
while ((c = getc(fp)) != EOF)
write(1, &c, sizeof(c));
return 0;
}
This can't use fprintf() etc because the surrogate implementation of the standard I/O library is not complete. Writing the errors direct to file descriptor 2 (standard error) with write() is fiddly, if not painful. It also means that I've taken shortcuts like assuming that the program is called mystdio rather than actually using argv[0] in the error messages. However, if it is invoked without any file name (or if more than one file name is given), or if the named file cannot be opened for reading, then it produces a more or less appropriate error message — and does not crash.
Leading underscores
Note that the C standard reserves identifiers starting with underscores.
You should not create function, variable or macro names that start with an underscore, in general. C11 §7.1.3 Reserved identifiers says (in part):
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
See also What does double underscore (__const) mean in C?
In fairness, K&R were essentially describing the standard implementation of the standard I/O library at the time when the 1st Edition was written (1978), modernized sufficiently to be using function prototype notation in the 2nd Edition. The original code was on pages 165-168 of the 1st Edition.
Even today, if you are implementing the standard library, you would use names starting with underscores precisely because they are reserved for use 'by the implementation'. If you are not implementing the standard library, you do not use names starting with underscores because that uses the namespace reserved for the implementation. Most people, most of the time, are not writing the standard library — most people should not be using leading underscores.
This question already has answers here:
Reading a text file backwards in C
(5 answers)
Closed 9 years ago.
I am supposed to create a program that takes a given file and creates a file with reversed txt. I wanted to know is there a way i can start the read() from the end of the file and copy it to the first byte in the created file if I dont know the exact size of the file?
Also i have googled this and came across many examples with fread, fopen, etc. However i cant use those for this project i can only use read, open, lseek, write, and close.
here is my code so far its not much but just for reference:
#include<stdio.h>
#include<unistd.h>
int main (int argc, char *argv[])
{
if(argc != 2)/*argc should be 2 for correct execution*/
{
printf("usage: %s filename",argv[0[]);}
}
else
{
int file1 = open(argv[1], O_RDWR);
if(file1 == -1){
printf("\nfailed to open file.");
return 1;
}
int reversefile = open(argv[2], O_RDWR | O_CREAT);
int size = lseek(argv[1], 0, SEEK_END);
char *file2[size+1];
int count=size;
int i = 0
while(read(file1, file2[count], 0) != 0)
{
file2[i]=*read(file1, file2[count], 0);
write(reversefile, file2[i], size+1);
count--;
i++;
lseek(argv[2], i, SEEK_SET);
}
I doubt that most filesystems are designed to support this operation effectively. Chances are, you'd have to read the whole file to get to the end. For the same reasons, most languages probably don't include any special feature for reading a file backwards.
Just come up with something. Try to read the whole file in memory. If it is too big, dump the beginning, reversed, into a temporary file and keep reading... In the end combine all temporary files into one. Also, you could probably do something smart with manual low-level manipulation of disk sectors, or at least with low-level programming directly against the file system. Looks like this is not what you are after, though.
Why don't you try fseek to navigate inside the file? This function is contained in stdio.h, just like fopen and fclose.
Another idea would be to implement a simple stack...
This has no error checking == really bad
get file size using stat
create a buffer with malloc
fread the file into the buffer
set a pointer to the end of the file
print each character going backwards thru the buffer.
If you get creative with google you can get several examples just like this.
IMO the assistance you are getting so far is not really even good hints.
This appears to be schoolwork, so beware of copying. Do some reading about the calls used here. stat (fstat) fread (read)
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
int main(int argc, char **argv)
{
struct stat st;
char *buf;
char *p;
FILE *in=fopen(argv[1],"r");
fstat(fileno(in), &st); // get file size in bytes
buf=malloc(st.st_size +2); // buffer for file
memset(buf, 0x0, st.st_size +2 );
fread(buf, st.st_size, 1, in); // fill the buffer
p=buf;
for(p+=st.st_size;p>=buf; p--) // print traversing backwards
printf("%c", *p);
fclose(in);
return 0;
}
Dear respected programmers. Please could you help me (again) on how to put the following code into functions for my program.
I have read on-line and understand how functions work but when I do it myself it all goes pear shaped/wrong(I am such a noob).
Please could you help with how to for example to write the code below into functions.(like opening the input file).
My initial code looks like:
main (int argc, char **argv)
{
int bytes_read, bytes_written;
struct stat inode;
int input_fd, output_fd;
char buffer[64];
int eof = 0;
int i;
/* Check the command line arguments */
if (argc != 3)
{
printf("syntax is: %s \n", <fromfile> <tofile>\n", argv[0]);
exit (1);
}
/* Check the input file exists and is a file */
if ((stat(argv[1], &inode) == -1) || (!S_ISREG(inode.st_mode)))
{
printf("%s is not a file\n", argv[1]);
exit(2);
}
/* Check that the output file doesnt exist */
if (stat(argv[2], &inode) != -1)
{
printf("Warning: The file %s already exists. Not going to overwrite\n", argv[2]);
exit(2);
}
/* Open the input file for reading */
input_fd = open(argv[1], O_RDONLY, 0);
if (input_fd == -1)
{
printf("%s cannot be opened\n", argv[1]);
exit(3);
}
output_fd = open(argv[2], O_CREAT | O_WRONLY | O_EXCL , S_IRUSR|S_IWUSR);
if (output_fd == -1)
{
printf("%s cannot be opened\n", argv[2]);
exit(3);
}
/* Begin processing the input file here */
while (!eof)
{
bytes_read = read(input_fd, buffer, sizeof(buffer));
if (bytes_read == -1)
{
printf("%s cannot be read\n", argv[1]);
exit(4);
}
if (bytes_read > > 0)
{
bytes_written = write(output_fd, buffer, bytes_read);
if (bytes_written == -1)
{
printf("There was an error writing to the file %s\n",argv[2]);
exit(4);
}
if (bytes_written != bytes_read)
{
printf("Devistating failure! Bytes have either magically appeared and been written or dissapeard and been skipped. Data is inconsistant!\n");
exit(101);
}
}
else
{
eof = 1;
}
}
close(input_fd);
close(output_fd);
}
My attempt at opening an output file:
void outputFile(int argc, char **argv)
{
/* Check that the output file doesnt exist */
if (stat(argv[argc-1], &inode) != -1)
{
printf("Warning: The file %s already exists. Not going to overwrite\n", argv[argc-1]);
return -1;
}
/*Opening ouput files*/
file_desc_out = open(argv[i],O_CREAT | O_WRONLY | O_EXCL , S_IRUSR|S_IWUSR);
if(file_desc_out == -1)
{
printf("Error: %s cannot be opened. \n",argv[i]); //insted of argv[2] have pointer i.
return -1;
}
}
Any help on how I would now reference to this in my program is appreciated thank you.
I tried:
ouputfile (but I cant figure out what goes here and why either).
Maybe the most useful function for you is:
#include <stdio.h>
#include <stdarg.h>
extern void error_exit(int rc, const char *format, ...); /* In a header */
void error_exit(int rc, const char *format, ...)
{
va_list args;
va_start(args, format);
vfprintf(stderr, format, args);
va_end(args);
exit(rc);
}
You can then write:
if (stat(argv[2], &inode) != -1)
error_exit(2, "Warning: The file %s exists. Not going to overwrite\n",
argv[2]);
Which has the merit of brevity.
You write functions to do sub-tasks. Deciding where to break up your code into functions is tricky - as much art as science. Your code is not so big that it is completely awful to leave it as it is - one function (though the error handling can be simplified as above).
If you want to practice writing functions, consider splitting it up:
open_input_file()
open_output_file()
checked_read()
checked_write()
checked_close()
These functions would allow your main code to be written as:
int main(int argc, char **argv)
{
int bytes_read;
int input_fd, output_fd;
char buffer[64];
if (argc != 3)
error_exit(1, "Usage: %s <fromfile> <tofile>\n", argv[0]);
input_fd = open_input_file(argv[1]);
output_fd = open_output_file(argv[2]);
while ((bytes_read = checked_read(input_fd, buffer, sizeof(buffer)) > 0)
check_write(output_fd, buffer, bytes_read);
checked_close(input_fd);
checked_close(output_fd);
return 0;
}
Because you've tucked the error handling out of sight, it is now much easier to see the structure of the program. If you don't have enough functions yet, you can bury the loop into a function void file_copy(int fd_in, int fd_out). That removes more clutter from main() and leaves you with very simple code.
Given an initial attempt at a function to open the output file:
void outputFile(int argc, char **argv)
{
/* Check that the output file doesnt exist */
if (stat(argv[argc-1], &inode) != -1)
{
printf("Warning: The file %s already exists. Not going to overwrite\n", argv[argc-1]);
return -1;
}
/*Opening ouput files*/
file_desc_out = open(argv[i],O_CREAT | O_WRONLY | O_EXCL , S_IRUSR|S_IWUSR);
if(file_desc_out == -1)
{
printf("Error: %s cannot be opened. \n",argv[i]); //insted of argv[2] have pointer i.
return -1;
}
}
Critique:
You have to define the variables used by the function in the function (you will want to avoid global variables as much as possible, and there is no call for any global variable in this code).
You have to define the return type. You are opening a file - how is the file descriptor going to be returned to the calling code? So, the return type should be int.
You pass only the information needed to the function - a simple form of 'information hiding'. In this case, you only need to pass the name of the file; the information about file modes and the like is implicit in the name of the function.
In general, you have to decide how to handle errors. Unless you have directives otherwise from your homework setter, it is reasonable to exit on error with an appropriate message. If you return an error indicator, then the calling code has to test for it, and decide what to do about the error.
Errors and warnings should be written to stderr, not to stdout. The main program output (if any) goes to stdout.
Your code is confused about whether argv[i] or argv[argc-1] is the name of the output file. In a sense, this criticism is irrelevant once you pass just the filename to the function. However, consistency is a major virtue in programming, and using the same expression to identify the same thing is usually a good idea.
Consistency of layout is also important. Don't use both if( and if ( in your programs; use the canonical if ( notation as used by the language's founding fathers, K&R.
Similarly, be consistent with no spaces before commas, a space after a comma, and be consistent with spaces around operators such as '|'. Consistency makes your code easier to read, and you'll be reading your code a lot more often than you write it (at least, once you've finished your course, you will do more reading than writing).
You cannot have return -1; inside a function that returns no value.
When you a splitting up code into functions, you need to copy/move the paragraphs of code that you are extracting, leaving behind a call to the new function. You also need to copy the relevant local variables from the calling function into the new function - possibly eliminating the variables in the calling function if they are no longer used there. You do compile with most warnings enabled, don't you? You want to know about unused variables etc.
When you create the new function, one of the most important parts is working out what the correct signature of the function is. Does it return a value? If so, which value, and what is its type? If not, how does it handle errors? In this case, you probably want the function to bail out (terminate the program) if it runs into an error. In bigger systems, you might need to consistently return an error indicator (0 implies success, negative implies failure, different negatives indicating different errors). When you work with function that return an error indicator, it is almost always crucial that you check the error indicators in the calling code. For big programs, big swathes of the code can be all about error handling. Similarly, you need to work out which values are passed into the function.
I'm omitting advice about things such as 'be const correct' as overkill for your stage in learning to program in C.
you seem to actually understand how to make a function. making a function really isnt that hard. first, you need to kind of understand that a function has a type. in otherwords, argc has type int and argv has type char *, your function (currently) has type void. void means it has no value, which means when you return, you return nothing.
however, if you look at your code, you do return -1. it looks like you want to return an interger. so you should change the top from void outputfile(...) to int outputfile(...).
next, your function must return. it wont compile if there is a circumstance where it won't return (besides infinite loops). so at the very bottom, if no errors happen, it will reach the end. since you're no longer using "void" as the return type, you must return something before the end of the function. so i suggest putting a return 1; to show that everything went great
There's several things.
The function return type isn't what you want. You either want to return a file descriptor or an error code. IIRC, the file descriptor is a nonnegative int, so you can use a return type of int rather than void. You also need to return something on either path, either -1 or file_desc_out.
You probably don't want to pass in the command-line arguments as a whole, but rather something like argv[argc - 1]. In that case, the argument should be something like char * filename rather than the argc/argv it has now. (Note that the argv[i] you've got in the last printf is almost certainly wrong.)
This means it would be called something like
int file_desc_out = outputFile(argv[argc - 1]);
You need to have all variables declared in the function, specifically inode and file_desc_out.
Finally, put an extra level of indentation on the code inside the { and } of the function itself.
This looks like a simple question, but I didn't find anything similar here.
Since there is no file copy function in C, we have to implement file copying ourselves, but I don't like reinventing the wheel even for trivial stuff like that, so I'd like to ask the cloud:
What code would you recommend for file copying using fopen()/fread()/fwrite()?
What code would you recommend for file copying using open()/read()/write()?
This code should be portable (windows/mac/linux/bsd/qnx/younameit), stable, time tested, fast, memory efficient and etc. Getting into specific system's internals to squeeze some more performance is welcomed (like getting filesystem cluster size).
This seems like a trivial question but, for example, source code for CP command isn't 10 lines of C code.
This is the function I use when I need to copy from one file to another - with test harness:
/*
#(#)File: $RCSfile: fcopy.c,v $
#(#)Version: $Revision: 1.11 $
#(#)Last changed: $Date: 2008/02/11 07:28:06 $
#(#)Purpose: Copy the rest of file1 to file2
#(#)Author: J Leffler
#(#)Modified: 1991,1997,2000,2003,2005,2008
*/
/*TABSTOP=4*/
#include "jlss.h"
#include "stderr.h"
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
const char jlss_id_fcopy_c[] = "#(#)$Id: fcopy.c,v 1.11 2008/02/11 07:28:06 jleffler Exp $";
#endif /* lint */
void fcopy(FILE *f1, FILE *f2)
{
char buffer[BUFSIZ];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f1)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f2) != n)
err_syserr("write failed\n");
}
}
#ifdef TEST
int main(int argc, char **argv)
{
FILE *fp1;
FILE *fp2;
err_setarg0(argv[0]);
if (argc != 3)
err_usage("from to");
if ((fp1 = fopen(argv[1], "rb")) == 0)
err_syserr("cannot open file %s for reading\n", argv[1]);
if ((fp2 = fopen(argv[2], "wb")) == 0)
err_syserr("cannot open file %s for writing\n", argv[2]);
fcopy(fp1, fp2);
return(0);
}
#endif /* TEST */
Clearly, this version uses file pointers from standard I/O and not file descriptors, but it is reasonably efficient and about as portable as it can be.
Well, except the error function - that's peculiar to me. As long as you handle errors cleanly, you should be OK. The "jlss.h" header declares fcopy(); the "stderr.h" header declares err_syserr() amongst many other similar error reporting functions. A simple version of the function follows - the real one adds the program name and does some other stuff.
#include "stderr.h"
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
void err_syserr(const char *fmt, ...)
{
int errnum = errno;
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
if (errnum != 0)
fprintf(stderr, "(%d: %s)\n", errnum, strerror(errnum));
exit(1);
}
The code above may be treated as having a modern BSD license or GPL v3 at your choice.
As far as the actual I/O goes, the code I've written a million times in various guises for copying data from one stream to another goes something like this. It returns 0 on success, or -1 with errno set on error (in which case any number of bytes might have been copied).
Note that for copying regular files, you can skip the EAGAIN stuff, since regular files are always blocking I/O. But inevitably if you write this code, someone will use it on other types of file descriptors, so consider it a freebie.
There's a file-specific optimisation that GNU cp does, which I haven't bothered with here, that for long blocks of 0 bytes instead of writing you just extend the output file by seeking off the end.
void block(int fd, int event) {
pollfd topoll;
topoll.fd = fd;
topoll.events = event;
poll(&topoll, 1, -1);
// no need to check errors - if the stream is bust then the
// next read/write will tell us
}
int copy_data_buffer(int fdin, int fdout, void *buf, size_t bufsize) {
for(;;) {
void *pos;
// read data to buffer
ssize_t bytestowrite = read(fdin, buf, bufsize);
if (bytestowrite == 0) break; // end of input
if (bytestowrite == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdin, POLLIN);
continue;
}
return -1; // error
}
// write data from buffer
pos = buf;
while (bytestowrite > 0) {
ssize_t bytes_written = write(fdout, pos, bytestowrite);
if (bytes_written == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdout, POLLOUT);
continue;
}
return -1; // error
}
bytestowrite -= bytes_written;
pos += bytes_written;
}
}
return 0; // success
}
// Default value. I think it will get close to maximum speed on most
// systems, short of using mmap etc. But porters / integrators
// might want to set it smaller, if the system is very memory
// constrained and they don't want this routine to starve
// concurrent ops of memory. And they might want to set it larger
// if I'm completely wrong and larger buffers improve performance.
// It's worth trying several MB at least once, although with huge
// allocations you have to watch for the linux
// "crash on access instead of returning 0" behaviour for failed malloc.
#ifndef FILECOPY_BUFFER_SIZE
#define FILECOPY_BUFFER_SIZE (64*1024)
#endif
int copy_data(int fdin, int fdout) {
// optional exercise for reader: take the file size as a parameter,
// and don't use a buffer any bigger than that. This prevents
// memory-hogging if FILECOPY_BUFFER_SIZE is very large and the file
// is small.
for (size_t bufsize = FILECOPY_BUFFER_SIZE; bufsize >= 256; bufsize /= 2) {
void *buffer = malloc(bufsize);
if (buffer != NULL) {
int result = copy_data_buffer(fdin, fdout, buffer, bufsize);
free(buffer);
return result;
}
}
// could use a stack buffer here instead of failing, if desired.
// 128 bytes ought to fit on any stack worth having, but again
// this could be made configurable.
return -1; // errno is ENOMEM
}
To open the input file:
int fdin = open(infile, O_RDONLY|O_BINARY, 0);
if (fdin == -1) return -1;
Opening the output file is tricksy. As a basis, you want:
int fdout = open(outfile, O_WRONLY|O_BINARY|O_CREAT|O_TRUNC, 0x1ff);
if (fdout == -1) {
close(fdin);
return -1;
}
But there are confounding factors:
you need to special-case when the files are the same, and I can't remember how to do that portably.
if the output filename is a directory, you might want to copy the file into the directory.
if the output file already exists (open with O_EXCL to determine this and check for EEXIST on error), you might want to do something different, as cp -i does.
you might want the permissions of the output file to reflect those of the input file.
you might want other platform-specific meta-data to be copied.
you may or may not wish to unlink the output file on error.
Obviously the answers to all these questions could be "do the same as cp". In which case the answer to the original question is "ignore everything I or anyone else has said, and use the source of cp".
Btw, getting the filesystem's cluster size is next to useless. You'll almost always see speed increasing with buffer size long after you've passed the size of a disk block.
the size of each read need to be a multiple of 512 ( sector size ) 4096 is a good one
Here is a very easy and clear example: Copy a file. Since it is written in ANSI-C without any particular function calls I think this one would be pretty much portable.
Depending on what you mean by copying a file, it is certainly far from trivial. If you mean copying the content only, then there is almost nothing to do. But generally, you need to copy the metadata of the file, and that's surely platform dependent. I don't know of any C library which does what you want in a portable manner. Just handling the filename by itself is no trivial matter if you care about portability.
In C++, there is the file library in boost
One thing I found when implementing my own file copy, and it seems obvious but it's not: I/O's are slow. You can pretty much time your copy's speed by how many of them you do. So clearly you need to do as few of them as possible.
The best results I found were when I got myself a ginourmous buffer, read the entire source file into it in one I/O, then wrote the entire buffer back out of it in one I/O. If I even had to do it in 10 batches, it got way slow. Trying to read and write out each byte, like a naieve coder might try first, was just painful.
The accepted answer written by Steve Jessop does not answer to the first part of the quession, Jonathan Leffler do it, but do it wrong: code should be written as
while ((n = fread(buffer, 1, sizeof(buffer), f1)) > 0)
if (fwrite(buffer, n, 1, f2) != 1)
/* we got write error here */
/* test ferror(f1) for a read errors */
Explanation:
sizeof(char) = 1 by definition, always: it does not matter how many bits in it, 8 (in most cases), 9, 11 or 32 (on some DSP, for example) — size of char is one. Note, it is not an error here, but an extra code.
The fwrite function writes upto nmemb (second argument) elements of specified size (third argument), it does not required to write exactly nmemb elements. To fix this you must write the rest of the data readed or just write one element of size n — let fwrite do all his work. (This item is in question, should fwrite write all data or not, but in my version short writes impossible until error occurs.)
You should test for a read errors too: just test ferror(f1) at the end of loop.
Note, you probably need to disable buffering on both input and output files to prevent triple buffering: first on read to f1 buffer, second in our code, third on write to f2 buffer:
setvbuf(f1, NULL, _IONBF, 0);
setvbuf(f2, NULL, _IONBF, 0);
(Internal buffers should, probably, be of size BUFSIZ.)