C: writing the following code into functions - c

Dear respected programmers. Please could you help me (again) on how to put the following code into functions for my program.
I have read on-line and understand how functions work but when I do it myself it all goes pear shaped/wrong(I am such a noob).
Please could you help with how to for example to write the code below into functions.(like opening the input file).
My initial code looks like:
main (int argc, char **argv)
{
int bytes_read, bytes_written;
struct stat inode;
int input_fd, output_fd;
char buffer[64];
int eof = 0;
int i;
/* Check the command line arguments */
if (argc != 3)
{
printf("syntax is: %s \n", <fromfile> <tofile>\n", argv[0]);
exit (1);
}
/* Check the input file exists and is a file */
if ((stat(argv[1], &inode) == -1) || (!S_ISREG(inode.st_mode)))
{
printf("%s is not a file\n", argv[1]);
exit(2);
}
/* Check that the output file doesnt exist */
if (stat(argv[2], &inode) != -1)
{
printf("Warning: The file %s already exists. Not going to overwrite\n", argv[2]);
exit(2);
}
/* Open the input file for reading */
input_fd = open(argv[1], O_RDONLY, 0);
if (input_fd == -1)
{
printf("%s cannot be opened\n", argv[1]);
exit(3);
}
output_fd = open(argv[2], O_CREAT | O_WRONLY | O_EXCL , S_IRUSR|S_IWUSR);
if (output_fd == -1)
{
printf("%s cannot be opened\n", argv[2]);
exit(3);
}
/* Begin processing the input file here */
while (!eof)
{
bytes_read = read(input_fd, buffer, sizeof(buffer));
if (bytes_read == -1)
{
printf("%s cannot be read\n", argv[1]);
exit(4);
}
if (bytes_read > > 0)
{
bytes_written = write(output_fd, buffer, bytes_read);
if (bytes_written == -1)
{
printf("There was an error writing to the file %s\n",argv[2]);
exit(4);
}
if (bytes_written != bytes_read)
{
printf("Devistating failure! Bytes have either magically appeared and been written or dissapeard and been skipped. Data is inconsistant!\n");
exit(101);
}
}
else
{
eof = 1;
}
}
close(input_fd);
close(output_fd);
}
My attempt at opening an output file:
void outputFile(int argc, char **argv)
{
/* Check that the output file doesnt exist */
if (stat(argv[argc-1], &inode) != -1)
{
printf("Warning: The file %s already exists. Not going to overwrite\n", argv[argc-1]);
return -1;
}
/*Opening ouput files*/
file_desc_out = open(argv[i],O_CREAT | O_WRONLY | O_EXCL , S_IRUSR|S_IWUSR);
if(file_desc_out == -1)
{
printf("Error: %s cannot be opened. \n",argv[i]); //insted of argv[2] have pointer i.
return -1;
}
}
Any help on how I would now reference to this in my program is appreciated thank you.
I tried:
ouputfile (but I cant figure out what goes here and why either).

Maybe the most useful function for you is:
#include <stdio.h>
#include <stdarg.h>
extern void error_exit(int rc, const char *format, ...); /* In a header */
void error_exit(int rc, const char *format, ...)
{
va_list args;
va_start(args, format);
vfprintf(stderr, format, args);
va_end(args);
exit(rc);
}
You can then write:
if (stat(argv[2], &inode) != -1)
error_exit(2, "Warning: The file %s exists. Not going to overwrite\n",
argv[2]);
Which has the merit of brevity.
You write functions to do sub-tasks. Deciding where to break up your code into functions is tricky - as much art as science. Your code is not so big that it is completely awful to leave it as it is - one function (though the error handling can be simplified as above).
If you want to practice writing functions, consider splitting it up:
open_input_file()
open_output_file()
checked_read()
checked_write()
checked_close()
These functions would allow your main code to be written as:
int main(int argc, char **argv)
{
int bytes_read;
int input_fd, output_fd;
char buffer[64];
if (argc != 3)
error_exit(1, "Usage: %s <fromfile> <tofile>\n", argv[0]);
input_fd = open_input_file(argv[1]);
output_fd = open_output_file(argv[2]);
while ((bytes_read = checked_read(input_fd, buffer, sizeof(buffer)) > 0)
check_write(output_fd, buffer, bytes_read);
checked_close(input_fd);
checked_close(output_fd);
return 0;
}
Because you've tucked the error handling out of sight, it is now much easier to see the structure of the program. If you don't have enough functions yet, you can bury the loop into a function void file_copy(int fd_in, int fd_out). That removes more clutter from main() and leaves you with very simple code.
Given an initial attempt at a function to open the output file:
void outputFile(int argc, char **argv)
{
/* Check that the output file doesnt exist */
if (stat(argv[argc-1], &inode) != -1)
{
printf("Warning: The file %s already exists. Not going to overwrite\n", argv[argc-1]);
return -1;
}
/*Opening ouput files*/
file_desc_out = open(argv[i],O_CREAT | O_WRONLY | O_EXCL , S_IRUSR|S_IWUSR);
if(file_desc_out == -1)
{
printf("Error: %s cannot be opened. \n",argv[i]); //insted of argv[2] have pointer i.
return -1;
}
}
Critique:
You have to define the variables used by the function in the function (you will want to avoid global variables as much as possible, and there is no call for any global variable in this code).
You have to define the return type. You are opening a file - how is the file descriptor going to be returned to the calling code? So, the return type should be int.
You pass only the information needed to the function - a simple form of 'information hiding'. In this case, you only need to pass the name of the file; the information about file modes and the like is implicit in the name of the function.
In general, you have to decide how to handle errors. Unless you have directives otherwise from your homework setter, it is reasonable to exit on error with an appropriate message. If you return an error indicator, then the calling code has to test for it, and decide what to do about the error.
Errors and warnings should be written to stderr, not to stdout. The main program output (if any) goes to stdout.
Your code is confused about whether argv[i] or argv[argc-1] is the name of the output file. In a sense, this criticism is irrelevant once you pass just the filename to the function. However, consistency is a major virtue in programming, and using the same expression to identify the same thing is usually a good idea.
Consistency of layout is also important. Don't use both if( and if ( in your programs; use the canonical if ( notation as used by the language's founding fathers, K&R.
Similarly, be consistent with no spaces before commas, a space after a comma, and be consistent with spaces around operators such as '|'. Consistency makes your code easier to read, and you'll be reading your code a lot more often than you write it (at least, once you've finished your course, you will do more reading than writing).
You cannot have return -1; inside a function that returns no value.
When you a splitting up code into functions, you need to copy/move the paragraphs of code that you are extracting, leaving behind a call to the new function. You also need to copy the relevant local variables from the calling function into the new function - possibly eliminating the variables in the calling function if they are no longer used there. You do compile with most warnings enabled, don't you? You want to know about unused variables etc.
When you create the new function, one of the most important parts is working out what the correct signature of the function is. Does it return a value? If so, which value, and what is its type? If not, how does it handle errors? In this case, you probably want the function to bail out (terminate the program) if it runs into an error. In bigger systems, you might need to consistently return an error indicator (0 implies success, negative implies failure, different negatives indicating different errors). When you work with function that return an error indicator, it is almost always crucial that you check the error indicators in the calling code. For big programs, big swathes of the code can be all about error handling. Similarly, you need to work out which values are passed into the function.
I'm omitting advice about things such as 'be const correct' as overkill for your stage in learning to program in C.

you seem to actually understand how to make a function. making a function really isnt that hard. first, you need to kind of understand that a function has a type. in otherwords, argc has type int and argv has type char *, your function (currently) has type void. void means it has no value, which means when you return, you return nothing.
however, if you look at your code, you do return -1. it looks like you want to return an interger. so you should change the top from void outputfile(...) to int outputfile(...).
next, your function must return. it wont compile if there is a circumstance where it won't return (besides infinite loops). so at the very bottom, if no errors happen, it will reach the end. since you're no longer using "void" as the return type, you must return something before the end of the function. so i suggest putting a return 1; to show that everything went great

There's several things.
The function return type isn't what you want. You either want to return a file descriptor or an error code. IIRC, the file descriptor is a nonnegative int, so you can use a return type of int rather than void. You also need to return something on either path, either -1 or file_desc_out.
You probably don't want to pass in the command-line arguments as a whole, but rather something like argv[argc - 1]. In that case, the argument should be something like char * filename rather than the argc/argv it has now. (Note that the argv[i] you've got in the last printf is almost certainly wrong.)
This means it would be called something like
int file_desc_out = outputFile(argv[argc - 1]);
You need to have all variables declared in the function, specifically inode and file_desc_out.
Finally, put an extra level of indentation on the code inside the { and } of the function itself.

Related

Sending exec output from function to main method

I have a method I call from the main method called that executes ls-l on a certain directory, I want it to execute it and send the result as a string to the main method.
My current flawed code:
char *lsl(){
char *stringts=malloc(1024);
chdir("/Users/file/path");
char * lsargs[] = { "/bin/ls" , "-l", NULL};
stringts="The result of ls-l in the created directory is:"+ execv(lsargs[0], lsargs);
return stringts;
}
Currently I am only getting the exec output on the screen, I understand why this is happening(exec getting called before reaching return point). However I don't know how I could possibly do what I want and if it's actually doable.
I was thinking of using pipes and dup2() so I don't let the exec function use stdout but I don't know if it would be possible to put the output in a string.
As Jonathan Leffler already pointed out in comments, there is no '+' operator for concatenating strings in C.
A possibility to dynamically extends strings is to use realloc together with strcat.
For each number of bytes you read from the pipe, you could check the remaining capacity of the originally allocated memory for the string and, if this is not enough, reallocate twice the size.
You have to keep track of the size of the current string yourself. You could do this with a variable of type size_t.
If you combine this with the popen handling, it could look something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
FILE *fp;
if ((fp = popen("ls -l", "r")) == NULL) {
perror("popen failed");
return EXIT_FAILURE;
}
size_t str_size = 1024;
char *stringts = malloc(str_size);
if (!stringts) {
perror("stringts allocation failed");
return EXIT_FAILURE;
}
stringts[0] = '\0';
char buf[128];
size_t n;
while ((n = fread(buf, 1, sizeof(buf) - 1, fp)) > 0) {
buf[n] = '\0';
size_t capacity = str_size - strlen(stringts) - 1;
while (n > capacity) {
str_size *= 2;
stringts = realloc(stringts, str_size);
if (!stringts) {
perror("stringts realloation failed");
return EXIT_FAILURE;
}
capacity = str_size - strlen(stringts) - 1;
}
strcat(stringts, buf);
}
printf("%s\n", stringts);
free(stringts);
if (pclose(fp) != 0) {
perror("pclose failed");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
You have several flaws in your code:
char *lsl(){
char *stringts=malloc(1024);
chdir("/Users/file/path");
char * lsargs[] = { "/bin/ls" , "-l", NULL};
stringts="The result of ls-l in the created directory is:"+ execv(lsargs[0], lsargs);
return stringts;
}
If you malloc(3) a 1024 byte buffer into stringts pointer, but then you assign a different value to the pointer, making your buffer to be lost in the immensity of your RAM.
When you do execv(2) call, all the memory of your process is freed by the kernel and reloaded with an execution of the command ls -l, you'll get the output in the standard output of the process, and then you'll get the prompt of the shell. This makes the rest of your program unuseful, as once you exec, there's no way back, and your program is unloaded and freed.
You can add (+) to a pointer value (you indeed add to the address pointing to the string "The result of the ls -l..." and ---as the result of exec is nothing, as a new program is loaded--- you get nothing) If execv fails, then you get a pointer pointing to the previous char to that string, which is a valid expression in C, but makes your program to behave erratically in an Undefined Behaviour. Use strcpy(3), strcat(3), or snprintf(3), depending on the exact text you want to copy in the space of the buffer you allocated.
Your return an invalid address as a result. The problem here is that, if execv(2) works, it doesn't return. Only if it fails you get an invalid pointer that you cannot use (by the reason above), and of course ls -l has not been executed. Well, you don't say what you got as ouptut, so it is difficult for me to guess if you actually exec()d the program or not.
On other side, you have a popen(3) library function that allows you to execute a subprogram and allows you to read from a file descriptor its output (I recommend you not to chdir gratuitously in your program, as that is a global change in your program environment, IMHO it is better to pass ls(1) the directory you want to list as a parameter)
#include <stdio.h>
FILE *lsl() {
/* the call creates a FILE * descriptor that you can use as input and
* read the output of the ls command. It's bad resources use to try to
* read all in a string and return the string instead. Better read as
* much as you can/need and then pclose() the descriptor. */
return popen("/bin/ls -l /Users/file/path|", "rt");
}
and then you can read (as it can be very long output, you probably don't have enought buffer space to handle it all in memory if you have a huge directory)
FILE *dir = lsl();
if (dir) {
char buffer[1024];
while (fgets(buffer, sizeof buffer, dir)) {
process_line_of_lsl(buffer);
}
pclose(dir); /* you have to use pclose(3) with popen(3) */
}
If you don't want to use popen(3), then you cannot use execv(2) alone, and you have to fork(2) first, to create a new process, and exec() in the child process (after mounting the redirection yourself). Read a good introduction to fork()/exec() and how to redirect I/O between fork() and exec(), as it is far longer and detailed to put it here (again)

Chmod in C assigning wrong permissions

The following is my code for a method that copies a file from a path to a file to a directory provided as the destination. The copy works perfectly fine, however my chmod call assigns the wrong permissions to the copied file in the destination. If the permission in the source is 644, the copied file has a permission of 170 or 120.
I have been attempting to debug this for hours and it's driving me slightly crazy so any help is greatly appreciated.
void copy_file(char* src, char* dest) {
char a;
//extract file name through a duplicate ptr
char* fname = strdup(src);
char* dname = basename(fname);
//open read and write streams
FILE* read;
FILE* write;
read = fopen(src, "r");
chdir(dest);
write = fopen(dname, "w");
//error checking
if (read == NULL) //|| (write == NULL))
{
perror("Read Error: ");
exit(0);
}
else if (write == NULL)
{
perror("Write Error: ");
exit(0);
}
//write from src to dest char by char
while (1){
a = fgetc(read);
if (a == EOF)
{
break;
}
fputc(a, write);
}
//close files
fclose(read);
fclose(write);
// this is where I attempt to assign source file permissions
//and it goes horribly wrong
struct stat src_st;
if(stat(src, &src_st)){
perror("stat: ");
}
chmod(dname, src_st.st_mode);
printf("%o\n", src_st.st_mode & 0777);
}
You fopen(src, "r"), then you chdir(dest). This means that when you later call stat(src, &src_st), there is no reason to think that stat will access the same file as fopen did, or indeed that stat will access any file at all.
If stat fails, you proceed to call chmod anyway, so you pass whatever random junk was in src_st.st_mode to chmod.
You should use fstat(fileno(read), &src_st) before calling fclose(src), instead of calling stat(src, &src_st).
The basic problem is you have to check your system calls like fopen, chdir, and stat immediately.
For example, first thing I tried was copy_file( "test.data", "test2.data" ) not realizing it expected a destination directory.
char* fname = strdup(src);
char* dname = basename(fname);
dname is now test.data, same as the source.
read = fopen(src, "r"); // succeeds
chdir(dest); // fails
write = fopen(dname, "w"); // blows away test.data, the source
You do eventually check read and write, but after the damage has been done.
Blowing away your source file is really bad. It's important that your code deals with failed system calls. If you don't, it will sail along causing confusion and destruction.
Most system calls in C return 0 for success. This is an anti-pattern where the return value is an error flag, so false is failure, and anything else indicates what kind of error (though stat doesn't use that, it uses errno).
When it fails, stat returns -1 which is true. So this is the wrong way around.
struct stat src_st;
if(stat(src, &src_st)){
perror("stat: ");
}
Instead, you have to check for non-zero.
struct stat src_st;
if(stat(src, &src_st) != 0 ){
// Note that I don't use perror, it doesn't provide enough information.
fprintf(stderr, "Could not stat %s: %s\n", src, strerror(errno));
exit(1);
}
As you can guess this gets tedious in the extreme, and you're going to forget, or do it slightly different each time. You'll want to write wrappers around those functions to do the error handling for you.
FILE *fopen_checked( const char *file, const char *mode ) {
FILE *fp = fopen(file, mode);
if( file == NULL ) {
fprintf(stderr, "Could not open '%s' for '%s': %s", file, mode, strerror(errno));
exit(1);
}
return fp;
}
It's not the best error handling, but it will at least ensure your code appropriately halts and catches fire.
A note about chdir: if you can avoid it don't use it. chdir affects the global state of the program, the current working directory, and globals add complexity to everything. It's very, very easy for a function to change directory and not change back, as yours does. Now your process is in a weird state.
For example, if one did copy_file( "somefile", "foo" ) this leaves the program in foo/. If they then did copy_file( "otherfile", "foo" ) they'd be trying to copy foo/otherfile to foo/foo/otherfile.
And, as #robmayoff pointed out, your stat fails because the process is now in a different directory. So even the function doing the chdir is confused by it.
Ensuring that your functions always chdir back to the original directory in a language like C is very difficult and greatly complicates error handling. Instead, stay in your original directory and use functions like basename to join paths together.
Finally, avoid mixing your file operations. Use filenames or use file descriptors, but try not to use both. That means if you're using fopen, use fstat and fchmod. You might have to use fileno to get a file descriptor out of the FILE pointer.
This avoids having to carry around and keep in sync two pieces of data, the file descriptor and the filename. It also avoids issues with chdir or the file being renamed or even deleted, the file descriptor will still work so long as it remains open.
This is also a problem:
char a;
...
while (1){
a = fgetc(read);
if (a == EOF)
{
break;
}
fputc(a, write);
}
fgetc() returns int, not char. Per the C Standard, 7.21.7.1 The fgetc function:
7.21.7.1 The fgetc function
Synopsis
#include <stdio.h>
int fgetc(FILE *stream);
Assuming sizeof( int ) > sizeof( char ), char values are signed, 2s-complement integers, and EOF is an int defined to be -1 (all very common values), reading a file with char a = fgetc( stream ); will fail upon reading a valid 0xFF character value. And if your implementation's default char value is unsigned char, char a = fgetc( stream ); will never produce a value that matches EOF.

retrofitting a .h & .c file to my already working .c program

I have a program that I'm doing for class where I need to take the content of one file, reverse it, and write that reversed content to another file. I have written a program that successfully does this (after much googling as I am new to the C programming language). The problem however is that my professor wants us to submit the program in a certain way with a couple supporting .h and .c files (which I understand is good practice). So I was hoping someone could help me understand exactly how I can take my already existing program and make it into one that is to his specifications, which are as follows:
he would like a file named "file_utils.h" that has function signatures and guards for the following two functions
int read_file( char* filename, char **buffer );
int write_file( char* filename, char *buffer, int size);
thus far I have created this file to try and accomplish this.
#ifndef UTILS_H
#define UTILS_H
int read_file(char* filename, char **buffer);
int write_file(char* filename, char *buffer, int size);
#endif
he would like a file named "file_utils.c" that has the implemented code for the previous two functions
he would like a file named "reverse.c" that accepts command arguments, includes a main function, and calls the functions from the previous two files.
now. I understand how this is supposed to work, but as I'm looking at the program I wrote my way I'm unsure how to actually accomplish the same result by adhering to the previously mentioned specifications.
Below is the program that successfully accomplishes the desired functionality
#include<stdlib.h>
#include<stdio.h>
#include<fcntl.h>
#include<string.h>
#include<sys/stat.h>
#include<unistd.h>
int main(int argc, char *argv[]) {
int file1, file2, char_count, x, k;
char buffer;
// if the number of parameters passed are not correct, exit
//
if (argc != 3) {
fprintf(stderr, "usage %s <file1> <file2>", argv[0]);
exit(EXIT_FAILURE);
}
// if the origin file cannot be opened for whatever reason, exit
// S_IRUSR specifies that this file is to be read by only the file owner
//
if ((file1 = open(argv[1], S_IRUSR)) < 0) {
fprintf(stderr, "The origin-file is inaccessible");
exit(EXIT_FAILURE);
}
// if the destination-file cannot be opened for whatever reason, exit
// S_IWUSR specifies that this file is to be written to by only the file owner
//
if ((file2 = creat(argv[2], S_IWUSR)) < 0) {
fprintf(stderr, "The destination-file is inaccessible");
exit(EXIT_FAILURE);
}
// SEEK_END is used to place the read/write pointer at the end of the file
//
char_count = lseek(file1, (off_t) 0, SEEK_END);
printf("origin-file size is %d\n", char_count - 1);
for (k = char_count - 1; k >= 0; k--) {
lseek(file1, (off_t) k, SEEK_SET);
x = read(file1, &buffer, 1);
if (x != 1) {
fprintf(stderr, "can't read 1 byte");
exit(-1);
}
x = write(file2, &buffer, 1);
if (x != 1) {
fprintf(stderr, "can't write 1 byte");
exit(-1);
}
}
write(STDOUT_FILENO, "Reversal & Transfer Complete\n", 5);
close(file1);
close(file2);
return 0;
}
any insight as to how I can accomplish this "re-factoring" of sorts would be much appreciated, thanks!
The assignment demands a different architecture than your program. Unfortunately, this will not be a refactoring but a rewrite.
You have most of the pieces of read_file and write_file already: opening the file, determining its length, error handling. Those can be copy-pasted into the new functions.
But read_file should call malloc and read the file into memory, which is different.
You should create a new function in reverse.c, called by main, to reverse the bytes in a memory buffer.
After that function runs, write_file should attempt to open the file, and only do its error checking at that point.
Your simple program is superior because it validates the output file before any I/O, and it requires less memory. Its behavior satisfies the assignment, but its form does not.

how can i cause a segmentation fault on this program? (to help me find a buffer overflow exploit)

I see that the code below uses memcpy which i can use to exploit this program and cause a buffer overflow, but i cant seem to make it crash. No matter what character argument i pass to it i just get "error opening packet file." Any ideas how?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <fcntl.h>
#define MAX_ADDR_LEN 128
#define ADDR_LENGTH_OFFSET 4
#define ADDR_OFFSET 8
typedef unsigned char shsize_t;
typedef struct{
char addr[MAX_ADDR_LEN];
shsize_t len;
} arp_addr;
void
print_address(char *packet)
{
arp_addr hwaddr;
int i;
hwaddr.len = (shsize_t) *(packet + ADDR_LENGTH_OFFSET);
memcpy(hwaddr.addr, packet + ADDR_OFFSET, hwaddr.len);
printf("Sender hardware address: ");
for (i = 0; i < hwaddr.len - 1; i ++)
printf("%02hhx::", hwaddr.addr[i]);
printf("%02hhx\n", hwaddr.addr[hwaddr.len - 1]);
return;
}
int main(int argc, char *argv[])
{
struct stat sbuf;
char *packet;
int fd;
if (argc != 2){
printf("Usage: %s <packet file>\n", argv[0]);
return EXIT_FAILURE;
}
if ((stat(argv[1], &sbuf)) < 0){
printf("Error opening packet file\n");
return EXIT_FAILURE;
}
if ((fd = open(argv[1], O_RDONLY)) < 0){
printf("Error opening packet file\n");
return EXIT_FAILURE;
}
if ((packet = (char *)malloc(sbuf.st_size * sizeof(char))) == NULL){
printf("Error allocating memory\n");
return EXIT_FAILURE;
}
if (read(fd, packet, sbuf.st_size) < 0){
printf("Error reading packet from file\n");
return EXIT_FAILURE;
}
close(fd);
print_address(packet);
free(packet);
return EXIT_SUCCESS;
}
When you do something like write past the end of a buffer there is no guarantee that the program will crash. This is called undefined behavior, which literally means that you can make no reasonable assumptions as to what will happen.
The program itself appears relatively well behaved. As long as len is calculated properly I don't see any way for you to cause an overrun via input. Just because a program uses memcpy doesn't mean that it is vulnerable to attack. The only attack vector I see is if you pass it a carefully crafted file such that the length is calculated incorrectly:
hwaddr.len = (shsize_t) *(packet + ADDR_LENGTH_OFFSET)
In this line the program reads ADDR_LENGTH_OFFSET bytes from the address of packet to get the data length. Obviously that is problematic if you craft a file with an erroneous value for the data length in the header (i.e., a data length > MAX_ADDR_LEN).
BTW, the argument is a file, not a character. You won't be able to do anything passing it nonsense input because read will fail.
No matter what character argument i pass to it i just get "error
opening packet file."
You need to pass a valid file name as an argument, not random characters.
As others have indicated, the memcpy() isn't the security problem. The problem is that the length parameter passed to memcpy() comes from user input (the file you specified). If you specify a file that has a length field of, say, a billion, you will probably see a crash (and, yes, 'crash' is accepted vernacular).
Since there is rather limited checking on the size of the packet, you can pass it the name of an empty or very short file and the print_address() code will mess around out of bounds.
Also, since the code reads a length from the data read from the file, you can place an arbitrary number at relevant position and make the code go running around most places in memory.

Tried and true simple file copying code in C?

This looks like a simple question, but I didn't find anything similar here.
Since there is no file copy function in C, we have to implement file copying ourselves, but I don't like reinventing the wheel even for trivial stuff like that, so I'd like to ask the cloud:
What code would you recommend for file copying using fopen()/fread()/fwrite()?
What code would you recommend for file copying using open()/read()/write()?
This code should be portable (windows/mac/linux/bsd/qnx/younameit), stable, time tested, fast, memory efficient and etc. Getting into specific system's internals to squeeze some more performance is welcomed (like getting filesystem cluster size).
This seems like a trivial question but, for example, source code for CP command isn't 10 lines of C code.
This is the function I use when I need to copy from one file to another - with test harness:
/*
#(#)File: $RCSfile: fcopy.c,v $
#(#)Version: $Revision: 1.11 $
#(#)Last changed: $Date: 2008/02/11 07:28:06 $
#(#)Purpose: Copy the rest of file1 to file2
#(#)Author: J Leffler
#(#)Modified: 1991,1997,2000,2003,2005,2008
*/
/*TABSTOP=4*/
#include "jlss.h"
#include "stderr.h"
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
const char jlss_id_fcopy_c[] = "#(#)$Id: fcopy.c,v 1.11 2008/02/11 07:28:06 jleffler Exp $";
#endif /* lint */
void fcopy(FILE *f1, FILE *f2)
{
char buffer[BUFSIZ];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f1)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f2) != n)
err_syserr("write failed\n");
}
}
#ifdef TEST
int main(int argc, char **argv)
{
FILE *fp1;
FILE *fp2;
err_setarg0(argv[0]);
if (argc != 3)
err_usage("from to");
if ((fp1 = fopen(argv[1], "rb")) == 0)
err_syserr("cannot open file %s for reading\n", argv[1]);
if ((fp2 = fopen(argv[2], "wb")) == 0)
err_syserr("cannot open file %s for writing\n", argv[2]);
fcopy(fp1, fp2);
return(0);
}
#endif /* TEST */
Clearly, this version uses file pointers from standard I/O and not file descriptors, but it is reasonably efficient and about as portable as it can be.
Well, except the error function - that's peculiar to me. As long as you handle errors cleanly, you should be OK. The "jlss.h" header declares fcopy(); the "stderr.h" header declares err_syserr() amongst many other similar error reporting functions. A simple version of the function follows - the real one adds the program name and does some other stuff.
#include "stderr.h"
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
void err_syserr(const char *fmt, ...)
{
int errnum = errno;
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
if (errnum != 0)
fprintf(stderr, "(%d: %s)\n", errnum, strerror(errnum));
exit(1);
}
The code above may be treated as having a modern BSD license or GPL v3 at your choice.
As far as the actual I/O goes, the code I've written a million times in various guises for copying data from one stream to another goes something like this. It returns 0 on success, or -1 with errno set on error (in which case any number of bytes might have been copied).
Note that for copying regular files, you can skip the EAGAIN stuff, since regular files are always blocking I/O. But inevitably if you write this code, someone will use it on other types of file descriptors, so consider it a freebie.
There's a file-specific optimisation that GNU cp does, which I haven't bothered with here, that for long blocks of 0 bytes instead of writing you just extend the output file by seeking off the end.
void block(int fd, int event) {
pollfd topoll;
topoll.fd = fd;
topoll.events = event;
poll(&topoll, 1, -1);
// no need to check errors - if the stream is bust then the
// next read/write will tell us
}
int copy_data_buffer(int fdin, int fdout, void *buf, size_t bufsize) {
for(;;) {
void *pos;
// read data to buffer
ssize_t bytestowrite = read(fdin, buf, bufsize);
if (bytestowrite == 0) break; // end of input
if (bytestowrite == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdin, POLLIN);
continue;
}
return -1; // error
}
// write data from buffer
pos = buf;
while (bytestowrite > 0) {
ssize_t bytes_written = write(fdout, pos, bytestowrite);
if (bytes_written == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdout, POLLOUT);
continue;
}
return -1; // error
}
bytestowrite -= bytes_written;
pos += bytes_written;
}
}
return 0; // success
}
// Default value. I think it will get close to maximum speed on most
// systems, short of using mmap etc. But porters / integrators
// might want to set it smaller, if the system is very memory
// constrained and they don't want this routine to starve
// concurrent ops of memory. And they might want to set it larger
// if I'm completely wrong and larger buffers improve performance.
// It's worth trying several MB at least once, although with huge
// allocations you have to watch for the linux
// "crash on access instead of returning 0" behaviour for failed malloc.
#ifndef FILECOPY_BUFFER_SIZE
#define FILECOPY_BUFFER_SIZE (64*1024)
#endif
int copy_data(int fdin, int fdout) {
// optional exercise for reader: take the file size as a parameter,
// and don't use a buffer any bigger than that. This prevents
// memory-hogging if FILECOPY_BUFFER_SIZE is very large and the file
// is small.
for (size_t bufsize = FILECOPY_BUFFER_SIZE; bufsize >= 256; bufsize /= 2) {
void *buffer = malloc(bufsize);
if (buffer != NULL) {
int result = copy_data_buffer(fdin, fdout, buffer, bufsize);
free(buffer);
return result;
}
}
// could use a stack buffer here instead of failing, if desired.
// 128 bytes ought to fit on any stack worth having, but again
// this could be made configurable.
return -1; // errno is ENOMEM
}
To open the input file:
int fdin = open(infile, O_RDONLY|O_BINARY, 0);
if (fdin == -1) return -1;
Opening the output file is tricksy. As a basis, you want:
int fdout = open(outfile, O_WRONLY|O_BINARY|O_CREAT|O_TRUNC, 0x1ff);
if (fdout == -1) {
close(fdin);
return -1;
}
But there are confounding factors:
you need to special-case when the files are the same, and I can't remember how to do that portably.
if the output filename is a directory, you might want to copy the file into the directory.
if the output file already exists (open with O_EXCL to determine this and check for EEXIST on error), you might want to do something different, as cp -i does.
you might want the permissions of the output file to reflect those of the input file.
you might want other platform-specific meta-data to be copied.
you may or may not wish to unlink the output file on error.
Obviously the answers to all these questions could be "do the same as cp". In which case the answer to the original question is "ignore everything I or anyone else has said, and use the source of cp".
Btw, getting the filesystem's cluster size is next to useless. You'll almost always see speed increasing with buffer size long after you've passed the size of a disk block.
the size of each read need to be a multiple of 512 ( sector size ) 4096 is a good one
Here is a very easy and clear example: Copy a file. Since it is written in ANSI-C without any particular function calls I think this one would be pretty much portable.
Depending on what you mean by copying a file, it is certainly far from trivial. If you mean copying the content only, then there is almost nothing to do. But generally, you need to copy the metadata of the file, and that's surely platform dependent. I don't know of any C library which does what you want in a portable manner. Just handling the filename by itself is no trivial matter if you care about portability.
In C++, there is the file library in boost
One thing I found when implementing my own file copy, and it seems obvious but it's not: I/O's are slow. You can pretty much time your copy's speed by how many of them you do. So clearly you need to do as few of them as possible.
The best results I found were when I got myself a ginourmous buffer, read the entire source file into it in one I/O, then wrote the entire buffer back out of it in one I/O. If I even had to do it in 10 batches, it got way slow. Trying to read and write out each byte, like a naieve coder might try first, was just painful.
The accepted answer written by Steve Jessop does not answer to the first part of the quession, Jonathan Leffler do it, but do it wrong: code should be written as
while ((n = fread(buffer, 1, sizeof(buffer), f1)) > 0)
if (fwrite(buffer, n, 1, f2) != 1)
/* we got write error here */
/* test ferror(f1) for a read errors */
Explanation:
sizeof(char) = 1 by definition, always: it does not matter how many bits in it, 8 (in most cases), 9, 11 or 32 (on some DSP, for example) — size of char is one. Note, it is not an error here, but an extra code.
The fwrite function writes upto nmemb (second argument) elements of specified size (third argument), it does not required to write exactly nmemb elements. To fix this you must write the rest of the data readed or just write one element of size n — let fwrite do all his work. (This item is in question, should fwrite write all data or not, but in my version short writes impossible until error occurs.)
You should test for a read errors too: just test ferror(f1) at the end of loop.
Note, you probably need to disable buffering on both input and output files to prevent triple buffering: first on read to f1 buffer, second in our code, third on write to f2 buffer:
setvbuf(f1, NULL, _IONBF, 0);
setvbuf(f2, NULL, _IONBF, 0);
(Internal buffers should, probably, be of size BUFSIZ.)

Resources