I'm not an expert in C and I'm looking for some advice to to make my program more robust and reliable. Just to give some context: I've written a program to do some scientific computation that takes quite a long time (about 20h) that I'm executing on a large university HPC linux cluster using a SLRUM scheduling system and NFS mounted file systems. What seems to happen is that some time during the 20h the connection to the file system goes stale (on the entire machine; independent of my program) and the first attempt to open & write a file takes a really long time and that results in a segfault cored dumped error that I have so far not been able to precisely track down. Below is a minimal file that at least conceptually reproduces the error: The program starts, opens a file and everything works. The program does some long computation (simulated by sleep()), tries to open & write to the same file again, and it fails. What are some conventions to make my code more robust and reliably write my results to file without crashing?
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv) {
// Declare variables
FILE *outfile;
char outname[150] = "result.csv";
// Open file for writing
printf("CHECKING if output file '%s' is writable?", outname);
outfile=fopen(outname, "w");
if (outfile == NULL) {
perror("Failed: ");
exit(EXIT_FAILURE);
}
fclose(outfile);
printf(" PASSED.\n");
// Do some computation that takes really long (around 19h)
sleep(3);
// Open file again and Write results
printf("Writing results to %s ...", outname);
outfile=fopen(outname, "w");
if (outfile == NULL) {
perror("Failed writing in tabulate_vector_new: ");
exit(EXIT_FAILURE);
}
fprintf( outfile, "This is the important result.\n");
fclose(outfile);
printf(" DONE.\n");
return 0;
}
It seems odd that your program would segfault due to an NFS issue. I would expect it to hang indefinitely, not crash. That having been said, I would suggest forking a new process to check whether the NFS mount is working. That way, your important code won't be directly involved in testing the problematic file system. Something like the following approach may be useful:
pid_t pid = fork();
if (pid == -1)
{
// error, failed to fork(). should probably give up now. something is really wrong.
}
else if (pid > 0)
{
// if the child exits, it has successfully interacted with the NFS file system
wait(NULL);
// proceed with attempting to write important data
}
else
{
// we are the child; fork df in order to test the NFS file system
execlp("df", "df", "/mnt", (char *)NULL)
// the child has been replaced by df, which will try to statfs(2) /mnt for us
}
The general concept here is that we utilize the df command to check whether the NFS file system (which I assume is at /mnt) is working. If it's temporarily not working, df should hang until it starts working again, and then exit, returning control to your program. If you suspect df might hang forever, you could enhance my example by using alarm(2) to wait a certain period of time, probably at least a few minutes, after which you could retry running df. Note that this could result in zombie df processes sticking around.
In the end, the correct solution is to try to get a more reliable NFS server, but until you can do that, I hope this is helpful.
How would I go about printing contents of a file that I've appended to using only low-level I/O functions?
The closest I get is printing the text that I'm using to append
Example:
file1.txt = dog
file2.txt = cat
I want file2.txt, which is now "catdog" to be printed out. How would I do that?
As said before I can only get "dog" to print. I'm also successful in appended the file. I know it's probably really simple solution but I been scratching my head for hours.
My code
while (1) {
if ((bufchar = read(fdin1, buf, sizeof(buf))) > 0) {
bp = buf; // Pointer to next byte to write.
while (bufchar > 0) {
if ((wrchar = write(fdin2, bp, bufchar)) < 0)
perror("Write failed");
bufchar -= wrchar; // Update.
bp += wrchar;
}
}
else if (bufchar == 0) { // EOF reached.
break;
}
else
perror("Read failed");
}
Just some heads up, that if you are appending to file2.txt it would then be "catdog" not the other way around. If you are only able to get dog to write out, physically go into the file to ensure you are actually appending and not simply overwriting the file.
Here is some reading for the specifics of low-level file I/O. Read the top two links for opening and closing files and primitive I/O operations. Without seeing any of your code it is hard to help you though it is possible you are not properly closing the file and so your appended line is not saved...
I am working on an assignment that only allows use of low-level I/O (read(), write(), lseek()) as well as perror().
I have been able to open the nessisary in and out files with correct permissions, but when I output I get an infinite loop of the in file contents to out. See snippet below...
void *buf = malloc(1024);
while((n = read(in, buf, 1024)) > 0){
if(lseek(in, n, SEEK_CUR) == -1){
perror("in file not seekable");
exit(-1);
}
while((m = write(out, buf, n)) > 0){
if(lseek(out, m, SEEK_CUR) == -1){
perror("out file not seekable");
exit(-1);
}
}
if(m == -1){ perror("error writing out"); exit(-1); }
}
if(n == -1){ perror("error reading in"); exit(-1); }
I have removed some error trapping from my code and you can assume the variables are initialized and includes statements are there.
Problem is the inner loop:
while((m = write(out, buf, n)) > 0){
should really be
if((m = write(out, buf, n)) > 0){
You only want buf to be written once, not infinitely many times. What you also need to handle is short writes, that is, when write returns with m < n && m > 0.
Also, the lseek() calls are wrong, but they don't lead to the loop. read() and write() already advance the current file offset. You do not need to manually advance it, unless you want to skip bytes in the input or output file (note that in the output file case, on UNIX, skipping bytes may lead to so-called "holes" in files, regions which are zero but don't occupy disk space).
Why are you seeking on the input file after your read? Since you will, at most, read 1024 bytes (meaning n will be somewhere between 0 and 1024), you will be continuously seeking to somewhere beyond where you've left the input file pointer so that you'll lose data in the transfer (including probably beyond the end of the file when you get near the end).
This may be one cause why you have an infinite loop but the far more insidious one is the use of while for the write. Since this will return values greater than zero on success, you will continuously write the first chunk to the file over and over. At least until you run out of disk space or other resources.
You also don't need the seek on the write either. The read and write calls do what they have to do and advance the file pointer correctly for the next read or write - it's not something you have to do manually.
You can probably simplify the whole thing to:
while ((n = read (in, buf, 1024)) > 0) {
if ((m = write (out, buf, n)) != n) {
perror ("error writing out");
exit (-1);
}
}
which has the advantages of:
getting rid of the seek calls;
removing the 'infinite' loop;
checking that you've written all the bytes requested.
Context for this is that the program is basically reading through a filestream, 4K chunks at a time, looking for a certain pattern. It starts by reading in 4k, and if doesn't find the pattern there, it starts a loop which reads in the next 4k chunk (rinse and repeat until EOF or pattern is found).
On many files the code is working properly, but some files are getting errors.
The code below is obviously highly redacted, which I know might be annoying, but it includes ALL lines that reference the file descriptor or the file itself. I know you don't want to take my word for it, since I'm the one with the problem...
Having done a LITTLE homework before crying for help, I've found:
The file descriptor happens to always = 6 (it's also 6 for the files that are working), and that number isn't getting changed through the life of execution. Don't know if that's useful info or not.
By inserting print statements after every operation that accesses the file descriptor, I've also found that successful files go through the following cycle "open-read-close-close" (i.e. the pattern was found in the first 4K)
Unsuccessful files go "open-read-read ERROR (Bad File Descriptor)-close." So no premature close, and it's getting in the first read successfully, but the second read causes the Bad File Descriptor error.
.
int function(char *file)
{
int len, fd, go = 0;
char buf[4096];
if((fd = open(file, O_RDONLY)) <= 0)
{
my_error("Error opening file %s: %s", file, strerror(errno));
return NULL;
}
//first read
if((len = read(fd, buf, 4096)) <= 0)
{
my_error("Error reading from file %s: %s", file, strerror(errno));
close(fd); return NULL;
}
//pattern-searching
if(/*conditions*/)
{
/* we found it, no need to keep looking*/
close(fd);
}
else
{
//reading loop
while(!go)
{
if(/*conditions*/)
{
my_error("cannot locate pattern in file %s", file);
close(fd); return NULL;
}
//next read
if((len = read(fd, buf, 4096)) <= 0) /**** FAILS HERE *****/
{
my_error("Error reading from file, possible bad message %s: %s",
file, strerror(errno));
close(fd); return NULL;
}
if(/*conditions*/)
{
close(fd);
break;
}
//pattern searching
if(/*conditions*/)
{
/* found the pattern */
go++; //break us out of the while loop
//stuff
close(fd);
}
else
{
//stuff, and we will loop again for the next chunk
}
} /*end while loop*/
}/*end else statement*/
close(fd);
}
.
Try not to worry about the pattern-reading logic - all operations are done on the char buffer, not on the file, so it ought to have no impact on this problem.
EOF returns 0 (falls into if ... <= 0), but does not set errno, which may have an out of date code in it.
Try testing for 0 and negative (error, -1) values seperately.
Regarding "strace": I've used it a little at home, and in previous jobs. Unfortunately, it's not installed in my current work environment. It is a useful tool, when it's available. Here, I took the "let's read the fine manual" (man read) approach with the questioner :-)
This looks like a simple question, but I didn't find anything similar here.
Since there is no file copy function in C, we have to implement file copying ourselves, but I don't like reinventing the wheel even for trivial stuff like that, so I'd like to ask the cloud:
What code would you recommend for file copying using fopen()/fread()/fwrite()?
What code would you recommend for file copying using open()/read()/write()?
This code should be portable (windows/mac/linux/bsd/qnx/younameit), stable, time tested, fast, memory efficient and etc. Getting into specific system's internals to squeeze some more performance is welcomed (like getting filesystem cluster size).
This seems like a trivial question but, for example, source code for CP command isn't 10 lines of C code.
This is the function I use when I need to copy from one file to another - with test harness:
/*
#(#)File: $RCSfile: fcopy.c,v $
#(#)Version: $Revision: 1.11 $
#(#)Last changed: $Date: 2008/02/11 07:28:06 $
#(#)Purpose: Copy the rest of file1 to file2
#(#)Author: J Leffler
#(#)Modified: 1991,1997,2000,2003,2005,2008
*/
/*TABSTOP=4*/
#include "jlss.h"
#include "stderr.h"
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
const char jlss_id_fcopy_c[] = "#(#)$Id: fcopy.c,v 1.11 2008/02/11 07:28:06 jleffler Exp $";
#endif /* lint */
void fcopy(FILE *f1, FILE *f2)
{
char buffer[BUFSIZ];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f1)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f2) != n)
err_syserr("write failed\n");
}
}
#ifdef TEST
int main(int argc, char **argv)
{
FILE *fp1;
FILE *fp2;
err_setarg0(argv[0]);
if (argc != 3)
err_usage("from to");
if ((fp1 = fopen(argv[1], "rb")) == 0)
err_syserr("cannot open file %s for reading\n", argv[1]);
if ((fp2 = fopen(argv[2], "wb")) == 0)
err_syserr("cannot open file %s for writing\n", argv[2]);
fcopy(fp1, fp2);
return(0);
}
#endif /* TEST */
Clearly, this version uses file pointers from standard I/O and not file descriptors, but it is reasonably efficient and about as portable as it can be.
Well, except the error function - that's peculiar to me. As long as you handle errors cleanly, you should be OK. The "jlss.h" header declares fcopy(); the "stderr.h" header declares err_syserr() amongst many other similar error reporting functions. A simple version of the function follows - the real one adds the program name and does some other stuff.
#include "stderr.h"
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
void err_syserr(const char *fmt, ...)
{
int errnum = errno;
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
if (errnum != 0)
fprintf(stderr, "(%d: %s)\n", errnum, strerror(errnum));
exit(1);
}
The code above may be treated as having a modern BSD license or GPL v3 at your choice.
As far as the actual I/O goes, the code I've written a million times in various guises for copying data from one stream to another goes something like this. It returns 0 on success, or -1 with errno set on error (in which case any number of bytes might have been copied).
Note that for copying regular files, you can skip the EAGAIN stuff, since regular files are always blocking I/O. But inevitably if you write this code, someone will use it on other types of file descriptors, so consider it a freebie.
There's a file-specific optimisation that GNU cp does, which I haven't bothered with here, that for long blocks of 0 bytes instead of writing you just extend the output file by seeking off the end.
void block(int fd, int event) {
pollfd topoll;
topoll.fd = fd;
topoll.events = event;
poll(&topoll, 1, -1);
// no need to check errors - if the stream is bust then the
// next read/write will tell us
}
int copy_data_buffer(int fdin, int fdout, void *buf, size_t bufsize) {
for(;;) {
void *pos;
// read data to buffer
ssize_t bytestowrite = read(fdin, buf, bufsize);
if (bytestowrite == 0) break; // end of input
if (bytestowrite == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdin, POLLIN);
continue;
}
return -1; // error
}
// write data from buffer
pos = buf;
while (bytestowrite > 0) {
ssize_t bytes_written = write(fdout, pos, bytestowrite);
if (bytes_written == -1) {
if (errno == EINTR) continue; // signal handled
if (errno == EAGAIN) {
block(fdout, POLLOUT);
continue;
}
return -1; // error
}
bytestowrite -= bytes_written;
pos += bytes_written;
}
}
return 0; // success
}
// Default value. I think it will get close to maximum speed on most
// systems, short of using mmap etc. But porters / integrators
// might want to set it smaller, if the system is very memory
// constrained and they don't want this routine to starve
// concurrent ops of memory. And they might want to set it larger
// if I'm completely wrong and larger buffers improve performance.
// It's worth trying several MB at least once, although with huge
// allocations you have to watch for the linux
// "crash on access instead of returning 0" behaviour for failed malloc.
#ifndef FILECOPY_BUFFER_SIZE
#define FILECOPY_BUFFER_SIZE (64*1024)
#endif
int copy_data(int fdin, int fdout) {
// optional exercise for reader: take the file size as a parameter,
// and don't use a buffer any bigger than that. This prevents
// memory-hogging if FILECOPY_BUFFER_SIZE is very large and the file
// is small.
for (size_t bufsize = FILECOPY_BUFFER_SIZE; bufsize >= 256; bufsize /= 2) {
void *buffer = malloc(bufsize);
if (buffer != NULL) {
int result = copy_data_buffer(fdin, fdout, buffer, bufsize);
free(buffer);
return result;
}
}
// could use a stack buffer here instead of failing, if desired.
// 128 bytes ought to fit on any stack worth having, but again
// this could be made configurable.
return -1; // errno is ENOMEM
}
To open the input file:
int fdin = open(infile, O_RDONLY|O_BINARY, 0);
if (fdin == -1) return -1;
Opening the output file is tricksy. As a basis, you want:
int fdout = open(outfile, O_WRONLY|O_BINARY|O_CREAT|O_TRUNC, 0x1ff);
if (fdout == -1) {
close(fdin);
return -1;
}
But there are confounding factors:
you need to special-case when the files are the same, and I can't remember how to do that portably.
if the output filename is a directory, you might want to copy the file into the directory.
if the output file already exists (open with O_EXCL to determine this and check for EEXIST on error), you might want to do something different, as cp -i does.
you might want the permissions of the output file to reflect those of the input file.
you might want other platform-specific meta-data to be copied.
you may or may not wish to unlink the output file on error.
Obviously the answers to all these questions could be "do the same as cp". In which case the answer to the original question is "ignore everything I or anyone else has said, and use the source of cp".
Btw, getting the filesystem's cluster size is next to useless. You'll almost always see speed increasing with buffer size long after you've passed the size of a disk block.
the size of each read need to be a multiple of 512 ( sector size ) 4096 is a good one
Here is a very easy and clear example: Copy a file. Since it is written in ANSI-C without any particular function calls I think this one would be pretty much portable.
Depending on what you mean by copying a file, it is certainly far from trivial. If you mean copying the content only, then there is almost nothing to do. But generally, you need to copy the metadata of the file, and that's surely platform dependent. I don't know of any C library which does what you want in a portable manner. Just handling the filename by itself is no trivial matter if you care about portability.
In C++, there is the file library in boost
One thing I found when implementing my own file copy, and it seems obvious but it's not: I/O's are slow. You can pretty much time your copy's speed by how many of them you do. So clearly you need to do as few of them as possible.
The best results I found were when I got myself a ginourmous buffer, read the entire source file into it in one I/O, then wrote the entire buffer back out of it in one I/O. If I even had to do it in 10 batches, it got way slow. Trying to read and write out each byte, like a naieve coder might try first, was just painful.
The accepted answer written by Steve Jessop does not answer to the first part of the quession, Jonathan Leffler do it, but do it wrong: code should be written as
while ((n = fread(buffer, 1, sizeof(buffer), f1)) > 0)
if (fwrite(buffer, n, 1, f2) != 1)
/* we got write error here */
/* test ferror(f1) for a read errors */
Explanation:
sizeof(char) = 1 by definition, always: it does not matter how many bits in it, 8 (in most cases), 9, 11 or 32 (on some DSP, for example) — size of char is one. Note, it is not an error here, but an extra code.
The fwrite function writes upto nmemb (second argument) elements of specified size (third argument), it does not required to write exactly nmemb elements. To fix this you must write the rest of the data readed or just write one element of size n — let fwrite do all his work. (This item is in question, should fwrite write all data or not, but in my version short writes impossible until error occurs.)
You should test for a read errors too: just test ferror(f1) at the end of loop.
Note, you probably need to disable buffering on both input and output files to prevent triple buffering: first on read to f1 buffer, second in our code, third on write to f2 buffer:
setvbuf(f1, NULL, _IONBF, 0);
setvbuf(f2, NULL, _IONBF, 0);
(Internal buffers should, probably, be of size BUFSIZ.)