C fwrite error handling - c

When dealing with a count mismatch from fwrite (and therefore error), dealing with the error, what is the correct approach?
clearerr(File); //Clear the error
fflush(File); //Empty the buffer of it's contents
Or:
fflush(File); //Other way around, empty buffer first then reset
clearerr(File);
Or just:
clearerr(File); //Contains fflush implicitly?
Or something else?

There isn't really anything you can do if you encounter a write error. You can flush the buffer, but your last write was still broken, so the file doesn't contain what you want. You could close the file, reopen it for writing (with "truncate") and write it anew, but that only works if you still have the entire file content in memory.
Alternatively, you could reopen and see how much data has been written, but that doesn't help you if there's an external reason why you can't write to the file, so there's really no graceful way to recover.
So in short, you don't "handle" the error at the file site; rather, your program must handle the larger error condition that the write just failed and react at an appropriate point.
You should probably consider "atomic writes", which means you first write your file to a temporary, and only if you succeed to you delete the original and rename the temporary to the original file name. That way the file itself is always in a consistent state.

Related

Why isn't it a good idea to open the file before each writing and close it right after each writing?

Suppose you are writing a list of names in a file, each name being written in a write command. Why isn't it a good idea to open the file before each writing and close it right after each writing?
Intuitively, I would say that this aproach is a lot more time consuming than writing to a buffer and posteriorly writing to the file. But I'm sure there is a better explanation to that. Can someone enlight me?
Let's sum it up:
Opening and closing a file for every single write operation, when many such operations are planned, is a bad idea because:
It is terribly inefficient.
It requires an extra seek to the end of file in order to append.
It forfeits atomicity, meaning that the file may be renamed, moved, deleted, written to, or locked by someone else between the write operations.
Think of all the possible reasons why fopen (and related) might fail when you call it even once: the file doesn't exist, your account doesn't have permission to access or create it, another program is using the file exclusively, etc.
If you are repeatedly opening and closing the file for every write operation, this chance of failure increases quite a bit.
Also, there is an overhead associated with acquiring and releasing resources (e.g. files). You'd observe it more if you were acquiring and releasing write access to the file every single time you needed to write.

confused about using ftell() to check if the file is empty

I want to add a structure to a binary file but first i need to check whether the file has previous data stored in it, and if not i can add the structure,otherwise ill have to read all the stored data and stick the structure in its correct place, but i got confused about how to check if the file is empty ,i thought about trying something like this:
size = 0
if(fp!=NULL)
{
fseek (fp, 0, SEEK_END);
size = ftell (fp);
rewind(fp);
}
if (size==0)
{
// print your error message here
}
but if the file is empty or still not created how can the file pointer not be NULL ? whats the point of using ftell() if i can simply do something like this :
if(fp==NULL){fp=fopen("data.bin","wb");
fwrite(&struct,sizeof(struct),1,ptf);
fclose(fp);}
i know that NULL can be returned in other cases such as protected files but still i cant understand how using ftell() is effective when file pointers will always return NULL if the file is empty.any help will be appreciated :)
i need to check whether the file has previous data stored in it
There might be no portable and robust way to do that (that file might change during the check, because other processes are using it). For example, on Unix or Linux, that file might be opened by another process writing into it while your own program is running (and that might even happen between your ftell and your rewind). And your program might be running in several processes.
You could use operating system specific functions. For POSIX (including Linux and many Unixes like MacOSX or Android), you might use stat(2) to query the file status (including its size with st_size). But after that, some other process might still write data into that file.
You might consider advisory locking, e.g. with flock(2), but then you adopt the system-wide convention that every program using that file would lock it.
You could use some database with ACID properties. Look into sqlite or into RDBMS systems like PostGreSQL or MariaDB. Or indexed file library like gdbm.
You can continue coding with the implicit assumption (but be aware of it) that only your program is using that file, and that your program has at most one process running it.
if the file is empty [...] how can the file pointer not be NULL ?
As Increasingly Idiotic answered, fopen can fail, but usually don't fail on empty files. Of course, you need to handle fopen failure (see also this). So most of the time, your fp would be valid, and your code chunk (assuming no other process is changing that file simulateously) using ftell and rewind is an approximate way to check that the file is empty. BTW, if you read (e.g. with fread or fgetc) something from that file, that read would fail if your file was empty, so you probably don't need to check its emptiness before.
A POSIX specific way to query the status (including size) of some fopen-ed file is to use fileno(3) and fstat(2) together like fstat(fileno(fp), &mystat) after having declared struct stat mystat;
fopen() does not return NULL for empty files.
From the documentation:
If successful, returns a pointer to the object that controls the opened file stream ... On error, returns a null pointer.
NULL is returned only when the file could not be opened. The file could fail to open due to any number of reasons such as:
The file doesn't exist
You don't have permissions to read the file
The file cannot be opened multiple times simultaneously.
More possible reasons in this SO answer
In your case, if fp == NULL you'll need to figure out why fopen failed and handle each case accordingly. In most cases, an empty file will open just fine and return a non NULL file pointer.

Reading a file in C with File Descriptor

I want to read from a file by using its file descriptor. I can't use its name because of assignment rules.
I obtain it by calling open and it works fine. At this moment I know that I have to use the read() function in order to read from it. My problem is that read() function requires as an argument the number of bytes to read, and I want to read a whole line from the file each time, so I don't know how many bytes to read.
If i use for example fscanf(), it works fine with a simple string and I take back the whole line as I want. So my question is:
Is there any function like fscanf() which can be called with file descriptor and not with a file pointer?
When you say "have to use read()" I can't tell if that's your understanding of the situation given a file descriptor from open() or a restriction on some kind of assignment.
If you have a file descriptor but you're more comfortable with fscanf() and friends, use fdopen() to get a FILE * from your fd and proceed to use stdio.
Internally it uses functions like read() into a buffer and then processes those buffers as you read them with fscanf() and friends.
What you could do is read one character at a time, until you've read the entire line, and detect a '/n'. As this is homework, I won't write it for you.
A few things to be warned of, however.
You need to check for EOF, otherwise, you might end up in an infinite loop.
You should declare some buffer which you read a character, then copy it into the buffer. Not knowing what your input is, I can't suggest a size, other than to say that for a homework assignment, [256] would probably be sufficient.
You need to make sure you don't overfill your buffer in the even that you do run over it's length.
Keep reading until you find a '/n' character. Then process the line that you have created, and start the next one.

C file pointers, multiple reads on stdin

I have an existing program where a message (for example, an email, or some other kind of message) will be coming into a program on stdin.
I know stdin is a FILE* but I'm somewhat confused as to what other special characteristics it has. I'm currently trying to add a check to the program, and handle the message differently if it contains a particular line (say, the word "hello"). The problem is, I need to search through the file for that word, but I still need stdin to point to its original location later in the program. An outline of the structure is below:
Currently:
//actual message body is coming in on stdin
read_message(char type)
{
//checks and setup
if(type == 'm')
{
//when it reaches this point, nothing has touched stdin
open_and_read(); //it will read from stdin
}
//else, never open the message
}
I want to add another check, but where I have to search the message body.
Like so:
//actual message body is coming in on stdin
read_message(char type)
{
//checks and setup
//new check
if(message_contains_hello()) //some function that reads through the message looking for the word hello
{
other_functionality();
}
if(type == 'm')
{
//when it reaches this point, my new check may have modified stdin
open_and_read(); //it will read from stdin
}
//else, never open the message
}
The problem with this is that to search the message body, I have to touch the file pointer stdin. But, if I still need to open and read the message in the second if statement (if type = 'm'), stdin needs to point to the same place it was pointing at the start of the program. I tried creating a copy of the pointer but was only successful in creating a copy that would also modify stdin if modified itself.
I don't have a choice about how to pass the message - it has to stay on stdin. How can I access the actual body of a message coming in on stdin without modifying stdin itself?
Basically, how can I read from it, and then have another function be able to read from the beginning of the message as well?
The short answer is that you can't. Once you read data from standard input, it's gone.
As such, your only real choice is to save what you read, and do the later processing on that rather than reading directly from standard input. If your later processing demands reading from a file, one possibility would be to structure this as two separate programs, with one acting as a filter for the other.
In general, you can only read bytes from stdin once. There is no fseek() functionality. To solve this problem, you can read the bytes into a buffer in your program, look at the bytes, and then pass the buffer off to another function that actually does something with the rest of the data.
Depending on your program, you may need to only read some of the data on stdin, or you may need to read all of it into that buffer. Either way, you will probably have to modify the existing code in the program in some way.
I know stdin is a FILE* but I'm somewhat confused as to what other special characteristics it has.
Well, it's opened for reading. But it's not guaranteed to be seekable, so you'll want to read in its contents entirely, then handle the resulting string (or list of strings, or whatever).
You should use and take advantage of buffering (<stdio.h> provides buffered I/O, but see setbuf).
My suggestion is to read your stdin line by line, e.g. using getline. Once you've read an entire line, you can do some minimal look-ahead inside.
Perhaps you might read more about parsing techniques.

Is it ‘safe’ to remove() open file?

I think about adding possibility of using same the filename for both input and output file to my program, so that it will replace the input file.
As the processed file may be quite large, I think that best solution would to be first open the file, then remove it and create a new one, i.e. like that:
/* input == output in this case */
FILE *inf = fopen(input, "r");
remove(output);
FILE *outf = fopen(output, "w");
(of course, with error handling added)
I am aware that not all systems are going to allow me to remove open file and that's acceptable as long as remove() is going to fail in that case.
I am worried though if there isn't any system which will allow me to remove that open file and then fail to read its' contents.
C99 standard specifies behavior in that case as ‘implementation-defined’; SUS doesn't even mention the case.
What is your opinion/experience? Do I have to worry? Should I avoid such solutions?
EDIT: Please note this isn't supposed to be some mainline feature but rather ‘last resort’ in the case user specifies same filename as both input and output file.
EDIT: Ok, one more question then: is it possible that in this particular case the solution proposed by me is able to do more evil than just opening the output file write-only (i.e. like above but without the remove() call).
No, it's not safe. It may work on your file system, but fail on others. Or it may intermittently fail. It really depends on your operating system AND file system. For an in depth look at Solaris, see this article on file rotation.
Take a look at GNU sed's '--in-place' option. This option works by writing the output to a temporary file, and then copying over the original. This is the only safe, compatible method.
You should also consider that your program could fail at any time, due to a power outage or the process being killed. If this occurs, then your original file will be lost. Additionally, for file systems which do have reference counting, your not saving any space, over the temp file solution, as both files have to exist on disk until the input file is closed.
If the files are huge, and space is at premium, and developer time is cheap, you may be able to open a single for read/write, and ensure that your write pointer does not advance beyond your read pointer.
All systems that I'm aware of that let you remove open files implement some form of reference-counting for file nodes. So, removing a file removes the directory entry, but the file node itself still has one reference from open file handle. In such an implementation, removing a file obviously won't affect the ability to keep reading it, and I find it hard to imagine any other reasonable way to implement this behavior.
I've always got this to work on Linux/Unix. Never on Windows, OS/2, or (shudder) DOS. Any other platforms you are concerned about?
This behaviour actually is useful in using temporary diskspace - open the file for read/write, and immediately delete it. It gets cleaned up automatically on program exit (for any reason, including power-outage), and makes it much harder (but not impossible) for others to monitor it (/proc can give clues, if you have read access to that process).

Resources