Writing to file using setvbuf, conditionally discard buffer contents - c

I would like to write a simple API which
allows the user to open a file.
let the user write data to the file
track the write calls and sanity check the written data after each write call.
prevents the data from beeing written to disk if it is not valid -> discard(file)
As a starting point i wrote the test program below, which opens a file in fully buffered "rb+" mode using fopen and setvbuf.
The stream is opened in fully buffered mode for the following reason:
http://www.cplusplus.com/reference/cstdio/setvbuf/
mode
Specifies a mode for file buffering.
Three special macro constants [...]:
_IOFBF Full buffering: On output, data is written once the buffer is full (or flushed). On Input, the buffer is filled when an input
operation is requested and the buffer is empty.
My testprogram contains comments where a validity check could be placed and where the buffer contents should be discarded.
My question is how do i accomplish the discard(file) operation which means the step of getting rid of invalid buffer contents ?
The idea behind this is to assemble some data in the buffer, do a regular validity check after each or several write operations and write the data to disk only, if the data is valid.
Therefore i would need to discard the buffer, if the validity check fails.
When the validity check passes, the whole buffer contents should be written to the file.
My code draft looks like in the following. This is a simplified example:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
int main(void)
{
static uint8_t buffer[10000];
/* The following would be part of mylib_init */
FILE *file = fopen("test", "wb+");
if (file == NULL){
print ("open error!");
exit(-1);
}
if ( 0 != setvbuf(file , buffer, _IOFBF , sizeof(buffer) ) ){
print("Could not set buffer!");
fclose(file);
exit (-2);
}
/* The following would be part of mylib_write_data.
Each write and check resembles one func call */
// Pretend the user writes some data into the file
// ...
// fwrite(x)
if (data_in_buffer_not_valid(buffer)){
discard(file);
}
// ...
// fwrite(y)
//
if (data_in_buffer_not_valid(buffer)){
discard(file);
}
// ...
// fwrite(z)
// ...
// The following would be part of mylib_exit
// Cleanup stuff
fclose(file)
return 0;
}

If you want to have some like "scratch" temporary file that you want to write your data into and then retrieve them later, then the portable interface would be tmpfile() - it's an interface created just for that. Write to that file, rewind if you want, and when you're ready, rewind it and read from it block by block to another file.
On linux you may use fmemopen and fopencookie to write to a buffer via FILE* - these functions are not available on windows.
I would also strongly consider just creating your own interface that would store the result in memory. Writing an interface like struct mystream; mystream_init(struct mystream *); mystream_printf(struct mystream *, const char *fmt, ...); etc. is some of the tasks you sometimes do in C when fopencookie is not available. And consider writing the interface for storing data, so that instead of calling fwrite you would actually call the function that would check the data and write them and process them along the way.
As for setvbuf, note the standard. From C11 7.21.3p3:
When a stream is unbuffered, characters are intended to appear from the source or at the destination as soon as possible. Otherwise characters may be accumulated and transmitted to or from the host environment as a block. When a stream is fully buffered, [...]. When a stream is line buffered, [...] Support for these characteristics is implementation-defined, and may be affected via the setbuf and setvbuf functions.
And these buffering modes may just be not supported at all. And from C11 7.21.5.6:
The setvbuf function may be used only after the stream pointed to by stream has been associated with an open file and before any other operation (other than an unsuccessful call to setvbuf) is performed on the stream. [...] The contents of the array at any time are indeterminate.
You can't count on anything what will be the content of the buffer. Do not expecting any data there.

Related

fileno for closed file

I have a function like this which aims to read a file:
int foo(FILE* f)
I want to use flock in order to prevent TOCTTOU. flock requires a file descriptor as an integer. I can get this using fileno(file). The implementation of foo therefore might look like this:
int foo(FILE* f) {
if(!f) return -1;
int fd = fileno(f);
if(fd < 0) return -1;
flock(fd, LOCK_EX);
//do all the reading stuff and so on.
}
However, the evil user might do something like this:
FILE* test;
test = fopen("someexistingfile.txt", "r");
fclose(test);
foo(test);
Then I have a problem because fileno will do invalid reads according to valgrind because it assumes that the file is open.
Any ideas on how to check whether the file is closed?
C11 n1570 7.21.3p4
A file may be disassociated from a controlling stream by closing the file. Output streams are flushed (any unwritten buffer contents are transmitted to the host environment) before the stream is disassociated from the file. The value of a pointer to a FILE object is indeterminate after the associated file is closed (including the standard text streams). Whether a file of zero length (on which no characters have been written by an output stream) actually exists is implementation-defined.
After fclose the use of the value of a FILE * in library functions leads to undefined behaviour. The value of the pointer cannot be used safely for anything at all until reassigned.
In other words, you cannot do really anything at all to discern whether the FILE * value you've given refers to a valid open file or not... well except for testing against NULL - if the value of the pointer is NULL it certainly cannot point to an open stream.

Logging fails if program stopped using ctrl + c

I have written a C program in which I am logging the results to a file. There is an infinite while loop - this is a requirement. To debug the code, I need to look at the log file, but as the program is running, I don't see anything written there. Closing the program forcibly using ctrl+C does not help either. I see nothing written on the file.
I am using simple fopen and fprintf functions to read the file in write mode and write to it.
FILE *fp = fopen("filename.txt", "w");
fprintf(fp, "this wants itself to be written the moment this statement is executed\n");
PS: There is no bug in the code. If I put a terminating condition in while loop and program exits gracefully, I do see things written in the log file.
A difference between printing to a console and printing to a file is that streams are line buffered by default when attached to the console, but block buffered when attached to a file. Change your code to:
FILE *fp = fopen("filename.txt", "w");
setvbuf(fp,0,_IOLBF,0);
fprintf(fp, "this wants itself to be written the moment this statement is executed\n");
and your output will be line buffered even though the stream is attached to a file. You can also do unbuffered streams.
[EDIT: ]
Ref C11 7.21.5.6:
Synopsis
#include <stdio.h>
int setvbuf(FILE * restrict stream,
char * restrict buf,
int mode, size_t size);
Description
The setvbuf function may be used only after the stream pointed to by
stream has been associated with an open file and before any other
operation (other than an unsuccessful call to setvbuf) is performed on
the stream. The argument mode determines how stream will be buffered,
as follows: _IOFBF causes input/output to be fully buffered; _IOLBF
causes input/output to be line buffered; _IONBF causes input/output to
be unbuffered. If buf is not a null pointer, the array it points to
may be used instead of a buffer allocated by the setvbuf function
and the argument size specifies the size of the array; otherwise, size
may determine the size of a buffer allocated by the setvbuf function.
The contents of the array at any time are indeterminate.
Returns
The setvbuf function returns zero on success, or nonzero if an invalid
value is given for mode or if the request cannot be honored.
You should to see the the function fopen(),if you fopen a file with "w" mode,it means if this file exist,clear this file and then write.I think you should use "a+" mode to append data in the end.

How create a FILE* from char * without create a temporary file

I'm creating a program using lex and yacc to parse text, but i need create a parser of various content. I don't wish use the stdin, if i using FILE *yyin to specify the input, i can change the source. I need can call the function from library parse (created with lex file and yacc file) to parse this content and receive a result.
/**
* This i don't know is possible, receive a char * and return a FILE*
*/
FILE *function_parse_to_file(char* text){
FILE *fp = NULL;
/**
* is really necessary create a temporary file with content text?
*/
return fp
}
/**
* I need call from other library or application
*/
char *function_parse_from_lex(char* text){
yyin = function_parse_to_file(text);
init();
yyparse();
fclose(yyin);
}
On a POSIX-2008-compliant system (and on Linux), you can use fmemopen to get a FILE* handle on an in-memory buffer.
You can define YY_INPUT macro with three arguments: buffer, result, max_size, where:
buffer - input with buffer where to read data,
result - output to store number of bytes read
max_size - input with buffer size
Just include the macro definition in your Lex file using header or inline and it will be used instead of fread(...)
You really haven't stated your question clearly, but I am going to assume you want to create a FILE * which will return the contents of the string pointed to by the char * when data is read from it. You could simply create a pipe and then invoke fdopen on the read side. It is a bit dangerous to just write the data into the write side, since the write might block and lead to a deadlock, but you can certainly fork a child and have the child write the data into the pipe.
On the other hand, there's no real reason not to create a temporary file. Assuming you are going to unlink the file after you read it, there's very little chance of the data ever going to disk (the OS will keep it in memory) If you're really concerned to can use a path on a ram disk.

Multiple processes accessing the same file

Is it alright for multiple processes to access (write) to the same file at the same time? Using the following code, it seems to work, but I have my doubts.
Use case in the instance is an executable that gets called every time an email is received and logs it's output to a central file.
if (freopen(console_logfile, "a+", stdout) == NULL || freopen(error_logfile, "a+", stderr) == NULL) {
perror("freopen");
}
printf("Hello World!");
This is running on CentOS and compiled as C.
Using the C standard IO facility introduces a new layer of complexity; the file is modified solely via write(2)-family of system calls (or memory mappings, but that's not used in this case) -- the C standard IO wrappers may postpone writing to the file for a while and may not submit complete requests in one system call.
The write(2) call itself should behave well:
[...] If the file was
open(2)ed with O_APPEND, the file offset is first set to the
end of the file before writing. The adjustment of the file
offset and the write operation are performed as an atomic
step.
POSIX requires that a read(2) which can be proved to occur
after a write() has returned returns the new data. Note that
not all file systems are POSIX conforming.
Thus your underlying write(2) calls will behave properly.
For the higher-level C standard IO streams, you'll also need to take care of the buffering. The setvbuf(3) function can be used to request unbuffered output, line-buffered output, or block-buffered output. The default behavior changes from stream to stream -- if standard output and standard error are writing to the terminal, then they are line-buffered and unbuffered by default. Otherwise, block-buffering is the default.
You might wish to manually select line-buffered if your data is naturally line-oriented, to prevent interleaved data. If your data is not line-oriented, you might wish to use un-buffered or leave it block-buffered but manually flush the data whenever you've accumulated a single "unit" of output.
If you are writing more than BUFSIZ bytes at a time, your writes might become interleaved. The setvbuf(3) function can help prevent the interleaving.
It might be premature to talk about performance, but line-buffering is going to be slower than block buffering. If you're logging near the speed of the disk, you might wish to take another approach entirely to ensure your writes aren't interleaved.
This answer was incorrect. It does work:
So the race condition would be:
process 1 opens it for append, then
later process 2 opens it for append, then
later still 1 writes and closes, then
finally 2 writes and closes.
I'd be impressed if that 'worked' because it isn't clear to me what
working should mean. I assume 'working' means all of the bytes written
by the two processes are inthe log file? I'd expect that they both
write starting at the same byte offset, so one will replace the others
bytes. It will all be okay upto and including step 3. and only show up
as a problem at step 4, Seems like an easy test to write: open getchar
... write close.
Is it critical that they can have the file open simultaneously? A
more obvious solution if the write is quick, is to open exclusive.
For a quick check on your system, try:
/* write the first command line argument to a file called foo
* stackoverflow topic 9880935
*/
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
int main (int argc, const char * argv[]) {
if (argc <2) {
fprintf(stderr, "Error: need some text to write to the file Foo\n");
exit(1);
}
FILE* fp = freopen("foo", "a+", stdout);
if (fp == NULL) {
perror("Error failed to open file\n");
exit(1);
}
fprintf(stderr, "Press a key to continue\n");
(void) getchar(); /* Yes, I really mean to ignore the character */
if (printf("%s\n", argv[1]) < 0) {
perror("Error failed to write to file: ");
exit(1);
}
fclose(fp);
return 0;
}

Creating a FILE * stream that results in a string

I'm looking for a way to pass in a FILE * to some function so that the function can write to it with fprintf. This is easy if I want the output to turn up in an actual file on disk, say. But what I'd like instead is to get all the output as a string (char *). The kind of API I'd like is:
/** Create a FILE object that will direct writes into an in-memory buffer. */
FILE *open_string_buffer(void);
/** Get the combined string contents of a FILE created with open_string_buffer
(result will be allocated using malloc). */
char *get_string_buffer(FILE *buf);
/* Sample usage. */
FILE *buf;
buf = open_string_buffer();
do_some_stuff(buf); /* do_some_stuff will use fprintf to write to buf */
char *str = get_string_buffer(buf);
fclose(buf);
free(str);
The glibc headers seem to indicate that a FILE can be set up with hook functions to perform the actual reading and writing. In my case I think I want the write hook to append a copy of the string to a linked list, and for there to be a get_string_buffer function that figures out the total length of the list, allocates memory for it, and then copies each item into it in the correct place.
I'm aiming for something that can be passed to a function such as do_some_stuff without that function needing to know anything other than that it's got a FILE * it can write to.
Is there an existing implementation of something like this? It seems like a useful and C-friendly thing to do -- assuming I'm right about the FILE extensibility.
If portability is not important for you, you can take a look on fmemopen and open_memstream. They are GNU extensions, hence only available on glibc systems. Although it looks like they are part of POSIX.1-2008 (fmemopen and open_memstream).
I'm not sure if it's possible to non-portably extend FILE objects, but if you are looking for something a little bit more POSIX friendly, you can use pipe and fdopen.
It's not exactly the same as having a FILE* that returns bytes from a buffer, but it certainly is a FILE* with programmatically determined contents.
int fd[2];
FILE *in_pipe;
if (pipe(fd))
{
/* TODO: handle error */
}
in_pipe = fdopen(fd[0], "r");
if (!in_pipe)
{
/* TODO: handle error */
}
From there you will want to write your buffer into fd[1] using write(). Careful with this step, though, because write() may block if the pipe's buffer is full (i.e. someone needs to read the other end), and you might get EINTR if your process gets a signal while writing. Also watch out for SIGPIPE, which happens when the other end closes the pipe. Maybe for your use you might want to do the write of the buffer in a separate thread to avoid blocking and make sure you handle SIGPIPE.
Of course, this won't create a seekable FILE*...
I'm not sure I understand why you want to mess up with FILE *. Couldn't you simply write to a file and then load it in string?
char *get_file_in_buf(char *filename) {
char *buffer;
... get file size with fseek or fstat ...
... allocate buffer ...
... read buffer from file ...
return buffer;
}
If you only want to "write" formatted text into a string, another option could be to handle an extensible buffer using snprintf() (see the answers to this SO question for a suggestion on how to handle this: Resuming [vf]?nprintf after reaching the limit).
If, instead, you want to create a type that can be passed transparently to any function taking a FILE * to make them act on string buffers, it's a much more complex matter ...

Resources