Creating a FILE * stream that results in a string - c

I'm looking for a way to pass in a FILE * to some function so that the function can write to it with fprintf. This is easy if I want the output to turn up in an actual file on disk, say. But what I'd like instead is to get all the output as a string (char *). The kind of API I'd like is:
/** Create a FILE object that will direct writes into an in-memory buffer. */
FILE *open_string_buffer(void);
/** Get the combined string contents of a FILE created with open_string_buffer
(result will be allocated using malloc). */
char *get_string_buffer(FILE *buf);
/* Sample usage. */
FILE *buf;
buf = open_string_buffer();
do_some_stuff(buf); /* do_some_stuff will use fprintf to write to buf */
char *str = get_string_buffer(buf);
fclose(buf);
free(str);
The glibc headers seem to indicate that a FILE can be set up with hook functions to perform the actual reading and writing. In my case I think I want the write hook to append a copy of the string to a linked list, and for there to be a get_string_buffer function that figures out the total length of the list, allocates memory for it, and then copies each item into it in the correct place.
I'm aiming for something that can be passed to a function such as do_some_stuff without that function needing to know anything other than that it's got a FILE * it can write to.
Is there an existing implementation of something like this? It seems like a useful and C-friendly thing to do -- assuming I'm right about the FILE extensibility.

If portability is not important for you, you can take a look on fmemopen and open_memstream. They are GNU extensions, hence only available on glibc systems. Although it looks like they are part of POSIX.1-2008 (fmemopen and open_memstream).

I'm not sure if it's possible to non-portably extend FILE objects, but if you are looking for something a little bit more POSIX friendly, you can use pipe and fdopen.
It's not exactly the same as having a FILE* that returns bytes from a buffer, but it certainly is a FILE* with programmatically determined contents.
int fd[2];
FILE *in_pipe;
if (pipe(fd))
{
/* TODO: handle error */
}
in_pipe = fdopen(fd[0], "r");
if (!in_pipe)
{
/* TODO: handle error */
}
From there you will want to write your buffer into fd[1] using write(). Careful with this step, though, because write() may block if the pipe's buffer is full (i.e. someone needs to read the other end), and you might get EINTR if your process gets a signal while writing. Also watch out for SIGPIPE, which happens when the other end closes the pipe. Maybe for your use you might want to do the write of the buffer in a separate thread to avoid blocking and make sure you handle SIGPIPE.
Of course, this won't create a seekable FILE*...

I'm not sure I understand why you want to mess up with FILE *. Couldn't you simply write to a file and then load it in string?
char *get_file_in_buf(char *filename) {
char *buffer;
... get file size with fseek or fstat ...
... allocate buffer ...
... read buffer from file ...
return buffer;
}
If you only want to "write" formatted text into a string, another option could be to handle an extensible buffer using snprintf() (see the answers to this SO question for a suggestion on how to handle this: Resuming [vf]?nprintf after reaching the limit).
If, instead, you want to create a type that can be passed transparently to any function taking a FILE * to make them act on string buffers, it's a much more complex matter ...

Related

Writing to file using setvbuf, conditionally discard buffer contents

I would like to write a simple API which
allows the user to open a file.
let the user write data to the file
track the write calls and sanity check the written data after each write call.
prevents the data from beeing written to disk if it is not valid -> discard(file)
As a starting point i wrote the test program below, which opens a file in fully buffered "rb+" mode using fopen and setvbuf.
The stream is opened in fully buffered mode for the following reason:
http://www.cplusplus.com/reference/cstdio/setvbuf/
mode
Specifies a mode for file buffering.
Three special macro constants [...]:
_IOFBF Full buffering: On output, data is written once the buffer is full (or flushed). On Input, the buffer is filled when an input
operation is requested and the buffer is empty.
My testprogram contains comments where a validity check could be placed and where the buffer contents should be discarded.
My question is how do i accomplish the discard(file) operation which means the step of getting rid of invalid buffer contents ?
The idea behind this is to assemble some data in the buffer, do a regular validity check after each or several write operations and write the data to disk only, if the data is valid.
Therefore i would need to discard the buffer, if the validity check fails.
When the validity check passes, the whole buffer contents should be written to the file.
My code draft looks like in the following. This is a simplified example:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
int main(void)
{
static uint8_t buffer[10000];
/* The following would be part of mylib_init */
FILE *file = fopen("test", "wb+");
if (file == NULL){
print ("open error!");
exit(-1);
}
if ( 0 != setvbuf(file , buffer, _IOFBF , sizeof(buffer) ) ){
print("Could not set buffer!");
fclose(file);
exit (-2);
}
/* The following would be part of mylib_write_data.
Each write and check resembles one func call */
// Pretend the user writes some data into the file
// ...
// fwrite(x)
if (data_in_buffer_not_valid(buffer)){
discard(file);
}
// ...
// fwrite(y)
//
if (data_in_buffer_not_valid(buffer)){
discard(file);
}
// ...
// fwrite(z)
// ...
// The following would be part of mylib_exit
// Cleanup stuff
fclose(file)
return 0;
}
If you want to have some like "scratch" temporary file that you want to write your data into and then retrieve them later, then the portable interface would be tmpfile() - it's an interface created just for that. Write to that file, rewind if you want, and when you're ready, rewind it and read from it block by block to another file.
On linux you may use fmemopen and fopencookie to write to a buffer via FILE* - these functions are not available on windows.
I would also strongly consider just creating your own interface that would store the result in memory. Writing an interface like struct mystream; mystream_init(struct mystream *); mystream_printf(struct mystream *, const char *fmt, ...); etc. is some of the tasks you sometimes do in C when fopencookie is not available. And consider writing the interface for storing data, so that instead of calling fwrite you would actually call the function that would check the data and write them and process them along the way.
As for setvbuf, note the standard. From C11 7.21.3p3:
When a stream is unbuffered, characters are intended to appear from the source or at the destination as soon as possible. Otherwise characters may be accumulated and transmitted to or from the host environment as a block. When a stream is fully buffered, [...]. When a stream is line buffered, [...] Support for these characteristics is implementation-defined, and may be affected via the setbuf and setvbuf functions.
And these buffering modes may just be not supported at all. And from C11 7.21.5.6:
The setvbuf function may be used only after the stream pointed to by stream has been associated with an open file and before any other operation (other than an unsuccessful call to setvbuf) is performed on the stream. [...] The contents of the array at any time are indeterminate.
You can't count on anything what will be the content of the buffer. Do not expecting any data there.

Reading from stdin after reading from file

I am trying to read each line from stdin after I finished reading from given file, or if given file name does not exist. Currently I am using below format.
while (fgets(buf, sizeof(buf), fp)!=NULL){
main process...
}
while (fgets(buf, sizeof(buf), stdin)!=NULL){
main process...
}
This format does work as I intended.
However, main process is quite a chunky code, and would there be a way to shorten this, so that I can write while loop only once? Thank you.
If your problem is that 'main process' consists of a lot of lines of code that you do not want to duplicate, the most straightforward solution is to make a function that implements main process.
Since the while loops are identical, save for the file pointer, you could also include the while loop in the function, with the file pointer as a parameter (as in David's remark).
Then you should add a function like this:
void process_input(FILE *input_handle) {
char buf[1024];
while (fgets(buf, sizeof(buf), input_handle) != NULL) {
main process...
}
}
And your original code then should be replaced with:
process_input(fp);
process_input(stdin);
would there be a way to shorten this, so that I can write while loop only once?
There isn't.
You can of course abstract the code into a function which takes a FILE* as a parameter, or extend the stdio interfaces yourself (example), but the long and short of it is that neither standard C nor any popular libc implementation have anything like the ARGV file handle from perl, or anything that let you open a list of files as a single stream.

open a temporary C FILE* for input

I have a legacy function accepting a FILE* pointer in a library. The contents I would like to parse is actually in memory, not on disk.
So I came up with the following steps to work around this issue:
the data is in memory at this point
fopen a temporary file (using tmpnam or tmpfile) on disk for writing
fclose the file
fopen the same file again for reading - guaranteed to exist
change the buffer using setvbuf(buffer, size)
do the legacy FILE* stuff
close the file
remove the temporary file
the data can be discarded
On windows, it looks like this:
int bufferSize;
char buffer[bufferSize];
// set up the buffer here
// temporary file name
char tempName [L_tmpnam_s];
tmpnam_s(tempName, L_tmpnam_s);
// open/close/reopen
fopen_s(&fp, tempName,"wb");
fclose(fp);
freopen_s(&fp, tempName,"rb", fp);
// replace the internal buffer
setvbuf(fp, buffer, _IONBF, bufferSize);
fp->_ptr = buffer;
fp->_cnt = bufferSize;
// do the FILE* reading here
// close and remove tmp file
fclose(fp);
remove(tempName);
Works, but quite cumbersome. The main problem, aside from the backwardness of this approach, are:
the temporary name needs to be determined
the temporary file is actually written to disk
the temporary file needs to be removed afterwards
I'd like to keep things portable, so using Windows memory-mapped functions or boost's facilities is not an option. The problem is mainly that, while it is possible to convert a FILE* to an std::fstream, the reverse seems to be impossible, or at least not supported on C++99.
All suggestions welcome!
Update 1
Using a pipe/fdopen/setvbuf as suggested by Speed8ump and a bit of twiddling seems to work. It does no longer create files on disk nor does it consume extra memory. One step closer, except, for some reason, setvbuf is not working as expected. Manually fixing it up is possible, but of course not portable.
// create a pipe for reading, do not allocate memory
int pipefd[2];
_pipe(pipefd, 0, _O_RDONLY | _O_BINARY);
// open the read pipe for binary reading as a file
fp = _fdopen(pipefd[0], "rb");
// try to switch the buffer ptr and size to our buffer, (no buffering)
setvbuf(fp, buffer, _IONBF, bufferSize);
// for some reason, setvbuf does not set the correct ptr/sizes
fp->_ptr = buffer;
fp->_charbuf = fp->_bufsiz = fp->_cnt = bufferSize;
Update 2
Wow. So it seems that unless I dive into the MS-specific implementation CreateNamedPipe / CreateFileMapping, POSIX portability costs us an entire memcopy (of any size!), be it to file or into a pipe. Hopefully the compiler understands that this is just a temporary and optimizes this. Hopefully.
Still, we eliminated the silly device writing intermediate. Yay!
int pipefd[2];
pipe(pipefd, bufferSize, _O_BINARY); // setting internal buffer size
FILE* in = fdopen(pipefd[0], "rb");
FILE* out = fdopen(pipefd[1], "wb");
// the actual copy
fwrite(buffer, 1, bufferSize, out);
fclose(out);
// fread(in), fseek(in), etc..
fclose(in);
You might try using a pipe and fdopen, that seems to be portable, is in-memory, and you might still be able to do the setvbuf trick you are using.
Your setvbuf hack is a nice idea, but not portable. C11 (n1570):
7.21.5.6 The setvbuf function
Synopsis
#include <stdio.h>
int setvbuf(FILE * restrict stream,
char * restrict buf,
int mode, size_t size);
Description
[...] If buf is not a null pointer, the array it points to may be used instead of a buffer allocated by the setvbuf function [...] and the argument size specifies the size of the array; otherwise, size may determine the size of a buffer allocated by the setvbuf function. The contents of the array at any time are indeterminate.
There is neither a guarantee that the provided buffer is used at all, nor about what it contains at any point after the setvbuf call until the file is closed or setvbuf is called again (POSIX doesn't give more guarantees).
The easiest portable solution, I think, is using tmpfile, fwrite the data into that file, fseek to the beginning (I'm not sure if temporary files are guaranteed to be seekable, on my Linux system, it appears they are, and I'd expect them to be elsewhere), and pass the FILE pointer to the function. This still requires copying in memory, but I guess usually no writing of the data to the disk (POSIX, unfortunately, implicitly requires a real file to exist). A file obtained by tmpfile is deleted after closing.

How create a FILE* from char * without create a temporary file

I'm creating a program using lex and yacc to parse text, but i need create a parser of various content. I don't wish use the stdin, if i using FILE *yyin to specify the input, i can change the source. I need can call the function from library parse (created with lex file and yacc file) to parse this content and receive a result.
/**
* This i don't know is possible, receive a char * and return a FILE*
*/
FILE *function_parse_to_file(char* text){
FILE *fp = NULL;
/**
* is really necessary create a temporary file with content text?
*/
return fp
}
/**
* I need call from other library or application
*/
char *function_parse_from_lex(char* text){
yyin = function_parse_to_file(text);
init();
yyparse();
fclose(yyin);
}
On a POSIX-2008-compliant system (and on Linux), you can use fmemopen to get a FILE* handle on an in-memory buffer.
You can define YY_INPUT macro with three arguments: buffer, result, max_size, where:
buffer - input with buffer where to read data,
result - output to store number of bytes read
max_size - input with buffer size
Just include the macro definition in your Lex file using header or inline and it will be used instead of fread(...)
You really haven't stated your question clearly, but I am going to assume you want to create a FILE * which will return the contents of the string pointed to by the char * when data is read from it. You could simply create a pipe and then invoke fdopen on the read side. It is a bit dangerous to just write the data into the write side, since the write might block and lead to a deadlock, but you can certainly fork a child and have the child write the data into the pipe.
On the other hand, there's no real reason not to create a temporary file. Assuming you are going to unlink the file after you read it, there's very little chance of the data ever going to disk (the OS will keep it in memory) If you're really concerned to can use a path on a ram disk.

File IO does not appear to be reading correctly

Disclaimer: this is for an assignment. I am not asking for explicit code. Rather, I only ask for enough help that I may understand my problem and correct it myself.
I am attempting to recreate the Unix ar utility as per a homework assignment. The majority of this assignment deals with file IO in C, and other parts deal with system calls, etc..
In this instance, I intend to create a simple listing of all the files within the archive. I have not gotten far, as you may notice. The plan is relatively simple: read each file header from an archive file and print only the value held in ar_hdr.ar_name. The rest of the fields will be skipped over via fseek(), including the file data, until another file is reached, at which point the process begins again. If EOF is reached, the function simply terminates.
I have little experience with file IO, so I am already at a disadvantage with this assignment. I have done my best to research proper ways of achieving my goals, and I believe I have implemented them to the best of my ability. That said, there appears to be something wrong with my implementation. The data from the archive file does not seem to be read, or at least stored as a variable. Here's my code:
struct ar_hdr
{
char ar_name[16]; /* name */
char ar_date[12]; /* modification time */
char ar_uid[6]; /* user id */
char ar_gid[6]; /* group id */
char ar_mode[8]; /* octal file permissions */
char ar_size[10]; /* size in bytes */
};
void table()
{
FILE *stream;
char str[sizeof(struct ar_hdr)];
struct ar_hdr temp;
stream = fopen("archive.txt", "r");
if (stream == 0)
{
perror("error");
exit(0);
}
while (fgets(str, sizeof(str), stream) != NULL)
{
fscanf(stream, "%[^\t]", temp.ar_name);
printf("%s\n", temp.ar_name);
}
if (feof(stream))
{
// hit end of file
printf("End of file reached\n");
}
else
{
// other error interrupted the read
printf("Error: feed interrupted unexpectedly\n");
}
fclose(stream);
}
At this point, I only want to be able to read the data correctly. I will work on seeking the next file after that has been finished. I would like to reiterate my point, however, that I'm not asking for explicit code - I need to learn this stuff and having someone provide me with working code won't do that.
You've defined a char buffer named str to hold your data, but you are accessing it from a separate memory ar_hdr structure named temp. As well, you are reading binary data as a string which will break because of embedded nulls.
You need to read as binary data and either change temp to be a pointer to str or read directly into temp using something like:
ret=fread(&temp,sizeof(temp),1,stream);
(look at the doco for fread - my C is too rusty to be sure of that). Make sure you check and use the return value.

Resources