How do you write to a buffer then to a file - c

I'm trying to write some STL data out of matlab and I'm trying to do this by writing a MEX File (a matlab DLL written in C) At the moment I have a loop just going through my data writing out the stl syntax with the floats.
...
for(m=0;m<colLen;m++)
{
res = m % 3;
if(res == 0)
{
fprintf(fp, "\tfacet normal %f %f %f \n",
normalValues[(x*nvcolLen)+0], normalValues[(x*nvcolLen)+1], normalValues[(x*nvcolLen)+2]);
fprintf(fp,"\t\touter loop\n" );
flag = 0;
x++;
}
fprintf(fp, "\t\t\tvertex ");
for(n=0;n<rowLen;n++)
{
fprintf(fp, "%f ", xValues[m*rowLen+n]);
}
fprintf(fp,"\n");
flag++;
if (flag == 3)
{
fprintf(fp, "\t\tendloop\n\tendfacet\n");
flag = 0;
}
}
...
The main reason why I want to do this in a MEX file is because things are way quicker since its compiled. I was reading a C++ book, "Sams Teach Yourself C++ in One our a day" and in page 645 they talk about using a buffers to speed up writing to the disk. Once the buffer fills up, write the data, flush it, and do it again. They don't really show any code on how to do that and this is with C++'s streams.
How would I approach this in C? Would I just make a char* buffer with a fixed size, and then somehow check when its full and write it to a file with fwrite(), flush it, start over??

Basically, of you want to do it yourself, you'd do it almost exactly as you wrote: Make a char* buffer, track the number of chars in it (by counting the chars you put in it) and if it is full (or nearly-full), flush it to the file.
However, this should really not be an issue with C streams, as they usually do buffering. You can even control this buffering with the function setbuf et al.

fprintf does buffered output automatically for you. If there is a problem, show us the code that opens the file (fp).

Related

How often shoud a file be opened and closed when constantly writing data to a file (SDCard)?

I am developing a CAN-BUS logger with an ESP32. The data is written to a SDCard with fprintf.
I know I have to use fopen() to open the file and later fclose() to close the file again.
My question is how often should I open and close the file? Open it just one time and then maybe an hour later close it? Or open it, write 100 values, close it, open it again?
I don't want to lose a lot of data. The ESP32 will be on when the motorcycle with the CAN-BUS is running. And if the ignition is turned off then the ESP32 will have no power anymore. I don't mind if maybe data from the last 5 seconds is lost. But I don't want that i.e. 10 minutes of data is lost.
I also saw the fflush() function. Should I use that regularly i.e. every 10 seconds? And then maybe it's no problem if the file is never closed?
More info: I could design the device to make sure the power is on long enough so that fclose() is executed (no power failure before that). But I guess this is not really necessary if I don't mind that the last seconds of data is lost.
I put this question into StackOverflow and not electrical engineering because this is about writing the code for that project.
I searched and found similar questions here but I did not really find an answer to this question.
The Linux manpage for fflush includes a really important warning:
NOTES
Note that fflush() flushes only the user-space buffers provided by the C library. To ensure that the data is physically stored on disk the kernel buffers must be flushed too, for example, with sync(2) or fsync(2).
sync or fsync are part of the Posix interface, which might or might not be similar to the underlying file i/o interfaces on your ESP32. But the warning is probably still worth heeding.
The C standard has this wording for fflush, which makes it clear that all that is guaranteed is that fflush flushes out the buffers maintained by the C library, similar to the wording in the Linux manpage:
… the fflush function causes any unwritten data for that stream to be delivered to the host environment to be written to the file…
So if you want data to be actually committed to disk, fflush is not generally sufficient.
I made some tests to answer my question.
First I used this code:
ESP_LOGI(TAG, "Opening file");
FILE* f = fopen(fileName1, "a");
if (f == NULL) {
ESP_LOGE(TAG, "Failed to open file for append");
return;
}
int64_t t = 0;
for (unsigned long i = 1; i <= 10009000; i++)
{
t = esp_timer_get_time();
fprintf(f, "%010" PRId64, t);
fprintf(f, " ");
fprintf(f, "%10lu\n", i);
if (0 == i % 10000){
ESP_LOGI(TAG, "Flush file");
fflush(f);
}
}
//fclose(f);
ESP_LOGI(TAG, "File written");
Please notice that the file is never closed.
When I run that code on my ESP32 the file is created but nothing is written to the file. So at least with my configuration fclose() is necessary.
When I change the code to this:
for (unsigned long i = 1; i <= 109000; i++)
{
t = esp_timer_get_time();
fprintf(f, "%010" PRId64, t);
fprintf(f, " ");
fprintf(f, "%10lu\n", i);
if (0 == i % 10000){
//fflush(f);
ESP_LOGI(TAG, "Close and Open file");
fclose(f);
f = fopen(fileName1, "a");
}
}
//fclose(f);
ESP_LOGI(TAG, "File written");
then the data is written to the file. In this case the last line in the file is
0013521402 100000
So as expected the values after 100,000 (the last time when the file was closed) and before 109,000 (i <= 109000) are not written to the file.
My conclusion is: I have to use fclose(). The program does not write anything if I just use fflush() regularly. How often I have to close and open the file depends on how much data I am prepared to lose. For my application I will decide this later after some testing. Maybe I will close and open the file i.e. every 5 seconds. Or I will do this after writing 1000 lines or something like this.

Function for loading data from binary files to array of structs

This is my code for loading data from binary file to array of structs but I get error that file is not found. File is there and it works with other functions.
void ispis2(void) {
float data_num = brojanje();
int N = data_num / 3;
ARTIKL artikl[120];
FILE *fp = NULL;
fp = fopen("artikli.bin", "rb");
if (fp == NULL) {
fprintf(stderr, "Pogreska: %s\n", strerror(errno));
}
else {
for (int i = 0; i < N; i++) {
fread(&artikl[i].ime, sizeof(artikl), 1, fp);
fread(&artikl[i].cijena, sizeof(artikl), 1, fp);
fread(&artikl[i].ID, sizeof(artikl), 1, fp);
}
for (int i = 0; i < N; i++) {
printf("Ime: %s", artikl[i].ime);
printf("Cijena: %f", artikl[i].cijena);
printf("ID: %d", artikl[i].ID);
}
}
}
You probably are not running your program in the environment (notably check the current working directory) you want.
Of course your fread-s smell bad. Read again the documentation of standard C input/output functions, in particular of fread. The sizeof(artikl) argument is surely wrong (but since your question don't explain what an ARTIKL really is, we cannot help more).
The notion of directory is unknown by the C11 standard n1570. See also this. But if you are on a POSIX system, you could query it with getcwd(3). Other operating systems have different ways of querying it (e.g. GetCurrentDirectory on Windows). Read Operating Systems: Three Easy Pieces to understand more about operating systems, and read also a good C programming book.
So you could perhaps improve the error handling:
if (fp == NULL) {
fprintf(stderr, "Pogreska: %s\n", strerror(errno));
char wdbuf[128];
memset(wdbuf, 0, sizeof(cwdbuf));
fprintf(stderr, "working directory: %s\n",
getcwd(wdbuf, sizeof(wdbuf));
}
(the above code don't handle failure of getcwd; a robust program should handle that)
BTW, I am not sure that handling binary files like you do is worthwhile (a binary format is very brittle, and you need to document its format very precisely; remember that most of the time, data is more valuable than the application processing it). You might consider instead storing the data in some textual format (like JSON, YAML, ....) or in some sqlite database. Consider using appropriate libraries (e.g. jansson for JSON, libsqlite for sqlite)
Don't forget to compile your code with all warnings and debug info (so gcc -Wall -Wextra -g with GCC). Read the documentation of your compiler (e.g. how to invoke it) and of your debugger. Use the debugger to understand the behavior of your program.

Write a program that creates a new data file that includes only the numbers from the first file that are greater than 60

I am writing a program that creates a new data file that will take the numbers from the first file that are greater than 60 and save them to a new file.
I have began with my code but for some reason it is not saving the numbers greater than 60. I am new to programming, still learning so any help will be very much appreciated. What am I doing wrong?
#include <stdio.h>
main()
{
int y;
FILE *DATA;
DATA = fopen("RN.txt","r");
fscanf(DATA, &y);
if (y > 60)
{
DATA = fopen("RN.txt","w");
fprintf(DATA, y, "");
}
printf("Finished saving file RN.txt \n");
return 0;
}
There are several severe problems in your code.
My suggestions are as below:
Compile your program (preferrably with -Wall, enabling all warnings). You'll be getting a bunch of errors and warnings. Fix them carefully, one-by-one.
Learn How to debug small programs?
Throw what you have into the bin and read one of the good C books. Then restart from scratch.
A few insights:
You're using fscanf() and fprintf() wrongly:
fscanf(DATA, "%d", &y);
fprintf(DATA, "%d", y);
Opening the same file with two different handles to read and write at the same time will run yourself into troubles, very quickly. Close after reading or writing, or use another file for writing
fclose(DATA);
OR
FILE *OUT = fopen("RN.out.txt", "w");
To repeat a specific process, you need a loop:
while (fscanf(DATA, "%d", &y) > 0) {
// process here
}

updating text files in C(is it possible?)

Do any of you guys know if it's possible to update a text file(e.g something.txt) in C?
I was expecting to find a function with similar syntax as update_txt(something.txt), but I haven't found anything while browsing the internet for the last 2 hours.....
The thing is that I would like some data to be stored and displayed in real time in an already opened text file. I can store the data but I am unable to find a way to display it without manually closing the text file and then open it again...
Do someone know how to solve this issue? Or do you have another way to solve it? I have read something about transferring data to a new text document and then renaming it, but I am quite sure that this wouldn't solve my problem. I have also read something about macros that could detect changes in the document and then somehow refresh it. I have never worked with macros and I have absolutely no idea of how they are implemented....
But please tell me if it is a fact that it is impossible to update an already opened text document?
I am thankful for any suggestions or tutorials that you guys may provide! :)
That's outside the scope of C; it will require some system-specific filesystem monitoring mechanism. For example, inotify offers this functionality
First off, you can use the rewind(), fseek(), ftell() or fgetpos() and fsetpos() functions to locate the read pointer in a file. If you record the start position where the updated record was written (the start offset) using ftell() or fgetpos(), you could jump back to that position later with fseek() or fsetpos() and read in the changed data.
The other gotcha lurking here is that in general, you can't simply 'update' a text file. Specifically, if the replacement text is not the same length as the original text, you have problems. You either need to expand or contract the file. This is normally done by making a copy with the desired edit in the correct location, and then copying or moving the modified copy of the file over the original file.
Detecting when some other process modifies the file is harder still. There are different mechanisms in different operating systems. For Linux, it is the inotify system, for example.
Based upon your statement that you 'can't display it without manually closing the text file and open it again', it may be a buffer issue. When using the C standard library calls (fopen, fread, fwrite, fclose, etc ...) the data you write may be buffered in user-space until the buffer is full or the file is closed.
To force the C library to flush the buffer, use the fflush(fp) call where fp is your file pointer.
Regarding: But please tell me if it is a fact that it is impossible to update an already opened text document? Yes, it is not possible, unless you own the handle to the file (i.e. FILE *fp = fopen("someFilePath", "w+");)
Regarding: if it's possible to update a text file(e.g something.txt) in C?
Yes. If you know the location of the file, (someFileLocation, eg. "c:\dev\somefile.txt"), then open it and write to it.
A simple function that uses FILE *fp = fopen(someFileLocation, "w+"); (open existing file for append) and fclose(fp); will do that: Here is an example that I use for logging:
(Note, you will have to comment out, or create the other functions this one refers to, but the general concept is shown)
int WriteToLog(char* str)
{
FILE* log;
char *tmStr;
ssize_t size;
char pn[MAX_PATHNAME_LEN];
char path[MAX_PATHNAME_LEN], base[50], ext[5];
char LocationKeep[MAX_PATHNAME_LEN];
static unsigned long long index = 0;
if(str)
{
if(FileExists(LOGFILE, &size))
{
strcpy(pn,LOGFILE);
ManageLogs(pn, LOGSIZE);
tmStr = calloc(25, sizeof(char));
log = fopen(LOGFILE, "a+");
if (log == NULL)
{
free(tmStr);
return -1;
}
//fprintf(log, "%10llu %s: %s - %d\n", index++, GetTimeString(tmStr), str, GetClockCycles());
fprintf(log, "%s: %s - %d\n", GetTimeString(tmStr), str, GetClockCycles());
//fprintf(log, "%s: %s\n", GetTimeString(tmStr), str);
fclose(log);
free(tmStr);
}
else
{
strcpy(LocationKeep, LOGFILE);
GetFileParts(LocationKeep, path, base, ext);
CheckAndOrCreateDirectories(path);
tmStr = calloc(25, sizeof(char));
log = fopen(LOGFILE, "a+");
if (log == NULL)
{
free(tmStr);
return -1;
}
fprintf(log, "%s: %s - %d\n", GetTimeString(tmStr), str, GetClockCycles());
//fprintf(log, "%s: %s\n", GetTimeString(tmStr), str);
fclose(log);
free(tmStr);
}
}
return 0;
}
Regarding: browsing the internet for the last 2 hours. Next time try
"tutorial on writing to a file in C" in Google, it lists lots of links, including:
This one... More On The Topic.

Buffering of standard I/O library

In the book Advanced Programming in the UNIX Environments (2nd edition), the author wrote in Section 5.5 (stream operations of the standard I/O library) that:
When a file is opened for reading and writing (the plus sign in the type), the following restrictions apply.
Output cannot be directly followed by input without an intervening fflush, fseek, fsetpos, or rewind.
Input cannot be directly followed by output without an intervening fseek, fsetpos, or rewind, or an input operation that encounters an end of file.
I got confused about this. Could anyone explain a little about this? For example, in what situation the input and output function calls violating the above restrictions will cause unexpected behavior of the program? I guess the reason for the restrictions may be related to the buffering in the library, but I'm not so clear.
You aren't allowed to intersperse input and output operations. For example, you can't use formatted input to seek to a particular point in the file, then start writing bytes starting at that point. This allows the implementation to assume that at any time, the sole I/O buffer will only contain either data to be read (to you) or written (to the OS), without doing any safety checks.
f = fopen( "myfile", "rw" ); /* open for read and write */
fscanf( f, "hello, world\n" ); /* scan past file header */
fprintf( f, "daturghhhf\n" ); /* write some data - illegal */
This is OK, though, if you do an fseek( f, 0, SEEK_CUR ); between the fscanf and the fprintf because that changes the mode of the I/O buffer without repositioning it.
Why is it done this way? As far as I can tell, because OS vendors often want to support automatic mode switching, but fail. The stdio spec allows a buggy implementation to be compliant, and a working implementation of automatic mode switching simply implements a compatible extension.
It's not clear what you're asking.
Your basic question is "Why does the book say I can't do this?" Well, the book says you can't do it because the POSIX/SUS/etc. standard says it's undefined behavior in the fopen specification, which it does to align with the ISO C standard (N1124 working draft, because the final version is not free), 7.19.5.3.
Then you ask, "in what situation the input and output function calls violating the above restrictions will cause unexpected behavior of the program?"
Undefined behavior will always cause unexpected behavior, because the whole point is that you're not allowed to expect anything. (See 3.4.3 and 4 in the C standard linked above.)
But on top of that, it's not even clear what they could have specified that would make any sense. Look at this:
int main(int argc, char *argv[]) {
FILE *fp = fopen("foo", "r+");
fseek(fp, 0, SEEK_SET);
fwrite("foo", 1, 3, fp);
fseek(fp, 0, SEEK_SET);
fwrite("bar", 1, 3, fp);
char buf[4] = { 0 };
size_t ret = fread(buf, 1, 3, fp);
printf("%d %s\n", (int)ret, buf);
}
So, should this print out 3 foo because that's what's on disk, or 3 bar because that's what's in the "conceptual file", or 0 because there's nothing after what's been written so you're reading at EOF? And if you think there's an obvious answer, consider the fact that it's possible that bar has been flushed already—or even that it's been partially flushed, so the disk file now contains boo.
If you're asking the more practical question "Can I get away with it in some circumstances?", well, I believe on most Unix platforms, the above code will give you an occasional segfault, but 3 xyz (either 3 uninitialized characters, or in more complicated cases 3 characters that happened to be in the buffer before it got overwritten) the rest of the time. So, no, you can't get away with it.
Finally, you say, "I guess the reason for the restrictions may be related to the buffering in the library, but I'm not so clear." This sounds like you're asking about the rationale.
You're right that it's about buffering. As I pointed out above, there really is no intuitive right thing to do here—but also, think about the implementation. Remember that the Unix way has always been "if the simplest and most efficient code is good enough, do that".
There are three ways you could implement something like stdio:
Use a shared buffer for read and write, and write code to switch contexts as needed. This is going to be a bit complicated, and will flush buffers more often than you'd ideally like.
Use two separate buffers, and cache-style code to determine when one operation needs to copy from and/or invalidate the other buffer. This is even more complicated, and makes a FILE object take twice as much memory.
Use a shared buffer, and just don't allow interleaving reads and writes without explicit flushes in between. This is dead-simple, and as efficient as possible.
Use a shared buffer, and implicitly flush between interleaved reads and writes. This is almost as simple, and almost as efficient, and a lot safer, but not really any better in any way other than safety.
So, Unix went with #3, and documented it, and SUS, POSIX, C89, etc. standardized that behavior.
You might say, "Come on, it can't be that inefficient." Well, you have to remember that Unix was designed for low-end 1970s systems, and the basic philosophy that it's not worth trading off even a little efficiency unless there's some actual benefit. But, most importantly, consider that stdio has to handle trivial functions like getc and putc, not just fancy stuff like fscanf and fprintf, and adding anything to those functions (or macros) that makes them 5x as slow would make a huge difference in a lot of real-world code.
If you look at modern implementations from, e.g., *BSD, glibc, Darwin, MSVCRT, etc. (most of which are open source, or at least commercial-but-shared-source), most of them do things the same way. A few add safety checks, but they generally give you an error for interleaving rather than implicitly flushing—after all, if your code is wrong, it's better to tell you that your code is wrong than to try to DWIM.
For example, look at early Darwin (OS X) fopen, fread, and fwrite (chosen because it's nice and simple, and has easily-linkable code that's syntax-colored but also copy-pastable). All that fread has to do is copy bytes out of the buffer, and refill the buffer if it runs out. You can't get any simpler than that.
reason 1
find the real file position to start.
due to the buffer implementation of the stdio, the stdio stream position may differ from the OS file position. when you read 1 byte, stdio mark the file position to 1. Due to the buffering, stdio may read 4096 bytes from the underlying file, where OS would record its file position at 4096. When you switch to output, you really need to choose which position you want to use.
reason 2
find the right buffer cursor to start.
tl;dr,
if an underlying implementation only uses a single shared buffer for both read and write, you have to flush the buffer when changing IO direction.
Take this glibc used in chromium os to demo how fwrite, fseek, and fflush handle the single shared buffer.
fwrite fill buffer impl:
fill_buffer:
while (to_write > 0)
{
register size_t n = to_write;
if (n > buffer_space)
n = buffer_space;
buffer_space -= n;
written += n;
to_write -= n;
if (n < 20)
while (n-- > 0)
*stream->__bufp++ = *p++;
else
{
memcpy ((void *) stream->__bufp, (void *) p, n);
stream->__bufp += n;
p += n;
}
if (to_write == 0)
/* Done writing. */
break;
else if (buffer_space == 0)
{
/* We have filled the buffer, so flush it. */
if (fflush (stream) == EOF)
break;
from this code snippet, we can see, if buffer is full, it will flush it.
Let's take a look at fflush
int
fflush (stream)
register FILE *stream;
{
if (stream == NULL) {...}
if (!__validfp (stream) || !stream->__mode.__write)
{
__set_errno (EINVAL);
return EOF;
}
return __flshfp (stream, EOF);
}
it uses __flshfp
/* Flush the buffer for FP and also write C if FLUSH_ONLY is nonzero.
This is the function used by putc and fflush. */
int
__flshfp (fp, c)
register FILE *fp;
int c;
{
/* Make room in the buffer. */
(*fp->__room_funcs.__output) (fp, flush_only ? EOF : (unsigned char) c);
}
the __room_funcs.__output by default is using flushbuf
/* Write out the buffered data. */
wrote = (*fp->__io_funcs.__write) (fp->__cookie, fp->__buffer,
to_write);
Now we are close. What's __write? Trace the default settings aforementioned, it's __stdio_write
int
__stdio_write (cookie, buf, n)
void *cookie;
register const char *buf;
register size_t n;
{
const int fd = (int) cookie;
register size_t written = 0;
while (n > 0)
{
int count = __write (fd, buf, (int) n);
if (count > 0)
{
buf += count;
written += count;
n -= count;
}
else if (count < 0
#if defined (EINTR) && defined (EINTR_REPEAT)
&& errno != EINTR
#endif
)
/* Write error. */
return -1;
}
return (int) written;
}
__write is the system call to write(3).
As we can see, the fwrite is only using only one single buffer. If you change direction, it can still store the previous write contents. From the above example, you can call fflush to empty the buffer.
The same applies to fseek
/* Move the file position of STREAM to OFFSET
bytes from the beginning of the file if WHENCE
is SEEK_SET, the end of the file is it is SEEK_END,
or the current position if it is SEEK_CUR. */
int
fseek (stream, offset, whence)
register FILE *stream;
long int offset;
int whence;
{
...
if (stream->__mode.__write && __flshfp (stream, EOF) == EOF)
return EOF;
...
/* O is now an absolute position, the new target. */
stream->__target = o;
/* Set bufp and both end pointers to the beginning of the buffer.
The next i/o will force a call to the input/output room function. */
stream->__bufp
= stream->__get_limit = stream->__put_limit = stream->__buffer;
...
}
it will soft flush (reset) the buffer at the end, which means read buffer will be emptied after this call.
This obeys the C99 rationale:
A change of input/output direction on an update file is only allowed following a successful fsetpos, fseek, rewind, or fflush operation, since these are precisely the functions which assure that the I/O buffer has been flushed.

Resources