Should I call fopen - fclose every fwrite action

Should I call fopen - fclose every fwrite action - c

I am making something similar to commit log in database system. The system is able to handle ~ 20,000 events / sec. Each event occupies ~16 bytes. Roughly, the system will write to commit log at a speed of ~312.5 kB / sec. Each commit log file will contain at most of 500,000 events.
I have a question that: Should I call fopen - fwrite - fclose for each event, OR should I call fopen once when creating a new file, then a series of fwrite and finally fclose?

In such cases, it might be even better to revert to open/write/close and get rid of C buffered output completely. Log files are typically consisting of a high volume of nearly identical (size-wise) writes and do not really gain much from C buffering. Low level, unbuffered I/O would also relieve you of calling fflush() and can guarantee to write every single log entry as atomic entity.
Given the volume you mentioned, you should probably still not close and re-open the file between writes.

fopen/fwrite/fclose 20k times per second looks pretty expensive.
Consider calling fflush as an alternative.
If you are looking to use it in order to record database transactions for possible recovery, you may need to rethink it. The f family of functions use buffering so in the event of a crash the final buffer may or may not have actually made it to disk.

You're not obliged to, no... and in fact it would be a much better idea to call fflush as suggested by EvilTeach's answer.
However, better yet, if you can avoid calling fflush that would be ideal since the C standard library might (probably will) implement system-specific caching to unite smaller physical writes into larger physical writes, making your 20k writes per second more optimal.
Calling fopen/fwrite/fclose as you suggested, or fflush as EvilTeach suggested would elude that caching, which will probably degrade performance.

Related

Why fread does have thread safe requirements which slows down its call

I am writing a function to read binary files that are organized as a succession of (key, value) pairs where keys are small ASCII strings and value are int or double stored in binary format.
If implemented naively, this function makes a lot of call to fread to read very small amount of data (usually no more than 10 bytes). Even though fread internally uses a buffer to read the file, I have implemented my own buffer and I have observed speed up by a factor of 10 on both Linux and Windows. The buffer size used by fread is large enough and the function call cannot be responsible for such a slowdown. So I went and dug into the GNU implementation of fread and discovered some lock on the file, and many other things such as verifying that the file is open with read access and so on. No wonder why fread is so slow.
But what is the rationale behind fread being thread-safe where it seems that multiple thread can call fread on the same file which is mind boggling to me. These requirements make it slow as hell. What are the advantages?

Imagine you have a file where each 5 bytes can be processed in parallel (let's say, pixel by pixel in an image):
123456789A
One thread needs to pick 5 bytes "12345", the next one the next 5 bytes "6789A".
If it was not thread-safe different threads could pick-up wrong chunks. For example: "12367" and "4589A" or even worst (unexpected behaviour, repeated bytes or worst).

As suggested by nemequ:
Note that if you're on glibc you can use the _unlocked variants (*e.g., fread_unlocked). On Windows you can define _CRT_DISABLE_PERFCRIT_LOCKS

Stream I/O is already as slow as molasses. Programmers think that a read from main memory (1000x longer than a CPU cycle) is ages. A read from the physical disk or a network may as well be eternity.
I don't know if that's the #1 reason why the library implementers were ok with adding the lock overhead, but I guarantee it played a significant part.
Yes, it slows it down, but as you discovered, you can manually buffer the read and use your own handling to increase the speed when performance really matters. (That's the key--when you absolutely must read the data as fast as possible. Don't bother manually buffering in the general case.)
That's a rationalization. I'm sure you could think of more!

Understanding low level file routines

I am going through Mark Burgess's "The GNU C Programming Tutorial". I have come across the following information:
Even though low-level fle routines do not use buﬀering, and once you call write, your data can be read from the file immediately, it may take up to a minute before your data is physically written to disk. (Page:142)
Firstly, is "it may take up to a minute(some time) before your data is written to disk" true?
Secondly, when low level file routines are not using buffering why will the delay take place?

There are two places where I/O buffering can occur (at least — it could be more than just two).
One is in the application; the standard I/O functions using FILE * use buffered I/O unless you use setvbuf() to prevent it.
The other is in the kernel. Disk I/O normally goes into the kernel buffer pool, and eventually gets written by the kernel to disk. There are ways around that (O_DIRECT on Linux; raw devices on classic Unix; etc). The key point is that the write() system call normally writes to he kernel buffer pool. The kernel takes responsibility for ensuring that the data is written to disk safely and correctly (journalling, …).
The kernel doesn't write everything to disk immediately because (a) you may add more changes to the data, (b) other people may need to read or write the data, (c) the disk drive may be busy writing something else at the other end of its 1 TiB of storage and it will take time to get the write head in position to take your data, and it would be better for the overall performance of the system if it scheduled other work before writing your changed buffer to disk. It will get written to disk. It is just not defined when, and it could be fractions of a second or multiple seconds or longer, though most often it will not take minutes for the data to be written to disk.
These days, there could also be buffering in the RAID controllers, and maybe in the individual disks inside the RAID setup, and maybe there's network buffering too if it is a remotely-mounted file system. Those add extra levels of buffering.
The read() and write() and related low-level I/O functions do not have any client-side (application) buffering — unlike the standard C I/O functions.

A file is said to be buffered, when its contents are not outputted or inputted directly. Instead, the file's bytes are written to a temporary buffer in memory.
For example, if you are reading from a file, you are reading from the buffer. Once you have read all the characters in the buffer, it is replenished with new bytes from the file. The reason for this indirectness, is that a memory read is much faster than a hard disk read.
The calls read and write are low-level, and do not perform buffering. The stdio.h calls like getc and putc, do use buffering. These higher-level APIs only call the low level ones, when the buffer must be replenished.

Writing to the hard drive is much slower than writing to RAM. When you write to a drive it writes to memory, but doesn't always write to the disk immediately. The data might not be written to disk until that part of memory needs to be overwritten to make room for something else. This is called a Write-Back cache.

Is write/fwrite guaranteed to be sequential?

Is the data written via write (or fwrite) guaranteed to be persisted to the disk in a sequence manner? In particular in relation to fault tolerance. If the system should fail during the write, will it behave as though first bytes were written first and writing stopped mid-stream (as opposed to random blocks written).
Also, are sequential calls to write/fwrite guaranteed to be sequential? According to POSIX I find only that a call to read is guaranteed to consider a previous write.
I'm asking as I'm creating a fault tolerant data store that persists to disks. My logical order of writing is such that faults won't ruin the data, but if the logical order isn't being obeyed I have a problem.
Note: I'm not asking if persistence is guaranteed. Only that if my calls to write do eventually persist they obey the order in which I actually write.

The POSIX docs for write() state that "If the O_DSYNC bit has been set, write I/O operations on the file descriptor shall complete as defined by synchronized I/O data integrity completion". Presumably, if the O_DSYNC bit isn't set, then the synchronization of I/O data integrity completion is unspecified. POSIX also says that "This volume of POSIX.1-2008 is also silent about any effects of application-level caching (such as that done by stdio)", so I think there is no guarantee for fwrite().

I am not an expert, but I might know enough to point you in the right direction:
The most disastrous case is if you lose power, so that is the only one worth considering.
Start with a file with X bytes of meaningful content, and a header that indicates it.
Write Y bytes of meaningful content somewhere that won't invalidate X.
Call fsync (slow!).
Update the header (probably has to be less than your disk's block size).
I don't know if changing the length of a file is safe. I don't know how much depends on the filesystem mount mode, except that any "safe" mode is probably completely unusable for systems need to have even a slight level of performance.
Keep in mind that on some systems the fsync call lies and just returns without doing anything safely. You can tell because it returns quickly. For this reason, you need to make pretty large transactions (i.e. much larger than application-level transactions).
Keep in mind that the kind of people who solve this problem in the real world get paid high 6 figures at least. The best answer for the rest of us is either "just send the data to postgres and let it deal with it." or "accept that we might have to lose data and revert to an hourly backup."

No, in general as far as POSIX and reality are concerned, filesystems do not provide such guarantee. Order of persistence (in which disk makes them permanent on the platter) is not dictated by order in which syscalls were made, or position within the file, or order of sectors on disk. Filesystems retain the data to be written in memory for several seconds, hoarding as much as possible, and later send them to disk in batches, in whatever order they seem fit. And regardless of how kernel sends it to disk, the disk itself can reorder writes on its own due to NCQ.
Filesystems have means of ensuring some ordering for safety. Barriers were used in the past, and explicit flushes and FUA requests nowadays. There is a good article on LWN about that. But those are used by filesystems, not applications.
I strongly suggest reading the article about Application Level Consistency. Not sure how relevant to you but it shows many behaviors that developers wrongly assumed in the past.
Answer from o11c is a good way to do it.

Yes, so long as we are not talking about adding in the complexity of multithreading. It will be in the same order on disk, for what makes it to disk. It buffers to memory and dumps that memory to disk when it fills up, or you close the file.

refresh stream buffer while reading /proc

I'm reading from /proc/pid/task/stat to keep track of cpu usage in a thread.
fopen on /proc/pic/task/stat
fget a string from the stream
sscanf on the string
I am having issues however getting the streams buffer to update.
If I fget 1024 characters if regreshes but if I fget 128 characters then it never updates and I always get the same stats.
I rewind the stream before the read and have tried fsync.
I do this very frequently so I'd rather not reopen to file each time.
What is the right way to do this?

Not every program benefits from the use of buffered I/O.
In your case, I think I would just use read(2)1. This way, you:
eliminate all stale buffer2 issues
probably run faster via the elimination of double buffering
probably use less memory
definitely simplify the implementation
For a case like you describe, the efficiency gain may not matter on today's remarkably powerful CPUs. But I will point out that programs like cp(2) and other heavy-duty data movers don't use buffered I/O packages.
1. That is, open(2), read(2), lseek(2), and close(2).
2. And perhaps to intercept an argument, on questions related to this one someone usually offers a "helpful" suggestion along the lines of fflush(stdin), and then another someone comes along to accurately point out that fflush() is defined by C99 on output streams only, and that it's usually unwise to depend on implementation-specific behavior.

Read a line of input faster than fgets?

I'm writing a program where performance is quite important, but not critical. Currently I am reading in text from a FILE* line by line and I use fgets to obtain each line. After using some performance tools, I've found that 20% to 30% of the time my application is running, it is inside fgets.
Are there faster ways to get a line of text? My application is single-threaded with no intentions to use multiple threads. Input could be from stdin or from a file. Thanks in advance.

You don't say which platform you are on, but if it is UNIX-like, then you may want to try the read() system call, which does not perform the extra layer of buffering that fgets() et al do. This may speed things up slightly, on the other hand it may well slow things down - the only way to find out is to try it and see.

Use fgets_unlocked(), but read carefully what it does first
Get the data with fgetc() or fgetc_unlocked() instead of fgets(). With fgets(), your data is copied into memory twice, first by the C runtime library from a file to an internal buffer (stream I/O is buffered), then from that internal buffer to an array in your program

Read the whole file in one go into a buffer.
Process the lines from that buffer.
That's the fastest possible solution.

You might try minimizing the amount of time you spend reading from the disk by reading large amounts of data into RAM then working on that. Reading from disk is slow, so minimize the amount of time you spend doing that by reading (ideally) the entire file once, then working on it.
Sorta like the way CPU cache minimizes the time the CPU actually goes back to RAM, you could use RAM to minimize the number of times you actually go to disk.

Depending on your environment, using setvbuf() to increase the size of the internal buffer used by file streams may or may not improve performance.
This is the syntax -
setvbuf (InputFile, NULL, _IOFBF, BUFFER_SIZE);
Where InputFile is a FILE* to a file just opened using fopen() and BUFFER_SIZE is the size of the buffer (which is allocated by this call for you).
You can try various buffer sizes to see if any have positive influence. Note that this is entirely optional, and your runtime may do absolutely nothing with this call.

If the data is coming from disk, you could be IO bound.
If that is the case, get a faster disk (but first check that you're getting the most out of your existing one...some Linux distributions don't optimize disk access out of the box (hdparm)), stage the data into memory (say by copying it to a RAM disk) ahead of time, or be prepared to wait.
If you are not IO bound, you could be wasting a lot of time copying. You could benefit from so-called zero-copy methods. Something like memory map the file and only access it through pointers.
That is a bit beyond my expertise, so you should do some reading or wait for more knowledgeable help.
BTW-- You might be getting into more work than the problem is worth; maybe a faster machine would solve all your problems...
NB-- It is not clear that you can memory map the standard input either...

If the OS supports it, you can try asynchronous file reading, that is, the file is read into memory whilst the CPU is busy doing something else. So, the code goes something like:

start asynchronous read
loop:
wait for asynchronous read to complete
if end of file goto exit
start asynchronous read
do stuff with data read from file
goto loop
exit:
If you have more than one CPU then one CPU reads the file and parses the data into lines, the other CPU takes each line and processes it.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight