I want to write a C program that will sample something every second (an extension to screen). I can't do it in a loop since screen waits for the program to terminate every time, and I have to access the previous sample in every execution. Is saving the value in a file really my best bet?
You could use a named pipe (if available), which might allow the data to remain "in flight", i.e. not actually hit disk. Still, the code isn't any simpler, and hitting disk twice a second won't break the bank.
You could also use a named shared memory region (again, if available). That might result in simpler code.
You're losing some portability either way.
Is saving the value in a file really my best bet?
Unless you want to write some complicated client/server model communicating with another instance of the program just for the heck of it. Reading and writing a file is the preferred method.
Related
I created a program that monitors for events.
I want to log these events "in the right way".
Currently I have a string array, log[500][100].
Each line is a string of characters (up to 100) that report something about the event.
I have it set up so that only the last 500 events are saved in the array.
After that, new events overwrite the oldest events.
Currently I just keep revolving through the array until the program terminates, then I write the array to a file.
Going forward I would like to view the log in real time, any time I wish, without disturbing the event processing and logging process.
I considered opening the file for "appending" but here are my concerns:
(1) The program is running on a Raspberry Pi which has a flash memory as a "disk drive". I believe flash memories have a limited number of write cycles before problems can occur. This program runs 24/7 "forever" so I am afraid the "disk drive" will "wear out".
(2) I am using pretty much all the CPU capacity of the RPi so I don't want to add a lot of overhead/CPU cycles.
How would experienced programmers attack this problem?
Please go easy on me, this is my first C program.
[EDIT]
I began reviewing all the information and I became intrigued by Mark A's suggestion for tmpfs. I looked into it more and I am sure this answers my question. It allows the creation of files in RAM not the SD card. They are lost on power down but I don't care.
In order to keep the files from growing to large I created a double buffer approach. First I write 500 events to file A then switch to file B. When 500 events have been written to file B I close and reopen file A (to delete the contents and start at 0 events) and switch to writing to file A. I found I needed to fflush(file...) after each write or else the file was empty until fclose.
Normally that would be OK but right now I am fighting a nasty segmentation fault so I want as much insight into what is going on. When I hit the fault, I never get to my fclose statements.
Welcome to Stack Overflow and to C programming! A wonderful world of possibilities awaits you.
I have several thoughts in response to your situation.
The very short summary is to use stdout and delegate the output-file management to the shell.
The longer, rambling answer full of my personal musing is as follows:
1 : A very typical thing for C programs to do is not be in charge of how outputs are kept. You might have heard of the "built in" file handles, stdin, stdout, and stderr. These file handles are (under normal circumstances) always available to your program for input (from stdin) and output (stdout and stderr). As you might guess from their names stdout is customarily used for regular output and stderr is customarily used for error / exception output. It is exceedingly typical for a C program to simply read from stdin and output to stdout and stderr, and let something else (e.g., the shell) take care of what those actually are.
For example, reading from stdin means that your program can be used for keyboard entry and for file reading, without needing to change your program's code. The same goes for stdout and stderr; simply output to those file handles, and let the user decide whether those should go to the screen or be redirected to a file. And, because stdout and stderr are separate file handles, the user can have them go to separate 'destinations'.
In your case, to implement this, drop the array entirely, and simply
fprintf(stdout, "event notice : %s\n", eventdetailstring);
(or similar) every time your program has something to say. Take a look at fflush(), too, because of potential output buffering.
2a : This gets you continuous output. This itself can help with your concern about memory wear on the Pi's flash disk. If you do something like:
eventmonitor > logfile
then logfile will be being appended to during the lifetime of your program, which will tend to be writing to new parts of the flash disk. Of course, if you only ever append, you will eventually run out of space on the disk, so you might set up a cron job to kill the currently running eventmonitor and restart it every day at midnight. Done with the above command, that would cause it to overwrite logfile once per day. This prevents endless growth, and it might even use a new physical area of the flash drive for the new file (even though it's the same name; underneath, it's a different file, with a different inode, etc.) But even if it reuses the exact same area of the flash drive, now you are down to worrying if this will last more than 10,000 days, instead of 10,000 writes. I'm betting that within 10,000 days, new options will be available -- worst case, you buy a new Pi every 27 years or so!
There are other possible variations on this theme, as well. e.g., you could have a sophisticated script kicked off by cron every day at midnight that kills any currently running eventmonitor, deletes output files older than a week, and starts a new eventmonitor outputting to a file whose filename is based partly on the date so that past days' file aren't overwritten. But all of this is in the realm of using your program. You can make your program easier to use by writing it to use stdin, stdout, and stderr.
2b : Or, you can just have stdout go to the screen, which is typically how it already is when a program is started from an interactive shell / terminal window. I imagine you could have the Pi running headless most of the time, and when you want to see what your program is outputting, hook up a monitor. Generally, things will stay running between disconnecting and reconnecting your monitor. This avoids affecting the flash drive at all.
3 : Another approach is to have your event monitoring program send its output somewhere off-system. This is getting into more advanced programming territory, so you might want to save this for a later enhancement, after you've mastered more of the basics. But, your program could establish a network connection to, say, a JSON API and send event information there. This would let you separate the functions of event monitoring from event reporting.
You will discover as you learn more programming that this idea of separation of concerns is an important concept, and applies at various levels of a program or a system of interoperating programs. In this case, the Pi is a good fit for the data monitoring aspect because it is a lightweight solution, and some other system with more capacity and more stable storage can cover the data collection aspect.
I want to implement a C program in Linux (Ubuntu distro) that mimics tail -f. Note that I do not want to actually call tail -f from my C code, rather implement its behaviour. At the moment I can think of two ways to implement it.
When the program is called, I seek to the end of file. Afterwards, I would read to the end of file periodically and print whatever I read if it is not empty.
The second method which can potentially be more efficient is to again, seek to the end of file. But, this time I "somehow" listen for changes to that file and read to the end of file, only if I it is changed.
With that being said, my question is how to implement the second approach and if someone can share if it is worth the effort. Also, are these the only two options?
NOTE: Thanks for the comments, the question is changed based on them.
There is no standardized mechanism for monitoring changes to a file, so you'll need to implement a "polling" solution anyway (that is, when you hit the end of file, wait a short amount of time and try again.)
On Linux, you can use the inotify family of system calls, but be aware that it won't always work. It doesn't work for special files or remote filesystems, for example, and it may not work for some local filesystems. It is complicated in the case of symlinks. And so on. There is a Windows equivalent, but I believe it suffers from some of the same issues.
So even if you use a notification system, you'll need the polling solution as a backup, and since OS notifications are not guaranteed to be reliable (that is, if the system is under load, notifications might be dropped), you'll need to poll on timeout even if you are using a notification system.
You might want to take a look at the implementation of the GNU tail utility (http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/tail.c) to see how the special cases are handled.
You can implement the requirement by following steps:
1) fopen with 'a+' mode;
2) select the file discriptor opened (need do convert from FILE * to file descriptor) and do the read.
So i want to check a file if it contains a data. My program is multi-threaded so it won't work as the file can't be accessed at same time and also gives error, is it possible to load it up on string Array and check if that array contains the text i want ?
If i check it from 5-10 different threads at exactly same time will it matter ?
and How can I write a text to a file from all these threads at the same time but it should look if it being used and wait and then write so no error is logged.
... is it possible to load it up on string Array and check if that array contains the text i want ?
Yes. It is straight-forward programming to read a file into an array of strings, and to check if one of the strings in the array contains another string.
If i check it from 5-10 different threads at exactly same time will it matter ?
Yes, it matters. You have to implement the code the right way to ensure that it always works.
Your question is very hard to decipher, but I am guessing that you want the array of strings to be shared between the threads, AND you want the threads to update the array. In that case, proper synchronization is essential, or you are liable to run into race conditions and memory anomalies.
How can I write a text to a file from all these threads at the same time but it should look if it being used and wait and then write so no error is logged.
You need to synchronize properly so that only one thread attempts to write to the file at any one time. In addition, you need to make sure that one thread doesn't attempt to open a stream to the file while another stream has the file open. (That is most likely the cause of your current errors. Java on Windows won't let you do that ... though Java on Linux will allow it.)
I suggest you read the Oracle Java Tutorials on how to write multi-threaded programs.
How does the monitor store the data displayed on it? Is it stored in memory and if so how can I access it? The reason I am asking is because I am programming a text editor, in which I use an array to store the data being manipulated. I was wondering if I could access the memory containing the data displayed on the screen rather than using my own array. It seems redundant to reserve memory for the same data twice. But I just don't know how the monitor stores the data displayed on it or if it even stores it at all.
You can make very few assumptions about where stdout goes. It might go to a terminal, where it will end up in a buffer somewhere. Or it might be piped to another process. Or it might to to /dev/null. Or to a line printer, etc. And even in the cases where it does end up in memory somewhere, that buffer will have a limited size, and hence not necessarily hold the whole file. And you probably won't have permission to access that memory anyway. So while this could in theory work in certain circumstances, it is definitely not the right way to go.
You will probably not want to use stdout for your text editor at all, but something like ncurses, which lets you place text where you want in the terminal, and update it at will. The actual contents of the file is probably best managed through your own internal buffers, the way you are already doing it, though you might consider mmap too.
Stdout is the output stream of a program. The environment you run the program from determines where this stream points to. You are probably running the program from either a console terminal or from some IDE.
Console terminals by default store the output internally themselves, unless instructed to redirect the output to a file or another program's input.
You can't rely on third party to store any output info for you to later query without any agreement. You'll want to hold enough data inside your program to generate views you want. And yes, as above stated, ncurses and such libraries make console app building a bit easier.
I've written two relatively small programs using C. Both of them comunnicate with each other using textual data. Program A generates some problems from given input, B evaluates them and creates input for another iteration of A.
Here's a bash script that I currently use:
for i in {1..1000}
do
./A data > data2;
./B data2 > data;
done
The problem is that since what A and B do is not very time consuming, most of the time is spent (as I suppose) in starting apps up. When I measure time the script runs I get:
$ time ./bash.sh
real 0m10.304s
user 0m4.010s
sys 0m0.113s
So my main question is: is there any way to communicate data beetwen those two apps faster? I don't want to integrate them into one application, because I'm trying to build a toolset with independent, easly communicating tools (as was suggested in "The Art of Unix Programming" from which I'm learning the way to write reusable software).
PS. The data and data2 files contain sets of data needed in whole at once by those applications (so communicating by for e.g. one line of data at time is impossible).
Thanks for any suggestions.
cheers,
kajman
Can you create named pipe ?
mkfifo data1
mkfifo data2
./A data1 > data2 &
./B data2 > data1
If your application is reading and writing in a loop, this could work :)
If you used pipes to transfer the stdout of program A to the stdin of program B you would remove the need to write the file "data2" each loop.
./A data1 | ./B > data1
Program B would need to have the capability of using input from stdin rather than a specified file.
If you want to make a program run faster, you need to understand what is making the program run slowly. The field of computer science dedicated to measuring the performance of a running program is called profiling.
Once you discover which internal portion of your program is running slow, you can generally speed it up. How you go about speeding up that item depends heavily on what "the slow part" is doing and how it is "being done".
Several people have recommended pipes for moving the data directly from the output of one program into the input of another program. Assuming you rewrite your tools to handle input and output in a piped manner, this might improve performance. Again, it depends on what you are doing and how you are doing it.
For example, if your tool just fixes windows style end-of-lines into unix style end-of-lines, the program might read in one line, waiting for it to be available, check the end-of-line and write out the line with the desired end-of-line. Or the tool might read in all of the data, do a replacement call on each "wrong" end-of-line in memory, and then write out all of the data. With the first solution, piping speeds things up. With the second solution piping doesn't speed up anything.
The reason is is truly so hard to answer such a question is because the fix you need really depends on the code you have, the problem you are trying to solve, and the means by which you are solving it now. In the end, there isn't always a 100% guarantee that the code can be sped up; however, virtually every piece of code has opportunities to be sped up. Use profiling to speed up the parts that are slow, instead of wasting your time working on a part of your program that is only called once, and represents 0.001% of the program's runtime.
Remember if you speed up something that is 0.001% of your program's runtime by 50%, you actually only sped up your entire program by 0.0005%. Use profiling to determine the block of code that's taking up 90% of your runtime and concentrate on it.
I do have to wonder why, if A and B depend on each other to run, do you want them to be part of an independent toolset.
One solution is a compromise between the two:
Create a library that contains A.
Create a library that contains B.
Create a program that spawns two threads, 1 containing A and 2 containing B.
Create a semaphore that tells A to run and another that tells B to run.
After the function that calls A in 1, increment B's semaphore.
After the function that calls B in 2, increment A's semaphore.
Another possibility is to use file locking in your programs:
Make both A and B execute in infinite loops (or however many times you're processing data)
Add code to attempt to lock both files at the beginning of the infinite loop in A and B (if not, sleep and try again so that you don't do anything until you have the lock).
Add code to unlock and sleep for longer than the sleep in step 2 at the end of each loop.
Either of these solve the problem of having the overhead of launching the program between runs.
It's almost certainly not application startup which is the bottleneck. Linux will end up caching large portions of your programs, which means that launching will progressively get faster (to a point) the more times you start your program.
You need to look elsewhere for your bottleneck.