Read from a file which is being actively written to - c

I have two programs: one (program1) writes to a file continuously, and I want the other program (program2) to read the file continuously. What is happening is that my second program is only reading up to the data point which was written when the second code was executed and stops instead of continuously reading.
Is there any way to achieve the thing?
Basically, I want the output of program1 to be used as the input of program2. Is there any way to read and write in RAM instead of file as disk read costs more time.
Code 2:
#include <stdio.h>
int main(){
FILE *fptr;
fptr = fopen("gbbct1.seq","r");
char c;
c = fgetc(fptr);
while (c != EOF){
printf("%c", c);
c = fgetc(fptr);
}
}
I am looking for platform independent approach. If that's not possible, I would like to know for Linux platform. I don't need to preserve the data once read. I don't want to block program1.

The most basic version of your code needs to reset the file stream status when it encounters EOF and then sleep for a while. For example, assuming POSIX and using only the simplest (most ubiquitous) functions:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
const char filename[] = "gbbct1.seq";
FILE *fptr = fopen(filename, "r");
if (fptr == 0)
{
fprintf(stderr, "failed to open file '%s' for reading\b", filename);
exit(EXIT_FAILURE);
}
while (1)
{
int c;
while ((c = fgetc(fptr)) != EOF)
fputc(c, stdout);
clearerr(fptr);
sleep(1);
}
/*NOTREACHED*/
return EXIT_FAILURE;
}
The sleep() function sleeps for an integral number of seconds; if you want sub-second sleeps, you can consider usleep(),
nanosleep(), timer_create() and relatives, etc.
I have a program I call dribbler (because it dribbles data to its output):
Usage: dribbler [-hlntV][-s nap.time][-r std.dev][-f outfile][-i infile][-m message][-o openstr][-F format]
-V Print version information and exit
-f outfile Write to named file (dribbler.out)
-h Print this help message and exit
-i infile Read lines from input file
-l Loop back to start of input file on EOF
-m message Write message on each line of output
-n Number lines read from input file
-o openstr Flags passed to fopen() (a+)
-s nap.time Sleep for given interval between writes (1.000 second)
-r std.dev Randomize the time (Gaussian around nap.time with std.dev)
-t Write to standard output instead of file
-F format Printf format to use instead of %zu
I used:
$ dribbler -s 3 -r 1.3 -f gbbct1.seq &
[1] 81129
$
to write to the control file that program2 is coded to read. I then ran program2 on it, and it produced the outputs as it proceeded.
It's hard to show the time sequence on SO. I have another program (the story of my life) called tstamp which reads lines of input and prints them with a timestamp prefixed to the line:
Usage: tstamp [-hV][-f num][-F format]
-f num Number of fractional digits (0, 3, 6, 9)
-h Print this help message and exit
-F fmt Time format (strftime(3)) to use
-V Print version information and exit
I tried modifying program2.c to set line buffered mode on my Mac (macOS Sierra 10.12.5, GCC 7.1.0), by adding the line below before the while loop in program2.c, but it was effectively ignored, somewhat to my surprise and chagrin:
setvbuf(fptr, 0, _IOLBF, 0);
So, I rewrote the while loop as:
while ((c = fgetc(fptr)) != EOF)
{
fputc(c, stdout);
if (c == '\n')
fflush(stdout);
}
Then I was able to run dribbler in the background, and program2 | tstamp -f 3 to get output like:
$ program2 | tstamp -f 3
2017-06-03 23:52:44.836: 0: message written to file
2017-06-03 23:52:44.836: 1: message written to file
2017-06-03 23:52:44.836: 2: message written to file
2017-06-03 23:52:44.836: 3: message written to file
[…more similar lines with the same time stamp…]
2017-06-03 23:52:44.836: 22: message written to file
2017-06-03 23:52:44.836: 23: message written to file
2017-06-03 23:52:44.836: 24: message written to file
2017-06-03 23:52:44.836: 25: message written to file
2017-06-03 23:52:50.859: 26: message written to file
2017-06-03 23:52:54.866: 27: message written to file
2017-06-03 23:52:58.880: 28: message written to file
2017-06-03 23:53:02.888: 29: message written to file
2017-06-03 23:53:05.902: 30: message written to file
2017-06-03 23:53:07.907: 31: message written to file
2017-06-03 23:53:09.913: 32: message written to file
2017-06-03 23:53:12.925: 33: message written to file
2017-06-03 23:53:14.935: 34: message written to file
2017-06-03 23:53:15.938: 35: message written to file
2017-06-03 23:53:19.954: 36: message written to file
2017-06-03 23:53:21.964: 37: message written to file
2017-06-03 23:53:23.972: 38: message written to file
^C
$ kill %1
[1]+ Terminated: 15 dribbler -s 3 -r 1.3 -f gbbct1.seq
$
You can see that I'd had dribbler running for a while when I started program2 (it got modified and recompiled — part of my chagrin), so there was quite a lot of date to read immediately (hence the multiple lines with the timestamp 2017-06-03 23:52:44.836:), but then it was waiting on dribbler to write more, and as you can see, it sometimes waited nearly 6 seconds between lines, and other times about 1 second, and various intervals in between. The gaps are made more uniform by program2 sleeping for a second at a time. (Yes, I wrote these tools to help answer questions on SO — but dribbler and tstamp both pre-date this question by months.)

I have two programs: one (program1) writes to a file continuously, and I want the other program (program2) to read the file continuously.
What happens then is platform-specific. BTW, the mere ability to run several programs at once (in several processes) is provided by the operating system (and is not defined in, since outside of, the C11 standard). Read Operating Systems : Three Easy Pieces (freely downloadable chapters).
IIRC, on Windows (that I don't know and never used) that is not allowed to happen (one of the programs would be blocked or would have its file open fail).
However, if on Linux with a native local file system (such as Ext4), you could consider using inotify(7) facilities (which won't work with remote file systems à la NFS, and probably won't work on FAT filesystems like e.g. some USB key; but you need to check).
Basically, I want the output of program1 to be used as the input of program2. Is there any way to read and write in RAM instead of file as disk read costs more time.
(I am supposing you wrote both program1 and program2, or at least have their source code and can modify it)
BTW, application programs don't read directly from RAM; they work in virtual memory and each process has its own virtual address space. See this answer.
You surely want to have some inter-process communications, which are provided by your operating system.
On Linux there are many ways of doing that (you should read Advanced Linux Programming whose chapters are freely downloadable). I suggest considering some fifo(7), or some pipe(7) (if both running programs can be started from a common process), or some unix(7) sockets.
You surely need to multiplex for I/O (in both processes) e.g. by having some event loop around a poll(2).
Windows also has inter-process communication facilities. But I don't know them.
(I strongly recommend to spend a few days reading, notably Advanced Linux Programming or some other similar book, before writing a single line of code. You lack an overall picture on OSes and on Linux)
I would recommend using a pipe(7), or else some named fifo(7), or else some unix(7) socket. You could then write code portable on all POSIX systems. I don't recommend to use a file and inotify(7) (which is complex and Linux specific). See also popen(3).
You might find some framework libraries (e.g. Glib from GTK, QtCore, POCO, libevent, 0mq) to help you write portable code able to run on many platforms.

Related

posix_spawn pipe dmesg to python script

I've got several USB to 422 adapters in my test system. I've used FTProg to give each adapter a specific name: Sensor1, Sensor2, etc. They will all be plugged in at power on. I don't want to hard code each adapter to a specific ttyUSBx. I want the drivers to figure out which tty it needs to use. I'm developing in C for a linux system. My first thought was to something like this in my startup code.
system("dmesg | find_usb.py");
The python script would find the devices since each one has a unique Product Description. Then using the usb tree to associate each device with its ttyUSBx. The script would then create /tmp/USBDevs which would just be a simple device:tty pairing that would be easy for the C code to search.
I've been told...DoN't UsE sYsTeM...use posix_spawn(). But I'm having problems getting the output of dmesg piped to my python script. This isn't working
char *my_args[] = {"dmesg", "|", "find_usb.py", NULL};
pid_t pid;
int status;
status = posix_spawn(&pid, "/bin/dmesg", NULL, NULL, my_args, NULL);
if(status == 0){
if(waitpid(pid, &status, 0) != -1);{
printf("posix_spawn exited: %i", status);
}
I've been trying to figure out how to do this with posix_spawn_file_actions(), but I'm not allowed to hit the peak of the 'Ballmer Curve' at work.
Thanks in advance
Instead of using /dev/ttyUSB* devices, write udev rules to generate named symlinks to the devices. For a brief how-to, see here. Basically, you'll have an udev rule for each device, ending with say SYMLINK+=Sensor-name, and in your program, use /dev/Sensor-name for each sensor. (I do recommend using Sensor- prefix, noting the initial Capital letter, as all device names are currently lowercase. This avoids any clashes with existing devices.)
These symlinks will then only exist when the matching device is plugged in, and will point to the correct device (/dev/ttyUSB* in this case). When the device is removed, udev automagically deletes the symlink also. Just make sure your udev rule identifies the device precisely (not just vendor:device, but serial number also). I'd expect the rule to look something like
SUBSYSTEM=="tty", ATTRS{idVendor}=="VVVV", ATTRS{idProduct}=="PPPP", ATTRS{serial}=="SSSSSSSS", SYMLINK+="Sensor-name"
where VVVV is the USB Vendor ID (four hexadecimal digits), PPPP is the USB Product ID (four hexadecimal digits), and SSSSSSSS is the serial number string. You can see these values using e.g. udevadm info -a -n /dev/ttyUSB* when the device is plugged in.
If you still insist on parsing dmesg output, using your own script is a good idea.
You could use FILE *handle = popen("dmesg | find_usb.py", "r"); and read from handle like it was a file. When complete, close the handle using int exitstatus = pclose(handle);. See man popen and man pclose for the details, and man 2 wait for the WIFEXITED(), WEXITSTATUS(), WIFSIGNALED(), WTERMSIG() macros you'll need to use to examine exitstatus (although in your case, I suppose you can just ignore any errors).
If you do want to use posix_spawn() (or roughly equivalently, fork() and execvp()), you'd need to set up at least one pipe (to read the output of the spawned command) – two if you spawn/fork+exec both dmesg and your Python script –, and that gets a bit more complicated. See man pipe for details on that. Personally, I would rewrite the Python script so that it executes dmesg itself internally, and only outputs the device name(s). With posix_spawn(), you'd init a posix_file_actions_t, with three actions: _adddup2() to duplicate the write end of the pipe to STDOUT_FILENO, and two _addclose()s to close both ends of the pipe. However, I myself prefer to use fork() and exec() instead, somewhat similar to the example by Glärbo in this answer.

How to read and write concurrently a pcap file in c

I have 2 programs written in C, one program writes to the pcap file and the second program reads from it at the same time.For writing ,I am using the following code
while(j < 100000)
{
pcount = pcap_dispatch(p,2000,&pcap_dump,(u_char *)pd);
j = j+pcount;
printf("Got %d packets , total packets %d\n",pcount,j);
}
And for decoding the packets, I am using the following code
while( (returnValue=pcap_next_ex(pcap,&header,&data)) >= 0)
{
printf("Packet # %d ",++packetCount);
printf("return value %d\n",returnValue);
}
When i run the program separately i.e when I stop writing to the pcap file, It decodes the packets perfectly. But when I run both the programs at the same time, the decoder does not decode all the packets. If i get 100 packets, the decoder will show only 50-60 decoded packets.
Any help will be appreciated.
In my opinion, reader's file is not getting updated as soon as the writer writes in the pcap file. This might be due to the reason that reader's file pointer is not refreshed i.e its reading the non updated version of file.
Hope it will help you.
This is what pipes are for. I suggest something like
pcap_writer -w - | tee permanent-file.pcap | pcap_reader -r -
where pcap_writer and pcap_reader are your programs. This way you create something that can be combined in a different manner, if wanted.

Executing System Function and Parsing Output

I want to run a system function within a program written in C.
This system function is blocking and can take some time before it returns to stdout. The function to be called is snort, and normally is executed on a raspberry pi as followed:
sudo snort -q -A console -i eth0 -c /etc/snort/snort.conf
In the case snort triggers an alert, the parent program should read that line and turn on a LED. I currently am turning on leds as followed:
void triggerLed(void) {
pinMode(7], OUTPUT);
digitalWrite(7, HIGH);
}
int main(void) {
//Execute this function call: sudo snort -q -A console -i eth0 -c /etc/snort/snort.conf
//while executing
//On new line from readline()
//if strcmp(line,"alert")
triggerLed();
//endif
//end while
}
How would you solve this? I tried monitoring syslog, snort however does not write to syslog as I cannot find any alerts.
fyi: Last week I asked this question on: Execute script on Snort alert . Unfortunately, due to a combination a vaguely formed question and a change of scope I rephrased the question here.
The function you are looking for is system(3). You get the exit code of the process back.
But if you intend to read the output (stdout) of the called process you have to implement a fork(3)/exec(3) combination, reconnecting the child's file descriptors (at least fd 1) and then reading from it.

C script running as cron giving permission denied error

I have a .c file compiled and would like to run via a cron job but I end up getting this error:
/bin/sh: /usr/local/bin/get1Receive.c: Permission denied.
What is causing this error and how do I fix it?
Should I be running the .c file in cron or a different compiled file?
Results from /tmp/myvars
GROUPS=()
HOME=/root
HOSTNAME=capture
HOSTTYPE=x86_64
IFS='
'
LOGNAME=root
MACHTYPE=x86_64-redhat-linux-gnu
OPTERR=1
OPTIND=1
OSTYPE=linux-gnu
PATH=/usr/bin:/bin
POSIXLY_CORRECT=y
PPID=11086
PS4='+ '
PWD=/root
SHELL=/bin/sh
SHELLOPTS=braceexpand:hashall:interactive-comments:posix
SHLVL=1
TERM=dumb
UID=0
USER=root
_=/bin/sh
Results from file get1Receive.c
file get1Receive.c
get1Receive.c: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
Snippet of codes.
sprintf(queryBuf1,"SELECT ipDest, macDest,portDest, sum(totalBits) FROM dataReceive WHERE timeStampID between '%s' And '%s' GROUP BY ipDest, macDest, portDest ",buff1,buff2);
printf("\nQuery receive %s",queryBuf1);
if(mysql_query(localConn, queryBuf1))
{
//fprintf(stderr, "%s\n", mysql_error(localConn));
printf("Error in first query of select %s\n",mysql_error(localConn));
exit(1);
}
localRes1 = mysql_store_result(localConn);
int num_fields = mysql_num_fields(localRes1);
printf("\nNumf of fields : %d",num_fields);
printf("\nNof of row : %lu",mysql_num_rows(localRes1));
If the output of this command:
file get1Receive1.c
shows that file name to be a valid executable that part is very unusual, but okay.
Assuming you are using biz14 (or your real username's ) crontab try this:
use the command crontab -e to create this line in your crontab:
* * * * * set > /tmp/myvars
Wait a few minutes, go back into crontab -e and delete that entry.
Use the set command from the command line to see what variables and aliases exist.
Compare that with that you see in /tmp/myvars You have to change how your C code executes by changing the variables and aliases the cron job runs with.
If you are running the cron job in someone else's crontab, then you have a bigger problem. Check file permissions on get1Receive1.c. and the directory it lives in. That other user (the one who wons the crontab) has to have permissions set on your directory and get1Receive1.c so the job can run.
Example crontab entry:
0 10 * * 1-5 /path/to/get1Receive1.c > /tmp/outputfile
Read /tmp/outputfile to see what you got. You are using printf in your code. printf only writes to the controlling terminal. There is no controlling terminal, so redirect the printf stuff to a file.
Last effort on this problem:
Check return codes on EVERYTHING. All C functions like fread(), any db function, etc. If a return code gives a fail response ( these are different for different function calls) then report the error number the line number and function - gcc provides LINE and func. Example:
printf("error on line %d in my code %s, error message =%s\n", __LINE__, __func__, [string of error message]);
If you do not check return codes you are writing very poor C code.
CHECK return codes, please, now!
Permission wise you could have two issues.
1. The 'c' file's permissions don't allow who you are running it as to run it.
2. You are running the cron with a script which doesn't have permissions.
Here's a helpful post: How to give permission for the cron job file?
The fact that you are running a 'c' file and referring to it as a script makes me think you're using C shell and not writing it as a C language program which would need to be compiled and have the generated executable run by the cron. If you're not using gcc or have never called gcc on your 'C' script then it's not C and call it C shell to avoid confusion.

practical examples use dup or dup2

I know what dup / dup2 does, but I have no idea when it would be used.
Any practical examples?
Thanks.
One example use would be I/O redirection. For this you fork a child process and close the stdin or stdout file descriptors (0 and 1) and then you do a dup() on another filedescriptor of your choice which will now be mapped to the lowest available file descriptor, which is in this case 0 or 1.
Using this you can now exec any child process which is possibly unaware of your application and whenever the child writes on the stdout (or reads from stdin, whatever you configured) the data gets written on the provided filedescriptor instead.
Shells use this to implement commands with pipes, e.g. /bin/ls | more by connecting the stdout of one process to the stdin of the other.
The best scenario to understand dup and dup2 is redirection.
First thing we need to know is that the system has 3 default file ids(or variables indicating output or input sources) that deals with the input and output. They are stdin, stdout, stderr, in integers they are 0,1,2. Most of the functions like fprintf or cout are directly output to stdout.
If we want to redirect the output, one way is give, for example, fprintf function more arguments indicating in and out.
However, there is a more elegant way: we can overwrite the default file ids to make them pointing to the file we want to receive the output. dup and dup2 exactly work in this situation.
Let's start with one simple example now: suppose we want to redirect the output of fprintf to a txt file named "chinaisbetter.txt". First of all we need to open this file
int fw=open("chinaisbetter.txt", O_APPEND|O_WRONLY);
Then we want stdout to point to "chinaisbetter.txt" by using dup function:
dup2(fw,1);
Now stdout(1) points to the descriptor of "chinaisbetter.txt" even though it's still 1, but the output is redirected now.
Then you can use printf as normal, but the results will be in the txt file instead of showing directly on the screen:
printf("Are you kidding me? \n");
PS:
This just gives a intuitive explanation, you may need to check the manpage or detailed information. Actually, we say "copy" here, they are not copying everything.
The file id here is referring to the handler of the file. The file descriptor mentioned above is a struct the records file's information.
When you are curious about POSIX functions, especially those that seem to duplicate themselves, it's generally good to check the standard itself. At the bottom you will usually see examples, as well as reasoning behind the implementation (and existence) of both.
In this case:
The following sections are informative.
Examples
Redirecting Standard Output to a File
The following example closes standard output for the current processes, re-assigns standard output to go to the file referenced by pfd, and closes the original file descriptor to clean up.
#include <unistd.h>
...
int pfd;
...
close(1);
dup(pfd);
close(pfd);
...
Redirecting Error Messages
The following example redirects messages from stderr to stdout.
#include <unistd.h>
...
dup2(2, 1); // 2-stderr; 1-stdout
...
Application Usage
None.
Rationale
The dup() and dup2() functions are redundant. Their services are also provided by the fcntl() function. They have been included in this volume of IEEE Std 1003.1-2001 primarily for historical reasons, since many existing applications use them.
While the brief code segment shown is very similar in behavior to dup2(), a conforming implementation based on other functions defined in this volume of IEEE Std 1003.1-2001 is significantly more complex. Least obvious is the possible effect of a signal-catching function that could be invoked between steps and allocate or deallocate file descriptors. This could be avoided by blocking signals.
The dup2() function is not marked obsolescent because it presents a type-safe version of functionality provided in a type-unsafe version by fcntl(). It is used in the POSIX Ada binding.
The dup2() function is not intended for use in critical regions as a synchronization mechanism.
In the description of [EBADF], the case of fildes being out of range is covered by the given case of fildes not being valid. The descriptions for fildes and fildes2 are different because the only kind of invalidity that is relevant for fildes2 is whether it is out of range; that is, it does not matter whether fildes2 refers to an open file when the dup2() call is made.
Future Directions
None.
See also
close(), fcntl(), open(), the Base Definitions volume of IEEE Std 1003.1-2001, <unistd.h>
Change History
First released in Issue 1. Derived from Issue 1 of the SVID.
One practical example is redirecting output messages to some other stream like some log file. Here is a sample code for I/O redirection.
Please refer to original post here
#include <stdio.h>
main()
{
int fd;
fpos_t pos;
printf("stdout, ");
fflush(stdout);
fgetpos(stdout, &pos);
fd = dup(fileno(stdout));
freopen("stdout.out", "w", stdout);
f();
fflush(stdout);
dup2(fd, fileno(stdout));
close(fd);
clearerr(stdout);
fsetpos(stdout, &pos); /* for C9X */
printf("stdout again\n");
}
f()
{
printf("stdout in f()");
}
I/O redirection in the shell would most likely be implemented using dup2/fcnlt system calls.
We can easily emulate the $program 2>&1 > logfile.log type of redirection using the dup2 function.
The program below redirects both stdout and stderr .i.e emulates behavior of $program 2>&1 > output using the dup2.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int
main(void){
int close_this_fd;
dup2(close_this_fd = open("output", O_WRONLY), 1);
dup2(1,2);
close(close_this_fd);
fprintf(stdout, "standard output\n");
fprintf(stderr, "standard error\n");
fflush(stdout);
sleep(100); //sleep to examine the filedes in /proc/pid/fd level.
return;
}
vagrant#precise64:/vagrant/advC$ ./a.out
^Z
[2]+ Stopped ./a.out
vagrant#precise64:/vagrant/advC$ cat output
standard error
standard output
vagrant#precise64:/vagrant/advC$ ll /proc/2761/fd
total 0
dr-x------ 2 vagrant vagrant 0 Jun 20 22:07 ./
dr-xr-xr-x 8 vagrant vagrant 0 Jun 20 22:07 ../
lrwx------ 1 vagrant vagrant 64 Jun 20 22:07 0 -> /dev/pts/0
l-wx------ 1 vagrant vagrant 64 Jun 20 22:07 1 -> /vagrant/advC/output
l-wx------ 1 vagrant vagrant 64 Jun 20 22:07 2 -> /vagrant/advC/output

Resources