Making sure two processes interleave - c

In a C program on Linux, I fork() followed by execve() twice to create two processes running two seperate programs. How do I make sure that the execution of the two child processes interleave?
Thanks
Tried to do the above task as an answer given below had suggested but seems on encountering sched_scheduler() process hangs. Including code below...replay1 and replay2 are two prograns which simply prints "Replay1" and "Replay2" respectively.
# include<stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <sched.h>
void main()
{
int i,pid[5],pidparent,new=0;
char *newargv1[] = {"./replay1",NULL};
char *newargv2[] = {"./replay2",NULL};
char *newenviron[] = {NULL};
struct sched_param mysched;
mysched.sched_priority = 1;
sched_setscheduler(0,SCHED_FIFO, &mysched);
pidparent =getpid();
for(i=0;i<2;i++)
{
if(getpid()==pidparent)
{
pid[i] = fork();
if(pid[i] != 0)
kill(pid[i],SIGSTOP);
if(i==0 && pid[i]==0)
execve(newargv1[0], newargv1, newenviron);
if (i==1 && pid[i]==0)
execve(newargv2[0], newargv2, newenviron);
}
}
for(i=0;i<10;i++)
{
if(new==0)
new=1;
else
new=0;
kill(pid[new],SIGCONT);
sleep(100);
kill(pid[new], SIGSTOP);
}
}

Since you need random interleaving, here's a horrible hack to do it:
Immediately after forking, send a SIGSTOP to each application.
Set your parent application to have real-time priority with sched_setscheduler. This will allow you to have more fine-grained timers.
Send a SIGCONT to one of the child processes.
Loop: Wait a random, short time. Send a SIGSTOP to the currently-running application, and a SIGCONT to the other. Repeat.
This will help force execution to interleave. It will also make things quite slow. You may also want to try using sched_setaffinity to assign each process to a different CPU (if you have a dual-core or hyperthreaded CPU) - this will cause them to effectively run simultaneously, modulo wait times for I/O. I/O wait times (which could cause them to wait for the hard disk, at which point they're likely to wake up sequentially and thus not interleave) can be avoided by making sure whatever data they're manipulating is on a ramdisk (on linux, use tmpfs).
If this is too coarse-grained for you, you can use ptrace's PTRACE_SINGLESTEP operation to step one CPU operation at a time, interleaving as you see fit.

As this is for testing purposes, you could place sched_yield(); calls after every line of code in the child processes.
Another potential idea is to have a parent process ptrace() the child processes, and use PTRACE_SINGLESTEP to interleave the two process's execution on an instruction-by-instruction basis.

if you need to synchronize them and they are your own processes, use semaphores. If you do not have access to the source, then there is no way to synchronize them.

If your aim is to do concurrency testing, I know of only two techniques:
Test exact scenarios using synchronization. For example, process 1 opens a connection and executes a query, then process 2 comes in and executes a query, then process1 gets active again and gets the results, etc. You do this with synchronization techniques mentioned by others. However, getting good test scenarios is very difficult. I have rarely used this method in the past.
In random you trust: fire up a high number of test processes that execute a long running test suite. I used this method for both multithreading and multiprocess testing (my case was testing device driver access from multiple processes without blue screening out). Usually you want to make the number of processes and number of iterations of the test suite per process configurable so that you can either do a quick pass or do a longer test before a release (running this kind of test with 10 processes for 10-12 hours was not uncommon for us). A usual run for this sort of testing is measured in hours. You just fire up the processes, let them run for a few hours, and hope that they will catch all the timing windows. The interleaving is usually handled by the OS, so you don't really need to worry about it in the test processes.

Job control is much simpler with the Bash instead of C. Try this:
#! /bin/bash
stop ()
{
echo "$1 stopping"
kill -SIGSTOP $2
}
cont ()
{
echo "$1 continuing"
kill -SIGCONT $2
}
replay1 ()
{
while sleep 1 ; do echo "replay 1 running" ; done
}
replay2 ()
{
while sleep 1 ; do echo "replay 2 running" ; done
}
replay1 &
P1=$!
stop "replay 1" $P1
replay2 &
P2=$!
stop "replay 2" $P2
trap "kill $P1;kill $P2" EXIT
while sleep 1 ; do
cont "replay 1 " $P1
cont "replay 2" $P2
sleep 3
stop "replay 1 " $P1
stop "replay 2" $P2
done
The two processes are running in parallel:
$ ./interleave.sh
replay 1 stopping
replay 2 stopping
replay 1 continuing
replay 2 continuing
replay 2 running
replay 1 running
replay 1 running
replay 2 running
replay 1 stopping
replay 2 stopping
replay 1 continuing
replay 2 continuing
replay 1 running
replay 2 running
replay 2 running
replay 1 running
replay 2 running
replay 1 running
replay 1 stopping
replay 2 stopping
replay 1 continuing
replay 2 continuing
replay 1 running
replay 2 running
replay 1 running
replay 2 running
replay 1 running
replay 2 running
replay 1 stopping
replay 2 stopping
^C

Related

Core dump when running C program using systemd

I have a program in C that runs well when running it directly from the comand line but fails when running it with systemd:
Core was generated by `/usr/local/bin/midnite-modbusd'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0 0x0000000000401308 in main (argc=1, argv=0x7ffeae390268) at src/midnite-modbusd.c:139
139 slen= interval - (millis % interval);
The code in question:
//wait for start of each sample interval
gettimeofday(&tv,NULL);
millis= (long long unsigned)tv.tv_sec*1000 + (tv.tv_usec/1000);
slen= interval - (millis % interval);
i= (millis+slen) % 1000;
usleep (slen*1000);
The full code is available on github.
The systyemd unit:
[Unit]
Description=Midnite Classic modbus data polling
After=network.target
[Service]
Type=simple
User=midnite-modbusd
ExecStart=/usr/local/bin/midnite-modbusd
Restart=on-failure
[Install]
WantedBy=multi-user.target
What can be so different when a program runs with systemd ?
Edit 1
It seems that my program has major issues that only happen when running with systemd:
it won't read my configuration file, which should throw an error message and exit(1) because of invalid values
journactl doesn't get filled in real time. Using journactl -f I have to wait a couple of minutes before seeing a bunch of logs that appear suddenly
As a side note for my tests using the command line I run: sudo -H -u midnite-modbusd /usr/local/bin/midnite-modbusd
A defined value of sample_interval from configuration file will initialize the interval, please check if the file is correct and sample_interval is present. An uninitialized value of interval might cause the divide by zero exception
I found the issue in this code:
if (getppid()==1) {
sprintf(str, "Daemon aready running");
log_message(log_file_path,(char*)str);
return;
}
This code is here for when the program was intended to fork itself to run as an "old style" daemon.
I didn't realize that, as systemd is forking it, then the program have a parent process (thus getppid() returning 1 when running with systemd but not from the command line)
Anyway it is badly written: this test should stop the script.

How to configure GDB in Eclipse such that all prcoesses keep on running including the process being debugged?

I am new in C programming and I have been trying hard to customize an opensource tool written in C according to my organizational needs.
IDE: Eclipse,
Debugger: GDB,
OS: RHEL
The tool is multi-process in nature (main process executes first time and spawns several child processes using fork() ) and they share values in run time.
While debugging in Eclipse (using GDB), I find that the process being debugged is only running while other processes are in suspended mode. Thus, the only running process is not able to do its intended job because the other processes are suspended.
I saw somewhere that using MI command in GDB as "set non-stop on" could make other processes running. I used the same command in the gdbinit file shown below:
Note: I have overridden above .gdbinit file with an another gdbinit because the .gdbinit is not letting me to debug child processes as debugger terminates after the execution of main process.
But unfortunately debugger stops responding after using this command.
Please see below commands I am using in the gdbinit file:
Commenting non-stop enables Eclipse to continue usual debugging of the current process.
Adding: You can see in below image that only one process is running while others are suspended.
Can anyone please help me to configure GDB according to my requirement?
Thanks in advance.
OK #n.m.: Actually, You were right. I should have given more time to understand the flow of the code.
The tool creates 3 processes first and then the third process creates 5 threads and keeps on wait() for any child thread to terminate.
Top 5 threads (highlighted in blue) shown in the below image are threads and they are children of Process ID: 17991
The first two processes are intended to initiate basic functionality of the tool and hence they just wait to get exit(0). You can see below.
if (0 != (pid = zbx_fork()))
exit(0);
setsid();
signal(SIGHUP, SIG_IGN);
if (0 != (pid = zbx_fork()))
exit(0);
That was the reason I was not actually able to step in these 3 processes. Whenever, I tried to do so, the whole main process terminated immediately and consequently leaded to terminate all other processes.
So, I learned that I was supposed to "step-into" threads only. And yes, actually I can now debug :)
And this could be achieved because I had to remove the MI command "set follow-fork-mode child". So, I just used the default " .gdbinit" file with enabled "Automatically debug forked process".
Thanks everyone for your input. Stackoverflow is an awesome place to learn and share. :)

How to communicate from one program to another long running program?

I have a long running program in C under Linux:
longrun.c
#include <stdio.h>
int main()
{
int mode=0;
int c=0;
while(1)
{
printf("\nrun # mode %d value : %d ",mode,c );
if (c>100)
c=0;
if(mode==0)
c++;
else
c=c+2;
sleep(3);
}
return 0;
}
It will display
run # mode 0 value : 0
run # mode 0 value : 1
run # mode 0 value : 2
I need to write another program in C (some thing like changemode.c) , so that it can communicate to the longrun.c
and set its value of mode to some other value, so that the running program will
display values in incremental order of 2.
I.e., if I am running the program after some x minutes , it will display in this pattern:
run # mode 0 value : nnn
run # mode 0 value : nnn+2
run # mode 0 value : (nnn+2)+2
I can do it using file method the changemode.c will create a file saying mode =2
then the longrun.c will everytime open and check and proceed. Is there some other better way to solve this, like interprocess communication?
If possible can any one write a sample of the changemode.c?
One of the most basic ideas in Unix programming is process forking, and the establishment of a pipe between the 2 processes. longrun could start by creating a pipe, calling fork, and using the parent process as the changemode 'monitor' process, and the child process as you use longrun now. You will need to periodically read / write on either end.
A google search will return many examples. Here's another.
The solution has two parts:
A communication channel between the two processes. Unix Domain Sockets are a good tool for it, and they behave similarly to TCP/IP sockets.
Replacing sleep with select. select will listen on the socket, handling communication with the other program. You can also specify a 3 second timeout, so when it returns 0 (meaning no activity on the socket), you know it's time to print some output.
As an alternative to #2, you could use two threads - one sleeping and producing output, the other handling the socket. Note that any data shared by the threads should be synchronized (in your very simple case, where there's just one integer, you probably need nothing, but you sure do when it gets more complicated).
As mentioned in other answers, you need some kind of inter-process communication. You can find more info on the topic in the "Beej guide to Unix IPC" (it's a "classic"), available at:
http://beej.us/guide/bgipc/
Fernando

Execute command just before Mac going to sleep

I wrote a C program/LaunchDaemon that checks if my MacBook is at home (connected to my WLAN). If so, it disables my password protection; if not, it enables it.
Easy. But the problem is that when I take my MacBook anywhere else and password protection is disabled, it will wake up without a password protection.
My fix for this would be: enable the password protection every time just before it goes to sleep.
QUESTION: is there any way find out when my Mac is preparing for sleep? Some interupt I can let my program listen to?
You can do it using I/O Kit, check Apple's QA1340: Registering and
unregistering for sleep and wake notifications. You may also want to
analyze the SleepWatcher utility sources or use/integrate for your needs.
From the homepage:
SleepWatcher 2.2 (running with Mac OS X 10.5 to 10.8, source code included)
is a command line tool (daemon) for Mac OS X that monitors sleep, wakeup and
idleness of a Mac. It can be used to execute a Unix command when the Mac or
the display of the Mac goes to sleep mode or wakes up, after a given time
without user interaction or when the user resumes activity after a break or
when the power supply of a Mac notebook is attached or detached. It also can
send the Mac to sleep mode or retrieve the time since last user activity. A
little bit knowledge of the Unix command line is required to benefit from
this software.
I attach below the contents of my C file beforesleep.c which executes some command line commands (in my case shell commands and AppleScript scripts) when a "will sleep" notification is received.
Where you can put your code:
In order to run your code when the mac is going to sleep, just replace the system(...) calls with the code you wish to run.
In my case, I use system() as it allows me to run shell commands passed as strings, but if you prefer to run just C code instead, you can just put your C code there.
How to build it
In order to build this file, I run:
gcc -framework IOKit -framework Cocoa beforesleep.c
Remark
If you are going to use this code, make sure it is always running in background. For example, I have a Cron job which makes sure that this code is always running, and it launches it again in case it is accidentally killed for any reason (although it never happened to me so far). If you are experienced enough, you can find smarter ways to ensure this.
Further info
See this link (already suggested by sidyll) for more details about how this works.
Code template
#include <ctype.h>
#include <stdlib.h>
#include <stdio.h>
#include <mach/mach_port.h>
#include <mach/mach_interface.h>
#include <mach/mach_init.h>
#include <IOKit/pwr_mgt/IOPMLib.h>
#include <IOKit/IOMessage.h>
io_connect_t root_port; // a reference to the Root Power Domain IOService
void
MySleepCallBack( void * refCon, io_service_t service, natural_t messageType, void * messageArgument )
{
switch ( messageType )
{
case kIOMessageCanSystemSleep:
IOAllowPowerChange( root_port, (long)messageArgument );
break;
case kIOMessageSystemWillSleep:
system("/Users/andrea/bin/mylogger.sh");
system("osascript /Users/andrea/bin/pause_clockwork.scpt");
IOAllowPowerChange( root_port, (long)messageArgument );
break;
case kIOMessageSystemWillPowerOn:
//System has started the wake up process...
break;
case kIOMessageSystemHasPoweredOn:
//System has finished waking up...
break;
default:
break;
}
}
int main( int argc, char **argv )
{
// notification port allocated by IORegisterForSystemPower
IONotificationPortRef notifyPortRef;
// notifier object, used to deregister later
io_object_t notifierObject;
// this parameter is passed to the callback
void* refCon;
// register to receive system sleep notifications
root_port = IORegisterForSystemPower( refCon, &notifyPortRef, MySleepCallBack, &notifierObject );
if ( root_port == 0 )
{
printf("IORegisterForSystemPower failed\n");
return 1;
}
// add the notification port to the application runloop
CFRunLoopAddSource( CFRunLoopGetCurrent(),
IONotificationPortGetRunLoopSource(notifyPortRef), kCFRunLoopCommonModes );
/* Start the run loop to receive sleep notifications. Don't call CFRunLoopRun if this code
is running on the main thread of a Cocoa or Carbon application. Cocoa and Carbon
manage the main thread's run loop for you as part of their event handling
mechanisms.
*/
CFRunLoopRun();
//Not reached, CFRunLoopRun doesn't return in this case.
return (0);
}

zombiefied threads in ps (for a threaded program written in c)

I am afraid I am not sure what I'm doing wrong here.
I have a threaded application that starts 3 threads upon start
[root#Embest /]# ps
1111 root 608 S fw634c_d_cdm_sb
1112 root 608 S fw634c_d_cdm_sb
1113 root 608 S fw634c_d_cdm_sb
then waits in standby mode for commands from the serial.
after it runs and returns to stand by mode, I check with ps whats going on; there are zombiefied instances of the application (and the file name is sq.bracketed too)
1114 root Z [fw634c_d_cdm_sb]
...
...
...
1768 root Z [fw634c_d_cdm_sb]
about 628 of them.
thing is,
the policy i'm following is:
-for detachable threads - don't care (they will exit and free resources on their own after completing)
-for joinable threads - i run pthread_join after running pthread_create and wait for the threaded function to complete. like this:
if (pthread_create(&tmp_thrd_id,&attr_joinable,run_function,(void *)&aStruct)!=0){
DEBUG(printf("thread NOT created \n"));
}else{
DEBUG(printf("thread created !\n"));
if (pthread_join(tmp_thrd_id,NULL)!=0){
DEBUG(printf("\nERROR in joining \n"));
}else{
DEBUG(printf("Thread completed\n"));
}
}
I only run pthread_exit(NULL) in main , which doesn't do much and after the startup just lies around just because it must not be killed.
i'm probably forgeting something vital here. but can't clarify what after reading a few basic guides on threads....
thank you for your help
A "zombie" thread is a thread that has exited, and is waiting around for someone to call pthread_join to collect its exit status. So somewhere in your program you are creating threads and not eventually calling pthread_join or pthread_detach for those threads.

Resources