How to control/restrict other process to run for very small amount of time in linux - c

The solution in the above link blocking-and-resuming-execution-of-an-independent-process-in-linux says, I can use ptrace to achieve my goal.
I have tried to run run the code from-
how ptrace work between 2 processes
, but not getting any output.
This is specifically asked to know how to use ptrace to make a program execute only few instructions like in debugger case.
How to use ptrace in following situation to restrict other process to access few instructions?
I have two independent C programs under linux. Program1 is running in CPU core-1 and Program-2 is running in CPU core-2.
Program-2 is executing shared library function, func-2, consists of 200 lines of instruction to perform operation (add+shift) on data .
shared library function
-------
func-2()
{
// code to perform operation (add+shift) on data
}
--------
Program2.
main()
{
while(1)
func-2();
}
program1
main()
{
while()
{
// ptrace
// OR
//kill STOP <pid of program2>
// kill CONT <pid of program2>
}}
I want to restrict program-2 from inside Program-1 or bash , so that Program-2 can execute only few instructions or restrict to run for 1-2 microsec, I can't add any code inside Program2.
Program-1 knows PID of program2 and base address of func-2.
I heard about ptrace which can be used to control other process and using ptrace it is possible to restrict process to execute only 1 instruction. For me even restrict process for 1-2 microsec ( 5-10 instructions) will be sufficient.
How can I control Program-2 which is running other CPU core? Any link to relevant documents is highly appreciated. Thanks in advance.
I am using gcc under linux.

Related

how to print a string not disordered in vxworks multitask environment?

void print_task(void)
{
for(;;)
{
taskLock();
printf("this is task %d\n", taskIdSelf());
taskUnlock();
taskDelay(0);
}
}
void print_test(void)
{
taskSpawn("t1", 100,0,0x10000, (FUNCPTR)print_task, 0,0,0,0,0,0,0,0,0,0);
taskSpawn("t2", 100,0,0x10000, (FUNCPTR)print_task, 0,0,0,0,0,0,0,0,0,0);
}
the above code show:
this is task this is task126738208 126672144 this is task this is
task 126712667214438208
this is task this is task 1266721441 26738208 this is task 126672144
this is task
what is the right way to print a string in multitask?
The problem lies in taskLock();
Try semaphore or mutex instead.
The main idea to print in multi-threaded environment is using dedicated task that printout.
Normally in vxWorks there is a log task that gets the log messages from all tasks in the system and print to terminal from one task only.
The main problem in vxWorks logger mechanism is that the logger task use very high priority and can change your system timing.
Therefore, you should create your own low priority task that get messages from other tasks (using message queue, shared memory protected by mutex, …).
In that case there are 2 great benefits:
The first one, all system printout will be printed from one single task.
The second, and most important benefit, the real-time tasks in the system should not loss time using printf() function.
As you know, printf is very slow function that use system calls and for sure change the timing of your tasks according the debug information you add.
taskLock,
taskLock use as a command to the kernel, it mean to leave the current running task in the CPU as READY.
As you wrote in the example code taskUnlock() function doesn't have arguments. The basic reason is to enable the kernel & interrupts to perform taskUnlock in the system.
There are many system calls that perform task unlock (and sometimes interrupts service routing do it also)
Rather than invent a home-brew solution, just use logMsg(). It is the canonical safe & sane way to print stuff. Internally, it pushes your message onto a message queue. Then a separate task pulls stuff off the queue and prints it. By using logMsg(), you gain ability to print from ISR's, don't have interleaved prints from multiple tasks printing simultaneously, and so on.
For example:
printf("this is task %d\n", taskIdSelf());
becomes
logMsg("this is task %d\n", taskIdSelf(), 0,0,0,0,0,0);

how to force a c program to run on a particular core

Say I have the following c program:
#include <stdio.h>
int main()
{
printf("Hello world \n");
getchar();
return 0;
}
gcc 1.c -o helloworld
and, say I have a dual core machine:
cat /proc/cpuinfo | grep processor | wc -l
Now my question is, when we execute the program, how do we force this program to run in core-0 (or any other particular core)?
How to do this programmatically? examples, api's, code reference would be helpful.
If there is no api's available then is there any compile time, link time, load time way of doing this?
OTOH, how to check whether a program is running in core-0 or core-1 (or any other core)?
Since you are talking about /proc/cpu, I assume you are using linux. In linux you would use the sched_setaffinity function. In your example you would call
cpu_set_t set;
CPU_ZERO(&set); // clear cpu mask
CPU_SET(0, &set); // set cpu 0
sched_setaffinity(0, sizeof(cpu_set_t), &set); // 0 is the calling process
Look up man sched_setaffinity for more details.
This is OS-specific. As Felice points out, you can do it on Linux by calling sched_setaffinity in your program. If you end up running on multiple platforms, though, you'll have to code something different for each.
Alternatively, you can specify the affinity when you launch your executable, from the command line or a run script or whatever.
See taskset for a Linux command-line tool to do this.

how to slow down a process?

Suppose I have a program that runs in a given amount of time (say, three seconds). I want to run this program so that it runs n-times slower (specified on command line). How would you achieve it with (or better, without) changes to the program ?
please note that adding a sleep at the end is not a solution. The program has to run slower, not to run at full speed for the first three seconds and then do nothing for the remaining time. Also, using "nice" under unix is not a good solution either. it will run slower if other processes demand the processor, but at full speed if nothing is processor-demanding at the same time.
This is a curiosity question. Nothing serious to do related to it. The fact is that I remember 15-20 years ago games that were simply too fast to play on new processors, because they were timed with the processor clock. You had to turn off the turbo.
Let's assume the program is a C compiled program.
One idea is to write a 'ptrace runner.' ptrace is the call that allows you to implement a debugger on platforms such as Linux and Mac.
The idea is to attach to the program and then just repeatedly tell the application to run one instruction with ptrace(PTACE_SINGLESTEP). If that's not slow enough, you could add a sleep between each call to ptrace in the runner program.
I wrote a simple example on my linux box how to slow down a child process with SIGSTOP and SIGCONT signals:
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
void dosomething(void){
static volatile unsigned char buffer[1000000];
for(unsigned i=0;i<1000;i++) for(unsigned j=0;j<sizeof(buffer);buffer[j++]=i){;}
}
#define RUN 1
#define WAIT 1
int main(void){
int delay=0, status, pid = fork();
if( !pid ){ kill(getpid(),SIGSTOP); dosomething(); return 0; }
do{
waitpid( pid, &status, WUNTRACED | WCONTINUED );
if( WIFSTOPPED (status) ){ sleep(delay); kill(pid,SIGCONT); }
if( WIFCONTINUED(status) && WAIT ){ sleep(RUN ); kill(pid,SIGSTOP); }
delay=WAIT;
}while( !WIFEXITED(status) && !WIFSIGNALED (status) );
}
No slowdown when WAIT is zero, otherwise after every RUN seconds the parent stop the child for WAIT seconds.
Runtime results:
RUN=1 WAIT=0
---------------
real 3.905s
user 3.704s
sys 0.012s
RUN=1 WAIT=1
---------------
real 9.061s
user 3.640s
sys 0.016s
RUN=1 WAIT=2
---------------
real 13.027s
user 3.372s
sys 0.032s
cpulimit is a tool that does something like this. It works by periodically
kill -STOP and kill -CONT the process, which has the effect of it running slower (when averaged over time).
If you have DTrace you may be able to use it's chill() function. You could insert this chill at almost anyplace in a userland application and in multiple places. It's been used before to replicate race conditions seen on slower systems.
I ran some application in a virtual machine under ubuntu. It was really slow.
You could configure the virtual machine usage of the system.
You might obfuscate the situation a little further by running a virtual machine under a virtual machine under a virtual machine, ...

How do I ensure my program runs from beginning to end without interruption?

I'm attempting to time code using RDTSC (no other profiling software I've tried is able to time to the resolution I need) on Ubuntu 8.10. However, I keep getting outliers from task switches and interrupts firing, which are causing my statistics to be invalid.
Considering my program runs in a matter of milliseconds, is it possible to disable all interrupts (which would inherently switch off task switches) in my environment? Or do I need to go to an OS which allows me more power? Would I be better off using my own OS kernel to perform this timing code? I am attempting to prove an algorithm's best/worst case performance, so it must be totally solid with timing.
The relevant code I'm using currently is:
inline uint64_t rdtsc()
{
uint64_t ret;
asm volatile("rdtsc" : "=A" (ret));
return ret;
}
void test(int readable_out, uint32_t start, uint32_t end, uint32_t (*fn)(uint32_t, uint32_t))
{
int i;
for(i = 0; i <= 100; i++)
{
uint64_t clock1 = rdtsc();
uint32_t ans = fn(start, end);
uint64_t clock2 = rdtsc();
uint64_t diff = clock2 - clock1;
if(readable_out)
printf("[%3d]\t\t%u [%llu]\n", i, ans, diff);
else
printf("%llu\n", diff);
}
}
Extra points to those who notice I'm not properly handling overflow conditions in this code. At this stage I'm just trying to get a consistent output without sudden jumps due to my program losing the timeslice.
The nice value for my program is -20.
So to recap, is it possible for me to run this code without interruption from the OS? Or am I going to need to run it on bare hardware in ring0, so I can disable IRQs and scheduling? Thanks in advance!
If you call nanosleep() to sleep for a second or so immediately before each iteration of the test, you should get a "fresh" timeslice for each test. If you compile your kernel with 100HZ timer interrupts, and your timed function completes in under 10ms, then you should be able to avoid timer interrupts hitting you that way.
To minimise other interrupts, deconfigure all network devices, configure your system without swap and make sure it's otherwise quiescent.
Tricky. I don't think you can turn the operating system 'off' and guarantee strict scheduling.
I would turn this upside down: given that it runs so fast, run it many times to collect a distribution of outcomes. Given that standard Ubuntu Linux is not a real-time OS in the narrow sense, all alternative algorithms would run in the same setup --- and you can then compare your distributions (using anything from summary statistics to quantiles to qqplots). You can do that comparison with Python, or R, or Octave, ... whichever suits you best.
You might be able to get away with running FreeDOS, since it's a single process OS.
Here's the relevant text from the second link:
Microsoft's DOS implementation, which is the de
facto standard for DOS systems in the
x86 world, is a single-user,
single-tasking operating system. It
provides raw access to hardware, and
only a minimal layer for OS APIs for
things like the file I/O. This is a
good thing when it comes to embedded
systems, because you often just need
to get something done without an
operating system in your way.
DOS has (natively) no concept of
threads and no concept of multiple,
on-going processes. Application
software makes system calls via the
use of an interrupt interface, calling
various hardware interrupts to handle
things like video and audio, and
calling software interrupts to handle
various things like reading a
directory, executing a file, and so
forth.
Of course, you'll probably get the best performance actually booting FreeDOS onto actual hardware, not in an emulator.
I haven't actually used FreeDOS, but I assume that since your program seems to be standard C, you'll be able to use whatever the standard compiler is for FreeDOS.
If your program runs in milliseconds, and if your are running on Linux,
Make sure that your timer frequency (on linux) is set to 100Hz (not 1000Hz).
(cd /usr/src/linux; make menuconfig, and look at "Processor type and features" -> "Timer frequency")
This way your CPU will get interrupted every 10ms.
Furthermore, consider that the default CPU time slice on Linux is 100ms, so with a nice level of -20, you will not get descheduled if your are running for a few milliseconds.
Also, you are looping 101 times on fn(). Please consider giving fn() to be a no-op to calibrate your system properly.
Make statistics (average + stddev) instead of printing too many times (that would consume your scheduled timeslice, and the terminal will eventually get schedule etc... avoid that).
RDTSC benchmark sample code
You can use chrt -f 99 ./test to run ./test with the maximum realtime priority. Then at least it won't be interrupted by other user-space processes.
Also, installing the linux-rt package will install a real-time kernel, which will give you more control over interrupt handler priority via threaded interrupts.
If you run as root, you can call sched_setscheduler() and give yourself a real-time priority. Check the documentation.
Maybe there is some way to disable preemptive scheduling on linux, but it might not be needed. You could potentially use information from /proc/<pid>/schedstat or some other object in /proc to sense when you have been preempted, and disregard those timing samples.

What's the algorithm behind sleep()?

Now there's something I always wondered: how is sleep() implemented ?
If it is all about using an API from the OS, then how is the API made ?
Does it all boil down to using special machine-code on the CPU ? Does that CPU need a special co-processor or other gizmo without which you can't have sleep() ?
The best known incarnation of sleep() is in C (to be more accurate, in the libraries that come with C compilers, such as GNU's libc), although almost every language today has its equivalent, but the implementation of sleep in some languages (think Bash) is not what we're looking at in this question...
EDIT: After reading some of the answers, I see that the process is placed in a wait queue. From there, I can guess two alternatives, either
a timer is set so that the kernel wakes the process at the due time, or
whenever the kernel is allowed a time slice, it polls the clock to check whether it's time to wake a process.
The answers only mention alternative 1. Therefore, I ask: how does this timer behave ? If it's a simple interrupt to make the kernel wake the process, how can the kernel ask the timer to "wake me up in 140 milliseconds so I can put the process in running state" ?
The "update" to question shows some misunderstanding of how modern OSs work.
The kernel is not "allowed" a time slice. The kernel is the thing that gives out time slices to user processes. The "timer" is not set to wake the sleeping process up - it is set to stop the currently running process.
In essence, the kernel attempts to fairly distribute the CPU time by stopping processes that are on CPU too long. For a simplified picture, let's say that no process is allowed to use the CPU more than 2 milliseconds. So, the kernel would set timer to 2 milliseconds, and let the process run. When the timer fires an interrupt, the kernel gets control. It saves the running process' current state (registers, instruction pointer and so on), and the control is not returned to it. Instead, another process is picked from the list of processes waiting to be given CPU, and the process that was interrupted goes to the back of the queue.
The sleeping process is simply not in the queue of things waiting for CPU. Instead, it's stored in the sleeping queue. Whenever kernel gets timer interrupt, the sleep queue is checked, and the processes whose time have come get transferred to "waiting for CPU" queue.
This is, of course, a gross simplification. It takes very sophisticated algorithms to ensure security, fairness, balance, prioritize, prevent starvation, do it all fast and with minimum amount of memory used for kernel data.
There's a kernel data structure called the sleep queue. It's a priority queue. Whenever a process is added to the sleep queue, the expiration time of the most-soon-to-be-awakened process is calculated, and a timer is set. At that time, the expired job is taken off the queue and the process resumes execution.
(amusing trivia: in older unix implementations, there was a queue for processes for which fork() had been called, but for which the child process had not been created. It was of course called the fork queue.)
HTH!
Perhaps the major job of an operating system is to hide the complexity of a real piece of hardware from the application writer. Hence, any description of how the OS works runs the risk of getting really complicated, really fast. Accordingly, I am not going to deal with all the "what ifs" and yeah buts" that a real operating system needs to deal with. I'm just going to describe, at a high conceptual level, what a process is, what the scheduler does, how the timer queue works. Hopefully this is helpful.
What's a process:
Think of a process--let's just talk about processes, and get to threads later--as "the thing the operating system schedules". A process has an ID--think an integer--and you can think of that integer as an index into a table containing all the context of that process.
Context is the hardware information--registers, memory management unit contents, other hardware state--that, when loaded into the machine, will allow the process to "go". There are other components of context--lists of open files, state of signal handlers, and, most importantly here, things the process is waiting for.
Processes spend a lot of time sleeping (a.k.a. waiting)
A process spends much of its time waiting. For example, a process that reads or writes to disk will spend a lot of time waiting for the data to arrive or be acknowledged to be out on disk. OS folks use the terms "waiting" and "sleeping" (and "blocked") somewhat interchangeably--all meaning that the process is awaiting something to happen before it can continue on its merry way. It is just confusing that the OS API sleep() happens to use underlying OS mechanisms for sleeping processes.
Processes can be waiting for other things: network packets to arrive, window selection events, or a timer to expire, for example.
Processes and Scheduling
Processes that are waiting are said to be non-runnable. They don't go onto the run queue of the operating system. But when the event occurs which the process is waiting for, it causes the operating system to move the process from the non-runnable to the runnable state. At the same time, the operating system puts the process on the run queue, which is really not a queue--it's more of a pile of all the processes which, should the operating system decide to do so, could run.
Scheduling:
the operating system decides, at regular intervals, which processes should run. The algorithm by which the operating system decides to do so is called, somewhat unsurprisingly, the scheduling algorithm. Scheduling algorithms range from dead-simple ("everybody gets to run for 10 ms, and then the next guy on the queue gets to run") to far more complicated (taking into account process priority, frequency of execution, run-time deadlines, inter-process dependencies, chained locks and all sorts of other complicated subject matter).
The Timer Queue
A computer has a timer inside it. There are many ways this can be implemented, but the classic manner is called a periodic timer. A periodic timer ticks at a regular interval--in most operating systems today, I believe this rate is 100 times per second--100 Hz--every 10 milliseconds. I'll use that value in what follows as a concrete rate, but know that most operating systems worth their salt can be configured with different ticks--and many don't use this mechanism and can provide much better timer precision. But I digress.
Each tick results in an interrupt to the operating system.
When the OS handles this timer interrupt, it increments its idea of system time by another 10 ms. Then, it looks at the timer queue and decides what events on that queue need to be dealt with.
The timer queue really is a queue of "things which need to be dealt with", which we will call events. This queue is ordered by time of expiration, soonest events first.
An "event" can be something like, "wake up process X", or "go kick disk I/O over there, because it may have gotten stuck", or "send out a keepalive packet on that fibrechannel link over there". Whatever the operating system needs to have done.
When you have a queue ordered in this way, it's easy to manage the dequeuing. The OS simply looks at the head of the queue, and decrements the "time to expiration" of the event by 10 ms every tick. When the expiration time goes to zero, the OS dequeues that event, and does whatever is called for.
In the case of a sleeping process, it simply makes the process runnable again.
Simple, huh?
there's at least two different levels to answer this question. (and a lot of other things that get confused with it, i won't touch them)
an application level, this is what the C library does. It's a simple OS call, it simply tells the OS not to give CPU time to this process until the time has passed. The OS has a queue of suspended applications, and some info about what are they waiting for (usually either time, or some data to appear somewhere).
kernel level. when the OS doesn't have anything to do right now, it executes a 'hlt' instruction. this instruction doesn't do anything, but it never finishes by itself. Of course, a hardware interrupt is serviced normally. Put simply, the main loop of an OS looks like this (from very very far away):
allow_interrupts ();
while (true) {
hlt;
check_todo_queues ();
}
the interrupt handlers simpy add things to the todo queues. The real time clock is programmed to generate interrupts either periodically (at a fixed rate), or to some fixed time in the future when the next process wants to be awaken.
A multitasking operating system has a component called a scheduler, this component is responsible for giving CPU time to threads, calling sleep tells the OS not to give CPU time to this thread for some time.
see http://en.wikipedia.org/wiki/Process_states for complete details.
I don't know anything about Linux, but I can tell you what happens on Windows.
Sleep() causes the process' time-slice to end immediately to return control to the OS. The OS then sets up a timer kernel object that gets signaled after the time elapses. The OS will then not give that process any more time until the kernel object gets signaled. Even then, if other processes have higher or equal priority, it may still wait a little while before letting the process continue.
Special CPU machine code is used by the OS to do process switching. Those functions cannot be accessed by user-mode code, so they are accessed strictly by API calls into the OS.
Essentially, yes, there is a "special gizmo" - and it's important for a lot more than just sleep().
Classically, on x86 this was an Intel 8253 or 8254 "Programmable Interval Timer". In the early PCs, this was a seperate chip on the motherboard that could be programmed by the CPU to assert an interrupt (via the "Programmable Interrupt Controller", another discrete chip) after a preset time interval. The functionality still exists, although it is now a tiny part of a much larger chunk of motherboard circuitry.
The OS today still programs the PIT to wake it up regularly (in recent versions of Linux, once every millisecond by default), and this is how the Kernel is able to implement pre-emptive multitasking.
glibc 2.21 Linux
Forwards to the nanosleep system call.
glibc is the default implementation for the C stdlib on most Linux desktop distros.
How to find it: the first reflex is:
git ls-files | grep sleep
This contains:
sysdeps/unix/sysv/linux/sleep.c
and we know that:
sysdeps/unix/sysv/linux/
contains the Linux specifics.
On the top of that file we see:
/* We are going to use the `nanosleep' syscall of the kernel. But the
kernel does not implement the stupid SysV SIGCHLD vs. SIG_IGN
behaviour for this syscall. Therefore we have to emulate it here. */
unsigned int
__sleep (unsigned int seconds)
So if you trust comments, we are done basically.
At the bottom:
weak_alias (__sleep, sleep)
which basically says __sleep == sleep. The function uses nanosleep through:
result = __nanosleep (&ts, &ts);
After greppingg:
git grep nanosleep | grep -v abilist
we get a small list of interesting occurrences, and I think __nanosleep is defined in:
sysdeps/unix/sysv/linux/syscalls.list
on the line:
nanosleep - nanosleep Ci:pp __nanosleep nanosleep
which is some super DRY magic format parsed by:
sysdeps/unix/make-syscalls.sh
Then from the build directory:
grep -r __nanosleep
Leads us to: /sysd-syscalls which is what make-syscalls.sh generates and contains:
#### CALL=nanosleep NUMBER=35 ARGS=i:pp SOURCE=-
ifeq (,$(filter nanosleep,$(unix-syscalls)))
unix-syscalls += nanosleep
$(foreach p,$(sysd-rules-targets),$(foreach o,$(object-suffixes),$(objpfx)$(patsubst %,$p,nanosleep)$o)): \
$(..)sysdeps/unix/make-syscalls.sh
$(make-target-directory)
(echo '#define SYSCALL_NAME nanosleep'; \
echo '#define SYSCALL_NARGS 2'; \
echo '#define SYSCALL_SYMBOL __nanosleep'; \
echo '#define SYSCALL_CANCELLABLE 1'; \
echo '#include <syscall-template.S>'; \
echo 'weak_alias (__nanosleep, nanosleep)'; \
echo 'libc_hidden_weak (nanosleep)'; \
) | $(compile-syscall) $(foreach p,$(patsubst %nanosleep,%,$(basename $(#F))),$($(p)CPPFLAGS))
endif
It looks like part of a Makefile. git grep sysd-syscalls shows that it is included at:
sysdeps/unix/Makefile:23:-include $(common-objpfx)sysd-syscalls
compile-syscall looks like the key part, so we find:
# This is the end of the pipeline for compiling the syscall stubs.
# The stdin is assembler with cpp using sysdep.h macros.
compile-syscall = $(COMPILE.S) -o $# -x assembler-with-cpp - \
$(compile-mkdep-flags)
Note that -x assembler-with-cpp is a gcc option.
This #defines parameters like:
#define SYSCALL_NAME nanosleep
and then use them at:
#include <syscall-template.S>
OK, this is as far as I will go on the macro expansion game for now.
I think then this generates the posix/nanosleep.o file which must be linked together with everything.
Linux 4.2 x86_64 nanosleep syscall
Uses the scheduler: it's not a busy sleep.
Search ctags:
sys_nanosleep
Leads us to kernel/time/hrtimer.c:
SYSCALL_DEFINE2(nanosleep, struct timespec __user *, rqtp,
hrtimer stands for High Resolution Timer. From there the main line looks like:
hrtimer_nanosleep
do_nanosleep
set_current_state(TASK_INTERRUPTIBLE); which is interruptible sleep
freezable_schedule(); which calls schedule() and allows other processes to run
hrtimer_start_expires
hrtimer_start_range_ns
TODO: reach the arch/x86 timing level
TODO: are the above steps done directly in the syscal call interrupt handler, or in a regular kernel thread?
A few articles about it:
https://geeki.wordpress.com/2010/10/30/ways-of-sleeping-in-linux-kernel/
http://www.linuxjournal.com/article/8144

Resources