why clear interrput flag cause segmentation fault in C? - c

I am learning some basics about Assembly and C. for learning purpose I decide to write a simple program that disable Interrupts and when user wants to type something in the console he/she can't :
#include <stdio.h>
int main(){
int a;
printf("enter your number : ");
asm ("cli");
scanf("%d", &a);
printf("your number is %d\n" , a);
return 0;
}
but when I compile this with GCC I got segmentation fault :
Segmentation fault (core dumped)
And when I debug it with gdb I got this message when program reach to the asm("cli"); line:
Program received signal SIGSEGV, Segmentation fault.
main () at cli.c:6
6 asm ("cli");

This is happening because You can't disable interrupts from user space program. All interrupts are under the control of kernel. You need to do it from kernel space. Before you do it you need to learn kernel internals first and playing with interrupts are very critical and requires more knowledge on kernel according to my knowledge.
You need to write a kernel module that can interact with user space through /dev/ (or some other) interface. User space code should request kernel module to disable interrupts.

cli is a privileged instruction. It raises a #GP(0) exception "If the CPL is greater (has less privilege) than the IOPL of the current program or procedure". This #GP is what causes Linux to deliver a SIGSEGV to your process.
Under Linux, you could make an iopl(3) system call to raise your IO priv level to match your ring 3 CPL, and then you could disable interrupts from user-space. (But don't do this, it's not supported AFAIK. The intended use-case for iopl is to use in and out instructions from user-space with high port numbers, not cli/sti. x86 just happens to use the same permissions for both.)
You'll probably crash your system if you don't re-enable interrupts right away, or maybe even if you do. Or at least screw up that CPU on a multi-core system. Basically don't do this unless you're ready to press the reset button, i.e. shut down X11, saved your files and run sync. Also remount your filesystems read-only.
Or try it in a virtual machine or simulator like BOCHS that will let you break in with a debugger even while interrupts are disabled. Or try it while booted from a USB stick.
Note that disabling interrupts only disables external interrupts. Software-generated interrupts like int $0x80 are still taken, but making system calls with interrupts disabled is probably an even worse idea. (It might work, though. The kernel saves/restores EFLAGS, so it probably won't return to user-space with interrupts re-enabled. Still, leaving interrupts disabled for a long time is a Bad Thing for interrupt latency.)
If you want to play around with disabling interrupts as a beginner, you should probably do it from a toy boot-sector program that uses BIOS calls for I/O. Or just look in the Linux kernel source for some places where it disables/enables interrupts if you're curious why it might do that.
IMO, "normal" asm in user-space is plenty interesting. With performance counters, you can see the details of how the CPU decodes and executes instructions. See links in the x86 tag wiki for manuals, guides, and performance tuning info.

Related

How could we sleep when we are executing a syscall that execute in interrupt mode

When I am executing a system call to do write or something else, the ISR corresponded to the exception is executing in interrupt mode (on cortex-m3 the IPSR register is having a non-zero value, 0xb). And what I have learned is that when we execute a code in an interrupt mode we can not sleep, we can not use functions that might block ...
My question is that: is there any kind of a mechanism with which the ISR could still executing in interrupt mode and in the same time it could use functions that might block, or is there any kind of trick is implemented.
Caveat: This is more of a comment than an answer but is too big to fit in a comment or series of comments.
TL;DR: Needing to sleep or execute a blocking operation from an ISR is a fundamental misdesign. This seems like an XY problem: https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem
Doing sleep [even in application code] is generally a code smell. Why do you feel you need to sleep [vs. some event/completion driven mechanism]?
Please edit your question and add clarification [i.e. don't just add comments]
When I am executing a system call to do write or something else
What is your application doing? A write to what device? What "something else"?
What is the architecture/board, kernel/distro? (e.g. Raspberry Pi running Raspian? nvidia Jetson? Beaglebone? Xilinx FPGA with petalinux?)
What is the H/W device and where did the device driver come from? Did you write the device driver yourself or is it a standard one that comes with the kernel/distro? If you wrote it, please post it in your question.
Is the device configured properly? (e.g.) Are the DTB entries correct?
Is the device a block device, such as a disk controller? Or, is it a character device, such as a UART? Does the device transfer data via DMA? Or, does it transfer data by reading/writing to/from an IO port?
What do you mean by "exception"? Generally, exception is an abnormal condition (e.g. segfault, bus error, etc.). Please describe the exact context/scenario for which this occurs.
Generally, an ISR does little things. (e.g.) Grab [and save] status from the device. Clear/rearm the interrupt in the interrupt controller. Start the next queued transfer request. Wake up the sleeping base level task (usually the task that executed the syscall [waiting on a completion event in kernel mode]).
More elaborate actions are generally deferred and handled in the interrupt's "bottom half" handler and/or tasklet. Or, the base level is woken up and it handles the remaining processing.
What kernel subsystems are involved? Are you using platform drivers? Are you interfacing from within the DMA device driver framework? Are message buses involved (e.g. I2C, SPI, etc.)?
Interrupt and device handling in the linux kernel is somewhat different than what one might do in a "bare metal" system or RTOS (e.g. FreeRTOS). So, if you're coming from those environments, you'll need to think about restructuring your driver code [and/or application code].
What are your requirements for device throughput and latency?
You may wish to consult a good book on linux device driver design. And, you may wish to consult the kernel's Documentation subdirectory.
If you're able to provide more information, I may be able to help you further.
UPDATE:
A system call is not really in the same class as a hardware interrupt as far as the kernel is concerned, even if the CPU hardware uses the same sort of exception vector mechanisms for handling both hardware and software interrupts. The kernel treats the system call as a transition from user mode to kernel mode. – Ian Abbott
This is a succinct/great explanation. The "mode" or "context" has little to do with how we got/get there from a H/W mechanism.
The CPU doesn't really "understand" interrupt mode [as defined by the kernel]. It understands "supervisor" vs "user" privilege level [sometimes called "mode"].
When executing at user privilege level, an interrupt/exception will notice the transition from "user" level to "supvervisor" level. It may have a special register that specifies the address of the [initial] supervisor stack pointer. Atomically, it swaps in the value, pushing the user SP onto the new kernel stack.
If the interrupt is interrupting a CPU that is already at supervisor level, the existing [supervisor] SP will be used unchanged.
Note that x86 has privilege "ring" levels. User mode is ring 3 and the highest [most privileged] level is ring 0. For arm, some arches can have a "hypervisor" privilege level [which is higher privilege than "supervisor" privilege].
The setup of the mode/context is handled in arch/arm/kernel/entry-*.S code.
An svc is a synchronous interrupt [generated by a special CPU instruction]. The resulting context is the context of the currently executing thread. It is analogous to "call function in kernel mode". The resulting context is "kernel thread mode". At that point, it's not terribly useful to think of it as an "interrupt" anymore.
In fact, on some arches, the syscall instruction/mechanism doesn't use the interrupt vector table. It may have a fixed address or use a "call gate" mechanism (e.g. x86).
Each thread has its own stack which is different than the initial/interrupt stack.
Thus, once the entry code has established the context/mode, it is not executing in "interrupt mode". So, the full range of kernel functions is available to it.
An interrupt from a H/W device is asynchronous [may occur at any time the CPU's internal interrupt enable flag is set]. It may interrupt a userspace application [executing in application mode] OR kernel thread mode OR an existing ISR executing in interrupt mode [from another interrupt]. The resulting ISR is executing in "interrupt mode" or "ISR mode".
Because the ISR can interrupt a kernel thread, it may not do certain things. For example, if the CPU were in [kernel] thread mode, and it was in the middle of a kmalloc call [GFP_KERNEL], the ISR would see partial state and any action that tried to adjust the kernel's heap would result in corruption of the heap.
This is a design choice by linux for speed.
Kernel ISRs may be classified as "fast interrupts". The ISR executes with the CPU interrupt enable [IE] flag cleared. No normal H/W interrupt may interrupt the ISR.
So, if another H/W device asserts its interrupt request line [in the external interrupt controller], that request will be "pending". That is, the request line has been asserted but the CPU has not acknowledged it [and the CPU has not jumped via the interrupt table].
The request will remain pending until the CPU allows further interrupts by asserting IE. Or, the CPU may clear the pending interrupt without taking action by clearing the pending interrupt in the interrupt controller.
For a "slow" interrupt ISR, the interrupt entry code will clear the interrupt in the external interrupt controller. It will then rearm interrupts by setting IE and call the ISR. This ISR can be interrupted by other [higher priority] interrupts. The result is a "stacked" interrupt.
I have been searching all over the places, I come to the conclusion is that the interrupts have a higher priority than some of exceptions in the Linux kernel.
An exception [synchronous interrupt] can be interrupted if the IE flag is enabled. An svc is treated differently but after the entry code is executed, the IE flag is set, so the actual syscall code [executing in kernel thread mode] can be interrupted by a H/W interrupt.
Or, in limited circumstances, the kernel code can generate an exception (e.g. a page fault caused by a kernel action [which is usually deemed fatal]).
but I am still looking on how exactly the context switching happen when executing an exception and letting the processor to execute in a thread mode while the SVCall exception is pending (was preempted and have not returned yet)... I think when I understand that, it would be more clear to me.
I think you have to be very careful with the terminology. In particular, when combining terms from disparate sources. Although user mode, kernel thread mode, or interrupt mode can be considered a context [in the dictionary sense of the word], context switching usually means that the current thread is suspended, the scheduler selects a new thread to run and resumes it. That is separate from the user-to-kernel transition.
And if there is any recommended resources about that for ARM-Cortex-M3/4, it would be nice
Here is something: https://interrupt.memfault.com/blog/arm-cortex-m-exceptions-and-nvic But, be very careful in applying the terminology therein. What it considers "pending" only exists in the kernel during the entry code. What is more relevant is what the kernel does to set up mode/context and the terms are not equivalent.
So, from the kernel's standpoint, it's probably better to not consider an svc as "pending".

how to single-step code on-target with no jtag, breakpoints, simulator, emulator

Let's say you have a pointer to function whose source you do not have and which is "untrusted" because it might read/write to disallowed memory region.
Before it executes each assembly instruction, you want to verify that it doesn't access disallowed memory regions.
The OS is (almost) bare-metal i.e. a custom RTOS (so no Linux or QNX).
This is for a functionality that needs to be enabled not only during development but during normal runtime.
Ideally, it'd run something like this:
void (*fptr)(int);
fptr = &someFunction; // untrusted, don't have source
// enable interrupts for each assembly instruction
_EN_INT();
// call the function
fptr();
// everytime the PC increments, some other code runs which verifies that if any load/stores are executed, it doesn't access some disallowed memory range
// disable interrupts for each assembly instruction
_DIS_INT();
QUESTION
Is it possible to call that function and pause execution after every assembly instruction?
The OS is (almost) bare-metal i.e. a custom RTOS (so no Linux or QNX).
My answer assumes that you can modify the "OS" the way you need it...
Cortex MK20DX256VLH7
This seems to be a Cortex M4 CPU.
how to single-step code on-target with no jtag, breakpoints
From the doc, it doesn't say whether you NEED an external debugger to resume execution.
If the CPU is really stopped, you'll definitely need an external signal (e.g. from a debugger).
However most CPUs support software debugging. This means that an interrupt service routine is executed whenever a breakpoint is hit. To continue execution you simply return from the interrupt service routine.
I don't know about the Cortex M4 but for the Cortex M3 you'll have to set some special registers to enable that feature. Whenever a "BKPT" instruction is hit then interrupt #12 (*) is executed.
For code in RAM you simply write an BKPT instruction (0xBExx, e.g. 0xBEBE) to the address where you want to set your breakpoint. (Before writing you read out the value to be able to restore it later on).
For code in Flash the M3 has a "Flash patching unit" which allows you to specify up to three addresses which shall be read out as 0xBExx (0xBEBE ?) even if other data is stored there. This allows you to set up to 3 breakpoints in Flash.
Interesting for you: The register controlling the debug features in the M3 (named "DEMCR") also has a bit named "MON_STEP":
If you set this bit in interrupt handler #12 then exactly one instruction is executed after returning from the interrupt handler and interrupt #12 is triggered again. The use case for this feature is - of course - single-stepping code!
To stop single-stepping you'll have to clear the MON_STEP bit again...
Important 1:
I don't know if the MK20DX256VLH7 really has all these features. However because it is a Cortex M4 chip and the M4 should have nearly all features of the M3 these features should be present...
Important 2:
Implementing single-stepping and debugging is not done quickly. Assembly language knowledge will be very helpful and you'll need a lot of time...
From the doc, ...
You will not only need the documentation for the MK20DX256VLH7 from NXP but you'll also need the Cortex M4 documentation from ARM.
(*) Offset 4*12 in the vector table is meant here (which is named "IRQ(-4)" in some ARM documents); not IRQ12.
yes the ARM emulator/interpreter sounds exactly like what I want. Is there a free one?
qemu is open-source, most of it is GPLv2. https://wiki.qemu.org/License. You'd probably need to modify it a lot, because it's designed for use as a stand-alone wrapper for a whole Unix process (qemu-user) or whole machine (qemu-system).
I googled, and there's also http://www.unicorn-engine.org/ which is designed to be used as part of a larger program (written in C with bindings for calling from various languages). It's also GPLv2 (not LGPL), so you can use it if the rest of your code is also Free software.
It's actually based on the CPU-emulation code from QEMU; they stripped out all the device / BIOS emulation stuff to make a flexible library for just emulating CPUs.
Presumably you could configure some memory protections for it and set up the starting machine state, and let it run your function (with a return address that leads to some code that hands control back to your main code?)

Where to find the ISR called by the linux kernel when a keyboard button is pressed?

Where in the linux kernel can I find the ISR called by the kernel when a keyboard button is pressed?
Apparently there a hardcoded IRQ numbers, keyboard seems to have the 1 whereas touchpad/mouse has the 12, where to find these in kernel source code?
I know I could probably spend hours to search for this myself, but maybe a more experienced kernel hacker might speed this up.
The actual IRQ numbers would depend on architecture. Your example numbers seem to match x86:
http://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29#x86_IRQs
(It also says "On Linux, IRQ mappings can be viewed by executing cat /proc/interrupts or using the procinfo utility.")

How does software recognize an interrupt has occured?

As we know we write Embedded C programming, for task management, memory management, ISR, File system and all.
I would like to know if some task or process is running and at the same time an interrupt occurred, then how SW or process or system comes to know that, the interrupt has occurred? and pauses the current task execution and starts serving ISR.
Suppose if I will write the below code like;
// Dummy Code
void main()
{
for(;;)
printf("\n forever");
}
// Dummy code for ISR for understanding
void ISR()
{
printf("\n Interrupt occurred");
}
In this above code if an external interrupt(ISR) occurs, then how main() comes to know that the interrupt occurred? So that it would start serving ISR first?
main doesn't know. You have to execute some-system dependent function in your setup code (maybe in main) that registers the interrupt handler with the hardware interrupt routine/vector, etc.
Whether that interrupt code can execute a C function directly varies quite a lot; runtime conventions for interrupt procedures don't always follow runtime conventions for application code. Usually there's some indirection involved in getting a signal from the interrupt routine to your C code.
your query: I understood your answer. But I wanted to know when Interrupt occurs how the current task execution gets stopped/paused and the ISR starts executing?
well Rashmi to answer your query read below,
when microcontroller detects interrupt, it stops exucution of the program after executing current instruction. Then it pushes PC(program counter) on to stack and loads PC with the vector location of that inerrupt hence, program flow is directed to interrrupt service routine. On completion of ISR the microcontroller again pops the stored program counter from stack and loads it on to PC hence, program execution again resumes from next location it was stopped.
does that replied to your query?
It depends on your target.
For example the ATMEL mega family uses a pre-processor directive to register the ISR with an interrupt vector. When an interrupt occurs the corrosponding interrupt flag is raised in the relevant status register. If the global interrupt flag is raised the program counter is stored on the stack before the ISR is called. This all happens in hardware and the main function knows nothing about it.
In order to allow main to know if an interrupt has occurred you need to implement a shared data resource between the interrupt routine and your main function and all the rules from RTOS programming apply here. This means that as the ISR may be executed at any time it as not safe to read from a shared resource from main without disabling interrupts first.
On an ATMEL target this could look like:
volatile int shared;
int main() {
char status_register;
int buffer;
while(1) {
status_register = SREG;
CLI();
buffer = shared;
SREG = status_register;
// perform some action on the shared resource here.
}
return 0;
}
void ISR(void) {
// update shared resource here.
}
Please note that the ISR is not added to the vector table here. Check your compiler documentation for instructions on how to do that.
Also, an important thing to remember is that ISRs should be very short and very fast to execute.
On most embedded systems the hardware has some specific memory address that the instruction pointer will move to when a hardware condition indicates an interrupt is required.
When the instruction pointer is at this specific location it will then begin to execute the code there.
On a lot of systems the programmer will place only an address of the ISR at this location so that when the interrupt occurs and the instruction pointer moves to the specific location it will then jump to the ISR
try doing a Google search on "interrupt vectoring"
An interrupt handling is transparent for the running program. The processor branchs automatically to a previously configured address, depending on the event, and this address being the corresponding ISR function. When returning from the interrupt, a special instruction restores the interrupted program.
Actually, most of the time you won't ever want that a program interrupted know it has been interrupted. If you need to know such info, the program should call a driver function instead.
interrupts are a hardware thing not a software thing. When the interrupt signal hits the processor the processor (generally) completes the current instruction. In some way shape or form preserves the state (so it can get back to where it was) and in some way shape or form starts executing the interrupt service routine. The isr is generally not C code at least the entry point is usually special as the processor does not conform to the calling convention for the compiler. The ISR might call C code, but you end up with the mistakes that you made, making calls like printf that should not be in an ISR. hard once in C to keep from trying to write general C code in an isr, rather than the typical get in and get out type of thing.
Ideally your application layer code should never know the interrupt happened, there should be no (hardware based) residuals affecting your program. You may choose to leave something for the application to see like a counter or other shared data which you need to mark as volatile so the application and isr can share it. this is not uncommon to have the isr simply flag that an interrupt happened and the application polls that flag/counter/variable and the handling happens primarily in the application not isr. This way the application can make whatever system calls it wants. So long as the overall bandwidth or performance is met this can and does work as a solution.
Software doesnt recognize the interrupt to be specific, it is microprocessor (INTC) or microcontrollers JOB.
Interrupt routine call is just like normal function call for Main(), the only difference is that main dont know when that routine will be called.
And every interrupt has specific priority and vector address. Once the interrput is received (either software or hardware), depending on interrupt priority, mask values and program flow is diverted to specific vector location associated with that interrupt.
hope it helps.

How do I ensure my program runs from beginning to end without interruption?

I'm attempting to time code using RDTSC (no other profiling software I've tried is able to time to the resolution I need) on Ubuntu 8.10. However, I keep getting outliers from task switches and interrupts firing, which are causing my statistics to be invalid.
Considering my program runs in a matter of milliseconds, is it possible to disable all interrupts (which would inherently switch off task switches) in my environment? Or do I need to go to an OS which allows me more power? Would I be better off using my own OS kernel to perform this timing code? I am attempting to prove an algorithm's best/worst case performance, so it must be totally solid with timing.
The relevant code I'm using currently is:
inline uint64_t rdtsc()
{
uint64_t ret;
asm volatile("rdtsc" : "=A" (ret));
return ret;
}
void test(int readable_out, uint32_t start, uint32_t end, uint32_t (*fn)(uint32_t, uint32_t))
{
int i;
for(i = 0; i <= 100; i++)
{
uint64_t clock1 = rdtsc();
uint32_t ans = fn(start, end);
uint64_t clock2 = rdtsc();
uint64_t diff = clock2 - clock1;
if(readable_out)
printf("[%3d]\t\t%u [%llu]\n", i, ans, diff);
else
printf("%llu\n", diff);
}
}
Extra points to those who notice I'm not properly handling overflow conditions in this code. At this stage I'm just trying to get a consistent output without sudden jumps due to my program losing the timeslice.
The nice value for my program is -20.
So to recap, is it possible for me to run this code without interruption from the OS? Or am I going to need to run it on bare hardware in ring0, so I can disable IRQs and scheduling? Thanks in advance!
If you call nanosleep() to sleep for a second or so immediately before each iteration of the test, you should get a "fresh" timeslice for each test. If you compile your kernel with 100HZ timer interrupts, and your timed function completes in under 10ms, then you should be able to avoid timer interrupts hitting you that way.
To minimise other interrupts, deconfigure all network devices, configure your system without swap and make sure it's otherwise quiescent.
Tricky. I don't think you can turn the operating system 'off' and guarantee strict scheduling.
I would turn this upside down: given that it runs so fast, run it many times to collect a distribution of outcomes. Given that standard Ubuntu Linux is not a real-time OS in the narrow sense, all alternative algorithms would run in the same setup --- and you can then compare your distributions (using anything from summary statistics to quantiles to qqplots). You can do that comparison with Python, or R, or Octave, ... whichever suits you best.
You might be able to get away with running FreeDOS, since it's a single process OS.
Here's the relevant text from the second link:
Microsoft's DOS implementation, which is the de
facto standard for DOS systems in the
x86 world, is a single-user,
single-tasking operating system. It
provides raw access to hardware, and
only a minimal layer for OS APIs for
things like the file I/O. This is a
good thing when it comes to embedded
systems, because you often just need
to get something done without an
operating system in your way.
DOS has (natively) no concept of
threads and no concept of multiple,
on-going processes. Application
software makes system calls via the
use of an interrupt interface, calling
various hardware interrupts to handle
things like video and audio, and
calling software interrupts to handle
various things like reading a
directory, executing a file, and so
forth.
Of course, you'll probably get the best performance actually booting FreeDOS onto actual hardware, not in an emulator.
I haven't actually used FreeDOS, but I assume that since your program seems to be standard C, you'll be able to use whatever the standard compiler is for FreeDOS.
If your program runs in milliseconds, and if your are running on Linux,
Make sure that your timer frequency (on linux) is set to 100Hz (not 1000Hz).
(cd /usr/src/linux; make menuconfig, and look at "Processor type and features" -> "Timer frequency")
This way your CPU will get interrupted every 10ms.
Furthermore, consider that the default CPU time slice on Linux is 100ms, so with a nice level of -20, you will not get descheduled if your are running for a few milliseconds.
Also, you are looping 101 times on fn(). Please consider giving fn() to be a no-op to calibrate your system properly.
Make statistics (average + stddev) instead of printing too many times (that would consume your scheduled timeslice, and the terminal will eventually get schedule etc... avoid that).
RDTSC benchmark sample code
You can use chrt -f 99 ./test to run ./test with the maximum realtime priority. Then at least it won't be interrupted by other user-space processes.
Also, installing the linux-rt package will install a real-time kernel, which will give you more control over interrupt handler priority via threaded interrupts.
If you run as root, you can call sched_setscheduler() and give yourself a real-time priority. Check the documentation.
Maybe there is some way to disable preemptive scheduling on linux, but it might not be needed. You could potentially use information from /proc/<pid>/schedstat or some other object in /proc to sense when you have been preempted, and disregard those timing samples.

Resources