Arm cortex-m4 custom os desgin - arm

I am writing a small os for ARM Cortex-M4 core and have some doubts. I decided to place os functionalities through Supervisor Call(SVC) where I keep and maintain all kernel objects and kernel functions.
But, is this a good idea because all kernel code executes and occupies user task stack?
Only thing that comes to my mind is that through supervisor call jump to kernel stack and lock scheduler while executing kernel code. Is this approach a good one?

On a Cortex-M you have a Process Stack Pointer(PSP) and a Main Stack Pointer(MSP). Interrupts use the Main Stack Pointer and the tasks should use the process stack pointer. Therefore any kernel work done in the SVC uses the MSP so should not interfere with the tasks stacks which use the PSP. When switching tasks you set the PSP to the new stack. I would read the Exception Handling section of the Cortex-M4 Generic User guide. Also I would recommend getting and reading The Definitive Guide to the ARM Cortex M3/M4 as this has a good section on RTOS's. The Cortex's were design with RTOS's and provide a lot of useful features.
Note: Unless you are doing this as a learning exercise, or just really want to write your own OS, you would be better of using something like FreeRTOS which is very well tested and provides all the features you are ever likely to use.

Related

Multicore ARM: how to assign a critical task to one dedicated core

Suppose an embedded system project where I have a multicore ARM processor (to make it simple assume 2 cores with an unshared cache between the 2 cores). Suppose my system contains a critical task and several non-critical tasks.
Therefore, can I assign the critical task to "core 1" exclusively? And all other to "core 2" exclusively?
If so, how to do and what are the best practices from an implementation point of view [assume I use C]? Should I use a library (if so which one)? An RTOS?
Ok, I see that you asked this over in the EE board as well. They gave the same answer I want to give you as well. Use an operating system of some sort to handle thread affinities. If your RTOS or whatever you have does not support this, then look into it and see how it actually handles process/thread scheduling.
Typically, each CPU on a system will be assigned some sort of thread that handles scheduling of tasks. This thread is one of the first things that an OS sets up. Feel free to research some micro kernels out there to see how this is done for your particular processor. You can also find the secret sauce for setting up this thread in the ARM documentation for your particular CPU.
But, I am going out on a limb and assuming this is far, far beyond the scope of any assignment given to you for a project. I would hope that you have some affinity of some sort built into what you were given. Setting up affinity for a known OS is a few seconds task. Setting up affinity on a bare metal system with no OS at all is much more involved.
Original question:
https://electronics.stackexchange.com/questions/356225/multicore-arm-how-to-assign-a-critical-task-to-one-dedicated-core#comment854845_356225
If you don't need real-time functionality, you can do this on a device with a Linux kernel without too much hassle.
See this question here

Bare bones OS kernel programming

I have recently started to take an interest in the topics of operating systems. I have a couple of things that are weighing on my mind, but I have decided to split the questions.
Let's assume we're designing a kernel for a new instruction set architecture that's out on the market. There are no C runtime libraries, no nothing. Only a compatible compiler for that ISA.
Presumably, this means that the only C constructs that are available to the kernel programmer are only basic assignment operators, bitwise operators and loops. Is this correct?
If so, how are more complex things like main memory I/O and process scheduling achieved on the lowest level? Can they only be implemented in pure assembly?
What does it mean then, for a kernel to be written in C (Linux for example). Are some parts of the kernel inherently written in assembly then?
Presumably, this means the only C constructs that are available to the kernel programmer are only basic assignment operators, bitwise operators and loops. Is this correct?
Pretty much all C language features will still work in your kernel without needing any particular runtime support, your C compiler will be able to translate them to assembler that can run just as well in kernel mode as they would in a normal user-mode program.
However libraries such as the Standard C Library will not be available, you will have to write your own implementation. In particular this means no malloc and free until you implement them yourself.
If so, how are more complex things like main memory I/O and process scheduling achieved on the lowest level? Can they only be implemented in pure assembly?
Memory I/O is something much more low level that is handled by the CPU, BIOS, and various other hardware on your computer. The OS thankfully doesn't have to bother with this (with some exceptions, such as some addresses being reserved, and some memory management features).
Process scheduling is a concept that doesn't really exist at the machine code level on most architecture. x86 does have a concept of tasks and hardware task switching but nobody uses it. This is an abstraction set up by the OS as needed, you would have to implement it yourself, or you could decide to have a single-tasking OS if you do not want to spend the effort, it will still work.
What does it mean then, for a kernel to be written in C (linux for example). Are some parts of the kernel inherently written in assembly then?
Some parts of the kernel will be heavily architecture dependent and will have to be written in ASM. For example on x86 switching between modes (e.g. to run 16 bit code, or as part of the boot process) or interrupt handling can only be done with some protected ASM instructions. The reference manual of your architecture of choice, such as the Intel® 64 and IA-32 Architectures Software Developer’s Manual for x86 are the first place to look for those kinds of details.
But C is a portable language, it has no need for such low level architecture-specific concepts (although you could in theory do everything from a .c file with compiler intrinsics and inline ASM). It is more useful to abstract this away in assembler routines, and build your C code on top of a clean interface that you could maintain if you wanted to port your OS to another architecture.
If you are interested in the subject, I highly recommend you pay a visit to the OS Development Wiki, it's a great source of information about Operating Systems and you'll find many hobbyists that share your interest.
About the only thing you need to code in assembler are:
Context switches (swapping out the machine state of one abstract process for another)
Access to device registers (and you don't even need this if the devices are memory mapped)
Entry and exit from interrupt handlers (this is a kind of context switch)
Perhaps a boot loader
Everthing else you should be able to do in C code.
If you want to see this job done spectacularly well, you should go an check out the Multics OS, dating from the middle 60s, supporting a large scale information services (multiple CPUs, Virtual Memory, ...). This was coded almost entirely in PL/1 (a C-like language) with only very small bits coded in the native assembly language of the Honeywell processor that supported Multics. The Organick book on Multics is worth its weight in gold in terms of showing how Multics worked and how clean most of it is. (We got "Eunuchs" instead).
There are some places where it will be worthwhile to code in assembler anyway. Regardless of the quality of your compiler's code generator, you will be able to hand-code certain routines that occur in time-critical areas better in assembler than the compiler will do. Places I'd expect this matter: the scheduler, system call entry and exit. Other places only as measurement indicates. (On older, much smaller systems, one tended to write the OS using a lot of assembler, but that was as much for space savings as it was for efficiency of execution, C compilers weren't nearly as good).
I'm wondering how a new architecture that's "out on the market" would not already have some type of operating system.
Device drivers - someone is going to have to write code for this, perhaps one driver for BIOS, the other for the OS. Memory mapped I/O can get complicated depending on the hardware, such as a controller with a set of descriptors, each containing a physical address and length. If the OS supports virtual memory, then that memory has to be "locked" and the physical addresses obtained in order to program the controller. This one reason for having a set of descriptors, so that a single memory mapped I/O can handle scattered physical pages that have been mapped into a continuous virtual address space.
Assembly code - the other comments here have already note that some assembly will be required (context switches, interrupt handlers (which could call C functions, so most of the code could be in C)).

Running MMU-less Linux on ARM Cortex-R4

I'm using ARM Cortex-R4 for my system. It has a Memory Protection Unit instead of a Memory Management Unit. Effectively, this means that there's dedicated hardware for memory protection but that there's a one-to-one mapping between physical and virtual addresses. I'm a little confused about which Linux I should go for - standard Linux kernel with MMU disabled or uCLinux.
On ARM's evaluation board, I have run the standard kernel compiled with MMU disabled. I used the cramfs filesystem which is available on the official ARM website. After the kernel boots up, I'm in the shell, but I couldn't do much experimentation as I found that, most of the time, the shell stops responding (particularly when I press "tab" for auto-completion).
So I'm still not sure whether the MMU-less kernel should run smoothly if I use the correct filesystem. Also, which distro (buildroot?) should I use for the no-VM Linux?
Any idea or suggestion is welcome.
It's been more than 2 years since I asked this question. Now is the time I should write what I found for myself.
ucLinux was a project forked from the Linux kernel long back with the aim to develop Kernel for MMU less systems. However, after a certain while, it was merged to the parent Linux branch. So, today there doesn't exist any active ucLinux distribution.
So, if you disable MMU from the mainline kernel configuration, you'll get an MMU-less version. In fact, now there are configuration options provided in the kernel itself whereby a user can specify the memory layout and the access permissions.
Cheers!
uClinux is a Linux distribution which uses the Linux kernel with the MMU "turned off" and adds some applications and libraries on top of it. You wont choose one or the either as they are best one on top of the other.
If you got to a point where you have a shell running, you've managed to boot Linux sans MMU on your board but ran into a bug.
I believe ucLinux was built for something just like this [mmu less systems]
http://www.uclinux.org/description/

how do you do system call interrupts in C?

I learned from download.savannah.gnu.org/.../ProgrammingGroundUp-1-0-booksize.pdf
that programs interrupt the kernel, and that is how things are done. What I want to know is how you do that in C (if it's possible)
There is no platform-independent way (obviously)! On x86 platforms, system-calls are typically implemented by placing the system-call code in the eax register, and triggering int 80h in assembler, which causes a switch to kernel-mode. The kernel then executes the relevant code based on what it sees in eax.
User processes usually request kernel services by calling system call wrapper functions from Standard C Library. You can do it manually with syscall(2).
The user program's interaction with the kernel is going to be very platform-specific, so it usually happens behind the scenes in the various library routines. So one just calls printf, write, select, or other library routines, which allow the programmer to write code without worrying about the details of the kernel interface, device drivers, and so forth.
And the way it usually works is that when one of those library routines needs the kernel to do something on its behalf, it performs a low-level system call that yields its control of the CPU to the kernel. It's the user program, not the kernel, that is the one being interrupted.
If you're using glibc (which you probably are if you are using gcc and linux) then there is a syscall function in unistd.h that you can use. It has different implementations for different architectures and operating systems, but the implementation is done in assembly (could be inline assembly). syscall has a man page, so:
man syscall
will give you some info.
If you are just curious about how all of this works then you should know that this has changed in Linux on x86 in recent years. Originally interrupt 0x80 was used by Linux as the normal system call entry point on x86. This worked well enough, but as processors got more advanced pipelining (starting an instruction before previous instructions have completed) interrupts have slowed down (relative to execution of regular code which has sped up, though some tests have shown that it has slowed down more than that). The reason for this is that even when the int instruction is used to trigger an interrupt it works mostly the same as hardware triggered interrupts, which occur unpredictably, which causes them not to play nice with the pipelining of instructions (pipelining works better when code paths are predictable).
To help with this newer x86 processors have instructions specifically intended for making system calls, but Intel and AMD use different instructions for this (sysenter and syscall, respectively). Additionally the Intel systenter instruction clobbers a general purpose register that Linux has used on x86_32 to pass a parameter to the kernel. This means that programs have to know which of 3 possible system call mechanisms to use as well as possibly different ways of passing arguments to the kernel. To get around all of this newer kernels map a special page of memory into programs (this page is called vsyscall and if you cat /proc/self/maps you will see an entry for it) that contains code for the system call mechanism that the kernel has determined should be used on the system, and newer versions of glib can implement their system call entry using the code in this page.
The point of all of this is that this isn't as simple as it used to be, but if you are just playing around on an x86_32 then you should be able to use the int 80h instruction because that will be supported on systems that can use one of the other mechanisms for backwards compatibility.
In C, you don't really do it directly, but you'll end up doing this indirectly any time you use library functions that end up invoking system calls. File access, network access, etc, are typical examples of this.
Those functions will all end up "trapping" to the kernel, which will handle the request.

Any open-source ARM7 emulators suitable for linking with C?

I have an open-source Atari 2600 emulator (Z26), and I'd like to add support for cartridges containing an embedded ARM processor (NXP 21xx family). The idea would be to simulate the 6507 until it tries to read or write a byte of memory (which it will do every 841ns). If the 6507 performs a write, put the address and data on some of the ARM's I/O ports and let the ARM code run 20 cycles, confirm that the ARM is floating its data bus, and let the ARM run for another 38 cycles. If the 6507 performs a read, put the address on the ARM's I/O ports, let the ARM run 38 cycles, grab the data from the ARM's I/O port (hopefully the ARM software will have put it there), and let the ARM run another 20 cycles.
The ARM7 seems pretty straightforward to implement; I don't need to simulate a whole lot of hardware features. Any thoughts?
Edit
What I have in mind would be a routine that would take as a parameter a struct holding the machine state and pointers to a memory access routine. When called, the routine would emulate the ARM's instruction engine, generating appropriate reads, writes, and code fetches. I could then write the memory access routine to regard appropriate areas as flash (with roughly-approximated wait states), RAM, I/O ports, and timer registers. Some other areas would be marked as don't-care, and accesses to any other areas would flag an error and stop the emulator.
Perhaps QEMU uses such a thing internally. Since the ARM emulation would be integrated into an already-existing emulation engine (which I didn't write and don't fully understand--the only parts of Z26 I've patched have been the memory read/write logic) I would need something with a fairly small footprint.
Any idea how QEMU works inside? Any idea what the GPL licence would require if I just use 2% of the code in QEMU--whether I'd have to bundle the code for the whole thing, or just the part that I use, or what?
Try QEMU.
With some work, you can make my emulator do what you want. It was written for ARM920, and the Thumb instruction set isn't done yet. Neither is the MMU/cache interface. Also, it's slow because it is an interpreter. On the bright side, it's all written in C99.
http://code.google.com/p/gp2xemu/
I haven't worked on it for a while (The svn trunk is 2 years old), but if you're going to use the code, I'll be glad to help you out with the missing features. It is licensed under MIT, so it's just the same as the broad BSD license.

Resources