What's a good system test for keeping a deadline? - c

Reading about RTOS, the characteristic of a "hard" RTOS is that it can keep a deadline deterministically but how do we test or prove that the system actually fulfils the requirements?
The MicroC/OS II RTOS is characterized as a hard RTOS but how can I validate that claim? If I have some C code and ISR for my FPGA that can run C programs and make context switch between threads with semaphores similar to what an RTOS does, how can I know whether the OS / RTOS is "hard" or "soft" RTOS?
Can it depend on the application and must it have a timer and therefore using the builtin hardware timer (e.g. the Altera DE2 has a 50 Mhz oscillator) with hardware interrupts is preferred, and then we just test whether threads and processes can be scheduled according to a deadline and we then check if the deadline was met?
Or is there some general practice to what must be included to make the difference between operating system, real-time operating system, and hard and soft RTOS?
Is there some "typical test" with a typical requirement for the label "hard RTOS" ?

It is hard to answer this question, because your premise is wrong.
A system classified as hard realtime is distinguished from a soft realtime system only through the severity of a missed deadline. In hard RT, a missed deadline is classified as a system failure, which may or may not cause harm to hardware and people, while soft realtime usually means that a missed deadline only degrades system performance, but does not bring it to a grinding halt.
A typical example for a hard RT system would be a watchdog that shuts down a system on overheating - if it fails to meet its deadline, the system breaks. Also, general safety-related systems in power plants, or airplanes fall in this category.
A Soft RT example would be video streaming, where a missed deadline causes degraded visual quality or stuttering, but does not necessarily cause a failure of the system.
Long story short, hard and soft RT are characteristics of complete software systems, measured by their specifications and fault models. So typically, it is the application running on the operating system that fits the hard/soft RT criteria, the OS merely provides interfaces with predictable timing behaviour, that allow the application to make timing assumptions.

Related

Why do we need a RTOS on ARM Cortex-M

If we can already execute C programs on cortex-m like micro-controllers, Why do we even need to install RTOS (or other operating systems).?
What benefits it can provide if micro-controller is intended to be multi-purpose.?
No you dont need an RTOS only if you need/want the features of the (particular) RTOS. You can program the microcontroller the way you/we always have without one if you prefer.
Typical things an RTOS might bring,
Memory management (who owns memory)
Interrupt handling support
Scheduling (pre-emptive or co-operative)
Usually several drivers in a BSP for your hardware/SOC
Debug tools
Some sort of shell
File systems
IPC (inter-process communitation)
A tool suite
A build environment
Memory protection
Networking
Your application may or may not need these features depending on your end goal. Some of them may be detrimental to your organizations work flow (like the tool suite and build environment). As a product matures, you may end up needing features you didn't account for.
However, a completely custom solution will probably have a smaller foot print. The race conditions involved in interrupt handling can be quite difficult to get right. Probably most RTOS will give a better implementation than something custom that evolves over time. If you are very dedicated, a state machine with polling of devices can be more optimal (hard real time) but again it is difficult to get right.
If the RTOS is BSD (or other permissive) licensed , it maybe possible to reuse the driver code to your own custom infra-structure. At some point your code may become an 'RTOS' of sorts. There are many to choose from.
POSIX compliance is a common standard. If you confine your code to POSIX, you are portable to many different RTOS/OS. However, most often an API that is more rich than POSIX; it is one way they differentiate each other. You may be able to use more 3rd party libraries if the RTOS is POSIX compliant.
An operating system provides a level of abstraction between the code written by an application programmer and the actual hardware the program runs on.
So you don't have to worry, as an application programmer, about the details of the hardware, as they are handled by drivers.
And thus you can compile the same program for many different hardware platforms, if they run the same (or a compatible) operating system.

Full software timer : derivate time?

I have been asked a question but I am not sure if I answered it correctly.
"Is it possible to rely only on software timer?"
My answer was "yes, in theory".
But then I added:
"Just relying on hardware timer at the kernel loading (rtc) and then
software only is a mess to manage since we must be able to know
how many cpu cycles each instruction took + eventual cache miss +
branching cost + memory speed and put a counter after each one or
group (good luck with out-of-order cpu).
And do the calculation to derivate the current cpu cycle. That is
insane.
Not talking about the overall performance drop.
The best we could have is a brittle approximation of the time which
become more wrong over time. Even possibly on short laps."
But even if it seems logical to me, did my thinking go wrong?
Thanks
On current processors and hardware (e.g. Intel or AMD or ARM in laptops or desktops or tablets) with common operating systems (Linux, Windows, FreeBSD, MacOSX, Android, iOS, ...) processes are scheduled at random times. So cache behavior is non deterministic. Hence, instruction timing is non reproducible. You need some hardware time measurement.
A typical desktop or laptop gets hundreds, or thousands, of interrupts every second, most of them time related. Try running cat /proc/interrupts on a Linux machine twice, with a few seconds between the runs.
I guess that even with a single-tasked MS-DOS like operating system, you'll still get random behavior (e.g. induced by ACPI, or SMM). On some laptops, the processor frequency can be throttled by its temperature, which depends upon the CPU load and the external temperature...
In practice you really want to use some timer provided by the operating system. For Linux, read time(7)
So you practically cannot rely on a purely software timer. However, the processor has internal timers.... Even in principle, you cannot avoid timers on current processors ....
You might be able, if you can put your hardware in a very controlled environment (thermostatically) to run a very limited software (an OS-like free standing thing) sitting entirely in the processor cache and perhaps then get some determinism, but in practice current laptop or desktop (or tablet) hardware is non-deterministic and you cannot predict the time needed for a given small machine routine.
Timers are extremely useful in interesting (non-trivial) software, see e.g. J.Pitrat CAIA, a sleeping beauty blog entry for an interesting point. Also look at the many uses of watchdog timers in software (e.g. in the Parma Polyhedra Library)
Read also about Worst Case Execution Time (WCET).
So I would say that even in theory it is not possible to rely upon a purely software timer (unless of course that software uses the processor timers, which are hardware circuits). In the previous century (up to 1980s or 1990s) hardware was much more deterministic, and the amount of clock cycles or microsecond needed for each machine instruction was documented (but some instructions, e.g. division, needed a variable amount of time, depending on the actual data!).

Low interrupt latency via dedicated architectures and operating systems

This question may seem slightly vague, however I am researching upon how interrupt systems work and their latency times. I am trying to achieve an understanding of how architecture facilities such as FIQ in ARM help decrease latency times. How does this differ from using a operating system that does not have access or can not provide access to this facilities? For example - Windows RT is made for ARM etc, and this operating system is not able to be ported to other architectures.
Simply put - how is interrupt latency different in dedicated architectures that have dedicated operating systems as compared to operating systems that can be ported across many different architectures (Linux for example)?
Sorry for the rant - I'm pretty confused as you can probably tell.
I'll start with your Windows RT example, Windows RT is a port of Windows to the ARM architecture. It is not a 'dedicated operating system'. There are (probably) many OSes that only run on only 1 architecture, but that is more a function of can't be arsed to port them due to some reason.
What does 'port' really mean though?
Windows has a kernel (we'll call is NT here, doesn't matter) and that NT kernel has a bunch of concepts that need to be implemented. These concepts are things like timers, memory virtualisation, exceptions etc...
These concepts are implemented differently between architectures, so the port of the kernel and drivers (I will ignore the rest of the OS here, often that is a recompile only) will be a matter of using the available pieces of silicon to implement the required concepts. This implementation is a called 'port'.
Let's zoom in on interrupts (AKA exceptions) on an ARM that has FIQ and IRQ.
In general an interrupt can occur asynchronously, by that I mean at any time. The CPU is generally busy doing something when an IRQ is asserted so that context (we'll call it UserContext1) needs to be stored before the CPU can use any resources in use by UserContext1. Generally this means storing registers on the stack before using them.
On ARM when an IRQ occurs the CPU will switch to IRQ mode. Registers r13 and r14 have there own copy for IRQ mode, the rest will need to be saved if they are used - so that is what happens. Those stores to memory take some time. The IRQ is handled, UserContext1 is popped back off the stack then IRQ mode is exited.
So the latency in this case might be the time from IRQ assertion to the time the IRQ vector starts executing. That going to be some set number of clock cycles based upon what the CPU was doing when the IRQ happened.
The latency before the IRQ handling can occur is the time from the IRQ assert to the time the CPU has finished storing the context.
The latency before user mode code can execute depends on too much stuff in the OS/Kernel to explain here, but the minimum boils down to the time from the IRQ assertion to the return after restoring UserContext1 + the time for the OS context switch.
FIQ - If you are a hard as nails programmer you might only need to use 7 registers to completely handle your interrupt servicing. I mentioned that IRQ mode has its own copy of 2 registers, well FIQ mode has its own copy of 7 registers. Yup, that's 28 bytes of context that doesn't need to be pushed out into the stack (actually one of them is the link register so it's really 6 you have). That can remove the need to store UserContext1 then restore UserContext1. Thus the latency can be reduced by up to the length of time needed to do that save/restore.
None of this has much to do with the OS. The OS can choose to use or not use these features. The OS can choose to make guarantees regarding how long it will take to execute the OSes concept of an interrupt handler, or it may not. This is one of the basic concepts of an RTOS, the contract about how long before the handler will run.
The OS is designed for some purpose (and that purpose may be 'general') - that target design goal will have a lot more affect on latency than haw many target the OS has been ported to.
Go have a read about something like freertos than buy some hardware and try it. Annotate the code to figure out the latencies you really want to look at. IT will likely be the best way to get your ehad around it.
(*Multi-CPU systems do it the same with but with some synchronization and barrier functions and a sprinkling of complexity)

Ultra-low latency programming on Linux, where to begin?

I heard there are some ways to modify linux such that an particular application can obtain very low latency such that whenver it ask resources, the OS will try to give it the resource as soon as possible, kind of overriding the default preemptive multitasking mechanism, I dont have a CS background, but the application I am working-on is very latency-sensitive, can anyone tell me are there any docs/stuff on this specific matter? many thanks.
Guaranteed low-latency response is called the real time capability. It means that timing goals that are realistic are guaranteed to be met.
There is a project for it called RTLinux. See the Real-Time Linux Wiki: https://rt.wiki.kernel.org/index.php/Main_Page
There are two real time models :
soft real time system - you get it by applying RT preempt kernel patches. I think it guaranties context switch within 10 ms. The goal of this project is to conform to hard real time requirements
hard real time system - have stricter guaranties (response of 1 ms). There are some libraries (like xenomai) that claim they provide hard real time system.

Is there a difference between a real time system and one that is just deterministic?

At work we're discussing the design of a new platform and one of the upper management types said it needed to run our current code base (C on Linux) but be real time because it needed to respond in less than a second to various inputs. I pointed out that:
That point doesn't mean it needs to be "real time" just that it needs a faster clock and more streamlining in its interrupt handling
One of the key points to consider is the OS that's being used. They wanted to stick with embedded Linux, I pointed out we need an RTOS. Using Linux will prevent "real time" because of the kernel/user space memory split thus I/O is done via files and sockets which introduce a delay
What we really need to determine is if it needs to be deterministic (needs to respond to input in <200ms 90% of the time for example).
Really in my mind if point 3 is true, then it needs to be a real time system, and then point 2 is the biggest consideration.
I felt confident answering, but then I was thinking about it later... What do others think? Am I on the right track here or am I missing something?
Is there any difference that I'm missing between a "real time" system and one that is just "deterministic"? And besides a RTC and a RTOS, am I missing anything major that is required to execute a true real time system?
Look forward to some great responses!
EDIT:
Got some good responses so far, looks like there's a little curiosity about my system and requirements so I'll add a few notes for those who are interested:
My company sells units in the 10s of thousands, so I don't want to go over kill on the price
Typically we sell a main processor board and an independent display. There's also an attached network of other CAN devices.
The board (currently) runs the devices and also acts as a webserver sending basic XML docs to the display for end users
The requirements come in here where management wants the display to be updated "quickly" (<1s), however the true constraints IMO come from the devices that can be attached over CAN. These devices are frequently motor controlled devices with requirements including "must respond in less than 200ms".
You need to distinguish between:
Hard realtime: there is an absolute limit on response time that must not be breached (counts as a failure) - e.g. this is appropriate for example when you are controlling robotic motors or medical devices where failure to meet a deadline could be catastrophic
Soft realtime: there is a requirement to respond quickly most of the time (perhaps 99.99%+), but it is acceptable for the time limit to be occasionally breached providing the response on average is very fast. e.g. this is appropriate when performing realtime animation in a computer game - missing a deadline might cause a skipped frame but won't fundamentally ruin the gaming experience
Soft realtime is readily achievable in most systems as long as you have adequate hardware and pay sufficient attention to identifying and optimising the bottlenecks. With some tuning, it's even possible to achieve in systems that have non-deterministic pauses (e.g. the garbage collection in Java).
Hard realtime requires dedicated OS support (to guarantee scheduling) and deterministic algorithms (so that once scheduled, a task is guaranteed to complete within the deadline). Getting this right is hard and requires careful design over the entire hardware/software stack.
It is important to note that most business apps don't require either: in particular I think that targeting a <1sec response time is far away from what most people would consider a "realtime" requirement. Having said that, if a response time is explicitly specified in the requirements then you can regard it as soft realtime with a fairly loose deadline.
From the definition of the real-time tag:
A task is real-time when the timeliness of the activities' completion is a functional requirement and correctness condition, rather than merely a performance metric. A real-time system is one where some (though perhaps not all) of the tasks are real-time tasks.
In other words, if something bad will happen if your system responds too slowly to meet a deadline, the system needs to be real-time and you will need a RTOS.
A real-time system does not need to be deterministic: if the response time randomly varies between 50ms and 150ms but the response time never exceeds 150ms then the system is non-deterministic but it is still real-time.
Maybe you could try to use RTLinux or RTAI if you have sufficient time to experiment with. With this, you can keep the non realtime applications on the linux, but the realtime applications will be moved to the RTOS part. In that case, you will(might) achieve <1second response time.
The advantages are -
Large amount of code can be re-used
You can manually partition realtime and non-realtime tasks and try to achieve the response <1s as you desire.
I think migration time will not be very high, since most of the code will be in linux
Just on a sidenote be careful about the hardware drivers that you might need to run on the realtime part.
The following architecture of RTLinux might help you to understand how this can be possible.
It sounds like you're on the right track with the RTOS. Different RTOSs prioritize different things either robustness or speed or something. You will need to figure out if you need a hard or soft RTOS and based on what you need, how your scheduler is going to be driven. One thing is for sure, there is a serious difference betweeen using a regular OS and a RTOS.
Note: perhaps for the truest real time system you will need hard event based resolution so that you can guarantee that your processes will execute when you expect them too.
RTOS or real-time operating system is designed for embedded applications. In a multitasking system, which handles critical applications operating systems must be
1.deterministic in memory allocation,
2.should allow CPU time to different threads, task, process,
3.kernel must be non-preemptive which means context switch must happen only after the end of task execution. etc
SO normal windows or Linux cannot be used.
example of RTOS in an embedded system: satellites, formula 1 cars, CAR navigation system.
Embedded System: System which is designed to perform a single or few dedicated functions.
The system with RTOS: also can be an embedded system but naturally RTOS will be used in the real-time system which will need to perform many functions.
Real-time System: System which can provide the output in a definite/predicted amount of time. this does not mean the real-time systems are faster.
Difference between both :
1.normal Embedded systems are not Real-Time System
2. Systems with RTOS are real-time systems.

Resources