I'm currently in the process of writing a state machine in C for a microcontroller (a TI MSP430). Now, I don't have any problems with writing the code and implementing my design, but I am wondering how to prove the state machine logic without having to use the actual hardware (which, of course, isn't yet available).
Using debugging features, I can simulate interrupts (although I haven't yet tried to do this, I'm just assuming it will be okay - it's documented after all) and I have defined and reserved a specific area of memory for holding TEST data, which using debugging macros, I can access at runtime outside of the application in a Python script. In other words, I have some test foundations in place. However, the focus of my question is this:
"How best do I force a certain state machine flow for decisions that require hardware input, e.g., for when an input pin is high or low". For example, "if some pin is high, follow this path, otherwise follow this path".
Again, using debugging macros, I can write to registers outside of the application (for example, to light an LED), but I can't (understandably) write to the read-only registers used for input, and so forcing a state machine flow in the way described above is proving taxing.
I had thought of using #ifdefs, where if I wanted to test flow I could use an output pin and check this value instead of the input pin that would ultimately be used. However, this will no doubt pepper my codebase with test-only code, which feels like the wrong approach to take. Does anyone have any advice on a good way of achieving this level of testing? I'm aware that I could probably just use a simulator, but I want to use real hardware wherever possible (albeit an evaluation board at this stage).
Sounds like you need abstraction.
Instead of, in the "application" code (the state machine) hard-coding input reading using e.g. GPIO register reads, encapsulate those reads into functions that do the check and return the value. Inside the function, you can put #ifdef:ed code that reads from your TEST memory area instead, and thus simulates a response from the GPIO pin that isn't there.
This should really be possible even if you're aiming for high performance, it's not a lot of overhead and if you work at it, you should be able to inline the functions.
Even though you don't have all the hardware yet, you can simulate pretty much everything.
A possible way of doing it in C...
Interrupt handlers = threads waiting on events.
Input devices = threads firing the above events. They can be "connected" to the PC keyboard, so you initiate "interrupts" manually. Or they can have their own state machines to do whatever necessary in an automated manner (you can script those too, they don't have to be hardwired to a fixed behavior!).
Output devices = likewise threads. They can be "connected" to the PC display, so you can see the "LED" states. You can log outputs to files as well.
I/O pins/ports can be just dedicated global variables. If you need to wake up I/O device threads upon reading/writing from/to them, you can do so too. Either wrap accesses to them into appropriate synchronization-and-communication code or even map the underlying memory in such a way that any access to these port variables would trigger a signal/page fault whose handler would do all the necessary synchronization and communication for you.
And the main part is in, well, main(). :)
This will create an environment very close to the real. You can even get race conditions!
If you want to be even more hardcode about it and if you have time, you can simulate the entire MSP430 as well. The instruction set is very compact and simple. Some simulators exist today, so you have some reference code to leverage.
If you want to test your code well, you will need to make it flexible enough for the purpose. This may include adding #ifdefs, macros, explicit parameters in functions instead of accessing global variables, pointers to data and functions, which you can override while testing, all kinds of test hooks.
You should also think of splitting the code into hardware-specific parts, very hardware-specific parts and plain business logic parts, which you can compile into separate libraries. If you do so, you'll be able to substitute the real hardware libs with test libs simulating the hardware.
Anyhow, you should abstract away the hardware devices and use test state machines to test production code and its state machines.
Build a test bench. First off I recommend when for example you read the input registers or whatever, use some sort of function call (vs some volatile this that the other address thing). Basically everything has at least one layer of abstraction. Now your main application can easily be lifted and placed anywhere with test functions for each of the abstractions. You can completely test that code without any of the real hardware. Also once on the real hardware you can use the abstraction (wrapper function, whatever you want to call it) as a way to change or fake the input.
switch(state)
{
case X:
r=read_gpio_port();
if(r&0x10) next_state = Y;
break;
}
In a test bench (or even on hardware):
unsigned int test_count;
unsigned read_gpio_port ( void )
{
test_count++;
return(test_count);
}
Eventually implement read_gpio_port in asm or C to access the gpio port, and link that in with the main application instead of the test code.
yes, you suffer a function call unless you inline, but in return your debugging and testing abilities are significantly greater.
Related
BACKGROUND
I'm integrating micropython into my custom cooperative multitasking OS (no, my company won't change to pre-preemptive)
Micropython uses garbage collection and this takes much more time than my alloted time slice even when there's nothing to collect i.e. I called it twice in a row, timed it and still takes A LOT of time.
OBVIOUS SOLUTION
Yes I could refactor micropython source but then whenever there's a change . . .
IDEAL SOLUTION
The ideal solution would involve calling some function void pause(&func_in_call_stack) that would jump out, leaving the stack intact, all the way to the function that is at the top of the call stack, say main. And resume would . . . resume.
QUESTION
Is it possible, using C and assembly, to implement pause?
UPDATE
As I wrote this, I realize that the C-based exception handling code nlr_push()/nlr_pop() already does most of what I need.
Your question is about implementing context switching. As we've covered fairly exhaustively in comments, support for context switching is among the key characteristics of any multitasking system, and of a multitasking OS in particular. Inasmuch as you posit no OS support for context switching, you are talking about implementing multitasking for a single-tasking OS.
That you describe the OS as providing some kind of task queue ("to relinquish control, a thread must simply exit its run loop") does not change this, though to some extent we could consider it a question of semantics. I imagine that a typical task for such a system would operate by creating and executing a series of microtasks (the work of the "run loop"), providing a shared, mutable memory context to each. Such a run loop could safely exit and later be reentered, to resume generating microtasks from where it left off.
Dividing tasks into microtasks at boundaries defined by affirmative application action (i.e. your pause()) would depend on capabilities beyond those provided by ISO C. Very likely, however, it could be done with the help of some assembly, plus some kind of framework support. You need at least these things:
A mechanism for recording a task's current execution context -- stack, register contents, and maybe other details. This is inherently system-specific.
A task-associated place to store recorded execution context. There are various ways in which such a thing could be established. Promising alternatives include (i) provided by the OS; (ii) provided by some kind of userland multi-tasking system running on top of the OS; (iii) built into the task by the compiler.
A mechanism for restoring recorded execution context -- this, too, will be system-specific.
If the OS does not provide such features, then you could consider the (now removed) POSIX context system as a model interface for recording and restoring execution context. (See makecontext(), swapcontext(), getcontext(), and setcontext().) You would need to implement those yourself, however, and you might want to wrap them to present a simpler interface to applications. Details will be highly dependent on hardware and underlying OS.
As an alternative, you might implement transparent multitasking support for such a system by providing compilers that emit specially instrumented code (i.e. even more specially instrumented than you otherwise need). For example, consider compilers that emit bytecode for a VM of your own design. The VMs in which the resulting programs run would naturally track the state of the program running within, and could yield after each sequence of a certain number of opcodes.
Assume that a large file is saved on disk and I want to run a computation on every chunk of data contained in the file.
The C/C++ code that I would write to do so would load part of the file, then do the processing, then load the next part, then do the processing of this next part, and so on.
If I am, however, interested to do so in the shortest possible time, I could actually do the following: First, tell DMA-controller to load first part of the file. When this part is loaded tell the DMA-controller to load the second part (in some other part of the memory) and then immediately start processing the first part.
If I get an interrupt from the DMA during processing the first part, I finish the first part and afterwards tell the DMA to overwrite it with the third part of the file; then I process the second part.
If I do not get an interrupt from the DMA during processing the first part, I finish the first part and wait for the interrupt of the DMA.
Depending of how long the processing takes in relation to the disk-read, this should be up to twice as fast. In reality, of course, one would have to measure. But that is not the question I am asking.
The question is: Is it possible to do this a) in C using some non-standard extension or b) in assembly? Or do operating systems not allow such things in general? The question is meant primarily in a single-thread context, although I also would be interested to know how to do it with two threads. Also, I am interested in specific code; this is more of a theoretical question.
You're right that you will not get the benefit of this by default, because a blocking read stops your thread from doing any processing. Hans is right that modern OSes already take care of all the little details of DMA and interrupt completion routines.
You need to use the architecture you've described, of issuing a request in advance of when you will use the data. Issue asynchronous I/O requests (on Windows these are called OVERLAPPED). Then the flow will go exactly as you envisions, but the DMA and interrupts are handled in the drivers.
On Windows, take a look at FILE_FLAG_OVERLAPPED (to CreateFile) and ReadFile (if you like events) or ReadFileEx (if you like callbacks). If you don't have to process the data in any particular order, then add a completion port to the mix, which queues the completion responses.
On Linux, OSX, and many other Unix-like OSes, look at aio_read. Or fadvise. Or use mmap with madvise.
And you can get these benefits without even writing native code. .NET recently added the ReadAsync method to its FileStream, which can be used with continuation-passing style in the form of Task objects, with async/await syntactic sugar in the C# compiler.
Typically, in a multi-mode (user/system) operating system, you do not have access to direct dma or to interrupts. In systems that extend those features from kernel(system) mode down to user mode, the overhead eliminates the benefit of using them.
Ignoring that what you're asking to do requires a very specialized environment to support it, the idea is sound and common: declaring two (or more) buffers to enable DMA to the next while you process the first. When two buffers are used they're sometimes referred to as ping-pong buffers.
I am aware that one cannot listen for, detect, and perform some action upon encountering context switches on Windows machines via managed languages such as C#, Java, etc. However, I was wondering if there was a way of doing this using assembly (or some other language, perhaps C)? If so, could you provide a small code snippet that gives an idea of how to do this (as I am relatively new to kernel programming)?
What this code will essentially be designed to do is run in the background on a standard Windows UI and listen for when a particular process is either context switched in or out of the CPU. Upon hearing either of these actions, it will send a signal. To clarify, I am looking to detect only the context switches directly involving a specific process, not any context switches. What I ultimately would like to achieve is to be able to notify another machine (via the internet signal) whenever a specific process begins making use of the CPU, as well as when it ceases doing so.
My first attempt at doing this involved simply calculating the CPU usage percentage of the specific process, but this ultimately proved to be too course-grained to catch the most minute calculations. For example, I wrote a test program that simply performed the operation 2+2 and placed the answer inside of an int. The CPU usage method did not pick up on this. Thus, I am looking for something lower level, hence the origin of this question. If there are potential alternatives, I would be more than happy to field them.
There's Event Tracing for Windows (ETW), which you can configure to receive messages about a variety of events occurring in the system.
You should be able to receive messages about thread scheduling events. The CSwitch class of events is for that.
Sorry, I don't know any good ETW samples that you could easily reuse for your task. Read MSDN and look around.
Simon pointed out a good link explaining why ETW can be useful. Very enlightening: http://randomascii.wordpress.com/2012/05/11/the-lost-xperf-documentationcpu-scheduling/
Please see the edits below. In particular #3, ETW appears to be the way to go.
In theory you could install your own trap handler for the old int 2Eh and the new sysenter. However, in practice this isn't going to be as easy anymore as it used to be because of Patchguard (since Vista) and signing requirements. I'm not aware of any other generic means to detect context switches, meaning you'd have to roll your own. All context switches of the OS go through call gates (the aforementioned trap handlers) and ReactOS allows you to peek behind the scenes if you feel uncomfortable with debugging/disassembling.
However, in either case there shouldn't be a generic way to install something like this without kernel mode privileges (usually referred to as ring 0) - anything else would be a security flaw in Windows. I'm not aware of a Windows-supplied method to achieve what you want either.
The book "Undocumented Windows NT" has a pretty good chapter about the exact topic (although obviously targeted at the old int 2Eh method).
If you can live with hooking only certain functions, you may be able to get away with some filter driver(s) or user-mode API hooking. Depends on your exact requirements.
Update: reading your updated question, I think you need to read up on the internals, in particular on the concept of IRQLs (not to be confused with IRQs from DOS times) and the scheduler. The problem is that there can - and usually will - be literally hundreds of context switches every second. However, your watcher process (the one watching for context switches) will, like any user-mode process be preemptable. This means that there is no way for you to achieve real-time signaling or anything close to it, which puts a big question mark on the method.
What is it actually that you want to achieve? The number of context switches doesn't really give you anything. Every single SEH exception will cause a context switch. What is it that you are interested in? Perhaps performance counters cater your needs better?
Update 2: the sheer amount of context switches even for a single thread will be flabbergasting within a single second. So assuming you'd install your own trap handler, you'd still end up (adversely) affecting all other threads on the system (after all you'd catch every context switch and then see whether it's the process/threads you care about and then do your thing or pass it on).
If you could tell us what you ultimately want to achieve, not with the means already pre-defined, we may be able to suggest alternatives.
Update 3: so apparently I was wrong in one respect here. Windows comes with something on board that signals context switches. And ETW can be harnessed to tap into those. Thanks to Simon for pointing out.
How can I emulate a memory I/O device for unit testing on Linux?
I'm writing a unit test for some source code for embedded deployment.
The code is accessing a specific address space to communicate with a chip.
I would like to unit test(UT) this code on Linux.
The unit test must be able to run without human intervention.
I need to run the UT as a normal user.
The code must being tested must be exactly the source code being run on the target system.
Any ideas of where I could go for inspiration on how to solve this?
Can an ordinary user somehow tell the MMU that a particular memory allocation must be done at a specific address.
Or that a data block must be in a particular memory areas?
As I understand it:
sigsegv can't be used; since after the return from the handler the same mem access code will be called again and fail again. ( or by accident the memory area might actually have valid data in it, just not what I would like)
Thanks
Henry
First, make the address to be read an injected dependency of the code, instead of a hard-coded dependency. Now you don't have to worry about the location under test conditions, it can be anything you like.
Then, you may also need to inject a function to read/write from/to the magic address as a dependency, depending what you're testing. Now you don't have to worry about how it's going to trick the code being tested into thinking it's performing I/O. You can stub/mock/whatever the hardware I/O behavior.
It's quite difficult to test low-level code under the conditions you describe, whilst also keeping it super-efficient in non-test mode, because you don't want to introduce too many levels of indirection.
"Exactly the source code" can hide a multitude of sins, though, depending how you interpret it. For example, your "dependency injection" could be via a macro, so that the unit source is "the same", but you've completely changed what it does with a sneaky -D compiler option.
AFAIK you need to create a block device (I am not sure whether character device will work). Create a kernel module that maps that memory range to itself.
create read/write function, so whenever that memory range is touched, those read/write functions are called.
register those read/write function with the kernel, so that whenever there is read/write to those addresses, kernel is invoked and read/write functionality is performed by kernel on behalf of user.
What is the best way to permit C code to regularly access the instantaneous value of an integer generated from a separate Labview program?
I have time-critical C code that controls a scientific experiment and records data once every 20ms. I also have some labview code that operates a different instrument and outputs an integer value ever 100ms. I want my C code to be able to record the value from labview. What is the best way to do this?
One idea is to have Labview write the integer to file in a loop, and have the C code read the value of the file in a loop. (I could add a second thread to my C code if necessary.) Labview can also link to C dll's. So I might be able to write a DLL in C that somehow facilitates sharing between the two programs. Is that advisable? How would I do that?
I have a similar application here and use TCP sockets with the TCP_NO_DELAY option set (disables the Nagle algorythm which does some sort of packet buffering). Sockets should allow for a 100mSec update rate without problems, although the actual network delay will always remain an unknown variable. For my application this does not matter as long as it stays under a certain limit (this is also checked for by sending a timestamp with each packet and big red dialog boxes if timestamp delta becomes too large :]). Does it matter for your application? Ie, is it important that whenever the LV instrument acquires a new sample it's value has to make it to the C app within x mSec?
You might get the dll approach working, but it's not as straightforward as sockets and it will make the two applications more dependant of each other. Variable acces will be pretty much instantaneous though. I see at least two possibilities:
put your entire C app in a dll (might seem a weird approach at first but it works), and have LV load it and call methods on it. Eg to start your app LV calls dll's Start() method, then in the loop LV acquires it's samples it calls the dll's NewSampleValue(0 method or so. Also means your app cannot run standalone unless you write a seperate host process for it.
look into shared process memory, and have the C app and another dll share common memory. LV will load that dll and call a method on it to write a value to the shared memory, then the C app can read it after polling a flag (which needs a lock!).
it might also be possible to have the C app call the LV program using dll/activeX/? calls but I don't know how that system works..
I would definitely stay away from the file approach: disk I/O can be a real bottleneck and it also has the locking problem which is messy to solve with files. C app cannot read the file while LV is writing it and vice-versa which might introduce extra delays.
On a sidenote, you can see that each of the approaches above either use a push or pull model (the TCP one can be implemented in both ways), this might affect your final decision of which way to go.. Push = LV signals the C app directly, pull = C app has to poll a flag or ask LV for the value.
I'm an employee at National Instruments and I wanted to make sure you didn't miss the Network Variable API that is provided with LabWindows/CVI, the National Instruments C development environment. The the Network Variable API will allow you to easily communicate with the LabVIEW program over Shared Variables (http://zone.ni.com/devzone/cda/tut/p/id/4679). While reading these links, note that a Network Variable and a Shared Variable are the same thing - the different names are unfortunate...
The nice thing about the Network Variable API is that it allows easy interoperability with LabVIEW, it provides a strongly typed communication mechanism, and it provides a callback model for notification when the Network/Shared variable's properties (such as value) change.
You can obtain this API by installing LabWindows/CVI, but it is not necessary to use the LabWindows/CVI environment. The header file is available at C:\Program Files\National Instruments\CVI2010\include\cvinetv.h, and the .lib file located at C:\Program Files\National Instruments\CVI2010\extlib\msvc\cvinetv.lib can be linked in with whatever C development tools you are using.
I followed up on one of #stijn's ideals:
have the C app and another dll share common memory. LV will load that dll and call a method on it to write a value to the shared memory, then the C app can read it after polling a flag (which needs a lock!).
I wrote the InterProcess library, available here: http://github.com/samuellab/InterProcess
InterProcess is a compact general library that sets up windows shared memory using CreateFileMapping() and MapViewOfFile(). It allows the user to seamlessly store values of any type (int, char, your struct.. whatever) in an arbitrary number of named fields. It also implements Mutex objects to avoid collisions and race conditions, and it abstracts away all of this in a clean and simple interface. Tested on Windows XP. Should work with any modern Windows.
For interfacing between my existing C code and labview, I wrote a small wrapper DLL that sits on top of InterProcess and exposes only the specific functions that my C code or labview need to access. In this way, all of the shared memory is completely abstracted away.
Hopefully someone else will find this code useful.