Context
I'm making some libraries to manage internet protocol trough GPRS, some part of this communications (made trough UART) are rather slow (some can take more than 30 seconds) because the module has to connect through GPRS.
First I made a driver library to control the module and manage TCP/IP connections, this library worked whit blocking functions, for example a function like Init_GPRS_connection() could take several seconds to end, I have been made to notice that this is bad practice, cause now I have to implement a watchdog timer and this kind of function is not friendly whit short timeout like watchdogs have (I cannot kick the timer before it expire)
What have I though
I need to rewrite part of my libraries to be watchdog friendly, for this purpose I have tough in this scheme, I need functions that have state machine inside, those will be pulling data acquired trough UART interruptions to advance trough the state machines, so then I can write code like:
GPRS_typef Init_GPRS_connection(){
switch(state){ //state would be a global functions that take the current state of the state machine
.... //here would be all the states of the state machine
case end:
state = 0;
return Done;
}
}
while(Init_GPRS_connection() != Done){
Do_stuff(); //Like kick the Watchdog
}
But I see a few problems whit this solution:
This is a less user-friendly implementation, the user should be careful using this library driver because extra lines of code would be always necessary (kind of defeating the purpose of using functions).
If, for some reason, the module wouldn't answer at some point the code would get stuck in the state machine because the watchdog would be kicked outside this function even though the code got stuck in a loop, this kind of defeat the purpose of using watchdog Timer's
My question
What kind of implementation should I use to make a user and watchdog friendly driver library?, how does other drivers library manage this?
Extra information
All this in the context of embedded systems
I would like to implement the watchdog kicking action outside the driver's functions
Given where you are and assuming you do not what too much upheaval to your project to "do it properly", what you might to is add variable watchdog timeout extension, such that you set a counter that is decremented in a timer interrupt and if the counter is not zero, the watch dog is reset.
That way you are not allowing the timer interrupt to reset the watchdog indefinitely while your main thread is stuck, but you can extend the watchdog immediately before executing any blocking code, essentially setting a timeout for that operation.
So you might have (pseudocode):
static volatile uint8_t wdg_repeat_count = 0 ;
void extendWatchdog( uint8_t repeat ) { wdg_repeat_count = repeat ; }
void timerISR( void )
{
if( wdg_repeat_count > 0 )
{
resetWatchdog() ;
wdg_repeat_count-- ;
}
}
Then you can either:
extendWatchdog( CONNECTION_INIT_WDG_TIMEOUT ) ;
while(Init_GPRS_connection() != Done){
Do_stuff(); //Like kick the Watchdog
}
or continue to use your existing non-state-machine based solution:
extendWatchdog( CONNECTION_INIT_WDG_TIMEOUT ) ;
bool connected = Init_GPRS_connection() ;
if( connected ) ...
The idea is compatible with both what you have and what you propose, it simply allows you to extend the watchdog timeout beyond that dictated by the hardware.
I suggest a uint8_t, because it prevents a lazy developer simply setting a large value and effectively disabling the watchdog protection, and it is likely to be atomic and so shareable between the main and interrupt context.
All that said, it would clearly have been better to design in your integrity infrastructure from the outset at the architectural level rather than trying to bolt it on after the event. For example if you were using an RTOS, you might reset the watchdog in a low priority task that if starved, would cause a watchdog expiry, and that "watchdog task" could be use to monitor the other tasks to ensure they are scheduling as expected.
Without an RTOS you might have a "big-loop" architecture with each "task" implemented as a state-machine. In your example you seem to have missed the point of a state-machine. "initialising connection" should be a single state of a high level state-machine, the internals of that state may itself be a state-machine (hierarchical state machines). So your entire system would be a single master state-machine in the main loop, and the watchdog reset once at each loop iteration. Nothing in any sub-state should block to ensure the loop time is low and deterministic. That is how for example Arduino framework's loop() function should work (when done properly - unfortunately seldom the case in examples). To understand how to implement a real-time deterministic state-machine architecture you couls do worse that look at the work of Miro Samek. The framework described therein is available via his company.
You should make your library non-blocking, but other than that, you should not worry about the watchdog at all. The watchdog management should be left to the user.
To allow the user to do other work while your library is waiting, you can use these approaches:
Provide a function to feed the data into your library (e.g. receive()). The user should call this function when the data is available, for example from the interrupt. As this function can be called from the interrupt, make sure it does not do heavy processing. Typically, you would just buffer the data and process it later (Step 2).
Provide a function, that user calls periodically, that updates the state of your library and does any other housekeeping tasks (like timeout detection). Typically, this function is called run(), process(), tick() or something along these lines. The user would call this function in their main loop or from a dedicated RTOS task.
Provide a way to tell the user the state of your library. You can do it either by some sort of getState() function or using a callback or both. Based on this information, the user can implement their own state machine to do things on connect, disconnect etc.
Related
I am using FreeRTOS on an ARM Cortex A9 CPU und I'm desperately trying to find out if it is possible to determin if the processor is executing a normal thread or an interrupt service routine. It is implemented in V7-a architecture.
I found some promising reference hinting the ICSR register (-> VECTACTIVE bits), but this only exist in the cortex M family. Is there a comparable register in the A family as well? I tried to read out the processor modes in the current processor status register (CPSR), but when read during an ISR I saw that the mode bits indicate supervisor mode rather than IRQ or FIQ mode.
Looks a lot like there is no way to determine in which state the processor is, but I wanted to ask anyway, maybe I missed something...
The processor has a pl390 General Interrupt Controller. Maybe it is possible to determine the if an interrupt has been triggered by reading some of it's registers?
If anybody can give me a clue I would be very greatfull!
Edit1:
The IRQ Handler of FreeRTOS switches the processor to Superviser mode:
And subsequently switches back to system mode:
Can I just check if the processor is in supervisor mode and assume that this means that the execution takes place in an ISR, or are there other situations where the kernel may switches to supervisor mode, without being in an ISR?
Edit2:
On request I'll add an overal background description of the solution that I want to achieve in the first place, by solving the problem of knowing the current execution context.
I'm writing a set of libraries for the CortexA9 and FreeRTOS that will access periphery. Amongst others I want to implement a library for the available HW timer from the processor's periphery.
In order to secure the access to the HW and to avoid multiple tasks trying to access the HW resource simultaneously I added Mutex Semaphores to the timer library implementation. The first thing the lib function does on call is to try to gain the Mutex. If it fails the function returns an error, otherwise it continouses its execution.
Lets focus on the function that starts the timer:
static ret_val_e TmrStart(tmr_ctrl_t * pCtrl)
{
ret_val_e retVal = RET_ERR_DEF;
BaseType_t retVal_os = pdFAIL;
XTtcPs * pHwTmrInstance = (XTtcPs *) pCtrl->pHwTmrInstance;
//Check status of driver
if(pCtrl == NULL)
{
return RET_ERR_TMR_CTRL_REF;
}else if(!pCtrl->bInitialized )
{
return RET_ERR_TMR_UNINITIALIZED;
}else
{
retVal_os = xSemaphoreTake(pCtrl->osSemMux_Tmr, INSTANCE_BUSY_ACCESS_DELAY_TICKS);
if(retVal_os != pdPASS)
{
return RET_ERR_OS_SEM_MUX;
}
}
//This function starts the timer
XTtcPs_Start(pHwTmrInstance);
(...)
Sometimes it can be helpful to start the timer directly inside an ISR. The problem that appears is that while the rest of function would support it, the SemaphoreTake() call MUST be changed to SemaphoreTakeFromISR() - moreover no wait ticks are supported when called from ISR in order to avoid a blocking ISR.
In order to achieve code that is suitable for both execution modes (thread mode and IRQ mode) we would need to change the function to first check the execution state and based on that invokes either SemaphoreTake() or SemaphoreTakeFromISR() before proceeding to access the HW.
That's the context of my question. As mentioned in the comments I do not want to implement this by adding a parameter that must be supplied by the user on every call which tells the function if it's been called from a thread or an ISR, as I want to keep the API as slim as possible.
I could take FreeRTOS approch and implement a copy of the TmrStart() function with the name TmrStartFromISR() which contains the the ISR specific calls to FreeRTOS's system resources. But I rather avoid that either as duplicating all my functions makes the code overall harder to maintain.
So determining the execution state by reading out some processor registers would be the only way that I can think of. But apparently the A9 does not supply this information easily unfortunately, unlike the M3 for example.
Another approch that just came to my mind could be to set a global variable in the assembler code of FreeRTOS that handles exeptions. In the portSAVE_CONTEXT it could be set and in the portRESTORE_CONTEXT it could be reset.
The downside of this solution is that the library then would not work with the official A9 port of FreeRTOS which does not sound good either. Moreover you could get problems with race conditions if the variable is changed right after it has been checked by the lib function, but I guess this would also be a problem when reading the state from a processor registers directly... Probably one would need to enclose this check in a critical section that prevents interrupts for a short period of time.
If somebody sees some other solutions that I did not think of please do not hesitate to bring them up.
Also please feel free to discuss the solutions I brought up so far.
I'd just like to find the best way to do it.
Thanks!
On a Cortex-A processor, when an interrupt handler is triggered, the processor enters IRQ mode, with interrupts disabled. This is reflected in the state field of CPSR. IRQ mode is not suitable to receive nested interrupts, because if a second interrupt happened, the return address for the first interrupt would be overwritten. So, if an interrupt handler ever needs to re-enable interrupts, it must switch to supervisor mode first.
Generally, one of the first thing that an operating system's interrupt handler does is to switch to supervisor mode. By the time the code reaches a particular driver, the processor is in supervisor mode. So the behavior you're observing is perfectly normal.
A FreeRTOS interrupt handler is a C function. It runs with interrupts enabled, in supervisor mode. If you want to know whether your code is running in the context of an interrupt handler, never call the interrupt handler function directly, and when it calls auxiliary functions that care, pass a variable that indicates who the caller is.
void code_that_wants_to_know_who_called_it(int context) {
if (context != 0)
// called from an interrupt handler
else
// called from outside an interrupt handler
}
void my_handler1(void) {
code_that_wants_to_know_who_called_it(1);
}
void my_handler2(void) {
code_that_wants_to_know_who_called_it(1);
}
int main(void) {
Install_Interrupt(EVENT1, my_handler1);
Install_Interrupt(EVENT2, my_handler1);
code_that_wants_to_know_who_called_it(0);
}
I am learning embedded systems on the ARM9 processor (SAM9G20). I am more familiar with procedural programming for general purpose. Thus what I am doing is going through the data sheet and learning what registers there are and how to manipulate them.
My question is, how do I know when the computer reset? I know that there is a Reset Controller that manages resets. A register called the Status Register (RSTC_SR) stores the source of the reset. Do I need to keep periodically reading this register?
My solution is to store the number of resets in the FRAM (or start by setting it to 0), once a reset happens, I compare this variable with the register value in my main function. If the register value is higher then obviously it reset. However I am sure there is a more optimized way (perhaps using interrupts). Or is this how its usually done?
You do not need to periodically check, since every time the machine is reset your program will re-start from the beginning.
Simply add checks to the startup code, i.e. early in main(), as needed. If you want to figure out things like how often you reset, then that is more difficult since typically (no experience with SAMs, I'm an STM32 type of guy) on-board timers etc will also reset. Best would be some kind of real-world independent clock, like an RTC that you can poll and save the value of. Please consider if you really need this, though.
A simple solution is to exploit the structure of your code.
Many code bases for embedded take this form:
int main(void)
{
// setup stuff here
while (1)
{
// handle stuff here
}
return 0;
}
You can exploit that the code above while(1) is only run once at startup. You could increment a counter there, and save it in non-volatile storage. That would tell you how many times the microcontroller has reset.
Another example is on Arduino, where the code is structured such that a function called setup() is called once, and a function called loop() is called continuously. With this structure, you could increment the variable in the setup()-function to achieve the same effect.
Whenever your processor starts up, it has by definition come out of reset. What the reset status register does is indicate the source or reason for the reset, such as power-on, watchdog-timer, brown-out, software-instruction, reset-pin etc.
It is not a matter of knowing when your processor has reset - that is implicit by the fact that your code has restarted. It is rather a matter of knowing the cause of the reset.
You need not monitor or read the reset status at all if your application has no need of it, but in some applications perhaps it is a useful diagnostic for example to maintain a count of various reset causes as it may be indicative of the stability of your system software, its power-supply or the behaviour of the operators. Ideally you'd want to log the cause with a timestamp assuming you have an suitable RTC source early enough in your start-up. The timing of resets is often a useful diagnostic where simply counting them may not be.
Any counting of the reset cause should occur early in your code start-up before any interrupts are enabled (because an interrupt may itself cause a reset). This may require you to implement the counters in the start-up code before main() is invoked in cases where the start-up code might enable interrupts - for stdio or filesystem support fro example.
A way to do this is to run the code in debug mode (if you got a debugger for the SAM). After a reset the program counter(PC) points to the address where your code starts.
I asked this question on EE forum. You guys on StackOverflow know more about coding than we do on EE so maybe you can give more detail information about this :)
When I learned about micrcontrollers, teachers taught me to always end the code with while(1); with no code inside that loop.
This was to be sure that the software get "stuck" to keep interruption working. When I asked them if it was possible to put some code in this infinite loop, they told me it was a bad idea. Knowing that, I now try my best to keep this loop empty.
I now need to implement a finite state machine in a microcontroller. At first view, it seems that that code belong in this loop. That makes coding easier.
Is that a good idea? What are the pros and cons?
This is what I plan to do :
void main(void)
{
// init phase
while(1)
{
switch(current_State)
{
case 1:
if(...)
{
current_State = 2;
}
else(...)
{
current_State = 3;
}
else
current_State = 4;
break;
case 2:
if(...)
{
current_State = 3;
}
else(...)
{
current_State = 1;
}
else
current_State = 5;
break;
}
}
Instead of:
void main(void)
{
// init phase
while(1);
}
And manage the FSM with interrupt
It is like saying return all functions in one place, or other habits. There is one type of design where you might want to do this, one that is purely interrupt/event based. There are products, that go completely the other way, polled and not even driven. And anything in between.
What matters is doing your system engineering, thats it, end of story. Interrupts add complication and risk, they have a higher price than not using them. Automatically making any design interrupt driven is automatically a bad decision, simply means there was no effort put into the design, the requirements the risks, etc.
Ideally you want most of your code in the main loop, you want your interrupts lean and mean in order to keep the latency down for other time critical tasks. Not all MCUs have a complicated interrupt priority system that would allow you to burn a lot of time or have all of your application in handlers. Inputs into your system engineering, may help choose the mcu, but here again you are adding risk.
You have to ask yourself what are the tasks your mcu has to do, what if any latency is there for each task from when an event happens until they have to start responding and until they have to finish, per event/task what if any portion of it can be deferred. Can any be interrupted while doing the task, can there be a gap in time. All the questions you would do for a hardware design, or cpld or fpga design. except you have real parallelism there.
What you are likely to end up with in real world solutions are some portion in interrupt handlers and some portion in the main (infinite) loop. The main loop polling breadcrumbs left by the interrupts and/or directly polling status registers to know what to do during the loop. If/when you get to where you need to be real time you can still use the main super loop, your real time response comes from the possible paths through the loop and the worst case time for any of those paths.
Most of the time you are not going to need to do this much work. Maybe some interrupts, maybe some polling, and a main loop doing some percentage of the work.
As you should know from the EE world if a teacher/other says there is one and only one way to do something and everything else is by definition wrong...Time to find a new teacher and or pretend to drink the kool-aid, pass the class and move on with your life. Also note that the classroom experience is not real world. There are so many things that can go wrong with MCU development, that you are really in a controlled sandbox with ideally only a few variables you can play with so that you dont have spend years to try to get through a few month class. Some percentage of the rules they state in class are to get you through the class and/or to get the teacher through the class, easier to grade papers if you tell folks a function cant be bigger than X or no gotos or whatever. First thing you should do when the class is over or add to your lifetime bucket list, is to question all of these rules. Research and try on your own, fall into the traps and dig out.
When doing embedded programming, one commonly used idiom is to use a "super loop" - an infinite loop that begins after initialization is complete that dispatches the separate components of your program as they need to run. Under this paradigm, you could run the finite state machine within the super loop as you're suggesting, and continue to run the hardware management functions from the interrupt context as it sounds like you're already doing. One of the disadvantages to doing this is that your processor will always be in a high power draw state - since you're always running that loop, the processor can never go to sleep. This would actually also be a problem in any of the code you had written however - even an empty infinite while loop will keep the processor running. The solution to this is usually to end your while loop with a series of instructions to put the processor into a low power state (completely architecture dependent) that will wake it when an interrupt comes through to be processed. If there are things happening in the FSM that are not driven by any interrupts, a normally used approach to keep the processor waking up at periodic intervals is to initialize a timer to interrupt on a regular basis to cause your main loop to continue execution.
One other thing to note, if you were previously executing all of your code from the interrupt context - interrupt service routines (ISRs) really should be as short as possible, because they literally "interrupt" the main execution of the program, which may cause unintended side effects if they take too long. A normal way to handle this is to have handlers in your super loop that are just signalled to by the ISR, so that the bulk of whatever processing that needs to be done is done in the main context when there is time, rather than interrupting a potentially time critical section of your main context.
What should you implement is your choice and debugging easiness of your code.
There are times that it will be right to use the while(1); statement at the end of the code if your uC will handle interrupts completely (ISR). While at some other application the uC will be used with a code inside an infinite loop (called a polling method):
while(1)
{
//code here;
}
And at some other application, you might mix the ISR method with the polling method.
When said 'debugging easiness', using only ISR methods (putting the while(1); statement at the end), will give you hard time debugging your code since when triggering an interrupt event the debugger of choice will not give you a step by step event register reading and following. Also, please note that writing a completely ISR code is not recommended since ISR events should do minimal coding (such as increment a counter, raise/clear a flag, e.g.) and being able to exit swiftly.
It belongs in one thread that executes it in response to input messages from a producer-consumer queue. All the interrupts etc. fire input to the queue and the thread processes them through its FSM serially.
It's the only way I've found to avoid undebuggable messes whilst retaining the low latencty and efficient CPU use of interrupt-driven I/O.
'while(1);' UGH!
I'm working on a project for automotive system where we use the MPC5748 MCU. The application uses an RTOS based on AUTOSAR OS, and this MPC target support two type of watchdogs; software and hardware (they have used soft WDT).
My mission is to fit an algorithm within this application, the development of the algorithm has been done, the problem is that in the task where the algorithm is running is a 1ms task and the algorithm needs much more time than the time dedicated to this function.
I'm a newbie to the embedded world.By the way, in the algorithm main function the program will reset itself and this seems to be a timeOut generated by the expiration of the watchdog.
My questions are:
Can I disable the watchdog timer for this specified function (which must not be disabled but just for testing purpose)? It is possible to use more timeOut for the watchdog on that specified function?
Must I develop another task with a big delay in other to run the algorithm? But the problem is that the algorithm need to be synchronised with the 1ms task since we are receiving CAN commands.
Can i add a sleep(<1ms) on the desired function in order to wait a little bit witout affecting other tasks
What are other options to try?
NB: This is a general problem on the watchdog timer and any useful informations will be much helpful for me. Sorry because I can't share the code.
Can I disable the watchdog timer for this specified function (which must not be disabled but just for testing purpose)? It is possible to use more timeOut for the watchdog on that specified function?
Let's forget that one - it is a really bad idea. If it is possible to defeat the watchdog, then it is possible to do it by error, and then the whole point of the watchdog is defeated. Apart from that its an XY question - a question about your proposed solution to a different problem - you should ask about the problem directly.
Must I develop another task with a big delay in other to run the algorithm? But the problem is that the algorithm need to be synchronised with the 1ms task since we are receiving CAN commands.
Yes you need another task, but you should not add a "big delay" and it is probably unnecessary and certainly a bad design. If the 1ms task needs the result of the algorithm then, the algorithm should run in a service task triggered by the 1ms task and run asynchronously to the 1ms task, the service task then makes the results available to the 1ms task when available (by shared memory or message passing perhaps). Alternatively if the result is not specifically needed by the 1ms task, the service task could take the necessary action independently of the 1ms task.
There are many options, but essentially it seems that your task partitioning is inappropriate; your CAN Rx task should be responsible for receiving CAN messages only, and any action required in response to CAN messages deferred to one or more other tasks, perhaps fed from a message queue.
What are other options to try ?
Software design should not be a matter of trial and error - get the design right, implement the design. However you might consider whether 1ms is appropriate; is it possible that the period can be extended to encompass the worst case execution time without causing a failure to meet deadlines in general? If the answer is "no" then the algorithm does not belong in this task.
I don't think so you can disable/delay the WATCHDOG timer and even if you could that's not a good option to go for.
The problem what think is that the task you are calling is of 1ms, which is very less to read CAN messages and then operate on the same. The minimum task time i think should be of 5ms and the optimal time should be of 10ms.
Can I disable the watchdog timer for this specified function (which must not be disabled but just for testing purpose)? It is possible to use more timeOut for the watchdog on that specified function?
You should never disable the watchdog anywhere in your code.
It might not even be possible, on the MPC5x families you typically set up the watchdog once, and then for safety reasons all watchdog registers turn to read-only registers.
Must I develop another task with a big delay in other to run the algorithm? But the problem is that the algorithm need to be synchronised with the 1ms task since we are receiving CAN commands.
Ideally you should only service the watchdog from one single location in the program. Your CAN peripheral will be FlexCAN, which has a lot of available "mailboxes" for CAN messages. In most cases, you shouldn't need to poll it, but a flag will be set when the desired message arrive.
So it isn't obvious to me why you would need a delay to wait for them. Simply do:
void the_task (void)
{
wdog_refresh();
... // do other things
if(can_message_available)
{
// do something with the message
}
... // do other things
}
rather than
// BAD:
while(!can_message_available)
; // do nothing
Even if you need to use the CAN as FIFO and poll it repeatedly, you would still use the same approach. You'd just have to ensure that the task runs often enough that there will never be an overflow in the FIFO buffer.
I'm using stm32f103 with GCC and have a task, which can be described with following pseudocode:
void http_server() {
transmit(data, len);
event = waitfor(data_sent_event | disconnect_event | send_timeout_event);
}
void tcp_interrupt() {
if (int_reg & DATA_SENT) {
emit(data_send_event);
}
}
void main.c() {
run_task(http_server);
}
I know, that all embedded OSes offer such functionality, but they are too huge for this single task. I don't need preemption, mutexes, queues and other features. Just waiting for flags in secondary tasks and raising these flags in interrupts.
Hope someone knows good tutorial on this topic or have a piece of code of context switching and wait implementation.
You will probably need to use an interrupt driven finite state machine.
There are a number of IP stacks that are independent of an operating system, or even interrupts. lwip (light weight ip) comes to mind. I used it indirectly as it was provided by xilinx. the freedos folks may have had one, certainly the crynwr packet drivers come to mind to which there were no doubt stacks built.
As far as the perhaps more simpler question. Your code is sitting in a foreground task in the waitfor() function which appears to want to be an infinite loop waiting for some global variables to change. And an interrupt comes along calls the interrupt handler which with a lot of stack work (to know it is a tcp interrupt) calls tcp_interrupt which modifies the flags, interrupt finishes and now waitfor sees the global flag change. The context switch is the interrupt which is built into the processor, no need for an operating system or anything fancy, a global variable or two and the isr. The context switch and flags/events are a freebie compared to the tcp/ip stack. udp is significantly easier, do you really need tcp btw?
If you want more than one of these waitfor() active, basically you don want to only have the one forground task sitting in one waitfor(). Then I would do one of two things. have the foreground task poll, instead of a waitfor(something) change it to an if(checkfor(something)) { then do something }.
Or setup your system so that the interrupt handler, which in your pseudo code is already very complicated to know this is tcp packet data, examines the tcp header deeper and knows to call the http_server() thing for port 80 events, and other functions for other events that you might have had a waitfor. So in this case instead of a multitasking series of functions that are waitfor()ing, create a single list of the events, and look for them in the ISR. Use a timer and interrupt and globals for the timeouts (reset a counter when a packet arrives, bump the counter on a timer interrupt if the counter reaches N then a timeout has occurred, call the timeout task handler function).