Does this sound like a stack overflow? - c

I think I might be having a stack overflow problem or something similar in my embedded firmware code. I am a new programmer and have never dealt with a SO so I'm not sure if that is what's happening or not.
The firmware controls a device with a wheel that has magnets evenly spaced around it and the board has a hall effect sensor that senses when magnet is over it. My firmware operates the stepper and also count steps while monitoring the magnet sensor in order to detect if the wheel has stalled.
I am using a timer interrupt on my chip (8 bit, 8057 acrh.) to set output ports to control the motor and for the stall detection. The stall detection code looks like this...
// Enter ISR
// Change the ports to the appropriate value for the next step
// ...
StallDetector++; // Increment the stall detector
if(PosSensor != LastPosMagState)
{
StallDetector = 0;
LastPosMagState = PosSensor;
}
else
{
if (PosSensor == ON)
{
if (StallDetector > (MagnetSize + 10))
{
HandleStallEvent();
}
}
else if (PosSensor == OFF)
{
if (StallDetector > (GapSize + 10))
{
HandleStallEvent();
}
}
}
this code is called every time the ISR is triggered. PosSensor is the magnet sensor. MagnetSize is the number of stepper steps that it takes to get through the magnet field. GapSize is the number of steps between two magnets. So I want to detect if the wheel gets stuck either with the sensor over a magnet or not over a magnet.
This works great for a long time but then after a while the first stall event will occur because 'StallDetector > (MagnetSize + 10)' but when I look at the value of StallDetector it is always around 220! This doesn't make sense because MagnetSize is always around 35. So the stall event should have been triggered at like 46 but somehow it got all the way up to 220? And I don't set the value of stall detector anywhere else in my code.
Do you have any advice on how I can track down the root of this problem?
The ISR looks like this
void Timer3_ISR(void) interrupt 14
{
OperateStepper(); // This is the function shown above
TMR3CN &= ~0x80; // Clear Timer3 interrupt flag
}
HandleStallEvent just sets a few variable back to their default values so that it can attempt another move...
#pragma save
#pragma nooverlay
void HandleStallEvent()
{
///*
PulseMotor = 0; //Stop the wheel from moving
SetMotorPower(0); //Set motor power low
MotorSpeed = LOW_SPEED;
SetSpeedHz();
ERROR_STATE = 2;
DEVICE_IS_HOMED = FALSE;
DEVICE_IS_HOMING = FALSE;
DEVICE_IS_MOVING = FALSE;
HOMING_STATE = 0;
MOVING_STATE = 0;
CURRENT_POSITION = 0;
StallDetector = 0;
return;
//*/
}
#pragma restore

Is PosSensor volatile? That is, do you update PosSensor somewhere, or is it directly reading a GPIO?
I assume GapSize is rather large (> 220?) It sounds to me like you might have a race condition.
// PosSensor == OFF, LastPosMagState == OFF
if(PosSensor != LastPosMagState)
{
StallDetector = 0;
LastPosMagState = PosSensor;
}
else
{
// Race Condition: PosSensor turns ON here
// while LastPosMagState still == OFF
if (PosSensor == ON)
{
if (StallDetector > (MagnetSize + 10))
{
HandleStallEvent();
}
}
else if (PosSensor == OFF)
{
if (StallDetector > (GapSize + 10))
{
HandleStallEvent();
}
}
}
You should cache the value of PosSensor once, right after doing StallDetector++, so that in the event PosSensor changes during your code, you don't start testing the new value.

This is definitely not stack overflow. If you blew the stack (overflowed it) your application would simply crash. This sounds more like something we used to call memory stomping in my C++ days. You may not be accessing the memory location that the StallDetector value occupies via StallDetector variable alone. There may be another part of your code "stomping" this particular memory location erroneously.
Unfortunately, this kind of issue is very hard to track down. About the only thing you could do is systematically isolate (remove from execution) chunks of your code until you narrow down and find the bug.

Do you have nest ISRs on your system? Could be something along the lines of start your ISR and increment your count, then interrupt it and do it again. Do this enough times and your interrupt stack can overflow. It could also explain such a high counter variable as well.

Does HandleStallEvent() "look at" StallDetector within the ISR or does it trigger something on the main loop? If it's on the main loop, are you clearing the interrupt bit?
Or are you looking at StallDetector from a debugger outside the ISR? Then a retriggered interrupt would use the correct value each time, but execute too many times, and you would only see the final, inflated value.
On second thought, more likely you don't have to clear an interrupt-generating register, but rather the interrupt pin is remaining asserted by the sensor. You need to ignore the interrupt after it's first handled until the line deasserts, such as by having the original ISR disable itself and and reinstall it in a second ISR which handles the 1->0 transition.
You might then also need to add debouncing hardware or adjust it if you have it.

Check your parameter types. If you defined the parameters in a way different than the caller expects then calling your method could overwrite the space that variable is stored in. (For instance if you wrote the function expecting an int but it is pushing a long onto the stack.)

You could see what additional options your debugger supports. In Visual Studio, for example, it is possible to set a "data breakpoint", where you break when a memory location changes (or is set to a certain value, or above a threshold, ...).
If something like this is possible in your case, you could see where the data is changed and if there is someone else writing to the memory erroneously.

Related

how to stop looping and wait until different value received from serial

hi i planning to made a multiple servo controlling with serial as control trigger signal on AVR with C and codevision
but when the trigger is true, the servo running in crazy loop, it back to original position (0 degree) instead of stay on desired position, my tutor give me hint to use "wait ... until" statement with the old data comparison but i'm not found the way to utilize it yet on google
because utilizing break; at the end of the if left the chip freeze until it reset
and the old code(it runs the servo forth and back continously)
while (1)
{
while(UCSRA & (1<<RXC))
{
// Place your code here
//data=UDR;
PORTC=UDR;
data=UDR;
//PORTB=data;
} ;
if (data== 0x0A || data== 0x0B)
{
if (data== 0x0A)
{
old_data=data;
PORTA=0x00;
PORTA.1=1;
movservo0(90,7);
movservo1(15,3);
}
if (data== 0x0B)
{
old_data=data;
PORTA=0x00;
PORTA.1=1;
movservo0(15,3);
movservo1(90,7);
}
}
}
as for movservo0 (another movservo() almost had same code)
void set_servo1(unchar derajat)
{ unchar n;
servo2=1;
delay_us(750);
for(n=0; n<derajat; n++)
{
delay_us(12);
};
servo2=0;
delay_ms(10);
}
void movservo0(unsigned char sudut, unsigned char speed)
{
unchar i;
set_servo1(sudut);
for (i=1;i<=sudut;i+=speed){
set_servo1(i);
delay_ms(100/speed);
}
}
This new code is a bit better.
The top of the while(UCSRA & (1<<RXC)) loop is fine this way. The loop body will only be entered if there is a character to be read. (Although, I'm not sure why you are expecting to read '\n' or vertical tab.)
A minor problem is the way the UDR is read. The act of reading UDR erases the contents. So you should be using
data=UDR;
PORTC=data;
The jumping of the servos appears to be in the moveservo() function. This function always sets the angle to 1 and then gradually increases the angle of the servo until it reaches the desired angle.
setservo() appears to be an attempt to perform PWM to drive a servo, but it doesn't work correctly. To keep the servo at a desired angle, you have to keep switching the pin from 0 to 1 at the correct times, not just once like this function does. Have you looked at the PWM functions of the timers? You just set these up and they run in the background. An alternative is to use a timer to set an interrupt to wake up and toggle the pins as you need.
If you want to do the switching of the pins without interrupts, then you should be using delays in the while(1) loop itself. Just use the setservo() functions to change some variables.
while(1)
{
// read the UART if ready and set the target values
// this part only runs occasionally
while(UCSRA & (1<<RXC))
{
// etc
}
// The rest of the loop body runs every time
// adjust the servo values toward the target values
// use a counter to determine if to adjust during this loop iteration
// e.g., a slow speed counts to a higher number before adjusting
delay(750);
for(int i = 0; i < NUM_INTERVALS; ++i)
{
// decide whether to set pins to 1 or 0 based on the servo values
delay(INTERVAL);
}
}
I don't know about your particular servos, but they typically have a period where most of the time the pin is 0, and then a short time in the period where the pin is 1. You will need to adjust NUM_INTERVALS and INTERVAL and 750 to add up to the correct length of time.
There is so much wrong with this snippet, it is hard to start. It is difficult to say why the servo is moving, as it is only ever "moved" to the same value every time. (Unless, you have omitted some code that sets it to some other value.)
Typically, when receiving UART, the processor should wait until the character is received. This is accomplished by this
while(UCSRA & (1<<RXC) == 0);
Note the ; creating an empty body of the while loop. This is probably what your tutor meant. When the flag is set, then the loop exits and the data is ready to be read.
while(UCSRA & (1<<RXC) == 0);
data = UDR;
Next, you have a block which looks like you meant to be part of a while loop, but it isn't. The body of the loop is the single statement abbove it. The block gets executed every time.
{
if(data=0x0a)
{
olddata=data;
movservo0(90,7); //move servo to certain degree
}
}
Another error is the condition in the if statement. It looks like you are trying to test if data is 0x0A, but that is not what is happening. Instead, you set data to be 0x0A, and then execute the inner part every time. The condition you probably want is if(data == 0x0A). Note the == instead of =.
So your code is equivalent to
while(1)
{
while(ucsra & (1<<RXC)) data=udr; // maybe read UDR into data
data=0x0a; // set data anyway
olddata=data;
movservo0(90,7); //move servo to certain degree
}
Again, for the jumping servo, I suspect some code that is omitted here that is also runs every time. Or else, the moveservo() function has a problem itself.

Prevent nested calls

I have a function to disable interrupts, but the problem is that if I disable them and I call a function which also disables/enables them, they get re-enabled too early. Is the following logic enough to prevent this?
static volatile int IrqCounter = 0;
void EnableIRQ()
{
if(IrqCounter > 0)
{
IrqCounter--;
}
if(IrqCounter == 0)
{
__enable_irq();
}
}
void DisableIRQ()
{
if(IrqCounter == 0)
{
__disable_irq();
}
IrqCounter++;
}
The way every operating system I know of does it is to save IRQ state into a local variable, and then restore that.
Clearly, your code has TOCTOU issues - if two threads run at the same time, checking the IrqCounter > 0, if IrqCounter == 1, then the first thread will see it as 1, the second thread sees it as 1, and both decrement the counter.
I would definitely try to arrange something like this:
int irq_state = irq_save();
irq_disable();
... do stuff with IRQ's turned off ...
irq_restore(irq_state);
Now, you don't have to worry about counters that can get out of sync, etc.
Assuming that you've got a system where you can't change context when interrupts are disabled, then what you've got is fine, assuming you keep careful track of when call the enable().
In the usage you're describing in the comments below, you plan on using these sections from within an interrupt service routine. Your main use is blocking higher-priority interrupts from running for a certain portion of an ISR.
Be aware that you'll have to consider the stack depth of these nested ISRs, as when you enable interrupts before your return from interrupt, you'll have interrupts enabled in the ISR.
Regarding other answers: the lack of thread-safety of the enable() (due to the if(IrqCounter > 0)) doesn't matter, because anytime you're in the enable() context switches are already disabled due to interrupts being off. (Unless for some reason you have unmatched disable/enable pairs, and in that case you've got other issues.)
The only suggestion I'd have would be to add an ASSERT to the enable instead of the run-time check, as you should never be enabling interrupts that you didn't disable.
void EnableIRQ()
{
ASSERT(IrqCounter != 0) //should never be 0, or we'd have an unmatched enable/disable pair
IrqCounter--; //doesn't matter that this isn't thread safe, as the enable is always called with interrupts disabled.
if(IrqCounter == 0)
{
__enable_irq();
}
}
I prefer the technique you've listed over the save(); disable(); restore(); technique as I don't like having to keep track of a piece of the OS' data every time I work with the interrupts. But, you do have to be aware of when you (directly or indirectly) make a call to the enable() from an ISR.
That looks fine, except it's not thread-safe.
Another common option is to query the interrupt-enable/disable state and save it into a local variable, then disable interrupts, then do whatever you want to be done while interrupts are disabled, then restore the state from the local variable.
static volatile int IrqCounter = 0;
void EnableIRQ(void)
{
ASSERT(IrqCounter != 0) //should never be 0, or we'd have an unmatched enable/disable pair
if (IrqCounter > 0)
{
IrqCounter--;
}
if (IrqCounter == 0)
{
__enable_irq();
}
}
void DisableIRQ(void)
{
__disable_irq(); // Fix TOCTOU issues. In CMSIS there is no harm in extra disables, so always disable.
IrqCounter++;
}

C - While and IF statements - Trying to timeout after X time

I am having difficulties with the below code;
int i = 0;
int x = 0;
int ch ;
int n;
while((i < sizeof(buffer) - 1) && (x < (TIMER_FREQ*30)))
{
//getkey_serial0 returns either a (int)character or 0 if nothing on
//UART0
if((ch = getkey_serial0) == 0)
{
x++; //Increment X as there is nothing being received.
}
else
{
if(ch == '\n')
{
n++;
}
if(n < 8){ //Yes I can simplify this but for some reason
} //I only just noticed this :/ Anyway, it is
else{ //just here to avoid saving info I don't need
buffer[i] = ch ;
i++;
}
}
}
As the input it is reading in is the results of a wireless scan the number of entries scanned can vary greatly, and so I need to be able to avoid infinitely looping.
Originally I just read up to 11 \n's but this was rubbish as I kept missing SSID's which I needed, so I decided I needed some sort of timer or method to help me break after X amount of time.
TIMER_FREQ is defined as 10.
Clearly I am doing something stupid so any suggestions or tips would be greatly appreciated.
I generally prefer suggestions to help me try and think out the problem as opposed to fixed code posts :) I always seem to miss something simple despite my best efforts!
Thanks
EDIT: I should mention, this is on an embedded system (ARM7)
You should have access to a general purpose timer interrupt -- commonly called sys_tick().
The general practice in such "bare metal" applications is to configure the interrupt to fire every n milliseconds (10 ms is frequently used on my Cortex M3). Then, have the ISR update a counter. You'll want to ensure the counter update is atomic, so use a 32-bit, properly-aligned variable. (I'm assuming your processor is 32-bit, I can't recall for certain). Then your "application" code can poll the elapsed time as needed.
BUT - this timer discussion might be moot. In my ARM9 applications, we tie an interrupt to the UART's receive buffer. The associated ISR captures the keystroke and then performs any buffer management. Is that an option for you?
Do you really mean:
if((ch = getkey_serial0) == 0) { ...
Or do you actually mean:
if((ch = getkey_serial0()) == 0) { ...
If the latter, this is why your program never returns zero as you are giving it a function pointer. Does your program have many warnings at build?
If you want to time things, look into time(). It will let you see the system's wall clock, so you can determine if too many seconds have elapsed.

Can't write to SC1DRL register on 68HC12 board--what am I missing?

I am trying to write to use the multiple serial interface on the 68HC12 but am can't get it to talk. I think I've isolated the problem to not being able to write to the SC1DRL register (SCI Data Register Low).
The following is from my SCI ISR:
else if (HWRegPtr->SCI.sc1sr1.bit.tdre) {
/* Transmit the next byte in TX_Buffer. */
if (TX_Buffer.in != TX_Buffer.out || TX_Buffer.full) {
HWRegPtr->SCI.sc1drl.byte = TX_Buffer.buffer[TX_Buffer.out];
TX_Buffer.out++;
if (TX_Buffer.out >= SCI_Buffer_Size) {
TX_Buffer.out = 0;
}
TX_Buffer.full = 0;
}
/* Disable the transmit interrupt if the buffer is empty. */
if (TX_Buffer.in == TX_Buffer.out && !TX_Buffer.full) {
Disable_SCI_TX();
}
}
TX_Buffer.buffer has the right thing at index TX_Buffer.out when its contents are being written to HWRegPtr->SCI.sc1drl.byte, but my debugger doesn't show a change, and no data is being transmitted over the serial interface.
Anybody know what I'm missing?
edit:
HWRegPtr is defined as:
extern HARDWARE_REGISTER *HWRegPtr;
HARDWARE_REGISTER is a giant struct with all the registers in it, and is volatile.
It's likely that SC1DRL is a write-only register (check the official register docs to be sure -- google isn't turning up the right PDF for me). That means you can't read it back (even with an in-target debugger) to verify your code.
How is HWRegPtr defined? Does it have volatile in the right places to ensure the compiler treats every write through that pointer as something which must happen immediately?

Is this a good implementation of a FPS independant game loop?

I currently have something close to the following implementation of a FPS independent game loop for physics based games. It works very well on just about every computer I have tested it on, keeping the game speed consistent when frame rate drops. However I am going to be porting to embedded devices which will likely struggle harder with video and I am wondering if it will still cut the mustard.
edits:
For this question assume that msecs() returns the time passed in milliseconds which the program has run. The implementation of msecs is different on different platforms. This loop is also run in different ways on different platforms.
#define MSECS_PER_STEP 20
int stepCount, stepSize; // these are not globals in the real source
void loop() {
int i,j;
int iterations =0;
static int accumulator; // the accumulator holds extra msecs
static int lastMsec;
int deltatime = msec() - lastMsec;
lastMsec = msec();
// deltatime should be the time since the last call to loop
if (deltatime != 0) {
// iterations determines the number of steps which are needed
iterations = deltatime/MSECS_PER_STEP;
// save any left over millisecs in the accumulator
accumulator += deltatime%MSECS_PER_STEP;
}
// when the accumulator has gained enough msecs for a step...
while (accumulator >= MSECS_PER_STEP) {
iterations++;
accumulator -= MSECS_PER_STEP;
}
handleInput(); // gathers user input from an event queue
for (j=0; j<iterations; j++) {
// here step count is a way of taking a more granular step
// without effecting the overall speed of the simulation (step size)
for (i=0; i<stepCount; i++) {
doStep(stepSize/(float) stepCount); // forwards the sim
}
}
}
I just have a few comments. The first is that you don't have enough comments. There are places where it's not clear what you are trying to do so it is difficult to say if there is a better way to do it, but I'll point those out as I come to them. First, though:
#define MSECS_PER_STEP 20
int stepCount, stepSize; // these are not globals in the real source
void loop() {
int i,j;
int iterations =0;
static int accumulator; // the accumulator holds extra msecs
static int lastMsec;
These are not initialized to anything. The probably turn up as 0, but you should have initialized them. Also, rather than declaring them as static you might want to consider putting them in a structure that you pass into loop by reference.
int deltatime = msec() - lastMsec;
Since lastMsec wasn't (initialized and is probably 0) this probably starts out as a big delta.
lastMsec = msec();
This line, just like the last line, calls msec. This is probably meant as "the current time", and these calls are close enough that the returned value is probably the same for both calls, which is probably also what you expected, but still, you call the function twice. You should change these lines to int now = msec(); int deltatime = now - lastMsec; lastMsec = now; to avoid calling this function twice. Current time getting functions often have much higher overhead than you think.
if (deltatime != 0) {
iterations = deltatime/MSECS_PER_STEP;
accumulator += deltatime%MSECS_PER_STEP;
}
You should have a comment here that says what this does, as well as a comment above
that says what the variables were meant to mean.
while (accumulator >= MSECS_PER_STEP) {
iterations++;
accumulator -= MSECS_PER_STEP;
}
This loop needs a comment. It also needs to not be there. It appears that it could have been replaced with iterations += accumulator/MSECS_PER_STEP; accumulator %= MSECS_PER_STEP;. The division and modulus should run in shorter and more consistent time than the loop on any machine that has hardware division (which many do).
handleInput(); // gathers user input from an event queue
for (j=0; j<iterations; j++) {
for (i=0; i<stepCount; i++) {
doStep(stepSize/(float) stepCount); // forwards the sim
}
}
Doing steps in a loop independent of input will have the effect of making the game unresponsive if it does execute slow and get behind. It appears, at least, that if the game gets behind all of the input will start to stack up and get executed together and all of the in-game time will pass in one chunk. This is a less than graceful way to fail.
Additionally, I can guess what the j loop (outer loop) means, but the inner loop I am less clear on. also, the value passed to the doStep function -- what does that mean.
}
This is the last curly brace. I think that it looks lonely.
I don't know what goes on as far as whatever calls your loop function, which may be out of your control, and that may dictate what this function does and how it looks, but if not I hope that you will reconsider the structure. I believe that a better way to do it would be to have a function that is called repeatedly but with only one event at the time (issued regularly at a relatively short period). These events can be either user input events or timer events. User input events just set things up to react upon the next timer event. (when you don't have any events to process you sleep)
You should always assume that each timer event is processed at the same period, even though there may be some drift here if the processing gets behind. The main oddity that you may notice here is that if the game gets behind on processing timer events and then catches up again the time within the game may appear to slow down (below real time), then speed up (to real time), and then slow back down (to real time).
Ways to deal with this include only allowing one timer event to be in the event queue at one time, which would result in time appearing to slow down (below real time) and then speed back up (to real time) with no super speed interval.
Another way to do this, which is functionally similar to what you have, would be to have the last step of processing each timer event be to queue up the next timer event (note that no one else should send timer events {except for the first one} if this is the way you choose to implement the game). This would mean doing away with the regular time intervals between timer events and also restrict the ability for the program to sleep, since at the very least every time the event queue were inspected there would be a timer event to process.

Resources