One hot encoding of the states of a C FSM - c

Basically, I just would like to know if this is a good idea to manually one hot encode the states of a C FSM.
I implemented that to write an easy state transition validator:
typedef enum
{
FSM_State1 = (1 << 0),
FSM_State2 = (1 << 1),
FSM_State3 = (1 << 2),
FSM_StateError = (1 << 3)
} states_t;
Then the validation:
states_t nextState, requestedState;
uint32_t validDestStates = 0;
// Compute requested state
requestedState = FSM_State1;
// Define valid transitions
validDestStates |= FSM_State2;
validDestStates |= FSM_State3;
// Check transition
if (validDestStates & requestedState)
{
// Valid transition
nextState = requestedState;
}
else
{
// Illegal transition
nextState = FSM_StateError;
}
I know that I am limited to the maximum size of integer that I can use. But I don't have that many states. So it is not an issue
Is there something better than this encoding?
Are there some drawbacks I don't see yet?
Thanks for your help!
Edit: changed validation test according to user3386109 comment
Final thoughts
So final here is what I did:
1/ State enum is a "classical" enum:
typedef enum
{
FSM_State1,
FSM_State2,
FSM_State3,
FSM_StateError
} states_t;
2/ Bit fields for valid transitions:
struct s_fsm_stateValidation
{
bool State1IsValid: 1;
bool State2Valid: 1;
bool State3IsValid: 1;
bool StateErrorIsValid: 1;
/// Reserved space for 32bit reserved in the union
uint32_t reserved: 28;
};
3/ Create an union for the validation
typedef union FSM_stateValidation_u
{
/// The bit field of critical system errors
struct s_fsm_stateValidation state;
/// Access the bit field as a whole
uint32_t all;
} u_FSM_stateValidation;
4/ I changed the validation:
u_FSM_stateValidation validDestStates;
// Set valid states
validDestStates.state.State1 = true;
// Compute requestedState
requestedState = FSM_State2;
if ((validDestStates.all & ((uint32_t) (1 << requestedState)) ))
{
// Next state is legal
return requestedState;
}
else
{
return FSM_StateError;
}

From a quick Google, "one hot encoded" means that every valid code has precisely one bit set, which seems to be what you're doing. The search results suggested this was a hardware design pattern.
Drawbacks I can think of are...
As you suggest, you're dramatically limiting the number of valid codes - for 32 bits you have a maximum of 32 codes/states instead of more than 4 billion.
It's not ideal for lookup tables, which are a common implementation for switch statements. There are usually an intrinsic available to determine which is the lowest bit set, but I wouldn't bet on compilers using that automatically.
Those aren't big issues, though, provided the number of states is small.
The question IMO, then, is whether there's an advantage to justify that cost. It doesn't need to be a huge advantage, but there has to be some kind of point.
The best I can come up with is that you can use bitwise tricks to specify sets of states, so you can test whether the current state is in a given set efficiently - if you have some action that needs to be done in states (1<<0) and (1<<3), for example, you could test if (state & 0x9).

Related

Broken CRC32 Combine

I have a problem with a certain CRC method when trying to combine CRC's.
I have been using the combine CRC method and even adapted it a while ago to work with CRC16, etc. and I (hope?) understand how the filling of 0's work according to the answer here https://stackoverflow.com/a/23126768/5495036 with zlib's code
Thing is, I've been rapping my brain on how to get an operator that I can use as I am also working with a non standard CRC calculation. The original CRC calculation is based on the CRC32 0xEDB88320 polynomial for the lookup table but the calculation itself was broken, so instead of 256 byte lookup table looking like
0x00000000,0x77073096,0xee0e612c,0x990951ba,0x076dc419,0x706af48f,0xe963a535,... it now looks like this
0x00000000,0x96000000,0x30960000,0x07309600,0x77073096,0x2c770730,0x612c7707,... and the combine CRC uses the operator to be able to zero out the rest of the bits.
I can't change the calculation unfortunately so that idea is out :P
Any ideas?
EDIT:
The calculation is standard, it just uses a table that has been built using the standard polynomial of 0xEDB88320 but building it incorrectly. Table is still 256 ints. Starting CRC of 0xFFFFFFFF
Complete class code
public static byte[] ByteLookupTable =
{
0x00,0x00,0x00,0x00,0x96,0x30,0x07,0x77,0x2C,0x61,0x0E,0xEE,0xBA,0x51,
0x09,0x99,0x19,0xC4,0x6D,0x07,0x8F,0xF4,0x6A,0x70,0x35,0xA5,0x63,
0xE9,0xA3,0x95,0x64,0x9E,0x32,0x88,0xDB,0x0E,0xA4,0xB8,0xDC,0x79,0x1E,
0xE9,0xD5,0xE0,0x88,0xD9,0xD2,0x97,0x2B,0x4C,0xB6,0x09,0xBD,0x7C,
0xB1,0x7E,0x07,0x2D,0xB8,0xE7,0x91,0x1D,0xBF,0x90,0x64,0x10,0xB7,0x1D,
0xF2,0x20,0xB0,0x6A,0x48,0x71,0xB9,0xF3,0xDE,0x41,0xBE,0x84,0x7D,
0xD4,0xDA,0x1A,0xEB,0xE4,0xDD,0x6D,0x51,0xB5,0xD4,0xF4,0xC7,0x85,0xD3,
0x83,0x56,0x98,0x6C,0x13,0xC0,0xA8,0x6B,0x64,0x7A,0xF9,0x62,0xFD,
0xEC,0xC9,0x65,0x8A,0x4F,0x5C,0x01,0x14,0xD9,0x6C,0x06,0x63,0x63,0x3D,
0x0F,0xFA,0xF5,0x0D,0x08,0x8D,0xC8,0x20,0x6E,0x3B,0x5E,0x10,0x69,
0x4C,0xE4,0x41,0x60,0xD5,0x72,0x71,0x67,0xA2,0xD1,0xE4,0x03,0x3C,0x47,
0xD4,0x04,0x4B,0xFD,0x85,0x0D,0xD2,0x6B,0xB5,0x0A,0xA5,0xFA,0xA8,
0xB5,0x35,0x6C,0x98,0xB2,0x42,0xD6,0xC9,0xBB,0xDB,0x40,0xF9,0xBC,0xAC,
0xE3,0x6C,0xD8,0x32,0x75,0x5C,0xDF,0x45,0xCF,0x0D,0xD6,0xDC,0x59,
0x3D,0xD1,0xAB,0xAC,0x30,0xD9,0x26,0x3A,0x00,0xDE,0x51,0x80,0x51,0xD7,
0xC8,0x16,0x61,0xD0,0xBF,0xB5,0xF4,0xB4,0x21,0x23,0xC4,0xB3,0x56,
0x99,0x95,0xBA,0xCF,0x0F,0xA5,0xBD,0xB8,0x9E,0xB8,0x02,0x28,0x08,0x88,
0x05,0x5F,0xB2,0xD9,0x0C,0xC6,0x24,0xE9,0x0B,0xB1,0x87,0x7C,0x6F,
0x2F,0x11,0x4C,0x68,0x58,0xAB,0x1D,0x61,0xC1,0x3D,0x2D,0x66,0xB6,0x90,
0x41,0xDC,0x76,0x06,0x71,0xDB,0x01,0xBC,0x20,0xD2,0x98,0x2A,0x10,
0xD5,0xEF,0x89,0x85,0xB1,0x71,0x1F,0xB5,0xB6,0x06,0xA5,0xE4,0xBF,0x9F,
0x33,0xD4,0xB8,0xE8,0xA2,0xC9,0x07,0x78,0x34,0xF9,0x00,0x0F,0x8E,
0xA8,0x09,0x96,0x18,0x98,0x0E,0xE1,0xBB,0x0D,0x6A,0x7F,0x2D,0x3D,0x6D,
0x08,0x97,0x6C,0x64,0x91,0x01,0x5C,0x63,0xE6,0xF4,0x51,0x6B,0x6B,
0x62,0x61,0x6C,0x1C,0xD8,0x30,0x65,0x85,0x4E,0x00,0x62,0xF2,0xED,0x95,
0x06,0x6C,0x7B,0xA5,0x01,0x1B,0xC1,0xF4,0x08,0x82,0x57,0xC4,0x0F,
0xF5,0xC6,0xD9,0xB0,0x65,0x50,0xE9,0xB7,0x12,0xEA,0xB8,0xBE,0x8B,0x7C,
0x88,0xB9,0xFC,0xDF,0x1D,0xDD,0x62,0x49,0x2D,0xDA,0x15,0xF3,0x7C,
0xD3,0x8C,0x65,0x4C,0xD4,0xFB,0x58,0x61,0xB2,0x4D,0xCE,0x51,0xB5,0x3A,
0x74,0x00,0xBC,0xA3,0xE2,0x30,0xBB,0xD4,0x41,0xA5,0xDF,0x4A,0xD7,
0x95,0xD8,0x3D,0x6D,0xC4,0xD1,0xA4,0xFB,0xF4,0xD6,0xD3,0x6A,0xE9,0x69,
0x43,0xFC,0xD9,0x6E,0x34,0x46,0x88,0x67,0xAD,0xD0,0xB8,0x60,0xDA,
0x73,0x2D,0x04,0x44,0xE5,0x1D,0x03,0x33,0x5F,0x4C,0x0A,0xAA,0xC9,0x7C,
0x0D,0xDD,0x3C,0x71,0x05,0x50,0xAA,0x41,0x02,0x27,0x10,0x10,0x0B,
0xBE,0x86,0x20,0x0C,0xC9,0x25,0xB5,0x68,0x57,0xB3,0x85,0x6F,0x20,0x09,
0xD4,0x66,0xB9,0x9F,0xE4,0x61,0xCE,0x0E,0xF9,0xDE,0x5E,0x98,0xC9,
0xD9,0x29,0x22,0x98,0xD0,0xB0,0xB4,0xA8,0xD7,0xC7,0x17,0x3D,0xB3,0x59,
0x81,0x0D,0xB4,0x2E,0x3B,0x5C,0xBD,0xB7,0xAD,0x6C,0xBA,0xC0,0x20,
0x83,0xB8,0xED,0xB6,0xB3,0xBF,0x9A,0x0C,0xE2,0xB6,0x03,0x9A,0xD2,0xB1,
0x74,0x39,0x47,0xD5,0xEA,0xAF,0x77,0xD2,0x9D,0x15,0x26,0xDB,0x04,
0x83,0x16,0xDC,0x73,0x12,0x0B,0x63,0xE3,0x84,0x3B,0x64,0x94,0x3E,0x6A,
0x6D,0x0D,0xA8,0x5A,0x6A,0x7A,0x0B,0xCF,0x0E,0xE4,0x9D,0xFF,0x09,
0x93,0x27,0xAE,0x00,0x0A,0xB1,0x9E,0x07,0x7D,0x44,0x93,0x0F,0xF0,0xD2,
0xA3,0x08,0x87,0x68,0xF2,0x01,0x1E,0xFE,0xC2,0x06,0x69,0x5D,0x57,
0x62,0xF7,0xCB,0x67,0x65,0x80,0x71,0x36,0x6C,0x19,0xE7,0x06,0x6B,0x6E,
0x76,0x1B,0xD4,0xFE,0xE0,0x2B,0xD3,0x89,0x5A,0x7A,0xDA,0x10,0xCC,
0x4A,0xDD,0x67,0x6F,0xDF,0xB9,0xF9,0xF9,0xEF,0xBE,0x8E,0x43,0xBE,0xB7,
0x17,0xD5,0x8E,0xB0,0x60,0xE8,0xA3,0xD6,0xD6,0x7E,0x93,0xD1,0xA1,
0xC4,0xC2,0xD8,0x38,0x52,0xF2,0xDF,0x4F,0xF1,0x67,0xBB,0xD1,0x67,0x57,
0xBC,0xA6,0xDD,0x06,0xB5,0x3F,0x4B,0x36,0xB2,0x48,0xDA,0x2B,0x0D,
0xD8,0x4C,0x1B,0x0A,0xAF,0xF6,0x4A,0x03,0x36,0x60,0x7A,0x04,0x41,0xC3,
0xEF,0x60,0xDF,0x55,0xDF,0x67,0xA8,0xEF,0x8E,0x6E,0x31,0x79,0xBE,
0x69,0x46,0x8C,0xB3,0x61,0xCB,0x1A,0x83,0x66,0xBC,0xA0,0xD2,0x6F,0x25,
0x36,0xE2,0x68,0x52,0x95,0x77,0x0C,0xCC,0x03,0x47,0x0B,0xBB,0xB9,
0x16,0x02,0x22,0x2F,0x26,0x05,0x55,0xBE,0x3B,0xBA,0xC5,0x28,0x0B,0xBD,
0xB2,0x92,0x5A,0xB4,0x2B,0x04,0x6A,0xB3,0x5C,0xA7,0xFF,0xD7,0xC2,
0x31,0xCF,0xD0,0xB5,0x8B,0x9E,0xD9,0x2C,0x1D,0xAE,0xDE,0x5B,0xB0,0xC2,
0x64,0x9B,0x26,0xF2,0x63,0xEC,0x9C,0xA3,0x6A,0x75,0x0A,0x93,0x6D,
0x02,0xA9,0x06,0x09,0x9C,0x3F,0x36,0x0E,0xEB,0x85,0x67,0x07,0x72,0x13,
0x57,0x00,0x05,0x82,0x4A,0xBF,0x95,0x14,0x7A,0xB8,0xE2,0xAE,0x2B,
0xB1,0x7B,0x38,0x1B,0xB6,0x0C,0x9B,0x8E,0xD2,0x92,0x0D,0xBE,0xD5,0xE5,
0xB7,0xEF,0xDC,0x7C,0x21,0xDF,0xDB,0x0B,0xD4,0xD2,0xD3,0x86,0x42,
0xE2,0xD4,0xF1,0xF8,0xB3,0xDD,0x68,0x6E,0x83,0xDA,0x1F,0xCD,0x16,0xBE,
0x81,0x5B,0x26,0xB9,0xF6,0xE1,0x77,0xB0,0x6F,0x77,0x47,0xB7,0x18,
0xE6,0x5A,0x08,0x88,0x70,0x6A,0x0F,0xFF,0xCA,0x3B,0x06,0x66,0x5C,0x0B,
0x01,0x11,0xFF,0x9E,0x65,0x8F,0x69,0xAE,0x62,0xF8,0xD3,0xFF,0x6B,
0x61,0x45,0xCF,0x6C,0x16,0x78,0xE2,0x0A,0xA0,0xEE,0xD2,0x0D,0xD7,0x54,
0x83,0x04,0x4E,0xC2,0xB3,0x03,0x39,0x61,0x26,0x67,0xA7,0xF7,0x16,
0x60,0xD0,0x4D,0x47,0x69,0x49,0xDB,0x77,0x6E,0x3E,0x4A,0x6A,0xD1,0xAE,
0xDC,0x5A,0xD6,0xD9,0x66,0x0B,0xDF,0x40,0xF0,0x3B,0xD8,0x37,0x53,
0xAE,0xBC,0xA9,0xC5,0x9E,0xBB,0xDE,0x7F,0xCF,0xB2,0x47,0xE9,0xFF,0xB5,
0x30,0x1C,0xF2,0xBD,0xBD,0x8A,0xC2,0xBA,0xCA,0x30,0x93,0xB3,0x53,
0xA6,0xA3,0xB4,0x24,0x05,0x36,0xD0,0xBA,0x93,0x06,0xD7,0xCD,0x29,0x57,
0xDE,0x54,0xBF,0x67,0xD9,0x23,0x2E,0x7A,0x66,0xB3,0xB8,0x4A,0x61,
0xC4,0x02,0x1B,0x68,0x5D,0x94,0x2B,0x6F,0x2A,0x37,0xBE,0x0B,0xB4,0xA1,
0x8E,0x0C,0xC3,0x1B,0xDF,0x05,0x5A,0x8D,0xEF,0x02,0x2D,0xC3,0x8D,0x40,0x00
};
protected void BuildLookupTable()
{
if (LookupTable == null)
{
LookupTable = new uint[256];
for (int i = 0; i < LookupTable.Length; i++)
{
LookupTable[i] = BitConverter.ToUInt32(ByteLookupTable, i);
}
}
}
protected override uint CalculateBuffer(byte[] buffer, uint crc, int startPos, int endPos)
{
for (int i = 0; i < endPos; i++)
{
crc = LookupTable[(crc ^ buffer[i]) & 0xff] ^ (crc >> 8);
}
return crc;
}
The lookup table constants are correct for that polynomial, but apparently the conversion done by BuildLookupTable() is totally messed up. It would have been easy to rewrite the table to avoid needing BuildLookupTable().
What follows is from the original answer, which assumed that the table was converted correctly. Which it isn't.
As it is, this isn't a CRC, so the combination approach for a CRC does not apply here.
The one thing missing from your definition is what the initial value for the CRC is, and possibly if there is an exclusive-or done on the final CRC. That polynomial is the same as used by zlib, PKZIP, etc., where the initial CRC value is 0xffffffff and the final exclusive-or is with 0xffffffff. That CRC is referred to as CRC-32/ISO-HDLC.
Whatever your initial and the final exclusive-or values are, if they are equal, then you can use the crc32_combine() function from zlib as is. If they are not, you can still use crc32_combine(), but you need to exclusive-or the input and output CRC values of that function with the exclusive-or of the initial and final exclusive-or values.

Preventing torn reads with an HCS12 microcontroller

Summary
I'm trying to write an embedded application for an MC9S12VR microcontroller. This is a 16-bit microcontroller but some of the values I deal with are 32 bits wide and while debugging I've captured some anomalous values that seem to be due to torn reads.
I'm writing the firmware for this micro in C89 and running it through the Freescale HC12 compiler, and I'm wondering if anyone has any suggestions on how to prevent them on this particular microcontroller assuming that this is the case.
Details
Part of my application involves driving a motor and estimating its position and speed based on pulses generated by an encoder (a pulse is generated on every full rotation of the motor).
For this to work, I need to configure one of the MCU timers so that I can track the time elapsed between pulses. However, the timer has a clock rate of 3 MHz (after prescaling) and the timer counter register is only 16-bit, so the counter overflows every ~22ms. To compensate, I set up an interrupt handler that fires on a timer counter overflow, and this increments an "overflow" variable by 1:
// TEMP
static volatile unsigned long _timerOverflowsNoReset;
// ...
#ifndef __INTELLISENSE__
__interrupt VectorNumber_Vtimovf
#endif
void timovf_isr(void)
{
// Clear the interrupt.
TFLG2_TOF = 1;
// TEMP
_timerOverflowsNoReset++;
// ...
}
I can then work out the current time from this:
// TEMP
unsigned long MOTOR_GetCurrentTime(void)
{
const unsigned long ticksPerCycle = 0xFFFF;
const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
const unsigned long ticks = _timerOverflowsNoReset * ticksPerCycle + TCNT;
const unsigned long microseconds = ticks / ticksPerMicrosecond;
return microseconds;
}
In main.c, I've temporarily written some debugging code that drives the motor in one direction and then takes "snapshots" of various data at regular intervals:
// Test
for (iter = 0; iter < 10; iter++)
{
nextWait += SECONDS(secondsPerIteration);
while ((_test2Snapshots[iter].elapsed = MOTOR_GetCurrentTime() - startTime) < nextWait);
_test2Snapshots[iter].position = MOTOR_GetCount();
_test2Snapshots[iter].phase = MOTOR_GetPhase();
_test2Snapshots[iter].time = MOTOR_GetCurrentTime() - startTime;
// ...
In this test I'm reading MOTOR_GetCurrentTime() in two places very close together in code and assign them to properties of a globally available struct.
In almost every case, I find that the first value read is a few microseconds beyond the point the while loop should terminate, and the second read is a few microseconds after that - this is expected. However, occasionally I find the first read is significantly higher than the point the while loop should terminate at, and then the second read is less than the first value (as well as the termination value).
The screenshot below gives an example of this. It took about 20 repeats of the test before I was able to reproduce it. In the code, <snapshot>.elapsed is written to before <snapshot>.time so I expect it to have a slightly smaller value:
For snapshot[8], my application first reads 20010014 (over 10ms beyond where it should have terminated the busy-loop) and then reads 19988209. As I mentioned above, an overflow occurs every 22ms - specifically, a difference in _timerOverflowsNoReset of one unit will produce a difference of 65535 / 3 in the calculated microsecond value. If we account for this:
A difference of 40 isn't that far off the discrepancy I see between my other pairs of reads (~23/24), so my guess is that there's some kind of tear going on involving an off-by-one read of _timerOverflowsNoReset. As in while busy-looping, it will perform one call to MOTOR_GetCurrentTime() that erroneously sees _timerOverflowsNoReset as one greater than it actually is, causing the loop to end early, and then on the next read after that it sees the correct value again.
I have other problems with my application that I'm having trouble pinning down, and I'm hoping that if I resolve this, it might resolve these other problems as well if they share a similar cause.
Edit: Among other changes, I've changed _timerOverflowsNoReset and some other globals from 32-bit unsigned to 16-bit unsigned in the implementation I now have.
You can read this value TWICE:
unsigned long GetTmrOverflowNo()
{
unsigned long ovfl1, ovfl2;
do {
ovfl1 = _timerOverflowsNoReset;
ovfl2 = _timerOverflowsNoReset;
} while (ovfl1 != ovfl2);
return ovfl1;
}
unsigned long MOTOR_GetCurrentTime(void)
{
const unsigned long ticksPerCycle = 0xFFFF;
const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
const unsigned long ticks = GetTmrOverflowNo() * ticksPerCycle + TCNT;
const unsigned long microseconds = ticks / ticksPerMicrosecond;
return microseconds;
}
If _timerOverflowsNoReset increments much slower then execution of GetTmrOverflowNo(), in worst case inner loop runs only two times. In most cases ovfl1 and ovfl2 will be equal after first run of while() loop.
Calculate the tick count, then check if while doing that the overflow changed, and if so repeat;
#define TCNT_BITS 16 ; // TCNT register width
uint32_t MOTOR_GetCurrentTicks(void)
{
uint32_t ticks = 0 ;
uint32_t overflow_count = 0;
do
{
overflow_count = _timerOverflowsNoReset ;
ticks = (overflow_count << TCNT_BITS) | TCNT;
}
while( overflow_count != _timerOverflowsNoReset ) ;
return ticks ;
}
the while loop will iterate either once or twice no more.
Based on the answers #AlexeyEsaulenko and #jeb provided, I gained understanding into the cause of this problem and how I could tackle it. As both their answers were helpful and the solution I currently have is sort of a mixture of the two, I can't decide which of the two answers to accept, so instead I'll upvote both answers and keep this question open.
This is how I now implement MOTOR_GetCurrentTime:
unsigned long MOTOR_GetCurrentTime(void)
{
const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
unsigned int countA;
unsigned int countB;
unsigned int timerOverflowsA;
unsigned int timerOverflowsB;
unsigned long ticks;
unsigned long microseconds;
// Loops until TCNT and the timer overflow count can be reliably determined.
do
{
timerOverflowsA = _timerOverflowsNoReset;
countA = TCNT;
timerOverflowsB = _timerOverflowsNoReset;
countB = TCNT;
} while (timerOverflowsA != timerOverflowsB || countA >= countB);
ticks = ((unsigned long)timerOverflowsA << 16) + countA;
microseconds = ticks / ticksPerMicrosecond;
return microseconds;
}
This function might not be as efficient as other proposed answers, but it gives me confidence that it will avoid some of the pitfalls that have been brought to light. It works by repeatedly reading both the timer overflow count and TCNT register twice, and only exiting the loop when the following two conditions are satisfied:
the timer overflow count hasn't changed while reading TCNT for the first time in the loop
the second count is greater than the first count
This basically means that if MOTOR_GetCurrentTime is called around the time that a timer overflow occurs, we wait until we've safely moved on to the next cycle, indicated by the second TCNT read being greater than the first (e.g. 0x0001 > 0x0000).
This does mean that the function blocks until TCNT increments at least once, but since that occurs every 333 nanoseconds I don't see it being problematic.
I've tried running my test 20 times in a row and haven't noticed any tearing, so I believe this works. I'll continue to test and update this answer if I'm wrong and the issue persists.
Edit: As Vroomfondel points out in the comments below, the check I do involving countA and countB also incidentally works for me and can potentially cause the loop to repeat indefinitely if _timerOverflowsNoReset is read fast enough. I'll update this answer when I've come up with something to address this.
The atomic reads are not the main problem here.
It's the problem that the overflow-ISR and TCNT are highly related.
And you get problems when you read first TCNT and then the overflow counter.
Three sample situations:
TCNT=0x0000, Overflow=0 --- okay
TCNT=0xFFFF, Overflow=1 --- fails
TCNT=0x0001, Overflow=1 --- okay again
You got the same problems, when you change the order to: First read overflow, then TCNT.
You could solve it with reading twice the totalOverflow counter.
disable_ints();
uint16_t overflowsA=totalOverflows;
uint16_t cnt = TCNT;
uint16_t overflowsB=totalOverflows;
enable_ints();
uint32_t totalCnt = cnt;
if ( overflowsA != overflowsB )
{
if (cnt < 0x4000)
totalCnt += 0x10000;
}
totalCnt += (uint32_t)overflowsA << 16;
If the totalOverflowCounter changed while reading the TCNT, then it's necessary to check if the value in tcnt is already greater 0 (but below ex. 0x4000) or if tcnt is just before the overflow.
One technique that can be helpful is to maintain two or three values that, collectively, hold overlapping portions of a larger value.
If one knows that a value will be monotonically increasing, and one will never go more than 65,280 counts between calls to "update timer" function, one could use something like:
// Note: Assuming a platform where 16-bit loads and stores are atomic
uint16_t volatile timerHi, timerMed, timerLow;
void updateTimer(void) // Must be only thing that writes timers!
{
timerLow = HARDWARE_TIMER;
timerMed += (uint8_t)((timerLow >> 8) - timerMed);
timerHi += (uint8_t)((timerMed >> 8) - timerHi);
}
uint32_t readTimer(void)
{
uint16_t tempTimerHi = timerHi;
uint16_t tempTimerMed = timerMed;
uint16_t tempTimerLow = timerLow;
tempTimerMed += (uint8_t)((tempTimerLow >> 8) - tempTimerMed);
tempTimerHi += (uint8_t)((tempTimerMed >> 8) - tempTimerHi);
return ((uint32_t)tempTimerHi) << 16) | tempTimerLow;
}
Note that readTimer reads timerHi before it reads timerLow. It's possible that updateTimer might update timerLow or timerMed between the time readTimer reads
timerHi and the time it reads those other values, but if that occurs, it will
notice that the lower part of timerHi needs to be incremented to match the upper
part of the value that got updated later.
This approach can be cascaded to arbitrary length, and need not use a full 8 bits
of overlap. Using 8 bits of overlap, however, makes it possible to form a 32-bit
value by using the upper and lower values while simply ignoring the middle one.
If less overlap were used, all three values would need to take part in the
final computation.
The problem is that the writes to _timerOverflowsNoReset isn't atomic and you don't protect them. This is a bug. Writing atomic from the ISR isn't very important, as the HCS12 blocks the background program during interrupt. But reading atomic in the background program is absolutely necessary.
Also, have in mind that Codewarrior/HCS12 generates somewhat ineffective code for 32 bit arithmetic.
Here is how you can fix it:
Drop unsigned long for the shared variable. In fact you don't need a counter at all, given that your background program can service the variable within 22ms real-time - should be very easy requirement. Keep your 32 bit counter local and away from the ISR.
Ensure that reads of the shared variable are atomic. Disassemble! It must be a single MOV instruction or similar; otherwise you must implement semaphores.
Don't read any volatile variable inside complex expressions. Not only the shared variable but also the TCNT. Your program as it stands has a tight coupling between the slow 32 bit arithmetic algorithm's speed and the timer, which is very bad. You won't be able to reliably read TCNT with any accuracy, and to make things worse you call this function from other complex code.
Your code should be changed to something like this:
static volatile bool overflow;
void timovf_isr(void)
{
// Clear the interrupt.
TFLG2_TOF = 1;
// TEMP
overflow = true;
// ...
}
unsigned long MOTOR_GetCurrentTime(void)
{
bool of = overflow; // read this on a line of its own, ensure this is atomic!
uint16_t tcnt = TCNT; // read this on a line of its own
overflow = false; // ensure this is atomic too
if(of)
{
_timerOverflowsNoReset++;
}
/* calculations here */
return microseconds;
}
If you don't end up with atomic reads, you will have to implement semaphores, block the timer interrupt or write the reading code in inline assembler (my recommendation).
Overall I would say that your design relying on TOF is somewhat questionable. I think it would be better to set up a dedicated timer channel and let it count up a known time unit (10ms?). Any reason why you can't use one of the 8 timer channels for this?
It all boils down to the question of how often you do read the timer and how long the maximum interrupt sequence will be in your system (i.e. the maximum time the timer code can be stopped without making "substantial" progress).
Iff you test for time stamps more often than the cycle time of your hardware timer AND those tests have the guarantee that the end of one test is no further apart from the start of its predecessor than one interval (in your case 22ms), all is well. In the case your code is held up for so long that these preconditions don't hold, the following solution will not work - the question then however is whether the time information coming from such a system has any value at all.
The good thing is that you don't need an interrupt at all - any try to compensate for the inability of the system to satisfy two equally hard RT problems - updating your overflow timer and delivering the hardware time is either futile or ugly plus not meeting the basic system properties.
unsigned long MOTOR_GetCurrentTime(void)
{
static uint16_t last;
static uint16_t hi;
volatile uint16_t now = TCNT;
if (now < last)
{
hi++;
}
last = now;
return now + (hi * 65536UL);
}
BTW: I return ticks, not microseconds. Don't mix concerns.
PS: the caveat is that such a function is not reentrant and in a sense a true singleton.

Understanding Logic Using Iteration for Master/Slave

I took over a project from unknown predecessor who had gone without proper documents and comments.
Now I am trying to analyze his codes, but it is hard to follow up.
Basically, there are 32 channels hook up with a micro controller. What he seems like trying to do is find slave channels between 32 channels once those information is sent from a server.
call nextslave()
for (scan=0 ; (scan = nextslave(chan, scan)) != -1 ; scan++)
nextslave() looks like below
/**
* nextslave - gets the channel number of the next slave channel
* associated with the master. returns -1 if no more slaves.
* channel and start are zero-based.
*/
short nextslave(short channel, short start)
{
short mask, major, minor;
unsigned char *p;
/* fix-up the slaveflag[] index values */
major = start / 8;
minor = start % 8;
/* init a pointer into the slaveflag[] array */
p = &(chparam[channel].slaveflag[major]);
/* now let us find the next slave channel (if any) */
for (; major < (NUMCHANS / 8) ; major++, p++)
{
minor &= 0x07;
for (mask = (0x01 << minor) ; minor < 8 ; mask <<= 1, minor++)
{
if (*p & mask)
{
/* found one so calculate channel# and return */
return ((major * 8) + minor);
}
}
}
/* if we reach here then there are no (more) slaves */
return (-1);
}
What I have analyzed so far is:
start variable keeps iterating until 32 in nextslave().
when the start var is 0~7, major var is 0 and minor var changes from 0 to 7,also mask var changes 1,2,4,8,16...
When the start var is 8 ~15, major var is 1, and minor var still keep changing from 0 to 7
Keep iterating until major var becomes 4.(I don't know the meaning of major var in his codes)
In second for loop, return something if *p(pointer to values from server) & mask is true
I am not clear about general idea about what he intended to for this process. Especially, in the second for loop, if there is no match up in if(*p&mask), then go back to the first for loop. However, minor variable was being increased without clearing out like 0. So once the code hits minor &= 0x07, the processor will do bitwise with the last value of minor although major var keeps increasing.
For instance, the range of minor var is 0 to 7 and there is no match up value, so ends up becoming 7 in the second for loop. Get out of the loop and go back to the first loop and increases major var by 1. But minor var is still 7, so the second loop will start like mask =(0x01<<minor) with minor =7.
I feel like need to reset minor =0 whenever getting out the second for loop, but I don't know what he was aiming for. He just wrote down "master/slave technique".
My Questions are:
Is it correct with my analysis?
How the master/slave technique is used for 32 channels getting ADC?
Is the reset code for minor var required? whenever getting out the second for loop.
If you have any idea, please answer anything that helps me out to understand his codes.
Thanks,
Jin

How to Interpret typedef enum property on MCOIMAPMessage

My question is mostly about how interpret a typedef enum, but here is the background:
I am using MailCore2, and I am trying to figure out how to read the flags off of an individual email object that I am fetching.
Each MCOIMAPMessage *email that I fetch has a property on it called 'flags.' Flags is of type MCOMessageFlag. When I look up the definition of MCOMessageFlag, I find that it is a typedef enum:
typedef enum {
MCOMessageFlagNone = 0,
/** Seen/Read flag.*/
MCOMessageFlagSeen = 1 << 0,
/** Replied/Answered flag.*/
MCOMessageFlagAnswered = 1 << 1,
/** Flagged/Starred flag.*/
MCOMessageFlagFlagged = 1 << 2,
/** Deleted flag.*/
MCOMessageFlagDeleted = 1 << 3,
/** Draft flag.*/
MCOMessageFlagDraft = 1 << 4,
/** $MDNSent flag.*/
MCOMessageFlagMDNSent = 1 << 5,
/** $Forwarded flag.*/
MCOMessageFlagForwarded = 1 << 6,
/** $SubmitPending flag.*/
MCOMessageFlagSubmitPending = 1 << 7,
/** $Submitted flag.*/
MCOMessageFlagSubmitted = 1 << 8,
} MCOMessageFlag;
Since I do not know how typedef enums really work - particularly this one with the '= 1 << 8' type components, I am a little lost about how to read the emails' flags property.
For example, I have an email message that has both an MCOMessageFlagSeen and an MCOMessageFlagFlagged on the server. I'd like to find out from the email.flags property whether or not the fetched email has one, both or neither of these flags (if possible). However, in the debugger when I print 'email.flags' for an email that has both of the above flags, I get back just the number 5. I don't see how that relates to the typedef enum definitions above.
Ultimately, I want to set a BOOL value based on whether or not the flag is present. In other words, I'd like to do something like:
BOOL wasSeen = email.flags == MCOMessageFlagSeen;
BOOL isFlagged = email.flags == MCOMessageFlagFlagged;
Of course this doesn't work, but this is the idea. Can anyone suggest how I might accomplish this and/or how to understand the typedef enum?
These flags are used as in a bitmask.
This allows to store multiple on/off flags in a single numeric type (let it be an unsigned char or an unsigned int). Basically if a flag is set then its corresponding bit is set too.
For example:
MCOMessageFlagMDNSent = 1 << 5
1<<5 means 1 shifted to the left by 5 bits, so in binary:
00000001 << 5 = 00100000
This works only if no flag overlaps with other flags, which is typically achieved by starting with 1 and shifting it to the left by a different amount for every flag.
Then to check if a flag is set you check if the corresponding bit is set, eg:
if (flags & MCOMessageFlagMDNSent)
result will be true if the bitwise AND result is different from zero, this can happen only if the corresponding bit is set.
You can set a flag easily with OR:
flags |= MCOMessageFlagMDNSent;
or reset it with AND:
flags &= ~MCOMessageFlagMDNSent;
The values of the enum represent the individual bits, so you need bitwise operations to check for flags:
BOOL wasSeen = ( email.flags & MCOMessageFlagSeen ); // check if a bit was set
BTW: You code seems to suggest this is C, not C++. Tagging a question is both is almost always wrong, I suggest you pick the language you are using and remove the other tag.

Intermittent bugs - sometimes this code works and sometimes it doesn't!

This code intermittently works. It's running on a small microcontroller. It will work fine even after restarting the processor, but if I change some part of the code, it breaks. This makes me think that it's some kind of pointer bug or memory corruption. What's happening is the coordinate, p_res.pos.x is sometimes read as 0 (the incorrect value) and 96 (the correct value) when it is passed to write_circle_outlined. y seems to be correct most of the time. If anyone can spot anything obviously wrong please point it out!
int demo_game()
{
long int d;
int x, y;
struct WorldCamera p_viewer;
struct Point3D_LLA p_subj;
struct Point2D_CalcRes p_res;
p_viewer.hfov = 27;
p_viewer.vfov = 32;
p_viewer.width = 192;
p_viewer.height = 128;
p_viewer.p.lat = 51.26f;
p_viewer.p.lon = -1.0862f;
p_viewer.p.alt = 100.0f;
p_subj.lat = 51.20f;
p_subj.lon = -1.0862f;
p_subj.alt = 100.0f;
while(1)
{
fill_buffer(draw_buffer_mask, 0x0000);
fill_buffer(draw_buffer_level, 0xffff);
compute_3d_transform(&p_viewer, &p_subj, &p_res, 10000.0f);
x = p_res.pos.x;
y = p_res.pos.y;
write_circle_outlined(x, y, 1.0f / p_res.est_dist, 0, 0, 0, 1);
p_viewer.p.lat -= 0.0001f;
//p_viewer.p.alt -= 0.00001f;
d = 20000;
while(d--);
}
return 1;
}
The code for compute_3d_transform is:
void compute_3d_transform(struct WorldCamera *p_viewer, struct Point3D_LLA *p_subj, struct Point2D_CalcRes *res, float cliph)
{
// Estimate the distance to the waypoint. This isn't intended to replace
// proper lat/lon distance algorithms, but provides a general indication
// of how far away our subject is from the camera. It works accurately for
// short distances of less than 1km, but doesn't give distances in any
// meaningful unit (lat/lon distance?)
res->est_dist = hypot2(p_viewer->p.lat - p_subj->lat, p_viewer->p.lon - p_subj->lon);
// Save precious cycles if outside of visible world.
if(res->est_dist > cliph)
goto quick_exit;
// Compute the horizontal angle to the point.
// atan2(y,x) so atan2(lon,lat) and not atan2(lat,lon)!
res->h_angle = RAD2DEG(angle_dist(atan2(p_viewer->p.lon - p_subj->lon, p_viewer->p.lat - p_subj->lat), p_viewer->yaw));
res->small_dist = res->est_dist * 0.0025f; // by trial and error this works well.
// Using the estimated distance and altitude delta we can calculate
// the vertical angle.
res->v_angle = RAD2DEG(atan2(p_viewer->p.alt - p_subj->alt, res->est_dist));
// Normalize the results to fit in the field of view of the camera if
// the point is visible. If they are outside of (0,hfov] or (0,vfov]
// then the point is not visible.
res->h_angle += p_viewer->hfov / 2;
res->v_angle += p_viewer->vfov / 2;
// Set flags.
if(res->h_angle < 0 || res->h_angle > p_viewer->hfov)
res->flags |= X_OVER;
if(res->v_angle < 0 || res->v_angle > p_viewer->vfov)
res->flags |= Y_OVER;
res->pos.x = (res->h_angle / p_viewer->hfov) * p_viewer->width;
res->pos.y = (res->v_angle / p_viewer->vfov) * p_viewer->height;
return;
quick_exit:
res->flags |= X_OVER | Y_OVER;
return;
}
Structure for the results:
typedef struct Point2D_Pixel { unsigned int x, y; };
// Structure for storing calculated results (from camera transforms.)
typedef struct Point2D_CalcRes
{
struct Point2D_Pixel pos;
float h_angle, v_angle, est_dist, small_dist;
int flags;
};
The code is part of an open source project of mine so it's okay to post a lot of code here.
I see some of your calculation depends on p_viewer->yaw, but I do not see any intialization for p_viewer->yaw. Is this your problem?
A couple of things that seem sketchy:
You can return from compute_3d_transform without setting many of the fields in p_res/res but the caller never checks for this situation.
You consistently read from res->flags without initializing it first.
Whenever the output differs, it possibly means some value is not initialized and the outcome depends on the garbage value present in a variable. Keeping that in mind, I looked for uninitialized variables. the structure p_res is not initialized.
if(res->est_dist > cliph)
goto quick_exit;
that means if condition may turn out to be true or false depending on what garbage value is stored in res->est_dist. When if condition turns out to true, it goes straight to quick_exit label and doesn't update p_res.pos.x. If condition turned out to be false then its updated.
When I used to program C, I would use a divide and conquer debugging technique for this kind of problem to try to isolate the offending operation (paying attention to whether the symptoms change as debugging code is added, which is indicative of dangling pointer type bugs).
Essentially, start with the first line where the value is known to be good (and prove that it is consistently good at that line). Then identify where is it known to be bad. Then approx. halfway between the two points insert a test to see if it's bad. If not, then insert a test halfway between the mid-point and the known bad location, if it is bad then insert a test halfway between the mid-point and the known good location, and so on.
If the line identified is itself a function call, this process can be repeated in that called function, and so on.
When using this kind of approach, it's important to minimize the amount of added code and the artificial "noise", which can create timing changes.
Use this if you don't have (or can't use) an interactive debugger, or if the problem does not manifest when using one.

Resources