I am wondering if it is possible to speed up the ISR without changing the prescaler.
I have a timer with 2 compare registers A and B.
COMPA is used for a PWM output from around 22% up to 100%. This has a fixed frequency and I am not allowed to change it at least not much.
Now I would like to use the COMPB with around 4 times the speed but with a fixed duty cycle of 50%.
If I set the OCIE0B bit in TIMSK0 for the attiny13 can I do the following to speed things up?
Or am I misunderstanding something here?
ISR(TIM0_COMPB_vect){
switch (timing){
case 0:
OCR0B = 63;
PORTB ^= (1 << PB3);
timing = 1;
break;
case 1:
OCR0B = 127;
PORTB ^= (1 << PB3);
timing = 2;
break;
case 2:
OCR0B = 191;
PORTB ^= (1 << PB3);
timing = 3;
break;
case 3:
OCR0B = 255;
PORTB ^= (1 << PB3);
timing = 0;
break;
}
}
Any help appreciated.
Thanx :D
You can do this very efficiently by creatively using the Normal Mode.
The trick is to set the prescaller to get a clock period that is double what you want the variable-duty PWM signal to run at. So if, for example, you want that to PWM at 1Mhz, set the prescaller to 2Mhz.
Assume the variable duty cycle PWM is on pin A and the fixed 50% 4x clock signal is on pin B. (You can also swap these and and also update the code everything will still work)
Enable interrupts for "On compare match B" and "Overflow".
Force pin A high with a force compare match. (Alternately you can skip this step and instead use the inverse of the desired duty cycle in step 7)
Set the COM bits for 'A' to Toggle on match.
Leave the COM bits for B to off. Assumes you have DDR set for this pin to be normal GPIO.
Set the OCR for B to 128.
Set the WGM timer mode to 0 - "Normal Mode".
Set the OCR for A whatever you want the variable duty cycle to be. Note that you might need to special case here for extreme values of 0 and/or 255 depending on what you want to have happen (just turn the pin ON of OFF with GPIO).
You can repeat step 6 anytime you want to change the duty cycle of A and it will update on the next TOP.
Once you do these steps, the A pin will output the desired duty cycle at 1/2 the prescaller clock and the B will output 50% duty at 2x the prescaller clock (which is the desired 4x of the A period).
Here is the ISR code (note that I am not sure what the TOV vector is called in the attiny13 headers [it is sometimes different across AVRs] so you might have to edit the TIM0_OVF_vect name)...
ISR(TIM0_COMPB_vect,TIM0_OVF_vect){
PINB |= (1 << PB3); // Compiles to a single cycle SBI
}
See how this works?
Note that setting a bit in the PIN register actually toggles the PORT bit. This is a quirk of the AVR GPIOs that is documented in the datasheets.
Hopefully this is fast enough. If you really want to squeeze every last cycle out, you can even potentially save the 2 cycles of the RJMP from the interrupt vector by putting the single SBI instruction that the ISR compiles down to directly into the interrupt vector table with a trailing RETI, but this is more complicated!
Focusing solely on the C code aspects, then this can be trivially optimized as:
ISR(TIM0_COMPB_vect)
{
static const uint8_t OCR[4] = {63,127,191,255};
OCR0B = OCR[timing];
PORTB ^= 1u << PB3;
timing++;
if(timing==4)
timing=0;
}
Disassembled on gcc AVR -O3 (with all variables/registers volatile) this brings down the amount of instructions from ~50 to ~20, so it's about twice as fast and takes less memory.
If you just want the fastest equivalent version of the supplied code, then here it is...
ISR(TIM0_COMPB_vect){
OCR0B += 64;
PINB |= (1 << PB3);
}
OCR0B will overflow every 4 passes, which is defined behavior. Probably wise to initialize OCR0B to some non-zero number like 1 to avoid edge cases.
This avoids all variables and memory access - only register access.
Avoids all compares and braches.
The PINB method of toggling the pin compiles down to a single SBI instruction rather than a PUSH, load, XOR, store, POP sequence.
...but again, none of this matters if it does not work and unless you are using one of the two "immediate OCR update" modes, then updating OCR in the middle of a timer cycle will have no effect.
Related
I am having some trouble with a timer on my Arduino atmega328p-pu with 16MHz clock.
I have a really simply program with only one timer, two ISRs, and one pin.
The program does the following:
Iterates through the bits of 'sequence' and sets pin4 high or low respectively. However it doesnt set the bit high for the entire period, only 1/12 of it. What you see below is a single timer that counts up from 0 to 340. There is ISRB at 28, and then ISRA happens at 340, then it loops (that is what CTC mode does, loops after ISRA). ISRB always turns off the pin, and ISRA handles whether or not the pin should be high.
Now then the problem. All the timing works for each bit, but for some reason the loopover event causes the pulse spacing to SHORTEN. Yes shorten, not widen (which if anything was what I would expect because of extra clock cycles for executing the loop event).
it makes waveforms that look like this.
_|_|_|_|_ _ _ _ _|_|_|_||_|_|_|_ _ _ _
You can see that the problem resides in the junction between two packets, but the rest of the timing is good. I cant seem to track down why.
#include <stdint.h>
uint32_t sequence =0b111100001111; // example data sequence
uint8_t packetlength = 12;
uint8_t index = 0;
void setup(){
DDRD = 0xFF; // all port D as input
bitSet(DDRD, 4); // board pin 4 output
bitSet(PORTD, 4); // start high
// initialize timer1
TCCR1A = 0; // zeros timer control register
TCCR1B = 0;
TCNT1 = 0; // sets timer counter to 0
OCR1A = 340; // compare match register 340*62.5ns = 21.25us
OCR1B = 28; // 28*62.5ns = 1.75us
TIMSK1 = 0;
TCCR1B |= (1 << WGM12); // CTC mode
TCCR1B |= (1 << CS10); // CS10 no prescaler (use CS12 for 256 prescaler for testing)
TIMSK1 |= (1 << OCIE1A); // enable timer compare A interrupt
TIMSK1 |= (1 << OCIE1B); // enable timer compare B interrupt
}
ISR(TIMER1_COMPA_vect){ // controls bit repeat rate
if (bitRead(sequence,index) == 1){
bitSet(PORTD, 4); //set high
}
index ++;
if (index == packetlength){ //loop over when end reached.
index = 0;
}
}
ISR(TIMER1_COMPB_vect){ // controls duty cycle
bitClear(PORTD, 4); // set low
}
void loop(){
//nothing
}
Edit: April 5. Scope photos demonstrating inter pulsetrain period shortening.
The important measurement value is BX-AX
Normal. 340 + 6 calculation clock cycles (best estimate from scope)
Bad. Timer only counting 284 cycles before interrupt is firing.
Also Bad, but not a huge problem. This pulse is far to wide to be reasonably explained by the clock cycles needed set the bit low. It appears to take 17, I would expect 3.
I do not see why you should expect precise timing at the bit output. Interrupts begin after a delay once requested which will vary depending on instruction execution time for each instruction being run in whatever is being interrupted. I suspect (without seeing evidence in your report of the problem) that the variation you see is identical to instruction execution time variation.
If you want precise hardware output timing, you must either use never-interrupted programmed I/O or use the various flip-bit-upon-timer-compare features of the uP's hardware peripheral set. The ISR can be used to set things up for the next compare, but not to directly flip output bits.
Once you've figured out how to setup the action to be performed by the hardware upon comparitor matches, it will be simpler to do it all in a single ISR. That service routine can arrange for both the conditional bit set and the following unconditional bit clear. You probably want the ISR to run during the lengthier part of the cycle so that latency in the actual running of your [a] ISR code does not cause the setup to be too late.
[a. In addition to your ISR code, the programming environment is causing some context save a restore to wrap what you wrote. This can add execution cycles that might not be expected. Auto-generated context save/restore often is extravagant about tucking away state so that naive programmers are not puzzled by strange foreground-background interactions. ]
Recently, I started programming the Atmega328P in pure c/c++, without all the arduino libraries and arduino IDE. I want to blink an LED at a rate of 1Hz (1 sec. on, 1 sec. off). Unfortunately, I can only use Timer0, an 8-bit timer. The mcu has a clock frequency of 16MHz and I chose a prescaler of 1024 to reduce to number of overflows as much as possible to reduce jitter, as an interrupt routine always has overhead (this makes more sense if you read the rest of my question). Using simple maths, I came to conclusion that after 1 second, Timer0 has overflowed 61 times and that the TCNT0 register equals 8.
Then, I came up with this solution:
#define F_CPU 16000000ul
#include <avr/io.h>
#include <avr/interrupt.h>
#define bit(b) (1 << (b))
void initBlinkTimer()
{
//Timer0 with CTC Mode
TCCR0A |= bit(WGM01) | bit(WGM00);
//Compare TCNT0 with 8
OCR0B = 8;
//Interrupt at OCR0B compare match aka: execute interrupt when TCNT0 equals 8
TIMSK0 |= bit(OCIE0B);
//Set PB5 as output
DDRB |= bit(PB5);
//Set the prescaler to 1024
TCCR0B |= bit(CS02) | bit(CS00);
//Enable global interrupts, so that the interrupt routine can be executed upon the OCR0B compare match
sei();
}
//Keeps track of the number of overflows
volatile unsigned char nOverflows = 0;
ISR(TIMER0_COMPB_vect)
{
if(nOverflows >= 61)
{
//Toggle PB5
PORTB ^= bit(PB5);
//Reset timer
nOverflows = 0;
TCNT0 = 0;
}
else
{
nOverflows++;
}
}
int main()
{
initBlinkTimer();
while(1){}
}
This code initializes an 8-bit CTC timer with a prescaler of 1024. An interrupt service routine (ISR) is executed when TCNT0 equals OCR0B, which I set to 8 in my code. In the ISR, the nOverflows variable is compared to 61. If nOverflows equals 61, one second has passed and the PB5 pin is to be toggled. It also performs a greater-than check, in case the mcu missed the 61th overflow (if that's somehow possible in this case). The timer and nOverflows variable are cleared after toggling the pin and the timer then starts again from zero.
My question is: Is this a good way to blink an LED at 1Hz, when only an 8-bit timer is available? Can/should a part of this be implemented in the hardware instead of in the software?
You don't need super high precision in your LED blinking; no one will be able to tell with their eyes if it is slightly off.
I'd say you should configure the timer to overflow about one to ten times per millisecond. Let's say it overflows every 251 microseconds (the exact number doesn't matter). Then every time it overflows, you add 251 to a uint16_t variable for counting microseconds. If the microsecond counter is greater than or equal to 1000, you subtract 1000 from it and increment your millisecond counter, which should probably be a uint32_t so you can use it to time long periods of time.
But what if the overflow period is not an exact number of microseconds? That's still fine, because it will probably be expressible in the form "N/M milliseconds", where N and M are whole numbers. So every time the overflow happens, you increment your counter by N. If the counter reaches M, you subtract M from the counter and add 1 to your millisecond counter. (In the preivous paragraph, we only considered the M=1000 case, and I think by starting with that case it is easier to see what I'm talking about.)
You can either use an ISR to update your millisecond counter or do it in your main-line code by frequently checking for an overflow flag. Either way would work.
So now you have a variable that counts up once per millisecond, like the Arduino millis() function, and you can use it for all sorts of millisecond-scale timing, including blinking LEDs.
The least-significant bit (bit 0) of the millisecond counter toggles once per millisecond, so its period is 2 ms. The next bit up from that (bit 1) has a period of 4 ms. Bit 10 has a period of 2048 ms.
So a simple way to blink your LED with a period of 2048 ms would be:
led_state = millisecond_counter >> 10 & 1;
(The code above uses bitwise operations to set led_state to be a copy of bit 10 from the millisecond counter. Assuming you run this code frequently and write led_state out to the LED using the I/O registers, your LED would then have a period of 2048 ms.)
If you really care that your LED is slow by 2.4%, you could do something more complicated like this:
static uint16_t last_toggle = 0;
if ((uint16_t)(millisecond_counter - last_toggle) >= 1000)
{
last_toggle += 1000;
toggle_the_led();
}
Hardware : Arduino Uno with ATmega328P
Software : Atmel Studio 6.2.1153, Arduino 1.0.6
Calculating the cycles needed for 1s
Clock Frequency of ATmega328P = 16M Hz
Clock Frequency with Prescaler CLK/1024 = 16M / 1024 = 15625 Hz
Clock Period with Prescaler CLK/1024 = (15625)^-1 = 6.4*10^-5s
Cycles of 1s = 1 / 6.4*10^-5 = 15625 cycles
Cycle needed for 1s = 15625 - 1 = 15624 cycles = 0x3D08
My Codes
OCR1AH = 0x3D; //Load higher byte of 15624 into output compare register
OCR1AL = 0x08; //Load lower byte of 15624 into output compare register
TCCR1A = 0b00000000;
TCCR1B = 0b00001101; //Turn on CTC mode and prescaler of CLK/1024
while((TIFR1 & (1<<OCF1A)) == 0); //If OCF1A is set (TCNT1 = OCR1A), break
TCCR1A = 0;
TCCR1B = 0; //Stop the timer
TIFR1 &= ~(1<<OCF1A); //Clear OCF1A for the next time delay
When I click "Start Debugging and Break" and "step over" the above codes as a function. It always show me "running" without a stop. Why? How to solve it?
Thank you for your help.
Your configuration looks OK.
When You pause execution of Your code in debugger, peripherals (independent of CPU) are not stopped. Some architectures/microcontrollers have additional hardware registers to stop peripherals (such a DMA or timers) when execution of code is stopped by debugger. Anyway, AVR does not.
If You are running Your code in simulation, You should be able to see all registers set by instructions, step-by-step. I advise You to turn off code optimizations for debugging purposes.
For debugging code in hardware (in case of AVR architecture) you need additional debugger. Debugging provided by Arduino use only software running over Your code in MPU and in some cases You cant rely on it.
Anyway, your code looks right. Only mistake: write logic 1 to TIFR1 to clear bit.
You should run Your code in loop to check if timer is working:
OCR1AH = 0x3D; //Load higher byte of 15624 into output compare register
OCR1AL = 0x08; //Load lower byte of 15624 into output compare register
TCCR1A = 0b00000000;
TCCR1B = 0b00001101; //Turn on CTC mode and prescaler of CLK/1024
while(1)
{
while((TIFR1 & (1<<OCF1A)) == 0); //If OCF1A is set (TCNT1 = OCR1A), break
//Blink LED here
TIFR1 = (1<<OCF1A); //Writing logic 1 to that register clears it
}
Of course if You dont want to run code in loop, just remove it. That code is just for testing purposes.
Edit:
Take look at Atmega328 datasheet: http://www.atmel.com/Images/doc8161.pdf, pages 139-140, TIFR1 register, bit 1 – OCF1A:
OCF1A is cleared by hardware when executing the corresponding interrupt. OCF1A can also be cleared by writing a logic one to that bit.
Some bits hardware registers (usually, which only can be cleared by user, never set by user) can be cleared by writing 1. Hardware connected with register set 0 to bit value when you write 1 to it. Writing 0 is ignored and not affect register/bit value. That prevents from setting bit from software, when bit may be set only with hardware (that case - timer). Think about it - there is no sense setting output compare register from code. Other actions (reading value, clearing bit) have sense and are allowed.
There are also some registers which can only be written, and not read (ie read always return same value).
When working with hardware registers remember to always check in datasheet how to set/clear bits.
I have been for some time on how to control a motor (control its speed) with fast pwm mode with my atmega32. I need to use the 8-bit Timer0, because i have other uses for the other counters. I think I know how to inialise the timer for this task:
void initial_io (void){
DDRC = 0xFF;
DDRB = 0xFF;
PORTA = (1<<PA4)|(1<<PA5);
TCCR0 = (1<<WGM01)|(1<<WGM00); // PWM mode : Fast PWM.
TCCR0 = (1<<CS02)|(1<<CS00); // PWM clock = CPU_Clock/1024
}
But then comes the problem. I simply don't know what to do next, what to do on my main.
My exact project is to drive a remote controlled car with acceleration. So when I ask from the car to go forward it must accelerate from stop to maximum speed whith fixed acceleration. I don't know any Assembly, so if you can help me please do it in C. Any help will be much appreciated.
Well, I guess you're okay with all port B and C pins being outputs. Also, you shouldn't need to do anything in assembly on AVR. The assembly-only stuff is available as macros in avr-libc.
Setup
First off, you're setting up TCCR0 wrong. You have to set all the bits at once, or you have to use a read-modify-write operation (usually TCCR0 |= _BV(bit_num); to set a bit or TCCR0 &= ~_BV(bit_num); to clear it). (_BV(N) is an avr-libc macro that's more legible than the (1<<N) stuff you're using, but does the same thing.) Also, you're missing the polarity of your PWM output, set by the COM00 and COM01 bits. Right now you have them (implicitly) disabled PWM output (OC0 disconnected).
So I'm going to assume you want a positive-going PWM, i.e. larger PWM input values result in larger high-output duty cycles. This means COM01 needs to be set and COM00 needs to be cleared. (See pp. 80-81 of the ATmega32(L) data sheet.) This results in the setup line:
TCCR0 = _BV(WGM01) | _BV(WGM00) // PWM mode: Fast PWM.
| _BV(COM01) // PWM polarity: active high
| _BV(CS02) | _BV(CS00); // PWM clock: CPU_Clock / 1024
Duty Cycle
Now we get to the actual duty cycle generation. Timer 0 is pretty stupid, and hard wires its BOTTOM to 0 and TOP to 0xFF. This means that each PWM period is PWM_Clock / 256, and since you set PWM_Clock to CPU_Clock / 1024, the period is CPU_Clock / 262144, which is about 33 ms for an 8 MHz CPU clock. So each PWM clock, this counter counts up from 0 to 255, then loops back to 0 and repeats.
The actual PWM is generated by the OC circuit per Table 40. For the COM0* setting we have, it says:
Clear OC0 on compare match, set OC0 at BOTTOM
What this means is that each time the counter counts up, it compares the count value to the OCR0 register, and if they match it drives the OC0 output pin to GND. When the counter wraps around to 0, it drives the pin to VCC.
So to set the duty cycle, you just write a value corresponding to that duty cycle into OCR0:
OCR0 = 0; // 0% duty cycle: always GND.
OCR0 = 64; // 25% duty cycle
OCR0 = 128; // 50% duty cycle
OCR0 = 172; // 67% duty cycle
OCR0 = 255; // 100% duty cycle; always VCC. See below.
That last case is there to resolve a common problem with PWM: the number of possible duty cycle settings is always one more than the number of count steps. In this case, there are 256 steps, and if the output could be VCC for 0, 1, 2, … 256 of those steps, that gives 257 options. So rather than prevent either the 0% or 100% case, they make the one-shy-of-100% case disappear. Note 1 on Table 40 says:
A special case occurs when OCR0 equals TOP and COM01 is set. In this case, the compare match is ignored, but the set or clear is done at BOTTOM.
One more thing: if you write to OCR0 in the middle of a PWM cycle, it just waits until the next cycle.
Simulating Acceleration
Now to get the "constant acceleration" you want, you need to have some kind of standard time base. The TOV0 (timer 0 overflow) interrupt might work, or you could use another of the timers or some kind of external reference. You'll use this standard time base to know when to update OCR0.
Constant acceleration just means that the speed changes linearly with time. Taking this a step further, it means that for each update event, you need to change the speed by a constant amount. This is probably nothing more than saturation arithmetic:
#define kAccelStep 4
void accelerate_step() {
uint8_t x = OCR0;
if(x < (255 - kAccelStep))
OCR0 = x + kAccelStep;
else
OCR0 = 255;
}
Just do something like this for each time step and you'll get constant acceleration. A similar algorithm can be used for deceleration, and you could even use fancier algorithms to simulate nonlinear functions or to compensate for the fact that the motor does not instantly go to the PWM-specified speed.
As you seem to be a beginner on AVR programming, I suggest you go the easy way: start with Arduino.
The Arduino environment offers simple functions so you don't need to manipulate the processor registers directly. For instance, to control a PWM output, you simply have to call analogWrite() (documentation here)
Here is a tutorial to hook up a motor to an Arduino.
Here is a tutorial to program a ATMega32 from Arduino IDE
I am trying to set up two timer interrupt routines with Teensy 2.0 Microcontroller (which is based on ATMEGA32U4 8 bit AVR 16 MHz) for independent control of two servo motors
After much trial - I was able to set one up on pin 7 of port C, but
How do I set up the second ISR to be initialized and called independently of the first?
Do I need to setup the second timer and, if so, what would such code look like?
Here is the setup code:
int main(void)
{
DDRE = 0xFF;
TCCR1A |= 1 << WGM12; // Configure timer 1 for CTC mode
TCCR1B = (1<<WGM12) | (1<<CS11) ;
OCR1A = 1000; // initial
TIMSK1 |= 1 << OCIE1A; // Output Compare A Match Interrupt Enable
sei(); // enable interrupts
// ...code that sets pulseWidth based on app logic variable.
// Not showing as its not important
}
ISR(TIMER1_COMPA_vect)
{
if (0 == pulseWidth)
{
return;
}
static uint8_t state = 0;
int dutyTotal = 20*1000;
if (0 == state)
{
PORTC |= 0b10000000;
OCR1A = pulseWidth;
state = 1;
}
else if (1 == state)
{
PORTC &= 0b01111111;
OCR1A = dutyTotal - pulseWidth;
state = 0;
}
}
While it's difficult to give a definitive answer without knowing more about your application (e.g. what kind of servo/motor, - I'm guessing model RC type with 1-2ms pule?) there are two approaches to solving the problem:
Firstly, In your code you seem to be manually generating a PWM signal by toggling PC7. You could add another output by increasing your number of states - you need one more than the number of servos to give the gap which sets the pulse repetition frequency. This is a common technique when you need to drive a lot of servos, since most RC servos don't care about pulse phasing or frequency (within limits), only the pulse width, so you can generate a bunch of different pulses one after the other on different outputs while only using one timer like this (in a sort of pseudo-code state diagram):
State 0:
Turn on output 1
Set timer TOP to pulse duration 1.
Go to state 1:
State 1:
Turn off output 1
Turn on output 2
Set timer TOP to pulse duration 1.
Go to state 2:
State 2:
Turn off output 2
Set timer TOP to pulse duration 3.
Go to state 0:
"Pulse duration 3" sets the PRF (pulse repetition frequency). If you want get fancy,
you can set this to 1/f-pd1-pd2, to give a constant frequency.
[ "TOP" is AVR-speak for the thing that sets the wrap (overflow) rate of the timer. See data sheet. ]
Secondly, there is a much easier way if you're only using two servos - use the hardware PWM functionality of the timer. The AVR timers have a built-in PWM function to do the pin-toggling for you. Timer1 on the mega32 has two PWM output pins, which could work great for your two servos and you then don't (necessarily) need an interrupt handler at all.
This is also the right solution if you are PWM driving motors directly (e.g. through an H-Bridge.)
To do this, you need to put the timer into PWM mode and enable the OC1A and OC1B output pins, e.g.
/*
* Set fast PWM mode on OC1A and OC1B with ICR1 as TOP
* (Mode 14)
*/
TCCR1A = (1 << WGM11) | (1 << COM1B1) | (1 << COM1A1);
TCCR1B = (3 << WGM12);
/*
* Clock source internal, pre-scale by 8
* (i.e. count rate = 2MHz for 16MHz crystal)
*/
TCCR1B |= (1 << CS11);
/*
* Set counter TOP value to set pulse repetition frequency.
* E.g. 50Hz (good for RC servos):
* 2e6/50 = 40000. N.B. This must be less than 65535.
* We count from t down to 0 so subtract 1 for true freq.
*/
ICR1 = 40000-1;
/* Enable OC1A and OC1B PWM output */
DDRB |= (1 << PB5) | (1 << PB6);
/* Uncomment to enable TIMER1_OVF_vect interrupts at 50Hz */
/* TIMSK1 = (1 << TOV1); */
/*
* Set both servos to centre (1.5ms pulse).
* Value for OCR1x is 2000 per ms then subtract one.
*/
OCR1A = 3000-1;
OCR1B = 3000-1;
Disclaimer - this code fragment compiles but I have not checked it on an actual device so you may need to double-check the register values. See the full datasheet at http://www.atmel.com/Images/doc7766.pdf
Also, you perhaps have some typos in your code, bit WGM12 doesn't exist in TCC1A (you've actually set bit 3, which is FOC1A - "force compare", see datasheet.) Also, you are writing DDRE to enable outputs on port E but toggling a pin on port C.
Halzephron, thank you so much for your answer. I dont have the high enough reputation to mark your answer, hopefully someone else will.
Details below:
You are absolutely right about being able to use a single IRS to control a number of servos - embarrassing I did not think of that, but given how well your simpler solution worked - I'll just use that.
... Also, you are writing DDRE to enable outputs on port E but toggling a pin on port C.
Thanks, I commented out that line, and it works the same - (so I dont need to enable output at all?)
Bit WGM12 doesn't exist in TCC1A (you've actually set bit 3, which is FOC1A - "force compare", see datasheet.)
I removed that too, but leaving the rest of my code unchanged - it results in servos moving slower, with way less torque and jittering as they do, even after arriving at desired position. Servo makes a weird "shaky" noise (frequency ~10-20, I'd say) and the arm it trembling, so for the reasons I don't understand - setting this bit seems necessary.
I suspected that a timer per motor is highly inelegant, and so I really like your second approach (built - in timer-generated PWM),
... This is also the right solution if you are PWM driving motors directly (e.g. through an H-Bridge.)
Very curious, why? I.e. what's the difference between using this method vs, say, timer-generated PWM.
In your code, I had to change below line to get 50HZ, otherwise I was getting 25HZ before (verified with a scope)
ICR1 = 20000-1; // was 40000 - 1;
One other thing I noticed with scope was that the timer-generated PWM code I have - produces less "rectangular" shape than the PWM code snippet you attached. It takes about 0.5 milliseconds for the signal to drop off to 0 with my code, and its absolutely instantaneous with yours (which is great). This solved another problem I had been bashing my head against: I could get analog servos to work fine with my code (IRS-generated PWM), but any digital servo I tried - just did not move, as if it was broken. I guess the shape of the signal is critical for digital servos, I never read this anywhere. Or maybe its something else I dont know.
As a side note, another weirdness I spent a bunch of time on - I uncommented the TIMSK1 = (1 << TOV1); line, thinking I always needed it - but what happened, my main function would get frozen, blocked forever upon first call to delay_ms(...) not sure what it is - but keeping it commented out unblocks my main loop (where I read the servo positin values from the USB HID using Teensy's sample code)
Again, many thanks for the help.