ARM USB functions take too long to excecute - arm

I just started working on a XMC4500 microcontroller. I'm currently implementing USB CDC communication and I ran into a problem.
After calling a function "USBD_VCOM_SendData" the program then waits for a frame to send the data.
More accurately the program waits in the "usbd_endpoint_stream_xmc4000.c" file in the "Endpoint_Write_Stream_LE". There it waits for the endpoint to be ready in the "Endpoint_WaitUntilReady()" function.
uint8_t Endpoint_WaitUntilReady(void)
uint8_t TimeoutMSRem = USB_STREAM_TIMEOUT_MS;
uint16_t TimeoutMSRem = USB_STREAM_TIMEOUT_MS;
uint16_t PreviousFrameNumber = USB_Device_GetFrameNumber();
for (;;)
if (Endpoint_GetEndpointDirection() == ENDPOINT_DIR_IN)
if (Endpoint_IsINReady())
if (Endpoint_IsOUTReceived())
uint8_t USB_DeviceState_LCL = USB_DeviceState;
if (USB_DeviceState_LCL == DEVICE_STATE_Unattached)
return ENDPOINT_READYWAIT_DeviceDisconnected;
else if (USB_DeviceState_LCL == DEVICE_STATE_Suspended)
return ENDPOINT_READYWAIT_BusSuspended;
else if (Endpoint_IsStalled())
return ENDPOINT_READYWAIT_EndpointStalled;
uint16_t CurrentFrameNumber = USB_Device_GetFrameNumber();
if (CurrentFrameNumber != PreviousFrameNumber)
PreviousFrameNumber = CurrentFrameNumber;
if (!(TimeoutMSRem--))
This wait time is approximately 100-150us and is to long. When I was working on STM32 microcontrollers the execution time was significantly smaller.
Has anybody delt with this problem before?
Is there a way to write the data to a buffer and then let the peripheral take care of the transaction without the need for processor time?
Or at least trigger an interrupt when the endpoint is ready for the transaction.

The loop terminates instantly if Endpoint_IsINReady() returns true. Perhaps you should try to run that macro in your code, and only send messages to the USB host if it evaluates to true.
Also, to ensure that the library does not use its blocking behavior, you'll have to think about how much data the endpoint buffers can hold at once, and make sure you don't send any more than that at a time. (You can make your own buffer and send it to the USB library piecemeal if you need to.)


STM32 - Read I/O from multiple Tasks

I'm using FreeRTOS in my STM32F4 based board, and I read about the communication between tasks with queues and semaphores, easy to understand and apply.
But on the documentation, I don't find any information about if is secure to call the same method from different task, for example:
void DefaultTask(void const * argument)
uint8_t pin = 10;
uint16_t analog = ADC_GetAnalog(pin);
uint32_t encoder = Encoder_GetCount(1);
void SecondTask(void const * argument)
uint8_t pin = 14;
uint16_t analog = ADC_GetAnalog(pin);
uint32_t encoder = Encoder_GetCount(2);
The ADC_GetAnalog:
uint16_t ADC_GetAnalog(uint8_t PinNumber)
if((PinNumber >=1)&&(PinNumber<=18))
return ADC_Pin[PinNumber].AnalogValue;
return 0;
I also have multiple encoders in my system (interrupts that increment/decrement the property CNT of the htim# ), and call the read method in the same line as the ADC, also from different Tasks:
uint32_t Encoder_GetCount(uint8_t encoder_num)
volatile __IO uint32_t count = 0;
case 1:
count = htim1.Instance->CNT;
case 2:
count = htim3.Instance->CNT;
case 3:
count = htim5.Instance->CNT;
return (uint32_t)count;
Today I use this way, but would like to know if it's the best (safer) way!!
From what you provide, it appears your functions that can be called simultaneously are only reading stuff, not writing them. So you are good to go. Even if you were writing stuff, if it is local variables, it is fine (each task will have it own copy)
You will need to care about synchronization when you write global variables, or write stuff to certain peripherals (eg, a serial flash chip, you don't want 2 tasks to use it at the same time). One way to deal with it is simply with semaphores/mutexes, or preferably (if possible) have only 1 task with access to this peripheral, a clean design is key to a maintainable system.
It depends on what function does.
Your functions does not change any global variable so it should be safe to call them from different tasks.
For example, if you would had function which writes to global variable, eg. buffer, second call will overwrite changes made by first call. If buffer is used to send data both task could (depends on timing) send same bytes.

Need some help in C code for optimization (Poll + delay/sleep)

Currently I'm polling the register to get the expected value and now I want reduce the CPU usage and increase the performance.
So, I think, if we do polling for particular time (Say for 10ms) and if we didn't get expected value then wait for some time (like udelay(10*1000) or usleep(10*1000) delay/sleep in ms) then continue to do polling for more more extra time (Say 100ms) and still if you didn't get the expected value then do sleep/delay for 100ms.....vice versa... need to do till it reach to maximum timeout value.
Please let me know if anything.
This is the old code:
#include <sys/time.h> /* for setitimer */
#include <unistd.h> /* for pause */
#include <signal.h> /* for signal */
#define INTERVAL 500 //timeout in ms
static int timedout = 0;
struct itimerval it_val; /* for setting itimer */
char temp_reg[2];
int main(void)
/* Upon SIGALRM, call DoStuff().
* Set interval timer. We want frequency in ms,
* but the setitimer call needs seconds and useconds. */
if (signal(SIGALRM, (void (*)(int)) DoStuff) == SIG_ERR)
perror("Unable to catch SIGALRM");
it_val.it_value.tv_sec = INTERVAL/1000;
it_val.it_value.tv_usec = (INTERVAL*1000) % 1000000;
it_val.it_interval = it_val.it_value;
if (setitimer(ITIMER_REAL, &it_val, NULL) == -1)
perror("error calling setitimer()");
temp_reg[0] = read_reg();
//Read the register here and copy the value into char array (temp_reg
if (timedout == 1 )
return -1;//Timedout
} while (temp_reg[0] != 0 );//Check the value and if not try to read the register again (poll)
* DoStuff
void DoStuff(void)
timedout = 1;
printf("Timer went off.\n");
Now I want to optimize and reduce the CPU usage and want to improve the performance.
Can any one help me on this issue ?
Thanks for your help on this.
Currently I'm polling the register to get the expected value [...]
wow wow wow, hold on a moment here, there is a huge story hidden behind this sentence; what is "the register"? what is "the expected value"? What does read_reg() do? are you polling some external hardware? Well then, it all depends on how your hardware behaves.
There are two possibilities:
Your hardware buffers the values that it produces. This means that the hardware will keep each value available until you read it; it will detect when you have read the value, and then it will provide the next value.
Your hardware does not buffer values. This means that values are being made available in real time, for an unknown length of time each, and they are replaced by new values at a rate that only your hardware knows.
If your hardware is buffering, then you do not need to be afraid that some values might be lost, so there is no need to poll at all: just try reading the next value once and only once, and if it is not what you expect, sleep for a while. Each value will be there when you get around to reading it.
If your hardware is not buffering, then there is no strategy of polling and sleeping that will work for you. Your hardware must provide an interrupt, and you must write an interrupt-handling routine that will read every single new value as quickly as possible from the moment that it has been made available.
Here are some pseudo code that might help:
// Pseudo code
start_time = get_current_time();
temp_reg[0] = read_reg();
//Read the register here and copy the value into char array (temp_reg
if (timedout == 1 )
return -1;//Timedout
// Pseudo code
stop_time = get_current_time();
if (stop_time - start_time > some_limit) break;
} while (temp_reg[0] != 0 );
if (temp_reg[0] != 0)
start_time = get_current_time();
} while (temp_reg[0] != 0 );
To turn the pseudo code into real code, see

--fill command on PIC32MX boot flash memory

I have been trying for the last few weeks to find out why this isn't working. I have tried reading all the documentation I could find on my PIC32 MCU (PIC32MX795F512L) and the XC32 compiler I am using (v1.34) but with no success as of yet.
I need a special constant value written to the physical boot flash address 0x1FC02FEC (Virtual address: 0x9FC02FEC). This constant is 0x3BDFED92.
I have managed to successfully do this on my main program (If I program my pic32 directly using Real ICE) by means of the following command line (Which I put in xc32-ld under "Additional options" under the Project Properties):
I am then able to check (Inside my main program) if this address indeed does have the correct value stored inside it, and this works too. I use the following code for that:
if(*(int *)(0x9fc02fec) == 0x3bdfed92)
My problem is the following. I do not want my main program hex file to write the constant into that location. I want my bootloader hex file to do this and my main program must just be able to read that location and see if that constant is there. If I use the --fill command inside the xc32-ld of my bootloader program, it successfully writes the constant just like the main program did (I have tested this by running my bootloader program with the same --fill command in debug mode and checking the 0x1FC02FEC address for the constant). Problem is, when my bootloader reads in a new main program via the MicroSD, and then jumps to the new main program, everything doesn't work. Seems like, before it jumps to the new main program, something bad happens and everything crashes. Almost like writing a value to the 1FC02FEC location is a problem when the program jumps from boot loader to main program.
Is there a reason for this? I hope my explanation is ok, if not then please let me know and I will try reword it in a more understandable way.
I am using the example code provided by Microchip to do the bootloader using the MicroSD card. The following is the code:
int main(void)
volatile UINT i;
volatile BYTE led = 0;
// Setup configuration
TRISBbits.TRISB14 = 0;
LATBbits.LATB14 = 0;
// Create a startup delay to resolve trigger switch bouncing issues
unsigned char x;
WORD ms = 500;
DWORD dwCount = 25;
volatile DWORD _dcnt;
_dcnt = dwCount*((DWORD)(0.00001/(1.0/GetInstructionClock())/10));
#if defined(__C32__)
if(!CheckTrigger() && ValidAppPresent())
// This means the switch is not pressed. Jump
// directly to the application
else if(CheckTrigger() && ValidAppPresent()){
myFile = FSfopen("image.hex","r");
if(myFile == NULL){
//Initialize the media
while (!MDD_MediaDetect())
// Waiting for media to be inserted.
// Initialize the File System
//Indicate error and stay in while loop.
myFile = FSfopen("image.hex","r");
if(myFile == NULL)// Make sure the file is present.
//Indicate error and stay in while loop.
// Erase Flash (Block Erase the program Flash)
// Initialize the state-machine to read the records.
record.status = REC_NOT_FOUND;
// For a faster read, read 512 bytes at a time and buffer it.
readBytes = FSfread((void *)&asciiBuffer[pointer],1,512,myFile);
if(readBytes == 0)
// Nothing to read. Come out of this loop
// break;
// Something fishy. The hex file has ended abruptly, looks like there was no "end of hex record".
//Indicate error and stay in while loop.
for(i = 0; i < (readBytes + pointer); i ++)
// This state machine seperates-out the valid hex records from the read 512 bytes.
if(asciiBuffer[i] == ':')
// We have a record found in the 512 bytes of data in the buffer.
record.start = &asciiBuffer[i];
record.len = 0;
record.status = REC_FOUND_BUT_NOT_FLASHED;
if((asciiBuffer[i] == 0x0A) || (asciiBuffer[i] == 0xFF))
// We have got a complete record. (0x0A is new line feed and 0xFF is End of file)
// Start the hex conversion from element
// 1. This will discard the ':' which is
// the start of the hex record.
record.status = REC_FLASHED;
// Move to next byte in the buffer.
record.len ++;
if(record.status == REC_FOUND_BUT_NOT_FLASHED)
// We still have a half read record in the buffer. The next half part of the record is read
// when we read 512 bytes of data from the next file read.
memcpy(asciiBuffer, record.start, record.len);
pointer = record.len;
record.status = REC_NOT_FOUND;
pointer = 0;
// Blink LED at Faster rate to indicate programming is in progress.
led += 3;
mLED = ((led & 0x80) == 0);
return 0;
If I remember well (very long time ago I used PIC32) you can add into your linker script:
//... other stuff
signature (RX) : ORIGIN = 0x9FC02FEC, length 0x4
Googling around I also found out, that you could do that in your source code, I hope...
const int __attribute__((space(prog), address(0x9FC02FEC))) signature = 0x3bdfed92;
In my program I use an attribute to place a certain value at a certain location in memory space. My bootloader and App can read this location. This may be a simpler way for you to do this. This is using xc16 and a smaller part but I've done it on a PIC32 as well.
#define CHECK_SUM 0x45FB
#define CH_HIGH ((CHECK_SUM & 0xFF00) >> 8)
#define CH_LOW ((CHECK_SUM & 0x00FF) >> 0)
const char __attribute__((space(prog), address(APP_CS_LOC))) CHKSUM[2] = {CH_LOW,CH_HIGH};
Just a note, when you read it, it will be in the right format: HIGH->LOW

Calculating the delay between write and read on I2C in Linux

I am currently working with I2C in Arch Linux Arm and not quite sure how to calculate the absolute minimum delay there is required between a write and a read. If i don't have this delay the read naturally does not come through. I have just applied usleep(1000) between the two commands, which works, but its just done empirically and has to be optimized to the real value (somehow). But how?.
Here is my code sample for the write_and_read function i am using:
int write_and_read(int handler, char *buffer, const int bytesToWrite, const int bytesToRead) {
write(handler, buffer, bytesToWrite);
int r = read(handler, buffer, bytesToRead);
if(r != bytesToRead) {
return -1;
return 0;
Normally there's no need to wait. If your writing and reading function is threaded somehow in the background (why would you do that???) then synchronizating them is mandatory.
I2C is a very simple linear communication and all the devices used my me was able to produce the output data within microsecs.
Are you using 100kHz, 400kHz or 1MHz I2C?
After some discuss I suggest you this to try:
void dataRequest() {
x = 0;
void dataReceive(int numBytes)
x = numBytes;
for (int i = 0; i < numBytes; i++) {;
Where x is a global variable defined in the header then assigned 0 in the setup(). You may try to add a simple if condition into the main loop, e.g. if x > 0, then send something in serial.print() as a debug message, then reset x to 0.
With this you are not blocking the I2C operation with the serial traffic.

Can't write to SC1DRL register on 68HC12 board--what am I missing?

I am trying to write to use the multiple serial interface on the 68HC12 but am can't get it to talk. I think I've isolated the problem to not being able to write to the SC1DRL register (SCI Data Register Low).
The following is from my SCI ISR:
else if (HWRegPtr->SCI.sc1sr1.bit.tdre) {
/* Transmit the next byte in TX_Buffer. */
if ( != TX_Buffer.out || TX_Buffer.full) {
HWRegPtr->SCI.sc1drl.byte = TX_Buffer.buffer[TX_Buffer.out];
if (TX_Buffer.out >= SCI_Buffer_Size) {
TX_Buffer.out = 0;
TX_Buffer.full = 0;
/* Disable the transmit interrupt if the buffer is empty. */
if ( == TX_Buffer.out && !TX_Buffer.full) {
TX_Buffer.buffer has the right thing at index TX_Buffer.out when its contents are being written to HWRegPtr->SCI.sc1drl.byte, but my debugger doesn't show a change, and no data is being transmitted over the serial interface.
Anybody know what I'm missing?
HWRegPtr is defined as:
HARDWARE_REGISTER is a giant struct with all the registers in it, and is volatile.
It's likely that SC1DRL is a write-only register (check the official register docs to be sure -- google isn't turning up the right PDF for me). That means you can't read it back (even with an in-target debugger) to verify your code.
How is HWRegPtr defined? Does it have volatile in the right places to ensure the compiler treats every write through that pointer as something which must happen immediately?
