STM32 Interrupt driven UART receival fails after several flawless receives - c

Please note the clarification and update at the end of the post
TL;DR: An STM32 has 3 UART connections, 1 for debugging and 2 for actual communication which use the interrupt-driven HAL_UART_Receive_IT. Initially, interrupt driven UART receive works fine, though over time the receive callback for one of the UARTs fires less and less until eventually the STM32 doesn't receive any packets on that one UART at all (despite me being able to verify that they were sent). I suspect the issue to be timing related.
Situation: As part of my thesis, I developed a novel protocol which now has to be implemented and tested. It involves two classes of actors, a server and devices. A device consists of an STM32, ESP32 and a UART to Ethernet bridge. The STM32 is connected via UART to the bridge and via UART to the ESP32. The bridge connects the STM32 to the server by converting serial data sent by the STM32 to TCP packets which it forwards to the server (and vice versa). The ESP32 receives framed packets from the STM32, broadcasts them via BLE and forwards all received and well-formed BLE packets to the STM32. I.e. the ESP32 is just a BLE bridge. The server and ESP32 seem to be working flawlessly.
In a nutshell, the server tries to find out which devices D_j can hear BLE advertisements from device D_i. The server does that by periodically iterating over all devices D_1, ..., D_n and sends them nonces Y_1, ..., Y_n encrypted as X_1, ..., X_n. Upon D_i receiving X_i, it decrypts it to get Y_i, which it then forwards to the ESP32 to be broadcasted via BLE. Conversely, whenever the STM32 receives a packet from the ESP32 (i.e. a packet broadcasted via BLE), it extracts some data, encrypts it and forwards it to the server.
After the server has iterated over all devices, it looks at all the messages it received during that round. If it e.g. received a message with value Y_i sent by D_j, it can deduce that D_i's broadcast somehow arrived at D_j.
Problem: The way I have it set up right now, each STM32 seems to occasionally "miss" messages sent by the ESP32. The more such devices I have in my setup, the worse it gets! With just two devices, the protocol works 100% of the time. With three devices, it also seems to work fine. However, with four devices the STM32's UART receive callback for the ESP32 works fine initially, but after a couple of such rounds it doesn't trigger all the time until eventually it doesn't trigger at all.
Visualization:
The below picture shows a sample topology of n devices. Not drawn here, but if e.g. D_1 was to receive Y_2, it would encrypt it to X_2' and send it across the bridge to the server.
N.B.:
Encryption and Decryption each take ca. 130ms
Average one way delay for one ESP32 receiving packet, broadcasting it and another ESP32 receiving is ca. 15ms
I am aware that UART is not a reliable protocol per se and that one should use framing in a real setting. Nevertheless, I was instructed to just assume that UART is perfect and doesn't drop anything.
Due to the larger scope of the project, using an RTOS is not an option
Code:
#define LEN_SERVER_FRAMED_PACKET 35
#define LEN_BLE_PACKET 24
volatile bool_t new_server_msg;
volatile bool_t new_ble_msg;
byte_t s_rx_framed_buf[LEN_SERVER_FRAMED_PACKET]; // Receive buffer to be used in all subsequent Server send operations
ble_packet_t ble_rx_struct; // A struct. The whole struct is then interpreted as uint8_t ptr. when being sent to the ESP32 over UART
Init:
< set up some stuff>
err = HAL_UART_Receive_IT(&SERVER_UART, s_rx_framed_buf, LEN_SERVER_FRAMED_PACKET);
if (!check_success_hal("Init, setting Server ISR", __LINE__, err)){
print_string("Init after Signup: Was NOT able to set SERVER_UART ISR");
}else{
print_string("Init after Signup: Was able to set SERVER_UART ISR");
}
err = HAL_UART_Receive_IT(&BLE_UART, &ble_rx_struct, LEN_BLE_PACKET);
if(!check_success_hal("Init, setting BLE ISR", __LINE__, err)){
print_string("Init after Signup: Was NOT able to set BLE_UART ISR");
}else{
print_string("Init after Signup: Was able to set BLE_UART ISR");
}
Main loop:
while (1)
{
// (2) Go over all 3 cases: New local alert, new BLE message and new Server message and handle them accordingly
// (2.1) Check whether a new local alert has come in
if (<something irrelevant happens>)
{
<do something irrelevant>
}
// (2.2) Check for new ble packet. Technically it checks for packets from the UART to the ESP32.
if (new_ble_msg)
{
new_ble_msg = FALSE;
int ble_rx_type_code = ble_parse_packet(&ble_rx_nonce, &ble_rx_struct);
HAL_UART_Receive_IT(&BLE_UART, &ble_rx_struct, LEN_BLE_PACKET); // Listen for new BLE messages.
<compute some stuff, rather quick> server_tx_encrypted(<stuff computed>, &c_write, "BLE", __LINE__); // Encrypts <stuff computed> and sends it to the server using a BLOCKING HAL_UART_Transmit(...).
// Encryption takes ca. 130ms.
}
// (2.3) Check for new server packet
if (new_server_msg)
{
new_server_msg = FALSE; // Set flag to false
memcpy(s_wx_framed_buf, s_rx_framed_buf, LEN_SERVER_FRAMED_PACKET); // Copy from framed receive buffer to framed working buffer.
// This is done such that we can process the current message while also being able to receive new messages
HAL_UART_Receive_IT(&SERVER_UART, s_rx_framed_buf, LEN_SERVER_FRAMED_PACKET); // Listen for new server messages.
<decrypt it, takes ca.130 - 150ms. results in buffer ble_tx_struct>
err = HAL_UART_Transmit(&BLE_UART, ble_tx_struct,
LEN_BLE_PACKET, UART_TX_TIMEOUT);
check_success_hal(err); // If unsuccessful, print that to debug UART
}
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
}
UART receive callback function:
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
if (huart == &SERVER_UART)
{ // One should technically compate huart -> Instance, but that works aswell...
new_server_msg = TRUE;
print_string("UART Callback: Server ISR happened!\r\n"); // Blocking write to debug UART. I know that this is typically considered bad form,
// but as the callback function is only called once per receive and because that's the only way of letting me know that the callback has occurred,
// I chose to keep the print in.
}
else if (huart == &BLE_UART)
{
new_ble_msg = TRUE;
print_string("UART Callback: BLE ISR happened!\r\n");
}
else
{
print_string("UART Callback: ISR triggered by unknown UART_HandleTypeDef!\r\n");
}
}
What I have tried so far:
I wrote a client implementation in Go and ran it on my computer, where clients would just directly send UDP messages to each other instead of BLE. As that version functioned flawlessly even with many "devices", I am confident that the problem lies squarely at the STM32 and its STM32 <-> ESP32 UART connection.
To get it working with 3 devices, I simply removed most of the debugging statements of the STM32 and made the server wait 250ms between sending X_i to D_{i} and X_{i + 1} to D_{i + 1}. As this seems to have at least made the problem so infrequent that I haven't noticed it anymore, I reckon that the core issue is timing related.
Through drawing execution traces, I have already found an inherent weakness to my approach: if an STM32 calls HAL_UART_Receive_it(&BLE_UART, ble_rx_buf, LEN_BLE_PACKET) while the ESP32 is currently transmitting a packet to the STM and has already sent k bytes, the STM32 will only receive LEN_BLE_PACKET - k bytes. This causes the BLE_UART.RxXferCount to be wrong for when the next packet is sent by the ESP32.
On a more theoretical front, I first considered doing DMA instead of interrupt driven receive. I then refrained however, as in the STM32 DMA doesn't use descriptor rings like in more powerful systems but instead really just removes the overhead from having to receive LEN_BLE_PACKET (resp LEN_SERVER_FRAMED_PACKET) interrupts.
I have also already of course checked stackoverflow, several people seem to have experienced similar issues. E.g. UART receive interrupt stops triggering after several hours of successful receive, "Uart dma receive interrupt stops receiving data after several minutes" .
Questions:
Given what I have described above, how is it possible for the STM32's callback of BLE_UART to simply stop triggering after some time without any apparent reason?
Does it seem plausible that the issue I raised in the last paragraph of "What I have tried so far" is actually the cause of the problem?
How can I fix this issue?
Clarification:
After the server sends a request to a device D_i, the server waits for 250ms before sending the next request to D_{i + 1}. Hence, the D_i has a 250ms transmission window in which no D_j can transmit anything. I.e. when it's D_i's turn to broadcast its nonce, the other devices have to simply receive one UART message.
As the receival from the server is typically rather fast, the decryption takes 130ms and the UART transmit with a baud of 115200 is also quick, this window should be long enough.
UPDATE:
After posting the question, I changed the ESP32 such that BLE packets are not immediately forwarded over UART to the STM32. Instead, they are enqueued and a dedicated task in the ESP32 dequeues them with a minimum 5ms delay between packets. Hence, the STM32 should now have a guaranteed 5ms between each BLE packet. This was done to reduce the burstiness (despite there not actually being any bursts due to what is mentioned in the clarification... I was just desperate). Nevertheless, this seems to have made the STM32 "survive" for longer before the UART receiver locking up.

You need to be very careful especially when using STM32 HAL library for production, the libraries isn't reliable when receiving fast and continuous data from the server or anywhere else.
I will suggest a solution to this problem based on what I did when implementing for similar application. This works well for my Firmware-Over-The-Air(FOTA) project and helps to eliminate any possible UART failures when using STM32 HAL library.
Steps are listed below:
Ensure you reset the UART by calling MX_USARTx_UART_Init()
Reconfigure the callback either for HAL_UART_Receive_IT() or HAL_UART_Receive_DMA()
This two settings would eliminate any UART failure for receive interrupt using STM32 HAL.

Related

How to Read data from RS232 port without RS232 task creation (Embedded FreeRTOS C)?

I want to write C code for an embedded system such that the data received at the RS232 port should be read continuously without creating a separate "RS232 TASK" for reading the data.
Can anyone help me with this?
I just need a basic approach for reading data without task creation.
Identify the function that tells you if some data was received. Commonly it returns a boolean value or the number of received bytes. (BTW, most protocols on RS232 allows 5 to 8 data bits per transmission.)
Use that function in a conditional block to call the next function that actually reads one or more received bytes. In case that nothing was received, this prevents your loop to block.
Example (without knowing how the functions are named in your case):
/* any task */ {
for (;;) /* or any other way of looping */ {
/* do some stuff, if needed */
if (areRs232DataAvailable()) {
uint8_t data = fetchRs232ReceivedByte();
/* handle received data */
}
/* do some stuff, if needed */
}
}
I would ask why you think reading data from a UART (which I assume is what you mean by "RS-232") requires a task at all? A solution will depend a great deal on your platform and environment and you have not specified other than FreeRTOS which does not provide any serial I/O support.
If your platform or device library already includes serial I/O, then you might use that, but at the very lowest level, the UART will have a status register with a "data available" bit, and a register or FIFO containing that data. You can simply poll the data availability, then read the data.
To avoid data loss while the processor is perhaps busy with other tasks, you would use either interrupts or DMA. At the very least the UART will be capable of generating an interrupt on receipt of a character. The interrupt handler would place the new data into a FIFO buffer (such as an RTOS message queue), and tasks that receive serial data simply read from the buffer asynchronously.
DMA works similarly, but you queue the data in response to the DMA interrupt. That will reduce the interrupt rate, but you have to deal with the possibility of a partially full DMA buffer waiting indefinitely. Also not all platforms necessarily support UART as a DMA source, or even DMA at all.

lwip board cannot maintain connection to another lwip board

I have a strange problem. For some time I've been trying to replace a small protocol converter (basically a two way serial to ethernet ... master and slave) that I've got for something that has more features.
Backstory
After a lot of reverse engineering I found out how the device works and I've been trying to replicate it and I've been successful in connecting my board to the device ... I've tried connecting the original as the master and my board as slave and vice versa and everything works perfectly, it's actually better since at higher speeds there are no more packet losses (connecting 2 original ones would cause packet losses).
However when I tried connecting my device as master and another one of my devices as slave .. running the exact same piece of code it works for 2 or 3 exchanges and then it stops ... eventually SOMETIMES after some minutes it will try again 2 or 3 more times.
How the tests were made
I connected a modbus master and slave (modbustools, two different instances). The master is a serial RTU modbus and the slave is an serial RTU modbus;
I configure one of my devices as master and connect it to the serial port so that it receives the serial modbus and sends the protocol to a device connected to it;
I configure my slave so that it connects via the serial port to the slave modbus. Basically it works by creating a socket and connecting to the master's IP, it then waits for a master transmission via ethernet, sends it via serial to the slave modbus (modbustools), receives a response, sends it its master and then it sends it to the modbus master (modbustools);
I's a bit confusing but that's how it works ... my master awaits a socket connection and then the communication between them starts, because that is how the old ones work.
I've written an echo client now to test the connection. Basically now, my code connects to a server (my master), it receives a packet, then it replies back the same packet that it received. When I try connecting this to my 2 boards they don't work. It's more of the same, 2 or 3 exchanges and then it stops, but when I connect it to the original device it keeps running without a hitch.
Sources
Here is my TCP master (server actually) initialization:
void initClient() {
if(tcp_modbus == NULL) {
tcp_modbus = tcp_new();
previousPort = port;
tcp_bind(tcp_modbus, IP_ADDR_ANY, port);
tcp_sent(tcp_modbus, sent);
tcp_poll(tcp_modbus, poll, 2);
tcp_setprio(tcp_modbus, 128);
tcp_err(tcp_modbus, error);
tcp_modbus = tcp_listen(tcp_modbus);
tcp_modbus->so_options |= SOF_KEEPALIVE; // enable keep-alive
tcp_modbus->keep_intvl = 1000; // sends keep-alive every second
tcp_accept(tcp_modbus, acceptmodbus);
isListening = true;
}
}
static err_t acceptmodbus(void *arg, struct tcp_pcb *pcb, err_t err) {
tcp_arg(pcb, pcb);
/* Set up the various callback functions */
tcp_recv(pcb, modbusrcv);
tcp_err(pcb, error);
tcp_accepted(pcb);
gb_ClientHasConnected = true;
}
//receives the packet, puts it in an array "ptransparentmessage->data"
//states which PCB to use in order to reply and the length that was received
static err_t modbusrcv(void *arg, struct tcp_pcb *pcb, struct pbuf *p, err_t err) {
if(p == NULL) {
return ERR_OK;
} else if(err != ERR_OK) {
return err;
}
tcp_recved(pcb, p->len);
memcpy(ptransparent.data, p->payload,p->len);
ptransparent->pcb = pcb;
ptransparent->len = p->len;
}
The serial reception is basically this:
detect one byte received, start timeout, when timeout ends send whatever was received via a TCP socket that was already connected to the server .. it then receives the packet via the acceptmodbus function and sends it via serial port.
This is my client's (slave) code:
void init_slave() {
if(tcp_client == NULL) {
tcp_client = tcp_new();
tcp_bind(tcp_client, IP_ADDR_ANY, 0);
tcp_arg(tcp_client, NULL);
tcp_recv(tcp_client, modbusrcv);
tcp_sent(tcp_client, sent);
tcp_client->so_options |= SOF_KEEPALIVE; // enable keep-alive
tcp_client->keep_intvl = 100; // sends keep-alive every 100 mili seconds
tcp_err(tcp_client, error);
err_t ret = tcp_connect(tcp_client, &addr, portCnt, connected);
}
}
The rest of the code is the identical. The only thing that changes is the flow of operation.
Connect to server
Wait for packet
send it via serial
wait for response timeout (same timeout as the server, it justs starts counting in a different way ... server starts after receiving one byte and client after it sent something via the serial port)
get response and send it to the server
Observation:
No error is detected in the communication. After some testing it doesn't seem to be the number of exchanges that causes the hang. It happens after some time. In my opinion this sounds like a disconnection problem or timeout error, but no disconnection occurs and no more packets are received. When I stop debugging and check the sockets nothing out of the ordinary is detected.
If I understood your question the right way, you have a computer with two serial ports, each running a Modbus client and server instance. From each of these ends, you then go to your STM32 boards that receive data on their serial ports and forward to TCP on an Ethernet network connecting them to each other.
Not easy to say but based on the symptoms you describe it certainly looks like you are having one or several timeout issues, likely on the serial sides. I think it won't be easy to help you pinpoint what is exactly wrong with your code without testing it and certainly not if you can't show a complete functional piece.
But what you can improve a lot is the way you debug on the end sides.
You can try replacing modbustools with something that gives you more details.
The easiest solution to get additional debugging info is to use pymodbus, you just need to install the library with pip and use the client and server provided with the examples. The only modification you need is to change them to the serial interface commenting and uncommenting a couple of lines. This will give you very useful details for debugging.
If you have a C development environment on your computer better go for libmodbus. This library has a fantastic set of unit tests. Again, you just have to edit the code to set the name of your serial ports and run server and client.
Lastly, I don't know to what extent this might be useful for you but you might want to take a look at SerialPCAP. With this tool, you can tap on an RS-485 bus and see all queries and responses running on it. I imagine you have RS-232, which is point-to-point and will not work with three devices on the bus. If so, you can try port forwarding.
EDIT: Reading your question more carefully I find this sentence particularly troublesome:
...detect one byte received, start timeout, when timeout ends send whatever was received via a TCP socket that was already connected to the server...
Why would you need to introduce this artificial delay? In Modbus, you have very well defined packages that you can identify by the minimum 3.5 frame spacing, is that what you mean by timeout?
Unrelated, but I've also remembered there is a serial forwarder example inluded with pymodbus that might somehow help you (maybe you can use it to emulate one of your boards?).

How to design a test case to validate throttling capacity of a packet decoder?

I am implementing a packet decoder on a micro-controller. The packets are of 32-bytes each, received through a UART, every 10 milliseconds. The UART ISR (Interrupt Service Routine) keeps the received bytes in a ring buffer, and a thread scheduled every 7.5ms decodes the packets from ring buffer. There are instrumentation routines implemented to report the number of times ring buffer was full, error count after decoding, dropped bytes count. The micro-controller can send these packets back to PC running my test case through a different UART.
How do I design a test case to check if the system is meeting my performance requirements. These are the test cases which I should take care of --
The transmitter clock may run slightly faster (Sending a packet every 8ms, rather than the nominal 10ms).
The channel may introduce errors to data bits. There are checksum fields included in packet to cope up with that. How to simulate the channel errors?
The test case should be maintainable and extendable.
I already have a simulator through which I tested the decoder (implemented in micro-controller) for functional correctness. This simulator sends packets at programmable intervals, and the value of data fields can be changed through a UI. How can this simulator be modified to do this?
Are there standard practices/test cases to handle such throttling tests? Are there some edge cases I am missing? I need to make sure that the ring buffer has enough space to handle the higher rates of packets sent by the receiver.

How does packetbuf work in ContikiOS if there's an incoming packet during a pending send?

I have trouble understanding how to write asynchronous sending/receiving in Contiki. Suppose I am using the xmac layer, or any layer that is based on packetbuf. I am sending a message, or a list of packets. I start sending a message using void(*send)(mac_callback_t sent_callback, void *ptr). This takes the message that is in the global buffer packetbuf, and tries to send it. Meanwhile while the send is pending (for example waiting for the other device to wake up or acknowledge the transmission), the device receives a packet from a third device.
Will this packet overwrite the packet waiting to be sent that is in the packetbuf? How should I handle this?
I thought that maybe you can't be trying to send a packets and listen for incoming packets, but then there is an obvious deadlock: 2 devices sending messages to each other at the same time.
I am porting a higher-level routing layer to Contiki. This is the second OS I am porting it to, but the previous OS didn't use a single buffer for both incoming and outgoing packets.
The packetbuf is a space for short-term data and metadata storage. It's not meant to be used by code that blocks longer than a few timer ticks. If you can't send the packet immediately from your send() function, do not block there! You need to schedule a timer callback in the future and return MAC_TX_DEFERRED. To store packet data in between invocations of send(), use the queuebuf module.
The fact that there is a single packetbuf for both reception and transmission is not a problem, since the radio is a half-duplex communication medium anyway. It cannot both send and receive data at the same time. Similarly, a packet that is received is first stored in the radio chip's memory: it does not overwrite the packetbuf. Contiki interrupt handlers similarly never write to packetbuf directly. They simply wake up the rx handler process, which takes the packet from the radio chip and puts it in the packetbuf. Since one process cannot unexpectedly interrupt another, this operation is safe: a processing wanting to send a packet cannot interrupt the process reading another packet.
To summarize, the recommendations are:
Do not block in Contiki process context (this is a generic rule when programming this OS, not specific to this question).
Do not the expect the contents of packetbuf are going to be saved across yielding the execution in Contiki process context. Serialize to a queuebuf if you need this.
Do not access the packetbuf from interrupt context.

creating a serial loopback under VxWorks

I'm fairly new to VxWorks OS and hence wouldn't mind explanations in case I differ in my understanding of things under the hood when they differ from more traditional OSes like Linux and the likes. With that out of the way, let me begin my actual question.
I am trying to create a loop-back test for testing changes I made to the serial UART driver on the board. Since I do not want to use a cross cable to actually short two UART ports externally, I've connected both of those ports to my dev machine. One configured as an output port from the dev machine's perspective (and consequently as an Input port on the board) and the other an input port (an output port on the board). I am actually doing the loopback using a shared memory buffer which I am protecting using a mutex. So there are two threads on the board, one of which reads from the input port, copies data to the shared memory and the other reads from the memory and sends it over the output port.
And I am using regular open, read and write calls in my VxWorks application (by the way I think it is part of the application code as I call the functions from usrAppInit.c not withstanding the fact that I can even call driver routines from here! (Is it because of a flat memory model vis-a-vis Linux?? Anyhow).
Now I these ports on VxWorks have been opened in a non blocking mode and here's the code snippet which configures one of the ports:
if( (fdIn = open(portstrIn, O_RDONLY | O_NOCTTY, 0) ) == ERROR) {
return 1;
}
if(((status = ioctl(fdIn, FIOSETOPTIONS, OPT_RAW))) == ERROR)
{
return 1;
}
/* set the baud rate to 115200
*/
if((status = ioctl(fdIn, FIOBAUDRATE, 115200)) == ERROR)
{
return 1;
}
/* set the HW options
*/
if((status = ioctl(fdIn, SIO_HW_OPTS_SET, (CS8 | 0 | 0))) == ERROR)
{
return 1;
}
And similarly the output port is also configured. These two are part of two separate tasks spawned using taskSpawn and have the same priority of 100. However what I am annoyed by, is that when I write to the in port from my dev machine (using a python script), the read call on the board get's sort of staggered (I wonder if that's the right way to refer to it). It is most likely due to the short availability of hardware buffer space on the UART input buffer (or some such). This is usually not much of a problem if that is all I am doing.
However, as explained earlier, I am trying to copy the whole received character stream into a common memory buffer (guarded by a mutex of course) which is then read by another task and then re-transmitted over another serial port (sort of an in memory loopback if you will)
In lieu of that aforementioned staggering of the read calls, I thought of holding the mutex as long as there are characters to be read from the Inport and once there are no chars to be read, release the mutex and since this is VxWorks, do an explicit taskDelay(0) to schedule the next ready task (my other task). However since this is a blocking read, I am (as expected) stuck on the read call due to which my other thread never gets a chance to execute.
I did think of checking if the buffer was full and then doing the explicit task switch however if any of you have a better idea, I'm all ears.
Also just to see how this staggered read thing works from the perspective of the kernel, I timed it using a time(NULL) call just before and right after the read. So surprisingly, the very first chunk shows up a number, every other chunk after that (if it's a part of the same data block coming from the outside) shows 0. Could anyone explain that as well?
Keen to hear
I don't have 50 rep points for commenting, but without a loopback cable attached, the only way you can arrive at testing serial loopback behavior is to switch the uart into loopback mode. This often means making changes to the specific hardware part driver.

Resources