Please note the clarification and update at the end of the post
TL;DR: An STM32 has 3 UART connections, 1 for debugging and 2 for actual communication which use the interrupt-driven HAL_UART_Receive_IT. Initially, interrupt driven UART receive works fine, though over time the receive callback for one of the UARTs fires less and less until eventually the STM32 doesn't receive any packets on that one UART at all (despite me being able to verify that they were sent). I suspect the issue to be timing related.
Situation: As part of my thesis, I developed a novel protocol which now has to be implemented and tested. It involves two classes of actors, a server and devices. A device consists of an STM32, ESP32 and a UART to Ethernet bridge. The STM32 is connected via UART to the bridge and via UART to the ESP32. The bridge connects the STM32 to the server by converting serial data sent by the STM32 to TCP packets which it forwards to the server (and vice versa). The ESP32 receives framed packets from the STM32, broadcasts them via BLE and forwards all received and well-formed BLE packets to the STM32. I.e. the ESP32 is just a BLE bridge. The server and ESP32 seem to be working flawlessly.
In a nutshell, the server tries to find out which devices D_j can hear BLE advertisements from device D_i. The server does that by periodically iterating over all devices D_1, ..., D_n and sends them nonces Y_1, ..., Y_n encrypted as X_1, ..., X_n. Upon D_i receiving X_i, it decrypts it to get Y_i, which it then forwards to the ESP32 to be broadcasted via BLE. Conversely, whenever the STM32 receives a packet from the ESP32 (i.e. a packet broadcasted via BLE), it extracts some data, encrypts it and forwards it to the server.
After the server has iterated over all devices, it looks at all the messages it received during that round. If it e.g. received a message with value Y_i sent by D_j, it can deduce that D_i's broadcast somehow arrived at D_j.
Problem: The way I have it set up right now, each STM32 seems to occasionally "miss" messages sent by the ESP32. The more such devices I have in my setup, the worse it gets! With just two devices, the protocol works 100% of the time. With three devices, it also seems to work fine. However, with four devices the STM32's UART receive callback for the ESP32 works fine initially, but after a couple of such rounds it doesn't trigger all the time until eventually it doesn't trigger at all.
Visualization:
The below picture shows a sample topology of n devices. Not drawn here, but if e.g. D_1 was to receive Y_2, it would encrypt it to X_2' and send it across the bridge to the server.
N.B.:
Encryption and Decryption each take ca. 130ms
Average one way delay for one ESP32 receiving packet, broadcasting it and another ESP32 receiving is ca. 15ms
I am aware that UART is not a reliable protocol per se and that one should use framing in a real setting. Nevertheless, I was instructed to just assume that UART is perfect and doesn't drop anything.
Due to the larger scope of the project, using an RTOS is not an option
Code:
#define LEN_SERVER_FRAMED_PACKET 35
#define LEN_BLE_PACKET 24
volatile bool_t new_server_msg;
volatile bool_t new_ble_msg;
byte_t s_rx_framed_buf[LEN_SERVER_FRAMED_PACKET]; // Receive buffer to be used in all subsequent Server send operations
ble_packet_t ble_rx_struct; // A struct. The whole struct is then interpreted as uint8_t ptr. when being sent to the ESP32 over UART
Init:
< set up some stuff>
err = HAL_UART_Receive_IT(&SERVER_UART, s_rx_framed_buf, LEN_SERVER_FRAMED_PACKET);
if (!check_success_hal("Init, setting Server ISR", __LINE__, err)){
print_string("Init after Signup: Was NOT able to set SERVER_UART ISR");
}else{
print_string("Init after Signup: Was able to set SERVER_UART ISR");
}
err = HAL_UART_Receive_IT(&BLE_UART, &ble_rx_struct, LEN_BLE_PACKET);
if(!check_success_hal("Init, setting BLE ISR", __LINE__, err)){
print_string("Init after Signup: Was NOT able to set BLE_UART ISR");
}else{
print_string("Init after Signup: Was able to set BLE_UART ISR");
}
Main loop:
while (1)
{
// (2) Go over all 3 cases: New local alert, new BLE message and new Server message and handle them accordingly
// (2.1) Check whether a new local alert has come in
if (<something irrelevant happens>)
{
<do something irrelevant>
}
// (2.2) Check for new ble packet. Technically it checks for packets from the UART to the ESP32.
if (new_ble_msg)
{
new_ble_msg = FALSE;
int ble_rx_type_code = ble_parse_packet(&ble_rx_nonce, &ble_rx_struct);
HAL_UART_Receive_IT(&BLE_UART, &ble_rx_struct, LEN_BLE_PACKET); // Listen for new BLE messages.
<compute some stuff, rather quick> server_tx_encrypted(<stuff computed>, &c_write, "BLE", __LINE__); // Encrypts <stuff computed> and sends it to the server using a BLOCKING HAL_UART_Transmit(...).
// Encryption takes ca. 130ms.
}
// (2.3) Check for new server packet
if (new_server_msg)
{
new_server_msg = FALSE; // Set flag to false
memcpy(s_wx_framed_buf, s_rx_framed_buf, LEN_SERVER_FRAMED_PACKET); // Copy from framed receive buffer to framed working buffer.
// This is done such that we can process the current message while also being able to receive new messages
HAL_UART_Receive_IT(&SERVER_UART, s_rx_framed_buf, LEN_SERVER_FRAMED_PACKET); // Listen for new server messages.
<decrypt it, takes ca.130 - 150ms. results in buffer ble_tx_struct>
err = HAL_UART_Transmit(&BLE_UART, ble_tx_struct,
LEN_BLE_PACKET, UART_TX_TIMEOUT);
check_success_hal(err); // If unsuccessful, print that to debug UART
}
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
}
UART receive callback function:
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
if (huart == &SERVER_UART)
{ // One should technically compate huart -> Instance, but that works aswell...
new_server_msg = TRUE;
print_string("UART Callback: Server ISR happened!\r\n"); // Blocking write to debug UART. I know that this is typically considered bad form,
// but as the callback function is only called once per receive and because that's the only way of letting me know that the callback has occurred,
// I chose to keep the print in.
}
else if (huart == &BLE_UART)
{
new_ble_msg = TRUE;
print_string("UART Callback: BLE ISR happened!\r\n");
}
else
{
print_string("UART Callback: ISR triggered by unknown UART_HandleTypeDef!\r\n");
}
}
What I have tried so far:
I wrote a client implementation in Go and ran it on my computer, where clients would just directly send UDP messages to each other instead of BLE. As that version functioned flawlessly even with many "devices", I am confident that the problem lies squarely at the STM32 and its STM32 <-> ESP32 UART connection.
To get it working with 3 devices, I simply removed most of the debugging statements of the STM32 and made the server wait 250ms between sending X_i to D_{i} and X_{i + 1} to D_{i + 1}. As this seems to have at least made the problem so infrequent that I haven't noticed it anymore, I reckon that the core issue is timing related.
Through drawing execution traces, I have already found an inherent weakness to my approach: if an STM32 calls HAL_UART_Receive_it(&BLE_UART, ble_rx_buf, LEN_BLE_PACKET) while the ESP32 is currently transmitting a packet to the STM and has already sent k bytes, the STM32 will only receive LEN_BLE_PACKET - k bytes. This causes the BLE_UART.RxXferCount to be wrong for when the next packet is sent by the ESP32.
On a more theoretical front, I first considered doing DMA instead of interrupt driven receive. I then refrained however, as in the STM32 DMA doesn't use descriptor rings like in more powerful systems but instead really just removes the overhead from having to receive LEN_BLE_PACKET (resp LEN_SERVER_FRAMED_PACKET) interrupts.
I have also already of course checked stackoverflow, several people seem to have experienced similar issues. E.g. UART receive interrupt stops triggering after several hours of successful receive, "Uart dma receive interrupt stops receiving data after several minutes" .
Questions:
Given what I have described above, how is it possible for the STM32's callback of BLE_UART to simply stop triggering after some time without any apparent reason?
Does it seem plausible that the issue I raised in the last paragraph of "What I have tried so far" is actually the cause of the problem?
How can I fix this issue?
Clarification:
After the server sends a request to a device D_i, the server waits for 250ms before sending the next request to D_{i + 1}. Hence, the D_i has a 250ms transmission window in which no D_j can transmit anything. I.e. when it's D_i's turn to broadcast its nonce, the other devices have to simply receive one UART message.
As the receival from the server is typically rather fast, the decryption takes 130ms and the UART transmit with a baud of 115200 is also quick, this window should be long enough.
UPDATE:
After posting the question, I changed the ESP32 such that BLE packets are not immediately forwarded over UART to the STM32. Instead, they are enqueued and a dedicated task in the ESP32 dequeues them with a minimum 5ms delay between packets. Hence, the STM32 should now have a guaranteed 5ms between each BLE packet. This was done to reduce the burstiness (despite there not actually being any bursts due to what is mentioned in the clarification... I was just desperate). Nevertheless, this seems to have made the STM32 "survive" for longer before the UART receiver locking up.
You need to be very careful especially when using STM32 HAL library for production, the libraries isn't reliable when receiving fast and continuous data from the server or anywhere else.
I will suggest a solution to this problem based on what I did when implementing for similar application. This works well for my Firmware-Over-The-Air(FOTA) project and helps to eliminate any possible UART failures when using STM32 HAL library.
Steps are listed below:
Ensure you reset the UART by calling MX_USARTx_UART_Init()
Reconfigure the callback either for HAL_UART_Receive_IT() or HAL_UART_Receive_DMA()
This two settings would eliminate any UART failure for receive interrupt using STM32 HAL.
I have the following setup:
A: 1 x Coordinator connected via USB dongle (sparkfun) to a Windows 10 IoT device - Serial communication
B: 1 x Router connected to an Arduino Fio
C: 1 x Router connected via USB dongle (sparkfun) to Windows 10 to XCTU
All above are API mode 1.
My scenario is as follows:
I send at each 5 seconds a 6 byte message from A to B and C.
B is instructed to reply to that message with another one of the same size.
After some time, typically 40 - 50 minutes, A no longer receives messages from B.
Reads from Serial port are working (Transmit Status messages are received for each message Sent by A).
C receives messages as seen in XCTU.
If nothing changes A will never hear from B again.
However if (by some internal logic) B sends a message to A (other than the reply) or if C sends a 6 byte message (same as the one A sends to B and C) to B, suddenly A starts receiving messages from B.
Does anyone know why is this happening?
It was the arduino library that we misused.
It only works in API Mode 2 and we have the module configured for API Mode 1.
(does anyone know why the library has not yet been updated to be used with API Mode 1?)
It was happening only after a while since we have an incremental counter in our message and at some point, that counter reached a value that contained a special character from API Mode 2 perspective.
From XCTU was always running since there was no incremental logic in there.
Many thanks to #tomlogic for his suggestion. Helped a lot!
I have a sound processor device with MIDI interface over USB. I would like to control the device from my PC besides the official app to the device. However I don't have the command protocol description.
I could get managed to dump a couple of USB packets to the device with the help of usbmon. They look like:
0x0B 0xB0 0x00 0x00
0x0C 0xC0 0x05 0x00
If I send this command from my app, then the device activates program no 5.
The protocol seems to be MIDI, but if I follow it and try to interact with another functions of the device, I get no desired result.
So, I am looking for any help to get it working. For example I need to learn how to select an effect or control the volume and another parameters.
Regards,
Dmitry
You'll find what you need in the Universal Serial Bus Device Class Definition for MIDI Devices and the MIDI specification.
Your example consists of two packets, each containing a MIDI event. They can be decoded as follows:
cable: 0
event: control change
channel number: 0
controller number: 0 (bank select)
controller value: 0
cable: 0
event: program change
channel number: 0
program number: 5
The Zoom G3X device uses the standard USB MIDI protocol.
However, just because it uses MIDI messages does not automatically imply that you know what these messages mean.
There are additional standards, such as General MIDI, but when the device is not a 'normal' synthesizer but an effect processor, most standard messages would not make sense.
To find out what MIDI messages the device accepts, look into the documentation.
If the messages are not documented (like in this case, where the device was meant to be used only with the supplied software), you have to do the changes on the device, and record any MIDI messages that it sends out (with amidi --dump, or aseqdump).
If the device does not send out messages to show changes in its current status, you have to capture the messages sent by the official app with a USB monitor (like usbmon in Linux).
I want to setup an xBee network with four serial 1 modules. Any two of them can communicate with each other in two-way. The transmitted data is string other than a single byte.
My original design is to setup a nonbeacon (with coordinator) network: One module is configured as coordinator. The left three modules is configured as end devices. The coordinator broadcasts the data from end devices.
The communication workflow is: If end device 1 want to send data to end device 2, it sends data to coordinator first. Then the coordinator broadcasts the data received from end device 1. End device 2 can receive the broadcast data. The communication workflow finishes.
I want the received string to be atomic. If end device 1 and end device 3 send out the data in the same time, there would be conflict. The two strings would combined together. And the end device 2 can't distinguish which byte is from which device. That is, end device 1 sends out string "{AAAA}" (quotes aren't included). In the meanwhile, end device 3 sends out string "<2222>". The end device 2 may receive the string like "{A<22AA2A2}>", which isn't what I want. My expected string is "{AAAA}<2222>" or "<2222>{AAAA}".
How do I setup the network to meet my requirements?
There are two ways to achieve atomic transmissions using Digi's XBee modules. The method varies depending on if API-mode (AP parameter > 0) is in use or not.
If API mode is not in use (AP = 0) then the atomicity of data can be encouraged by setting the RO time to be greater than the number of characters of the longest string you are going to send from one of your nodes. This will make the XBee buffer wait the specified number of character times (the time it takes to send a character at a particular baud rate) before starting the over-the-air transmission. Note: you'll have to ensure that you send your entire string all at once to the radio in order for this scheme to work.
If API mode is being used (AP > 0) then it is very easy to get the behavior you want. You'll simply use the Tx Request frame (API frame type 0x1) and specify the string data you want to send. The data will always be sent atomically.
If API mode is being used on the on the receiving node (i.e. in this case, the coordinator) then the frame data will always arrive atomically as well.
Please refer to the Digi XBee 802.15.4 product support page for more information on how to use API mode and search the Internet for the many wonderful XBee libraries which allow you to use Digi XBee modules in API mode easily.
I setup a NonBeacon (w/Coordinator) network with three XBee Series 1 modules. One is configured as coordinator. The other two are configured as end devices. The firmware version and configuration are as below.
Firmare
Modem: XBEE Function Set Version
XB24 XB24 802.15.4 10CD
Coordinator
Parameter Value Comments
CH (Channel) 0x0F Identical
ID (PAN ID) 0x5241 Identical
DH 0x0
DL 0x0
MY (Source Address) 0xFF01 Unique
CE (Coord. Enable) 1
A2 (Coord. Assoc.) 0x04 allow end devices to associate to it.
End device
Parameter Value Comments
CH (Channel) 0x0F Identical
ID (PAN ID) 0x5241 Identical
DH 0x0
DL 0x0
MY (Source Address) 0xFF02 Unique
CE (Coord. Enable) 0
A1 (End Dev Assoc.) 0x04 allow associate to coordinator.
When end device 1 sends out the data, the coordinator can receive the data, but the end device 2 can't. I want end device 2 to receive data from end device 1 in this network. My current solution is to let the coordinator broadcasts the data, so end device 2 can receive it. I'm not sure if this is good solution to resolve the communication issue among end devices. Is there any other solution?
With Digi XBee 802.15.4 modules (also know as Digi XBee Series 1 modules), there is no penalty to using broadcasts on the coordinate to speak with your end devices.
If on the other hand you wanted to be able to establish communication between any two Digi XBee 802.15.4 modules you'd need to use unicast addressing. Unicast addressing is performed the following way:
Set an address on each node by setting the MY parameter to a unique value
Set the coordinator's DL parameter to the MY value of the *end device node you wish to speak with.
Note that each end device will always be able to speak to the coordinator (the node with CE set to 1) by setting DL to 0.
It can be very clumsy to have to change the DL parameter on the coordinator to be able to speak with each end device in turn. This is why many end up using the Digi XBee radios in API mode.
If you download the manual from the Documentation section of Digi XBee 802.15.4 Support Page, you'll find a section entitled "API Operation". If you set the AP parameter to > 1 it will enable this mode.
If you send some data from an end device to the coordinator in API mode you'll see RX frames (API type 0x81) emerge from the radio. Likewise if you send packets of a similar format using API type 0x01) and specify the MY address of an end device as a destination, you'll see the data emerge from the serial port of the end device XBee.
If you search for "XBee API library" you'll find lots of useful links for libraries which can speak to Digi XBee modules using your language of choice such as this handy one for the Java language