Delayed ACK at receiver vs. RTO at sender - c

if H1 opens TCP connection to H2, then H1 sends a small packet ( < MSS ) to H2 (H2 is not to respond with data), how long will it take H2 to send a delayed ACK?
How TCP ensures that RTO timer at H1 wouldn't expire before receiving the delayed ACK?
Is it correct understanding that Linux has minimal RTO of 200ms by default, so if network is fast (RTO stays at minimal 200ms) and the data packet is lost on the way from H1 to H2, then H1 will retry in 200ms? If network is slow, then H1 may wait well longer than 200ms?

About delayed ACK timing, RFC 1122 says:
A TCP SHOULD implement a delayed ACK, but an ACK should not
be excessively delayed; in particular, the delay MUST be
less than 0.5 seconds, and in a stream of full-sized
segments there SHOULD be an ACK for at least every second
segment.
So it depend on implementation and of course if could reduce application performance.
In linux kernel they do not send delayed ACK by timer or fixed interval as you can see in following code they behave differently by conditions.
As you can see in net/ipv4/tcp_input.c, they says in comment:
There is something which you must keep in mind when you analyze the
behavior of the tp->ato delayed ack timeout interval. When a
connection starts up, we want to ack as quickly as possible. The
problem is that "good" TCP's do slow start at the beginning of data
transmission. The means that until we send the first few ACK's the
sender will sit on his end and only queue most of his data, because
he can only send snd_cwnd unacked packets at any given time. For
each ACK we send, he increments snd_cwnd and transmits more of his
queue. -DaveM
static void tcp_event_data_recv(struct sock *sk, struct sk_buff *skb)
{
struct tcp_sock *tp = tcp_sk(sk);
struct inet_connection_sock *icsk = inet_csk(sk);
u32 now;
inet_csk_schedule_ack(sk);
tcp_measure_rcv_mss(sk, skb);
tcp_rcv_rtt_measure(tp);
now = tcp_time_stamp;
if (!icsk->icsk_ack.ato) {
/* The _first_ data packet received, initialize
* delayed ACK engine.
*/
tcp_incr_quickack(sk);
icsk->icsk_ack.ato = TCP_ATO_MIN;
} else {
int m = now - icsk->icsk_ack.lrcvtime;
if (m <= TCP_ATO_MIN / 2) {
/* The fastest case is the first. */
icsk->icsk_ack.ato = (icsk->icsk_ack.ato >> 1) + TCP_ATO_MIN / 2;
} else if (m < icsk->icsk_ack.ato) {
icsk->icsk_ack.ato = (icsk->icsk_ack.ato >> 1) + m;
if (icsk->icsk_ack.ato > icsk->icsk_rto)
icsk->icsk_ack.ato = icsk->icsk_rto;
} else if (m > icsk->icsk_rto) {
/* Too long gap. Apparently sender failed to
* restart window, so that we send ACKs quickly.
*/
tcp_incr_quickack(sk);
sk_mem_reclaim(sk);
}
}
icsk->icsk_ack.lrcvtime = now;
tcp_ecn_check_ce(tp, skb);
if (skb->len >= 128)
tcp_grow_window(sk, skb);
}

Related

Problem with keeping alive master and slave in a proprietary communication protocol

I'm writing a UART communication protocol for my stm32l152c-discovery. Briefly, the master sends the communication start packet to the slave via the user button and the slave, if it receives it correctly, sends an ack start packet and passes to the connected state. In the same way, if the master receives the ack start packet correctly, it goes into connected state. In this state, master and slave keep alive with keep alive packets that must be sent every TPOLL which can be from 100 to 5000 milliseconds. If the TSILENT timeout, which is TPOLL + 100 milliseconds, expires, the board disconnects. The event counter is managed by a timer which increments every millisecond.
I encounter a low solidity problem, in fact after a few packets exchanged one of the two boards disconnects, I tried with various TPOLL values. If I increase TSILENT from TPOLL + 100 to TPOLL +1000 the program works correctly. The problem is that I have no delay in the operations I perform in this state and all the functions to be performed take 12 milliseconds (also counting the 10 for UART reception).timer 3 is initialized before entering the connected state. I've posted parts of the code that might be of interest.
void TIM3_IRQHandler(void)
{
/* USER CODE BEGIN TIM3_IRQn 0 */
/* USER CODE END TIM3_IRQn 0 */
HAL_TIM_IRQHandler(&htim3);
/* USER CODE BEGIN TIM3_IRQn 1 */
GlobalTick++;
if(GlobalTick == 1)
{
TX_BufferReset();
SendPacket(KEEP_ALIVE);
HAL_UART_Transmit_DMA(&huart3, TX_Buffer.buffer, TX_Buffer.count);
}
if(GlobalTick - t_event_poll >= TPOLL)
{
TX_BufferReset();
SendPacket(KEEP_ALIVE);
HAL_UART_Transmit_DMA(&huart3, TX_Buffer.buffer, TX_Buffer.count);
t_event_poll = GlobalTick;
}
if(GlobalTick - t_event_silence > TSILENT)
{
state = STATE_DISCONNECTED;
ResetTime();
}
/* USER CODE END TIM3_IRQn 1 */
}
case STATE_CONNECTED:
RX_BufferReset();
HAL_UART_Receive(&huart3, RX_Buffer.buffer, MAX_PKT_BYTES, 10);
if(WaitFullPacket() == TRUE)
if (ReceivePacket() == CORRECT && RX_Buffer.buffer[3] == KEEP_ALIVE)
t_event_silence = GlobalTick;
break;

MSP430 I2C slave holding clock line low

I'm more of a high level software guy but have been working on some embedded projects lately so I'm sure there's something obvious I'm missing here, though I have spent over a week trying to debug this and every 'MSP' related link in google is purple at this point...
I currently have an MSP430F5529 set up as an I2C slave device whose only responsibility currently is to receive packets from a master device. The master uses industry grade I2C and has been heavily tested and ruled out as the source of my problem here. I'm using Code composer as my IDE using the TI v15.12.3.LTS compiler.
What is currently happening is the master queries how many packets (of size 62 bytes) the slave can hold, then sends over a few packets which the MSP is just currently discarding. This is happening every 100ms on the master side and for the minimal example below the MSP will always just send back 63 when asked how many packets it can hold. I have tested the master with a Total Phase Aardvark and everything is working fine with that so I'm sure it's a problem on the MSP side. The problem is as follows:
The program will work for 15-20 minutes, sending over tens of thousands of packets. At some point the slave starts to hold the clock line low and when paused in debug mode, is shown to be stuck in the start interrupt. The same sequence of events is happening every single time to cause this.
1) Master queries how many packets the MSP can hold.
2) A packet is sent successfully
3) Another packet is attempted but < 62 bytes are received by the MSP (counted by logging how many Rx interrupts I receive). No stop condition is sent so master times out.
4) Another packet is attempted. A single byte is sent before the stop condition is sent.
5) Another packet is attempted to be sent. A start interrupt, then a Tx interrupt happens and the device hangs.
Ignoring the fact that I'm not handling the timeout errors on the master side, something very strange is happening to cause that sequence of events, but that's what happens every single time.
Below is the minimal working example which is reproducing the problem. My particular concern is with the SetUpRx and SetUpTx functions. The examples that the Code Composer Resource Explorer gives only has examples of Rx or Tx, I'm not sure if I'm combining them in the right way. I also tried removing the SetUpRx completely, putting the device into transmit mode and replacing all calls to SetUpTx/Rx with mode = TX_MODE/RX_MODE, which did work but still eventually holds the clock line low. Ultimately I'm not 100% sure on how to set this up to receive both Rx and Tx requests.
#include "driverlib.h"
#define SLAVE_ADDRESS (0x48)
// During main loop, set mode to either RX_MODE or TX_MODE
// When I2C is finished, OR mode with I2C_DONE, hence upon exit mdoe will be one of I2C_RX_DONE or I2C_TX_DONE
#define RX_MODE (0x01)
#define TX_MODE (0x02)
#define I2C_DONE (0x04)
#define I2C_RX_DONE (RX_MODE | I2C_DONE)
#define I2C_TX_DONE (TX_MODE | I2C_DONE)
/**
* I2C message ids
*/
#define MESSAGE_ADD_PACKET (3)
#define MESSAGE_GET_NUM_SLOTS (5)
static volatile uint8_t mode = RX_MODE; // current mode, TX or RX
static volatile uint8_t rx_buff[64] = {0}; // where to write rx data
static volatile uint8_t* rx_data = rx_buff; // used in rx interrupt
static volatile uint8_t tx_len = 0; // number of bytes to reply with
static inline void SetUpRx(void) {
// Specify receive mode
USCI_B_I2C_setMode(USCI_B0_BASE, USCI_B_I2C_RECEIVE_MODE);
// Enable I2C Module to start operations
USCI_B_I2C_enable(USCI_B0_BASE);
// Enable interrupts
USCI_B_I2C_clearInterrupt(USCI_B0_BASE, USCI_B_I2C_TRANSMIT_INTERRUPT);
USCI_B_I2C_enableInterrupt(USCI_B0_BASE, USCI_B_I2C_START_INTERRUPT + USCI_B_I2C_RECEIVE_INTERRUPT + USCI_B_I2C_STOP_INTERRUPT);
mode = RX_MODE;
}
static inline void SetUpTx(void) {
//Set in transmit mode
USCI_B_I2C_setMode(USCI_B0_BASE, USCI_B_I2C_TRANSMIT_MODE);
//Enable I2C Module to start operations
USCI_B_I2C_enable(USCI_B0_BASE);
//Enable master trasmit interrupt
USCI_B_I2C_clearInterrupt(USCI_B0_BASE, USCI_B_I2C_RECEIVE_INTERRUPT);
USCI_B_I2C_enableInterrupt(USCI_B0_BASE, USCI_B_I2C_START_INTERRUPT + USCI_B_I2C_TRANSMIT_INTERRUPT + USCI_B_I2C_STOP_INTERRUPT);
mode = TX_MODE;
}
/**
* Parse the incoming message and set up the tx_data pointer and tx_len for I2C reply
*
* In most cases, tx_buff is filled with data as the replies that require it either aren't used frequently or use few bytes.
* Straight pointer assignment is likely better but that means everything will have to be volatile which seems overkill for this
*/
static void DecodeRx(void) {
static uint8_t message_id = 0;
message_id = (*rx_buff);
rx_data = rx_buff;
switch (message_id) {
case MESSAGE_ADD_PACKET: // Add some data...
// do nothing for now
tx_len = 0;
break;
case MESSAGE_GET_NUM_SLOTS: // How many packets can we send to device
tx_len = 1;
break;
default:
tx_len = 0;
break;
}
}
void main(void) {
//Stop WDT
WDT_A_hold(WDT_A_BASE);
//Assign I2C pins to USCI_B0
GPIO_setAsPeripheralModuleFunctionInputPin(GPIO_PORT_P3, GPIO_PIN0 + GPIO_PIN1);
//Initialize I2C as a slave device
USCI_B_I2C_initSlave(USCI_B0_BASE, SLAVE_ADDRESS);
// go into listening mode
SetUpRx();
while(1) {
__bis_SR_register(LPM4_bits + GIE);
// Message received over I2C, check if we have anything to transmit
switch (mode) {
case I2C_RX_DONE:
DecodeRx();
if (tx_len > 0) {
// start a reply
SetUpTx();
} else {
// nothing to do, back to listening
mode = RX_MODE;
}
break;
case I2C_TX_DONE:
// go back to listening
SetUpRx();
break;
default:
break;
}
}
}
/**
* I2C interrupt routine
*/
#pragma vector=USCI_B0_VECTOR
__interrupt void USCI_B0_ISR(void) {
switch(__even_in_range(UCB0IV,12)) {
case USCI_I2C_UCSTTIFG:
break;
case USCI_I2C_UCRXIFG:
*rx_data = USCI_B_I2C_slaveGetData(USCI_B0_BASE);
++rx_data;
break;
case USCI_I2C_UCTXIFG:
if (tx_len > 0) {
USCI_B_I2C_slavePutData(USCI_B0_BASE, 63);
--tx_len;
}
break;
case USCI_I2C_UCSTPIFG:
// OR'ing mode will let it be flagged in the main loop
mode |= I2C_DONE;
__bic_SR_register_on_exit(LPM4_bits);
break;
}
}
Any help on this would be much appreciated!
Thank you!

Unicast UDP packet miss

Unicast(one to one) UDP communication, each time the packet received is not same; if I am sending 1000 packets within interval of 500ms I get 9 packets missed. I am working on windows platform VCC 6.0; using sendto system call to send the Ethernet packet. In the host side I miss the packets by checksum error or header error.
Please let me know if you need more details. My agenda is that I should not miss any packets in target side.
Any help regarding this issue will be highly appreciated.
{
//Initialize local variables
MAINAPP(pAppPtr);
int iResult = 0;
int sRetVal = 0;
static char cTransmitBuffer[1024];
unsigned long ulTxPacketLength =0;
int in_usTimeOut = 0;
unsigned short usTimeout = 0;
S_QJB_POWER_CNTRL S_Out_QJB_Power_Cntrl = {0};
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_ucHeader[0] = QJB_TCP_HEADER_BYTE1;
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_ucHeader[1] = QJB_TCP_HEADER_BYTE2;
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_usCmdID = QJB_ETH_POWER_ON;
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_usCmdResults = 0;
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_usDataSize = sizeof(S_QJB_POWER_CNTRL);
//Fill the controls & delay
sRetVal = PowerCntrlStructFill(&pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.U_Tcp_Msg.S_QJB_PowerCntrl,&usTimeout);
if(sRetVal)
{
return sRetVal;
}
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_usReserved = 0;
pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.m_usChecksum = 0;
//Perform Endian Swap
pAppPtr->objEndianConv.EndianSwap(&pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg.U_Tcp_Msg.S_QJB_PowerCntrl, &S_Out_QJB_Power_Cntrl);
//Frame the transmission packet
QJB_Frame_TXBuffer(cTransmitBuffer, &(pAppPtr->S_Tcp_Handle.Tcp_Tx_Msg), &ulTxPacketLength,(void *)&S_Out_QJB_Power_Cntrl);
//Send the data to the target
iResult = sendto(pAppPtr->sktConnectSocket,cTransmitBuffer,ulTxPacketLength,0,(struct sockaddr *)&pAppPtr->g_dest_sin, sizeof(pAppPtr->g_dest_sin));
if(iResult == SOCKET_ERROR)
{
return QJB_TARGET_DISCONNECTED;
}
memset(&pAppPtr->S_Tcp_Handle.Tcp_Rx_Msg,0,sizeof(S_QJB_ETHERNET_PKT));// 1336
//Send the Command and obtain the response
sRetVal = QJB_ETHResRev(pAppPtr->sktConnectSocket,&pAppPtr->S_Tcp_Handle.Tcp_Rx_Msg,3);
return sRetVal;
}
Sathishkumar.
Unfortunately, since UDP makes no guarantees about delivery, the network stack can drop your sent packets at any time for any reason. It is worth noting there is no order gurantee of the orders the packets will arrive in as well.
If ordering and delivery is prime to your application, which I think it is, consider switching to TCP.

WinUSB Bulk IN transfer fails on transfer size greater than maximum packet size

I am using WinUSB on the windows host side to communicate with my WINUSB USB device.
My USB device is Full speed device.
I am able to get the device handle and do the OUT and IN data transfers.
I am facing an issue with the Bulk IN transfer on FS WinUSB device. When i do a loop back of data from PC to device and back to PC, the sizes from 1 to 64 are working fine. When i transfer 65 bytes, first 64 bytes am able to read back in PC. But the last byte is missing.
Can anybody facing the same kind of issue or can suggest some solution?
Regards,
Nisheedh
First you should read out MAXIMUM_TRANSFER_SIZE. For sending, WinUSB "divides the buffer into appropriately sized chunks, if necessary" (source).
Also check the remarks of WinUsb_ReadPipe:
If the data returned by the device is greater than a maximum transfer
length, WinUSB divides the request into smaller requests of maximum
transfer length and submits them serially. If the transfer length is
not a multiple of the endpoint's maximum packet size (retrievable
through the WINUSB_PIPE_INFORMATION structure's MaximumPacketSize
member), WinUSB increases the size of the transfer to the next
multiple of MaximumPacketSize.
USB packet size does not factor into the transfer for a read request.
If the device responds with a packet that is too large for the client
buffer, the behavior of the read request corresponds to the type of
policy set on the pipe. If policy type for the pipe is
ALLOW_PARTIAL_READS, WinUSB adds the remaining data to the beginning
of the next transfer. If ALLOW_PARTIAL_READS is not set, the read
request fails. For more information about policy types, see WinUSB
Functions for Pipe Policy Modification.
Check your settings and whether the last Byte is send with a second transfer.
You also should test how many bytes have been actually written / read.
Recently I meet this issue when I program my STM32F4 discovery board as an USB device and use WINUSB to have an application being able to transmit loop back messages through USB BULK.
Three things I did to have more than 64 bytes packet sent by PC and received from device as the loop back messaging way:
1.For application, please set pipe policy to allow partial read:
BOOL policy_allow_partial = true;
if (WinUsb_SetPipePolicy(deviceData.WinusbHandle, pipe_id.PipeInId, ALLOW_PARTIAL_READS,sizeof(UCHAR), &policy_allow_partial)) {
printf("WinUsb_SetPipePolicy for ALLOW_PARTIAL_READS OK\n");
}
else {
printf("WinUsb_SetPipePolicy for ALLOW_PARTIAL_READS failed:%s\n", GetLastErrorAsString().c_str());
}
2.For firmware, please just let the USB receive handler do its read task only and maintain the read pointer offset well.
static int8_t CDC_Receive_FS(uint8_t* Buf, uint32_t *Len) {
/* USER CODE BEGIN 6 */
extern void on_CDC_Receive_FS(uint32_t len);
extern volatile int8_t usb_rxne;
USBD_CDC_SetRxBuffer(&hUsbDeviceFS, &Buf[0]);
USBD_CDC_ReceivePacket(&hUsbDeviceFS);
uint32_t len = *Len;
if ((ir + len) >= RX_DATA_LEN ) {
len = RX_DATA_LEN - 1 - ir;
if (len > 0 ) {
memcpy(usb_rx+ir, Buf, len);
}
ir = RX_DATA_LEN - 1;
}
else {
memcpy(usb_rx+ir, Buf, len);
ir += len;
}
usb_rxne = SET;
on_CDC_Receive_FS(len);
return (USBD_OK); }
void on_CDC_Receive_FS(uint32_t len) {
extern int8_t CDC_is_busy(void);
if (CDC_is_busy()) return;
//USB loopback method 2, transmit later but can support more than 64 bytes
if (iw < 512) {
memcpy(usb_tx+iw, usb_rx, len);
iw += len;
tx_len += len;
}
else {
memset(usb_tx, 0, 512);
iw = 0;
tx_len = 0;
}
}}
3.Let the transmit task running in the main loop.
int main(void) {
while (1) {
if (iw != 0) {
usb_txe = RESET;
ret_tran = CDC_Transmit_FS(usb_tx, tx_len);
iw = 0;
tx_len = 0;
usb_txe = SET;
}
else {
usb_txe = SET;
}}
For sharing more details for this question, here is my source code at my GitHub project .

UART DMA for varying sized arrays

Using MPLAB X 1.70 with a dsPIC33FJ128GP802 microcontroller.
I've got an application which is collecting data from two sensors at different sampling rates (one at 50Hz, the other at 1000Hz), both sensor packets are also different sizes (one is 5 bytes, the other is 21 bytes). Up until now I've used manual UART transmision as seen below:
void UART_send(char *txbuf, char size) {
// Loop variable.
char i;
// Loop through the size of the buffer until all data is sent. The while
// loop inside checks for the buffer to be clear.
for (i = 0; i < size; i++) {
while (U1STAbits.UTXBF);
U1TXREG = *txbuf++;
}
}
The varying sized arrays (5 or 21 bytes) were sent to this function, with their size, and a simple for loop looped through each byte and outputted it through the UART tx register U1TXREG.
Now, I want to implement DMA to relieve some pressure on the system when transmitting the large amount of data. I've used DMA for my UART receive and ADC, but having trouble with transmit. I've tried both ping pong mode on and off, and one-shot and continuous mode, but whenever it comes to sending the 21 byte packet it messes up with strange values and zero value padding.
I'm initialising the DMA as seen below.
void UART_TX_DMA_init() {
DMA2CONbits.SIZE = 0; // 0: word; 1: byte
DMA2CONbits.DIR = 1; // 0: uart to device; 1: device to uart
DMA2CONbits.AMODE = 0b00;
DMA2CONbits.MODE = 1; // 0: contin, no ping pong; 1: oneshot, no ping pong; 2: contin, ping pong; 3: oneshot, ping pong.
DMA2PAD = (volatile unsigned int) &U1TXREG;
DMA2REQ = 12; // Select UART1 Transmitter
IFS1bits.DMA2IF = 0; // Clear DMA Interrupt Flag
IEC1bits.DMA2IE = 1; // Enable DMA interrupt
}
The DMA interrupt I'm just clearing the flag. To build the DMA arrays I've got the following function:
char TXBufferADC[5] __attribute__((space(dma)));
char TXBufferIMU[21] __attribute__((space(dma)));
void UART_send(char *txbuf, char size) {
// Loop variable.
int i;
DMA2CNT = size - 1; // x DMA requests
if (size == ADCPACKETSIZE) {
DMA2STA = __builtin_dmaoffset(TXBufferADC);
for (i = 0; i < size; i++) {
TXBufferADC[i] = *txbuf++;
}
} else if (size == IMUPACKETSIZE) {
DMA2STA = __builtin_dmaoffset(TXBufferIMU);
for (i = 0; i < size; i++) {
TXBufferIMU[i] = *txbuf++;
}
} else {
NOTIFICATIONLED ^= 1;
}
DMA2CONbits.CHEN = 1; // Re-enable DMA2 Channel
DMA2REQbits.FORCE = 1; // Manual mode: Kick-start the first transfer
}
This example is with ping pong turned off. I'm using the same DMA2STA register but changing the array depending on which packet type I have. I'm determining the packet type from the data to be sent, changing the DMA bytes to be sent (DMA2CNT), building the array same as before with a for loop, then forcing the first transfer along with re-enabling the channel.
It takes much longer to process the data for the large data packet and I'm starting to think the DMA is missing these packets and sending null/weird packets in its place. It seems to be polling before I build the buffer and force the first transfer. Perhaps the force isn't necessary for every poll; I don't know...
Any help would be great.
After a few days of working on this I think I've got it.
The main issue I experienced was that the DMA interrupt was being polled faster than previous transmission, therefore I was only getting segments of packages before the next package overwrote the previous. This was solved with simply waiting for the end of UART transmission with:
while (!U1STAbits.TRMT);
I managed to avoid the redundancy of recreating a new DMA with the package data by simply making the original data array the one recognised by the DMA.
In the end the process was pretty minimal, the function called every time a package was created is:
void sendData() {
// Check that last transmission has completed.
while (!U1STAbits.TRMT);
DMA2CNT = bufferSize - 1;
DMA2STA = __builtin_dmaoffset(data);
DMA2CONbits.CHEN = 1; // Re-enable DMA0 Channel
DMA2REQbits.FORCE = 1; // Manual mode: Kick-start the first transfer
}
Regardless of what the size of the package, the DMA changes the amount it sends using the DMA2CNT register, then it's simply re-enabling the DMA and forcing the first bit.
Setting up the DMA was:
DMA2CONbits.SIZE = 1;
DMA2CONbits.DIR = 1;
DMA2CONbits.AMODE = 0b00;
DMA2CONbits.MODE = 1;
DMA2PAD = (volatile unsigned int) &U1TXREG;
DMA2REQ = 12; // Select UART1 Transmitter
IFS1bits.DMA2IF = 0; // Clear DMA Interrupt Flag
IEC1bits.DMA2IE = 1; // Enable DMA interrupt
Which is one-shot, no ping-pong, byte transfer, and all the correct parameters for UART1 TX.
Hope this helps someone in the future, the general principle can be applied to most PIC microcontrollers.

Resources