Reading data on serial port - byte by byte

Reading data on serial port - byte by byte - c

How to read data on serial port in byte by byte fashion.
I have a source which sends out packets of varying packet size. I am reading the data in raw mode(non-canonical). When i set VMIN, i am able to get packet of that size or slightly larger.
for ex: If the received packet size is 46 bytes, and if i set VMIN to say '1'. I receive the data in 2 chunks(meaning 2 read calls were needed to get the complete data with one fetching first 32 and next fetching the rest 14 bytes).
If i set VMIN to 46, complete packet is fetched.
But the problem here is varying packet size. If the data packet size is more(say 70 bytes), it will mess up the buffer and following reads as it reads 60+ bytes in first read and rest in next read.
So i am thinking to read the data byte by byte and determine the end of the packet.
Does anyone know if it is do-able. Or any suggestion on how to read the complete data packet in one read operation.
UART setting:
Baud: 115200
No parity.
1 stop bit.
8N1.
No flow control.
Thanks in advance.

A good approach for for processing serial data is to Read chunks of data from port the into a buffer and then pull byte by byte from the buffer.
Serial port reading is affected by the timeout settings and incoming data flow, so the number of bytes per read are not guaranteed to be consistent. For example, if you knew that packets were always going to be 46 bytes, then you might think to set Vmin to 46 and expect to get 46 bytes per read. However, if the sending source sends multiple packets without delays between, then you might get all of one and part of another packet. If the sending source were to delay during the transmission of a packet for longer than the receiving port's timeout, then you would get fewer than Vmin bytes.
Be sure to code for the possibility of lost data. For example, let's say that packets start with and ends with . You start pulling data from the buffer and the first byte is , but 49 bytes later you encounter meaning a new packet, but without having seen the from the previous packet. There should of course also be a CRC for the packet, or at least a checksum.

Since you are reading data that is structured into packets of variable size, you should add a 2 byte header for each packet and set it to the packet size.
In the reader you would read 2 bytes first and then decide how many bytes to read to receive the whole packet.

Related

TCP fragmentation - unread bytes

A slight variation on this SO question.
Say the receiver expects packets to be at most 100 bytes.
Say at time X there are actually 100 bytes available in the buffer, but for reasons the receiver only determines it needs to read 75 of those.
What happens with data not read from a socket?
Example:
Using flag MSG_PEEK (see here) the receiver determines that there is a full valid reply of 75 bytes in the buffer. The remaining 25 bytes must be the start of a next packet.
The receiver elects to remove only 75 bytes (i.e. ::recv() without the MSG_PEEK flag) from the buffer, leaving 25 bytes unread/unmoved in the buffer.

there are actually 100 bytes available in the buffer, but for reasons the receiver only determines it needs to read 75 of those.
I guess receiver refers to the application reading from the TCP socket. The remaining 25 bytes simply stay in the socket buffer to be read at some later time. If the socket is closed before that the data is lost.
Using the MSG_PEEK flag, the read data isn't removed from the buffer at all, so it still contains all 100 bytes after reading.
From the application level, you receive a continous data stream from a TCP socket. If and how the data was segmented or even fragmented for transport doesn't matter and isn't visible to the application. You can read the data in chunks of any size, regardless of how the source application has written it.
Say the receiver expects packets to be at most 100 bytes.
If you are trying to refer to TCP's Maximum Segment Size (MSS), the minimum Maximum Transfer Unit (MTU) for IPv4 is 576 bytes, so the minimum MSS is 536 bytes.

STM32 USB CDC Long packet receive

I need to send data from the PC to my STM32F3, so I decided to use a built-in USB in uC.
But now I have a problem - I want to send to stm32 big amount of data at once - I mean something like 200-500 Bytes.
When I send from PC with minicom packets which have less than 64 chart - everything is fine - callback CDC_Receive_FS(uint8_t* Buf, uint32_t *Len) occurs once - it enables UsbRxFlag, just to inform the running program that there is data available.
static int8_t CDC_Receive_FS(uint8_t* Buf, uint32_t *Len)
{
/* USER CODE BEGIN 6 */
USBD_CDC_SetRxBuffer(&hUsbDeviceFS, &Buf[0]);
USBD_CDC_ReceivePacket(&hUsbDeviceFS);
if( (Buf[0] == 'A') & (Buf[1] == 'T') ){
GPIOB->BSRR = (uint32_t)RX_Led_Pin;
UsbRxFlag = 1;
}
return (USBD_OK);
/* USER CODE END 6 */
}
But when I try to send more data (just long text from minicom ) to uC, something weird happens - sometimes uC doesn't react at all - sometimes it doesn't take into account some data.
How can I handle sending to STM32F3 more than 64Bytes over USB-CDC?

The maximum packet length for full-speed USB communication is 64 bytes. So the data will be transferred in chunks of 64 bytes and needs to be reassembled on the other end.
USB CDC is based on bulk transfer endpoints and implements a data stream (also known as pipe), not a message stream. It's basically a stream of bytes. So if you send 200 bytes, do not expect any indication of where the 200 bytes end. Such information is not transmitted.
Your code looks a bit suspicious:
You probably meant '&&' instead of '&' as pointed out by Reinstate Monica.
Unless you change buffers, USBD_CDC_SetRxBuffer only needs to be called once at initialization.
When CDC_Receive_FS is called, a data packet has already been received. Buf will point to the buffer you have specified with USBD_CDC_SetRxBuffer. Len provides the length of the packet. So the first thing you would do is process the received data. Once the data has been processed and the buffer can be reused again, you would call USBD_CDC_ReceivePacket to indicate that you are ready to receive the next packet. So move USBD_CDC_SetRxBuffer to another function (unless you want to use several buffers) and move USBD_CDC_ReceivePacket to the end of CDC_Receive_FS.
The incorrect order of the function calls could likely have led to the received data being overwritten while you are still processing it.
But the biggest issue is likely that you expect that the entire data is received in a single piece if you sent is as a single piece, or that it at least contains an indication of the end of the piece. That's not the case. You will have to implement this yourself.
If you are using a text protocol, you could buffer all incoming data until you detect a line feed. Then you know that you have a complete command and can execute it.

The following is a general purpose implementation for reading an arbitrary number of bytes: https://github.com/philrawlings/bluepill-usb-cdc-test.
The full code is a little too long to post here, but this essentially modifies usb_cdc_if.c to create a circular buffer and exposes additional functions (CDC_GetRxBufferBytesAvailable_FS(), CDC_ReadRxBuffer_FS() and CDC_FlushRxBuffer_FS()) which can be consumed from main.c. The readme.md text shown on the main page describes all the code changes required.
As mentioned by #Codo, you will need to either add termination characters to your source data, or include a "length" value (which itself would be a fixed number of bytes) at the beginning to then indicate how many bytes are in the data payload.

How can I clear UDP buffer without recvfrom?

I have an embedded linux project. And it gets data via UDP to static char array from UDP buffer. This static array's size is 20000 bytes. I want to ignore UDB messages that exceed this size. But when comes bigger data, it stays always in UDP buffer since it is not read with recvfrom. Is there any way to clear this bigger data in UDP buffer?

One cannot discard the data from the socket buffer without reading. But one can read these large datagrams even when having a smaller buffer - it will simply discard anything which does not fit into the given buffer. To find out if the datagram was too large use the MSG_TRUNC flag so that it will provide the original length of the packet. If this indicates an oversized packet just discard it and continue with the next packet.

Isn't recv() in C socket programming blocking?

In Receiver, I have
recvfd=accept(sockfd,&other_side,&len);
while(1)
{
recv(recvfd,buf,MAX_BYTES-1,0);
buf[MAX_BYTES]='\0';
printf("\n Number %d contents :%s\n",counter,buf);
counter++;
}
In Sender , I have
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
MAX_BYTES is 1024 and length of mesg is 15. Currently, It calls recv only one time. I want recv function to be called three times for each corresponding send. How do I achieve it?

In short: yes, it is blocking. But not in the way you think.
recv() blocks until any data is readable. But you don't know the size in advance.
In your scenario, you could do the following:
call select() and put the socket where you want to read from into the READ FD set
when select() returns with a positive number, your socket has data ready to be read
then, check if you could receive length bytes from the socket:
recv(recvfd, buf, MAX_BYTES-1, MSG_PEEK), see man recv(2) for the MSG_PEEK param or look at MSDN, they have it as well
now you know how much data is available
if there's less than length available, return and do nothing
if there's at least length available, read length and return (if there's more than length available, we'll continue with step 2 since a new READ event will be signalled by select()

To send discrete messages over a byte stream protocol, you have to encode messages into some kind of framing language. The network can chop up the protocol into arbitrarily sized packets, and so the receives do not correlate with your messages in any way. The receiver has to implement a state machine which recognizes frames.
A simple framing protocol is to have some length field (say two octets: 16 bits, for a maximum frame length of 65535 bytes). The length field is followed by exactly that many bytes.
You must not even assume that the length field itself is received all at once. You might ask for two bytes, but recv could return just one. This won't happen for the very first message received from the socket, because network (or local IPC pipe, for that matter) segments are never just one byte long. But somewhere in the middle of the stream, it is possible that the fist byte of the 16 bit length field could land on the last position of one network frame.
An easy way to deal with this is to use a buffered I/O library instead of raw operating system file handles. In a POSIX environment, you can take an open socket handle, and use the fdopen function to associate it with a FILE * stream. Then you can use functions like getc and fread to simplify the input handling (somewhat).
If in-band framing is not acceptable, then you have to use a protocol which supports framing, namely datagram type sockets. The main disadvantage of this is that the principal datagram-based protocol used over IP is UDP, and UDP is unreliable. This brings in a lot of complexity in your application to deal with out of order and missing frames. The size of the frames is also restricted by the maximum IP datagram size which is about 64 kilobytes, including all the protocol headers.
Large UDP datagrams get fragmented, which, if there is unreliability in the network, adds up to greater unreliability: if any IP fragment is lost, the entire packet is lost. All of it must be retransmitted; there is no way to just get a repetition of the fragment that was lost. The TCP protocol performs "path MTU discovery" to adjust its segment size so that IP fragmentation is avoided, and TCP has selective retransmission to recover missing segments.

I bet you've created a TCP socket using SOCK_STREAM, which would cause the three messages to be read into your buffer during the first recv call. If you want to read the messages one-by-one, create a UPD socket using SOCK_DGRAM, or develop some type of message format which allows you to parse your messages when they arrive in a stream (assuming your messages will not always be fixed length).

First send the length to be received in a fixed format regarding the size of length in bytes you use to transmit this length, then make recv() loop until length bytes had been received.
Note the fact (as also already mentioned by other answers), that the size and number of chunks received do not necessarly need to be the same as sent. Only the sum of all bytes received shall be the same as the sum of all bytes sent.
Read the man pages for recvand send. Especially read the sections on what those functions RETURN.

recv will block until the entire buffer is filled, or the socket is closed.
If you want to read length bytes and return, then you must only pass to recv a buffer of size length.
You can use select to determine if
there are any bytes waiting to be read,
how many bytes are waiting to be read, then
read only those bytes
This can avoid recv from blocking.
Edit:
After re-reading the docs, the following may be true: your three "messages" may be being read all-at-once since length + length + length < MAX_BYTES - 1.
Another possibility, if recv is never returning, is that you may need to flush your socket from the sender-side. The data may be waiting in a buffer to actually be sent to the receiver.

libpcap format - packet header - incl_len / orig_len

The libpcap packet header structure has 2 length fields:
typedef struct pcaprec_hdr_s {
guint32 ts_sec; /* timestamp seconds */
guint32 ts_usec; /* timestamp microseconds */
guint32 incl_len; /* number of octets of packet saved in file */
guint32 orig_len; /* actual length of packet */
} pcaprec_hdr_t;
incl_len: the number of bytes of packet data actually captured and saved in the file. This value should never become larger than orig_len or the snaplen value of the global header.
orig_len: the length of the packet as it appeared on the network when it was captured. If incl_len and orig_len differ, the actually saved packet size was limited by snaplen.
Can any one tell me what is the difference between the 2 length fields? We are saving the packet in entirely then how can the 2 differ?

Reading through the documentation at the Wireshark wiki ( http://wiki.wireshark.org/Development/LibpcapFileFormat ) and studying an example pcap file, it looks like incl_len and orig_len are usually the same quantity. The only time they will differ is if the length of the packet exceeded the size of snaplen, which is specified in the global header for the file.
I'm just guessing here, but I imagine that snaplen specifies the size of the static buffer used for capturing. In the event that a packet was too large for the capture buffer, this is the format's method for signaling that fact. snaplen is documented to "usually" be 65535, which is large enough for most packets. But the documentation stipulates that the size might be limited by the user.

Can any one tell me what is the difference between the 2 length fields? We are saving the packet in entirely then how can the 2 differ?
If you're saving the entire packet, the 2 shouldn't differ.
However, if, for example, you run tcpdump or TShark or dumpcap or a capture-from-the-command-line Wireshark and specify a small value with the "-s n" flag, or specify a small value in the "Limit each packet to [n] bytes" option in the Wireshark GUI, then libpcap/WinPcap will be passed that value and will only supply the first n bytes of each packet to the program, and the entire packet won't be saved.
A limited "snapshot length" means you don't see all the packet data, so some analysis might not be possible, but means that less memory is needed in the OS to buffer packets (so fewer packets might be dropped), and less CPU bandwidth is needed to copy packet data to the application and less disk bandwidth is needed to save packets to disk if the application is saving them (which might also reduce the number of packets dropped), and less disk space is needed for the saved packets.