Receive message of undefined size in UART in C - c

I'm writing my own drivers for LPC2148 and a question came to mind.
How do I receive a message of unspecified size in UART?
The only 2 things that come to mind are: 1 - Configure a watchdog and end the receiving when the time runs out. 2- make it so that whenever a meswsage is sent to it there must be an end of message character.
The first choice seems better in my opinion, but I'd like to know if anybody has a better answer, and I know there must be.
Thank you very much

Just give the caller whatever bytes you have received so far. The UART driver shouldn't try to implement the application protocol, the application should do that.

It looks like a wrong use for a watchdog. I ended up with three solutions for this problem:
Use fixed-size packets and DMA; so, you receive one packet per transaction. Apparently, it is not possible in your case.
Receive message char-by-char until the end-of-message character is received. Kind of error-prone, since the EOM char may appear in the data, probably.
Use a fixed-size header before every packet. In the header, store payload size and/or message type ID.
The third approach is probably the best one. You may combine it with the first one, i.e. use DMA to receive header and then data (in the second transaction, after the data size is known from the header). It is also one of the most flexible approaches.
One more thing to worry about is to keep bytestream in sync. There may be rubbish laying in the UART input buffers, which may get read as data, or you can get only a part of a packet after your MCU is powered (i.e. the beginning of the packet had already been sent by that time). To avoid that, you can add magic bytes in your packet header, and probably CRC.
EDIT
OK, one more option :) Just store everything you receive in a growing buffer for later use. That is basically what PC drivers do.

Real embedded uart drivers usually use a ring buffer. Bytes are stored in order and the clients promise to read from the buffer before it's full.
A state machine can then process the message in multiple passes with no need for a watchdog to tell it reception is over

better to go for option 2) append end of transmission character to the transmission string.
but i suggest to add start of transmission also to validate that you are receiving actual transmission.

Watchdog timer is used to reset system when there is a unexpected behavior of device. I think it is better to use a buffer which can store size of data that your application requires.

Related

Sending variable sized packets over the network using TCP/IP

I want to send variable sized packets between 2 linux OSes over an internal network. The packet is variable sized and its length and CRC are indicated in the header which is also sent along with the packet. Something roughly like-
struct hdr {
uint32 crc;
uint32 dataSize;
void *data;
};
I'm using CRC at the application layer to overcome the inherent limitation of TCP checksums
The problem I have is, there is a chance that the dataSize field itself is corrupted, in which case, I dont know where the next packet starts? Cos at the reciver, when I read the socket buffer, I read n such packets next to one another. So dataSize is the only way I can get to the next packet correctly.
Some ideas I have is to-
Restart the connection if a CRC mismatch occurs.
Aggregate X such packets into one big packet of fixed size and discard the big packet if any CRC error is detected. The big packet is to make sure we lose <= sizeof of one packet in case of errors
Any other ideas for these variable sized packets?
Since TCP is stream based, data length is the generally used way to extract one full message for processing at the application. If you believe that the length byte itself is wrong for some reason, there is not much we can do except discard the packet,"flush" the connection and expect that the sender and receiver would re-sync. But the best is to disconnect the line unless, there is a protocol at the application layer to get to re-sync the connection.
Another method other than length bytes would be to use markers. Start-of-Message and End-of-Message. Application when encountering Start-of-Message should start collecting data until End-of-Message byte is received and then further process the message. This requires that the message escapes the markers appropriately.
I think that you are dealing with second order error possibilities, when major risk is somewhere else.
When we used serial line transmissions, errors were frequent (one or two every several kBytes). We used good old Kermit with a CRC and a packet size of about 100 bytes and that was enough: I encountered many times a failed transfer because the line went off, but never a correct transfer with a bad file.
With current networks, unless you have very very poor lines, the hardware level is not that bad, and anyway the level 2 data link layer already has a checksum to control that each packet was not modified between 2 nodes. HDLC is commonly used at that level and it uses normaly a CRC16 or CRC32 checksum which is a very correct checksum.
So the checksum as TCP level is not meant to detect random errors in the byte stream, but simply as a last defense line for unexpected errors, for exemple if a router gets mad because of an electrical shock and sends full garbage. I do not have any statistical data on it, but I am pretty sure that the number of errors reaching the TCP level is already very very low. Said differently, do not worry about that: unless you are dealing with highly sensitive data - and in that case I would prefere to have two different channels, former for data, latter for a global checksum - TCP/IP is enough.
That being said, adding a control at the application level as an ultime defense is perfectly acceptable. It will only process the errors that could have been undetected at data link and TCP level, or more probably errors in the peer application (who wrote it and how was it tested?). So the probability to get an error is low enough to use a very rough recovery procedure:
close the connection
open a new one
restart after last packet correctly exchanged (if it makes sense) or simply continue sending new packets if you can
But the risk is much higher to get a physical disconnection, or a power outage anywhere in the network, not speaking in a flaw in application level implementations...
And do not forget to fully specify the byte order and the size of the crc and datasize...

What's the 'safest' way to read from this buffer?

I'm trying to read and write a serial port in Linux (Ubuntu 12.04) where a microcontroller on the other end blasts 1 or 3 bytes whenever it finishes a certain task. I'm able to successfully read and write to the device, but the problem is my reads are a little 'dangerous' right now:
do
{
nbytes = read(fd, buffer, sizeof(buffer));
usleep(50000);
} while(nbytes == -1);
I.e. to simply monitor what the device is sending me, I poll the buffer every half second. If it's empty, it idles in this loop. If it receives something or errors, it kicks out. Some logic then processes the 1 or 3 packets and prints it to a terminal. A half second is usually a long enough window for something to fully appear in the buffer, but quick enough for a human who will eventually see it to not think it's slow.
'Usually' is the keyword. If I read the buffer in the middle of it blasting 3 bytes. I'll get a bad read; the buffer will have either 1 or 2 bytes in it and it'll get rejected in the packet processing (If I catch the first of a 3 byte packet, it won't be a purposefully-sent-one-byte value).
Solutions I've considered/tried:
I've thought of simply reading in one byte at a time and feeding in additional bytes if its part of a 3 byte transmission. However this creates some ugly loops (as read() only returns the number of bytes of only the most previous read) that I'd like to avoid if I can
I've tried to read 0 bytes (eg nbytes = read(fd, buffer, 0);) just to see how many bytes are currently in the buffer before I try to load it into my own buffer, but as I suspected it just returns 0.
It seems like a lot of my problems would be easily solved if I could peek into the contents of the port buffer before I load it into a buffer of my own. But read() is destructive up to the amount of bytes that you tell it to read.
How can I read from this buffer such that I don't do it in the middle of receiving a transmission, but do it fast enough to not appear slow to a user? My serial messenger is divided into a sender and receiver thread, so I don't have to worry about my program loop blocking somewhere and neglecting the other half.
Thanks for any help.
Fix your packet processing. I always end up using a state machine for instances like this, so that if I get a partial message, I remember (stateful) where I left off processing and can resume when the rest of the packet arrives.
Typically I have to verify a checksum at the end of the packet, before proceeding with other processing, so "where I left off processing" is always "waiting for checksum". But I store the partial packet, to be used when more data arrives.
Even though you can't peek into the driver buffer, you can load all those bytes into your own buffer (in C++ a deque is a good choice) and peek into that all you want.
You need to know how large the messages being sent are. There are a couple of ways to do that:
Prefix the message with the length of the message.
Have a message-terminator, a byte (or sequence of bytes) that can not be part of a message.
Use the "command" to calculate the length, i.e. when you read a command-byte you know how much data should follow, so read that amount.
The second method is best for cases when you can come out of sync, because then read until you get the message-terminator sequence and you're sure that the next bytes will be a new message.
You can of course combine these methods.
To poll a device, you should better use a multiplexing syscall like poll(2) which succeeds when some data is available for reading from that device. Notice that poll is multiplexing: you can poll several file descriptors at once, and poll will succeed as soon as one (any) file descriptor is readable with POLLIN (or writable, if so asked with POLLOUT, etc...).
Once poll succeeded for some fd which you POLLIN you can read(2) from that fd
Of course, you need to know the conventions used by the hardware device about its messages. Notice that a single read could get several messages, or only a part of one (or more). There is no way to prevent reading of partial messages (or "packets") - probably because your PC serial I/O is much faster than the serial I/O inside your microcontroller. You should bear with that, by knowing the conventions defining the messages (and if you can change the software inside the microcontroller, define an easy convention for that) and implementing the appropriate state machine and buffering, etc...
NB: There is also the older select(2) syscall for multiplexing, which has limitations related to the C10K problem. I recommend poll instead of select in new code.

Padding data over TCP

I am working on a client-server project and need to implement a logic where I need to check whether I have received the last data over a TCP socket connection, before I proceed.
To make sure that I have received all the data , I am planning to pad a flag to the last packet sent.I had two options in mind as below and also related prob.
i. Use a struct as below and populate the vst_pad for the last packet sent and check the same on the recv side for its presence. The advantage over option two is that, I dont have to remove the flag from actual data before writing it to a file.Just check the first member of the struct
typedef struct
{
/* String holding padding for last packet when socket is changed */
char vst_pad[10];
/* Pointer to data being transmitted */
char *vst_data;
//unsigned char vst_data[1];
} st_packetData;
The problem is I have to serialize the struct on every send call. Also I am not sure whether I will receive the entire struct over TCP in one recv call and so have to add logic/overhead to check this every time. I have implemented this so far but figured it later that stream based TCP may not guarantee to recv entire struct in one call.
ii. Use function like strncat to add that flag at the end to the last data being sent.
The prob is I have to check on every receive call either using regex functions or function like strstr for the presence of that flag and if so have to remove it from the data.
This application is going to be used for large data transfers and hence want to add minimal overhead on every send/recv/read/write call. Would really appreciate to know if there is a better option then the above two or any other option to check the receipt of last packet. The program is multithreaded.
Edit: I do not know the total size of file I am going to send, but I am sending fixed amount of data. That is fgets read until the size specified -1 or until a new line is encountered.
Do you know the size of the data in advance, and is it a requirement that you implement a end of message flag?
Because I would simplify the design, add a 4-byte header (assuming you're not sending more than 4gb of data per message), that contains the expected size of the message.
Thus you parse out the first 4 bytes, calculate the size, then continue calling recv until you get that much data.
You'll need to handle the case where your recv call gets data from the next message, and obviously error handling.
Another issue not raised with your 10byte pad solution is what happens if the actual message contains 10 zero bytes--assuming you're padding it with zeros? You'd need to escape the 10bytes of zeros otherwise you may mistakenly truncate the message.
Using a fixed sized header and a known size value will alleviate this problem.
For a message (data packet) first send a short (in network order) of the size, followed by the data. This can be achieved in one write system call.
On the reception end, just read the short and convert back into host order (this will enable one to use different processors at a later state. You can then read the rest of the data.
In such cases, it's common to block up the data into chunks and provide a chunk header as well as a trailer. The header contains the length of the data in the chunk and so the peer knows when the trailer is expected - all it has to do is count rx bytes and then check for a valid trailer. The chunks allow large data transfers without huge buffers at both ends.
It's no great hassle to add a 'status' byte in the header that can identify the last chunk.
An alternative is to open another data connection, stream the entire serialization and then close this data connection, (like FTP does).
Could you make use of an open source network communication library written in C#? If so checkout networkComms.net.
If this is truely the last data sent by your application, use shutdown(socket, SHUT_WR); on the sender side.
This will set the FIN TCP flag, which signals that the sender->receiver stream is over. The receiver will know this because his recv() will return 0 (just like an EOF condition) when everything has been received. The receiver can still send data afterward, and the sender can still listen for them, but it cannot send more using this connection.

Buffering of stream data

I'm trying to develop a simple IRC bot. First I want to think out a proper design for this project. One of the things I'm wondering about right now is the read mechanism. I develop this bot on a Linux system. (Fedora 12) To read from a socket I use the system call "read()". I plan to use the reading functionality in the following way (code just an example. Not something from the final product):
while (uBytesRead = read(iServerSocket, caBuffer, MAX_MESSAGE_SIZE))
{
//1. Parse the buffer and place it into a Message structure.
//2. Add the message structure to a linked list that will act as a queue of message that are to be processed.
}
This code will be run in it's own thread. I choose for this option because I wanted there to be as small of a delay between reads as possible. (writes will be implemented in the same way) This is all slightly based on assumptions, that I would like to clear up. My question is: what if you receive so much data at such a quick rate, that the reading and processing the data (in this case just parsing it) goes slower than the rate at which data comes in. I made the assumption that this data will be buffered by the system. is this a right assumption? And if so:
How big is this buffer?
What happens with incomming data when this buffer gets full?
To make my application protected against spam, how could I best deal with it?
I hope I've explained my issue clear enough.
Thanks in advance.
IRC uses TCP sockets for networking. Linux/Posix TCP sockets have a data buffer for sending and another one for receiving. You can resize the buffers with setsockopt() and SO_SNDBUF/SO_RCVBUF.
TCP has flow control so when a receive buffer is getting full the OS will send a congestion notice. Received packets that didn't fit in the buffer will not be acknowledged by the receiver and will be eventually retransmitted by the sender.
So that's not to worry. What matters is what does the sender program when its socket's send buffer gets full. Some programs will close the socket, others would just discard written data and try again, while others might buffer internally.

Data structure for storing serial port data in firmware

I am sending data from a linux application through serial port to an embedded device.
In the current implementation a byte circular buffer is used in the firmware. (Nothing but an array with a read and write pointer)
As the bytes come in, it is written to the circular bufffer.
Now the PC application appears to be sending the data too fast for the firmware to handle. Bytes are missed resulting in the firmware returning WRONG_INPUT too mant times.
I think baud rate (115200) is not the issue. A more efficient data structure at the firmware side might help. Any suggestions on choice of data structure?
A circular buffer is the best answer. It is the easiest way to model a hardware FIFO in pure software.
The real issue is likely to be either the way you are collecting bytes from the UART to put in the buffer, or overflow of that buffer.
At 115200 baud with the usual 1 start bit, 1 stop bit and 8 data bits, you can see as many as 11520 bytes per second arrive at that port. That gives you an average of just about 86.8 µs per byte to work with. In a PC, that will seem like a lot of time, but in a small microprocessor, it might not be all that many total instructions or in some cases very many I/O register accesses. If you overfill your buffer because bytes are arriving on average faster than you can consume them, then you will have errors.
Some general advice:
Don't do polled I/O.
Do use a Rx Ready interrupt.
Enable the receive FIFO, if available.
Empty the FIFO completely in the interrupt handler.
Make the ring buffer large enough.
Consider flow control.
Sizing your ring buffer large enough to hold a complete message is important. If your protocol has known limits on the message size, then you can use the higher levels of your protocol to do flow control and survive without the pains of getting XON/XOFF flow to work right in all of the edge cases, or RTS/CTS to work as expected in both ends of the wire which can be nearly as hairy.
If you can't make the ring buffer that large, then you will need some kind of flow control.
There is nothing better than a circular buffer.
You could use a slower baud rate or speed up the application in the firmware so that it can handle data coming at full speed.
If the output of the PC is in bursts it may help to make the buffer big enough to handle one burst.
The last option is to implement some form of flow control.
What do you mean by embedded device ? I think most of current DSP and processor can easily handle this kind of load. The problem is not with the circular buffer, but how do you collect bytes from the serial port.
Does your UART have a hardware fifo ? If yes, then you should enable it. If you have an interrupt per byte, you can quickly get into trouble, especially if you are working with an OS or with virtual memory, where the IRQ cost can be quit high.
If your receiving firmware is very simple (no multitasking), and you don't have an hardware fifo, polled mode can be a better solution than interrupt driven, because then your processor is doing only UART data reception, and you have no interrupt overhead.
Another problem might be with the transfer protocol. For example if you have long packet of data that you have to checksum, and you do the whole checksum at the end of the packet, then all the processing time of the packet is at the end of it, and that is why you may miss the beginning of the next packet.
So circular buffer is fine and you have to way to improve :
- The way you interact with the hardware
- The protocol (packet length, acknoledgment etc ...)
Before trying to solve the problem, first you need to establish what the problem really is. Otherwise you might waste time trying to fix something that isn't actually broken.
Without knowing more about your set-up it's hard to give more specific advice. But you should investigate further to establish what exactly the hardware and software is currently doing when the bytes come in, and then what is the weak point where they're going missing.
A circular buffer with Interrupt driven IO will work on the smallest and slowest of embedded targets.
First try it at the lowest baud rate and only then try at high speeds.
Using a circular buffer in conjunction with IRQ is an excellent suggestion. If your processor generates an interrupt each time a byte is received take that byte and store it in the buffer. How you decide to empty that buffer depends on if you are processing a stream of data or data packets. If you are processing a stream simply have your background process remove the bytes from the buffer and process them first-in-first-out. If you are processing packets then just keep filing the buffer until you have a complete packet. I've used the packet method successfully many times in the past. I would implement some type of flow control as well to signal to the PC if something went wrong like a full buffer or if packet-processing time is long to indicate to the PC when it is ready for the next packet.
You could implement something like IP datagram which contains data length, id, and checksum.
Edit:
Then you could hard-code some fixed length for the packets, for example 1024 byte or whatever that makes sense for the device. PC side would then check if the queue is full at the device every time it writes in a packet. Firmware side would run checksum to see if all data is valid, and read up till the data length.

Resources