What is the use of Dart RawSockets? - c

I mean, I've read questions about Dart RawSockets and also read the API but it was not clear for me to understand how to use them. Are Dart RawSockets the same as C rawsockets?
Also, what is the difference between Dart RawSockets and normal Sockets?

A Socket is a higher level concept. It implements a Stream of bytes (actually byte arrays) and an IOSink. listen to the stream and data arriving at the socket appears in the stream. When you want to send data down the socket, add it to the sink and away it goes.
A RawSocket is the lower level concept. Now, instead of getting a stream of bytes, you are just told when bytes are available to be read. (You get a stream of events telling you when data is available. You are responsible for then calling read to collect them.) This allows you to work more in the mode of a Unix socket where you use select to know that there is data available so that you don't block when trying to read it.
Dart's streams relieve you of much of the responsibility of select/blocking read/separate reader thread of different languages. When reading from a Socket you don't need to worry, data just arrives in the stream when it's available.
Note how there is only a RawDatagramSocket. This makes sense, since UDP packets are discrete, not a byte stream. A UDP socket just tells you that a packet is available to be read, and you then read it.

Related

C socket atomic non-blocking read

I'm implementing a TCP/IP server application which uses epoll in edge-triggered mode and does non-blocking socket operations. The clients are using simple blocking operations without epoll.
I don't see how "atomic reads" can be implemented on the server side. To explain what I mean about "atomic read", see this example with simple blocking operations:
Both the client and the server are using 64K buffers. (On application level. They don't change the kernel level socket buffers.)
Client writes 12K data with a single write operation.
Server reads it. In this case it always reads the whole 12K when the buffers are the same. So it can't read only the half of it. This is what I call "atomic".
But in the case of epoll + non-blocking operations this can happen:
Both the client and the server are using 64K buffers. (On application level. They don't change the kernel level socket buffers.)
Client writes 12K data with a single write operation.
6K arrives to the server
epoll tells the application that data arrived to the socket
the application reads the 6K into the buffer using a non-blocking operation.
When repeating the read, it returns EAGAIN / EWOULDBLOCK.
In this case the read is not "atomic". It is not guarantied that when the data was written with a single write operation, then the read will return the whole, in one piece.
Is it possible to know when the data is partial? I know that one solution is to always append the data size to the beginning, or another could be to always close and re-open the connection, but I don't wanna do these: because I think the kernel must know that not the full "package" (how is that unit called BTW?) arrived, since it guaranties atomicity for the blocking operations.
Many thanks!
TCP is stream based and not message oriented. Even in case of blocking socket, you cannot be guaranteed that what the application sends would go as is on the wire in one go. TCP will decide its own course.
So, it is up to the application to do "atomic" read of it wishes. For example:
The application protocol should dictate that the message should be prepended by the length bytes. The length bytes inform the peer the size of the application data of interest. Of course, the application should be aware of when the two byte length indicator begins.
[2 byte msg length][Data bytes of interest]
Based on this information the application doing read should take action. It should be polling the socket until it receives all bytes as indicated by the msg length bytes. Only then process the data.
If you need "atomic" read and not partial read you can use MSG_PEEK flag in recv. This shall not remove the data from the socket buffer. Application peeks into the socket, see if required number of data is in socket buffer based on return value.
ret = recv(sd, buf, MAX_CALL_DATA_SIZE, MSG_PEEK);

Abstracting UDP and TCP send/receive procedures

Good day.
Intro.
Recently I've started to study some 'low-level' network programming as well as networking protocols in Linux. For this purpose I decided to create a small library for networking.
And now I wonder on some questions. I will ask one of them now.
As you know there are at least two protocols built on top of IP. I talk about TCP and UDP. Their implementation may differ in OS due to connection-orientation property of those.
According to man 7 udp all receive operations on UDP socket return only one packet. It is rational as different datagrams may come from different sources.
On the other hand TCP connection packets sequence may be considered as continuous byte flow.
Now, about the problem itself.
Say, I have an API for TCP connection socket and for UDP socket like:
void tcp_connection_recv(endpoint_t *ep, buffer_t *b);
void udp_recv(endpoint_t *ep, buffer_t *b);
endpoint_t type will describe the endpoint (remote for TCP connection and local for UDP). buffer_t type will describe some kind of vector-based or array-based buffer.
It is quite possible that buffer is already allocated by user and I'm not sure that this will be right for UDP to not change size of the buffer. And thus, to abstract code for TCP and UDP operations I think it will need to allocate as much buffer as needed to contain whole received data.
Also, to prevent from resizeing user buffer each socket may be maped to its own buffer (although it will be userspace buffer, but it will be hidden from user). And then on user's request data will be copied from that 'inner' buffer to user's one or read from socket if there is not enough amount.
Any suggestions or opinions?
If you want to create such API, it will depend on the service you want to provide. In TCP it will be different than UDP as TCP is stream oriented.
For TCP, tcp_connection_recv instead of reallocating a buffer, if the buffer passed by the user is not big enough, you can fill the whole buffer and then return, maybe with an output parameter, and indication that there is more data waiting to be read. Basically you can use the receive buffer that TCP connection already provides in the kernel, no need to create other buffer.
For, udp, you can request the user a number indicating the maximum datagram size it is waiting for. When you read from a UDP socket with recvfrom, if you read less data than what came in the arrived datagram, the rest of the datagram data is lost. You can read first with MSG_PEEK flag in order to find out how much data is available.
In general I wouldn't handle the buffer for the application as the application, actually the application layer protocol, is the one that knows how it expects to receive the data.

Reading all available bytes via socket using blocking I/O

When reading from a socket using read(2) and blocking I/O, when do I know that the other side (the client) has no more data to send? (by "no more data to send" I mean that, as an example, the client is waiting for a response). At first, I thought that this point is reached when less than count bytes are returned by read (as in read(fd, *buf, count)).
But what if the client sends the data fragmented? Reading until read returns 0 would be a solution, but as far as I know 0 is only returned when the client closes the connection - otherwise, read would just block until the connection is closed. I thought of using non-blocking I/O and a timeout for select(2), but this does not seem to be a tidy solution to me.
Are there any known best practices?
The concept of "the other side has no more data to send", without either a timeout or some semantics in the transmitted data, is quite pointless. Normally, code on the client/server will be able to process data faster than the network can transmit it. So if there's no data in the receive buffer when you're trying to read() it, this just means the network has not yet transmitted everything, but you have no way to tell if the next packet will arrive within a millisecond, a second, or a day. You'd probably consider the first case as "there is more data to send", the third as "no more data to send", and the second depends on your application.
If the other side doesn't close the connection, you probably don't know when it's ready to send the next data packet either.
So unless you have specific semantics and knowledge about what the client sends, using select() and non-blocking I/O is the best you can do.
In specific cases, there might be other ways - for example, if you know the client will send and XML tag, some data, and a closing tag, every n seconds. In that case you could start reading n seconds after the last packet you received, then just read on until you receive the closing tag. But as i said, this isn't a general approach since it requires semantics on the channel.
TCP is a byte-stream protocol, not a message protocol. If you want messages you really have to implement them yourself, e.g. with a length-word prefix, lines, XML, etc. You can guess with the FIONREAD option of ioctl(), but guessing is all it is, as you can't know whether the client has paused in the middle of transmission of the message, or whether the network has done so for some reason.
The protocol needs to give you a way to know when the client is finishes sending a message.
Common approaches are to send the length of each message before it, or to send a special terminator after each message (similar to the NUL character at the end of strings in C).

c select() reading until null character

I am implementing a proxy in c and am using select() to not block on I/O. There are multiple clients connecting to the proxy, so I include the socket descriptor # in my messages so that I know to which socket to forward a reply message from the server.
However, sometimes read() will not receive the full message up to the null character, but will send the rest of the message on the next round of select(). I would like to receive the full message at once so that I will know which socket to forward the reply to (buffering will not work, since I don't know which message belongs to which when there are multiple clients). Is there a way to do this without blocking on read while waiting for a null character to arrive?
There is no such thing as a message in TCP. It is a byte stream protocol. You write bytes, it sends bytes, you read bytes. There is no guarantee how many bytes you will receive at any one time and there is no guaranteed association between the amount of data written by a single write and read by a single read. If you want messages you must implement them yourself. Any given read may read zero, one, or more bytes, up to the length of the buffer. It might be half a message. It might be one and a half messages. What it is is entirely up to you.
Use ZeroMQ if you're doing individual messages. It has bindings for a huge number of languages and is a great abstraction for networking. In fact, it can handle this proxy model for you.

Buffering of stream data

I'm trying to develop a simple IRC bot. First I want to think out a proper design for this project. One of the things I'm wondering about right now is the read mechanism. I develop this bot on a Linux system. (Fedora 12) To read from a socket I use the system call "read()". I plan to use the reading functionality in the following way (code just an example. Not something from the final product):
while (uBytesRead = read(iServerSocket, caBuffer, MAX_MESSAGE_SIZE))
{
//1. Parse the buffer and place it into a Message structure.
//2. Add the message structure to a linked list that will act as a queue of message that are to be processed.
}
This code will be run in it's own thread. I choose for this option because I wanted there to be as small of a delay between reads as possible. (writes will be implemented in the same way) This is all slightly based on assumptions, that I would like to clear up. My question is: what if you receive so much data at such a quick rate, that the reading and processing the data (in this case just parsing it) goes slower than the rate at which data comes in. I made the assumption that this data will be buffered by the system. is this a right assumption? And if so:
How big is this buffer?
What happens with incomming data when this buffer gets full?
To make my application protected against spam, how could I best deal with it?
I hope I've explained my issue clear enough.
Thanks in advance.
IRC uses TCP sockets for networking. Linux/Posix TCP sockets have a data buffer for sending and another one for receiving. You can resize the buffers with setsockopt() and SO_SNDBUF/SO_RCVBUF.
TCP has flow control so when a receive buffer is getting full the OS will send a congestion notice. Received packets that didn't fit in the buffer will not be acknowledged by the receiver and will be eventually retransmitted by the sender.
So that's not to worry. What matters is what does the sender program when its socket's send buffer gets full. Some programs will close the socket, others would just discard written data and try again, while others might buffer internally.

Resources