How Do Sockets Work in C?

How Do Sockets Work in C? - c

I am a bit confused about socket programming in C.
You create a socket, bind it to an interface and an IP address and get it to listen. I found a couple of web resources on that, and understood it fine. In particular, I found an article Network programming under Unix systems to be very informative.
What confuses me is the timing of data arriving on the socket.
How can you tell when packets arrive, and how big the packet is, do you have to do all the heavy lifting yourself?
My basic assumption here is that packets can be of variable length, so once binary data starts appearing down the socket, how do you begin to construct packets from that?

Short answer is that you have to do all the heavy lifting yourself. You can be notified that there is data available to be read, but you won't know how many bytes are available. In most IP protocols that use variable length packets, there will be a header with a known fixed length prepended to the packet. This header will contain the length of the packet. You read the header, get the length of the packet, then read the packet. You repeat this pattern (read header, then read packet) until communication is complete.
When reading data from a socket, you request a certain number of bytes. The read call may block until the requested number of bytes are read, but it can return fewer bytes than what was requested. When this happens, you simply retry the read, requesting the remaining bytes.
Here's a typical C function for reading a set number of bytes from a socket:
/* buffer points to memory block that is bigger than the number of bytes to be read */
/* socket is open socket that is connected to a sender */
/* bytesToRead is the number of bytes expected from the sender */
/* bytesRead is a pointer to a integer variable that will hold the number of bytes */
/* actually received from the sender. */
/* The function returns either the number of bytes read, */
/* 0 if the socket was closed by the sender, and */
/* -1 if an error occurred while reading from the socket */
int readBytes(int socket, char *buffer, int bytesToRead, int *bytesRead)
{
*bytesRead = 0;
while(*bytesRead < bytesToRead)
{
int ret = read(socket, buffer + *bytesRead, bytesToRead - *bytesRead);
if(ret <= 0)
{
/* either connection was closed or an error occurred */
return ret;
}
else
{
*bytesRead += ret;
}
}
return *bytesRead;
}

So, the answer to your question depends a fair bit on whether you are using UDP or TCP as your transport.
For UDP, life gets a lot simpler, in that you can call recv/recvfrom/recvmsg with the packet size you need (you'd likely send fixed-length packets from the source anyway), and make the assumption that if data is available, it's there in multiples of packet-length sizes. (I.E. You call recv* with the size of your sending side packet, and you're set.)
For TCP, life gets a bit more interesting - for the purpose of this explanation, I will assume that you already know how to use socket(), bind(), listen() and accept() - the latter being how you get the file descriptor (FD) of your newly made connection.
There are two ways of doing the I/O for a socket - blocking, in which you call read(fd, buf, N) and the read sits there and waits until you've read N bytes into buf - or non-blocking, in which you have to check (using select() or poll()) whether the FD is readable, and THEN do your read().
When dealing with TCP-based connections, the OS doesn't pay attention to the packet sizes, since it's considered a continual stream of data, not seperate packet-sized chunks.
If your application uses "packets" (packed or unpacked data structures that you're passing around), you ought to be able to call read() with the proper size argument, and read an entire data structure off the socket at a time. The only caveat you have to deal with, is to remember to properly byte-order any data that you're sending, in case the source and destination system are of different byte endian-ness. This applies to both UDP and TCP.
As far as *NIX socket programming is concerned, I highly recommend W. Richard Stevens' "Unix Network Programming, Vol. 1" (UNPv1) and "Advanced Programming in an Unix Environment" (APUE). The first is a tome regarding network-based programming, regardless of the transport, and the latter is a good all-around programming book as it applies to *NIX based programming. Also, look for "TCP/IP Illustrated", Volumes 1 and 2.

When you do a read on the socket, you tell it how many maximum bytes to read, but if it doesn't have that many, it gives you however many it's got. It's up to you to design the protocol so you know whether you've got a partial packet or not. For instance, in the past when sending variable length binary data, I would put an int at the beginning that said how many bytes to expect. I'd do a read requesting a number of bytes greater than the largest possible packet in my protocol, and then I'd compare the first int against however many bytes I'd received, and either process it or try more reads until I'd gotten the full packet, depending.

Sockets operate at a higher level than raw packets - it's like a file you can read/write from. Also, when you try to read from a socket, the operating system will block (put on hold) your process until it has data to fulfill the request.

Related

Sending integer atomically over a socket in c

I have a simple client which accepts a single uint32_t from the server through a socket. Using the solution that appeared here many times (e.g. transfer integer over a socket in C) seems to work, but:
When calling "read" on files I know that the system is not guaranteed to read the entire content of the message at once, and it therefore returns the number of bytes read. Couldn't the same happen when accepting 4 bytes over a network socket?
If this can't happen, why is that? and if it can, how is it possible to make sure the send is "atomic", or is it necessary to piece back the bytes myself?

Depending on the socket type, different protocols can be used. SOCK_STREAM (correspond to TCP on network sockets) is a stream oriented protocol, so packets may be re-combined by the sender, the receiver or any equipement in the middle.
But SOCK_DGRAM (UDP) or SOCK_SEQPACKET actually send packets that cannot be changed. In that case 4 bytes in the same packet are guaranteed be be available in the same read operation, except if the receive buffer is too small. From man socket:
If a message is too long to fit in the supplied buffer, excess
bytes may be discarded depending on the type of socket the message is
received from
So if you want to have atomic blocs, use a packet protocol and not a stream one, and make sure to have large enough receive buffers.

When calling "read" on files I know that the system is not guaranteed
to read the entire content of the message at once
That is wrong, if the requested number of bytes is available they are read:
POSIX read manual says: The value returned may be less than nbyte if the number of bytes left
in the file is less than nbyte
This is at least correct for regular files, for pipes and alike it is a different story.
Couldn't the same happen when accepting 4 bytes over a network socket?
(I suppose you are talking about TCP sockets.) That may happen with socket because underlying protocol may transport your byte in any suitable manner (read about TCP fragmentation for example), the only thing ensured is that if received bytes are received in the same order that they have been sent. So, to read a given number of bytes you have to try to read those bytes eventually with several reads. This is usually made by looping over the read until needed bytes are received and read.

If the underlying protocol is TCP/IP, which is stream-oriented (there are no "packets" or "messages", just two byte streams), then yes.
You need to take care to manage the amount of read data so that you can know where each "message" (in your case, a single integer) begins and ends.

Receive continuous stream of varying length packets over sockets in C at a quick rate?

I'm working on the sockets (in C, with no prior experience on socket programming) from the past few days.
Actually I have to collect WiFi packets on raspberry pi, do some processing and have to send the formatted information to another device over sockets (both the devices are connect in a network).
The challenge I'm facing is when receiving the data over the sockets.
While sending the data, the data is sent successfully over the sockets from the sending side but on the receiving side, sometimes some junk or previous data is received.
On Sending Side (client):
int server_socket = socket(AF_INET, SOCK_STREAM, 0);
//connecting to the server with connect function
send(server_socket, &datalength, sizeof(datalength),0); //datalength is an integer containing the number of bytes that are going to be sent next
send(server_socket, actual_data, sizeof(actual_data),0); //actual data is a char array containing the actual character string data
On Receiving Side (Server Side):
int server_socket = socket(AF_INET, SOCK_STREAM, 0);
//bind the socket to the ip and port with bind function
//listen to the socket for any clients
//int client_socket = accept(server_socket, NULL, NULL);
int bytes;
recv(client_socket, &bytes, sizeof(bytes),0);
char* actual_message = malloc(bytes);
int rec_bytes = recv(client_socket, actual_message, bytes,0);
*The above lines of code are not the actual lines of code, but the flow and procedure would be similar (with exception handling and comments).
Sometimes, I could get the actual data for all the packets quickly(without any errors and packet loss). But sometimes the bytes (integer sent to tell the size of byte stream for the next transaction) is received as a junk value, so my code is breaking at that point.
Also sometimes the number of bytes that I receive on the receving side are less than the number of bytes expected (known from the received integer bytes). So in that case, I check for that condition and retrieve the remaining bytes.
Actually the rate at which packets arrive is very high (around 1000 packets in less than a second and I have to dissect, format and send it over sockets). I'm trying different ideas (using SOCK_DGRAMS but there is some packet loss here, insert some delay between transactions, open and close a new socket for each packet, adding an acknowledgement after receiving packet) but none are them meets my requirement (quick transfer of packets with 0 packet loss).
Kindly, suggest a way to send and receive varying length of packets at a quick rate over sockets.

I see a few main issues:
I think your code ignores the possibility of a full buffer in the send function.
It also seems to me that your code ignores the possibility of partial data being collected by recv. (nevermind, I just saw the new comment on that)
In other words, you need to manage a user-land buffer for send and handle fragmentation in recv.
The code uses sizeof(int) which might be a different length on different machines (maybe use uint32_t instead?).
The code doesn't translate to and from network byte order. This means that you're sending the memory structure of the int instead of an integer that can be read by different machines (some machines store the bytes backwards, some forward, some mix and match).
Notice that when you send larger data using TCP/IP, it will be fragmented into smaller packets.
This depends, among others, on the MTU network value (which often runs at ~500 bytes in the wild and usually around ~1500 bytes in your home network).
To handle these cases you should probably use an evented network design rather than blocking sockets.
Consider routing the send through something similar to this (if you're going to use blocking sockets):
int send_complete(int fd, void * data, size_t len) {
size_t act = 0;
while(act < len) {
int tmp = send(fd, (void *)((uintptr_t)data + act), len - act);
if(tmp <= 0 && errno != EWOULDBLOCK && errno != EAGAIN && errno != EINTR)
return tmp; // connection error
act += tmp;
// add `select` to poll the socket
}
return (int)act;
}
As for the sizeof issues, I would replace the int with a specific byte length integer type, such as int32_t.
A few more details
Please notice that sending the integer separately doesn't guaranty that it would be received separately or that the integer itself wouldn't be fragmented.
The send function writes to the system's buffer for the socket, not to the network (just like recv reads from the available buffer and not from the wire).
You can't control where fragmentation occurs or how the TCP packets are packed (unless you implement your own TCP/IP stack).
I'm sure it's clear to you that the "junk" value is data that was sent by the server. This means that the code isn't reading the integer you send, but reading another piece of data.
It's probably a question of alignment to the message boundaries, caused by an incomplete read or an incomplete send.
P.S.
I would consider using the Websocket protocol on top of the TCP/IP layer.
This guaranties a binary packet header that works with different CPU architectures (endianness) as well as offers a wider variety of client connectivity (such as connecting with a browser etc').
It will also solve the packet alignment issue you're experiencing (not because it won't exist, but because it was resolved in whatever Websocket parser you will adopt).

what does readable/writable mean in a socket file descriptor? And why regular files don't bother with that?

Since I'm new in learning libev recently, there's a readable/writable concept in a io_watcher that I don't quite understand. For my knowledge there's a parameter in linux system programming:
O_ASYNC
A signal (SIGIO by default) will be generated when the specified file
becomes readable or writable. This flag is available only for
terminals and sockets, not for regular files.
So, since a regular file won't bother with readable/writable, what readable/writable really mean in socket programming? And what measure did kernel do to find out whether a socket file descriptor is readable?
Considering the everything-is-a-file philosophy, does every socket descriptor with different descriptor number actually point to the same file? If so,can I consider the readable/writable problem is caused by the synchronisation?
OK it seems that I'v asked a silly question. What I really mean is that both socket and regular file read and write via file descriptor, so why socket descriptor got a readable/writable concept but regular file doesn't. Since EJP told me that this is because the buffer and each descriptor got their own pair of buffers, here's my conclusion: readable/writable concept is for buffers, if a buffer is empty, it's unreadable, while it is full, it's unwritable. readable and writable have nothing to do with synchronisation, and since regular file don't have a buffer, it is always readable and writable.
And there are more questions: when saying receive buffer, this buffer is not the same thing in int recv(SOCKET socket, char FAR* buf, int len, int flags);, right?

This question is specifically addressed in Unix Network Programming, Volume 1: The Sockets Networking API (3rd Edition) [W. Richard Stevens, Bill Fenner, Andrew M. Rudoff] (see it here. I'll add some minor edits for enhanced readability):
Under What Conditions Is a Descriptor Ready?
[...]
The conditions that cause select to return "ready" for sockets [are]:
1. A socket is ready for reading if any of the following four conditions
is true:
The number of bytes of data in the socket receive buffer is
greater than or equal to the current size of the low-water mark for
the socket receive buffer. A read operation on the socket will not block and will return a value greater than 0 (i.e., the data that is
ready to be read). [...]
The read half of the connection is closed (i.e., a TCP connection that
has received a FIN). A read operation on the socket will not block and
will return 0 (i.e., EOF).
The socket is a listening socket and the
number of completed connections is nonzero. [...]
A socket error is
pending. A read operation on the socket will not block and will return
an error (–1) with errno set to the specific error condition. [...]
2. A socket is ready for
writing if any of the following four conditions is true:
The number of bytes of available space in the socket send buffer is greater than or
equal to the current size of the low-water mark for the socket send
buffer and either: (i) the socket is connected, or (ii) the socket
does not require a connection (e.g., UDP). This means that if we set
the socket to nonblocking, a write operation will not block and will
return a positive value (e.g., the number of bytes accepted by the
transport layer). [...]
The write half of the connection is closed. A write
operation on the socket will generate SIGPIPE.
A socket using a non-blocking connect has completed the connection, or the connect has failed.
A socket error is pending. A write operation on the socket
will not block and will return an error (–1) with errno set to the
specific error condition. [...]
3.
A socket has an exception condition pending if there is out-of-band data
for the socket or the socket is still at the out-of-band mark.
[Notes:]
Our definitions of "readable" and "writable" are taken directly from the
kernel's soreadable and sowriteable macros on pp. 530–531 of TCPv2.
Similarly, our definition of the "exception condition" for a socket is
from the soo_select function on these same pages.
Notice that when an error occurs on a socket, it is marked as both
readable and writable by select.
The purpose of the receive and send low-water marks is to give the application control over how much data must be available for reading or how much space must be available for writing before select returns a readable or writable status. For example, if we know that our application has nothing productive to do unless at least 64 bytes of data are present, we can set the receive low-water mark to 64 to prevent select from waking us up if less than 64 bytes are ready for reading.
As long as the send low-water mark for a UDP socket is less than the send buffer size (which should always be the default relationship), the UDP socket is always writable, since a connection is not required.
A related read, from the same book: TCP socket send buffer and UDP socket (pseudo) send buffer

Readable means there is data or a FIN present in the socket receive buffer.
Writable means there is space available in the socket send buffer.
Files don't have socket send or receive buffers.
Considering the everything-is-a-file philosophy
What philosophy is that?
does every socket descriptor with different descriptor number actually point to the same file?
What file? Why would they point to the same anything? Question doesn't make sense.
I'm confused with one thing: when a socket is created, the descriptor is actually point to the receive and send buffers of the socket
It 'points to' a lot of things: a source address, a target address, a source port, a target point, a pair of buffers, a set of counters and timers, ...
not the file represent the net hardware.
There is no such thing as 'the file represent[ing] the net hardware', unless you're talking about the device driver entry in /dev/..., which is barely relevant. A TCP socket is an endpoint of a connection. It is specific to that connection, to TCP, to the source and target addresses and ports, ...

Isn't recv() in C socket programming blocking?

In Receiver, I have
recvfd=accept(sockfd,&other_side,&len);
while(1)
{
recv(recvfd,buf,MAX_BYTES-1,0);
buf[MAX_BYTES]='\0';
printf("\n Number %d contents :%s\n",counter,buf);
counter++;
}
In Sender , I have
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
MAX_BYTES is 1024 and length of mesg is 15. Currently, It calls recv only one time. I want recv function to be called three times for each corresponding send. How do I achieve it?

In short: yes, it is blocking. But not in the way you think.
recv() blocks until any data is readable. But you don't know the size in advance.
In your scenario, you could do the following:
call select() and put the socket where you want to read from into the READ FD set
when select() returns with a positive number, your socket has data ready to be read
then, check if you could receive length bytes from the socket:
recv(recvfd, buf, MAX_BYTES-1, MSG_PEEK), see man recv(2) for the MSG_PEEK param or look at MSDN, they have it as well
now you know how much data is available
if there's less than length available, return and do nothing
if there's at least length available, read length and return (if there's more than length available, we'll continue with step 2 since a new READ event will be signalled by select()

To send discrete messages over a byte stream protocol, you have to encode messages into some kind of framing language. The network can chop up the protocol into arbitrarily sized packets, and so the receives do not correlate with your messages in any way. The receiver has to implement a state machine which recognizes frames.
A simple framing protocol is to have some length field (say two octets: 16 bits, for a maximum frame length of 65535 bytes). The length field is followed by exactly that many bytes.
You must not even assume that the length field itself is received all at once. You might ask for two bytes, but recv could return just one. This won't happen for the very first message received from the socket, because network (or local IPC pipe, for that matter) segments are never just one byte long. But somewhere in the middle of the stream, it is possible that the fist byte of the 16 bit length field could land on the last position of one network frame.
An easy way to deal with this is to use a buffered I/O library instead of raw operating system file handles. In a POSIX environment, you can take an open socket handle, and use the fdopen function to associate it with a FILE * stream. Then you can use functions like getc and fread to simplify the input handling (somewhat).
If in-band framing is not acceptable, then you have to use a protocol which supports framing, namely datagram type sockets. The main disadvantage of this is that the principal datagram-based protocol used over IP is UDP, and UDP is unreliable. This brings in a lot of complexity in your application to deal with out of order and missing frames. The size of the frames is also restricted by the maximum IP datagram size which is about 64 kilobytes, including all the protocol headers.
Large UDP datagrams get fragmented, which, if there is unreliability in the network, adds up to greater unreliability: if any IP fragment is lost, the entire packet is lost. All of it must be retransmitted; there is no way to just get a repetition of the fragment that was lost. The TCP protocol performs "path MTU discovery" to adjust its segment size so that IP fragmentation is avoided, and TCP has selective retransmission to recover missing segments.

I bet you've created a TCP socket using SOCK_STREAM, which would cause the three messages to be read into your buffer during the first recv call. If you want to read the messages one-by-one, create a UPD socket using SOCK_DGRAM, or develop some type of message format which allows you to parse your messages when they arrive in a stream (assuming your messages will not always be fixed length).

First send the length to be received in a fixed format regarding the size of length in bytes you use to transmit this length, then make recv() loop until length bytes had been received.
Note the fact (as also already mentioned by other answers), that the size and number of chunks received do not necessarly need to be the same as sent. Only the sum of all bytes received shall be the same as the sum of all bytes sent.
Read the man pages for recvand send. Especially read the sections on what those functions RETURN.

recv will block until the entire buffer is filled, or the socket is closed.
If you want to read length bytes and return, then you must only pass to recv a buffer of size length.
You can use select to determine if
there are any bytes waiting to be read,
how many bytes are waiting to be read, then
read only those bytes
This can avoid recv from blocking.
Edit:
After re-reading the docs, the following may be true: your three "messages" may be being read all-at-once since length + length + length < MAX_BYTES - 1.
Another possibility, if recv is never returning, is that you may need to flush your socket from the sender-side. The data may be waiting in a buffer to actually be sent to the receiver.

How large should my recv buffer be when calling recv in the socket library

I have a few questions about the socket library in C. Here is a snippet of code I'll refer to in my questions.
char recv_buffer[3000];
recv(socket, recv_buffer, 3000, 0);
How do I decide how big to make recv_buffer? I'm using 3000, but it's arbitrary.
what happens if recv() receives a packet bigger than my buffer?
how can I know if I have received the entire message without calling recv again and have it wait forever when there is nothing to be received?
is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space? maybe using strcat to concatenate the latest recv() response to the buffer?
I know it's a lot of questions in one, but I would greatly appreciate any responses.

The answers to these questions vary depending on whether you are using a stream socket (SOCK_STREAM) or a datagram socket (SOCK_DGRAM) - within TCP/IP, the former corresponds to TCP and the latter to UDP.
How do you know how big to make the buffer passed to recv()?
SOCK_STREAM: It doesn't really matter too much. If your protocol is a transactional / interactive one just pick a size that can hold the largest individual message / command you would reasonably expect (3000 is likely fine). If your protocol is transferring bulk data, then larger buffers can be more efficient - a good rule of thumb is around the same as the kernel receive buffer size of the socket (often something around 256kB).
SOCK_DGRAM: Use a buffer large enough to hold the biggest packet that your application-level protocol ever sends. If you're using UDP, then in general your application-level protocol shouldn't be sending packets larger than about 1400 bytes, because they'll certainly need to be fragmented and reassembled.
What happens if recv gets a packet larger than the buffer?
SOCK_STREAM: The question doesn't really make sense as put, because stream sockets don't have a concept of packets - they're just a continuous stream of bytes. If there's more bytes available to read than your buffer has room for, then they'll be queued by the OS and available for your next call to recv.
SOCK_DGRAM: The excess bytes are discarded.
How can I know if I have received the entire message?
SOCK_STREAM: You need to build some way of determining the end-of-message into your application-level protocol. Commonly this is either a length prefix (starting each message with the length of the message) or an end-of-message delimiter (which might just be a newline in a text-based protocol, for example). A third, lesser-used, option is to mandate a fixed size for each message. Combinations of these options are also possible - for example, a fixed-size header that includes a length value.
SOCK_DGRAM: An single recv call always returns a single datagram.
Is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space?
No. However, you can try to resize the buffer using realloc() (if it was originally allocated with malloc() or calloc(), that is).

For streaming protocols such as TCP, you can pretty much set your buffer to any size. That said, common values that are powers of 2 such as 4096 or 8192 are recommended.
If there is more data then what your buffer, it will simply be saved in the kernel for your next call to recv.
Yes, you can keep growing your buffer. You can do a recv into the middle of the buffer starting at offset idx, you would do:
recv(socket, recv_buffer + idx, recv_buffer_size - idx, 0);

If you have a SOCK_STREAM socket, recv just gets "up to the first 3000 bytes" from the stream. There is no clear guidance on how big to make the buffer: the only time you know how big a stream is, is when it's all done;-).
If you have a SOCK_DGRAM socket, and the datagram is larger than the buffer, recv fills the buffer with the first part of the datagram, returns -1, and sets errno to EMSGSIZE. Unfortunately, if the protocol is UDP, this means the rest of the datagram is lost -- part of why UDP is called an unreliable protocol (I know that there are reliable datagram protocols but they aren't very popular -- I couldn't name one in the TCP/IP family, despite knowing the latter pretty well;-).
To grow a buffer dynamically, allocate it initially with malloc and use realloc as needed. But that won't help you with recv from a UDP source, alas.

For SOCK_STREAM socket, the buffer size does not really matter, because you are just pulling some of the waiting bytes and you can retrieve more in a next call. Just pick whatever buffer size you can afford.
For SOCK_DGRAM socket, you will get the fitting part of the waiting message and the rest will be discarded. You can get the waiting datagram size with the following ioctl:
#include <sys/ioctl.h>
int size;
ioctl(sockfd, FIONREAD, &size);
Alternatively you can use MSG_PEEK and MSG_TRUNC flags of the recv() call to obtain the waiting datagram size.
ssize_t size = recv(sockfd, buf, len, MSG_PEEK | MSG_TRUNC);
You need MSG_PEEK to peek (not receive) the waiting message - recv returns the real, not truncated size; and you need MSG_TRUNC to not overflow your current buffer.
Then you can just malloc(size) the real buffer and recv() datagram.

There is no absolute answer to your question, because technology is always bound to be implementation-specific. I am assuming you are communicating in UDP because incoming buffer size does not bring problem to TCP communication.
According to RFC 768, the packet size (header-inclusive) for UDP can range from 8 to 65 515 bytes. So the fail-proof size for incoming buffer is 65 507 bytes (~64KB)
However, not all large packets can be properly routed by network devices, refer to existing discussion for more information:
What is the optimal size of a UDP packet for maximum throughput?
What is the largest Safe UDP Packet Size on the Internet

16kb is about right; if you're using gigabit ethernet, each packet could be 9kb in size.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight