I need to read from an AF_UNIX socket to a buffer using the function read from C, but I don't know the buffer size.
I think the best way is to read N bytes until the read returns 0 (no more writers in the socket). Is this correct? Is there a way to guess the size of the buffer being written on the socket?
I was thinking that a socket is a special file. Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
I'm a very new to C, so please keep that in mind.
On common way is to use ioctl(..) to query FIONREAD of the socket which will return how much data is available.
int len = 0;
ioctl(sock, FIONREAD, &len);
if (len > 0) {
len = read(sock, buffer, len);
}
One way to read an unknown amount from the socket while avoiding blocking could be to poll() a non-blocking socket for data.
E.g.
char buffer[1024];
int ptr = 0;
ssize_t rc;
struct pollfd fd = {
.fd = sock,
.events = POLLIN
};
poll(&fd, 1, 0); // Doesn't wait for data to arrive.
while ( fd.revents & POLLIN )
{
rc = read(sock, buffer + ptr, sizeof(buffer) - ptr);
if ( rc <= 0 )
break;
ptr += rc;
poll(&fd, 1, 0);
}
printf("Read %d bytes from sock.\n", ptr);
I think the best way is to read N
bytes until the read returns 0 (no
more writers in the socket). Is this
correct?
0 means EOF, other side has closed the connection. If other side of communication closes the connection, then it is correct.
If connection isn't closed (multiple transfers over the same connect, chatty protocol), then the case is bit more complicated and behavior generally depends on whether you have SOCK_STREAM or SOCK_DGRAM socket.
Datagram sockets are already delimited for you by the OS.
Stream sockets do not delimit messages (all data are an opaque byte stream) and if desired one has to implement that on application level: for example by defining a size field in the message header structure or using a delimiter (e.g. '\n' for single-line text messages). In first case you would first read the header, extract length and using the length read the rest of the message. In other case, read stream into partial buffer, search for the delimiter and extract from buffer the message including the delimiter (you might need to keep the partial buffer around as depending on protocol several command can be received with single recv()/read()).
Is there a way to guess the
size of the buffer being written on
the socket?
For stream sockets, there is no reliable way as the other side of communication might be still in process of writing the data. Imagine the quite normal case: socket buffer is 32K and 128K is being written. Writing application would block inside send()/write(), the OS waiting for reading application to read out the data and thus free space for the next chunk of written data.
For datagram sockets, one normally knows the size of the message beforehand. Or one can try (never did that myself) recvmsg( MSG_PEEK ) and if the MSG_TRUNC is in the returned msghdr.msg_flags, try to increase the buffer size.
you are correct, if you don't know the size of the input you can just read one byte each time and append it to a larger buffer.
read N bytes until the read returns 0
Yes!
One added detail. If the sender doesn't close the connection, the socket will just block, instead of returning. A nonblocking socket will return -1 (with errno == EAGAIN) when there's nothing to read; that's another case.
Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
Nope. Sockets don't have a size. Suppose you sent two messages over the same connection: How long is the file?
Related
I'm working with POSIX sockets in C.
Given X, I have a need to verify that the socketfd contains at least X bytes before proceeding to perform an operation with it.
With that being said, I don't want to receive X bytes and store it into a buffer using recv as X has the potential of being very large.
My first idea was to use MSG_PEEK...
int x = 9999999
char buffer[1];
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
(value == X) ? good : bad;
...
...
...
// Do some operation
But I'm concerned X > 1 is corrupting memory, flag MSG_TRUNC seems to resolve the memory concern but removes X bytes from socketfd.
There's a big difference between e.g. TCP and UDP in this regards.
UDP is packet based, you send and receive packets of fixed size, basically.
TCP is a streaming protocol, where data begins to stream on connection and stops at disconnection. There are no message boundaries or delimiters in TCP, other than what you add at the application layer. It's simply a stream of bytes without any meaning (in TCP's point of view).
That means there's no way to tell how much will be received with a single recv call.
You need to come up with an application-level protocol (on top of TCP) which can either tell the size of the data to be received; For example there might be a fixed-size data-header that contains the size of the following data; Or you could have a specific delimiter between messages, something that can't occur in the stream of bytes.
Then you receive in a loop until you either have received all the data, or until you have received the delimiter. But note, with a delimiter there's the possibility that you also receive the beginning of the next message, so you need to be able to handle partial beginnings of message after the current message have been fully received.
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
This will copy up to X byte into buffer and return it without removing it from the socket. But your buffer is only 1 byte large. Increase your buffer.
Have you tried this?
ssize_t available = recv(socketfd, NULL, 0, MSG_PEEK | MSG_TRUNC);
Or this?
size_t available;
ioctl(socketfd, FIONREAD, &available);
Today I have encountered some weird looking code that at first glance it's not apparent to me what it does.
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
It seems that a text string is being written into the file descriptor. Immediately then after that it reads from that file descriptor into a buffer. And it compares if the text written into the buffer is a "y".
My understanding (please correct me if I am wrong), is that it writes some data which is a text string into the file descriptor, and then the file descriptor acts as a temporary storage location for anything you write to it. And after that it reads that data from the file descriptor into the buffer. It actually is the same file descriptor. It seems as a primitive way of using a file descriptor to copy data from the text string into the buffer. Why not just use a strcpy() instead?
What would be the use case of writing to a file descriptor and then immediately read from it? It seems like a convoluted way to copy data using file descriptors. Or maybe I don't understand this code well enough, what this sequence of a send() and a read() does?
And assuming that this code is instead using the file descriptor to copy the text string "Input \'y\' to continue.\t" into the buffer, why are they comparing it with the string "y"? It should probably be false every single time.
I am assuming that any data written into a file descriptor stays in that file descriptor until it is read from. So here it seems that send() is being used to write the string into, and read() is used to read it back out.
In man send it says:
The only difference between send() and write(2) is the presence of flags. With a zero
flags argument, send() is equivalent to write(2).
why would they use send() instead of write()? This code is just so mind boggling.
Edit: here's the full function where this code is originally from:
void send_read(int file_desc)
{
int are_equal;
undefined2 buffer [8];
char local_28 [32];
/* 0x6e == 110 == 'n' */
buffer[0] = 0x6e;
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
are_equal = strcmp("y",(char *)buffer);
if (are_equal == 0) {
FUN_00400a86(file_desc,local_28);
}
else {
close(file_desc);
}
return;
}
The send() and recv() functions are for use with sockets (send: send a message on a socket — recv: receive a message from a connected socket). See also the POSIX description of Sockets in general.
Socket file descriptors are bi-directional — you can read and write on them. You can't read what you wrote, unlike with pipe file descriptors. With pipes, the process writing to the write end of a pipe can read what it wrote from the read end of the pipe — if another process didn't read it first. When a process writes on a socket, that information goes to the peer process and cannot be read by the writer.
send(2) is a system call that can only be used with sockets. A socket is a descriptor that allows you to use it to send data or receive from a remote point (a remote socket) that can be on a different computer or in the same as you are. But it works like a phone line, what you send is received by your parnter and what he/she sends is received by you. read(2) system call can be used by sockets, while send(2) cannot be used by files, so your sample code is mixing calls related to files with calls related to sockets (that's not uncommon, as read(2) and write(2) can both be used with sockets)
The code you post above is erroneous, as it blindly compares the received buffer with strcmp function, assuming that it received a null terminated string. This can be the case, but it also cannot.
Even if the sender (in the other side of the connection) agreed on sending a full message, nul terminated string. The receiver must first get the amount of data received (this is the return value of the read(2) call, which can be:
-! indicating some error on reception. The connection can be reset by the other side, or the other side can have rebooted while you send the data.
0 indicating no more data or end of data (the other side closed the connection) This can happen if the other side has a timeout and you take too much to respond. It closes the connection without sending anything. You just receive nothing.
n some data, less than the buffer size, but including the full packet sent by the peer (and the agreed nul byte it sent with it). This is the only case in which you can safely strcmp the data.
n some data, less than the buffer size, and less than the data transmitted. This can happen due to some data fragmentation of the data in several packets. Then you have to do another read until you have all the data send by your peer. Packet fragmentation is something natural in TCP, for example.
n some data, less than the buffer size, and more than the data transmitted. The sender did another transmit, after the one you receive, and both packets got into the kernel buffer. You have to investigate this case, as you have one full packet, and must save the rest of the received data in the buffer, for later processing, or you'll lose data you have received.
n some data, the full buffer filled, and no space to store the full transmitted data remained. You have filled the buffer and no \0 char came... the packet is larger than the buffer, you run out of buffer space and have to decide what to do (allocate other buffer to receive the rest, discard the data, or whatever you decide to do) This will not happen to you because you expect a packet of 1 or 2 characters, and you have a buffer of 100, but who knows...
At least, and as a minimum safe net, you can do this:
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
int n = read(file_desc,buffer,sizeof buffer - 1); /* one cell reserved for '\0' */
switch (n) {
case -1: /* error */
do_error();
break;
case 0: /* disconnect */
do_disconnect();
break;
default: /* some data */
buffer[n] = '\0'; /* append the null */
break;
}
if (n > 0) {
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
}
Note:
As you didn't post a complete and verifiable example, I couldn't post a complete and verifiable response.
My apologies for that.
With the following pseudo-Python script for sending data to a local socket:
s = socket.socket(AF_UNIX, SOCK_STREAM)
s.connect("./sock.sock")
s.send("test\n")
s.send("aaa\0")
s.close()
My C program will randomly end up recving the following buffers:
test\n
test\n<random chars>
test\naaa (as expected)
The socket is being recv()'d after select() points that the socket is readable. Question is, how to avoid the first two cases?
And side question: Is it possible to send the following two messages from that script:
asd\0
dsa\0
And have select() to show the socket as readable on each of those sends, or will it only do that if I run the script again (restarting the socket client connection) and sending a message for each connect?
At a guess, the len argument to recv specifies a maximum amount of data to read, not the precise amount to be returned. recv is free to return any amount of data up to len bytes instead.
If you want to read a specific number of bytes, call recv in a loop.
int bytes = 0;
while (bytes < len) {
int remaining = len - bytes;
int read = recv(sockfd, buf+bytes, remaining, 0);
if (read < 0) {
// error
break;
}
bytes += read;
}
As noted by junix, if you'll need to send unpredictable amounts of data, consider defining a simple protocol that either starts each message with a note of its length or ends with a particular byte or sequence of bytes.
C++ has the following function to receive bytes from socket, it can check for number of bytes available with the MSG_PEEK flag. With MSG_PEEK, the returned value of 'recv' is the number of bytes available in socket:
#include <sys/socket.h>
ssize_t recv(int socket, void *buffer, size_t length, int flags);
I need to get the number of bytes available in the socket without creating buffer (without allocating memory for buffer). Is it possible and how?
You're looking for is ioctl(fd,FIONREAD,&bytes_available) , and under windows ioctlsocket(socket,FIONREAD,&bytes_available).
Be warned though, the OS doesn't necessarily guarantee how much data it will buffer for you, so if you are waiting for very much data you are going to be better off reading in data as it comes in and storing it in your own buffer until you have everything you need to process something.
To do this, what is normally done is you simply read chunks at a time, such as
char buf[4096];
ssize_t bytes_read;
do {
bytes_read = recv(socket, buf, sizeof(buf), 0);
if (bytes_read > 0) {
/* do something with buf, such as append it to a larger buffer or
* process it */
}
} while (bytes_read > 0);
And if you don't want to sit there waiting for data, you should look into select or epoll to determine when data is ready to be read or not, and the O_NONBLOCK flag for sockets is very handy if you want to ensure you never block on a recv.
On Windows, you can use the ioctlsocket() function with the FIONREAD flag to ask the socket how many bytes are available without needing to read/peek the actual bytes themselves. The value returned is the minimum number of bytes recv() can return without blocking. By the time you actually call recv(), more bytes may have arrived.
Be careful when using FIONREAD! The problem with using ioctl(fd, FIONREAD, &available) is that it will always return the total number of bytes available for reading in the socket buffer on some systems.
This is no problem for STREAM sockets (TCP) but misleading for DATAGRAM sockets (UDP). As for datagram sockets read requests are capped to the size of the first datagram in the buffer and when reading less than she size of the first datagram, all unread bytes of that datagram are still discarded. So ideally you want to know only the size of the next datagram in the buffer.
E.g. on macOS/iOS it is documented that FIONREAD always returns the total amount (see comments about SO_NREAD). To only get the size of the next datagram (and total size for stream sockets), you can use the code below:
int available;
socklen_t optlen = sizeof(readable);
int err = getsockopt(soc, SOL_SOCKET, SO_NREAD, &available, &optlen);
On Linux FIONREAD is documented to only return the size of the next datagram for UDP sockets.
On Windows ioctlsocket(socket, FIONREAD, &available) is documented to always give the total size:
If the socket passed in the s parameter is message oriented (for example, type SOCK_DGRAM), FIONREAD returns the reports the total number of bytes available to read, not the size of the first datagram (message) queued on the socket.
Source: https://learn.microsoft.com/en-us/windows/win32/api/ws2spi/nc-ws2spi-lpwspioctl
I am unaware of a way how to get the size of the first datagram only on Windows.
The short answer is : this cannot be done with MS-Windows WinSock2,
as I can discovered over the last week of trying.
Glad to have finally found this post, which sheds some light on the issues I've been having, using latest Windows 10 Pro, version 20H2 Build 19042.867 (x86/x86_64) :
On a bound, disconnected UDP socket 'sk' (in Listening / Server mode):
1. Any attempt to use either ioctlsocket(sk, FIONREAD, &n_bytes)
OR WsaIoctl with a shifted FIONREAD argument, though they succeed,
and retern 0, after a call to select() returns > with that
'sk' FD bit set in the read FD set,
and the ioctl call returns 0 (success), and n_bytes is > 0,
causes the socket sk to be in a state where any
subsequent call to recv(), recvfrom(), or ReadFile() returns
SOCKET_ERROR with a WSAGetLastError() of :
10045, Operation Not Supported, or ReadFile
error 87, 'Invalid Parameter'.
Moreover, even worse:
2. Any attempt to use recv or recvfrom with the 'MSG_PEEK' msg_flags
parameter returns -1 and WSAGetLastError returns :
10040 : 'A message sent on a datagram socket was larger than
the internal message buffer or some other network limit,
or the buffer used to receive a datagram into was smaller
than the datagram itself.
' .
Yet for that socket I DID successfully call:
setsockopt(s, SOL_SOCKET, SO_RCVBUF, bufsz = 4096 , sizeof(bufsz) )
and the UDP packet being received was of only 120 bytes in size.
In short, with modern windows winsock2 ( winsock2.h / Ws2_32.dll) ,
there appears to be absolutely no way to use any documented API
to determine the number of bytes received on a bound UDP socket
before calling recv() / recvfrom() in MSG_WAITALL blocking mode to
actually receive the whole packet.
If you do not call ioctlsocket() or WsaIoctl or
recv{,from}(...,MSG_PEEK,...)
before entering recv{,from}(...,MSG_WAITALL,...) ,
then the recv{,from} succeeds.
I am considering advising clients that they must install and run
a Linux instance with MS Services for Linux under their windows
installation , and developing some
API to communicate with it from Windows, so that reliable
asynchronous UDP communication can be achieved - or does anyone
know of a good open source replacement for WinSock2 ?
I need access to a "C" library TCP+UDP/IP implementation for
modern Windows 10 that conforms to its own documentation,
unlike WinSock2 - does anyone know of one ?
I am trying to think of the best way to have a video file sent over a TCP socket. I have made a standard socket program, but after the read command I'm not sure how I can go about saving it.
Code sample
//bind server socketfd
if (bind(sfdServer, (struct sockaddr*)&adrServer, ServerAddrLen) < 0)
error("ERROR binding");
listen(sfdServer, 5);
while(1){
printf("Waiting for connections...\n");
sfdClient = accept(sfdServer, (struct sockaddr*)&adrClient, &ClientAddrLen);
if(sfdClient < 0)
error("ERROR accepting");
printf("Connection Established.\n");
//set buffer to zero
bzero(buff, 2048);
printf("Reading from client.\n");
numChar = read(sfdClient, buff, 2048);
//What should go here?
close(sfdClient);
close(sfdServer);
}
Would I just save the buffer as a file movie.mp4 or something like that? I realize I may need to change my buffer size or possibly send it in chunks. But I can't find any good info on the best way to do this. Any help or a point in the right direction would be appreciated!
You'd write the buffer chunks to a file.
First, open an output file for writing:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int outfd;
outfd = open("movie.mp4", O_WRONLY|);
Then, after the read(), write your bytes:
numWrite = write(outfd, buff, numChar);
Note that you will need to deal with a number of chunking/buffering cases, such as:
Either the read() or the write() returning zero or -1 (error)
Reading until no more bytes are available. Right now, your code only reads 2048 bytes and then closes the socket.
The write() writing fewer bytes than requested (this can happen on network filesystems)
Firstly, you should consider using sendfile() or the equivalent for your system. That is, rather than read a chunk of the file into your memory only to write it back out somewhere, you should memory-map your file descriptor and send everything in one go.
As for chunking, TCP takes care of splitting the stream into packets for you. Ideally the remote node should know exactly how much to expect upfront so it can handle pre-mature termination (that's why HTTP headers contain a content length).
So use some sort of handshake before sending the data. Then the sender should just call sendfile() in a loop, paying attention to the amount of sent data and the returned offet. The receiver should call recv() in a loop to make sure it all arrives (there is no standard "recvfile()").