Windows C socket, read N bytes from the socket stream

Windows C socket, read N bytes from the socket stream - c

How do i read exactly N bytes from the socket stream in C on windows 10, the only function i was able to find is recv() that reads everything.

As the official documentation says, you have to pass to the recv() function 4 parameters:
s - The socket descriptor
buf - The memory buffer which will hold the read data
len - The length of the memory buffer
flags - "A flag or a set of flags that influence the behavior of this function"
In particular, when the last parameter is set to the defined value MSG_WAITALL, the recv() function will complete only when one of these events occurs:
The buffer supplied by the caller is completely full.
The connection has been closed.
The request has been canceled or an error occurred.
In other words, if you want to read exactly 10 bytes, you should create a 10 bytes buffer and ask the recv() function to return only once the buffer is full!
char data[10];
recv(streamSocket, data, 10, MSG_WAITALL);
Hope it helps

Related

How does recv work in socket programming?

I'm trying to understand recv() at a high level. So recv takes data in "chunks" but I'm still not getting how it is precisely handled. Example:
char buffer[1000];
int received= recv(sock, buffer, sizeof(buffer), 0)
Does this mean if I'm receiving a massive file, the buffer, if connected through sock might for example reflect it stored 500 bytes in the received variable, then in a loop receive another 300 bytes, and all 800 bytes of data will be stored in buffer by the end of the loop (lost in the received variable unless accounted for), or does buffer need a pointer to keep track of where it last received the data to store it in then next iteration?

recv has no context. All it knows that it got some address (pointer) to write into and some maximum size - and then it will try this. It will always start writing with the given address. If for example on wish to add data after some previously received data one can simply give the pointer into the location after the previous data instead of the beginning of the buffer. Of course on should adjust the maximum size it is allowed to read to not overflow the buffer.

You asked "How does recv() work?", so it may be worth briefly studying a simpler function that does essentially the same thing - read().
recv() operates in more or less the same way as the read() function. The main difference is that recv() allows you to pass flags in the last argument - but you are not using these flags anyway.
My suggestion would be - before trying to use recv() to read from a network socket - to practice using read() on a plain text file.
Both functions return the number of bytes read - except in the case of an error, in which case they will return -1. You should always check for this scenario - and handle appropriately.
Both functions can also return less than the number of bytes requested. In the case of recv() - and reading from a socket - this may be because the other end has simply not sent all the required data yet. In the case of a reading from a file - with read() - it may be because you have reached the end of the file.
Anyway ...
You will need to keep track of the current offset within your buffer - and update it on each read. So declare a file-scope variable offset.
static off_t offset; static char buffer[1000];
Then - when your 'loop' is running - increment the offset after each read ...
while (1) {
size_t max_len = sizeof(buffer) - offset;
ssize_t count = recv(sock, buffer+offset, max_len, 0);
if (count == -1) {
switch (errno) {
case EAGAIN:
usleep(20000);
break;
default:
perror("Failed to read from socket");
close(sock);
break;
}
}
if (count == 0) {
puts("Looks like connection has been closed.");
break;
}
offset += count;
if (offset >= expected_len) {
puts("Got the expected amount of data. Wrapping up ...");
}
}
Notes:
Using this approach, you will either need to know the expected amount of data before-hand - or use a special delimiter to mark the end of the message
the max_len variable indicates how much space is left in your buffer - and (perhaps needless to say) you should not try to read more bytes than this
the destination for the recv() command is buffer+offset - not buffer.
if recv() returns zero, AFAIK this indicates that the other end has performed an "orderly shutdown".
if recv() returns -1, you really need to check the return code. EAGAIN is non-fatal - and just means you need to try again.

Write to file descriptor and immediately read from it

Today I have encountered some weird looking code that at first glance it's not apparent to me what it does.
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
It seems that a text string is being written into the file descriptor. Immediately then after that it reads from that file descriptor into a buffer. And it compares if the text written into the buffer is a "y".
My understanding (please correct me if I am wrong), is that it writes some data which is a text string into the file descriptor, and then the file descriptor acts as a temporary storage location for anything you write to it. And after that it reads that data from the file descriptor into the buffer. It actually is the same file descriptor. It seems as a primitive way of using a file descriptor to copy data from the text string into the buffer. Why not just use a strcpy() instead?
What would be the use case of writing to a file descriptor and then immediately read from it? It seems like a convoluted way to copy data using file descriptors. Or maybe I don't understand this code well enough, what this sequence of a send() and a read() does?
And assuming that this code is instead using the file descriptor to copy the text string "Input \'y\' to continue.\t" into the buffer, why are they comparing it with the string "y"? It should probably be false every single time.
I am assuming that any data written into a file descriptor stays in that file descriptor until it is read from. So here it seems that send() is being used to write the string into, and read() is used to read it back out.
In man send it says:
The only difference between send() and write(2) is the presence of flags. With a zero
flags argument, send() is equivalent to write(2).
why would they use send() instead of write()? This code is just so mind boggling.
Edit: here's the full function where this code is originally from:
void send_read(int file_desc)
{
int are_equal;
undefined2 buffer [8];
char local_28 [32];
/* 0x6e == 110 == 'n' */
buffer[0] = 0x6e;
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
are_equal = strcmp("y",(char *)buffer);
if (are_equal == 0) {
FUN_00400a86(file_desc,local_28);
}
else {
close(file_desc);
}
return;
}

The send() and recv() functions are for use with sockets (send: send a message on a socket — recv: receive a message from a connected socket). See also the POSIX description of Sockets in general.
Socket file descriptors are bi-directional — you can read and write on them. You can't read what you wrote, unlike with pipe file descriptors. With pipes, the process writing to the write end of a pipe can read what it wrote from the read end of the pipe — if another process didn't read it first. When a process writes on a socket, that information goes to the peer process and cannot be read by the writer.

send(2) is a system call that can only be used with sockets. A socket is a descriptor that allows you to use it to send data or receive from a remote point (a remote socket) that can be on a different computer or in the same as you are. But it works like a phone line, what you send is received by your parnter and what he/she sends is received by you. read(2) system call can be used by sockets, while send(2) cannot be used by files, so your sample code is mixing calls related to files with calls related to sockets (that's not uncommon, as read(2) and write(2) can both be used with sockets)
The code you post above is erroneous, as it blindly compares the received buffer with strcmp function, assuming that it received a null terminated string. This can be the case, but it also cannot.
Even if the sender (in the other side of the connection) agreed on sending a full message, nul terminated string. The receiver must first get the amount of data received (this is the return value of the read(2) call, which can be:
-! indicating some error on reception. The connection can be reset by the other side, or the other side can have rebooted while you send the data.
0 indicating no more data or end of data (the other side closed the connection) This can happen if the other side has a timeout and you take too much to respond. It closes the connection without sending anything. You just receive nothing.
n some data, less than the buffer size, but including the full packet sent by the peer (and the agreed nul byte it sent with it). This is the only case in which you can safely strcmp the data.
n some data, less than the buffer size, and less than the data transmitted. This can happen due to some data fragmentation of the data in several packets. Then you have to do another read until you have all the data send by your peer. Packet fragmentation is something natural in TCP, for example.
n some data, less than the buffer size, and more than the data transmitted. The sender did another transmit, after the one you receive, and both packets got into the kernel buffer. You have to investigate this case, as you have one full packet, and must save the rest of the received data in the buffer, for later processing, or you'll lose data you have received.
n some data, the full buffer filled, and no space to store the full transmitted data remained. You have filled the buffer and no \0 char came... the packet is larger than the buffer, you run out of buffer space and have to decide what to do (allocate other buffer to receive the rest, discard the data, or whatever you decide to do) This will not happen to you because you expect a packet of 1 or 2 characters, and you have a buffer of 100, but who knows...
At least, and as a minimum safe net, you can do this:
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
int n = read(file_desc,buffer,sizeof buffer - 1); /* one cell reserved for '\0' */
switch (n) {
case -1: /* error */
do_error();
break;
case 0: /* disconnect */
do_disconnect();
break;
default: /* some data */
buffer[n] = '\0'; /* append the null */
break;
}
if (n > 0) {
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
}
Note:
As you didn't post a complete and verifiable example, I couldn't post a complete and verifiable response.
My apologies for that.

Counting bytes received by posix read()

I get confused with one line of code:
temp_uart_count = read(VCOM, temp_uart_data, 4096);
I found more about read function at http://linux.die.net/man/3/read, but if everything is okay it returns 0, so how we can get num of bytes received from that?
temp_uart_count is used to count how much bytes we received from virtual COM port and stored it to temp_uart_data which is 4096 bytes wide.
Am I really getting how much bytes i received with this line of code?

... but if everything is okay it returns 0, so how we can get num of bytes received from that?
A return code of zero simply means that read() was unable to provide any data.
Am I really getting how much bytes i received with this line of code?
Yes, a positive return code (i.e. >= 0) from read() is an accurate count of bytes that were returned in the buffer. Zero is a valid count.
If you're expecting more data, then simply repeat the read() syscall. (However you may have setup the termios arguments poorly, e.g. VMIN=0 and VTIME=0).
And - zero indicates end of file
If you get 0, it means that the end of file (or an equivalent condition) has been reached and there is nothing else to read.
The above (one from a comment, and the other in an answer) are incorrect.
Reading from a tty device (e.g. a serial port) is not like reading from a file on a block device, but is temporal. Data for reading is only available as it is received over the comm link.
A non-blocking read() will return with -1 and errno set to EAGAIN when there is no data available.
A blocking non-canonical read() will return zero when there is no data available. Correlate your termios configuration with this to confirm that a return of zero is valid (and does not indicate "end of file").
In either case, the read() can be repeated to get more data when/if it arrives.
Also when using non-canonical (aka raw) mode (or non-blocking reads), do not expect or rely on the the read() to perform message or packet management for you. You will need to add a layer to your program to read bytes, concatenate those bytes into a complete message datagram/packet, and validate it before that message can be processed.

ssize_t read(int fd, void *buf, size_t count); returns you the size of bytes he read and stores it into the value you passed in parameters. And when errors happen it returns -1 (with errno set to EINTR) or to return the number of bytes already read..
From the linux man :
On files that support seeking, the read operation commences at the current file offset, and the file offset is incremented by the number of bytes read. If the current file offset is at or past the end of file, no bytes are read, and read() returns zero.

Yes, temp_uart_count will contain the actual number of bytes read, and obviously that number will be smaller or equal to the number of elements of temp_uart_data. If you get 0, it means that the end of file (or an equivalent condition) has been reached and there is nothing else to read.
If it returns -1 this indicate that an error has occurred and you'll need to check the errno variable to understand what happened.

Get the number of bytes available in socket by 'recv' with 'MSG_PEEK' in C++

C++ has the following function to receive bytes from socket, it can check for number of bytes available with the MSG_PEEK flag. With MSG_PEEK, the returned value of 'recv' is the number of bytes available in socket:
#include <sys/socket.h>
ssize_t recv(int socket, void *buffer, size_t length, int flags);
I need to get the number of bytes available in the socket without creating buffer (without allocating memory for buffer). Is it possible and how?

You're looking for is ioctl(fd,FIONREAD,&bytes_available) , and under windows ioctlsocket(socket,FIONREAD,&bytes_available).
Be warned though, the OS doesn't necessarily guarantee how much data it will buffer for you, so if you are waiting for very much data you are going to be better off reading in data as it comes in and storing it in your own buffer until you have everything you need to process something.
To do this, what is normally done is you simply read chunks at a time, such as
char buf[4096];
ssize_t bytes_read;
do {
bytes_read = recv(socket, buf, sizeof(buf), 0);
if (bytes_read > 0) {
/* do something with buf, such as append it to a larger buffer or
* process it */
}
} while (bytes_read > 0);
And if you don't want to sit there waiting for data, you should look into select or epoll to determine when data is ready to be read or not, and the O_NONBLOCK flag for sockets is very handy if you want to ensure you never block on a recv.

On Windows, you can use the ioctlsocket() function with the FIONREAD flag to ask the socket how many bytes are available without needing to read/peek the actual bytes themselves. The value returned is the minimum number of bytes recv() can return without blocking. By the time you actually call recv(), more bytes may have arrived.

Be careful when using FIONREAD! The problem with using ioctl(fd, FIONREAD, &available) is that it will always return the total number of bytes available for reading in the socket buffer on some systems.
This is no problem for STREAM sockets (TCP) but misleading for DATAGRAM sockets (UDP). As for datagram sockets read requests are capped to the size of the first datagram in the buffer and when reading less than she size of the first datagram, all unread bytes of that datagram are still discarded. So ideally you want to know only the size of the next datagram in the buffer.
E.g. on macOS/iOS it is documented that FIONREAD always returns the total amount (see comments about SO_NREAD). To only get the size of the next datagram (and total size for stream sockets), you can use the code below:
int available;
socklen_t optlen = sizeof(readable);
int err = getsockopt(soc, SOL_SOCKET, SO_NREAD, &available, &optlen);
On Linux FIONREAD is documented to only return the size of the next datagram for UDP sockets.
On Windows ioctlsocket(socket, FIONREAD, &available) is documented to always give the total size:
If the socket passed in the s parameter is message oriented (for example, type SOCK_DGRAM), FIONREAD returns the reports the total number of bytes available to read, not the size of the first datagram (message) queued on the socket.
Source: https://learn.microsoft.com/en-us/windows/win32/api/ws2spi/nc-ws2spi-lpwspioctl
I am unaware of a way how to get the size of the first datagram only on Windows.

The short answer is : this cannot be done with MS-Windows WinSock2,
as I can discovered over the last week of trying.
Glad to have finally found this post, which sheds some light on the issues I've been having, using latest Windows 10 Pro, version 20H2 Build 19042.867 (x86/x86_64) :
On a bound, disconnected UDP socket 'sk' (in Listening / Server mode):
1. Any attempt to use either ioctlsocket(sk, FIONREAD, &n_bytes)
OR WsaIoctl with a shifted FIONREAD argument, though they succeed,
and retern 0, after a call to select() returns > with that
'sk' FD bit set in the read FD set,
and the ioctl call returns 0 (success), and n_bytes is > 0,
causes the socket sk to be in a state where any
subsequent call to recv(), recvfrom(), or ReadFile() returns
SOCKET_ERROR with a WSAGetLastError() of :
10045, Operation Not Supported, or ReadFile
error 87, 'Invalid Parameter'.
Moreover, even worse:
2. Any attempt to use recv or recvfrom with the 'MSG_PEEK' msg_flags
parameter returns -1 and WSAGetLastError returns :
10040 : 'A message sent on a datagram socket was larger than
the internal message buffer or some other network limit,
or the buffer used to receive a datagram into was smaller
than the datagram itself.
' .
Yet for that socket I DID successfully call:
setsockopt(s, SOL_SOCKET, SO_RCVBUF, bufsz = 4096 , sizeof(bufsz) )
and the UDP packet being received was of only 120 bytes in size.
In short, with modern windows winsock2 ( winsock2.h / Ws2_32.dll) ,
there appears to be absolutely no way to use any documented API
to determine the number of bytes received on a bound UDP socket
before calling recv() / recvfrom() in MSG_WAITALL blocking mode to
actually receive the whole packet.
If you do not call ioctlsocket() or WsaIoctl or
recv{,from}(...,MSG_PEEK,...)
before entering recv{,from}(...,MSG_WAITALL,...) ,
then the recv{,from} succeeds.
I am considering advising clients that they must install and run
a Linux instance with MS Services for Linux under their windows
installation , and developing some
API to communicate with it from Windows, so that reliable
asynchronous UDP communication can be achieved - or does anyone
know of a good open source replacement for WinSock2 ?
I need access to a "C" library TCP+UDP/IP implementation for
modern Windows 10 that conforms to its own documentation,
unlike WinSock2 - does anyone know of one ?

Read from socket

I need to read from an AF_UNIX socket to a buffer using the function read from C, but I don't know the buffer size.
I think the best way is to read N bytes until the read returns 0 (no more writers in the socket). Is this correct? Is there a way to guess the size of the buffer being written on the socket?
I was thinking that a socket is a special file. Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
I'm a very new to C, so please keep that in mind.

On common way is to use ioctl(..) to query FIONREAD of the socket which will return how much data is available.
int len = 0;
ioctl(sock, FIONREAD, &len);
if (len > 0) {
len = read(sock, buffer, len);
}

One way to read an unknown amount from the socket while avoiding blocking could be to poll() a non-blocking socket for data.
E.g.
char buffer[1024];
int ptr = 0;
ssize_t rc;
struct pollfd fd = {
.fd = sock,
.events = POLLIN
};
poll(&fd, 1, 0); // Doesn't wait for data to arrive.
while ( fd.revents & POLLIN )
{
rc = read(sock, buffer + ptr, sizeof(buffer) - ptr);
if ( rc <= 0 )
break;
ptr += rc;
poll(&fd, 1, 0);
}
printf("Read %d bytes from sock.\n", ptr);

I think the best way is to read N
bytes until the read returns 0 (no
more writers in the socket). Is this
correct?
0 means EOF, other side has closed the connection. If other side of communication closes the connection, then it is correct.
If connection isn't closed (multiple transfers over the same connect, chatty protocol), then the case is bit more complicated and behavior generally depends on whether you have SOCK_STREAM or SOCK_DGRAM socket.
Datagram sockets are already delimited for you by the OS.
Stream sockets do not delimit messages (all data are an opaque byte stream) and if desired one has to implement that on application level: for example by defining a size field in the message header structure or using a delimiter (e.g. '\n' for single-line text messages). In first case you would first read the header, extract length and using the length read the rest of the message. In other case, read stream into partial buffer, search for the delimiter and extract from buffer the message including the delimiter (you might need to keep the partial buffer around as depending on protocol several command can be received with single recv()/read()).
Is there a way to guess the
size of the buffer being written on
the socket?
For stream sockets, there is no reliable way as the other side of communication might be still in process of writing the data. Imagine the quite normal case: socket buffer is 32K and 128K is being written. Writing application would block inside send()/write(), the OS waiting for reading application to read out the data and thus free space for the next chunk of written data.
For datagram sockets, one normally knows the size of the message beforehand. Or one can try (never did that myself) recvmsg( MSG_PEEK ) and if the MSG_TRUNC is in the returned msghdr.msg_flags, try to increase the buffer size.

you are correct, if you don't know the size of the input you can just read one byte each time and append it to a larger buffer.

read N bytes until the read returns 0
Yes!
One added detail. If the sender doesn't close the connection, the socket will just block, instead of returning. A nonblocking socket will return -1 (with errno == EAGAIN) when there's nothing to read; that's another case.
Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
Nope. Sockets don't have a size. Suppose you sent two messages over the same connection: How long is the file?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight