Effect of SO_SNDBUF

Effect of SO_SNDBUF - c

I am unable to make sense of how and why the following code segments work :
/* Now lets try to set the send buffer size to 5000 bytes */
size = 5000;
err = setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, &size, sizeof(int));
if (err != 0) {
printf("Unable to set send buffer size, continuing with default size\n");
}
If we check the value of the send buffer, it is indeed correctly set to 5000*2 = 10000.
However, if we try to send more than the send buffer size, it does send all of it. For example:
n = send(sockfd, buf, 30000, 0);
/* Lets check how much us actually sent */
printf("No. of bytes sent is %d\n", n);
This prints out 30000.
How exactly did this work? Didn't the fact that the send buffer size was limited to 10000 have any effect? If it did, what exactly happened? Some kind of fragmentation?
UPDATE: What happens if the socket is in non-blocking mode? I tried the following:
Changing buffer size to 10000 (5000*2) causes 16384 bytes to be sent
Changing buffer size to 20000 (10000*2) causes 30000 bytes to be sent
Once again, why?

The effect of setting SO_SNDBUF option is different for TCP and UDP.
For UDP this sets the limit on the size of the datagram, i.e. anything larger will be discarded.
For TCP this just sets the size of in-kernel buffer for given socket (with some rounding to page boundary and with an upper limit).
Since it looks like you are talking about TCP, the effect you are observing is explained by the socket being in blocking mode, so send(2) blocks until kernel can accept all of your data, and/or the network stack asynchronously de-queueing data and pushing it to the network card, thus freeing space in the buffer.
Also, TCP is a stream protocol, it does not preserve any "message" structure. One send(2) can correspond to multiple recv(2)s on the other side, and the other way around. Treat it as byte-stream.

SO_SNDBUF configures the buffer that the socket implementation uses internally. If your socket is non-blocking you can only send up to the configured size, if your socket is blocking there is no limitation for your call.

Related

Using C POSIX sockets, can you determine how many bytes a socket contains without extracting?

I'm working with POSIX sockets in C.
Given X, I have a need to verify that the socketfd contains at least X bytes before proceeding to perform an operation with it.
With that being said, I don't want to receive X bytes and store it into a buffer using recv as X has the potential of being very large.
My first idea was to use MSG_PEEK...
int x = 9999999
char buffer[1];
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
(value == X) ? good : bad;
...
...
...
// Do some operation
But I'm concerned X > 1 is corrupting memory, flag MSG_TRUNC seems to resolve the memory concern but removes X bytes from socketfd.

There's a big difference between e.g. TCP and UDP in this regards.
UDP is packet based, you send and receive packets of fixed size, basically.
TCP is a streaming protocol, where data begins to stream on connection and stops at disconnection. There are no message boundaries or delimiters in TCP, other than what you add at the application layer. It's simply a stream of bytes without any meaning (in TCP's point of view).
That means there's no way to tell how much will be received with a single recv call.
You need to come up with an application-level protocol (on top of TCP) which can either tell the size of the data to be received; For example there might be a fixed-size data-header that contains the size of the following data; Or you could have a specific delimiter between messages, something that can't occur in the stream of bytes.
Then you receive in a loop until you either have received all the data, or until you have received the delimiter. But note, with a delimiter there's the possibility that you also receive the beginning of the next message, so you need to be able to handle partial beginnings of message after the current message have been fully received.

int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
This will copy up to X byte into buffer and return it without removing it from the socket. But your buffer is only 1 byte large. Increase your buffer.
Have you tried this?
ssize_t available = recv(socketfd, NULL, 0, MSG_PEEK | MSG_TRUNC);
Or this?
size_t available;
ioctl(socketfd, FIONREAD, &available);

SocketCAN select() and write() don't block

I'm testing the CAN interface on an embedded device (SOC / ARM core / Linux) using SocketCAN, and I want to send data as fast as possible for testing, using efficient code.
I can open the CAN device ("can0") as a BSD socket, and send frames with "write". This all works well.
My desktop can obviously generate frames faster than the CAN transmission rate (I'm using 500000 bps). To send efficiently, I tried using a "select" on the socket file descriptor to wait for it to become ready, followed by the "write". However, the "select" seems to return immediately regardless of the state of the send buffer, and "write" also doesn't block. This means that when the buffer fills up, I get an error from "write" (return value -1), and errno is set to 105 ("No buffer space available").
This mean I have to wait an arbitrary amount of time, then try the write again, which seems very inefficient (polling!).
Here's my code (C, edited for brevity):
printf("CAN Data Generator\n");
int skt; // CAN raw socket
struct sockaddr_can addr;
struct canfd_frame frame;
const int WAIT_TIME = 500;
// Create socket:
skt = socket(PF_CAN, SOCK_RAW, CAN_RAW);
// Get the index of the supplied interface name:
unsigned int if_index = if_nametoindex(argv[1]);
// Bind CAN device to socket created above:
addr.can_family = AF_CAN;
addr.can_ifindex = if_index;
bind(skt, (struct sockaddr *)&addr, sizeof(addr));
// Generate example CAN data: 8 bytes; 0x11,0x22,0x33,...
// ...[Omitted]
// Send CAN frames:
fd_set fds;
const struct timeval timeout = { .tv_sec=2, .tv_usec=0 };
struct timeval this_timeout;
int ret;
ssize_t bytes_writ;
while (1)
{
// Use 'select' to wait for socket to be ready for writing:
FD_ZERO(&fds);
FD_SET(skt, &fds);
this_timeout = timeout;
ret = select(skt+1, NULL, &fds, NULL, &this_timeout);
if (ret < 0)
{
printf("'select' error (%d)\n", errno);
return 1;
}
else if (ret == 0)
{
// Timeout waiting for buffer to be free
printf("ERROR - Timeout waiting for buffer to clear.\n");
return 1;
}
else
{
if (FD_ISSET(skt, &fds))
{
// Ready to write!
bytes_writ = write(skt, &frame, CAN_MTU);
if (bytes_writ != CAN_MTU)
{
if (errno == 105)
{
// Buffer full!
printf("X"); fflush(stdout);
usleep(20); // Wait for buffer to clear
}
else
{
printf("FAIL - Error writing CAN frame (%d)\n", errno);
return 1;
}
}
else
{
printf("."); fflush(stdout);
}
}
else
{
printf("-"); fflush(stdout);
}
}
usleep(WAIT_TIME);
}
When I set the per-frame WAIT_TIME to a high value (e.g. 500 uS) so that the buffer never fills, I see this output:
CAN Data Generator
...............................................................................
................................................................................
...etc
Which is good! At 500 uS I get 54% CAN bus utilisation (according to canbusload utility).
However, when I try a delay of 0 to max out my transmission rate, I see:
CAN Data Generator
................................................................................
............................................................X.XX..X.X.X.X.XXX.X.
X.XX..XX.XX.X.XX.X.XX.X.X.X.XX..X.X.X.XX..X.X.X.XX.X.XX...XX.X.X.X.X.XXX.X.XX.X.
X.X.XXX.X.XX.X.X.X.XXX.X.X.X.XX.X.X.X.X.XX..X..X.XX.X..XX.X.X.X.XX.X..X..X..X.X.
.X.X.XX.X.XX.X.X.X.X.X.XX.X.X.XXX.X.X.X.X..XX.....XXX..XX.X.X.X.XXX.X.XX.XX.XX.X
.X.X.XX.XX.XX.X.X.X.X.XX.X.X.X.X.XX.XX.X.XXX...XX.X.X.X.XX..X.XX.X.XX.X.X.X.X.X.
The initial dots "." show the buffer filling up; Once the buffer is full, "X" starts appearing meaning that the "write" call failed with error 105.
Tracing through the logic, this means the "select" must have returned and the "FD_ISSET(skt, &fds)" was true, although the buffer was full! (or did I miss something?).
The SockedCAN docs just say "Writing CAN frames can be done similarly, with the write(2) system call"
This post suggests using "select".
This post suggests that "write" won't block for CAN priority arbitration, but doesn't cover other circumstances.
So is "select" the right way to do it? Should my "write" block? What other options could I use to avoid polling?

After a quick look at canbusload:184, it seems that it computes efficiency (#data/#total bits on the bus).
On the other hand, according to this, max efficiency for CAN bus is around 57% for 8-byte frames, so you seem not to be far away from that 57%... I would say you are indeed flooding the bus.
When setting a 500uS delay, 500kbps bus bitrate, 8-byte frames, it gives you a (control+data) bitrate of 228kbps, which is lower than max bitrate of the CAN bus, so, no bottleneck here.
Also, since in this case only 1 socket is being monitored, you don't need pselect, really. All you can do with pselect and 1 socket can be done without pselect and using write.
(Disclamer: hereinafter, this is just guessing since I cannot test it right now, sorry.)
As of why the behavior of pselect, think that the buffer could have byte semantics, so it tells you there is still room for more bytes (1 at least), not necessarily for more can_frames. So, when returning, pselect does not inform you can send the whole CAN frame. I guess you could solve this by using SIOCOUTQ and the max size of the Rx buffer SO_SNDBUF, but not sure if it works for CAN sockets (the nice thing would be to use SO_SNDLOWAT flags, but it is not changable in Linux's implementation).
So, to answer your questions:
Is "select" the right way to do it?
Well, you can do it both ways, either (p)select or write, since you are only waiting for one file descriptor, there is no real difference.
Should my "write" block? It should if there is no single byte available in the send buffer.
What other options could I use to avoid polling? Maybe by ioctl'ing SIOCOUTQ and getsockopt'ing SO_SNDBUF and substracting... you will need to check this yourself. Alternatively, maybe you could set the send buffer size to a multiple of sizeof(can_frame) and see if it keeps you signaling when less than sizeof(can_frame) are available.
Anyhow, if you are interested in having a more precise timing, you could use a BCM socket. There, you can instruct the kernel to send a specific frame at a specific interval. Once set, the process run in kernel space, without any system call. In this way, user-kernel buffer problem is avoided. I would test different rates until canbusload shows no rise in bus utilization.

select and poll worked for me right with SocketCan. However, carefull configuration is require.
some background:
between user app and the HW, there are 2 buffers:
socket buffer, where its size (in bytes) is controlled by the setsockopt's SO_SNDBUF option
driver's qdisc, where its size (in packets) is controlled by the "ifconfig can0 txqueuelen 5" command.
data path is: user app "write" command --> socket buffer -> driver's qdisc -> HW TX mailbox.
2 flow control points exist along this path:
when there is no free TX mailboxe, driver freeze driver's qdisc (__QUEUE_STATE_DRV_XOFF), to prevent more packets to be dequeued from driver's qdisc into HW. it will be un-freezed when TX mailbox is free (upon TX completion interrupt).
when socket buffer goes above half of its capacity, poll/select blocks, until socket buffer goes beyond half of its capacity.
now, assume that socket buffer has room for 20 packets, while driver's qdisc has room for 5 packets. lets assume also that HW have single TX mailbox.
poll/select let user write up to 10 packets.
those packets are moved down to socket buffer.
5 of those packets continue and fill driver's qdisc.
driver dequeue 1st packet from driver's qdisc, put it into HW TX mailbox and freeze driver's qdisc (=no more dequeue). now there is room for 1 packet in driver's qdisc
6th packet is moved down successfully from socket buffer to driver's qdisc.
7th packet is moved down from socket buffer to driver's qdisc, but since there is no room - it is dropped and error 105 ("No buffer space available") is generated.
what is the solution?
in the above assumptions, lets configure socket buffer for 8 packets. in this case, poll/select will block user app after 4 packets, ensuring that there is room in driver's qdisc for all of those 4 packets.
however, socket buffer is configured to bytes, not to packet. translation should be made as the following: each CAN packet occupy ~704 bytes at socket buffer (most of them for the socket structure). so, to configure socket buffer to 8 packet, the size in bytes should be 8*704:
int size = 8*704;
setsockopt(s, SOL_SOCKET, SO_SNDBUF, &size, sizeof(size));

Why are particular UDP messages always getting dropped below a particular buffer size?

3 different messages are being sent to the same port at different rates:
Message size (bytes) Sent everytransmit speed
High 232 10 ms 100Hz
Medium 148 20ms 50Hz
Low 20 60 ms 16.6Hz
I can only process one message every ~ 6 ms.
Single threaded. Blocking read.
A strange situation is occurring, and I don't have an explanation for it.
When I set my receive buffer to 4,799 bytes, all of my low speed messages get dropped.
I see maybe one or two get processed, and then nothing.
When I set my receive buffer to 4,800(or higher!), it appears as though all of the low speed messages start getting processed. I see about 16/17 a second.
This has been observed consistently. The application sending the packets is always started before the receiving application. The receiving application always has a long delay after the sockets are created, and before it begins processing. So the buffer is always full when the processing starts, and it is not the same starting buffer each time a test occurs. This is because the socket is created after the sender is already sending out messages, so the receiver might start listening in the middle of a send cycle.
Why does increasing the received buffer size a single byte, cause a huge change in low speed message processing?
I built a table to better visualize the expected processing:
As some of these messages get processed, more messages presumably get put on the queue instead of being dropped.
Nonetheless, I would expect a 4,799 byte buffer to behave the same way as 4,800 bytes.
However that is not what I have observed.
I think the issue is related to the fact that low speed messages are sent at the same time as the other two messages. It is always received after the high/medium speed messages. (This has been confirmed over wireshark).
For example, assuming the buffer was empty to begin with, it is clear that the low speed message would need queued longer than the other messages.
*1 message every 6ms is about 5 messages every 30ms.
This still doesn't explain the buffer size.
We are running VxWorks, and using their sockLib, which is an implementation of Berkeley sockets. Here is a snippet of what our socket creation looks like:
SOCKET_BUFFER_SIZE is what I'm changing.
struct sockaddr_in tSocketAddress; // Socket address
int nSocketAddressSize = sizeof(struct sockaddr_in); // Size of socket address structure
int nSocketOption = 0;
// Already created
if (*ptParameters->m_pnIDReference != 0)
return FALSE;
// Create UDP socket
if ((*ptParameters->m_pnIDReference = socket(AF_INET, SOCK_DGRAM, 0)) == ERROR)
{
// Error
CreateSocketMessage(ptParameters, "CreateSocket: Socket create failed with error.");
// Not successful
return FALSE;
}
// Valid local address
if (ptParameters->m_szLocalIPAddress != SOCKET_ADDRESS_NONE_STRING && ptParameters->m_usLocalPort != 0)
{
// Set up the local parameters/port
bzero((char*)&tSocketAddress, nSocketAddressSize);
tSocketAddress.sin_len = (u_char)nSocketAddressSize;
tSocketAddress.sin_family = AF_INET;
tSocketAddress.sin_port = htons(ptParameters->m_usLocalPort);
// Check for any address
if (strcmp(ptParameters->m_szLocalIPAddress, SOCKET_ADDRESS_ANY_STRING) == 0)
tSocketAddress.sin_addr.s_addr = htonl(INADDR_ANY);
else
{
// Convert IP address for binding
if ((tSocketAddress.sin_addr.s_addr = inet_addr(ptParameters->m_szLocalIPAddress)) == ERROR)
{
// Error
CreateSocketMessage(ptParameters, "Unknown IP address.");
// Cleanup socket
close(*ptParameters->m_pnIDReference);
*ptParameters->m_pnIDReference = ERROR;
// Not successful
return FALSE;
}
}
// Bind the socket to the local address
if (bind(*ptParameters->m_pnIDReference, (struct sockaddr *)&tSocketAddress, nSocketAddressSize) == ERROR)
{
// Error
CreateSocketMessage(ptParameters, "Socket bind failed.");
// Cleanup socket
close(*ptParameters->m_pnIDReference);
*ptParameters->m_pnIDReference = ERROR;
// Not successful
return FALSE;
}
}
// Receive socket
if (ptParameters->m_eType == SOCKTYPE_RECEIVE || ptParameters->m_eType == SOCKTYPE_RECEIVE_AND_TRANSMIT)
{
// Set the receive buffer size
nSocketOption = SOCKET_BUFFER_SIZE;
if (setsockopt(*ptParameters->m_pnIDReference, SOL_SOCKET, SO_RCVBUF, (char *)&nSocketOption, sizeof(nSocketOption)) == ERROR)
{
// Error
CreateSocketMessage(ptParameters, "Socket buffer size set failed.");
// Cleanup socket
close(*ptParameters->m_pnIDReference);
*ptParameters->m_pnIDReference = ERROR;
// Not successful
return FALSE;
}
}
and the socket receive that's being called in an infinite loop:
*The buffer size is definitely large enough
int SocketReceive(int nSocketIndex, char *pBuffer, int nBufferLength)
{
int nBytesReceived = 0;
char szError[256];
// Invalid index or socket
if (nSocketIndex < 0 || nSocketIndex >= SOCKET_COUNT || g_pnSocketIDs[nSocketIndex] == 0)
{
sprintf(szError,"SocketReceive: Invalid socket (%d) or ID (%d)", nSocketIndex, g_pnSocketIDs[nSocketIndex]);
perror(szError);
return -1;
}
// Invalid buffer length
if (nBufferLength == 0)
{
perror("SocketReceive: zero buffer length");
return 0;
}
// Send data
nBytesReceived = recv(g_pnSocketIDs[nSocketIndex], pBuffer, nBufferLength, 0);
// Error in receiving
if (nBytesReceived == ERROR)
{
// Create error string
sprintf(szError, "SocketReceive: Data Receive Failure: <%d> ", errno);
// Set error message
perror(szError);
// Return error
return ERROR;
}
// Bytes received
return nBytesReceived;
}
Any clues on why increasing the buffer size to 4,800 results in successful and consistent reading of low speed messages?

The basic answer to the question of why a SO_RCVBUF size of 4799 results in lost low speed messages and a size of 4800 works fine is that with the mixture of the UDP packets coming in, the rate at which they are coming in, the rate at which you are processing incoming packets, and the sizing of the mbuff and cluster numbers in your vxWorks kernel allow for sufficient network stack throughput that the low speed messages are not being discarded with the larger size.
The SO_SNDBUF option description in the setsockopt() man page at URL http://www.vxdev.com/docs/vx55man/vxworks/ref/sockLib.html#setsockopt mentioned in a comment above has this to say about the size specified and the effect on mbuff usage:
The effect of setting the maximum size of buffers (for both SO_SNDBUF
and SO_RCVBUF, described below) is not actually to allocate the mbufs
from the mbuf pool. Instead, the effect is to set the high-water mark
in the protocol data structure, which is used later to limit the
amount of mbuf allocation.
UDP packets are discrete units. If you send 10 packets of size 232 that is not considered to be 2320 bytes of data in contiguous memory. Instead that is 10 memory buffers within the network stack because UDP is discrete packets while TCP is a continuous stream of bytes.
See How do I tune the network buffering in VxWorks 5.4? in the DDS community web site which gives a discussion about the interdependence of the mixture of mbuff sizes and network clusters.
See How do I resolve a problem with VxWorks buffers? in the DDS community web site.
See this PDF of a slide presentation, A New Tool to study Network Stack Exhaustion in VxWorks from 2004 which discusses using various tools such as mBufShow and inetStatShow to see what is happening in the network stack.

Without detailed analysis of every network stack implementation along the path your UDP messages are being sent it is nearly impossible to state the resulting behaviour.
UDP implementations are allowed to drop any packet at their own discretion. Usually this happens when a stack comes to the conclusion that it would need to drop packets to be able to receive new ones. There is no formal requirement that the packets dropped are the oldest or the newest being received. It could also be that a certain size class is more affected due to internal memory management strategies.
From the IP stacks involved the most interesting one is the one on the receiving machine.
For sure you will get better receive experience if you change the receive side to have a receive buffer that will take several seconds full of expected messages. I'd start with at least 10k.
The observed "change" in behaviour when going from 4,799 to 4,800 may result from the later just allowing one of the small messages to be received before it needs to be dropped again, while the smaller size just causes it to be dropped slightly earlier. If the receiving application is quick enough to read the pending message you will receive small messages in the one case and no small messages in the other case.

Get the number of bytes available in socket by 'recv' with 'MSG_PEEK' in C++

C++ has the following function to receive bytes from socket, it can check for number of bytes available with the MSG_PEEK flag. With MSG_PEEK, the returned value of 'recv' is the number of bytes available in socket:
#include <sys/socket.h>
ssize_t recv(int socket, void *buffer, size_t length, int flags);
I need to get the number of bytes available in the socket without creating buffer (without allocating memory for buffer). Is it possible and how?

You're looking for is ioctl(fd,FIONREAD,&bytes_available) , and under windows ioctlsocket(socket,FIONREAD,&bytes_available).
Be warned though, the OS doesn't necessarily guarantee how much data it will buffer for you, so if you are waiting for very much data you are going to be better off reading in data as it comes in and storing it in your own buffer until you have everything you need to process something.
To do this, what is normally done is you simply read chunks at a time, such as
char buf[4096];
ssize_t bytes_read;
do {
bytes_read = recv(socket, buf, sizeof(buf), 0);
if (bytes_read > 0) {
/* do something with buf, such as append it to a larger buffer or
* process it */
}
} while (bytes_read > 0);
And if you don't want to sit there waiting for data, you should look into select or epoll to determine when data is ready to be read or not, and the O_NONBLOCK flag for sockets is very handy if you want to ensure you never block on a recv.

On Windows, you can use the ioctlsocket() function with the FIONREAD flag to ask the socket how many bytes are available without needing to read/peek the actual bytes themselves. The value returned is the minimum number of bytes recv() can return without blocking. By the time you actually call recv(), more bytes may have arrived.

Be careful when using FIONREAD! The problem with using ioctl(fd, FIONREAD, &available) is that it will always return the total number of bytes available for reading in the socket buffer on some systems.
This is no problem for STREAM sockets (TCP) but misleading for DATAGRAM sockets (UDP). As for datagram sockets read requests are capped to the size of the first datagram in the buffer and when reading less than she size of the first datagram, all unread bytes of that datagram are still discarded. So ideally you want to know only the size of the next datagram in the buffer.
E.g. on macOS/iOS it is documented that FIONREAD always returns the total amount (see comments about SO_NREAD). To only get the size of the next datagram (and total size for stream sockets), you can use the code below:
int available;
socklen_t optlen = sizeof(readable);
int err = getsockopt(soc, SOL_SOCKET, SO_NREAD, &available, &optlen);
On Linux FIONREAD is documented to only return the size of the next datagram for UDP sockets.
On Windows ioctlsocket(socket, FIONREAD, &available) is documented to always give the total size:
If the socket passed in the s parameter is message oriented (for example, type SOCK_DGRAM), FIONREAD returns the reports the total number of bytes available to read, not the size of the first datagram (message) queued on the socket.
Source: https://learn.microsoft.com/en-us/windows/win32/api/ws2spi/nc-ws2spi-lpwspioctl
I am unaware of a way how to get the size of the first datagram only on Windows.

The short answer is : this cannot be done with MS-Windows WinSock2,
as I can discovered over the last week of trying.
Glad to have finally found this post, which sheds some light on the issues I've been having, using latest Windows 10 Pro, version 20H2 Build 19042.867 (x86/x86_64) :
On a bound, disconnected UDP socket 'sk' (in Listening / Server mode):
1. Any attempt to use either ioctlsocket(sk, FIONREAD, &n_bytes)
OR WsaIoctl with a shifted FIONREAD argument, though they succeed,
and retern 0, after a call to select() returns > with that
'sk' FD bit set in the read FD set,
and the ioctl call returns 0 (success), and n_bytes is > 0,
causes the socket sk to be in a state where any
subsequent call to recv(), recvfrom(), or ReadFile() returns
SOCKET_ERROR with a WSAGetLastError() of :
10045, Operation Not Supported, or ReadFile
error 87, 'Invalid Parameter'.
Moreover, even worse:
2. Any attempt to use recv or recvfrom with the 'MSG_PEEK' msg_flags
parameter returns -1 and WSAGetLastError returns :
10040 : 'A message sent on a datagram socket was larger than
the internal message buffer or some other network limit,
or the buffer used to receive a datagram into was smaller
than the datagram itself.
' .
Yet for that socket I DID successfully call:
setsockopt(s, SOL_SOCKET, SO_RCVBUF, bufsz = 4096 , sizeof(bufsz) )
and the UDP packet being received was of only 120 bytes in size.
In short, with modern windows winsock2 ( winsock2.h / Ws2_32.dll) ,
there appears to be absolutely no way to use any documented API
to determine the number of bytes received on a bound UDP socket
before calling recv() / recvfrom() in MSG_WAITALL blocking mode to
actually receive the whole packet.
If you do not call ioctlsocket() or WsaIoctl or
recv{,from}(...,MSG_PEEK,...)
before entering recv{,from}(...,MSG_WAITALL,...) ,
then the recv{,from} succeeds.
I am considering advising clients that they must install and run
a Linux instance with MS Services for Linux under their windows
installation , and developing some
API to communicate with it from Windows, so that reliable
asynchronous UDP communication can be achieved - or does anyone
know of a good open source replacement for WinSock2 ?
I need access to a "C" library TCP+UDP/IP implementation for
modern Windows 10 that conforms to its own documentation,
unlike WinSock2 - does anyone know of one ?

Read from socket

I need to read from an AF_UNIX socket to a buffer using the function read from C, but I don't know the buffer size.
I think the best way is to read N bytes until the read returns 0 (no more writers in the socket). Is this correct? Is there a way to guess the size of the buffer being written on the socket?
I was thinking that a socket is a special file. Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
I'm a very new to C, so please keep that in mind.

On common way is to use ioctl(..) to query FIONREAD of the socket which will return how much data is available.
int len = 0;
ioctl(sock, FIONREAD, &len);
if (len > 0) {
len = read(sock, buffer, len);
}

One way to read an unknown amount from the socket while avoiding blocking could be to poll() a non-blocking socket for data.
E.g.
char buffer[1024];
int ptr = 0;
ssize_t rc;
struct pollfd fd = {
.fd = sock,
.events = POLLIN
};
poll(&fd, 1, 0); // Doesn't wait for data to arrive.
while ( fd.revents & POLLIN )
{
rc = read(sock, buffer + ptr, sizeof(buffer) - ptr);
if ( rc <= 0 )
break;
ptr += rc;
poll(&fd, 1, 0);
}
printf("Read %d bytes from sock.\n", ptr);

I think the best way is to read N
bytes until the read returns 0 (no
more writers in the socket). Is this
correct?
0 means EOF, other side has closed the connection. If other side of communication closes the connection, then it is correct.
If connection isn't closed (multiple transfers over the same connect, chatty protocol), then the case is bit more complicated and behavior generally depends on whether you have SOCK_STREAM or SOCK_DGRAM socket.
Datagram sockets are already delimited for you by the OS.
Stream sockets do not delimit messages (all data are an opaque byte stream) and if desired one has to implement that on application level: for example by defining a size field in the message header structure or using a delimiter (e.g. '\n' for single-line text messages). In first case you would first read the header, extract length and using the length read the rest of the message. In other case, read stream into partial buffer, search for the delimiter and extract from buffer the message including the delimiter (you might need to keep the partial buffer around as depending on protocol several command can be received with single recv()/read()).
Is there a way to guess the
size of the buffer being written on
the socket?
For stream sockets, there is no reliable way as the other side of communication might be still in process of writing the data. Imagine the quite normal case: socket buffer is 32K and 128K is being written. Writing application would block inside send()/write(), the OS waiting for reading application to read out the data and thus free space for the next chunk of written data.
For datagram sockets, one normally knows the size of the message beforehand. Or one can try (never did that myself) recvmsg( MSG_PEEK ) and if the MSG_TRUNC is in the returned msghdr.msg_flags, try to increase the buffer size.

you are correct, if you don't know the size of the input you can just read one byte each time and append it to a larger buffer.

read N bytes until the read returns 0
Yes!
One added detail. If the sender doesn't close the connection, the socket will just block, instead of returning. A nonblocking socket will return -1 (with errno == EAGAIN) when there's nothing to read; that's another case.
Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
Nope. Sockets don't have a size. Suppose you sent two messages over the same connection: How long is the file?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight