How many bytes can I write at once on a TCP socket?

How many bytes can I write at once on a TCP socket? - c

As the title says, is there a limit to the number of bytes that can be written at once on a connection-oriented socket?
If I want to send a buffer of, for example, 1024 bytes, can I use a
write(tcp_socket, buffer, 1024);
or should I use multiple write() calls with a lower amount of bytes for each one?

write() does not guarantee that all bytes will be written so multiple calls to write() are required. From man write:
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying physical medium, or the RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the call was interrupted by a signal handler after having written less than count bytes. (See also pipe(7).)
write() returns the number of bytes written so a running total of the bytes written must be maintained and used as an index into buffer and to calculate the number of remaining bytes to be written:
ssize_t total_bytes_written = 0;
while (total_bytes_written != 1024)
{
assert(total_bytes_written < 1024);
ssize_t bytes_written = write(tcp_socket,
&buffer[total_bytes_written],
1024 - total_bytes_written);
if (bytes_written == -1)
{
/* Report failure and exit. */
break;
}
total_bytes_written += bytes_written;
}

From my experience, it is better to stay in the 1024 byte limit

There's no inherent limit. TCP/IP will fragment and reassemble packets as required. Your system may impose (a possibly tunable) upper limit, but it is likely to be in the multi-MB range. See your manpage for setsockopt() and always check write()'s return value.

The actual amount you can write will depend on the type of socket. In general, you need to check the return value to see how many bytes were actually written. The number of bytes written can vary depend on whether the socket is in blocking mode or not.
Also, if the socket is blocking, you may not want to wait for all the data to be written in one go. You may want to write some at a time in order to be able to something else in between write operations.

as you can see write socket the maximum buffer size is 1048576 bytes.

Related

How does recv work in socket programming?

I'm trying to understand recv() at a high level. So recv takes data in "chunks" but I'm still not getting how it is precisely handled. Example:
char buffer[1000];
int received= recv(sock, buffer, sizeof(buffer), 0)
Does this mean if I'm receiving a massive file, the buffer, if connected through sock might for example reflect it stored 500 bytes in the received variable, then in a loop receive another 300 bytes, and all 800 bytes of data will be stored in buffer by the end of the loop (lost in the received variable unless accounted for), or does buffer need a pointer to keep track of where it last received the data to store it in then next iteration?

recv has no context. All it knows that it got some address (pointer) to write into and some maximum size - and then it will try this. It will always start writing with the given address. If for example on wish to add data after some previously received data one can simply give the pointer into the location after the previous data instead of the beginning of the buffer. Of course on should adjust the maximum size it is allowed to read to not overflow the buffer.

You asked "How does recv() work?", so it may be worth briefly studying a simpler function that does essentially the same thing - read().
recv() operates in more or less the same way as the read() function. The main difference is that recv() allows you to pass flags in the last argument - but you are not using these flags anyway.
My suggestion would be - before trying to use recv() to read from a network socket - to practice using read() on a plain text file.
Both functions return the number of bytes read - except in the case of an error, in which case they will return -1. You should always check for this scenario - and handle appropriately.
Both functions can also return less than the number of bytes requested. In the case of recv() - and reading from a socket - this may be because the other end has simply not sent all the required data yet. In the case of a reading from a file - with read() - it may be because you have reached the end of the file.
Anyway ...
You will need to keep track of the current offset within your buffer - and update it on each read. So declare a file-scope variable offset.
static off_t offset; static char buffer[1000];
Then - when your 'loop' is running - increment the offset after each read ...
while (1) {
size_t max_len = sizeof(buffer) - offset;
ssize_t count = recv(sock, buffer+offset, max_len, 0);
if (count == -1) {
switch (errno) {
case EAGAIN:
usleep(20000);
break;
default:
perror("Failed to read from socket");
close(sock);
break;
}
}
if (count == 0) {
puts("Looks like connection has been closed.");
break;
}
offset += count;
if (offset >= expected_len) {
puts("Got the expected amount of data. Wrapping up ...");
}
}
Notes:
Using this approach, you will either need to know the expected amount of data before-hand - or use a special delimiter to mark the end of the message
the max_len variable indicates how much space is left in your buffer - and (perhaps needless to say) you should not try to read more bytes than this
the destination for the recv() command is buffer+offset - not buffer.
if recv() returns zero, AFAIK this indicates that the other end has performed an "orderly shutdown".
if recv() returns -1, you really need to check the return code. EAGAIN is non-fatal - and just means you need to try again.

How socket send data?

I got a snippet from internet for send data through a socket .
Here is the code .
u32_t nLength = 0;
u32_t nOffset = 0;
do {
nLength = nFullLength - nOffset;
status = Socket->Send(((u8_t*) buff) + nOffset, &nLength);
if (status != ERROR_SUCCESS) {
break;
}
nOffset += nLength;
} while (nOffset < nFullLength);
My doubts are :
When send(sock_fd, buf+bytes, buflen-bytes, flags); function running , it will send the entire data ?
Let's assume i have a buff with 45 byte length . So it will send like
send(buf+0, 45-0) = send(buf+0, 45);
So it will send complete data with length 45 ? what is the use of length here ? initially it will 45 . Isn't ?

Well, no. There's no guarantee that it will send all the data you ask it to send, that's why the code looks the way it does.
The manual page for send() states this pretty clearly:
Return Value
On success, these calls return the number of characters sent. On error, -1
is returned, and errno is set appropriately.
The same is true for e.g. a regular write() to a local file, by the way. It might never happen, but the way the interface is designed you're supposed to handle partial sends (and writes) if they do happen.

TCP is a streaming transport. There is no guarantee that a given send() operation will accept all of the bytes given to it at one time. It depends on the available kernel buffer space, the I/O mode of the socket (blocking vs non-blocking), etc. send() returns the number of bytes it actually accepted and put into the kernel buffer for subsequent transmission.
In the code example shown, it appears that Socket->Send() expects nLength to be initially set to the total number of bytes to sent, and it will then update nLength with the number of bytes actually sent. The code is adjusting its nOffset variable accordingly, looping just in case send() returns fewer bytes than requested, so it can call send() as many times as it takes to send the full number of bytes.
So, for example, lets assume the kernel accepts up to 20 bytes at a time. The loop would call send() 3 times:
send(buf+0, 45-0) // returns 20
send(buf+20, 45-20) // returns 20
send(buf+40, 45-40) // returns 5
// done
This is typical coding practice for TCP programming, given the streaming nature of TCP.

Counting bytes received by posix read()

I get confused with one line of code:
temp_uart_count = read(VCOM, temp_uart_data, 4096);
I found more about read function at http://linux.die.net/man/3/read, but if everything is okay it returns 0, so how we can get num of bytes received from that?
temp_uart_count is used to count how much bytes we received from virtual COM port and stored it to temp_uart_data which is 4096 bytes wide.
Am I really getting how much bytes i received with this line of code?

... but if everything is okay it returns 0, so how we can get num of bytes received from that?
A return code of zero simply means that read() was unable to provide any data.
Am I really getting how much bytes i received with this line of code?
Yes, a positive return code (i.e. >= 0) from read() is an accurate count of bytes that were returned in the buffer. Zero is a valid count.
If you're expecting more data, then simply repeat the read() syscall. (However you may have setup the termios arguments poorly, e.g. VMIN=0 and VTIME=0).
And - zero indicates end of file
If you get 0, it means that the end of file (or an equivalent condition) has been reached and there is nothing else to read.
The above (one from a comment, and the other in an answer) are incorrect.
Reading from a tty device (e.g. a serial port) is not like reading from a file on a block device, but is temporal. Data for reading is only available as it is received over the comm link.
A non-blocking read() will return with -1 and errno set to EAGAIN when there is no data available.
A blocking non-canonical read() will return zero when there is no data available. Correlate your termios configuration with this to confirm that a return of zero is valid (and does not indicate "end of file").
In either case, the read() can be repeated to get more data when/if it arrives.
Also when using non-canonical (aka raw) mode (or non-blocking reads), do not expect or rely on the the read() to perform message or packet management for you. You will need to add a layer to your program to read bytes, concatenate those bytes into a complete message datagram/packet, and validate it before that message can be processed.

ssize_t read(int fd, void *buf, size_t count); returns you the size of bytes he read and stores it into the value you passed in parameters. And when errors happen it returns -1 (with errno set to EINTR) or to return the number of bytes already read..
From the linux man :
On files that support seeking, the read operation commences at the current file offset, and the file offset is incremented by the number of bytes read. If the current file offset is at or past the end of file, no bytes are read, and read() returns zero.

Yes, temp_uart_count will contain the actual number of bytes read, and obviously that number will be smaller or equal to the number of elements of temp_uart_data. If you get 0, it means that the end of file (or an equivalent condition) has been reached and there is nothing else to read.
If it returns -1 this indicate that an error has occurred and you'll need to check the errno variable to understand what happened.

Read from socket

I need to read from an AF_UNIX socket to a buffer using the function read from C, but I don't know the buffer size.
I think the best way is to read N bytes until the read returns 0 (no more writers in the socket). Is this correct? Is there a way to guess the size of the buffer being written on the socket?
I was thinking that a socket is a special file. Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
I'm a very new to C, so please keep that in mind.

On common way is to use ioctl(..) to query FIONREAD of the socket which will return how much data is available.
int len = 0;
ioctl(sock, FIONREAD, &len);
if (len > 0) {
len = read(sock, buffer, len);
}

One way to read an unknown amount from the socket while avoiding blocking could be to poll() a non-blocking socket for data.
E.g.
char buffer[1024];
int ptr = 0;
ssize_t rc;
struct pollfd fd = {
.fd = sock,
.events = POLLIN
};
poll(&fd, 1, 0); // Doesn't wait for data to arrive.
while ( fd.revents & POLLIN )
{
rc = read(sock, buffer + ptr, sizeof(buffer) - ptr);
if ( rc <= 0 )
break;
ptr += rc;
poll(&fd, 1, 0);
}
printf("Read %d bytes from sock.\n", ptr);

I think the best way is to read N
bytes until the read returns 0 (no
more writers in the socket). Is this
correct?
0 means EOF, other side has closed the connection. If other side of communication closes the connection, then it is correct.
If connection isn't closed (multiple transfers over the same connect, chatty protocol), then the case is bit more complicated and behavior generally depends on whether you have SOCK_STREAM or SOCK_DGRAM socket.
Datagram sockets are already delimited for you by the OS.
Stream sockets do not delimit messages (all data are an opaque byte stream) and if desired one has to implement that on application level: for example by defining a size field in the message header structure or using a delimiter (e.g. '\n' for single-line text messages). In first case you would first read the header, extract length and using the length read the rest of the message. In other case, read stream into partial buffer, search for the delimiter and extract from buffer the message including the delimiter (you might need to keep the partial buffer around as depending on protocol several command can be received with single recv()/read()).
Is there a way to guess the
size of the buffer being written on
the socket?
For stream sockets, there is no reliable way as the other side of communication might be still in process of writing the data. Imagine the quite normal case: socket buffer is 32K and 128K is being written. Writing application would block inside send()/write(), the OS waiting for reading application to read out the data and thus free space for the next chunk of written data.
For datagram sockets, one normally knows the size of the message beforehand. Or one can try (never did that myself) recvmsg( MSG_PEEK ) and if the MSG_TRUNC is in the returned msghdr.msg_flags, try to increase the buffer size.

you are correct, if you don't know the size of the input you can just read one byte each time and append it to a larger buffer.

read N bytes until the read returns 0
Yes!
One added detail. If the sender doesn't close the connection, the socket will just block, instead of returning. A nonblocking socket will return -1 (with errno == EAGAIN) when there's nothing to read; that's another case.
Opening the file in binary mode and getting the size would help me in knowing the correct size to give to the buffer?
Nope. Sockets don't have a size. Suppose you sent two messages over the same connection: How long is the file?

Case when blocking recv() returns less than requested bytes

The recv() library function man page mention that:
It returns the number of bytes received. It normally returns any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.
If we are using blocking recv() call and requested for 100 bytes:
recv(sockDesc, buffer, size, 0); /* Where size is 100. */
and only 50 bytes are send by the server then this recv() is blocked until 100 bytes are available or it will return receiving 50 bytes.
The scenario could be that:
server crashes after sendign only 50 bytes
bad protocol design where server is only sending 50 bytes while client is expecting 100 and server is also waiting for client's reply (i.e. socket close connection has not been initiated by server in which recv will return)
I am interested on Linux / Solaris platform. I don't have the development environment to check it out myself.

recv will return when there is data in the internal buffers to return. It will not wait until there is 100 bytes if you request 100 bytes.
If you're sending 100 byte "messages", remember that TCP does not provide messages, it is just a stream. If you're dealing with application messages, you need to handle that at the application layer as TCP will not do it.
There are many, many conditions where a send() call of 100 bytes might not be read fully on the other end with only one recv call when calling recv(..., 100); here's just a few examples:
The sending TCP stack decided to bundle together 15 write calls, and the MTU happened to be 1460, which - depending on timing of the arrived data might cause the clients first 14 calls to fetch 100 bytes and the 15. call to fetch 60 bytes - the last 40 bytes will come the next time you call recv() . (But if you call recv with a buffer of 100 , you might get the last 40 bytes of the prior application "message" and the first 60 bytes of the next message)
The sender buffers are full, maybe the reader is slow, or the network is congested. At some point, data might get through and while emptying the buffers the last chunk of data wasn't a multiple of 100.
The receiver buffers are full, while your app recv() that data, the last chunk it pulls up is just partial since the whole 100 bytes of that message didn't fit the buffers.
Many of these scenarios are rather hard to test, especially on a lan where you might not have a lot of congestion or packet loss - things might differ as you ramp up and down the speed at which messages are sent/produced.
Anyway. If you want to read 100 bytes from a socket, use something like
int
readn(int f, void *av, int n)
{
char *a;
int m, t;
a = av;
t = 0;
while(t < n){
m = read(f, a+t, n-t);
if(m <= 0){
if(t == 0)
return m;
break;
}
t += m;
}
return t;
}
...
if(readn(mysocket,buffer,BUFFER_SZ) != BUFFER_SZ) {
//something really bad is going on.
}

The behavior is determined by two things. The recv low water mark and whether or not you pass the MSG_WAITALL flag. If you pass this flag the call will block until the requested number of bytes are received, even if the server crashes. Other wise it returns as soon as at least SO_RCVLOWAT bytes are available in the socket's receive buffer.
SO_RCVLOWAT
Sets the minimum number of bytes to
process for socket input operations.
The default value for SO_RCVLOWAT is
1. If SO_RCVLOWAT is set to a larger value, blocking receive calls normally
wait until they have received the
smaller of the low water mark value or
the requested amount. (They may return
less than the low water mark if an
error occurs, a signal is caught, or
the type of data next in the receive
queue is different than that returned,
e.g. out of band data). This option
takes an int value. Note that not all
implementations allow this option to
be set.

If you read the quote precisely, the most common scenario is:
the socket is receiving data. That 100 bytes will take some time.
the recv() call is made.
If there are more than 0 bytes in the buffer, recv() returns what is available and does not wait.
While there are 0 bytes available it blocks and the granularity of the threading system determines how long that is.