the docs say for send:
When the message does not fit into the send buffer of the socket,
send() normally blocks, unless the socket has been placed in non-block-
ing I/O mode. In non-blocking mode it would return EAGAIN in this
case. The select(2) call may be used to determine when it is possible
to send more data.
I am in blocking mode, doing something along the lines of:
buf = malloc(size);
send (socket, buf, size);
free(buf)
Assume but is very large, larger than the buffer can hold at a time (so it would need to go into the buffer as two chunks lets say). Anyways, in blocking mode, which I'm in, after send, can I feel safe that the data is fully copied or dealt with and thus deletable?
In blocking mode, send blocks until I/O is complete, or an error is triggered. You should check the returned value, because a send operation does not guarantee that the number of bytes sent is the same number of bytes passed as third argument.
Only when send returns a value equal to the size of the buffer sent you can be sure that the whole block has been copied into kernel memory, or passed through device memory, or sent to the destination.
The short answer is: Yes, you can free the buffer after the send() call successfully returns (without errors) when the file descriptor is in blocking mode.
The reason for this is based on the blocking concept itself: The send() call (targeting a blocking file descriptor) will only return when an error occur or the requested size bytes of the data in the buf is buffered or transmitted by the underlying layer of the operating system (typically the kernel).
Also note that a successful return of send() doesn't mean that the data was transmitted. It means that it was, at least, buffered by the underlying layer.
Related
This refers to C sockets. Say I have written some data into the socket, then I call read, will read system call read buffer size(say 4096 etc) information and delete all the information in the socket? Basically will read just move the seek pointer forward to those many bytes that it read or will it read and just delete all the information from socket, so that next time when read is called, it reads from the 0th index?
Or say I write into the socket without calling read from anywhere else? Will the data be replaced or appended?
If there is more data available on a socket than the amount that you read(), the extra data will be kept in the socket's buffer until you read it. No data is lost during a short read.
Writing works similarly. If you call write() multiple times, each write will append data to the buffer on the remote host. Again, no data is lost.
(Eventually, the buffer on the remote host will fill up. When this happens, write() will block -- the local host will wait for the buffer to empty before sending more data.)
Conceptually, each direction in a socket pair behaves like a pipe between the two peers. The overall stream of data sent will be received in the same order it was sent, regardless of how much data was read/written at a time.
I read How large should my recv buffer be when calling recv in the socket library in order to understand buffer in read. There are yet some points that i wish to know about read buffer in tcp socket connection.
My application is sending video packets. when i set buff to 80000 sender could send the packets but when i set it less for example 8000 after sending few packets it stops with RST.
a)Is this buffer, TCP receive window?
b)Is there any relation between this buffer and .net.ipv4.tcp_rmem , .net.ipv4.tcp_wmem ?if yes, Should i set read buffer based on rmem or wmem?
I would greatly appreciate any responses
a)Is this buffer, TCP receive window?
No, it is just a buffer that you provide for the TCP stack to place bytes into when you call recv().
b)Is there any relation between this buffer and .net.ipv4.tcp_rmem ,
.net.ipv4.tcp_wmem?
No.
if yes, Should i set read buffer based on rmem or women?
You can pass any size buffer you want to recv(); it is unrelated to any of the above, except that there isn't any benefit to making the buffer you pass to recv() larger than the socket's current SO_RCVBUF size, since it's unlikely that recv() would ever return more bytes at once than can be present in the socket's internal buffer.
As for how to decide what size buffer to use -- consider that a larger buffer will (of course) take up more memory, and if you are allocating that buffer on the stack, a very large buffer might cause a stack overflow. On the other hand, a smaller buffer means that you can read fewer bytes with any given call to recv(), so you may have to call recv() more times to read in the same total number of bytes.
Note that number of bytes of data returned by recv() may be any number from 1 byte up to the total size of the buffer that you passed in to recv()'s third argument, and there is no way to predict how many bytes you'll get. In particular, with TCP the number of bytes you receive from any particular call to recv() will not have any correlation to the number of bytes previously passed to any particular call to send() on the sending side. So you just need to use a "reasonably sized" array (for whatever definition of "reasonably sized" you prefer) and recv() as many bytes into it as possible, and then handle that many bytes (based on recv()'s return value).
Consider the following invocation of read() on a nonblocking stream-mode socket (SOCK_STREAM):
ssize_t n = read(socket_fd, buffer, size);
Assume that the remote peer will not close the connection, and will not shut down its writing half of the connection (the reading half, from a local point of view).
On Linux, a short read (n > 0 && n < size) under these circumstances means that the kernel-level read buffer has been exhausted, and an immediate follow-up invocation would normally fail with EAGAIN/EWOULDBLOCK (it would fail unless new data manages to arrive in between the two calls).
In other words, on Linux, an invocation of read() will always consume everything that is immediately available provided that size is large enough.
Likewise for write(), on Linux a short write always means that the kernel-level buffer was filled, and an immediate follow-up invocation is likely to fail with EAGAIN/EWOULDBLOCK.
Question 1: Is this also guaranteed on macOS/OSX?
Question 2: Is this also guaranteed on FreeBSD?
Question 3: Is this required/guaranteed by POSIX?
I know this is true on Linux, because of the following note in the manual page for epoll (section 7):
For stream-oriented files (e.g., pipe, FIFO, stream socket), the condition that the read/write I/O space is exhausted can also be detected by checking the amount of data read from / written to the target file descriptor. For example, if you call read(2) by asking to read a certain amount of data and read(2) returns a lower number of bytes, you can be sure of having exhausted the read I/O space for the file descriptor. The same is true when writing using write(2). (Avoid this latter technique if you cannot guarantee that the monitored file descriptor always refers to a stream-oriented file.)
EDIT: As a motivation for the question, consider a case where you want to process input on a number of sockets simultaneously, and for whatever reason, you want to do this by fully exhausting in-kernel buffers for each socket in turn (i.e., "depth first" rather than "breadth first"). This can obviously be done by repeating a read on a ready-ready socket until it fails with EAGAIN/EWOULDBLOCK, but the last invocation would be redundant if the previous read was short, and we knew that a short read was a guarantee of exhaustion.
It is guaranteed by Posix:
data shall be returned to the user as soon as it becomes available.
... and therefore on all the other platforms you mention as well, and also Windows, OS/2, NetWare, ...
Any other implementation would be pointless.
I write about 50k bytes data to a file (which is stored in a USB disk and mount on linux 2.6.37. FAT32 ) which using O_NOBLOCK every 200 ms.Whether the write() function has any risk of returning a EAGAIN.If yes, why and in what case. I run the program already half an hour, and no error return has been reported.
Copy of correct-but-deleted answer:
No. The O_NONBLOCK flag doesn't affect working with regular files.
Some reference, for completeness:
That aplies only to pipes; for regular files, it's ignored.
If the O_NONBLOCK flag is clear, a write request may cause the thread to block, but on normal completion it shall return nbyte.
If the O_NONBLOCK flag is set, write() requests shall be handled differently, in the following ways:
The write() function shall not block the thread.
A write request for {PIPE_BUF} or fewer bytes shall have the following effect: if there is sufficient space available in the pipe, write() shall transfer all the data and return the number of bytes requested. Otherwise, write() shall transfer no data and return -1 with errno set to [EAGAIN].
A write request for more than {PIPE_BUF} bytes shall cause one of the following:
When at least one byte can be written, transfer what it can and return the number of bytes written. When all data previously written to the pipe is read, it shall transfer at least {PIPE_BUF} bytes.
When no data can be written, transfer no data, and return -1 with errno set to [EAGAIN].
The read and write functions (and relatives like send, recv, readv, ...) can return a number of bytes less than the requested read/write length if interrupted by a signal (under certain circumstances), and perhaps in other cases too. Is there a well-defined set of conditions for when this can happen, or is it largely up to the implementation? Here are some particular questions I'm interested in the answers to:
If a signal handler is non-interrupting (SA_RESTART) that will cause IO operations interrupted before any data is transferred to be restarted after the signal handler returns. But if a partial read/write has already occurred and the signal handler is non-interrupting, will the syscall return immediately with the partial length, or will it be resumed attempting to read/write the remainder?
Obviously read functions can return short reads on network, pipe, and terminal file descriptors when less data than the requested amount is available. But can write functions return short writes in these cases due to limited buffer size, or will they block until all the data can be written?
I'd be interested in all three of standards-required, common, and Linux-specific behavior.
For your second question : write can return short writes for a limited buffer size if it is non-blocking
There's at least one standard condition that can cause write on a regular file to return a short size:
If a write() requests that more bytes
be written than there is room for (for
example, [XSI] the file size limit
of the process or the physical end of
a medium), only as many bytes as there
is room for shall be written. For
example, suppose there is space for 20
bytes more in a file before reaching a
limit. A write of 512 bytes will
return 20. The next write of a
non-zero number of bytes would give a
failure return (except as noted
below).