I'm writing a tcp server in C but I'm facing problems on send. I read local file and send data back to the client, when the file is small I have no problems, but when it becomes bigger I have this strange situation:
server tcp:
// create socket, bind, listen accept
// read file
fseek(fptr, 0, SEEK_SET);
// malloc for the sending buffer
ssize_t read = fread(sbuf, 1, file_size, fptr);
while(to_send>0) {
sent = send(socket, sbuf, buf_size, 0);
sbuf += sent;
to_send -= sent;
}
On huge files sent becomes equals to the max value of size_t, I think that I have a buffer overflow. How can I prevent this? What is the best practice to read from a file and send it back?
The problem is that you send buf_size bytes every time, even if there aren't that many left.
For example, pretend buf_size is 8 and you are sending 10 bytes (So initially, to_send is also 10). The first send sends 8 bytes, so you need to send 2 more. The second time, you also send 8 bytes (Which probably reads out of bounds). Then, to_send will be will be -6, which is the same as SIZE_MAX - 5.
Simple fix is to send to_send if it is smaller:
sent = send(socket, sbuf, to_send < buf_size ? to_send : buf_size, 0);
Also, send returns -1 if it is unsuccessful. This is the same as SIZE_MAX when it is assigned to a size_t. You would need some error handling to fix this.
On huge files sent becomes equals to the max value of size_t, I think
that I have a buffer overflow.
Since sent gets its value as the return value of send(), and send() returns ssize_t, which is a signed type unlikely to be wider than size_t, it is virtually certain that what is actually happening is that send() is indicating an error by returning -1. In that case, it will also be setting errno to a value indicative of the error. It cannot return the maximum value of size_t on any system I've ever had my hands on.
How can I prevent this?
In the first place, before you worry about preventing it, you should be sure to detect it by
declaring sent as a ssize_t to match the return type of send(), not a size_t, and
checking the value returned into sent for such error conditions.
Second, if you are really dealing with files longer than can be represented by a ssize_t (much less a size_t), then it is a poor idea to load the whole thing into memory before sending any of it. Instead, load it in (much) smaller blocks, and send the data one such block at a time. Not only will this tend to have lower perceived latency, but it will also avoid any risk associated with approaching the limits of the data types involved.
Additionally, when you do so, be careful to do it right. You have done well to wrap your send() call in a loop to account for short writes, but as #Artyer describes in his answer, you don't quite get that right because you do not reduce the number of bytes you try to send on the second and subsequent calls.
What is the
best practice to read from a file and send it back?
As above.
Related
How does the send()* function handle -
ssize_t retval = send(sock, buf, SIZE_MAX, 0);
If send is successfully able to send SIZE_MAX bytes then the return type isn't big enough to handle such a big value (typically SSIZE_MAX ~= SIZE_MAX/2).
Will the send() function limit itself to send only SSIZE_MAX bytes in this case?
* ssize_t send(int socket, const void *buffer, size_t length, int flags);
- https://linux.die.net/man/3/send
The only way for send to conform to the requirement that it return the number of bytes sent upon successful completion and that it return −1 if it failed is for send never to send more than SSIZE_MAX bytes. When the length parameter exceeds SSIZE_MAX, send must either fail or return a length less than requested.
send() in length is limited by ability to handle the message by underlying protocol. return is signed to report on error contition so negative numbers are errors. So theoretically it is limited but there is no yet protocol to transfer SIZE_MAX message.
send() will attempt to send SIZE_MAX bytes from the buffer pointed to by buf. However, SIZE_MAX is basically an indication of the largest value that the pointer type of an architecture could conceivably represent. So it is usually at minimum the size of the entire address space, if not more.
I am interposing a read operation with my own implementation of read that prints some log and calls the libc read. I am wondering what should be the right way to handle read with a huge nbyte parameter. Since nbyte is size_t, what is the right way to handle out of range read request? From the read manpage:
If the value of nbyte is greater than {SSIZE_MAX}, the result is implementation-defined
What does this mean and if I have to handle a large read request, what should I do?
Don't change the behavior of the read() call - just wrap the OS-provided call and allow it to do what it does.
ssize_t read( int fd, void *buf, size_t bytes )
{
ssize_t result;
.
.
.
result = read_read( fd, buf, bytes );
.
.
.
return( result );
}
What could you possibly do if you're implementing a 64-bit library a caller passes you a size_t value that's greater than SSIZE_MAX? You can't split that up into anything reasonable anyway.
And if you're implementing a 32-bit library, how would you pass the proper result back if you did split up the read?
You could break up the one large request into several smaller ones.
Besides, SSIZE_MAX is positively huge. Are you really sure you need to read >2GB of data, in one go?
You could simply use strace(1) to get some logs of your read syscalls.
In practice the read count is the size of some buffer (in memory), so it is very unusual to have it being bigger than a dozen of megabytes. It is often some kilobytes.
So I believe you should not care about SSIZE_MAX limit in real life
The last parameter of read is the buffer size. It's not the number of bytes to read.
So:
if the buffer size you received is lesser than SSIZE_MAX, call the syscall 'read' with buffer size.
If the buffer size you received is greater than SSIZE_MAX, 'read' SSIZE_MAX
If the read syscall return -1, return -1 too
If the read syscall return 0 or less than SSIZE_MAX --> return the sum of bytes read.
If the read call return exactly SSIZE_MAX, decrement the buffer size received of SSIZE_MAX
and loop (goto "So")
Do not forget to adjust the buffer pointer and to count the total number of bytes read.
Being implementation defined means that there is no correct answer, and callers should never do this (because they can’t be certain how it will be handled). Given that you are interposing the syscall, I suggest you just assert(2) that the value is in range. If you end up failing that assert somewhere, fix the calling code to be compliant.
I read the documentation regarding send() function, when it was said that the third parameter (len) is "The length, in bytes, of the data in buffer pointed to by the buf parameter".
I can't seem to understand if it sends the number of bytes I pass, or I nedd to pass the size of the buffer and it sends all the data included there.
Exmaple:
#define X 256
// main :
char test[X] = {0};
memcpy(test, "hello", 6);
send(sockfd, test, 6, 0)
send(sockfd, test, 256,0)
// will the first opetion send only hello? or hello000000....?
Thanks!
The send function sends precisely the number of bytes you tell it to (assuming it's not interrupted and doesn't otherwise fail). The number of bytes you need to send is determined by the protocol you are implementing. For example, if the protocol says you should send "FOO\r\n", then you need to send 5 bytes. If the protocol specifies that integers are represented as 4 bytes in network byte order and you're sending an integer, the buffer should contain an integer in network byte order and you should send 4 bytes. The size of the buffer doesn't matter to send.
As a complement to David Schwartz proper answer:
Depending on if the socket is non-blocking,or not, it is NOT guaranteed that a single send will actually send all data. You must check return value and might have to call send again (with correct buffer offsets).
For instance if you want to send 10 bytes of data (len=10), you call send(sock, buf, len, 0). However lets say it only manages to send 5 bytes, then send(..) will return 5, meaning that you will have to call it again later like send(sock, (buf + 5), (len - 5), 0). Meaning, skip first five bytes in buffer, they're already sent, and withdraw five bytes from the total number of bytes (len) we want to send.
(Note that I used parenthesis to make it easier to read only, and it assumes that buf is a pointer to 1 byte type.)
I just recently started my programming education within Inter-process commmunications and this piece of code was written within the parent processs code section. From what I have read about write(), it returns -1 if it failed, 0 if nothing was written to the pipe() and a positive integer if successful. How exactly does sizeof(value) help us identify this? Isn't if(write(request[WRITE],&value,sizeof(value) < 1) a much more reading friendlier alternative to what the sizeof(value).
if(sizeof(value)!=write(request[WRITE],&value,sizeof(value)))
{
perror("Cannot write thru pipe.\n");
return 1;
}
Code clarification: The variable value is an input of a digit in the parent process which the parent then sends to the child process through a pipe the child to do some arithmic operation on it.
Any help of clarification on the subject is very much apprecaited.
Edit: How do I highlight my system functions here when asking questions?
This also captures a successful but partial write, which the application wants to consider being a failure.
It's slightly easier to read without the pointless parnethesis:
if(write(request[WRITE], &value, sizeof value) != sizeof value)
So, for instance, if value is an int, it might occupy 4 bytes, but if the write() just writes 2 of those it will return 2 which is captured by this test.
At least in my opinion. Remember that sizeof is not a function.
That's not a read, that's a write. The principle is almost the same, but there's a bit of a twist.
As a general rule you are correct: write() could return a "short count", indicating a partial write. For instance, you might ask to write 2000 bytes to some file descriptor, and write might return a value like 1024 instead, indicating that 976 (2000 - 1024) bytes were not written but no actual error occurred. (This occurs, for instance, when receiving a signal while writing on a "slow" device like a tty or pty. Of course, the application must decide what to do about the partial write: should it consider this an error? Should it retry the remaining bytes? It's pretty common to wrap the write in a loop, that retries remaining bytes in case of short counts; the stdio fwrite code does this, for instance.)
With pipes, however, there's a special case: writes of sufficiently small size (less than or equal to PIPE_BUF) are atomic. So assuming sizeof(value) <= PIPE_BUF, and that this really is writing on a pipe, this code is correct: write will return either sizeof(value) or -1.
(If sizeof(value) is 1, the code is correct—albeit misleading—for any descriptor: write never returns zero. The only possible return values are -1 and some positive value between 1 and the number of bytes requested-to-write, inclusive. This is where read and write are not symmetric with respect to return values: read can, and does, return zero.)
I'm in the middle of of reading Internetworking with TCP/IP Vol III, by Comer.
I am looking at a some sample code for a "TIME" client for UDP.
The code gets to the point where it does the read of the response, and it takes what should be a 4 bytes and converts it to a 32 bit unsigned integer, so it can be converted to UNIX time.
"n" is a file descriptor that points to a socket that listens for UDP.
n = read (s, (char *)&now, sizeof(now));
if (n < 0)
errexit("read failed: %s\n", strerror(errno));
now = ntohl((u_long)now); /* put in host byte order */
What I am wondering is:
Are there some assumptions that should be checked before making the conversion? This is in C, and I am wondering if there are situations where read would pass a number of bytes that is not 4. If so, it seems like "now" would be a mess.
"Now" is defined as:
time_t now; /* 32-bit integer to hold time */
So maybe I don't understand the nature of "time_t", or how the bytes are passed around in C, or what situations UDP would return the wrong number of bytes to the file descriptor...
Thanks in advance.
With UDP, as long as the recieve buffer you pass to read is long enough, a single UDP packet won't be broken up between read calls.
However, there's no guarantee that the other side sent a packet of at least 4 bytes - you're quite right, if a server sent only a 2 byte response then that code would leave now containing garbage.
That probably doesn't matter too much in this precise situation - after all, the server is just as free to send 4 bytes of garbage as it is to send only 2 bytes. If you want to check for it, just check that the n returned by read is as long as you were expecting.