Non-block socket consecutive file transfer - c

The server that Im working on (which is a Unix C multi-threaded non-block socket server) need to receive a file from a client and broadcast it to all the other clients connected to the server.
Everything is working at the exception that Im having a hard time to determine when a file is done transferring... since Im using non-block socket Im having the issue that sometimes during the file transfer recv return -1 (which I was assuming was the end of the file) then the next pass more bytes comes in.
I try to hack the whole thing putting "END" at the end of the stream. However, sometimes when multiple files are sent in a row the "END" is part of the same recv buffer as the beginning of the next file. Or even worst, sometimes I end up with a buffer that finish with EN and the next pass the D comes in.
What would be the best approach to avoid the situations mentioned above, I don't really want that each time I receive some bytes from the socket loop the whole accumulated buffer to check if "END" is part of it then cut appropriately... Im sure there's a better solution to this right?
Thanks in advance!

If recv() returns -1 it is an error and you need to inspect errno. Most probably it was EAGAIN or EWOULDBLOCK, which just means there is no data currently in the socket receive buffer. So you need to re-select().
When recv() returns zero the peer has disconnected the socket and the transfer is complete.

Signaling the end of a file with some byte sequence is not reliable, the file could contain that sequence. First send the file length - 4 bytes or 8 if you allow huge file transfer, use network byte order.
if ((n = read(..., filelen)) > 0) {
filelen -= n;
}

The most simpe case EJP is referring to, the case where you take the closing of the socket by the other end as end-of-file, could look like the following:
{
ssize_t sizeRead = 0;
while (sizeRead = recv(...)) {
if (0 > sizeRead) { /* recv() failed */
if ((EGAGAIN == errno) ¦¦ (EWOULDBLOCK == errno)) { /* retry the recv() on those two kinds of error */
usleep(1) /* optional */
continue;
}
else
break;
}
... /* process the data read ... */
}
if (0 > sizeRead) {
/* There had been an error during recv() */
}
}

Related

Write to file descriptor and immediately read from it

Today I have encountered some weird looking code that at first glance it's not apparent to me what it does.
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
It seems that a text string is being written into the file descriptor. Immediately then after that it reads from that file descriptor into a buffer. And it compares if the text written into the buffer is a "y".
My understanding (please correct me if I am wrong), is that it writes some data which is a text string into the file descriptor, and then the file descriptor acts as a temporary storage location for anything you write to it. And after that it reads that data from the file descriptor into the buffer. It actually is the same file descriptor. It seems as a primitive way of using a file descriptor to copy data from the text string into the buffer. Why not just use a strcpy() instead?
What would be the use case of writing to a file descriptor and then immediately read from it? It seems like a convoluted way to copy data using file descriptors. Or maybe I don't understand this code well enough, what this sequence of a send() and a read() does?
And assuming that this code is instead using the file descriptor to copy the text string "Input \'y\' to continue.\t" into the buffer, why are they comparing it with the string "y"? It should probably be false every single time.
I am assuming that any data written into a file descriptor stays in that file descriptor until it is read from. So here it seems that send() is being used to write the string into, and read() is used to read it back out.
In man send it says:
The only difference between send() and write(2) is the presence of flags. With a zero
flags argument, send() is equivalent to write(2).
why would they use send() instead of write()? This code is just so mind boggling.
Edit: here's the full function where this code is originally from:
void send_read(int file_desc)
{
int are_equal;
undefined2 buffer [8];
char local_28 [32];
/* 0x6e == 110 == 'n' */
buffer[0] = 0x6e;
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
read(file_desc,buffer,100);
are_equal = strcmp("y",(char *)buffer);
if (are_equal == 0) {
FUN_00400a86(file_desc,local_28);
}
else {
close(file_desc);
}
return;
}
The send() and recv() functions are for use with sockets (send: send a message on a socket — recv: receive a message from a connected socket). See also the POSIX description of Sockets in general.
Socket file descriptors are bi-directional — you can read and write on them. You can't read what you wrote, unlike with pipe file descriptors. With pipes, the process writing to the write end of a pipe can read what it wrote from the read end of the pipe — if another process didn't read it first. When a process writes on a socket, that information goes to the peer process and cannot be read by the writer.
send(2) is a system call that can only be used with sockets. A socket is a descriptor that allows you to use it to send data or receive from a remote point (a remote socket) that can be on a different computer or in the same as you are. But it works like a phone line, what you send is received by your parnter and what he/she sends is received by you. read(2) system call can be used by sockets, while send(2) cannot be used by files, so your sample code is mixing calls related to files with calls related to sockets (that's not uncommon, as read(2) and write(2) can both be used with sockets)
The code you post above is erroneous, as it blindly compares the received buffer with strcmp function, assuming that it received a null terminated string. This can be the case, but it also cannot.
Even if the sender (in the other side of the connection) agreed on sending a full message, nul terminated string. The receiver must first get the amount of data received (this is the return value of the read(2) call, which can be:
-! indicating some error on reception. The connection can be reset by the other side, or the other side can have rebooted while you send the data.
0 indicating no more data or end of data (the other side closed the connection) This can happen if the other side has a timeout and you take too much to respond. It closes the connection without sending anything. You just receive nothing.
n some data, less than the buffer size, but including the full packet sent by the peer (and the agreed nul byte it sent with it). This is the only case in which you can safely strcmp the data.
n some data, less than the buffer size, and less than the data transmitted. This can happen due to some data fragmentation of the data in several packets. Then you have to do another read until you have all the data send by your peer. Packet fragmentation is something natural in TCP, for example.
n some data, less than the buffer size, and more than the data transmitted. The sender did another transmit, after the one you receive, and both packets got into the kernel buffer. You have to investigate this case, as you have one full packet, and must save the rest of the received data in the buffer, for later processing, or you'll lose data you have received.
n some data, the full buffer filled, and no space to store the full transmitted data remained. You have filled the buffer and no \0 char came... the packet is larger than the buffer, you run out of buffer space and have to decide what to do (allocate other buffer to receive the rest, discard the data, or whatever you decide to do) This will not happen to you because you expect a packet of 1 or 2 characters, and you have a buffer of 100, but who knows...
At least, and as a minimum safe net, you can do this:
send(file_desc,"Input \'y\' to continue.\t",0x18,0);
int n = read(file_desc,buffer,sizeof buffer - 1); /* one cell reserved for '\0' */
switch (n) {
case -1: /* error */
do_error();
break;
case 0: /* disconnect */
do_disconnect();
break;
default: /* some data */
buffer[n] = '\0'; /* append the null */
break;
}
if (n > 0) {
iVar1 = strcmp("y",(char *)buffer);
if (iVar1 == 0) {
// some more code
}
}
Note:
As you didn't post a complete and verifiable example, I couldn't post a complete and verifiable response.
My apologies for that.

Is it OK to loop over recv / read to read all data from socket

I'm building a multi-client<->server messaging application over TCP.
I created a non blocking server using epoll to multiplex linux file descriptors.
When a fd receives data, I read() /or/ recv() into buf.
I know that I need to either specify a data length* at the start of the transmission, or use a delimiter** at the end of the transmission to segregate the messages.
*using a data length:
char *buffer_ptr = buffer;
do {
switch (recvd_bytes = recv(new_socket, buffer_ptr, rem_bytes, 0)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
buffer_ptr += recvd_bytes;
rem_bytes -= recvd_bytes;
} while (rem_bytes != 0);
**using a delimiter:
void get_all_buf(int sock, std::string & inStr)
{
int n = 1, total = 0, found = 0;
char c;
char temp[1024*1024];
// Keep reading up to a '\n'
while (!found) {
n = recv(sock, &temp[total], sizeof(temp) - total - 1, 0);
if (n == -1) {
/* Error, check 'errno' for more details */
break;
}
total += n;
temp[total] = '\0';
found = (strchr(temp, '\n') != 0);
}
inStr = temp;
}
My question: Is it OK to loop over recv() until one of those conditions is met? What if a client sends a bogus message length or no delimiter or there is packet loss? Wont I be stuck looping recv() in my program forever?
Is it OK to loop over recv() until one of those conditions is met?
Probably not, at least not for production-quality code. As you suggested, the problem with looping until you get the full message is that it leaves your thread at the mercy of the client -- if a client decides to only send part of the message and then wait for a long time (or even forever) without sending the last part, then your thread will be blocked (or looping) indefinitely and unable to serve any other purpose -- usually not what you want.
What if a client sends a bogus message length
Then you're in trouble (although if you've chosen a maximum-message-size you can detect obviously bogus message-lengths that are larger than that size, and defend yourself by e.g. forcibly closing the connection)
or there is packet loss?
If there is a reasonably small amount of packet loss, the TCP layer will automatically retransmit the data, so your program won't notice the difference (other than the message officially "arriving" a bit later than it otherwise would have). If there is really bad packet loss (e.g. someone pulled the Ethernet cable out of the wall for 5 minutes), then the rest of the message might be delayed for several minutes or more (until connectivity recovers, or the TCP layer gives up and closes the TCP connection), trapping your thread in the loop.
So what is the industrial-grade, evil-client-and-awful-network-proof solution to this dilemma, so that your server can remain responsive to other clients even when a particular client is not behaving itself?
The answer is this: don't depend on receiving the entire message all at once. Instead, you need to set up a simple state-machine for each client, such that you can recv() as many (or as few) bytes from that client's TCP socket as it cares to send to you at any particular time, and save those bytes to a local (per-client) buffer that is associated with that client, and then go back to your normal event loop even though you haven't received the entire message yet. Keep careful track of how many valid received-bytes-of-data you currently have on-hand from each client, and after each recv() call has returned, check to see if the associated per-client incoming-data-buffer contains an entire message yet, or not -- if it does, parse the message, act on it, then remove it from the buffer. Lather, rinse, and repeat.

One socket descriptor always blocked on write. Select not working?

Hello I have a server program and a client program. The server program is working fine, as in I can telnet to the server and I can read and write in any order (like a chat room) without any issue. However I am now working on my client program and when I use 'select' and check if the socket descriptor is set to read or write, it always goes to write and then is blocked. As in messages do not get through until the client sends some data.
How can I fix this on my client end so I can read and write in any order?
while (quit != 1)
{
FD_ZERO(&read_fds);
FD_ZERO(&write_fds);
FD_SET(client_fd, &read_fds);
FD_SET(client_fd, &write_fds);
if (select(client_fd+1, &read_fds, &write_fds, NULL, NULL) == -1)
{
perror("Error on Select");
exit(2);
}
if (FD_ISSET(client_fd, &read_fds))
{
char newBuffer[100] = {'\0'};
int bytesRead = read(client_fd, &newBuffer, sizeof(newBuffer));
printf("%s",newBuffer);
}
if(FD_ISSET(client_fd, &write_fds))
{
quit = transmit(handle, buffer, client_fd);
}
}
Here is code to transmit function
int transmit(char* handle, char* buffer, int client_fd)
{
int n;
printf("%s", handle);
fgets(buffer, 500, stdin);
if (!strchr(buffer, '\n'))
{
while (fgetc(stdin) != '\n');
}
if (strcmp (buffer, "\\quit\n") == 0)
{
close(client_fd);
return 1;
}
n = write(client_fd, buffer, strlen(buffer));
if (n < 0)
{
error("ERROR writing to socket");
}
memset(buffer, 0, 501);
}
I think you are misinterpreting the use of the writefds parameer of select(): only set the bit when you want to write data to the socket. In other words, if there is no data, do not set the bit.
Setting the bit will check if there is room for writing, and if yes, the bit will remain on. Assuming you are not pumping megabytes of data, there will always be room, so right now you will always call transmit() which waits for input from the command line with fgets(), thus blocking the rest of the program. You have to monitor both the client socket and stdin to keep the program running.
So, check for READ action on stdin (use STDIN_FILENO to get the file descriptor for that), READ on client_fd always and just write() your data to the client_fd if the amount of data is small (if you need to write larger data chunks consider non-blocking sockets).
BTW, you forget to return a proper value at the end of transmit().
Sockets are almost always writable, except when the socket send buffer is full, which indicates that you are sending faster than the receiver is receiving.
So your transmit() function will be entered every time around the loop, so it will read some data from stdin, which blocks until you type something, so nothing happens.
You should only select on writability when a prior send() has returned EWOULDBLOCK/EAGAIN. Otherwise you should just send, when you have something to send.
I would throw this code away and use two or three threads in blocking mode.
select is used to check whether a socket has become ready to read or write. If it is blocking for read then that indicates no data to read. If it is blocking in write, then that indicates the TCP buffer is likely full and the remote end has to read some data so that the socket will allow more data to be written. Since the select blocks until one of the socket descriptions is ready, you also need to use timeout in select to avoid waiting for a long time.
In your specific case, if your remote/receiving end keep reading data from the socket then the select will not block for the write on the other end. Otherwise the tcp buffer will become full on the sender side and select will block. Answers posted also indicate the importance of handling EAGAIN or EWOULDBLOCK.
Sample flow:
while(bytesleft > 0)
then
nbytes = write data
if(nbytes > 0)
bytesleft -= nbytes;
else
if write returns with EAGAIN or EWOULDBLOCK
call poll or select to wait for the socket to be come ready
endif
endif
if poll or select times out
then handle the timeout error(e.g. the remote end did not send the
data within expected time interval)
endif
end while
The code also should include handle error conditions and read/write returning with (For example, write/read returning with 0). Also note read/recv returning 0 indicates the remote end closed the socket.

send file socket C linux

I have a problem about a server (call servera) that receive a file from another server end send it to the client. The problem is that client receive 0 as file size and so zero bytes of the file:
/* receive file size from serverB */
recv(s,&bytes,sizeof(bytes),0);
/* send file size to client */
send(file_descriptor,&bytes,sizeof(bytes),0);
bytes = ntohs(bytes);
/* receive (from serverb) and send immediately (to client)*/
while (total != bytes) {
nread = read(s,&c,sizeof(char));
if(nread == 1){
send(file_descriptor,&c,sizeof(c),0);
total += nread;
}
}
What's wrong?
Everything could be wrong.
You must check I/O calls for errors before relying on side-effects, otherwise you will get unpredictable results.
In your case, perhaps the first recv() fails, leaving bytes uninitialized at 0.
Also, the loop reading a single byte at a time is very inefficient, and still fails to check that it manages to send that byte (send() can fail in which case you need to re-try).

Forward proxy detecting FIN packet

I have written a forward proxy. I gonna use it for both windows and linux. I do have required changes as per the OS. However, I keep seeing some raise conditions. Mostly I believe they are due to my misunderstanding in guessing which is the last packet (FIN sigal). Currently I do select on set of sockets. Whichever socket gets signalled, I do read() on it. If read returns 0 then I assume it is a FIN packet and I close that socket. Can it happen that my read() gives non zero value. But that packet does contain FIN (I think it can happen). So, I do not close some sockets though they have got closed.
I am not sure how proxies detect which socket has closed? Or which is a last packet on the established connection.
My code looks like follow:
I have 100 fds which I have accepted from client. I store them an array sock_array[total_size].
select(copy_of_sock_array,timeout)
for(int cnt=0;cnt<total_size;cnt++)
{
if(FD_ISSET(sock_array[cnt],sock_array))
{
ret = recv(sock_array[cnt],buffer,len);
if(ret<=0){
/*This must be a FIN packet */
/* Close corresponding socket which is opened with outer world */
close(/*corresponding socket*/);
}
}
}
Does this look ok?
Thanks
You need to do a non-blocking read, and keep reading from the socket until you get a return value that indicates you should stop reading.
ssize_t r = 0;
for (;;) {
r = recv(sock, buf, bufsz, MSG_DONTWAIT);
if (r <= 0) {
if (r < 0 && errno == EINTR) {
continue;
}
break;
}
/* ... handle data in buf .. */
}
if (r < 0) {
if (errno == EAGAIN) {
/* ... wait in select again ... */
} else {
/* ... handle error ... */
}
} else {
/* got FIN */
}
Note that just because FIN is received does not necessarily mean the connection should be closed. The FIN merely indicates that no more data will be sent, but the peer may still be willing to accept more data. This can happen in HTTP where the client only wants a single response, so it delivers a FIN after its request. It still expects to receive the response though.
Your proxy likely has two sockets, say sock1 and sock2. So receipt of the FIN on sock1 should mean that this indication be forwarded onto sock2 after any data that has been queued on it has been delivered (and the mirror is true as well). You can forward the FIN by using shutdown.
shutdown(sock2, SHUT_WR);
When FIN has been received from both sock1 and sock2, you can call close on both sockets.
So addressing your questions.
Can it happen that my read() gives non zero value. But that packet does contain FIN (I think it can happen).
Yes, this may happen. This is why you continue reading until you get an indication to stop. Well, technically, you don't have to. You can defer that until you have processed some other connection if you have per connection fairness issues. But, you need to come back to it and finish reading before you enter your select wait.
So, I do not close some sockets though they have got closed. I am not sure how proxies detect which socket has closed? Or which is a last packet on the established connection.
As I described, as a (transparent) proxy, the socket can be safely closed once you have forwarded a FIN on it and a FIN has been received on it. If you are not a transparent proxy, you play by a different set of rules, since you really are the server for the client in that case. So, you can close the socket whenever the application protocol you are implementing permits you to do so.
Sockets have a well defined behavior. If you receive data and the connection is closed after that, you'll need two read()s. The first will return the data and the second one will return 0, to signal the end of connection.
You always have to read until the syscall returns 0.
And you don't need a non-blocking read to detect this!

Resources