I have written a forward proxy. I gonna use it for both windows and linux. I do have required changes as per the OS. However, I keep seeing some raise conditions. Mostly I believe they are due to my misunderstanding in guessing which is the last packet (FIN sigal). Currently I do select on set of sockets. Whichever socket gets signalled, I do read() on it. If read returns 0 then I assume it is a FIN packet and I close that socket. Can it happen that my read() gives non zero value. But that packet does contain FIN (I think it can happen). So, I do not close some sockets though they have got closed.
I am not sure how proxies detect which socket has closed? Or which is a last packet on the established connection.
My code looks like follow:
I have 100 fds which I have accepted from client. I store them an array sock_array[total_size].
select(copy_of_sock_array,timeout)
for(int cnt=0;cnt<total_size;cnt++)
{
if(FD_ISSET(sock_array[cnt],sock_array))
{
ret = recv(sock_array[cnt],buffer,len);
if(ret<=0){
/*This must be a FIN packet */
/* Close corresponding socket which is opened with outer world */
close(/*corresponding socket*/);
}
}
}
Does this look ok?
Thanks
You need to do a non-blocking read, and keep reading from the socket until you get a return value that indicates you should stop reading.
ssize_t r = 0;
for (;;) {
r = recv(sock, buf, bufsz, MSG_DONTWAIT);
if (r <= 0) {
if (r < 0 && errno == EINTR) {
continue;
}
break;
}
/* ... handle data in buf .. */
}
if (r < 0) {
if (errno == EAGAIN) {
/* ... wait in select again ... */
} else {
/* ... handle error ... */
}
} else {
/* got FIN */
}
Note that just because FIN is received does not necessarily mean the connection should be closed. The FIN merely indicates that no more data will be sent, but the peer may still be willing to accept more data. This can happen in HTTP where the client only wants a single response, so it delivers a FIN after its request. It still expects to receive the response though.
Your proxy likely has two sockets, say sock1 and sock2. So receipt of the FIN on sock1 should mean that this indication be forwarded onto sock2 after any data that has been queued on it has been delivered (and the mirror is true as well). You can forward the FIN by using shutdown.
shutdown(sock2, SHUT_WR);
When FIN has been received from both sock1 and sock2, you can call close on both sockets.
So addressing your questions.
Can it happen that my read() gives non zero value. But that packet does contain FIN (I think it can happen).
Yes, this may happen. This is why you continue reading until you get an indication to stop. Well, technically, you don't have to. You can defer that until you have processed some other connection if you have per connection fairness issues. But, you need to come back to it and finish reading before you enter your select wait.
So, I do not close some sockets though they have got closed. I am not sure how proxies detect which socket has closed? Or which is a last packet on the established connection.
As I described, as a (transparent) proxy, the socket can be safely closed once you have forwarded a FIN on it and a FIN has been received on it. If you are not a transparent proxy, you play by a different set of rules, since you really are the server for the client in that case. So, you can close the socket whenever the application protocol you are implementing permits you to do so.
Sockets have a well defined behavior. If you receive data and the connection is closed after that, you'll need two read()s. The first will return the data and the second one will return 0, to signal the end of connection.
You always have to read until the syscall returns 0.
And you don't need a non-blocking read to detect this!
Related
I'm building a multi-client<->server messaging application over TCP.
I created a non blocking server using epoll to multiplex linux file descriptors.
When a fd receives data, I read() /or/ recv() into buf.
I know that I need to either specify a data length* at the start of the transmission, or use a delimiter** at the end of the transmission to segregate the messages.
*using a data length:
char *buffer_ptr = buffer;
do {
switch (recvd_bytes = recv(new_socket, buffer_ptr, rem_bytes, 0)) {
case -1: return SOCKET_ERR;
case 0: return CLOSE_SOCKET;
default: break;
}
buffer_ptr += recvd_bytes;
rem_bytes -= recvd_bytes;
} while (rem_bytes != 0);
**using a delimiter:
void get_all_buf(int sock, std::string & inStr)
{
int n = 1, total = 0, found = 0;
char c;
char temp[1024*1024];
// Keep reading up to a '\n'
while (!found) {
n = recv(sock, &temp[total], sizeof(temp) - total - 1, 0);
if (n == -1) {
/* Error, check 'errno' for more details */
break;
}
total += n;
temp[total] = '\0';
found = (strchr(temp, '\n') != 0);
}
inStr = temp;
}
My question: Is it OK to loop over recv() until one of those conditions is met? What if a client sends a bogus message length or no delimiter or there is packet loss? Wont I be stuck looping recv() in my program forever?
Is it OK to loop over recv() until one of those conditions is met?
Probably not, at least not for production-quality code. As you suggested, the problem with looping until you get the full message is that it leaves your thread at the mercy of the client -- if a client decides to only send part of the message and then wait for a long time (or even forever) without sending the last part, then your thread will be blocked (or looping) indefinitely and unable to serve any other purpose -- usually not what you want.
What if a client sends a bogus message length
Then you're in trouble (although if you've chosen a maximum-message-size you can detect obviously bogus message-lengths that are larger than that size, and defend yourself by e.g. forcibly closing the connection)
or there is packet loss?
If there is a reasonably small amount of packet loss, the TCP layer will automatically retransmit the data, so your program won't notice the difference (other than the message officially "arriving" a bit later than it otherwise would have). If there is really bad packet loss (e.g. someone pulled the Ethernet cable out of the wall for 5 minutes), then the rest of the message might be delayed for several minutes or more (until connectivity recovers, or the TCP layer gives up and closes the TCP connection), trapping your thread in the loop.
So what is the industrial-grade, evil-client-and-awful-network-proof solution to this dilemma, so that your server can remain responsive to other clients even when a particular client is not behaving itself?
The answer is this: don't depend on receiving the entire message all at once. Instead, you need to set up a simple state-machine for each client, such that you can recv() as many (or as few) bytes from that client's TCP socket as it cares to send to you at any particular time, and save those bytes to a local (per-client) buffer that is associated with that client, and then go back to your normal event loop even though you haven't received the entire message yet. Keep careful track of how many valid received-bytes-of-data you currently have on-hand from each client, and after each recv() call has returned, check to see if the associated per-client incoming-data-buffer contains an entire message yet, or not -- if it does, parse the message, act on it, then remove it from the buffer. Lather, rinse, and repeat.
Before I Start
Please don't mark this question as a duplicate. I have already seen the numerous posts on SO about handling multiple clients with socket programming. Most people recommend just multi-threading, but I am trying to avoid that path because I have read it has a few problems:
Bad Scalability
Large Overhead/Inefficient/Memory Hungry
Difficult to Debug
Any posts that I have read that specifically talk about using a single thread either have bad/no answers or have unclear explanations, like people saying "Just use select()!"
The Problem
I am writing code for a server to handle multiple (~1000) clients, and I'm having trouble figuring out how to create an efficient solution. Right now I already have the code for my server that is able to handle 1 client at a time. Both are written in C; the server is on Windows using WinSock and the client is on Linux.
The server and client send several communications back and forth, using send() and blocking recv() calls. Writing this code was pretty simple, and I won't post it here because it is pretty long and I doubt anyone will actually read through all of it. Also the exact implementation is not important, I just want to talk about high level pseudocode. The real difficulty is changing the server to handle multiple clients.
What's Already Out There
I have found a nice PDF tutorial about how to create a WinSock server that handles multiple clients and it can be found here: WinSock Multiple Client Support. It's in C++ but it's easily transferable to C.
From what I understand the server operates something like this:
while (running) {
Sleep(1000);
/* Accept all incoming clients and add to clientArray. */
for (client in clientArray) {
/* Interact with client */
if (recv(...) == "disconnect") {
/* Disconnect from client */
}
}
}
/* Close all connections. */
The problem that I see with using this approach is that you essentially only handle one client at a time (which is obvious because you aren't multithreading), but what if the interaction with each client only needs to happen once? Meaning, what if I just want to send some data back and forth and close the connection? This operation could take anywhere from 5 seconds to 5 minutes depending on the speed of the clients connection, so other clients would be blocking on a connect() call to the server while the server handles a client for 5 minutes. It doesn't seem very efficient, but maybe the best way would be to implement a waiting queue, where clients are connected and told to wait for a while? I'm not sure, but it makes me curious about how large servers send out update downloads concurrently to thousands of clients, and if I should operate the same way.
Also, is there a reason for adding a Sleep(1000) call in the main server loop, if the send() and recv() between the server and client take a while (~1 minute)?
What I'm Asking For
What I want is a solution to handling multiple clients on a single threaded server that is efficient enough for ~1000 clients. If you tell me that the solution in the PDF is fine, that's good enough for me (maybe I'm just too preoccupied with efficiency.)
Please give answers that include a verbal explanation of the implementation, server/client pseudocode, or even a small sample code for the server, if you're feeling sadistic.)
Thanks in advance.
I have written single thread socket pool handling. Im using non-blocking sockets and select call to handle all send, receive and errors.
My class keep all sockets in array, and build 3 fd set's for select call. When something happens it check read or write or error list and handle those events.
For example, non-blocking client socket during connection can trigger write or error event. If error event happens then connection failed. If write happens, connection is established.
All sockets is in read fd set. If you create server socket (with bind and listen) new connection will trigger read event. Then check if socket is server socket then call accept for new connection. If read operation is triggered by regular socket then there is some bytes to read.. just call recv with buffer arge enough to suck all data from that socket.
SOCKET maxset=0;
fd_set rset, wset, eset;
FD_ZERO(&rset);
FD_ZERO(&wset);
FD_ZERO(&eset);
for (size_t i=0; i<readsockets.size(); i++)
{
SOCKET s = readsockets[i]->s->GetSocket();
FD_SET(s, &rset);
if (s > maxset) maxset = s;
}
for (size_t i=0; i<writesockets.size(); i++)
{
SOCKET s = writesockets[i]->s->GetSocket();
FD_SET(s, &wset);
if (s > maxset) maxset = s;
}
for (size_t i=0; i<errorsockets.size(); i++)
{
SOCKET s = errorsockets[i]->s->GetSocket();
FD_SET(s, &eset);
if (s > maxset) maxset = s;
}
int ret = 0;
if (bBlocking)
ret = select(maxset + 1, &rset, &wset, &eset, NULL/*&tv*/);
else
{
timeval tv= {0, timeout*1000};
ret = select(maxset + 1, &rset, &wset, &eset, &tv);
}
if (ret < 0)
{
//int err = errno;
NetworkCheckError();
return false;
}
if (ret > 0)
{
// loop through eset and check each with FD_ISSET. if you find some socket it means connect failed
// loop through wset and check each with FD_ISSET. If you find some socket check is there any pending connectin on that socket. If there is pending connection then that socket just got connected. Otherwise select just reported that some data has been sent and you can send more.
// finally, loop through rset and check each with FD_ISSET. If you find some socket then check is this socket your server socket (bind and listen). If its server socket then this is signal new client want to connect.. just call accept and new connection is established. If this is not server socket, then just do recv on that socket to collect new data.
}
There is few more things to handle... All sockets must be in non-blocking mode. Each send or recv calls will return -1 (error) but error code is EWOULDBLOCK. Thats normal and ignore error. If recv returns 0 then this connection is dropped. If send return 0 bytes sent then internal buffer is full.
You need to write additional code to serialize and parse data. For example, after recv, message may not be complete (depending on message size) so it may take more than one recv calls to receive complete message. Sometimes if messages is short recv call can deliver several messages in buffer. So, you need to write good parser or design good protocol, easy to parse.
First, regarding single-thread approach: I'd say it's bad idea because your server processing power is limited by performance of single processor core. But other than that it'll work to some extent.
Now about multiclient problem. I'd suggest using WSASend and WSARecv with their compilation routines. It also can be scaled to multiple threads if necessary.
Server core will look something like this:
struct SocketData {
::SOCKET socket;
::WSAOVERLAPPED overlapped;
::WSABUF bufferRef;
char buf [1024];
// other client-related data
SocketData (void) {
overlapped->hEvent = (HANDLE) this;
bufferRef->buf = buf;
bufferRef->len = sizeof (buf);
// ...
}
};
void OnRecv (
DWORD dwError,
DWORD cbTransferred,
LPWSAOVERLAPPED lpOverlapped,
DWORD dwFlags) {
auto data = (SocketData*) lpOverlapped->hEvent;
if (dwError || !cbTransferred) {
::closesocket (data->socket);
delete data;
return;
}
// process received data
// ...
}
// same for OnSend
void main (void) {
// init and start async listener
::SOCKET serverSocket = ::socket (...);
HANDLE hAccept = ::CreateEvent (nullptr, 0, 0, nullptr);
::WSAEventSelect (serverSocket, FD_ACCEPT, hAccept);
::bind (serverSocket, ...);
::listen (serverSocket, ...);
// main loop
for (;;) {
int r = ::WaitForSingleObjectEx (hAccept, INFINITE, 1);
if (r == WAIT_IO_COMPLETION)
continue;
// accept processing
auto data = new SocketData ();
data->socket = ::accept (serverSocket, ...);
// detach new socket from hAccept event
::WSAEventSelect (data->socket, 0, nullptr);
// recv first data from client
::WSARecv (
data->socket,
&data->bufferRef,
1,
nullptr,
0,
&data->overlapped,
&OnRecv);
}
}
Key points:
wait in main loop (WaitForSingleObjectEx, WaitForMultipleObjectsEx etc.) must be alertable;
most data processing done in OnSend/OnRecv;
all processing must be done without blocking APIs in OnSend/OnRecv;
for event-based processing events must be waited in main loop.
OnRecv will be called for each processed incoming packet. OnSend will be called for each processed outgoing packet. Keep in mind: how many data you asked to send/recv is not the same as what actually processed in packet.
I am trying to read data off an Openssl linked socket using SSL_read. I perform Openssl operations in client mode that sends command and receives data from a real-world server. I used two threads where one thread handles all Openssl operations like connect, write and close. I perform the SSL_read in a separate thread. I am able to read data properly when I issue SSL_read once.
But I ran into problems when I tried to perform multiple connect, write, close sequences. Ideally I should terminate the thread performing the SSL_read in response to close. This is because for the next connect we would get a new ssl pointer and so we do not want to perform read on old ssl pointer. But problem is when I do SSL_read, I am stuck until there is data available in SSL buffer. It gets blocked on the SSL pointer, even when I have closed the SSL connection in the other thread.
while(1) {
memset(sbuf, 0, sizeof(uint8_t) * TLS_READ_RCVBUF_MAX_LEN);
read_data_len = SSL_read(con, sbuf, TLS_READ_RCVBUF_MAX_LEN);
switch (SSL_get_error(con, read)) {
case SSL_ERROR_NONE:
.
.
.
}
I tried all possible solutions to the problem but non works. Mostly I tried indication for letting me know there might be data in SSL buffer, but none of it returns proper indication.
I tried:
- Doing SSL_pending first to know if there is data in SSL buffer. But this always returns zero
- Doing select on the Openssl socket to see if it returns value bigger than zero. But it always returns zero.
- Making the socket as non-blocking and trying the select, but it doesnt seem to work. I am not sure if I got the code properly.
An example of where I used select for blocking socket is as follows. But select always returns zero.
while(1) {
// The use of Select here is to timeout
// while waiting for data to read on SSL.
// The timeout is set to 1 second
i = select(width, &readfds, NULL,
NULL, &tv);
if (i < 0) {
// Select Error. Take appropriate action for this error
}
// Check if there is data to be read
if (i > 0) {
if (FD_ISSET(SSL_get_fd(con), &readfds)) {
// TODO: We have data in the SSL buffer. But are we
// sure that the data is from read buffer? If not,
// SSL_read can be stuck indefinitely.
// Maybe we can do SSL_read(con, sbuf, 0) followed
// by SSL_pending to find out?
memset(sbuf, 0, sizeof(uint8_t) * TLS_READ_RCVBUF_MAX_LEN);
read_data_len = SSL_read(con, sbuf, TLS_READ_RCVBUF_MAX_LEN);
error = SSL_get_error(con, read_data_len);
switch (error) {
.
.
}
So as you can see I have tried number of ways to get the thread performing SSL_read to terminate in response to close, but I didnt get it to work as I expected. Did anybody get to make SSL_read work properly? Is non-blocking socket only solution to my problem? For blocking socket how do you solve the problem of quitting from SSL_read if you never get a response for command? Can you give an example of working solution for non blocking socket with read?
I can point you to a working example of non-blocking client socket with SSL ... https://github.com/darrenjs/openssl_examples
It uses non-blocking sockets with standard linux IO (based on poll event loop). Raw data is read from the socket and then fed into SSL memory BIO's, which then perform the decryption.
The approach I used was single threaded. A single thread performs the connect, write, and read. This means there cannot be any problems associated with one thread closing a socket, while another thread is trying to use that socket. Also, as noted by the SSL FAQ, "an SSL connection cannot be used concurrently by multiple threads" (https://www.openssl.org/docs/faq.html#PROG1), so single threaded approach avoids problems with concurrent SSL write & read.
The challenge with single threaded approach is that you then need to create some kind of synchronized queue & signalling mechanism for submitting and holding data pending for outbound (eg, the commands that you want to send from client to server), and get the socket event loop to detect when there is data pending for write and pull it from the queue etc. For that I would would look at standard std::list, std::mutex etc, and either pipe2 or eventfd for signalling the event loop.
OpenSSL calls recv() which in turn obeys the SOCKET's timeout, which by default is infinite. You can change the timeout thusly:
void socket_timeout_receive_set(SOCKET handle, dword milliseconds)
{
if(handle==SOCKET_HANDLE_NULL)
return;
struct timeval tv = { long(milliseconds / 1000), (milliseconds % 1000) * 1000 };
setsockopt(handle, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(tv));
}
Unfortunately, ssl_error_get() returns SSL_ERROR_SYSCALL which it returns in other situations too, so it's not easy to determine that it timed out. But this function will help you determine if the connection is lost:
bool socket_dropped(SOCKET handle)
{
// Special thanks: "Detecting and terminating aborted TCP/IP connections" by Vinayak Gadkari
if(handle==SOCKET_HANDLE_NULL)
return true;
// create a socket set containing just this socket
fd_set socket_set;
FD_ZERO(&socket_set);
FD_SET(handle, &socket_set);
// if the connection is unreadable, it is not dropped (strange but true)
static struct timeval timeout = { 0, 0 };
int count = select(0, &socket_set, NULL, NULL, &timeout);
if(count <= 0) {
// problem: count==0 on a connection that was cut off ungracefully, presumably by a busy router
// for connections that are open for a long time but may not talk much, call keepalive_set()
return false;
}
if(!FD_ISSET(handle, &socket_set)) // creates a dependency on __WSAFDIsSet()
return false;
// peek at the next character
// recv() returns 0 if the connection was dropped
char dummy;
count = recv(handle, &dummy, 1, MSG_PEEK);
if(count > 0)
return false;
if(count==0)
return true;
return sec==WSAECONNRESET || sec==WSAECONNABORTED || sec==WSAENETRESET || sec==WSAEINVAL;
}
Hello I have a server program and a client program. The server program is working fine, as in I can telnet to the server and I can read and write in any order (like a chat room) without any issue. However I am now working on my client program and when I use 'select' and check if the socket descriptor is set to read or write, it always goes to write and then is blocked. As in messages do not get through until the client sends some data.
How can I fix this on my client end so I can read and write in any order?
while (quit != 1)
{
FD_ZERO(&read_fds);
FD_ZERO(&write_fds);
FD_SET(client_fd, &read_fds);
FD_SET(client_fd, &write_fds);
if (select(client_fd+1, &read_fds, &write_fds, NULL, NULL) == -1)
{
perror("Error on Select");
exit(2);
}
if (FD_ISSET(client_fd, &read_fds))
{
char newBuffer[100] = {'\0'};
int bytesRead = read(client_fd, &newBuffer, sizeof(newBuffer));
printf("%s",newBuffer);
}
if(FD_ISSET(client_fd, &write_fds))
{
quit = transmit(handle, buffer, client_fd);
}
}
Here is code to transmit function
int transmit(char* handle, char* buffer, int client_fd)
{
int n;
printf("%s", handle);
fgets(buffer, 500, stdin);
if (!strchr(buffer, '\n'))
{
while (fgetc(stdin) != '\n');
}
if (strcmp (buffer, "\\quit\n") == 0)
{
close(client_fd);
return 1;
}
n = write(client_fd, buffer, strlen(buffer));
if (n < 0)
{
error("ERROR writing to socket");
}
memset(buffer, 0, 501);
}
I think you are misinterpreting the use of the writefds parameer of select(): only set the bit when you want to write data to the socket. In other words, if there is no data, do not set the bit.
Setting the bit will check if there is room for writing, and if yes, the bit will remain on. Assuming you are not pumping megabytes of data, there will always be room, so right now you will always call transmit() which waits for input from the command line with fgets(), thus blocking the rest of the program. You have to monitor both the client socket and stdin to keep the program running.
So, check for READ action on stdin (use STDIN_FILENO to get the file descriptor for that), READ on client_fd always and just write() your data to the client_fd if the amount of data is small (if you need to write larger data chunks consider non-blocking sockets).
BTW, you forget to return a proper value at the end of transmit().
Sockets are almost always writable, except when the socket send buffer is full, which indicates that you are sending faster than the receiver is receiving.
So your transmit() function will be entered every time around the loop, so it will read some data from stdin, which blocks until you type something, so nothing happens.
You should only select on writability when a prior send() has returned EWOULDBLOCK/EAGAIN. Otherwise you should just send, when you have something to send.
I would throw this code away and use two or three threads in blocking mode.
select is used to check whether a socket has become ready to read or write. If it is blocking for read then that indicates no data to read. If it is blocking in write, then that indicates the TCP buffer is likely full and the remote end has to read some data so that the socket will allow more data to be written. Since the select blocks until one of the socket descriptions is ready, you also need to use timeout in select to avoid waiting for a long time.
In your specific case, if your remote/receiving end keep reading data from the socket then the select will not block for the write on the other end. Otherwise the tcp buffer will become full on the sender side and select will block. Answers posted also indicate the importance of handling EAGAIN or EWOULDBLOCK.
Sample flow:
while(bytesleft > 0)
then
nbytes = write data
if(nbytes > 0)
bytesleft -= nbytes;
else
if write returns with EAGAIN or EWOULDBLOCK
call poll or select to wait for the socket to be come ready
endif
endif
if poll or select times out
then handle the timeout error(e.g. the remote end did not send the
data within expected time interval)
endif
end while
The code also should include handle error conditions and read/write returning with (For example, write/read returning with 0). Also note read/recv returning 0 indicates the remote end closed the socket.
The server that Im working on (which is a Unix C multi-threaded non-block socket server) need to receive a file from a client and broadcast it to all the other clients connected to the server.
Everything is working at the exception that Im having a hard time to determine when a file is done transferring... since Im using non-block socket Im having the issue that sometimes during the file transfer recv return -1 (which I was assuming was the end of the file) then the next pass more bytes comes in.
I try to hack the whole thing putting "END" at the end of the stream. However, sometimes when multiple files are sent in a row the "END" is part of the same recv buffer as the beginning of the next file. Or even worst, sometimes I end up with a buffer that finish with EN and the next pass the D comes in.
What would be the best approach to avoid the situations mentioned above, I don't really want that each time I receive some bytes from the socket loop the whole accumulated buffer to check if "END" is part of it then cut appropriately... Im sure there's a better solution to this right?
Thanks in advance!
If recv() returns -1 it is an error and you need to inspect errno. Most probably it was EAGAIN or EWOULDBLOCK, which just means there is no data currently in the socket receive buffer. So you need to re-select().
When recv() returns zero the peer has disconnected the socket and the transfer is complete.
Signaling the end of a file with some byte sequence is not reliable, the file could contain that sequence. First send the file length - 4 bytes or 8 if you allow huge file transfer, use network byte order.
if ((n = read(..., filelen)) > 0) {
filelen -= n;
}
The most simpe case EJP is referring to, the case where you take the closing of the socket by the other end as end-of-file, could look like the following:
{
ssize_t sizeRead = 0;
while (sizeRead = recv(...)) {
if (0 > sizeRead) { /* recv() failed */
if ((EGAGAIN == errno) ¦¦ (EWOULDBLOCK == errno)) { /* retry the recv() on those two kinds of error */
usleep(1) /* optional */
continue;
}
else
break;
}
... /* process the data read ... */
}
if (0 > sizeRead) {
/* There had been an error during recv() */
}
}