My application creats a TCP connection, This is working normaly.
But in one network server has many IP say
174.X.X.X
54.x.x.x
like this
When calling TCP connect (Non blocking with timeout of 60 seconds)
to IP 174.X.X.X is always success .
But TCP connect to same server with ip 54.x.x.x is failing (most of the times) with errno 115
measn operation in progress.
Can you please explain me what are the possible reason for errno 115
OS : Linux
My TCP conenct code is as below
tcp_connect(......)
{
int iValOpt = 0;
int iLength= 0;
fcnt((int)(long)SockID,F_SETFL_O_NONBLOCK);
ret = connect (sockID,(struct sockaddr*)pstSockAdr,uiSockLen);
if (ret < 0)
{
if (errno == EINPROGRESS)
{
stTv.tv_sec = 60;
stTv.tv_usec = 0;
FD_ZERO(&write_fd);
FD_SET(sockID,&write_fd);
iLength = sizeof(int);
if (0 < select (sockID+1) , NULL,&write_fd,NULL,&stTv);
{
if(0 > getsockopt(sockID,SOL_SOCKET,SO_ERROR,(void*)(&iValOpt),&iLength))
{
return -1
}
if (0 != iValOpt)
{
return -1;
}
return success;
}
else
{
return -1;
}
}
else
{
return -1;
}
}
return success;
}
Based on your information:
You are trying to do a connect() to 54.x.x.x
The socket is non-blocking
Connection timeout is 60 sec
First, if you look into your /usr/include/asm-generic/errno.h you'll see the following:
#define EINPROGRESS 115 /* Operation now in progress */
It means an existing operation on the socket is in progress. Since, you said you are doing a connect() call, lets do a man connect:
EINPROGRESS
The socket is nonblocking and the connection cannot be completed
immediately. It is possible to select(2) or poll(2) for completion by
selecting the socket for writing. After select(2) indicates
writability, use getsockopt(2) to read the SO_ERROR option at level
SOL_SOCKET to determine whether connect() completed successfully
(SO_ERROR is zero) or unsuccessfully (SO_ERROR is one of the usual
error codes listed here, explaining the reason for the failure).
So, the best guess would be that the TCP 3-way handshake (your connect() call to 54.x.x.x IP address) is taking longer than expected to complete. Since the connect() operation is already in progress, any subsequent operation on the socket is resulting into EINPROGRESS error code. As suggested in the man page, try to use select() or poll() to check if your socket is ready to use (to perform read() or write() calls).
You can pin-point what is preventing your TCP handshake to complete by capturing and analyzing the traffic to/from your own machine and 54.x.x.x. The best tool to help you with this is called WireShark. Good luck.
This seems to be the behaviour of connect():
If the connection cannot be established immediately and O_NONBLOCK is
set for the file descriptor for the socket, connect() shall fail and
set errno to [EINPROGRESS], but the connection request shall not be
aborted, and the connection shall be established asynchronously.
Subsequent calls to connect() for the same socket, before the
connection is established, shall fail and set errno to [EALREADY].
Related
app A and app B created Unix domain datagram socket,A call connect to connect B,so A can use read and write 0r send and recv to communicate with B.but if B crashed,A will block at recv.B restart,sending msg to A will get error 1 operation not permitted.is there any way A can detect B crashed ?
OS:Ubuntu 18.04 kernel 4.18.0
If you want to detect connection errors, a connection-orientated socket might be more appropriate, like SOCK_STREAM or SOCK_SEQPACKET
From the man page of socket:
...
If a piece of data for which the peer protocol has buffer space cannot be successfully transmitted within a reasonable length of time, then the connection is considered to be dead. When SO_KEEPALIVE is enabled on the socket the protocol checks in a protocol-specific manner if the other end is still alive.
...
SOCK_SEQPACKET sockets employ the same system calls as SOCK_STREAM sockets.
Nonblocking reads can be achieved via the O_NONBLOCK flag (on the descriptor) or the recv flag MSG_DONTWAIT.
Connection error detection on connectionless protocols has to be implemented in the application. You could implement a simple ping/heartbeat mechanism, where a client has to send (empty) packets within a specific time interval to indicate that it is still alive or still participating in the communication.
Edit: I've used TCP/UDP synonymously for SOCK_STREAM/SOCK_DGRAM (as the user Shawn pointed out in the comments below).
I had the same problem recently, and the solutions that I found on the web did not entirely convince me. Therefore, I came up with this:
// Client: replace every close() with this close_dgram_socket():
int close_dgram_socket(int fd)
{
if (send(fd, "Bye-bye.", 8) == -1)
perror("send");
close(fd)
}
// Server loop:
do
{
fd_set readfds; FD_ZERO(&readfds); FD_SET(fd, &readfds);
struct timeval timeout = { 0, 300000 }; // 300 ms
if (select(fd+1, &readfds, 0, 0, &timeout) == 0) // nothing reveived
{
if (kill(getppid(), 0)
break; // client has died
else
continue;
}
rx_len = recv(fd, buf, sizeof(buf), 0);
} while(rx_len != 8 || strncmp(buffer, "Bye-bye.", 8));
close(fd); // peer has closed, so do we.
The idea is that (1) the server sends a Bye-bye datagram if she closes the socket orderly and (2) the client checks with kill() if the server is still alive to handle the case where the server crashes and cannot send the Bye-bye datagram. Works fine for socketpair() but not for UDP.
I'm trying to do a https connection over TLSv1.2. Context creation and everything is fine which I've not included in the below code.
if(https) //block 1
{
int ssl_err = 0;
int ssl_rc = 0;
ssl_fd = SSL_new(ctx); /* create new SSL connection state */
if(ssl_fd) //block 2
{
printf("Failure in SSL state creation.");
ssl_err = -1;
}
if(ssl_err == 0) //block 3
{
SSL_set_fd(ssl_fd, fd); /* attach the socket descriptor */
ERR_clear_error();
ssl_rc = SSL_connect(ssl_fd); /* perform the connection*/
printf("Failure in SSL connection %d returned.\n", ssl_rc);
if ( ssl_rc == -1 ) //block 4
ssl_err = -1;
}
if ( ssl_err == -1 ) //block 5
{
printf("Failure in SSL connection.\n");
SSL_free(ssl_fd);
shutdown(fd, 2);
abort();
}
}
In the code above, it's showing output as
Failure in SSL connection -1 returned.
Failure in SSL connection.
I've checked the packets file. Immediately (in 200 microseconds) after sending the client hello, it is going to if block 5 and sending a FIN request which kept me worried to find the error which I couldn't as without server's response, the SSL_connect is returning with error.
I commented if block 5 and tested. To my surprise, since the shutdown is not called, the SSL handshake is happening and data transfer over TLSv1.2 is going on till the end. That means SSL_connect is actually succeeding but somehow it's happening async. But in this way, I can't report if actually there is some errors in the SSL handshaking.
Can anyone help me with this behaviour?
Whether it's actually doing the handshake asynchronously? If yes, why it's returning -1 immediately. Shouldn't it wait for the handshake to complete before putting -1?
As you mentioned you have non-blocking sockets. In that case if SSL_connect() returns -1you need to call SSL_get_error(), check what it returns SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE, then call SSL_connect() again after checking the underlying socket is ready for read/write with select()
I am trying to read data off an Openssl linked socket using SSL_read. I perform Openssl operations in client mode that sends command and receives data from a real-world server. I used two threads where one thread handles all Openssl operations like connect, write and close. I perform the SSL_read in a separate thread. I am able to read data properly when I issue SSL_read once.
But I ran into problems when I tried to perform multiple connect, write, close sequences. Ideally I should terminate the thread performing the SSL_read in response to close. This is because for the next connect we would get a new ssl pointer and so we do not want to perform read on old ssl pointer. But problem is when I do SSL_read, I am stuck until there is data available in SSL buffer. It gets blocked on the SSL pointer, even when I have closed the SSL connection in the other thread.
while(1) {
memset(sbuf, 0, sizeof(uint8_t) * TLS_READ_RCVBUF_MAX_LEN);
read_data_len = SSL_read(con, sbuf, TLS_READ_RCVBUF_MAX_LEN);
switch (SSL_get_error(con, read)) {
case SSL_ERROR_NONE:
.
.
.
}
I tried all possible solutions to the problem but non works. Mostly I tried indication for letting me know there might be data in SSL buffer, but none of it returns proper indication.
I tried:
- Doing SSL_pending first to know if there is data in SSL buffer. But this always returns zero
- Doing select on the Openssl socket to see if it returns value bigger than zero. But it always returns zero.
- Making the socket as non-blocking and trying the select, but it doesnt seem to work. I am not sure if I got the code properly.
An example of where I used select for blocking socket is as follows. But select always returns zero.
while(1) {
// The use of Select here is to timeout
// while waiting for data to read on SSL.
// The timeout is set to 1 second
i = select(width, &readfds, NULL,
NULL, &tv);
if (i < 0) {
// Select Error. Take appropriate action for this error
}
// Check if there is data to be read
if (i > 0) {
if (FD_ISSET(SSL_get_fd(con), &readfds)) {
// TODO: We have data in the SSL buffer. But are we
// sure that the data is from read buffer? If not,
// SSL_read can be stuck indefinitely.
// Maybe we can do SSL_read(con, sbuf, 0) followed
// by SSL_pending to find out?
memset(sbuf, 0, sizeof(uint8_t) * TLS_READ_RCVBUF_MAX_LEN);
read_data_len = SSL_read(con, sbuf, TLS_READ_RCVBUF_MAX_LEN);
error = SSL_get_error(con, read_data_len);
switch (error) {
.
.
}
So as you can see I have tried number of ways to get the thread performing SSL_read to terminate in response to close, but I didnt get it to work as I expected. Did anybody get to make SSL_read work properly? Is non-blocking socket only solution to my problem? For blocking socket how do you solve the problem of quitting from SSL_read if you never get a response for command? Can you give an example of working solution for non blocking socket with read?
I can point you to a working example of non-blocking client socket with SSL ... https://github.com/darrenjs/openssl_examples
It uses non-blocking sockets with standard linux IO (based on poll event loop). Raw data is read from the socket and then fed into SSL memory BIO's, which then perform the decryption.
The approach I used was single threaded. A single thread performs the connect, write, and read. This means there cannot be any problems associated with one thread closing a socket, while another thread is trying to use that socket. Also, as noted by the SSL FAQ, "an SSL connection cannot be used concurrently by multiple threads" (https://www.openssl.org/docs/faq.html#PROG1), so single threaded approach avoids problems with concurrent SSL write & read.
The challenge with single threaded approach is that you then need to create some kind of synchronized queue & signalling mechanism for submitting and holding data pending for outbound (eg, the commands that you want to send from client to server), and get the socket event loop to detect when there is data pending for write and pull it from the queue etc. For that I would would look at standard std::list, std::mutex etc, and either pipe2 or eventfd for signalling the event loop.
OpenSSL calls recv() which in turn obeys the SOCKET's timeout, which by default is infinite. You can change the timeout thusly:
void socket_timeout_receive_set(SOCKET handle, dword milliseconds)
{
if(handle==SOCKET_HANDLE_NULL)
return;
struct timeval tv = { long(milliseconds / 1000), (milliseconds % 1000) * 1000 };
setsockopt(handle, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(tv));
}
Unfortunately, ssl_error_get() returns SSL_ERROR_SYSCALL which it returns in other situations too, so it's not easy to determine that it timed out. But this function will help you determine if the connection is lost:
bool socket_dropped(SOCKET handle)
{
// Special thanks: "Detecting and terminating aborted TCP/IP connections" by Vinayak Gadkari
if(handle==SOCKET_HANDLE_NULL)
return true;
// create a socket set containing just this socket
fd_set socket_set;
FD_ZERO(&socket_set);
FD_SET(handle, &socket_set);
// if the connection is unreadable, it is not dropped (strange but true)
static struct timeval timeout = { 0, 0 };
int count = select(0, &socket_set, NULL, NULL, &timeout);
if(count <= 0) {
// problem: count==0 on a connection that was cut off ungracefully, presumably by a busy router
// for connections that are open for a long time but may not talk much, call keepalive_set()
return false;
}
if(!FD_ISSET(handle, &socket_set)) // creates a dependency on __WSAFDIsSet()
return false;
// peek at the next character
// recv() returns 0 if the connection was dropped
char dummy;
count = recv(handle, &dummy, 1, MSG_PEEK);
if(count > 0)
return false;
if(count==0)
return true;
return sec==WSAECONNRESET || sec==WSAECONNABORTED || sec==WSAENETRESET || sec==WSAEINVAL;
}
Our application uses a non-blocking socket usage with connect and select operations (c code).
The pusedo code is as below
unsigned int ConnectToServer(struct sockaddr_in *pSelfAddr,struct sockaddr_in *pDestAddr)
{
int sktConnect = -1;
sktConnect = socket(AF_INET,SOCK_STREAM,0);
if(sktConnect == INVALID_SOCKET)
return -1;
fcntl(sktConnect,F_SETFL,fcntl(sktConnect,F_GETFL) | O_NONBLOCK);
if(pSelfAddr != 0)
{
if(bind(sktConnect,(const struct sockaddr*)(void *)pSelfAddr,sizeof(*pSelfAddr)) != 0)
{
closesocket(sktConnect);
return -1;
}
}
errno = 0;
int nRc = connect(sktConnect,(const struct sockaddr*)(void *)pDestAddr, sizeof(*pDestAddr));
if(nrC != -1)
{
return sktConnect;
}
if(errno != EINPROGRESS)
{
int savedError = errno;
closesocket(sktConnect);
return -1;
}
fd_set scanSet;
FD_ZERO(&scanSet);
FD_SET(sktConnect,&scanSet);
struct timeval waitTime;
waitTime.tv_sec = 2;
waitTime.tv_usec = 0;
int tmp;
tmp = select(sktConnect +1, (fd_set*)0, &scanSet, (fd_set*)0,&waitTime);
if(tmp == -1 || !FD_ISSET(sktConnect,&scanSet))
{
int savedErrorNo = errno;
writeLog("Connect %s failed after select, cause %d, error %s",inet_ntoa(pDestAddr->sin_addr),savedErrorNo,strerror(savedErrorNo));
closesocket(sktConnect);
return -1;
}
.
.
.
.
.}
Problem statement
In the above code, the select fails with error code 115 which is "Operation in progress". I do not see any documentation on select failing with errno 115.
a. When does the select fails with error code 115 in non-blocking socket? Under what scenario?
b. Do we see any system logs which hints at this problem. Only concern for us me - I could not find any documented feature which describes such problem.
PS : We are using SUSE Linux 11 Enterprise Edition.
The errno EINPROGRESS isn't from select(), it is left over from the prior connect() operation. You enter the block that reports it if either select() returned -1 or the FD isn't set. All this means is that the connection is still in progress. errno is never cleared, only set.
Some thoughts on your code:
I think your condition below the select can be modified to check only to see, if select has returned a value greater than 0 and if that is the case, you can check output of getsockopt for the socket (for SOL_SOCKET and SO_ERROR) options (getsockopt(...,SOL_SOCKET, SO_ERROR,...,...)) to see if connect has not failed.
I am not very sure if the select will always return the socket as writable in case of a connection success. So, in your case, it may (only may) be the case that, the tmp variable is not -1 and the errno it is showing is the errno of the previous connect call.
Additional Reasons:
Another good reason is that, the destination address to which you are connecting is either not reachable, or doesn't have a server waiting at the specified address + port combination. In which case, you can try once with a blocking socket to see if that connects.
As far as I understand, you are trying to make a connection with timeout.
If so, there is a error in your code. After connect() call but before select() you should remove O_NONBLOCK option using fcntl(). Otherwise the select() will always return at once because the operations with your socket (which has O_NONBLOCK) would not block.
The EINPROGRESS which you read is probably generated not by select() but by previous connect() call.
You also should not use bind() call here because connect() implicitly binds your address to socket.
I have a TCP connection. Server just reads data from the client. Now, if the connection is lost, the client will get an error while writing the data to the pipe (broken pipe), but the server still listens on that pipe. Is there any way I can find if the connection is UP or NOT?
You could call getsockopt just like the following:
int error = 0;
socklen_t len = sizeof (error);
int retval = getsockopt (socket_fd, SOL_SOCKET, SO_ERROR, &error, &len);
To test if the socket is up:
if (retval != 0) {
/* there was a problem getting the error code */
fprintf(stderr, "error getting socket error code: %s\n", strerror(retval));
return;
}
if (error != 0) {
/* socket has a non zero error status */
fprintf(stderr, "socket error: %s\n", strerror(error));
}
The only way to reliably detect if a socket is still connected is to periodically try to send data. Its usually more convenient to define an application level 'ping' packet that the clients ignore, but if the protocol is already specced out without such a capability you should be able to configure tcp sockets to do this by setting the SO_KEEPALIVE socket option. I've linked to the winsock documentation, but the same functionality should be available on all BSD-like socket stacks.
TCP keepalive socket option (SO_KEEPALIVE) would help in this scenario and close server socket in case of connection loss.
There is an easy way to check socket connection state via poll call. First, you need to poll socket, whether it has POLLIN event.
If socket is not closed and there is data to read then read will return more than zero.
If there is no new data on socket, then POLLIN will be set to 0 in revents
If socket is closed then POLLIN flag will be set to one and read will return 0.
Here is small code snippet:
int client_socket_1, client_socket_2;
if ((client_socket_1 = accept(listen_socket, NULL, NULL)) < 0)
{
perror("Unable to accept s1");
abort();
}
if ((client_socket_2 = accept(listen_socket, NULL, NULL)) < 0)
{
perror("Unable to accept s2");
abort();
}
pollfd pfd[]={{client_socket_1,POLLIN,0},{client_socket_2,POLLIN,0}};
char sock_buf[1024];
while (true)
{
poll(pfd,2,5);
if (pfd[0].revents & POLLIN)
{
int sock_readden = read(client_socket_1, sock_buf, sizeof(sock_buf));
if (sock_readden == 0)
break;
if (sock_readden > 0)
write(client_socket_2, sock_buf, sock_readden);
}
if (pfd[1].revents & POLLIN)
{
int sock_readden = read(client_socket_2, sock_buf, sizeof(sock_buf));
if (sock_readden == 0)
break;
if (sock_readden > 0)
write(client_socket_1, sock_buf, sock_readden);
}
}
Very simple, as pictured in the recv.
To check that you will want to read 1 byte from the socket with MSG_PEEK and MSG_DONT_WAIT. This will not dequeue data (PEEK) and the operation is nonblocking (DONT_WAIT)
while (recv(client->socket,NULL,1, MSG_PEEK | MSG_DONTWAIT) != 0) {
sleep(rand() % 2); // Sleep for a bit to avoid spam
fflush(stdin);
printf("I am alive: %d\n", socket);
}
// When the client has disconnected, this line will execute
printf("Client %d went away :(\n", client->socket);
Found the example here.
I had a similar problem. I wanted to know whether the server is connected to client or the client is connected to server. In such circumstances the return value of the recv function can come in handy. If the socket is not connected it will return 0 bytes. Thus using this I broke the loop and did not have to use any extra threads of functions. You might also use this same if experts feel this is the correct method.
get sock opt may be somewhat useful, however, another way would to have a signal handler installed for SIGPIPE. Basically whenever you the socket connection breaks, the kernel will send a SIGPIPE signal to the process and then you can do the needful. But this still does not provide the solution for knowing the status of the connection. hope this helps.
You should try to use: getpeername function.
now when the connection is down you will get in errno:
ENOTCONN - The socket is not connected.
which means for you DOWN.
else (if no other failures) there the return code will 0 --> which means UP.
resources:
man page: http://man7.org/linux/man-pages/man2/getpeername.2.html
On Windows you can query the precise state of any port on any network-adapter using:
GetExtendedTcpTable
You can filter it to only those related to your process, etc and do as you wish periodically monitoring as needed. This is "an alternative" approach.
You could also duplicate the socket handle and set up an IOCP/Overlapped i/o wait on the socket and monitor it that way as well.
#include <sys/socket.h>
#include <poll.h>
...
int client = accept(sock_fd, (struct sockaddr*)&address, (socklen_t*)&addrlen);
pollfd pfd = {client, POLLERR, 0}; // monitor errors occurring on client fd
...
while(true)
{
...
if(not check_connection(pfd, 5))
{
close(client);
close(sock[1]);
if(reconnect(HOST, PORT, reconnect_function))
printf("Reconnected.\n");
pfd = {client, POLLERR, 0};
}
...
}
...
bool check_connection(pollfd &pfd, int poll_timeout)
{
poll(&pfd, 1, poll_timeout);
return not (pfd.revents & POLLERR);
}
you can use SS_ISCONNECTED macro in getsockopt() function.
SS_ISCONNECTED is define in socketvar.h.
For BSD sockets I'd check out Beej's guide. When recv returns 0 you know the other side disconnected.
Now you might actually be asking, what is the easiest way to detect the other side disconnecting? One way of doing it is to have a thread always doing a recv. That thread will be able to instantly tell when the client disconnects.