I'm trying to connect two machines, say machine A and B. I'm trying to send TCP message from A to B (One way). In normal scenario this works fine. When the communication is smooth, if the socket in B is closed, send() from A is stuck forever. And it puts process into Zombie state. I have socket in blocked mode in machine A. Below is the code that stuck forever.
if (send (txSock,&txSockbuf,sizeof(sockstruct),0) == -1) {
printf ("Error in sending the socket Data\n");
}
else {
printf ("The SENT String is %s \n",sock_buf);
}
How do I find if the other side socket is closed?? What does send return if the destination socket is closed?? Would select be helpful.
A process in the "zombie" state means that it has already exited, but its parent has not yet read its return code. What's probably happening is that your process is receiving a SIGPIPE signal (this is what you'll get by default when you write to a closed socket), your program has already terminated, but the zombie state hasn't yet been resolved.
This related question gives more information about SIGPIPE and how to handle it: SIGPIPE, Broken pipe
Related
I'm trying to force TCP reset on a connection.
The suggested way is to set SO_LINGER to 0 and the call close().
I'm doing exactly that, but the connection remains in ESTABLISHED state.
The socket is operating in non blocking mode. The operating system is Raspbian.
The code:
struct linger l;
l.l_onoff = 1;
l.l_linger = 0;
if (setsockopt(server->connection_socket, SOL_SOCKET, SO_LINGER, &l, sizeof(l)) != 0) {
LOG_E(tcp, "setting SO_LINGER failed");
}
if (close(server->connection_socket) != 0) {
LOG_E(tcp, "closing socket failed");
}
server->connection_socket = 0;
LOG_I(tcp, "current TCP connection was closed");
Wireshark trace shows no RST as well.
No other threads of the application are performing any operation on that socket.
I can't figure out what is wrong, any suggestion would be much appreciated.
SOLVED
The issue was with file descriptors leaking to children created through system() call.
In fact when I listed all TCP socket descriptors with lsof -i tcp, I found out that child processes had opened file descriptor from parent process (even though parent already did not).
The solution was requesting file descriptor to be closed in forked process (right after accept()).
fcntl(server->connection_socket, F_SETFD, FD_CLOEXEC)
In your case you can no longer send and receive data after calling close. Also, after close call the RST will be sent only if the reference counter of socket descriptor becomes 0. Then the connection goes into CLOSED state and data in the receive and the send buffers are discarded.
Probably the answer lies in how you forked a process (as mentioned by EJP in comment). It seems that you didn't close the accepted socket in parent process after calling fork. So the socket reference counter is nonzero and there is no RST immediately after your close.
Such situations are well described by Stevens in UNIX Network Programming.
I'm writing an epoll-based network server in C. When I create my socket to listen for incoming connections, I make it non-blocking using fcntl. Similarly when incoming connections arrive from clients, I make their sockets non-blocking before doing anything with them, and likewise for outgoing connections' sockets.
Sometimes my server gets a SIGPIPE -- I think this is when I try to write to a client connection that has been closed by the client. This seems strange to me; I thought that with non-blocking sockets instead of a SIGPIPE I should get an -1 back from the call to write and ECONNRESET in errno.
Is there something I'm missing? Or is it just normal to get both a SIGPIPE and an error code even with non-blocking sockets (meaning that I should explicitly ignore the signal with signal(SIGPIPE, SIG_IGN) in my setup)?
Yes, this is normal. If you write to a socket (non-blocking or not) where the other end has closed the connection, you will get a SIGPIPE or (if you are blocking the SIGPIPE signal) an error return (-1) with errno set to EPIPE.
From the man page for write:
EPIPE: fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive
a SIGPIPE signal. (Thus, the write return value is seen only if the program catches, blocks or ignores this signal.)
The POSIX standard is here: http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html and says:
[EPIPE] An attempt is made to write to a pipe or FIFO that is not open for reading by any process, or that only has one end open. A SIGPIPE signal shall also be sent to the thread.
The SIGPIPE is normal. Another option beside setting signal handler solely for this purpose is to use flag MSG_NOSIGNAL whenever you send.
Two processes are communicating via sockets - Process A and Process B.
Process B is using select() call to check when the socket is the ready for I/O.
Process A is suddenly killed. What will happen to the B side socket. Will B side socket automatically detect that A's socket is no longer available and select() will return -1 with EABDF. OR select() call will remain blocked forever.
Select will unlock and either an error case or a read case will be returned.
select() returns and says that the socket is readable. When you read the socket, you will get -1 (and the corresponding error in errno) or 0 (EOF).
The tcp socket will remain half opened for some time if there's no heartbeat between two sides.
Finally tcp connection will time out, depends on the time out settings.
Refer to: http://en.wikipedia.org/wiki/Half-open_connection
Hallo erveyone,
two days before I was asking about threads and fork. Now I ended up using the fork methods.
Creating a second process, parent and child are executing different code, but both end up in a while loop, because one is sending forever packets through a socket and the other one is listening forever on a socket. Now I want them to clean up, when ctrl-c is pressed, i.e. both should close their open sockets before returning.
I have three files, first one, the main file creates the processes. In the second file is written the parent code, in the third the child code. Some more information (code snippets) you can find here: c / interrupted system call / fork vs. thread
Now my question, where do I have to put the signal handler, or do I have to specify two of them, one for each process? It seems like a simple question, but not for me somehow. I tried different ways. But could only make one of the guys successful to clean up before returning (my English is bad, sorry therefore). both have to do different things, that's the problem for me, so one handler wouldn't be enough, right?
struct sigaction new_action;
new_action.sa_handler = termination_handler_1;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction(SIGINT, &new_action, NULL);
....more code here ...
/* will run until crtl-c is pressed */
while(keep_going) {
recvlen = recvfrom(sockfd_in, msg, itsGnMaxSduSize_MIB, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(1);
}
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &athinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(1);
}
}
close(sockfd_in);
/* Close network header socket */
gnwh_close(sfd);
/* Terminate network header library */
gnwh_term();
printf("pc2wsu: signal received, closed all sockets and so on!\n");
return 0;
}
void termination_handler_1(wuint32 signum) {
keep_going = 0;
}
As you can see, handling the signal in my case is just changing the loop condition "keep_going". After exiting the loop, each process should clean up.
Thanks in advance for your help.
nyyrikki
There is no reason to close the sockets. When a process exits (as is the default action for SIGINT), all its file descriptors are inherently closed. Unless you have other essential cleanup to do (like saving data to disk) then forget about handling the signal at all. It's almost surely the wrong thing to do.
Your code suffers from a race condition. You test for keep_going and then enter recvfrom, but it might have gotten the signal between then. That is pretty unlikely, so we will ignore it.
It sounds like the sender and receiver were started by the same process and that process was started from the shell. If you have not done anything, they will be in the same process group and all three processes will receive SIGINT when you hit ^C. Thus it would be best if both processes handled SIGINT if you want to run cleanup code (note closing FDs isn't a good reason...the fds will be autoclosed when the process exits). If these are TCP sockets between the two, closing one side would eventually cause the other side to close (but for sender, not until they try to send again).
Can a socket be closed from another thread when a send / recv on the same socket is going on?
Suppose one thread is in blocking recv call and another thread closes the same socket, will the thread in the recv call know this and come out safely?
I would like to know if the behavior will differ between different OS / Platforms. If yes, how will it behave in Solaris?
In linux closing a socket won't wake up recv(). Also, as #jxh says:
If a thread is blocked on recv() or send() when the socket is closed
by a different thread, the blocked thread will receive an error.
However, it is difficult to detect the correct remedial action after
receiving the error. This is because the file descriptor number
associated with the socket may have been picked up by yet a different
thread, and the blocked thread has now been woken up on an error for a
"valid" socket. In such a case, the woken up thread should not call
close() itself.
The woken up thread will need some way to differentiate whether the
error was generated by the connection (e.g. a network error) that
requires it to call close(), or if the error was generated by a
different thread having called close() on it, in which case it should
just error out without doing anything further to the socket.
So the best way to avoid both problems is to call shutdown() instead of close(). shutdown() will make the file descriptor still available, so won't be allocated by another descriptor, also will wake up recv() with an error and the thread with the recv() call can close the socket the normal way, like a normal error happened.
I don't know Solaris network stack implementation but I'll throw out my theory/explanation of why it should be safe.
Thread A enters some blocking system call, say read(2), for this given socket. There's no data in socket receive buffer, so thread A is taken off the processor an put onto wait queue for this socket. No network stack events are initiated here, connection state (assuming TCP) has not changed.
Thread B issues close(2) on the socket. While kernel socket structure should be locked while thread B is accessing it, no other thread is holding that lock (thread A released the lock when it was put to sleep-wait). Assuming there's no outstanding data in the socket send buffer, a FIN packet is sent and the connection enters the FIN WAIT 1 state (again I assume TCP here, see connection state diagram)
I'm guessing that socket connection state change would generate a wakeup for all threads blocked on given socket. That is thread A would enter a runnable state and discover that connection is closing. The wait might be re-entered if the other side has not sent its own FIN, or the system call would return with eof otherwise.
In any case, internal kernel structures will be protected from inappropriate concurrent access. This does not mean it's a good idea to do socket I/O from multiple threads. I would advise to look into non-blocking sockets, state machines, and frameworks like libevent.
For me, shutdown() socket from another thread do the job in Linux
If a thread is blocked on recv() or send() when the socket is closed by a different thread, the blocked thread will receive an error. However, it is difficult to detect the correct remedial action after receiving the error. This is because the file descriptor number associated with the socket may have been picked up by yet a different thread, and the blocked thread has now been woken up on an error for a "valid" socket. In such a case, the woken up thread should not call close() itself.
The woken up thread will need some way to differentiate whether the error was generated by the connection (e.g. a network error) that requires it to call close(), or if the error was generated by a different thread having called close() on it, in which case it should just error out without doing anything further to the socket.
Yes, it is ok to close the socket from another thread. Any blocked/busy threads that are using that socket will report a suitable error.