How to atomically create "close-on-exec" socket on Mac? - c

When I create socket on Linux, it's possible to specify the flag O_CLOEXEC on creation time:
auto fd = socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
So there is no way, that some other thread will do fork()+exec() with this socket remaining open.
But on Mac, according to the manual, I can't do the same trick:
The socket has the indicated type, which specifies the semantics of communication.
Currently defined types are:
SOCK_STREAM
SOCK_DGRAM
SOCK_RAW
SOCK_SEQPACKET
SOCK_RDM
The flag O_CLOEXEC can only be set with a call to fcntl(). And it's not an atomic way - some thread may do exec() between calls to socket() and fcntl().
How to resolve this issue?

Assuming that you code is using pthreads, consider using the 'pthread_atfork', which make it possible to specify callback for forking. Extending the idea from Alk's post, but implementing the wait in the fork callback.
From the OP, looks like it is possible to control the code that create the sockets (where in Linux it will use SOCK_CLOEXEC), but not possible to control or modify the forking calls.
mutex_t socket_lock ;
main()
{
...
// establish atfork, before starting any thread.
mutex_t socket_lock = mutex_init(...);
pthread_atfork(set_socket_lock, release_sock_lock, release_sock_lock) ;
pthread_create(...) ;
.. rest of the code ...
}
int socket_wrapper(...)
{
set_socket_lock() ;
int sd = socket(..);
fcntl(sd, O_CLOEXEC);
release_socket_lock();
return sd;
}
void set_socket_lock(void)
{
lock(socket_lock ) ;
}
void release_socket_lock(void)
unlock(socket_lock ) ;
}
As an added bonus, the above logic mutex will also cover the case that a fork is called, and while the fork is executing, one of the thread in the parent/child will attempt to create a socket. Not sure if this is even possible, as the kernel might first suspend all threads before starting any of the fork activity.
Disclaimer: I did not try to run the code, as I do not have access to Mac machines.

Related

Interrupting pselect when it's waiting - multithread

So, according to manual, pselect can have a timeout parameter and it will wait if no file-descriptors are changing. Also, it has an option to be interrupted by a signal:
sigemptyset(&emptyset); /* Signal mask to use during pselect() */
res = pselect(0, NULL, NULL, NULL, NULL, &emptyset);
if (errno == EINTR) printf("Interrupted by signal\n");
It is however not obvious from the manual which signals are able to interrupt pselect?
If I have threads (producers and consumers), and each (consumer)thread is using pselect, is there a way to interrupt only one (consumer)thread from another(producer) thread?
i think the issue is analyzed in https://lwn.net/Articles/176911/
For this reason, the POSIX.1g committee devised an enhanced version of
select(), called pselect(). The major difference between select() and
pselect() is that the latter call has a signal mask (sigset_t) as an
additional argument:
int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
const struct timespec *timeout, const sigset_t *sigmask);
pselect uses the sigmask argument to configure which signals can interrupt it
The collection of signals that are currently blocked is called the
signal mask. Each process has its own signal mask. When you create a
new process (see Creating a Process), it inherits its parent’s mask.
You can block or unblock signals with total flexibility by modifying
the signal mask.
source : https://www.gnu.org/software/libc/manual/html_node/Process-Signal-Mask.html
https://linux.die.net/man/2/pselect
https://www.linuxprogrammingblog.com/code-examples/using-pselect-to-avoid-a-signal-race
Because of your second questions there are multiple algorithms for process synchronization see i.e. https://www.geeksforgeeks.org/introduction-of-process-synchronization/ and the links down on this page or https://en.wikipedia.org/wiki/Sleeping_barber_problem and associated pages. So basically signals are only one path for IPC in linux, cf IPC using Signals on linux
(Ignoring all the signal's part of the question, and only answering to
If I have threads (producers and consumers), and each (consumer)thread
is using pselect, is there a way to interrupt only one
(consumer)thread from another(producer) thread?"
, since the title does not imply the use of signals).
The easiest way I know is for the thread to expose a file descriptor that will always included in the p/select monitored descriptors, so it always monitor at least one. If other thread writes to that, the p/select call will return:
struct thread {
pthread_t tid;
int wake;
...
}
void *thread_cb(void *t) {
struct thread *me = t;
t->wake = eventfd(0, 0);
...
fd_set readfds;
// Populate readfds;
FD_SET(t->wake, &readfds);
select(...);
}
void interrupt_thread(struct thread *t) {
eventfd_write(t->wake, 1);
}
If no eventfd is available, you can replace it with a classic (and more verbose) pipe, or other similar communication mechanism.

Making a descriptor for the poll function

I find the poll() function quite useful for multiplexing pipes and sockets, but I wanted to extend on that and poll my own mediums, as in implement my own pipe and have it work with poll for POLLIN and POLLOUT events, How would I do that?
int self = GenerateMyPipe();
int sock = socket(...);
struct pollfd fd[2];
//Init Pollfd and Stuff...
poll(fd, 2, -1);
...
Thanks for reading...
There's no standard POSIX method for this, but on Linux you can use eventfd()
eventfd() creates an "eventfd object" that can be used as an event
wait/notify mechanism by user-space applications, and by the kernel
to notify user-space applications of events. The object contains an
unsigned 64-bit integer (uint64_t) counter that is maintained by the
kernel. This counter is initialized with the value specified in the
argument initval.
...
The returned file descriptor supports poll(2) (and analogously
epoll(7)) and select(2), as follows:
The file descriptor is readable (the select(2) readfds
argument; the poll(2) POLLIN flag) if the counter has a
value greater than 0.
The file descriptor is writable (the select(2) writefds
argument; the poll(2) POLLOUT flag) if it is possible to
write a value of at least "1" without blocking.
You change the counter by writing to the descriptor.

Are parallel calls to send/recv on the same socket valid as per POSIX standard?

I am trying to understand the usage of socket APIs (recv, send, select, close, etc) on parallel threads. That means using one socket file descriptor on two parallel threads. I have gone through this question. But still I am not able to find any standard doc which explains the usage of socket APIs in multiple thread. Even opengroup man page is not telling anything about this.
I also want to know whether below listed parallel thread usage scenarios are valid in POSIX socket APIs
1) Calling recv and send in two parallel threads
int main_thread() {
fd = do_connect(); //TCP or UDP
spawn_thread(recv_thread, fd);
spwan_thread(send_thread, fd);
...
}
int recv_thread(fd) {
while(1) {
recv(fd, ..)
...
}
}
int send_thread(fd) {
while(1) {
send(fd, ..)
...
}
}
2) Calling recv and send with select in two parallel threads
int recv_thread(fd) {
while(1) {
select(fd in readfd)
recv(fd, ..)
...
}
}
int send_thread(fd) {
while(1) {
select(fd in write)
send(fd, ..)
...
}
}
3) Calling recv and send with setsockopt, ioctl, fcntl in two paralle threads
int recv_thread(fd) {
int flag = 1
while(1) {
ioctl(fd, FIONBIO, &flag); //enable non block
recv(fd, ..)
flag = 0;
ioctl(fd, FIONBIO, &flag); //disable non block
...
}
}
int send_thread(fd) {
while(1) {
select(fd in write)
send(fd, ..)
...
}
}
Posix functions are thread-safe "by default":
2.9.1 Thread-Safety
All functions defined by this volume of POSIX.1-2008 shall be
thread-safe, except that the following functions need not be
thread-safe.
As many have already commented, you can safely call the mentioned calls from different threads.
Case "1" and "2" are quite typical (one thread receiving, one sending, each thread handling many connections with select()) for production code.
Case "3" is somehow odd, and probably source of troubles (it will work, the calls are valid, but it may be not straightforward to get the desired behaviour). Generally you either put the socket in non-blocking mode at the beginning and handle EAGAIN/EWOULDBLOCK errors in send()/recv() calls or blocking and use select()/pselect()/poll()/ppoll().
The sending thread in this case will randomly "find" the socket being in blocking or not-blocking mode: I wouldn't do that.

File Descriptor not closed on exec

I'm having a problem with child processes hanging onto a socket after exec(). This process 1) reads udp packets, and 2) kills/starts other processes. This process monitors other processes via the udp packets that they send.
This runs on Windows, Linux, and AIX. I have not experienced any issues on AIX, only on Linux. (The Windows code is significantly different, so I won't go into details about that.)
I am setting the FD_CLOEXEC flag on the returned descriptor immediately after the creating it via fcntl(). This must run on Red Hat EL 4-6, so using O_CLOEXEC on creation is not an option (the kernels in RHEL4/5 do not have the option.)
For maintenance, the monitoring process may need to be restarted, and when I attempt to restart it, I find that occasionally one of the child processes is still bound to the socket, preventing the monitoring process from doing so. [Normally this wouldn't be an issue (because the user would see the restart failed and take appropriate action), however the monitor itself is monitored via a different mechanism (to avoid a SPOF), and an automated restart of the monitoring process may fail if one of its child processes is holding onto the socket. This can lead to more Bad Things happening downstream. ]
I have went so far as to add code between the fork() and the exec() calls to explicitly close the socket (with associated shutdown) in the child process, and synchronized the fork() and the read() via a pthread_mutex so that I am not reading from the socket when a fork occurs.
The socket is created with
s = socket( AF_INET, SOCK_DGRAM, IPPROTO_UDP )
and no other options. Immediately after the creation, I make the call to fcntl to set FD_CLOEXEC. The process is still single-threaded at this point, so there is no race condition (in theory) before the flag is set.
The bind is done next, while still single-threaded. It binds to the first IPV4 address matching "localhost" as returned by getaddrinfo (probably unnecessary, but it's using an underlying utility function to simplify the call to bind.)
The close logic in the child process after the fork (none of which should be necessary because of the FD_CLOEXEC) is:
char retryClose = 1;
int eno = 0;
int retries = 20;
if ( shutdown( s, SHUT_RDWR ) ) {
/* Failed to shutdown. Wait and try again */
my_sleep( 3000 ); /* sleep using select(0,NULL,NULL,NULL, timeval) */
shutdown( socketno, SHUT_RDWR );
/* not much else can be done... */
}
while ( retryClose && ( close( s ) == -1 ) )
{
/* save error number */
eno = errno;
/* check specific error */
switch ( eno ) {
case ( EIO ) :
/* terminate loop if retries have expired; otherwise sleep for a while and try again */
if ( --retries <= 0 ) {
retryClose = 0;
}
else {
my_sleep( 50 );
}
case ( EINTR ) :
break;
case ( EBADF ) :
default:
retryClose = FALSE;
break;
} /* switch ( eno ) */
}
So, I'm setting the FD_CLOEXEC flag, and explicitly closing the fd prior to the exec() call.
Am I missing anything? Is there anything I can do to ensure that the child process really doesn't hang onto the socket?
Turns out, it wasn't the fork/exec that was causing the problem.
The server process could be restarted several times after starting all children processes, without any problems, but occasionally, when the server would die, one of the child processes would actually grab the server socket.
Switching from using connect()/send() in the client to just sendto() seems to have resolved the problem.

Listen to multiple ports from one server

Is it possible to bind and listen to multiple ports in Linux in one application?
For each port that you want to listen to, you:
Create a separate socket with socket.
Bind it to the appropriate port with bind.
Call listen on the socket so that it's set up with a listen queue.
At that point, your program is listening on multiple sockets. In order to accept connections on those sockets, you need to know which socket a client is connecting to. That's where select comes in. As it happens, I have code that does exactly this sitting around, so here's a complete tested example of waiting for connections on multiple sockets and returning the file descriptor of a connection. The remote address is returned in additional parameters (the buffer must be provided by the caller, just like accept).
(socket_type here is a typedef for int on Linux systems, and INVALID_SOCKET is -1. Those are there because this code has been ported to Windows as well.)
socket_type
network_accept_any(socket_type fds[], unsigned int count,
struct sockaddr *addr, socklen_t *addrlen)
{
fd_set readfds;
socket_type maxfd, fd;
unsigned int i;
int status;
FD_ZERO(&readfds);
maxfd = -1;
for (i = 0; i < count; i++) {
FD_SET(fds[i], &readfds);
if (fds[i] > maxfd)
maxfd = fds[i];
}
status = select(maxfd + 1, &readfds, NULL, NULL, NULL);
if (status < 0)
return INVALID_SOCKET;
fd = INVALID_SOCKET;
for (i = 0; i < count; i++)
if (FD_ISSET(fds[i], &readfds)) {
fd = fds[i];
break;
}
if (fd == INVALID_SOCKET)
return INVALID_SOCKET;
else
return accept(fd, addr, addrlen);
}
This code doesn't tell the caller which port the client connected to, but you could easily add an int * parameter that would get the file descriptor that saw the incoming connection.
You only bind() to a single socket, then listen() and accept() -- the socket for the bind is for the server, the fd from the accept() is for the client. You do your select on the latter looking for any client socket that has data pending on the input.
In such a situation, you may be interested by libevent. It will do the work of the select() for you, probably using a much better interface such as epoll().
The huge drawback with select() is the use of the FD_... macros that limit the socket number to the maximum number of bits in the fd_set variable (from about 100 to 256). If you have a small server with 2 or 3 connections, you'll be fine. If you intend to work on a much larger server, then the fd_set could easily get overflown.
Also, the use of the select() or poll() allows you to avoid threads in the server (i.e. you can poll() your socket and know whether you can accept(), read(), or write() to them.)
But if you really want to do it Unix like, then you want to consider fork()-ing before you call accept(). In this case you do not absolutely need the select() or poll() (unless you are listening on many IPs/ports and want all children to be capable of answering any incoming connections, but you have drawbacks with those... the kernel may send you another request while you are already handling a request, whereas, with just an accept(), the kernel knows that you are busy if not in the accept() call itself—well, it does not work exactly like that, but as a user, that's the way it works for you.)
With the fork() you prepare the socket in the main process and then call handle_request() in a child process to call the accept() function. That way you may have any number of ports and one or more children to listen on each. That's the best way to really very quickly respond to any incoming connection under Linux (i.e. as a user and as long as you have child processes wait for a client, this is instantaneous.)
void init_server(int port)
{
int server_socket = socket();
bind(server_socket, ...port...);
listen(server_socket);
for(int c = 0; c < 10; ++c)
{
pid_t child_pid = fork();
if(child_pid == 0)
{
// here we are in a child
handle_request(server_socket);
}
}
// WARNING: this loop cannot be here, since it is blocking...
// you will want to wait and see which child died and
// create a new child for the same `server_socket`...
// but this loop should get you started
for(;;)
{
// wait on children death (you'll need to do things with SIGCHLD too)
// and create a new children as they die...
wait(...);
pid_t child_pid = fork();
if(child_pid == 0)
{
handle_request(server_socket);
}
}
}
void handle_request(int server_socket)
{
// here child blocks until a connection arrives on 'server_socket'
int client_socket = accept(server_socket, ...);
...handle the request...
exit(0);
}
int create_servers()
{
init_server(80); // create a connection on port 80
init_server(443); // create a connection on port 443
}
Note that the handle_request() function is shown here as handling one request. The advantage of handling a single request is that you can do it the Unix way: allocate resources as required and once the request is answered, exit(0). The exit(0) will call the necessary close(), free(), etc. for you.
In contrast, if you want to handle multiple requests in a row, you want to make sure that resources get deallocated before you loop back to the accept() call. Also, the sbrk() function is pretty much never going to be called to reduce the memory footprint of your child. This means it will tend to grow a little bit every now and then. This is why a server such as Apache2 is setup to answer a certain number of requests per child before starting a new child (by default it is between 100 and 1,000 these days.)

Resources