Is there a way to get the sockfd from a struct sock or any other way that would allow me to uniquely identify the socket / connection I'm working with in kernel space?
I need this piece of information in the context of a device driver for a network adapter.
I thought it was impossible but actually there is a way, at least for simple cases where we have no duplicate file descriptors for a single socket. I'm answering my own question, hoping it'll help people out there.
int get_sockfd(struct sock *sk)
{
int sockfd;
unsigned int i;
struct files_struct *current_files;
struct fdtable *files;
struct socket *sock;
struct file *sock_filp;
sockfd = -1;
sock = sk->sk_socket;
sock_filp = sock->file;
current_files = current->files;
files = files_fdtable(current_files);
for (i = 0; files->fd[i] != NULL; i++) {
if (sock_filp == files->fd[i]) {
sockfd = i;
break;
}
}
return sockfd;
}
You would of course want to check for NULL pointers, starting with struct sock *sk passed in param.
So, basically, the idea is that the numerical value of a file descriptor (a sockfd is just a regular file descriptor, after all) corresponds to the index of its corresponding entry in a process open files table. All we have to do when given a struct sock *sk pointer is loop over the open files table of the current process until the addres pointed to by sk->sk_socket->file matches an entry in the table.
Related
I want to study the source code of the kernel network part to understand how the network part of the kernel works.But when I looked at the listen function, I found the above problem. Use man to see that the first parameter of the listen function is int.
int listen(int sockfd, int backlog);
But in https://github.com/torvalds/linux, the first parameter of the sctp_inet_listen function is struct socket*,In protocol.c we know listen is a function pointer of sctp_inet_listen
static const struct proto_ops inet_seqpacket_ops = {
.family = PF_INET,
.owner = THIS_MODULE,
.release = inet_release, /* Needs to be wrapped... */
.bind = inet_bind,
.connect = sctp_inet_connect,
.socketpair = sock_no_socketpair,
.accept = inet_accept,
.getname = inet_getname, /* Semantics are different. */
.poll = sctp_poll,
.ioctl = inet_ioctl,
.gettstamp = sock_gettstamp,
.listen = sctp_inet_listen,
.shutdown = inet_shutdown, /* Looks harmless. */
.setsockopt = sock_common_setsockopt, /* IP_SOL IP_OPTION is a problem */
.getsockopt = sock_common_getsockopt,
.sendmsg = inet_sendmsg,
.recvmsg = inet_recvmsg,
.mmap = sock_no_mmap,
.sendpage = sock_no_sendpage,
};
int sctp_inet_listen(struct socket *sock, int backlog)
{
struct sock *sk = sock->sk;
struct sctp_endpoint *ep = sctp_sk(sk)->ep;
int err = -EINVAL;
if (unlikely(backlog < 0))
return err;
lock_sock(sk);
/* Peeled-off sockets are not allowed to listen(). */
if (sctp_style(sk, UDP_HIGH_BANDWIDTH))
goto out;
if (sock->state != SS_UNCONNECTED)
goto out;
if (!sctp_sstate(sk, LISTENING) && !sctp_sstate(sk, CLOSED))
goto out;
/* If backlog is zero, disable listening. */
if (!backlog) {
if (sctp_sstate(sk, CLOSED))
goto out;
err = 0;
sctp_unhash_endpoint(ep);
sk->sk_state = SCTP_SS_CLOSED;
if (sk->sk_reuse || sctp_sk(sk)->reuse)
sctp_sk(sk)->bind_hash->fastreuse = 1;
goto out;
}
/* If we are already listening, just update the backlog */
if (sctp_sstate(sk, LISTENING))
WRITE_ONCE(sk->sk_max_ack_backlog, backlog);
else {
err = sctp_listen_start(sk, backlog);
if (err)
goto out;
}
err = 0;
out:
release_sock(sk);
return err;
}
In Linux, the C library function listen(fd,backlog) corresponds to a syscall (SYS_listen) with the same prototype. This syscall is implemented in net/socket.c (see SYSCALL_DEFINE2(listen, int, fd, int, backlog)). It calls net/socket.c:__sys_listen().
net/socket.c:__sys_listen() looks up the socket description (which is of type struct socket) by looking up the file description table entry fd, and does basic checks and bookkeeping work.
The struct socket structure contains member ops, which is a pointer to struct proto_ops. This is a set of function pointers, so that different types of sockets (say, Unix domain sockets, or IP sockets) can be supported in the same interface. (Each socket type defines its own proto_ops, basically.)
net/socket.c:__sys_listen() obtains the listen function pointer of that set, and calls it, so that different socket types can implement their own 'listen' facility. Because the file descriptor was already looked up, and converted to a pointer to the socket description, that pointer is passed (instead of the file descriptor). (This same – or very similar – interface is used across all file/socket descriptor using functions.)
The core point to realize here is that file descriptor numbers are just indexes to a process-specific table of references to file descriptions. For sockets, that reference is of type struct sock *. (The table of file descriptions is internal to the kernel, and is usually called file table; the process-specific table of references is usually called file descriptor table; and a file descriptor is an index to the file descriptor table. If you find this confusing, read e.g. the Wikipedia File descriptor article for further details.)
I am using UNIX domain datagram sockets to send records from multiple clients to a single server in a multithreaded program. Everything is done within one process; I'm sending records from multiple threads to a single thread that acts as the server. All threads are assigned to separate cores using their affinity masks.
It works fine with a single client, but now I am using multiple clients. The server will read data from the socket using select() to return file descriptors that are ready ("set"), then use recvfrom to get the records.
But first I need to write the file descriptors to the fd_set struct so I can use it with select(). I created fd_set as a global struct at the top of the C file that contains the programs to open client and server sockets and pass messages between them:
fd_set fdset;
I create client sockets in this way:
int64_t * create_socket_client(struct sockaddr_un claddr, int64_t retvals[])
{
int sfd, j;
size_t msgLen;
ssize_t numBytes;
char resp[BUF_SIZE];
retvals[0] = 0;
retvals[1] = 0;
sfd = socket(AF_UNIX, SOCK_DGRAM, 0);
if (sfd == -1)
return retvals;
memset(&claddr, 0, sizeof(struct sockaddr_un));
claddr.sun_family = AF_UNIX;
snprintf(claddr.sun_path, sizeof(claddr.sun_path), "/tmp/ud_ucase_cl.%ld", (long) getpid());
retvals[0] = sfd;
retvals[1] = (int64_t)&claddr;
return retvals;
}
The array retvals is passed in and returned with file descriptor and client address. But to be used with select() I need to insert the file descriptor in the fd_set when the socket is created (in the program above).
Normally that wouldn't be a problem if I knew the layout of fd_set. It's defined in sys/select.h:
/* fd_set for select and pselect. */
typedef struct
{
/* XPG4.2 requires this member name. Otherwise avoid the name
from the global namespace. */
#ifdef __USE_XOPEN
__fd_mask fds_bits[__FD_SETSIZE / __NFDBITS];
# define __FDS_BITS(set) ((set)->fds_bits)
#else
__fd_mask __fds_bits[__FD_SETSIZE / __NFDBITS];
# define __FDS_BITS(set) ((set)->__fds_bits)
#endif
} fd_set;
but from that definition I can't tell what the fields are or how to get a file descriptor or array of file descriptors into fd_set.
So my question is: how can I get the file descriptors into fd_set so it can be used with select()?
The way to manipulate an fd_set is with the following macros (from the man page for select()):
void FD_CLR(int fd, fd_set *set);
int FD_ISSET(int fd, fd_set *set);
void FD_SET(int fd, fd_set *set);
void FD_ZERO(fd_set *set);
A new fd_set must be cleared before it is used:
FD_ZERO(&my_fd_set);
To set a file descriptor in an fd_set, do:
FD_SET(my_fd, &my_fd_set);
Similarly, to remove an fd from an fd_set, do:
FD_CLR(my_fd, &my_fd_set);
To test if a file descriptor is set in an fd_set (i.e. to test which descriptors returned ready):
if (FD_ISSET(my_fd, &my_fd_set)) {
// Take action on my_fd
}
I am trying to implement a server / multi client program in Linux with C using select() and fd_sets. I am trying to broadcast messages sent from one connected client to all other connected clients but I don't know how to access the sockets for other clients in the fd_set once they are added dynamically. I am trying to replicate an implementation of this I found in C++ but the fd_set in C doesn't have the properties as C++. This is the code I'm trying to replicate:
for(int i = 0; i < master.fd_count; i++)
{
SOCKET outSock = master.fd_array[i];
if(outSock != listening && outSock != sock)
{
send(outSock, buffer, 250);
}
}
where master is the fd_set, listening is the original socket listening for new clients and sock is the socket the message about to be broadcast came from.
Can any one help me learn how to access the fd_set socket elements to be able to do != comparisons on them like in the example. Or alternatively, point me to another method to implement the multi client setup where I can broadcast a message back to all connected clients. I initially tried using multi processes with fork() pipes but I could not find enough information on how to implement that properly.
In C, you use the macro FD_ISSET to find out whether a given bit is set or not. See the manual page for select(2) for details.
The basic idea is that first you zero the set with FD_ZERO, then you set some bits with FD_SET, then you call select() (or pselect(), according to taste). When select() returns you iterate over the set and use FD_ISSET to find out whether you can do a non-blocking I/O operation on the specified descriptor.
There are many examples on the net; for example, an example from IBM.
Get in the habit of reading manpages. The one for select(2) lists the macros provided for fd_set:
void FD_CLR(int fd, fd_set *set);
int FD_ISSET(int fd, fd_set *set);
void FD_SET(int fd, fd_set *set);
void FD_ZERO(fd_set *set);
You can use FD_SET and FD_ISSET macros to set or test the bits in fd_set that correspond to your filedescriptors.
Keep your own list of connected "users", like a linked list of structures containing what is needed for each user (like username and other data) and the socket descriptor.
Then if you need to send a message to all users just iterate over this list. Since the structures contain all information about the user, it's easy to skip one or more users when iterating, for example to not send to the user from which the message originated.
Simple example
struct user
{
char *name; // Name of the user
SOCKET socket; // Socket descriptor for communication
// Other data needed for the user...
struct user *next; // For linking into a list
};
// The list of all users
struct user *users = NULL;
// Broadcast a message to *all* connected users
void broadcast(const char *message)
{
for (struct user *u = users; u != NULL; u = u->next)
{
send(u->socket, message, strlen(message), 0);
}
}
// Broadcast to all except a specific user
void broadcast_except_name(const char *message, const char *name)
{
for (struct user *u = users; u != NULL; u = u->next)
{
if (strcmp(u->name, name) != 0)
{
send(u->socket, message, strlen(message), 0);
}
}
}
// Broadcast to all except a specific socket
void broadcast_except_socket(const char *message, SOCKET socket)
{
for (struct user *u = users; u != NULL; u = u->next)
{
if (u->socket != socket)
{
send(u->socket, message, strlen(message), 0);
}
}
}
[Functions for creating or otherwise operating on the list omitted]
I am using Unix domain sockets. Want to know about its location in the system.
If I am creating a socketpair using a system call
socketpair(AF_UNIX,SOCK_STREAM,0,fd) ;
I have read it is unnamed socket (a socket that is not been bound to pathname using bind).
On the other hand, named socket or better a socket bound to file system path name using bind call get stored in some directory we specify.
for example
struct sockaddr_un {
sa_family_t sun_family; /* AF_UNIX */
char sun_path[UNIX_PATH_MAX]; /* pathname */
};
here sun_path can be /tmp/sock file.
So, similarly , I want to know does unnamed socket have any location in the system or anywhere in the memory or kernel ?
Thanks in advance.
I'm no kernel expert, so take this as an (educated?) guess.
#include <sys/un.h>
#include <sys/socket.h>
#include <stdio.h>
#include <string.h>
int main()
{
struct sockaddr_un sun;
socklen_t socklen;
int fd[2];
if(socketpair(AF_UNIX,SOCK_STREAM,0,fd) < 0) {
perror("socketpair");
return 111;
}
socklen = sizeof(sun);
memset(&sun, 0, sizeof sun);
sun.sun_path[0] = '!'; /* replace with any character */
if(getsockname(fd[0], (struct sockaddr *)&sun, &socklen) < 0) {
perror("getsockname");
return 111;
}
printf("sunpath(%s)\n", sun.sun_path);
return 0;
}
This program says the socket doesn't have a corresponding path, so my guess is that a unix socketpair is never associated with a filename -- it only stays alive as a data structure inside the kernel until all references are closed.
A better answer is welcome of course :)
How can I distinguish between "listener" file descriptors and "client" file descriptors?
Here's what I saw in the manpage example:
if(events[n].data.fd == listener) {
...
} else {
...
}
'But what if I don't have access to listener?
Sorry if this is a vague question. I'm not quite sure how to word it.
Assuming you are writing a server, you should either keep the listening socket descriptor around in some variable (listener in the manual page), or setup a small structure for each socket you give to epoll_ctl(2) and point to it with data.ptr member of the struct epoll_event (don't forget to de-allocate that structure when socket is closed).
Something like this:
struct socket_ctl
{
int fd; /* socket descriptor */
int flags; /* my info about the socket, say (flags&1) != 0 means server */
/* whatever else you want to have here, like pointers to buffers, etc. */
};
...
struct socket_ctl* pctl = malloc( sizeof( struct socket_ctl ));
/* check for NULL */
pctl->fd = fd;
pctl->flags = 1; /* or better some enum or define */
struct epoll_event ev;
ev.events = EPOLLIN|...;
ev.data.ptr = pctl;
...
if (( events[n].data.ptr->flags & 1 ) != 0 )
{
/* this is server socket */
}
As you can see it's much more work then just having access to the server socket descriptor, but it has a nice property of keeping all information related to one socket in one place.