Query on Select System Call - c

select() is defined as :
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *errorfds, struct timeval *timeout);
nfds represents the highest file descriptor in all given sets plus one. I would like to know why is this data required for select() when the fd_set information is available.
If the FDs in the set are say, 4, 8, 9 ,the value of nfds would be 10. Would select() moniter fds 9,8,7,6,5,4 ?

The catch is that fd_set is not really a "set" in the way you're thinking. The behind-the-scenes detail is that the implementation of an fd_set is just an integer that is used as a bitfield. In other words, executing
fd_set foo;
FD_CLEAR(&foo);
FD_SET(&foo, 3);
Sets foo to decimal value 8 - it sets the fourth-least-singificant bit to 1 (remember that 0 is a valid descriptor).
FD_SET(&foo, 3);
is equivalent to
foo |= (1 << 3);
So in order for select to work right, it needs to know which bits of the fd_set are bits that you care about. Otherwise there would be no way for it to tell a zero bit that is "in" the set but set to false from a zero bit that is "not in" the set.
In your example, a fd_set with 4, 8, and 9 set and n = 10 is interpreted as "A set with 10 entries (fds 0-9). Entries 4, 8, and 9 are true (monitor them). Entries 1,2,3,5,6,7 are false (don't monitor them). Any fd value greater than 9 is simply not in the set period."

Select monitors those FDs which you have enabled using the FD_SET macro. If you do not enable any FD for monitoring, select() does not monitor any.
"nfds" is definitely redundant, but it is part of the select() interface, so you need to use it :)
Anyway, if you have {4, 8, 9} in the set, you set nfds to 10 (as you mentioned), and select() will only monitor the three FDs 4, 8 and 9.

It's probably an optimization so that select doesn't have to walk through the whole fd_set to find out which descriptors are actually used. Without that parameter, select would always need to look at the whole set to find which descriptors are actually used in the call, with the parameter, some of that work can be omitted.

Related

FD_SET not putting the file descriptor in the set

I am using the following code to set the value of a file descriptor
fd_set current_sockets, ready_sockets;
FD_SET(sock_fd, &current_sockets);
In the above sock_fd is 3. And after the execution of this line I don't see this value 3 in
current_sockets. In fact I see some weird set of values. What could be the reason for this ?
When you declare fd_set current_sockets, ready_sockets;, both of those variables are uninitialized. man 2 select says this:
FD_ZERO()
This macro clears (removes all file descriptors from) set.
It should be employed as the first step in initializing a
file descriptor set.
But you skipped this non-optional step, so your FD sets are full of random garbage.
Also, the contents of the fd_set structure are unspecified, so you shouldn't expect to be able to make sense of them by any means other than using FD_ISSET.

linux: the first parameter of select

int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *errorfds, struct timeval *timeout);
For the first parameter nfds, here is what I get from wiki:
This is an integer one more than the maximum of any file descriptor in
any of the sets. In other words, while adding file descriptors to each
of the sets, you must calculate the maximum integer value of all of
them, then increment this value by one, and then pass this as nfds.
I have a simple question:
If I have more than one socket to process, how should I set the first parameter of select?
Should I set it with the largest socket number + 1?
If so, does it mean that the select is listening all of file descriptors which is less than the largest socket number + 1?
For example, I have three sockets: 111, 222 and 333. If I set the first parameter as 334, does it mean that I'm listening all of file descriptors from 0 to 333?
Should I set it with the largest socket number + 1?
Yes!
If so, does it mean that the select is listening all of file
descriptors which is less than the largest socket number + 1?
No, it only performs its operations on the fd_sets that are listed in
readfds, writefds, and exceptfds
For example, I have three sockets: 111, 222 and 333. If I set the
first parameter as 334, does it mean that I'm listening all of file
descriptors from 0 to 333?
No, you are only doing $select$ on 111, 222 and 333.
Internally sys_select sets up 3 bitmaps which have a bit set to 1 for each of the three bitsets and then if any of these bits are set (which in turn corrospond the file descriptor operation) then the wait_key_set operation is performed on it.
The reason for this interface is that in the kernel it devolves into a very predictable for-loop; making it quite safe to work with; rather than trying to do the calculation internally in the kernel.

Meaning of FLAG in socket send and recv

While searching in the Linux manual page, what I have found about the format of send and recv in socket is like below:
For send,
ssize_t send(int sockfd, const void *buf, size_t len, int flags);
For recv,
ssize_t recv(int sockfd, void *buf, size_t len, int flags);
But I am not sure what they are trying to tell about int flags. In one sample code I have found the value of flag as 0 (zero). What it means? Also what is the meaning of the line below in the man page?
"The flags argument is the bitwise OR of zero or more of the following flags."
Then the list of flags:
MSG_CONFIRM
MSG_DONTROUTE
.
.
.
etc.
if int flags are equal to 0, it means that no flags are specified. these are optional.
to answer about ORing flags - it is a mechanism that allow you to specify more than one flag - MSG_CONFIRM | MSG_DONTWAIT specify two flags.
OR gate: AND gate:
a b out a b out
0 0 0 0 0 0
0 1 1 0 1 0
1 0 1 1 0 0
1 1 1 1 1 1
what i understand, is that by ORing flags you set specific bits to 1 in an int variable.
later in the code, by ANDing that variable with specific flags you know whether flag was set or not.
if you had specified MSG_DONTWAIT flag, the code: flags & MSG_DONTWAIT will return 1, so you know that flag was set.
let's have a look how MSG_DONTWAIT is defined.
enum
{
...
MSG_DONTWAIT = 0x40, /* Nonblocking IO. */
#define MSG_DONTWAIT MSG_DONTWAIT
...
};
the hex notation 0x40 means that only 7th bit is set to 1.
below i present an example of bitwise operations from socket.c. there's a check whether O_NONBLOCK flag was set when socket file descriptor was created. if so, then set on current flags variable 7th bit to 1, which was defined as MSG_DONTWAIT.
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
a nice reference about bitwise operations: http://teaching.idallen.com/cst8214/08w/notes/bit_operations.txt
Flags allow to pass extra behavior. The default value (0) will lead to the default behavior. In most of the simple cases that's what you want.
If you want the network subsystem to behave in a specific way you can pass on flag value or several values by Oring them: For example If you want to have the 'MSG_DONTWAIT' and the 'MSG_MORE' behaviour (as described by the man page) you can use MSG_DONTWAIT | MSG_MORE.
Here's a good explanation, with an example.
The 0 flag allows you to use a regular recv(), with a standard behavior.
If you want to use a custom recv(), you need to separate your flags (thoses who're listed in the man page) with the OR operator, as it's stated here :
"The flags argument is the bitwise OR of zero or more of the following flags."
Just like that :
recv(sockfd, buf, buflen, FLAG | FLAG | FLAG);

Select from multiply sockets - right NFDS value?

I think that NFDS in select() determines how many sockets the function will check in READFDS and the other fd_sets. So if we set 3 sockets in our fd_set, but I want to check only first one, I have to call select(1 + 1,...). Is this right?
Or does "nfds is the highest-numbered file descriptor in any of the three sets, plus 1" in linux select man means something different? Also why do we need to add + 1?
Example code - fixed
int CLIENTS[max_clients];//Clients sockets
int to_read;
FD_ZERO(&to_read);
int i;
int max_socket_fd = 0;
for (i = 0 ; i < max_clients ; i++)
{
if(CLIENTS[i] < 0)
continue;
int client_socket = CLIENTS[i];
if(client_socket > max_socket_fd)
max_socket_fd = client_socket;
FD_SET(client_socket , &to_read);
}
struct timeval wait;
wait.tv_sec = 0;
wait.tv_usec = 1000;
int select_ret = select(max_socket_fd + 1, &read_flags, NULL, NULL, &wait);
...
int select_ret = select(current_clients + 1, &read_flags, NULL, NULL, &wait);
Your code is wrong. You don't need to pass the number of file descriptors monitored. You need to pick the biggest descriptor you're interested in and add 1.
The standard says:
The nfds argument specifies the range of descriptors to be tested. The
first nfds descriptors shall be checked in each set; that is, the
descriptors from zero through nfds-1 in the descriptor sets shall be
examined
So it's just the expected semantics of select: nfds is not the number of file descriptors (as its name would imply) but rather the upper limit of the watched range.
The bold part in the quote also explains why you need to add 1 to your nfds.
"nfds is the highest-numbered file descriptor in any of the three sets, plus 1"
Every file descriptor is represented by an integral value. So they are not asking for the x-th descriptor that you want to check, they are asking for the highest integral value of the descriptors in your READFDS +1.
Btw, you should check out poll(2) and ppoll(2).
Basically, the "fd" you put into the FD_SET() and similar calls are integer numbers. The "nfds" required by select is the max() of all these values, plus 1.

Increasing limit of FD_SETSIZE and select

I want to increase FD_SETSIZE macro value for my system.
Is there any way to increase FD_SETSIZE so select will not fail
Per the standards, there is no way to increase FD_SETSIZE. Some programs and libraries (libevent comes to mind) try to work around this by allocating additional space for the fd_set object and passing values larger than FD_SETSIZE to the FD_* macros, but this is a very bad idea since robust implementations may perform bounds-checking on the argument and abort if it's out of range.
I have an alternate solution that should always work (even though it's not required to by the standards). Instead of a single fd_set object, allocate an array of them large enough to hold the max fd you'll need, then use FD_SET(fd%FD_SETSIZE, &fds_array[fd/FD_SETSIZE]) etc. to access the set.
I also suggest using poll if possible. And there exist several "event" processing libraries like libevent or libev (or the event abilities of Glib from GTK, or QtCore, etc) which should help you. There are also things like epoll. And your problem is related to C10k
It would be better (and easy) to replace with poll. Generally poll() is a simple drop-in replacement for select() and isn't limited by the 1024 of FD_SETSIZE...
fd_set fd_read;
int id = 42;
FD_ZERO(fd_read);
FD_SET(id, &fd_read);
struct timeval tv;
tv.tv_sec = 5;
tv.tv_usec = 0;
if (select(id + 1, &fd_read, NULL, NULL, &tv) != 1) {
// Error.
}
becomes:
struct pollfd pfd_read;
int id = 42;
int timeout = 5000;
pfd_read.fd = id;
pfd_read.events = POLLIN;
if (poll(&pfd_read, 1, timeout) != 1) {
// Error
}
You need to include poll.h for the pollfd structure.
If you need to write as well as read then set the events flag as POLLIN | POLLOUT.
In order to use a fd_set larger than FD_SETSIZE, it is possible to define an extended one like this :
#include <sys/select.h>
#include <stdio.h>
#define EXT_FD_SETSIZE 2048
typedef struct
{
long __fds_bits[EXT_FD_SETSIZE / 8 / sizeof(long)];
} ext_fd_set;
int main()
{
ext_fd_set fd;
int s;
printf("FD_SETSIZE:%d sizeof(fd):%ld\n", EXT_FD_SETSIZE, sizeof(fd));
FD_ZERO(&fd);
while ( ((s=dup(0)) != -1) && (s < EXT_FD_SETSIZE) )
{
FD_SET(s, &fd);
}
printf("select:%d\n", select(EXT_FD_SETSIZE,(fd_set*)&fd, NULL, NULL, NULL));
return 0;
}
This prints :
FD_SETSIZE:2048 sizeof(fd):256
select:2045
In order to open more than 1024 filedescriptors, it is needed to increase the limit using for instance ulimit -n 2048.
Actually there IS a way to increase FD_SETSIZE on Windows. It's defined in winsock.h and per Microsoft themselves you can increase it by simply defining it BEFORE you include winsock.h:
See Maximum Number of Sockets an Application Can Use (old link), or the more recent page Maximum Number of Sockets Supported.
I do it all the time and have had no problems. The largest value I have used was around 5000 for a server I was developing.

Resources