SO_LINGER and closing sockets(WINSOCK) - c

im writing a multithreaded winsock application and im having some issues with closing the sockets.
first of all, is there a limit for a number of simultaneously open sockets? lets say like 32 sockets all in once.
i establish a connection on one of the sockets, and passing information and it all goes right.
problem is when i disconnect the socket and then reconnect to the same destination, i get a RST from the server after my SYN.
i dont have the code for the server app so i cant debug it.
when i used SO_LINGER and it sent a RST flag at the end of each session - it worked.
but i dont want to end my connections this way.
when not using SO_LINGER a FIN flag was sent but it seems the connection was not really closed.
any help?
thanks

On Unix there's a file descriptor limit per process - I'm guessing on Windows it's "handles".
You are probably bind()-ing your client socket to a fixed port. That might be the reason the server is rejecting your subsequent connection. Try normal ephemeral ports.

Firstly, I agree with Nikolai, are you binding your client socket?
If so it sounds like the socket on the server side is still in TIME_WAIT and is discarding the new connection attempt. By binding the client socket you're forcing the server to try and reuse the exact same connection that is currently in the 2MSL wait period, it can't be reused at this point in time and so you're seeing what you're seeing. There's usually no need to bind the client port, stop doing it and your problem will likely go away.
Secondly, yes, there are limits to the number of open sockets on Windows platforms but they're resource related rather than some hard coded number.
Each open socket uses some 'non paged pool' memory and each pending read or write request on a socket is also likely to use both 'non paged pool' and have pages of memory locked in memory during I/O (there's a limit to the number of pages that can be locked). That said on Vista and later there's much more 'non paged pool' available than on earlier versions of Windows and even then I've managed to achieve more than 70,000 concurrent active connections on a pretty low spec XP box (see here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html). Note that there are some separate limits on the number of outbound connections that you can establish (which is more likely to be of interest to you) but that's around 4000 by default and can be tuned by setting MAX_USER_PORT see here: Maximum number of concurrent TCP/IP connections - Win XP SP3 for more details.

Related

How to properly restart server socket?

Once in a while my server accept functions just stop working properly anymore.
There is a much deeper story behind this, I'm being flooded with SYN and SYN/ACK packets, my network router goes disco and accept keeps returning ECONNABORTED.... I already tried to debug and fix this specific attack, but without success. By now I gave up and rather look for a more generic server recover solution.
Anyway I figured out that simpy "restarting" the server socket by closing and calling socket again is helping. Theoretically very simple, but practically I'm facing here a huge challenge because (a) the server is quite complex by now and (b) when should I exactly restart the server socket.
My setup is one accept-thread that calls accept and feeds epoll, one listener-thread that listens for epoll read/write etc. events and feeds a queue of a thread pool.
I have not found any literature that guides one through restarting the server socket.
Particularly:
When do I actually restart the server socket? I mean I do not really know if a ECONNABORTED return value from accept is just a aborted connection or the accept/filedescriptor is going banana.
How does closing the server socket affect epoll and connected clients? Should I close the server socket immediately or rather have a buffer time such that all clients have finished first?
Or is it even best to have two alternating server sockets such that if one goes banana I just try the other one.
I am making some assumptions about the things you say in your question all being true and accurate even though some of them seems like they may be misdiagnosed. Unfortunately, you didn't really explain how you reached the conclusions presented, so I really can't do much other than assume they're true.
For example, you don't explain how or why you figured that closing and calling socket again will help. From just the information you gave, I would strongly suspect the opposite is true. But again, without knowing the evidence and rationale that lead you to figure that, all I can do is assume it's true despite my instinct and experience saying it's wrong.
When do I actually restart the server socket? I mean I do not really know if a ECONNABORTED return value from accept is just a aborted connection or the accept/filedescriptor is going banana.
If it really is the case that accepting connections will recover faster from a restart than without one and you really can't get any connections through, keep track of the last successful connection and the number of failures since the last successful connection. If, for example, you've gone 120 seconds or more without a successful connection and had at least four failed connections since the last successful one, then close and re-open. You may need to tune those parameters.
How does closing the server socket affect epoll and connected clients?
It has no effect on them unless you're using epoll on the server socket itself. In that case, make sure to remove it from the set before closing it.
Should I close the server socket immediately or rather have a buffer time such that all clients have finished first?
I would suggest "draining" the socket by calling accept without blocking until it returns EWOULDBLOCK. Then you can close it. If you get any legitimate connections in that process, don't close it since it's obviously still working.
A client that tries to get in between your close and getting around to calling listen on a new socket might get an error. But if they're getting errors anyway, that should be acceptable.
Or is it even best to have two alternating server sockets such that if one goes banana I just try the other one.
A long time ago, port DoS attacks were common because built-in defenses to things like SYN-bombs weren't as good as they are now. In those days, it was common for a server to support several different ports and for clients to try the ports in rotation. This is why IRC servers often accepted connections on ranges of ports such as 6660-6669. That meant an attacker had to do ten times as much work to make all the ports unusable. These days, it's pretty rare for an attack to take out a specific inbound port so the practice has largely gone away. But if you are facing an attack that can take out specific listening ports, it might make sense to open more listening ports.
Or you could work harder to understand the attack and figure out why you are having a problem that virtually nobody else is having.

Deny a client's TCP connect request before accept()

I'm trying code TCP server in C language. I just noticed accept() function returns when connection is already established.
Some clients are flooding with random data some clients are just sending random data for one time, after that I want to close their's current connection and future connections for few minutes (or more, depends about how much load program have).
I can save bad client IP addresses in a array, can save timings too but I cant find any function for abort current connection or deny future connections from bad clients.
I found a function for windows OS called WSAAccept that allows you deny connections by user choice, but I don't use windows OS.
I tried code raw TCP server which allows you access TCP packet from begin including all TCP header and it doesn't accept connections automatically. I tried handle connections by program side including SYN ACK and other TCP signals. It worked but then I noticed raw TCP server receiving all packets in my network interface, when other programs using high traffic it makes my program laggy too.
I tried use libnetfilter which allows you filter whole traffic in your network interface. It works too but like raw TCP server it also receiving whole network interface's packets which is making it slow when there is lot of traffic. Also I tried compare libnetfilter with iptables. libnetfilter is slower than iptables.
So in summary how I can abort client's current and future connection without hurt other client connections?
I have linux with debian 10.
Once you do blacklisting on packet level you could get very fast vulnerable to very trivial attacks based on IP spoofing. For a very basic implementation an attacker could use your packet level blacklisting to blacklist anyone he wants by just sending you many packets with a fake source IP address. Usually you don't want to touch these filtering (except you really know what you are doing) and you just trust your firewall etc. .
So I recommend really just to close the file descriptor immediately after getting it from accept.

Programmatically detect if local web server has hung

I realise that I'll get at least one answer along the lines of "(re)write the code so it doesn't hang" but let's assume we don't live in that shiny happy utopia just yet...
In our embedded system we have a big SDK including a web-server (Boa) which is the primary method of user interaction.
It's possible, during certain phases of the moon, that something can cause the web server to hang or become otherwise stuck in such a way that the process appears running normally (not crashed/dead/using 100% CPU) but does not serve any web pages.
So, the question is, how do we test/detect this situation?
To test whether the server is hung, create a TCP socket and connect to port 80 on IP address 127.0.0.1 (loopback address). Then send the following text over the socket
GET / HTTP/1.1\r\n\r\n
Most servers will interpret that as a request for index.html. Alternatively, you could implement an undocumented URL for testing (which allows for a shorter, predetermined response), e.g.
GET /test/fdoaoqfaf12491r2h1rfda HTTP/1.1\r\n\r\n
You then need to read the response from the server. This involves using select with a reasonable timeout to determine whether any data came back from the server, and if so, use recv to read the data. The response from the server will consist of a header followed by content. The header consists of lines of text, with a blank line at the end of the header. Lines end with \r\n, so the end of the header is \r\n\r\n.
Getting the content involves calling select and recv until recv returns 0. This assumes that the server will send the response and then close the socket. Some sophisticated servers will leave a socket open to allow multiple requests over the same socket. A simple embedded server should not be doing that. (If your server is trying to use the same socket for multiple requests, then you need to figure out how to turn that feature off.)
That's all very well and good, but you really need to rewrite your code so it doesn't hang.
The mostly likely cause of the problem is that the server has a bunch of dangling sockets, i.e. connections from clients that were never properly cleaned up. Dangling sockets will eventually prevent the server from accepting more connections, either because the server has a limit on the number of open connections, or because the process that's running the server uses up all of its file descriptors.
The first thing to check is the TCP timeout value. One project that I worked on had a default timeout of 5 hours, which meant that dangling sockets stayed open for 5 hours. A reasonable timeout is 1 minute.
Then you need to create a client that deliberately misbehaves. Clients can misbehave by
leaving a socket open without reading the server's response
abruptly closing the socket while reading the response
gracefully closing the socket while reading the response
The first situation should be handled by the TCP timeout. The other two need to be properly handled by the server code. Graceful and abrupt socket closure is controlled via the SO_LINGER option of ioctl and the shutdown function. After the client misbehaves, check the number of open file descriptors in the server process, to verify that the server has handled the situation correctly.

How to use SO_KEEPALIVE option properly to detect that the client at the other end is down?

I was trying to learn the usage of option SO_KEEPALIVE in socket programming in C language under Linux environment.
I created a server socket and used my browser to connect to it. It was successful and I was able to read the GET request, but I got stuck on the usage of SO_KEEPALIVE.
I checked this link keepalive_description#tldg.org but I could not find any example which shows how to use it.
As soon as I detect the client's request on accept() function I set the SO_KEEPALIVE option value 1 on the client socket. Now I don't know, how to check if the client is down, how to change the time interval between the probes sent etc.
I mean, how will I get the signal that the client is down? (Without reading or writing at the client - I thought I will get some signal when probes are not replied back from client), how should I program it after setting the option SO_KEEPALIVE on).
Also if suppose the probes are sent every 3 secs and the client goes down in between I will not get to know that client is down and I may get SIGPIPE.
Anyways importantly I wanna know how to use SO_KEEPALIVE in the code.
To modify the number of probes or the probe intervals, you write values to the /proc filesystem like
echo 600 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes
Note that these values are global for all keepalive enabled sockets on the system, You can also override these settings on a per socket basis when you set the setsockopt, see section 4.2 of the document you linked.
You can't "check" the status of the socket from userspace with keepalive. Instead, the kernel is simply more aggressive about forcing the remote end to acknowledge packets, and determining if the socket has gone bad. When you attempt to write to the socket, you will get a SIGPIPE if keepalive has determined remote end is down.
You'll get the same result if you enable SO_KEEPALIVE, as if you don't enable SO_KEEPALIVE - typically you'll find the socket ready and get an error when you read from it.
You can set the keepalive timeout on a per-socket basis under Linux (this may be a Linux-specific feature). I'd recommend this rather than changing the system-wide setting. See the man page for tcp for more info.
Finally, if your client is a web browser, it's quite likely that it will close the socket fairly quickly anyway - most of them will only hold keepalive (HTTP 1.1) connections open for a relatively short time (30s, 1 min etc). Of course if the client machine has disappeared or network down (which is what SO_KEEPALIVE is really useful for detecting), then it won't be able to actively close the socket.
As already discussed, SO_KEEPALIVE makes the kernel more aggressive about continually verifying the connection even when you're not doing anything, but does not change or enhance the way the information is delivered to you. You'll find out when you try to actually do something (for example "write"), and you'll find out right away since the kernel is now just reporting the status of a previously set flag, rather than having to wait a few seconds (or much longer in some cases) for network activity to fail. The exact same code logic you had for handling the "other side went away unexpectedly" condition will still be used; what changes is the timing (not the method).
Virtually every "practical" sockets program in some way provides non-blocking access to the sockets during the data phase (maybe with select()/poll(), or maybe with fcntl()/O_NONBLOCK/EINPROGRESS&EWOULDBLOCK, or if your kernel supports it maybe with MSG_DONTWAIT). Assuming this is already done for other reasons, it's trivial (sometimes requiring no code at all) to in addition find out right away about a connection dropping. But if the data phase does not already somehow provide non-blocking access to the sockets, you won't find out about the connection dropping until the next time you try to do something.
(A TCP socket connection without some sort of non-blocking behaviour during the data phase is notoriously fragile, as if the wrong packet encounters a network problem it's very easy for the program to then "hang" indefinitely, and there's not a whole lot you can do about it.)
Short answer, add
int flags =1;
if (setsockopt(sfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPALIVE"); exit(0); };
on the server side, and read() will be unblocked when the client is down.
A full explanation can be found here.

sockets - discover firewalled ports

I was reading the nmap source code because I'd like to find out how does it discover that certain ports are filtered or firewalled. I have some experience with sockets in c and i've built simple port scanners, that's easy - if the connection succeeds, the port is open, otherwise it's closed (because of the RST returned). But in case with the firewalled ports, they don't return RST packet back, and my port scanner just "waits" forever.
If someone's got experience with this topic, please point me to the parts of the nmap code where the actual scanning and port-state determination occurs, or at least tell me if there are any other codes available which deal with this problem.
Use asynchronous socket API calls (i.e. don't wait for the connection to be established, and instead try the next port/address in parallel) and define a reasonable timeout (e.g. if the connection isn't established after a minute you can consider it filtered).

Resources