listen() backlog upper limits

listen() backlog upper limits - c

even though a lot was said on the topic, I am still stumped.
I experiment with a monster linux server capable of handling proper load ramps, presumably many thousand connections a second. Now, if i check default listen() queue:
#cat /proc/sys/net/core/somaxconn
128
which couldn't be actual queue size at all. I suspect it might be a legacy, and actual size is given by this:
#cat /proc/sys/net/ipv4/tcp_max_syn_backlog
2048
However, man tcp says the latter is connections awaiting ACK from clients, which is different from total number of connections having not yet been accepted, which is what listen() backlog is.
So my question is how can I increase listen() backlog, and how to get/set upper limit of it (right before kernel recompilation)?

somaxconn is the number of complete connections waiting.
tcp_max_syn_backlog is the number of incomplete connections waiting.
They aren't the same thing. It's all described in the man page.

You increase it by following these instructions: https://serverfault.com/questions/271380/how-can-i-increase-the-value-of-somaxconn - basically by using sysctl.
And yes, somaxconn is the cap on listen backlog.

Related

How to optimize number of threads needed

I am building an UDP port scanner in C.
This is a scheme of the code
Create Socket
Structure raw UDP packet with port i
Send packet and wait n miliseconds for reply
I need to perform those tasks X times, depending on the number of ports to be scanned. It may be up to 65535 times.
My goal is to optimize resources, considering an i386 machine running under a 3.5.0-17-generic Linux kernel.
How many threads should be created?
How many packets should be sent inside a single thread?
Thanks for your attention.

One thread, using select, epoll or similar.
All of them. Remember to rate limit since that doesn't happen automatically with UDP.

What value of backlog should I use?

I read the man 2 listen.
I don't understand what is the backlog value, it says
The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow
Right, how can I define what is the best value?
Thanks

Basically, what the listen() backlog affects is how many incoming connections can queue up if your application isn't accept()ing connections as soon as they come in. It's not particularly important to most applications. The maximum value used by most systems is 128, and passing that is generally safe.

It's a fight between clients trying to connect. pushing accept requests onto the queue, and the accept thread/s sucking them off. Usually, the threads win. I usually set at 32, but it's not usually an important parameter.

listen() ignores the backlog argument?

I have the following problem:
I have sockfd = socket(AF_INET, SOCK_STREAM, 0)
After I set up and bind the socket (let's say with sockfd.sin_port = htons(666)), I immediately do:
listen(sockfd, 3);
sleep(50); // for test purposes
I'm sleeping for 50 seconds to test the backlog argument, which seems to be ignored because I can establish a connection* more than 3 times on port 666.
*: What I mean is that I get a syn/ack for each Nth SYN (n>3) sent from the client and placed in the listen queue, instead of being dropped. What could be wrong? I've read the man pages of listen(2) and tcp(7) and found:
The behavior of the backlog argument on TCP sockets changed with Linux 2.2.
Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests. The maximum length of the queue for incomplete sockets can
be set using
/proc/sys/net/ipv4/tcp_max_syn_backlog.
When syncookies are enabled there is
no logical maximum length and this
setting is ignored. See tcp(7) for
more
information.
, but even with sysctl -w sys.net.ipv4.tcp_max_syn_backlog=2 and sysctl -w net.ipv4.tcp_syncookies=0, I still get the same results! I must be missing something or completely missunderstand listen()'s backlog purpose.

The backlog argument to listen() is only advisory.
POSIX says:
The backlog argument provides a hint
to the implementation which the
implementation shall use to limit the
number of outstanding connections in
the socket's listen queue.
Current versions of the Linux kernel round it up to the next highest power of two, with a minimum of 16. The revelant code is in reqsk_queue_alloc().

Different operating systems provide different numbers of queued connections with different backlog numbers. FreeBSD appears to be one of the few OSes that actually has a 1-to-1 mapping. (source: http://books.google.com/books?id=ptSC4LpwGA0C&lpg=PA108&ots=Kq9FQogkTr&dq=berkeley%20listen%20backlog%20ack&pg=PA108#v=onepage&q=berkeley%20listen%20backlog%20ack&f=false )

Socket client/server input/output polling vs. read/write in linux

Basically I set up a test to see which method is the fastest way to get data from another computer my network for a server with only a few clients(10 at max, 1 at min).
I tried two methods, both were done in a thread/per client fashion, and looped the read 10000 times. I timed the loop from the creation of the threads to the joining of the threads right after. In my threads I used these two methods, both used standard read(2)/write(2) calls and SOCK_STREAM/AF_INET:
In one I polled for data in my client reading(non blocking) whenever data was available, and in my server, I instantly sent data whenever I got a connection. My thread returned on a read of the correct number of bytes(which happened every time).
In the other, my client sent a message to the sever on connect and my server sent a message to my client on a read(both sides blocked here to make this more turn-based and synchronous). My thread returned after my client read.
I was pretty sure polling would be faster. I made a histogram of times to complete threads, and, as expected, polling was faster by a slight margin, but two things were not expected about the read/write method. Firstly, the read/write method gave me two distinct time spikes. I.E. some event sometimes occurred which would slow the read/write down by about .01 microseconds. I ran this test on a switch initially, and thought this may be a collision of packets, but then I ran the server and client on the same computer and still got these two different time spikes. Anyone know what event may be occurring?
The other, my read function returned too many bytes sometimes, and some bytes were garbage. I know streams don't guarantee you'll get all the data correctly, but why would the read function return extra garbage bytes?

Seems you are confusing the purpose of these two alternatives:
Connection per thread approach does not need polling (unless your protocol allows for random sequence of messages either way, which would be very confusing to implement). Blocking reads and writes will always be faster here since you skip one extra system call to select(2)/poll(2)/epoll(4).
Polling approach allows to multiplex I/O on many sockets/files in single-threaded or fixed-number-of-threads setup. This is how web-servers like nginx handle thousands of client connections in very few threads. The idea is that wait on any given file descriptor does not block others - wait on all of them.
So I would say you are comparing apples and goblins :) Take a look here:
High Performance Server Architecture
The C10K problem
libevent
As for the spikes - check if TCP gets into re-transmission mode, i.e. one of the sides is not reading fast enough to drain receive buffers, play with SO_RCVBUF and SO_SNDBUF socket options.
Too many bytes is definitely wrong - looks like API misuse - check if you are comparing signed and unsigned numbers, compile with high warning level.
Edit:
Looks like you are between two separate issues - data corruption and data transfer performance. I would strongly recommend focusing on the first one before tackling the second. Reduce the test to a minimum and try to figure out what you are doing wrong with the sockets. i.e. where's that garbage data comes from. Do you check return values of the read(2) and write(2) calls? Do you share buffers between threads? Paste the reduced code sample into the question (or provide a link to it) if really stuck.
Hope this helps.

I know streams don't guarantee you'll get all the data correctly, but why would the read function return extra garbage bytes?
Actually, streams do guarantee you will get all the data correctly, and in order. Datagrams (UDP) are what you were thinking of, SOCK_DGRAM, which is not what you are using. Within AF_INET, SOCK_STREAM means TCP and TCP means reliable.

backlog value in listen system call

I have a doubt regarding the backlog value in listen system call. From man page of listen system call.
If the backlog argument is greater than the value in /proc/sys/net/core/somaxconn, then it is silently truncated to that value; the default value in this file is 128.
It means my server can accept only <128 connections at once. What if I want to accept more connection >128 ?? Can I simply set the value to the possible maximum number so that I can access more number of connection ??

That number is only the size of the connection queue, where new connections wait for somebody to accept them. As soon as your application calls accept(), a waiting connection is removed from that queue. So, you can definitely handle more than 128 simultaneous connections because they usually only spend a short time in the queue.

Yes. Use a command such as
$ echo 1000 >/proc/sys/net/core/somaxconn
To set the limit higher. See, for instance, this page for more tuning tips.

The backlog value is not the number of maximum connections, it's the number of outstanding connections, i.e connections which you havn't accept():ed.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight