best way to deal with multithreading TCP Server with C on linux - c

i've read here about this topic in a lot of differents ways, and i want to know whats the best practices of "creating a Linux TCP server with C and Multithreading".
so far i've read :
1-Duplicating process, with Fork().
2-Creating separated threads for each client. multithread server/client implementation in C
3-Creating Asynchronous threads for each connection
i've read that Fork and thread for each connection are not best practices, but, im not sure what really is one?
i have a small server with asynchronous threads for each connection and i have problems with bind() in the time, if i kill the process and start it again, it need like 5 minutes to start again, because i get " ERROR on binding: Address already in use " and i decided to fix it, but with the best practices.
many thanks in advance and sorry for my english .

Regarding your problem binding..
Set the option SO_REUSEADDR to enable binding to a port already in use (under certain circumstances). Set it before bind.
now it will work fine
...
servSock=socket(PF_INET,SOCK_STREAM,IPPROTO_TCP);
int optval = 1;
setsockopt(servSock,SOL_SOCKET,SO_REUSEADDR,(void *)&optval,sizeof(optval));
/* Construct local address structure */
memset(&echoServAddr,0,sizeof(echoServAddr)); /* Zero out the structure */
echoServAddr.sin_family=AF_INET; /* Internet address family*/
echoServAddr.sin_addr.s_addr=htonl(INADDR_ANY); /* Any incoming interface */
echoServAddr.sin_port = htons(echoServPort); /* Local port */
/* Bind to the local address */
bind(servSock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr));
...

Is obsolete since the introduction of threads.
This is the most widely used technique.
You've misread this. You can use asynchronous I/O, but it's a complex programming model and not to be entered into lightly.
You left out non-blocking I/O with select(), poll(), epoll().
If you know what you're doing and you expect very high load you should investigate 3 or 4. Otherwise you should start with 2, as it's the easiest to program and get working, and see by observing it in production whether you have a capacity problem. Odds are than you will never need to progress beyond this model.

I would suggest you to read the doc & code of libev, that is state-of-the-art.

Related

Compute data from multiple clients simultaneously

I'm trying to write a server able to handle multiple (more than a thousand) client connections concurrently in C language. Every connection is meant to accomplish three things:
Send data to the server
The server processes the data
The server returns data to the client
I am using non-blocking sockets and epoll() for handling all the connections, but my problem is right in the moment after the server receives the data from one client and has to call a function which spends several seconds in processing the data before it returns the result that has to be sent back to the client before closing the connection.
My question is, what paradigm can I use in order to be able to keep handling more connections while the data of one client "is cooking"?
I've been researching a bit about the possibilities of doing it by creating a thread or a process every time I need to call the computing function, but I'm not sure if this would be possible given the number of possible concurrent connections, that's why I came here expecting that someone more experienced that me in the matter could shed some light on my ignorance.
Code snippet:
while (1)
{
ssize_t count;
char buf[512];
count = read (events[i].data.fd, buf, sizeof buf); // read the data
if (count == -1)
{
/* If errno == EAGAIN, that means we have read all
data. So go back to the main loop. */
if (errno != EAGAIN)
{
perror ("read");
done = 1;
}
/* Here is where I should call the processing function before
exiting the loop and closing the actual connection */
answer = proc_function(buf);
count = write (events[i].data.fd, answer, sizeof answer); // send the answer to the client
break;
}
...
Thanks in advance.
It seems sensible to multi-thread or multi-process to some degree to accomplish this. The degree to which you multi-thread or multi-process is the question.
1) You could dump the polling system entirely and use a thread/process per connection. That thread can then stall as long as it wants working on the processing for that connection. You'd then have to decide on creating/killing a thread/process each time (probably easiest) or having a pool of threads/processes (probably fastest).
2) You could have a thread/process for the networky bits and hand off the processing to one other thread. This is less parallel, but it does mean you can at least keep handling network connections whilst you're chopping through the list of work. This gives you control of what processing is being handled at least. It would be easy to prioritise incoming connections this way, whereas option 1 might not.
3) (sort of possible 1 & 2) You could use asynchronous I/O to multiplex your connections. You still to handle the processing in the same way as 1 & 2 above.
You also have the question of threads vs processes. Threads are probably quicker to get going but it's more difficult to ensure data integrity. Processes are going to be more resilient but require more interfacing between them.
You also have to decide on a way to pass data between the threads/processes. This is less of an issue for option 1 as you only have to pass off the connection to the thread. Option 2 may (depending on what your data is) be more of a problem. You could use a message queue for passing the messages about but if you have a lot of data to send shared memory is more appropriate. Shared memory is a pain to engineer for processes but easy with threads (as all threads share the same memory space).
There are performance issues as you get to this scale too. It's worth investigating performance characteristics for these things. The differences to how calls like select and poll scale is significant when you're dealing with a lot of connections.
Without knowledge of what data is being sent and received it's hard to give solid recommendations.
Incidentally, this isn't a new problem. Dan Kegel had a good article about it a few years back. It's now out-of-date, but the overview is still good. You should research the current state of the art for the concepts he discusses though.

Howto combine TCP and UDP in a single Server?

I am totally new to socket programming and I want to program a combined TCP/UDP-Server socket in C but I don't know how to combine those two.
So at the moment, I do know how TCP- and UDP-Server/-Clients work and I have already coded the Clients for TCP and UDP. I also know that I have to use the select()-function somehow, but I don't know how to do it.
I have to read two numbers, which are sent to the TCP-/UDP-Server with either TCP- or UDP-Clients and then do some calculations with these numbers and then print the result on the server.
Does anyone know a tutorial for that or an example code or can help me with that?
Or at least a good explanation of the select() function.
Basically, use an event loop. It works like this:
Is there anything I need to do now? If so, do it.
Compute how long until I next need to do something.
Call select specifying all sockets I'm willing to read from in the read set and all sockets I'm trying to write to in the write set.
If we discovered any sockets that are ready for reading, read from them.
If we discovered any sockets that are ready from writing, try to write to them. If we wrote everything we need to write, remove them from the write set.
Go to step 1.
Generally, to write to a socket, you follow this logic:
Am I already trying to write to this socket? If so, just add this to the queue and we're done.
Try to write the data to the socket. If we sent it all, we're done.
Save the leftover in the queue and add this socket to our write set.
Three things to keep in mind:
You must set all sockets non-blocking.
Make sure to copy your file descriptor sets before you pass them to select because select modifies them.
For TCP connections, you will probably need your own write queue.
The idea is to mix inside your server a TCP part and a UDP part.
Then you multiplex the inputs. You could use the old select(2) multiplexing call, but it has limitations (google for C10K problem). Using the poll(2)
multiplexing call is preferable.
You may want to use some event loop libraries, like libev (which uses select or poll or some fancier mechanisms like epoll). BTW, graphical toolkits (e.g. GTK or Qt) also provide their own even loop machinery.
Read some good Linux programming book like the Advanced Linux Programming
book (available online) which has good chapters about multiplexing syscalls and event loops. These are too complex to be explained well in a few minutes in such an answer. Books explain them better.
1) Simple write a tcp/udp server code, and when receive the message, just print it out.
2) substitute print code to process_message() function.
Then you have successfully combine TCP and UDP server to the same procedure.
Be careful with your handling procedure, it's should be cope with parellel execution.
You may try this stream_route_handler, it is c/c++ application, you can add tcp/udp handler in your single c/c++ application. This has been using by transportation heavy traffic route, and logging service purpose.
Example of using
void read_data(srh_request_t *req);
void read_data(srh_request_t *req) {
char *a = "CAUSE ERROR FREE INVALID";
if (strncmp( (char*)req->in_buff->start, "ERROR", 5) == 0) {
free(a);
}
// printf("%d, %.*s\n", i++, (int) (req->in_buff->end - req->in_buff->start), req->in_buff->start);
srh_write_output_buffer_l(req, req->in_buff->start, (req->in_buff->end - req->in_buff->start));
// printf("%d, %.*s\n", i++, (int) (req->out_buff->end - req->out_buff->start), req->out_buff->start);
}
int main(void) {
srh_instance_t * instance = srh_create_routing_instance(24, NULL, NULL);
srh_add_udp_fd(instance, 12345, read_data, 1024);
srh_add_tcp_fd(instance, 3232, read_data, 64);
srh_start(instance);
return 0;
}
If you are using C++ program, you may like this sample code.
stream route with spdlog

Implementing: udp receive queue dropping packets

How can I implement following scenario?
I want my FreeBSD kernel to drop UDP packets on high load.
I can set sysctl net.inet.udp.recvspace to very low number to drop the packet. But how do I implement such an application?
I assume I would need some kind of client/server application.
Any pointers are appreciated.
p.s. This is not a homework. And I am not looking for exact code. I am just looking for ideas.
It will do that automatically. You don't have to do anything about it at all, let alone fiddle with kernel parameters.
Most people posting about UDP are looking for ways to stop UDP from dropping packets!
Use the (SOL_SOCKET, SO_RCVBUF) socket option via setsockopt() to change the size of your socket buffer.
Either tweak the sending app to 'drop' the ocasional packet or, if not possible, connect the UDP messages via a proxy that does the same thing.
What I would do is do the following. I don't know if you need a kernel module or a program.
Supouse you have a function call when you receive an UDP datagram, and then you can choose what to do, drop it or process it. And the process function can trigger several threads.
EVER:
DATAGRAM := DEQUE()
IF(HIGHLOAD > LIMIT)
SEND(HIGH_LOAD_TO(DATAGRAM.SOURCE))
CONTINUE //Start from the biggining
HIGLOAD := HIGHLOAD + 1
PROCESS(DATAGRAM)
PROCESS(DATAGRAM):
...PROCESS DATAGRAM...
HIGHLOAD := HIGHLOAD - 1
You can tweek this how ever you want, but is an idea. When you start processing a pakcage, you count, and when the process is finished, you decrement. So you basically can choose how many packages are you processing right now.

Reading from the serial port in a multi-threaded program on Linux

I'm writing a program in linux to interface, through serial, with a piece of hardware. The device sends packets of approximately 30-40 bytes at about 10Hz. This software module will interface with others and communicate via IPC so it must perform a specific IPC sleep to allow it to receive messages that it's subscribed to when it isn't doing anything useful.
Currently my code looks something like:
while(1){
IPC_sleep(some_time);
read_serial();
process_serial_data();
}
The problem with this is that sometimes the read will be performed while only a fraction of the next packet is available at the serial port, which means that it isn't all read until next time around the loop. For the specific application it is preferable that the data is read as soon as it's available, and that the program doesn't block while reading.
What's the best solution to this problem?
The best solution is not to sleep ! What I mean is a good solution is probably to mix
the IPC event and the serial event. select is a good tool to do this. Then you have to find and IPC mechanism that is select compatible.
socket based IPC is select() able
pipe based IPC is select() able
posix message queue are also selectable
And then your loop looks like this
while(1) {
select(serial_fd | ipc_fd); //of course this is pseudo code
if(FD_ISSET(fd_set, serial_fd)) {
parse_serial(serial_fd, serial_context);
if(complete_serial_message)
process_serial_data(serial_context)
}
if(FD_ISSET(ipc_fd)) {
do_ipc();
}
}
read_serial is replaced with parse_serial, because if you spend all your time waiting for complete serial packet, then all the benefit of the select is lost. But from your question, it seems you are already doing that, since you mention getting serial data in two different loop.
With the proposed architecture you have good reactivity on both the IPC and the serial side. You read serial data as soon as they are available, but without stopping to process IPC.
Of course it assumes you can change the IPC mechanism. If you can't, perhaps you can make a "bridge process" that interface on one side with whatever IPC you are stuck with, and on the other side uses a select()able IPC to communicate with your serial code.
Store away what you got so far of the message in a buffer of some sort.
If you don't want to block while waiting for new data, use something like select() on the serial port to check that more data is available. If not, you can continue doing some processing or whatever needs to be done instead of blocking until there is data to fetch.
When the rest of the data arrives, add to the buffer and check if there is enough to comprise a complete message. If there is, process it and remove it from the buffer.
You must cache enough of a message to know whether or not it is a complete message or if you will have a complete valid message.
If it is not valid or won't be in an acceptable timeframe, then you toss it. Otherwise, you keep it and process it.
This is typically called implementing a parser for the device's protocol.
This is the algorithm (blocking) that is needed:
while(! complete_packet(p) && time_taken < timeout)
{
p += reading_device.read(); //only blocks for t << 1sec.
time_taken.update();
}
//now you have a complete packet or a timeout.
You can intersperse a callback if you like, or inject relevant portions in your processing loops.

C Socket Programming, problems with select() and fd_set

I'm learning my way about socket programming in C (referring to Beej).
Here is a simple multi-user chat server i'm trying to implement:
http://pastebin.com/gDzd0WqP
On runtime, it gives Bus Error. It's coming from the lines 68-78.
Help me trace the source of the problem?
in fact, WHY is my code even REACHING that particular region? I've just run the server. no clients have connected.. :#
ps - i know my code is highly unreliable (no error checks anywhere), but i WILL do that at a later stage, i just want to TEST the functionality of the code before implementing it in all it's glory ;)
line 81
msg[MSG_SIZE] = '\0';`
overruns your buffer. Make it
msg[MSG_SIZE - 1] = '\0';`
You also need to check the return value of all the calls that can fail, that's line 39,42,45,68 and 80
Edit: And if you'd checked for errors, likely you'd seen the accept() call fail, likely due to the socket not being in listen mode - that is, you're missing a call to listen()
Another thing to consider is that you can't necessarily copy fd_set variables by simple assignment. The only portable way to handle them is to regenerate the fd_set from scratch by looping over a list of active file descriptors each time.

Resources