I have a tcp chat program: server.c and client.c.
The server is in a while(1) loop and uses select to detect clients wanting to connect on it's socket. A new thread is then created for the accepted client and its socket descriptor is given as an argument for thread: pthread_create (&thread,NULL, do_something, (void *) &socket_descriptor);
When receiving a message from a client, the server should send this message to all connected clients. (not implemented this yet).
Now I'm wondering how to do this. I absolutely need each accepted connection to be in a thread.
I was thinking of using a select inside the do_something as well; will select detect if data is incoming on the socket descriptor? Or would you do it another way?
edit: added code
my code:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include "tcp_comm.h"
#include <sys/time.h>
#include <sys/types.h>
#define BUFSIZE 1024
#define PORT 1234
void *do_something(void *a);
int main (void){
Socket server = tcp_passive_open( PORT );
MySocket *s = (MySocket *)server;
printf("Server socked_id (main): %i", s->sd);
pthread_t thread;
fd_set active_socketDescriptors,read_socketDescriptors;
FD_ZERO(&active_socketDescriptors);
FD_SET(s->sd,&active_socketDescriptors);
while (1){
read_socketDescriptors = active_socketDescriptors;
if (select (FD_SETSIZE, &read_socketDescriptors, NULL, NULL, NULL) < 0){
perror ("select");
exit (EXIT_FAILURE);
}
int i;
for (i = 0; i < FD_SETSIZE; ++i){
if (FD_ISSET (i, &read_socketDescriptors)){
if (i == s->sd){
Socket client = tcp_wait_for_connection( server );
pthread_create (&thread,NULL, do_something, (void *)client);
FD_SET (s->sd, &active_socketDescriptors);
} else {
}
}
}
}
tcp_close( server );
return 0;
}
void *do_something(void *client){
unsigned char input[BUFFER_SIZE];
pthread_detach(pthread_self());
MySocket *s = (MySocket *)client;
printf("Client socked_id (thread): %i", s->sd);
int j;
while (1){
int nbytes = tcp_receive(client, input, BUFSIZE );
if (nbytes <= 0){
if (nbytes ==0){
/* connection closed by client*/
printf("Client closed connection");
} else {
/* other error*/
perror("tcp_receive");
}
tcp_close(&client);
/*remove the socket descriptor from set in the main BRAINSTORM ABOUT THIS */
} else {
/*data incoming */
printf("\nMessage from client: %s",input);
}
}
return 0;
}
edit 2: reformulation of problem
I have to use threads (it not because of the system; linux) but because it's mandatory in the assignment to have a thread for each client.
The problem i have specifically is that only the main thread can send the data recieved in each thread from each client to all clients because only the main thread has access to the set which contains the socket descriptors.
edit3: what I need to add in each thread but I can't because of the s.thread and s.main being in different places & the thread not knowing the set of the main.
for (j=0; j<=FD_SETSIZE;j++){
if(FD_ISSET(j,&active_socketDescriptors)){
if (j != s.thead && j!=s.main){
tcp_send(j, (void*)input,nbytes);
}
}
}
edit 4: I solved it this way:
i have a dynamic array list where i put a list of connected clients with there socket descriptor. Inside the thread of the server (do something) I have the recieve blocking until it gets input then this input is send to all connected clients using there socket descriptor from the list which it loops trough. Inside the clients there is a thread listening and a thread sending.
If the client connection sockets are non-blocking, then using e.g. select to wait for the socket receive data is a possible way. However, since you already have the connected sockets in threads, you can keep them blocking, and just do a read call on them. The call to read will block until you receive data, which can then be spread to the other threads.
Edit
After better understanding your requirements, you should probably have the sockets non-blocking, and use a loop with select with a short timeout. When select timeouts (i.e. returns 0) then you check if there is data to send. If there is, then send the data, and go back to the select call.
Given your description it might be worth rethinking the architecture of your application. (Unless this has been dictated by limitations on your system). Let me explain this a little more...
By your description, if I understood you correctly, after a client has connected to the server any messages it (the client) sends will be relayed (by the server) to all other clients. So, rather than creating a new thread why not simply add the newly connected socket to the FDSET of the select. Then when a message comes in you can simply relay to the others.
If you expect a large number of clients for a single server you should see if the poll system call is available on your system (it's just like select but supports monitoring more clients). A good poll/select version ought to out-perform your threaded version.
If you really want to continue with your threaded version here's one way to accomplish what you are trying to do. When you create the thread for each accepted client you also create a pipe back to the server thread (and you add this to the server select/poll set.) and pass that to the client thread. So your server thread now not only receives new connections but relays the messages too.
Although you said that you absolutely must deal with each client in a separate thread, unless you are using a real time operating system, you will probably find that the thread context-switch/synchronization you need to do will soon dominate over the multiplexing overhead of the first solution I suggested. (But since you did not mention an OS I am guessing!)
This is related to your design.
If you only need to do one or two features for each connected client, then suggest you to use only one thread to implement your server.
If you has to do lots of features for each connected client, then multiple thread design is okay.
However, the question you asked should be how did I passing the data from receiving thread to all others. The suggested answer from me is ether:
a) use message queue to passing inter thread data: each thread has one message queue and each thread will listen to its own socket and this message queue. When receiving data from socket, the thread sending the data to all other message queues
b) use an single global buffer: if has any incoming data form socket, put this data into this global buffer and adding a tag to this data indicating where this data comes from.
my 2 cents.
Related
I created a multithreaded C TCP server. It seems to work (as a client I type a message and the message is sent to the server and the server prints what the client sent in a thread (and send back the client id).
Do I respect the "best practices" of a C multithreaded TCP server ?
Maybe I should use a semaphore to access / use the client_counter variable ?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h> // disable close() warning
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <pthread.h>
#define MAX_CONNECTIONS 5
static int client_counter = 0;
void* serverWorker(void* context)
{
char client_response[256];
int sock = *(int*)context;
char message[256] = "\n Hello dear client, you are the client number \n";
char numero[12];
sprintf(numero, "%d", client_counter); // SHOULD I USE A SEMAPHORE HERE FOR client_counter ?
while(1)
{
memset(client_response, 0, sizeof(client_response)); // clean string
recv(sock, &client_response, sizeof(client_response), 0);
printf("client number %s sent: '%s' \n", numero, client_response);
if (send(sock, numero , strlen(numero) , 0) < 0)
{
printf("ERROR while sending response to client from worker \n");
}
}
return NULL;
}
int main()
{
printf("Waiting for incoming connections ...\n");
// socket creation
int server_socket;
server_socket = socket(AF_INET, SOCK_STREAM, 0);
// dserver address
struct sockaddr_in server_address;
server_address.sin_family = AF_INET;
server_address.sin_port = htons(9002);
server_address.sin_addr.s_addr = INADDR_ANY;
// bind the socket to IP and port
bind(server_socket, (struct sockaddr*) &server_address, sizeof(server_address));
listen(server_socket, MAX_CONNECTIONS);
int client_socket;
while((client_socket = accept(server_socket, NULL ,NULL)))
{
client_counter++;
pthread_t thread_id;
pthread_create(&thread_id, NULL, serverWorker, (void*)&client_socket);
printf("new client ! \n");
}
close(server_socket);
return 0;
}
There are several problems in your code... you create a thread on an incoming connection and pass all the created threads a reference (the same reference) to the variable in which you have stored the socket descriptor. This will make all threads to share the same variable to store all the socket descriptors you'll get from the wildcard one. Probably you think well, i make a copy just on thread start, so this is not going to happen, but think two connections that come in almost simultaneously, thread main() runs and processes both. Then the first and second threads get scheduled and both get the same descriptor stored (the second one) and the first connection is leaked.
Another thing is that while this variable is local to main, it will cease to exist as soon as main() returns (which is not the end of the program, if the threads are to survive past main()'s return) but as you are in an endless loop (you probably don't know, but the only means for the server_socket to give an error is if you destroy (close() it) in a thread, or you drop the interface it is attached to.) This could lead to a SIGSEGV trap.
You can freely pass an int value casted to (void *) without problem, as the thread body function will convert it back to an int before use, which reduces to a noop at all, as pointer types are normally greater in size (or equal, but not lesser) than int. Anyway, this is strictly undefined behaviour, but probably that will work (as legacy software is full of such conversions, so all compilers normally implement to try to respect this) The right way to do this is to declare a struct of information to be passed to the thread on start and return from it. Then you can store on it whatever you want, but think that, as you have a dynamic amount of threads to come, you need the structs to be dynamically allocated.
In respect to the use of the client_counter variable, the only thread touching that variable is the one running main() code. This plants no major problem than the risk presented above, two updates in quick sequence can make both threads to get the values updated in main after main has made both updates.
Another issue is that you need to declare it volatile as the thread code will not assume it is only changed by it between accesses and will probably cache it as a register variable.
The messages passed between main() and the different threads you are getting can be implemented in two ways. This is the reason of the routines to get a void * on input and returning a void * on return:
The first uses a dynamic struct of local data (malloc()ed, passed from main() to the thread, and back on termination, (when you join the thread to main). This way allows you to collect result info from the thread in main and then you have to free(3) the struct in main. The struct is used as a communication message between the thread and the main routine in both directions, and you can store there any information you need to pass or to return back. Once the thread has finished, you can free() the structure in main (don't do it in the thread, as it has to survive its death)
The second involves no more communication with main() and the threads must deallocate the structure, once it is finished. This is simpler, and more adequate to your example. In this way, you can destroy the struct in the thread, or in main, but only if you have already joined the thread and are sure the struct is not going to be used by it.
One common mistake is that you do not examine the return values of send and recv calls. These calls may send and receive less then the entire buffer and such cases must be handled, as well as disconnects. That will also remove the need to use memset and strlen on received data.
Generally, dedicating a thread to each client is considered non-scalable. You may like to read the famous The C10K problem for a good treatment of I/O strategies for handling many clients. The article is old but the advice is timeless.
I am working on sending traffic (such as UDP/TCP packets) from one machine to another. I am writing a C application which has 1 thread for each traffic type. I want these two threads to concurrently send packets.
Do I need to use any synchronization primitives such as a mutex lock within the sendMsg function since it is being called from each thread?
sockaddr_in dest;
void * udp(void * arg){
struct * info = arg;
int fd = socket(AF_INET, SOCK_DGRAM, 0);
//set up socket info
while(1){
sendMsg(udpInfo, fd);
}
}
void * tcp(void * arg){
struct * info = arg;
int fd = socket(AF_INET, SOCK_STREAM, 0);
// set up socket info
while(1){
sendMsg(tcpInfo, fd);
}
}
void sendMsg(struct * info, int fd){
sendTo(fd, "hello", strlen("hello") + 1, 0, (struct sockaddr*)&dest, sizeof(dest));
}
You seem to be a little unclear about how mutex works. A mutex is not applied on a piece of code, but a piece of data used within the code. Here the function is called by both threads, but there is no shared data between the threads. So as #Sami Kuhmonen said in a comment above, you don't need to use mutexes here.
You might need a mutex in future if, say, in case there was a third thread that pushed data into a (say) queue which your thread would then access from the queue and send to the connected computer. Then you would need to synchronise the way these threads push and pop data into and from the queue.
If you read the manpage for sendmsg you might see the following sentence:
If the message is too long to pass atomically through the underlying protocol, the error EMSGSIZE is returned, and the message is not transmitted.
Atomically means that the package is always send in one blob with no other data being able to insert itself in the middle. So no matter how many threads use sendmsg the kernel will mix packets.
So, I have this client/server application, where the server has a producer/consumer architecture. I have 2 functions that handle writting and reading to the socket. The main thread of the server (the Producer) handles connections and passes socket descriptors via a Stack to the second thread, the Consumer, for processing.
The problem is, whenever I try to write() or read() the socket from a different-than-main thread, it always returns -1 and causes a Connection reset by peer error on client and Transport endpoint is not connected error on the server. Surpirsingly, it works perfectly when socket is read/written from the main thread.
Why does this happen? Is this official behaviour? How do I go about replying to the client with the Consumer thread? I don't believe it's because of the code I wrote, since the Consumer thread only calls the read/write-to-socket functions.
If you have any suspicion on which part could be a culprit, ask me to post some of the code.
EDIT:
typedef struct s_stack {
int * c_stack;
int base;
int top;
unsigned char is_full;
unsigned char is_empty;
int max_size;
} s_stack_t;
s_stack_t stack;
void * producer_routine(void * arguments) {
/* socket(), bind(), listen(), etc.,
socket fd on "socket_fd",
new connection fd on "new_fd" */
for(;;) {
new_fd = accept(socket_fd, (struct sockaddr *)&client_addr, &clen);
pthread_mutex_lock(&mutex);
while (stack.is_full) {
pthread_cond_wait(&stack_not_full, &mutex);
}
if (stack.is_full){
push(&stack, new_fd);
pthread_cond_signal(&stack_not_empty);
}
pthread_mutex_unlock(&mutex);
}
close(new_fd);
}
void * consumer_routine(void *args) {
for(;;) {
int socket_fd;
/* same mutex lock as above, just reversed, pop to socket_fd */
write_a_message_to_client(socket_fd);
}
}
int main() {
stack_init(&stack, 1024); // (s_stack_t * stack, int max_size)
pthread_t tidp, tidc;
int prc = pthread_create(&tidp, NULL, producer_routine, NULL);
int crc = pthread_create(&tidc, NULL, consumer_routine, NULL);
stack_destroy(&stack);
return 0;
}
The client just sends a message, and waits to receive one. If write_a_message_to_client() is called within any of those threads, even with the socket_fd passed as a parameter, I get the same errors. If it's called directly in main, it has no problem.
EDIT #2:
I tested this, and found my stack implementation to not work on Cygwin. Cygwin adds gibberish after the 3rd element for some reason, so the socket fds were invalid. Also, I was testing this in a Debian 6 VM and the server was crashing after connection from client. But I tested it in Arch, Kali and my Uni servers (Debian 7) and works as it should have been. A whole lot of trouble for a whole lot of nothing. Thanks Cygwin!
You should not call stack_destroy() until after both threads have completed. I think your entire program is running using a destroyed stack.
Use pthread_join() to wait for the threads to complete before destroying the stack.
I am working on one project in which i need to read from 80 or more clients and then write their o/p into a file continuously and then read these new data for another task. My question is what should i use select or multithreading?
Also I tried to use multi threading using read/fgets and write/fputs call but as they are blocking calls and one operation can be performed at one time so it is not feasible. Any idea is much appreciated.
update 1: I have tried to implement the same using condition variable. I able to achieve this but it is writing and reading one at a time.When another client tried to write then it cannot able to write unless i quit from the 1st thread. I do not understand this. This should work now. What mistake i am doing?
Update 2: Thanks all .. I am able to succeeded to get this model implemented using mutex condition variable.
updated Code is as below:
**header file*******
char *mailbox ;
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER ;
pthread_cond_t writer = PTHREAD_COND_INITIALIZER;
int main(int argc,char *argv[])
{
pthread_t t1 , t2;
pthread_attr_t attr;
int fd, sock , *newfd;
struct sockaddr_in cliaddr;
socklen_t clilen;
void *read_file();
void *update_file();
//making a server socket
if((fd=make_server(atoi(argv[1])))==-1)
oops("Unable to make server",1)
//detaching threads
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_DETACHED);
///opening thread for reading
pthread_create(&t2,&attr,read_file,NULL);
while(1)
{
clilen = sizeof(cliaddr);
//accepting request
sock=accept(fd,(struct sockaddr *)&cliaddr,&clilen);
//error comparison against failire of request and INT
if(sock==-1 && errno != EINTR)
oops("accept",2)
else if ( sock ==-1 && errno == EINTR)
oops("Pressed INT",3)
newfd = (int *)malloc(sizeof(int));
*newfd = sock;
//creating thread per request
pthread_create(&t1,&attr,update_file,(void *)newfd);
}
free(newfd);
return 0;
}
void *read_file(void *m)
{
pthread_mutex_lock(&lock);
while(1)
{
printf("Waiting for lock.\n");
pthread_cond_wait(&writer,&lock);
printf("I am reading here.\n");
printf("%s",mailbox);
mailbox = NULL ;
pthread_cond_signal(&writer);
}
}
void *update_file(int *m)
{
int sock = *m;
int fs ;
int nread;
char buffer[BUFSIZ] ;
if((fs=open("database.txt",O_RDWR))==-1)
oops("Unable to open file",4)
while(1)
{
pthread_mutex_lock(&lock);
write(1,"Waiting to get writer lock.\n",29);
if(mailbox != NULL)
pthread_cond_wait(&writer,&lock);
lseek(fs,0,SEEK_END);
printf("Reading from socket.\n");
nread=read(sock,buffer,BUFSIZ);
printf("Writing in file.\n");
write(fs,buffer,nread);
mailbox = buffer ;
pthread_cond_signal(&writer);
pthread_mutex_unlock(&lock);
}
close(fs);
}
I think for the the networking portion of things, either thread-per-client or multiplexed single-threaded would work fine.
As for the disk I/O, you are right that disk I/O operations are blocking operations, and if your data throughput is high enough (and/or your hard drive is slow enough), they can slow down your network operations if the disk I/O is done synchronously.
If that is an actual problem for you (and you should measure first to verify that it really is a problem; no point complicating things if you don't need to), the first thing I would try to ameliorate the problem would be to make your file's output-buffer larger by calling setbuffer. With a large enough buffer, it may be possible for the C runtime library to hide any latency caused by disk access.
If larger buffers aren't sufficient, the next thing I'd try is creating one or more threads dedicated to reading and/or writing data. That is, when your network thread wants to save data to disk, rather than calling fputs()/write() directly, it allocates a buffer containing the data it wants written, and passes that buffer to the IO-write thread via a (mutex-protected or lockless) FIFO queue. The I/O thread then pops that buffer out of the queue, writes the data to the disk, and frees the buffer. The I/O thread can afford to be occasionally slow in writing because no other threads are blocked waiting for the writes to complete. Threaded reading from disk is a little more complex, but basically the IO-read thread would fill up one or more buffers of in-memory data for the network thread to drain; and whenever the network thread drained some of the data out of the buffer, it would signal the IO-read thread to refill the buffer up to the top again. That way (ideally) there is always plenty of input-data already present in RAM whenever the network thread needs to send some to a client.
Note that the multithreaded method above is a bit tricky to get right, since it involves inter-thread synchronization and communication; so don't do it unless there isn't any simpler alternative that will suffice.
Either select/poll or multithreading is ok if you you program solves the problem.
I' guess your program would be io-bound as the number of clients grows up, as you have disk read/write frequently. So it would not speed up to have multiple threads doing the io operation. Polling may be a better choice then
You can set a socket that you get from accept to be non-blocking. Then it is easy to use select to find out when there is data, read the number of bytes that are available and process them.
With (only) 80 clients, I see no reason to expect any significant difference from using threads unless you get very different amounts of data from different clients.
Is it possible to bind and listen to multiple ports in Linux in one application?
For each port that you want to listen to, you:
Create a separate socket with socket.
Bind it to the appropriate port with bind.
Call listen on the socket so that it's set up with a listen queue.
At that point, your program is listening on multiple sockets. In order to accept connections on those sockets, you need to know which socket a client is connecting to. That's where select comes in. As it happens, I have code that does exactly this sitting around, so here's a complete tested example of waiting for connections on multiple sockets and returning the file descriptor of a connection. The remote address is returned in additional parameters (the buffer must be provided by the caller, just like accept).
(socket_type here is a typedef for int on Linux systems, and INVALID_SOCKET is -1. Those are there because this code has been ported to Windows as well.)
socket_type
network_accept_any(socket_type fds[], unsigned int count,
struct sockaddr *addr, socklen_t *addrlen)
{
fd_set readfds;
socket_type maxfd, fd;
unsigned int i;
int status;
FD_ZERO(&readfds);
maxfd = -1;
for (i = 0; i < count; i++) {
FD_SET(fds[i], &readfds);
if (fds[i] > maxfd)
maxfd = fds[i];
}
status = select(maxfd + 1, &readfds, NULL, NULL, NULL);
if (status < 0)
return INVALID_SOCKET;
fd = INVALID_SOCKET;
for (i = 0; i < count; i++)
if (FD_ISSET(fds[i], &readfds)) {
fd = fds[i];
break;
}
if (fd == INVALID_SOCKET)
return INVALID_SOCKET;
else
return accept(fd, addr, addrlen);
}
This code doesn't tell the caller which port the client connected to, but you could easily add an int * parameter that would get the file descriptor that saw the incoming connection.
You only bind() to a single socket, then listen() and accept() -- the socket for the bind is for the server, the fd from the accept() is for the client. You do your select on the latter looking for any client socket that has data pending on the input.
In such a situation, you may be interested by libevent. It will do the work of the select() for you, probably using a much better interface such as epoll().
The huge drawback with select() is the use of the FD_... macros that limit the socket number to the maximum number of bits in the fd_set variable (from about 100 to 256). If you have a small server with 2 or 3 connections, you'll be fine. If you intend to work on a much larger server, then the fd_set could easily get overflown.
Also, the use of the select() or poll() allows you to avoid threads in the server (i.e. you can poll() your socket and know whether you can accept(), read(), or write() to them.)
But if you really want to do it Unix like, then you want to consider fork()-ing before you call accept(). In this case you do not absolutely need the select() or poll() (unless you are listening on many IPs/ports and want all children to be capable of answering any incoming connections, but you have drawbacks with those... the kernel may send you another request while you are already handling a request, whereas, with just an accept(), the kernel knows that you are busy if not in the accept() call itself—well, it does not work exactly like that, but as a user, that's the way it works for you.)
With the fork() you prepare the socket in the main process and then call handle_request() in a child process to call the accept() function. That way you may have any number of ports and one or more children to listen on each. That's the best way to really very quickly respond to any incoming connection under Linux (i.e. as a user and as long as you have child processes wait for a client, this is instantaneous.)
void init_server(int port)
{
int server_socket = socket();
bind(server_socket, ...port...);
listen(server_socket);
for(int c = 0; c < 10; ++c)
{
pid_t child_pid = fork();
if(child_pid == 0)
{
// here we are in a child
handle_request(server_socket);
}
}
// WARNING: this loop cannot be here, since it is blocking...
// you will want to wait and see which child died and
// create a new child for the same `server_socket`...
// but this loop should get you started
for(;;)
{
// wait on children death (you'll need to do things with SIGCHLD too)
// and create a new children as they die...
wait(...);
pid_t child_pid = fork();
if(child_pid == 0)
{
handle_request(server_socket);
}
}
}
void handle_request(int server_socket)
{
// here child blocks until a connection arrives on 'server_socket'
int client_socket = accept(server_socket, ...);
...handle the request...
exit(0);
}
int create_servers()
{
init_server(80); // create a connection on port 80
init_server(443); // create a connection on port 443
}
Note that the handle_request() function is shown here as handling one request. The advantage of handling a single request is that you can do it the Unix way: allocate resources as required and once the request is answered, exit(0). The exit(0) will call the necessary close(), free(), etc. for you.
In contrast, if you want to handle multiple requests in a row, you want to make sure that resources get deallocated before you loop back to the accept() call. Also, the sbrk() function is pretty much never going to be called to reduce the memory footprint of your child. This means it will tend to grow a little bit every now and then. This is why a server such as Apache2 is setup to answer a certain number of requests per child before starting a new child (by default it is between 100 and 1,000 these days.)