I have created a thread using pthread_create in my linux system.
Job of the thread subroutine is it will listen on one IPC socket and process the message.
Message will be 1 or 0. Using conditional check on message, further function calls will be made. So far it is working as expected.
In my scenario, I will get '1' or '0' on different time interval regularly.
Current Behavior:
Test 1: '1', '0' dis_func.
Test 2: '1', '0' dis_func, '1', '0' dis_func, '1', '0' dis_func, '1', '0' dis_func, and soon.
Expected Behavior:
Test: '1', '0', '1', '0', '1', '0', '1', '0' dis_func.
I want to delay disable function calls by 'n' second after socket receives message '0' to see whether followup message '1' is received or not, if '1' is not received then disable func should be called. If '1' is received then delay disable function call till only '0' is received.
I tried sleep API before disable_func() call, but it will queue up the time and it will be failure.
int main(){
int ret = 0;
handle = custom_socket_open();
ret = pthread_create(&Thread, (const pthread_attr_t *)NULL, client_monitor, out);
if (ret != 0)
printf("%s: can't create Thread: [%d]", __func__, ret);
else
printf("%s: Thread created successfully: %ld", __func__, Thread);
custom_socket_close();
return 0;
}
void* client_monitor(void *context) {
payload_t rx_payload;
int ret = 0;
struct data *rx = (struct data *context);
while (1) {
ret = custom_socket(handle, (void *)&rx_payload, &rx_payload_size, 0);
if (ret !=0 )
printf("failed\n");
if (rx_payload.enable) {
printf("received 1 from server socket\n");
enable_func();
} else {
printf("received 0 from server socket\n");
disable_func();
}
}
return 0;
}
Any suggestion?
Related
So I'm trying to do the following:
I have two participants (let's call them A and B) communicating via TCP socket (send() and recv()). A is sending a counter and a random Nonce, B is just responding with that same message it gets. A then checks if the response matches the sent packet and if yes, it increments the counter and repeats.
This is a code snippet illustrating what A does at the moment:
send(sock, payload, strlen(payload), 0);
struct timeval t_out;
t_out.tv_sec = 0;
t_out.tv_usec = 5000;
setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO,&t_out,sizeof(t_out)) <0)
int len = recv(sock, rx_buffer, sizeof(rx_buffer) - 1, 0);
if (len < 0)
{
print("Timeout reached, recv failed: errno %d", errno);
}
else
{
rx_buffer[len] = 0;
if(strncmp(rx_buffer, payload, payload_len) == 0)
{
pack_nr++;
}
}
Now I'm encountering one problem.
Let's say B, for some reason, has a delay in responding. This causes something like that:
A sends something like "1xyz"
B has a delay ......
A times out and resends something like "1abc"
B's first response ("1xyz") reaches A, A decides that this is the wrong payload
B's second response ("1abc") reaches A too, but A is only executing one recv() and it's unseen for now
A resends something like "1uvw"
A reads "1abc" from recv() and again decides that this is the wrong payload
B's third response ("1uvw") reaches A, and so on and on
So what I'd like to do is to put a recv() in a loop, so that in step 5, A would first look for another response from B until the timeout is reached.
So is there clever way to do this? I was thinking about something like
send(sock, payload, strlen(payload), 0);
int flag = 0;
gettimeofday(&start_time, NULL);
while((tx_time < start_time + timeout) && flag = 0)
{
gettimeofday(&tx_time, NULL);
recv(sock, rx_buffer, sizeof(rx_buffer) - 1, 0);
if(rx_buffer is okay)
{
flag = 1;
}
wait_a_bit();
}
if(flag == 1) pack_nr++;
"... B is just responding with that same message it gets. A then checks if the response matches the sent packet ..."
You have a code problem and a terminology problem.
First, the terminology problem: Don't say "matches the sent packet". The data can be sent in one packet or ten packets, TCP doesn't care. You don't receive packets, you receive data that may be split or combined across packets as TCP wishes. It really helps (trust me) to be very precise in your use of words. If you mean a message, say "message". If you mean data, say "data". If you mean a datagram, say "datagram".
Unfortunately, your code problem is enormous. You want B to respond to A with the same message it received. That means you need a protocol that sends and receives messages. TCP is not a message protocol. So you need to implement a message protocol and write code that actually sends and receives messages.
If A write "foobar", B might receive "foobar" or it might first receive "foo" and then later "bar". If A writes "foo" then "bar", B might receive "foobar" or "f" and then "oobar". That's TCP. If you need a message protocol, you need to implement one.
First off, you are not checking for a timeout correctly. recv() could fail for any number of reasons. You need to check errno (or WSAGetLastError() on Windows) to find out WHY it failed. But even if it did actually fail due to timeout, TCP is a byte stream, the delayed data may still show up (especially since 5000 microseconds (0.005 seconds) is way too short a timeout to reliably use for TCP network traffic), but your sender would have moved on. The only sensible thing to do if a timeout occurs in TCP is to close the connection, since you don't know the state of the stream anymore.
In your situation, you are basically implementing an ECHO protocol. Whatever the sender sends just gets echoed back as-is. As such, if you send 4 bytes (which you are not verifying, BTW), then you should keep reading until 4 bytes are received, THEN compare them. If any failure occurs in that process, immediately close the connection.
int sendAll(int sock, void *data, int len)
{
char *ptr = (char*) data;
while (len > 0) {
int sent = send(sock, ptr, len, 0);
if (sent < 0) {
if (errno != EINTR)
return -1;
}
else {
ptr += sent;
len -= sent;
}
}
return 0;
}
int recvAll(int sock, void *data, int len)
{
char *ptr = (char*) data;
while (len > 0) {
int recvd = recv(sock, ptr, len, 0);
if (recvd < 0) {
if (errno != EINTR)
return -1;
}
else if (recvd == 0) {
return 0;
}
else {
ptr += recvd;
len -= recvd;
}
}
return 1;
}
...
int payload_len = strlen(payload);
if (sendAll(sock, payload, payload_len) < 0)
{
// error handling
close(sock);
}
else
{
struct timeval t_out;
t_out.tv_sec = 5;
t_out.tv_usec = 0;
if (setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, &t_out, sizeof(t_out)) < 0)
{
// error handling
close(sock);
}
else
{
int res = recvAll(sock, rx_buffer, payload_len);
if (res < 0)
{
if (errno == EAGAIN || errno == EWOULDBLOCK)
print("Timeout reached");
else
print("recv failed: errno %d", errno);
close(sock);
}
else if (res == 0)
{
print("disconnected");
close(sock);
}
else
{
if (memcmp(rx_buffer, payload, payload_len) == 0)
{
print("data matches");
pack_nr++;
}
else
print("data mismatch!");
}
}
}
After installing zmq and czmq with brew, I tried to compile and play the Asynchronous-Majordomo-Pattern but it did not work as it requires czmq v3. As far as I understood, I tried to update it to the v4, using zactor because
zthread is deprecated in favor of zactor http://czmq.zeromq.org/czmq3-0:zthread
So right now the following code looks fine to me as updated async-majordomo pattern, but it does not work as expected, It does not create any thread when I run it via my terminal.
// Round-trip demonstrator
// While this example runs in a single process, that is just to make
// it easier to start and stop the example. The client task signals to
// main when it's ready.
#include "czmq.h"
#include <stdlib.h>
void dbg_write_in_file(char * txt, int nb_request) {
FILE * pFile;
pFile = fopen ("myfile.txt","a");
if (pFile!=NULL)
{
fputs (txt, pFile);
char str_nb_request[12];
sprintf(str_nb_request, "%d", nb_request);
fputs (str_nb_request, pFile);
fputs ("\n", pFile);
fclose (pFile);
}
}
static void
client_task (zsock_t *pipe, void *args)
{
zsock_t *client = zsock_new (ZMQ_DEALER);
zsock_connect (client, "tcp://localhost:5555");
printf ("Setting up test...\n");
zclock_sleep (100);
printf("child 1: parent: %i\n\n", getppid());
printf("child 1: my pid: %i\n\n", getpid());
int requests;
int64_t start;
printf ("Synchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 10000; requests++) {
zstr_send (client, "hello");
// stuck here /!\
char *reply = zstr_recv (client);
zstr_free (&reply);
// check if it does something
dbg_write_in_file("sync round-trip requests : ", requests);
// end check
}
printf (" %d calls/second\n",
(1000 * 10000) / (int) (zclock_time () - start));
printf ("Asynchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 100000; requests++) {
zstr_send (client, "hello");
// check if it does something
dbg_write_in_file("async round-trip send requests : ", requests);
// end check
}
for (requests = 0; requests < 100000; requests++) {
char *reply = zstr_recv (client);
zstr_free (&reply);
// check if it does something
dbg_write_in_file("async round-trip rec requests : ", requests);
// end check
}
printf (" %d calls/second\n",
(1000 * 100000) / (int) (zclock_time () - start));
zstr_send (pipe, "done");
}
// Here is the worker task. All it does is receive a message, and
// bounce it back the way it came:
static void
worker_task (zsock_t *pipe, void *args)
{
printf("child 2: parent: %i\n\n", getppid());
printf("child 2: my pid: %i\n\n", getpid());
zsock_t *worker = zsock_new (ZMQ_DEALER);
zsock_connect (worker, "tcp://localhost:5556");
while (true) {
zmsg_t *msg = zmsg_recv (worker);
zmsg_send (&msg, worker);
}
zsock_destroy (&worker);
}
// Here is the broker task. It uses the zmq_proxy function to switch
// messages between frontend and backend:
static void
broker_task (zsock_t *pipe, void *args)
{
printf("child 3: parent: %i\n\n", getppid());
printf("child 3: my pid: %i\n\n", getpid());
// Prepare our sockets
zsock_t *frontend = zsock_new (ZMQ_DEALER);
zsock_bind (frontend, "tcp://localhost:5555");
zsock_t *backend = zsock_new (ZMQ_DEALER);
zsock_bind (backend, "tcp://localhost:5556");
zmq_proxy (frontend, backend, NULL);
zsock_destroy (&frontend);
zsock_destroy (&backend);
}
// Finally, here's the main task, which starts the client, worker, and
// broker, and then runs until the client signals it to stop:
int main (void)
{
// Create threads
zactor_t *client = zactor_new (client_task, NULL);
assert (client);
zactor_t *worker = zactor_new (worker_task, NULL);
assert (worker);
zactor_t *broker = zactor_new (broker_task, NULL);
assert (broker);
// Wait for signal on client pipe
char *signal = zstr_recv (client);
zstr_free (&signal);
zactor_destroy (&client);
zactor_destroy (&worker);
zactor_destroy (&broker);
return 0;
}
When I run it, it looks like the program is stuck at the comment
// stuck here /!\
Then when I kill it as it does not finish, or print anything at all, I need to press five time Ctrl+C ( ^C ). Only then, it looks more verbose on the console, like it was indeed running. => Note that I delete all my printf() steps' outputs, as it was really messy to read.
When it runs, it does not write anything to the file, called by the dbg_write_in_file() function, only after sending five Ctrl+C ( ^C ).
Both client worker and broker task return the same getppid number ( my terminal ) and getpid as the program itself.
I use gcc trippingv4.c -o trippingv4 -L/usr/local/lib -lzmq -lczmq to compile.
When I try to kill it :
./trippingv4
Setting up test...
child 1: parent: 60967
child 1: my pid: 76853
Synchronous round-trip test...
^Cchild 2: parent: 60967
child 2: my pid: 76853
^Cchild 3: parent: 60967
child 3: my pid: 76853
^C^C^CE: 18-02-28 00:16:37 [76853]dangling 'PAIR' socket created at src/zsys.c:471
E: 18-02-28 00:16:37 [76853]dangling 'DEALER' socket created at trippingv4.c:29
E: 18-02-28 00:16:37 [76853]dangling 'PAIR' socket created at src/zsys.c:471
E: 18-02-28 00:16:37 [76853]dangling 'DEALER' socket created at trippingv4.c:89
Update
Thanks for the detailed answer #user3666197. In first part, the compiler does not compile the assert call so I just show the value instead and compare visually, they are the same.
int czmqMAJOR,
czmqMINOR,
czmqPATCH;
zsys_version ( &czmqMAJOR, &czmqMINOR, &czmqPATCH );
printf( "INF: detected CZMQ ( %d, %d, %d ) -version\n",
czmqMAJOR,
czmqMINOR,
czmqPATCH
);
printf( "INF: CZMQ_VERSION_MAJOR %d, CZMQ_VERSION_MINOR %d, CZMQ_VERSION_PATCH %d\n",
CZMQ_VERSION_MAJOR,
CZMQ_VERSION_MINOR,
CZMQ_VERSION_PATCH
);
Output :
INF: detected CZMQ ( 4, 1, 0 ) -version
INF: CZMQ_VERSION_MAJOR 4, CZMQ_VERSION_MINOR 1, CZMQ_VERSION_PATCH 0
The zsys_info call does compile but does not show anything on the terminal, even with a fflush(stdout) just in case so I just used printf :
INF: This system's Context() limit is 65535 ZeroMQ socketsINF: current state of the global Context()-instance has:
( 1 )-IO-threads ready
( 1 )-ZMQ_BLOCKY state
Then I changed the global context thread value with zsys_set_io_threads(2) and/or zmq_ctx_set (aGlobalCONTEXT, ZMQ_BLOCKY, false);, still blocked. It looks like zactor does not works with systems threads as zthread was... or does not gives a similar behavior. Given my experience in zeromq (also zero) probably I trying something that can't be achieved.
Update solved but unproper
My main error was to not have properly initiate zactor instance
An actor function MUST call zsock_signal (pipe) when initialized and MUST listen to pipe and exit on $TERM command.
And to not have blocked the zactor's proxy execution before it called zactor_destroy (&proxy);
I let the final code below but you still need to exit at the end with Ctrl+C because I did not figure it out how to manage $TERM signal properly. Also, zactor still appears to not use system theads. It's probably design like this but I don't know how it's work behind the wood.
// Round-trip demonstrator
// While this example runs in a single process, that is just to make
// it easier to start and stop the example. The client task signals to
// main when it's ready.
#include <czmq.h>
static void
client_task (zsock_t *pipe, void *args)
{
assert (streq ((char *) args, "Hello, Client"));
zsock_signal (pipe, 0);
zsock_t *client = zsock_new (ZMQ_DEALER);
zsock_connect (client, "tcp://127.0.0.1:5555");
printf ("Setting up test...\n");
zclock_sleep (100);
int requests;
int64_t start;
printf ("Synchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 10000; requests++) {
zstr_send (client, "hello");
zmsg_t *msgh = zmsg_recv (client);
zmsg_destroy (&msgh);
}
printf (" %d calls/second\n",
(1000 * 10000) / (int) (zclock_time () - start));
printf ("Asynchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 100000; requests++) {
zstr_send (client, "hello");
}
for (requests = 0; requests < 100000; requests++) {
char *reply = zstr_recv (client);
zstr_free (&reply);
}
printf (" %d calls/second\n",
(1000 * 100000) / (int) (zclock_time () - start));
zstr_send (pipe, "done");
printf("send 'done' to pipe\n");
}
// Here is the worker task. All it does is receive a message, and
// bounce it back the way it came:
static void
worker_task (zsock_t *pipe, void *args)
{
assert (streq ((char *) args, "Hello, Worker"));
zsock_signal (pipe, 0);
zsock_t *worker = zsock_new (ZMQ_DEALER);
zsock_connect (worker, "tcp://127.0.0.1:5556");
bool terminated = false;
while (!terminated) {
zmsg_t *msg = zmsg_recv (worker);
zmsg_send (&msg, worker);
// zstr_send (worker, "hello back"); // Give better perf I don't know why
}
zsock_destroy (&worker);
}
// Here is the broker task. It uses the zmq_proxy function to switch
// messages between frontend and backend:
static void
broker_task (zsock_t *pipe, void *args)
{
assert (streq ((char *) args, "Hello, Task"));
zsock_signal (pipe, 0);
// Prepare our proxy and its sockets
zactor_t *proxy = zactor_new (zproxy, NULL);
zstr_sendx (proxy, "FRONTEND", "DEALER", "tcp://127.0.0.1:5555", NULL);
zsock_wait (proxy);
zstr_sendx (proxy, "BACKEND", "DEALER", "tcp://127.0.0.1:5556", NULL);
zsock_wait (proxy);
bool terminated = false;
while (!terminated) {
zmsg_t *msg = zmsg_recv (pipe);
if (!msg)
break; // Interrupted
char *command = zmsg_popstr (msg);
if (streq (command, "$TERM")) {
terminated = true;
printf("broker received $TERM\n");
}
freen (command);
zmsg_destroy (&msg);
}
zactor_destroy (&proxy);
}
// Finally, here's the main task, which starts the client, worker, and
// broker, and then runs until the client signals it to stop:
int main (void)
{
// Create threads
zactor_t *client = zactor_new (client_task, "Hello, Client");
assert (client);
zactor_t *worker = zactor_new (worker_task, "Hello, Worker");
assert (worker);
zactor_t *broker = zactor_new (broker_task, "Hello, Task");
assert (broker);
char *signal = zstr_recv (client);
printf("signal %s\n", signal);
zstr_free (&signal);
zactor_destroy (&client);
printf("client done\n");
zactor_destroy (&worker);
printf("worker done\n");
zactor_destroy (&broker);
printf("broker done\n");
return 0;
}
Let's diagnose the as-is state, going step by step:
int czmqMAJOR,
czmqMINOR,
czmqPATCH;
zsys_version ( &czmqMAJOR, &czmqMINOR, &czmqPATCH );
printf( "INF: detected CZMQ( %d, %d, %d )-version",
czmqMAJOR,
czmqMINOR,
czmqPATCH
);
assert ( czmqMAJOR == CZMQ_VERSION_MAJOR & "Major: does not match\n" );
assert ( czmqMINOR == CZMQ_VERSION_MINOR & "Minor: does not match\n" );
assert ( czmqPATCH == CZMQ_VERSION_PATCH & "Patch: does not match\n" );
if this matches your expectations, you may hope the DLL-versions are both matching and found in proper locations.
Next:
may test the whole circus run in a non-blocking mode, to prove, there is no other blocker, but as briefly inspected, I have not found such option exposed in CZMQ-API, the native API allows one to flag a NOBLOCK option on { _send() | _recv() }-operations, which prevents them from remaining blocked ( which may be the case for DEALER socket instance in cases on _send()-s, when there are not yet any counterparty with a POSACK-ed .bind()/.connect() state ).
Here I did not find some tools to do this as fast as expected in native API. Maybe you will have more luck on going through this.
Test the presence of a global Context() instance, if it is ready:
add before a first socket instantiation, to be sure we are before any and all socket-generation and their respective _bind()/_connect() operation a following self-reporting row, using:
zsys_info ( "INF: This system's Context() limit is %zu ZeroMQ sockets",
zsys_socket_limit ()
);
One may also enforce the Context() instantiation manually:
so as to be sure the global Context() instance is up and running, before any higher abstracted instances ask if for implementing additional internalities ( sockets, counters, handlers, port-management, etc. )
// Initialize CZMQ zsys layer; this happens automatically when you create
// a socket or an actor; however this call lets you force initialization
// earlier, so e.g. logging is properly set-up before you start working.
// Not threadsafe, so call only from main thread. Safe to call multiple
// times. Returns global CZMQ context.
CZMQ_EXPORT void *
zsys_init (void);
// Optionally shut down the CZMQ zsys layer; this normally happens automatically
// when the process exits; however this call lets you force a shutdown
// earlier, avoiding any potential problems with atexit() ordering, especially
// with Windows dlls.
CZMQ_EXPORT void
zsys_shutdown (void);
and possibly better tune IO-performance, using this right at the initialisation state:
// Configure the number of I/O threads that ZeroMQ will use. A good
// rule of thumb is one thread per gigabit of traffic in or out. The
// default is 1, sufficient for most applications. If the environment
// variable ZSYS_IO_THREADS is defined, that provides the default.
// Note that this method is valid only before any socket is created.
CZMQ_EXPORT void
zsys_set_io_threads (size_t io_threads);
This manual instantiation gives one an additional benefit, from having the instance-handle void pointer, so that one can inspect it's current state and shape by zmq_ctx_get() tools:
void *aGlobalCONTEXT = zsys_init();
printf( "INF: current state of the global Context()-instance has:\n" );
printf( " ( %d )-IO-threads ready\n", zmq_ctx_get( aGlobalCONTEXT,
ZMQ_IO_THREADS
)
);
printf( " ( %d )-ZMQ_BLOCKY state\n", zmq_ctx_get( aGlobalCONTEXT,
ZMQ_BLOCKY
)
); // may generate -1 in case DLL is << 4.2+
...
If unhappy with signal-handling, one may design and use another one:
// Set interrupt handler; this saves the default handlers so that a
// zsys_handler_reset () can restore them. If you call this multiple times
// then the last handler will take affect. If handler_fn is NULL, disables
// default SIGINT/SIGTERM handling in CZMQ.
CZMQ_EXPORT void
zsys_handler_set (zsys_handler_fn *handler_fn);
where
// Callback for interrupt signal handler
typedef void (zsys_handler_fn) (int signal_value);
I have a client server program in C. Client send the command while server receives it.
But if client shutdown by pressing Ctrl+C the server application process the previous input.
Example.
CLient.c Server.c
-------------------------------------------------
Enter COmmand: adf Command from client: adf
Enter COmmand: bbb Command from client: bbb
Enter Command: Ctrl+c Command from client: bb
I don't understand why it process the previous input.
Given below is my main logic.
main(){
// bind, listen, accept is done.
while(!done && !shutFlag){ // Main server command Loop
done = ReceiveRequestMessage(&request, connectedSock );
if(done)
{
printf("Client closed the connection while recv() \n");
printf("Listening for new client connection to establish... \n");
connectedSock = accept(srvSock, (struct sockaddr *)&connectSAddr, &addrLen );
printf("GetLastError: %d\n", GetLastError());
done = FALSE;
continue;
}
request.record[strlen(request.record)] ='\0';
commandLen = strcspn(request.record, "\n\t");
memcpy(sysCommand, request.record, commandLen);
sysCommand[commandLen] = '\0';
printf("Request recieved from client: %s -> Hex: %X\n\n", request.record, *(request.record));
}
}
ReceiveRequestMessage Function:
static BOOL ReceiveRequestMessage(REQUEST *pRequest, SOCKET sd){
LONG32 nRemainRecv = 0, nRecv;
LPBYTE pBuffer;
BOOL disconnect = FALSE;
nRemainRecv = RQ_HEADER_LEN;
pBuffer = (LPBYTE) pRequest;
while(nRemainRecv > 0 && !disconnect )
{
nRecv = recv (sd, pBuffer, nRemainRecv, 0); // Reading the 1st 4 bytes(length of record)to pRequest.
if ( nRecv > 0 )
printf("Bytes received in request.rqLen: %d\n", nRecv);
else if ( nRecv == SOCKET_ERROR ){
printf("Connection closed\n");
return TRUE;
}
disconnect = (nRecv == 0); // check connection is closed
nRemainRecv -= nRecv;
pBuffer += nRecv;
}
/* Read the request record */
nRemainRecv = pRequest->rqLen;
/* Exclude buffer overflow */
nRemainRecv = min(nRemainRecv, MAX_RQRS_LEN);
pBuffer = (LPSTR)pRequest->record;
while(nRemainRecv > 0 && !disconnect)
{
nRecv = recv(sd, pBuffer, nRemainRecv, 0);
if(nRecv > 0)
printf("Bytes Received in request.record: %d\n", nRecv);
else if(nRecv == SOCKET_ERROR){
printf("Connection closed");
return TRUE;
}
disconnect = (nRecv == 0); // check connection is closed
nRemainRecv -= nRecv;
pBuffer += nRecv;
}
return disconnect;
}
How can I eliminate the last print statement after clicking Ctrl+C in client?
Means:
Whenever Client disconnects by clicking Ctrl+C or in any way. How can I notify Server?
Function ReceiveRequestMessage() returns the value of its variable disconnect.
That variable is set nonzero only if recv() returns 0.
The documentation for recv() promises a return value of 0 only for the case that the remote end performs an orderly shutdown (and no more data are available to receive). It will definitely return -1, not 0, if an error occurs.
It is not safe to assume that the client will perform an orderly shutdown every time it effectively disconnects, and in particular, it is not safe to assume that it will perform one when it is killed by a signal, as happens when you send it a Ctrl+c.
If ReceiveRequestMessage() returns 0 without modifying the object pointed to by its pRequest parameter, then it will appear to the caller that the previously-sent request was repeated. This is what you are observing.
Objective: N nodes (running on different machines) should communicate with each other by establishing TCP connections with each other. Sending and receiving messages are done by 2 threads created by the process. Initially the main process connects all nodes with each other, creates the 2 threads and gives it a list of file descriptors which can be used by threads to send and receive data. The below structure is filled by the main process and passed to the threads.
typedef struct
{
char hostName[MAXIMUM_CHARACTERS_IN_HOSTNAME]; /* Host name of the node */
char portNumber[MAXIMUM_PORT_LENGTH]; /* Port number of the node */
char nodeId[MAXIMUM_NODE_ID_LENGTH]; /* Node ID of the node */
int socketFd; /* Socket file descriptor */
int socketReady; /* Flag to indicate if socket information is filled */
}SNodeInformation;
PS: socketFd is the socket descriptor received by either accept() or by socket() depending on how the connection was established (Either listening to connections from a node or connecting to a node).
An array of SNodeInformation of size MAX_NUM_OF_NODES is used.
The send thread goes through the nodeInformation and sends a message "Hello" to all nodes as except itself show below.
void *sendMessageThread(void *pNodeInformation) {
int i;
int ownNodeId;
int bytesSent = 0;
char ownHostName[MAXIMUM_CHARACTERS_IN_HOSTNAME];
SNodeInformation *nodeInformation = (SNodeInformation *) pNodeInformation;
SNodeInformation *iterNodeInformation;
printf("SendMessageThread: Send thread created\n");
if(gethostname(ownHostName, MAXIMUM_CHARACTERS_IN_HOSTNAME) != 0) {
perror("Error: sendMessageThread, gethostname failed\n");
exit(1);
}
for(i=0, iterNodeInformation=nodeInformation ; i<MAXIMUM_NUMBER_OF_NODES ; i++, iterNodeInformation++) {
if(strcmp((const char*) iterNodeInformation->hostName, (const char*) ownHostName) != 0) {
/* Send message to all nodes except yourself */
bytesSent = send(iterNodeInformation->socketFd, "Hello", 6, 0);
if(bytesSent == -1) {
printf("Error: sendMessageThread, sending failed, code: %s FD %d\n", strerror(errno), iterNodeInformation->socketFd);
}
}
}
pthread_exit(NULL);
}
The receive thread goes through the nodeInformation, sets up a file descriptor set and uses select to wait for incoming data as show below.
void *receiveMessageThread(void *pNodeInformation)
{
int i;
int fileDescriptorMax = -1;
int doneReceiving = 0;
int numberOfBytesReceived = 0;
int receiveCount = 0;
fd_set readFileDescriptorList;
char inMessage[6];
SNodeInformation *nodeInformation = (SNodeInformation *) pNodeInformation;
SNodeInformation *iterNodeInformation;
printf("ReceiveMessageThread: Receive thread created\n");
/* Initialize the read file descriptor */
FD_ZERO(&readFileDescriptorList);
for(i=0, iterNodeInformation=nodeInformation ; i<MAXIMUM_NUMBER_OF_NODES ; i++, iterNodeInformation++) {
FD_SET(iterNodeInformation->socketFd, &readFileDescriptorList);
if(iterNodeInformation->socketFd > fileDescriptorMax) {
fileDescriptorMax = iterNodeInformation->socketFd;
}
}
printf("ReceiveMessageThread: fileDescriptorMax:%d\n", fileDescriptorMax);
while(!doneReceiving) {
if (select(fileDescriptorMax+1, &readFileDescriptorList, NULL, NULL, NULL) == -1) {
perror("Error receiveMessageThread, select failed \n");
return -1;
}
for(i=0 ; i<fileDescriptorMax ; i++) {
if (FD_ISSET(i, &readFileDescriptorList)) {
/* Check if any FD was set */
printf("ReceiveThread: FD set %d\n", i);
/* Receive data from one of the nodes */
if ((numberOfBytesReceived = recv(i, &inMessage, 6, 0)) <= 0) {
/* Got error or connection closed by client */
if (numberOfBytesReceived == 0) {
/* Connection closed */
printf("Info: receiveMessageThread, node %d hung up\n", i);
}
else {
perror("Error: receiveMessageThread, recv FAILED\n");
}
close(i);
/* Remove from Master file descriptor set */
FD_CLR(i, &readFileDescriptorList);
doneReceiving = 1;
}
else {
/* Valid data from a node */
inMessage[6] = '\0';
if(++receiveCount == MAXIMUM_NUMBER_OF_NODES-1) {
doneReceiving = 1;
}
printf("ReceiveThread: %s received, count: %d\n", inMessage, rece iveCount);
}
}
}
}
pthread_exit(NULL);
}
Expected Output: I tried with just 2 processes, P1 (Started first) and P2 running on machine1 and another on machine2. Both the processes in the machines should first connect and then the threads should send and receive the message "Hello" and exit.
Observed Output: The P1 is able to send the message and P2 (receiver thread) is able to receive the message "Hello". But P1 (receiver thread) is not able to get the message from P2 (Sending thread). Application code is the same in both the machines but every time, the process started first does not get the message from the other process. I added a print to just check if some file descriptor was set, but I don't see it for P1 but only for the P2. The send in the receiving process is not failing, it returns with 6. I checked the maximum value of file descriptors, its correct.
If I start P2 first and then P1 then I can see that P1 receives the message from P2 and exists while P2 waits infinitely for the message from P1.
I am not sure if the problem is because of incorrect use of socket descriptors or because of threads ?
Two issues:
1 The loop testing for a file descriptor being set, does not include all file descriptors put into the set. (This programming error is expected to be the reason for the malfunction described in the OP.)
2 The sets of file descriptors passed to select() are modified by select(), so the set need to be re-initialized before for select() again. (The programming error would only be notable if from more than one socket data sall be received.)
Please see the following mod/s to the OP's code:
void *receiveMessageThread(void *pNodeInformation)
{
...
printf("ReceiveMessageThread: Receive thread created\n");
while(!doneReceiving) {
/* Initialize the read-set of file descriptors */
/* Issue 2 fixed from here ... */
FD_ZERO(&readFileDescriptorList);
for(i=0, iterNodeInformation=nodeInformation ; i<MAXIMUM_NUMBER_OF_NODES ; i++, iterNodeInformation++) {
FD_SET(iterNodeInformation->socketFd, &readFileDescriptorList);
if (iterNodeInformation->socketFd > fileDescriptorMax) {
fileDescriptorMax = iterNodeInformation->socketFd;
}
}
/* ... up to here. */
printf("ReceiveMessageThread: fileDescriptorMax:%d\n", fileDescriptorMax);
if (select(fileDescriptorMax+1, &readFileDescriptorList, NULL, NULL, NULL) == -1) {
perror("Error receiveMessageThread, select failed \n");
return -1;
}
for(i=0 ; i <= fileDescriptorMax ; i++) { /* Issue 1 fixed here. */
...
I have two nodes communicating with a socket. Each node has a read thread and a write thread to communicate with the other. Given below is the code for the read thread. The communication works fine between the two nodes with that code. But I am trying to add a select function in this thread and that is giving me problems (the code for select is in the comments. I just uncomment it to add the functionality). The problem is one node does not receive messages and only does the timeout. The other node gets the messages from the other node but never timesout. That problem is not there (both nodes send and receive messages) without the select (keeping the comments /* */).
Can anyone point out what the problem might be? Thanks.
void *Read_Thread(void *arg_passed)
{
int numbytes;
unsigned char *buf;
buf = (unsigned char *)malloc(MAXDATASIZE);
/*
fd_set master;
int fdmax;
FD_ZERO(&master);
*/
struct RWThread_args_template *my_args = (struct RWThread_args_template *)arg_passed;
/*
FD_SET(my_args->new_fd, &master);
struct timeval tv;
tv.tv_sec = 2;
tv.tv_usec = 0;
int s_rv = 0;
fdmax = my_args->new_fd;
*/
while(1)
{
/*
s_rv = -1;
if((s_rv = select(fdmax+1, &master, NULL, NULL, &tv)) == -1)
{
perror("select");
exit(1);
}
if(s_rv == 0)
{
printf("Read: Timed out\n");
continue;
}
else
{
printf("Read: Received msg\n");
}
*/
if( (numbytes = recv(my_args->new_fd, buf, MAXDATASIZE-1, 0)) == -1 )
{
perror("recv");
exit(1);
}
buf[numbytes] = '\0';
printf("Read: received '%s'\n", buf);
}
pthread_exit(NULL);
}
You must set up master and tv before each call to select(), within the loop. They are both modified by the select() call.
In particular, if select() returned 0, then master will now be empty.