select() times out immediately after long runtime (C++) - c

Most of the time this code works just fine. But sometimes when the executable has been running for a while, select() appears to time out immediately, then get into a weird state where it keeps getting called, timing out immediately, over and over. Then it has to be killed from the outside.
My guess would be that the way that standard input changes overtime is at fault - that is what select is blocking on.
Looking around on StackOverflow, most of people's select() troubles seem to be solved by making sure to reset with the macros (FD_ZERO & FD_SET) every time and using the right initial parameter to select. I don't think those are the issues here.
int rc = 0;
fd_set fdset;
struct timeval timeout;
// -- clear out the response -- //
readValue = "";
// -- set the timeout -- //
timeout.tv_sec = passedInTimeout; // 5 seconds
timeout.tv_usec = 0;
// -- indicate which file descriptors to select from -- //
FD_ZERO(&fdset);
FD_SET(passedInFileDescriptor, &fdset); //passedInFileDescriptor = 0
// -- perform the selection operation, with timeout -- //
rc = select(1, &fdset, NULL, NULL, &timeout);
if (rc == -1) // -- select failed -- //
{
result = TR_ERROR;
}
else if (rc == 0) // -- select timed out -- //
{
result = TR_TIMEDOUT;
}
else
{
if (FD_ISSET(mFileDescriptor, &fdset))
{
if(rc = readData(readValue) <= 0)
{
result = TR_ERROR;
}
} else {
result = TR_SUCCESS;
}
}

Beware that some implementaions of "select" apply strictly the specification:
"nfds is the highest-numbered file descriptor in any of the three sets, plus 1".
So, you'd better to change "1" with "passedInFileDescriptor+1" as first parameter.
I don't know if this can solve your problem, but at least your code becomes more... uhm... "traditional" ;)
Bye

On some OSes, timeout is modified when calling select to reflect the amount of time not slept. It doesn't look like you're reusing timeout in your example, but make sure that you are indeed reinitializing it to 5 seconds every time before calling select.

I'm having the same problem, it works fine on windows but not on linux and I have the maxfd set to last socket + 1. It occurs periodically after long runs. I pick up the connection on accept and then the first call to select periodically times out.

Look at this code:
if (FD_ISSET(mFileDescriptor, &fdset))
{
if(rc = readData(readValue) <= 0)
{
result = TR_ERROR;
}
} else {
result = TR_SUCCESS;
}
There are two things bothering me here:
if your FD has no data in it (like, say, an error occured),
FD_ISSET() will return false and your function returns
TR_SUCCESS !?
you FD_SET(passedInFileDescriptor, &fdset), but check on another
value: FD_ISSET(mFileDescriptor, &fdset). If mFileDescriptor !=
passedInFileDescriptor at some point, you'll fall into my first
assumption.
It should be looking like this:
if (FD_ISSET(passedInFileDescriptor, &fdset))
{
if(rc = readData(readValue) <= 0)
{
result = TR_ERROR;
}
else
{
result = TR_SUCCESS;
}
}
else
{
result = TR_ERROR;
}
No?
(Edit: also, this answer also points the problem of your use of select() with a bad high_fd value)
Another edit: well, looks like the guys never came back... frustrating.

Related

Measure wall clock time before and after a function

I am measuring wall time through the use of clock_gettime() found in . It works perfectly fine when i use it in main() but not the way i am attempting to use it.
I am familiarizing myself with the linux scheduler and i am measuring performance on different parts.
I want to be able to measure Waiting time which is defined by "the total time a thread spends in the ready queue" (how long until it starts executing the function).
Easily enough i can measure this by setting a clock_gettime() before the thread function and another right inside the function. However the problem i am having is that the time inside the thread function is lower than the one outside, giving us a negative time.
I am running this on my windows pc through ubuntu.
what could the problem be?
code:
clock_gettime(CLOCK_REALTIME,&data.before);
thread_array[i-1] = data;
if(pthread_create(&tids[i],&attr,workLoad,(void*) &data) != 0){
perror("Could not create thread");
return 1;
}
}
for(int i = 1;i < threadAmount; i++){
if(pthread_join(tids[i],NULL)!= 0){
perror("Thread could not wait");
return 1;
}
}
and here is my threadfunc:
void *workLoad(void *args)
{
threadData* data = (threadData*) args;
clock_gettime(CLOCK_REALTIME,&data->after);
int loopAmount = data->loopAmount;
int counter = 0;
for(int i = 0; i < loopAmount; i++){
counter++;
}
return NULL;
}
result of time intervall
In the following code:
clock_gettime(CLOCK_REALTIME,&data.before);
thread_array[i-1] = data;
if(pthread_create(&tids[i],&attr,workLoad,(void*) &data) != 0){
data seems to be a local variable whose address you pass to the thread. You also copy this variable into thread_array[i-1]. If you then do thread_array[i-1].after - thread_array[i-1].before then that means that the thread updates a wrong variable. You need to pass &thread_array[i-1] to that thread, e.g.:
if(pthread_create(&tids[i],&attr,workLoad,(void*)&thread_array[i-1]) != 0){

Contents in message-queue is changed

I am using uuntu 18.04.1LTS and studying IPC using C. I'm testing Unix i/o using LPC this time, and there's a problem when more than one client connects to the server at the same time.
(when only one client connected, there is no problem.)
sprintf(s1,"./%sA",t);
sprintf(s2, "./%sB", t);
if (MakeDirectory(s1, 0755) == -1) {
return -1;
}
if (MakeDirectory(s2, 0755) == -1) {
return -1;
}
for (i = 0; i < 5; i++)
{
memset(dirName, 0, SIZE);
sprintf(dirName, "%s/%d",s1,i);
usleep(300000);
if (MakeDirectory(dirName, 0755) == -1) {
return -1;
}
}
This code is client's main function. There is no problem at the top, but after running the repeat statement once (when i = 1), MakeDirectory() returns -1 with an error.
(t refers to the pid of the forked process converted into a string.)
int MakeDirectory(char* path, int mode) {
memset(&pRequest, 0x00, LPC_REQUEST_SIZE);
memset(&pResponse, 0x00, LPC_RESPONSE_SIZE);
pRequest.pid = getpid();
pRequest.service = LPC_MAKE_DIRECTORY;
pRequest.numArg = 2;
pRequest.lpcArgs[0].argSize = strlen(path);
strcpy(pRequest.lpcArgs[0].argData, path);
pRequest.lpcArgs[1].argSize = mode;
msgsnd(rqmsqid, &pRequest, LPC_REQUEST_SIZE, 0);
msgrcv(rpmsqid, &pResponse, LPC_RESPONSE_SIZE, getpid(), 0);
int res = pResponse.responseSize;
return res;
}
This is client's MakeDirectory, and
int MakeDirectory(LpcRequest* pRequest) {
memset(&pResponse, 0x00, LPC_RESPONSE_SIZE);
char *path = pRequest->lpcArgs[0].argData;
int mode = pRequest->lpcArgs[1].argSize;
int res = mkdir(path, mode);
pResponse.errorno = 0;
pResponse.pid = pRequest->pid;
printf("%ld\n", pResponse.pid);
pResponse.responseSize = res;
msgsnd(rpmsqid, &pResponse, LPC_RESPONSE_SIZE, 0);
return res;
}
This is a function of the server that runs after checking the pRequest.service when the MakeDirectory function is enabled on the client.
Again, there's nothing wrong with having one client, and if there's more than one. I checked with printf(), but the server passes 0 and the client receives -1. I don't know why this happens.
There's too much missing from your code to know definitively what's happening. I'm placing my bet on either using unallocated memory, or not recognizing a syscall error.
I'm using LTS 16, and there's no definition on my system for LpcRequest or LPC_REQUEST_SIZE, etc. You don't show how they're defined, so we don't know for example if pRequest.lpcArgs[1] exists.
You're also not checking the return code for msgsnd and msgrcv, a sure recipe for endless hours of entertaining debugging.
I suggest you edit your question to include working code, and a shell script that produces the mysterious result. Then someone will be able, if willing, to debug it and explain where you went wrong.
My other suggestion in this area is pretty standard: W. Richard Stevens's books on TCP/IP, specifically Unix Network Programming. If you're studying this stuff, you'll absolutely be glad to have read it.

What is the purpose of WSA_WAIT_EVENT_0 in overlapped IO?

All my experience in networking has been on linux so I'm an absolute beginner at windows networking. This is probably a stupid question but I can't seem to find the answer anywhere. Consider the following code snippet:
DWORD Index = WSAWaitForMultipleEvents(EventTotal, EventArray, FALSE, WSA_INFINITE, FALSE);
WSAResetEvent( EventArray[Index - WSA_WAIT_EVENT_0]);
Every time an event is selected from the EventArray WSA_WAIT_EVENT_0 is subtracted from the index but WSA_WAIT_EVENT_0 is defined in winsock2.h as being equal to zero.
Why is code cluttered with this seemingly needless subtraction? Obviously the compiler will optimize it out but still don't understand why it's there.
The fact that WSA_WAIT_EVENT_0 is defined as 0 is irrelevant (it is just an alias for WAIT_OBJECT_0 from the WaitFor(Single|Multiple)Object(s)() API, which is also defined as 0 - WSAWaitForMultipleEvents() is itself just a wrapper for WaitForMultipleObjectsEx(), though Microsoft reserves the right to change the implementation in the future without breaking existing user code).
WSAWaitForMultipleEvents() can operate on multiple events at a time, and its return value will be one of the following possibilities:
WSA_WAIT_EVENT_0 .. (WSA_WAIT_EVENT_0 + cEvents - 1)
A specific event object was signaled.
WSA_WAIT_IO_COMPLETION
One or more alertable I/O completion routines were executed.
WSA_WAIT_TIMEOUT
A timeout occurred.
WSA_WAIT_FAILED
The function failed.
Typically, code should be looking at the return value and act accordingly, eg:
DWORD ReturnValue = WSAWaitForMultipleEvents(...);
if ((ReturnValue >= WSA_WAIT_EVENT_0) && (ReturnValue < (WSA_WAIT_EVENT_0 + EventTotal))
{
DWORD Index = ReturnValue - WSA_WAIT_EVENT_0;
// handle event at Index as needed...
}
else if (ReturnValue == WSA_WAIT_IO_COMPLETION)
{
// handle I/O as needed...
}
else if (RetunValue == WSA_WAIT_TIMEOUT)
{
// handle timeout as needed...
}
else
{
// handle error as needed...
}
Which can be simplified given the fact that bAlertable is FALSE (no I/O routines can be called) and dwTimeout is WSA_INFINITE (no timeout can elapse), so there are only 2 possible outcomes - an event is signaled or an error occurred:
DWORD ReturnValue = WSAWaitForMultipleEvents(EventTotal, EventArray, FALSE, WSA_INFINITE, FALSE);
if (ReturnValue != WSA_WAIT_FAILED)
{
DWORD Index = ReturnValue - WSA_WAIT_EVENT_0;
WSAResetEvent(EventArray[Index]);
}
else
{
// handle error as needed...
}
The documentation says it will return WSA_WAIT_EVENT_0 if event 0 was signaled, WSA_WAIT_EVENT_0 + 1 if event 1 was signaled, and so on.
Sure, they set WSA_WAIT_EVENT_0 to 0 in this version of Windows, but what if it's 1 in the next version, or 100?

Synchronizing the result of threads with incremented shared variable and condition

The title might not appear particularly clear, but the code explains itself:
int shared_variable;
int get_shared_variable() {
int result;
pthread_mutex_lock(&shared_variable_mutex);
result = shared_variable;
pthread_mutex_unlock(&shared_variable_mutex);
return result;
}
void* thread_routine(void *arg) {
while (get_shared_variable() < 5000) {
printf();
printf();
sleep(2);
int i = 0;
while (pthread_mutex_trylock(&foo_mutexes[i]) != 0) {
i++;
pthread_mutex_lock(&foo_count_mutex);
if (i == foo_count) {
pthread_mutex_unlock(&foo_count_mutex);
sleep(1); // wait one second and retry
i = 0;
}
pthread_mutex_unlock(&foo_count_mutex);
}
pthread_mutex_lock(&shared_variable_mutex);
shared_variable += 10;
pthread_mutex_unlock(&shared_variable_mutex);
}
return NULL;
}
I'm passing thread_routine to a pthread_create (pretty standard), but I'm having a problem with the synchronization of the result. Basically, the problem is that the first thread checks the while condition, it passes, and then another thread checks it, it passes too. However, when the first thread finishes and shared_variable reaches 5000, the second thread has not yet finished and it adds up another 10 and the end result becomes 5010 (or NUM_OF_THREADS - 1 * 10 if I run more than two) at the end, while the whole process should end at 5000.
Another issue is that in // do some work I output something on the screen, so the whole thing inside the loop should pretty much work as a transaction in database terms. I can't seem to figure out how to solve this problem, but I suppose there's something simple that I'm missing. Thanks in advance.
This answer may or may not be what you are after. Because as explained in the comments your description of the expected behaviour of the program is incomplete. Without the exact expected behaviour it is difficult to give a full answer. But since you ask, here is a possible structure of the program based on the code shown. The main principle it is illustrating is that the critical section for shared_variable needs to be both minimal and complete.
int shared_variable;
void* thread_routine(void *arg)
{
while (1) {
pthread_mutex_lock(&shared_variable_mutex);
if (shared_variable >= 5000) {
pthread_mutex_unlock(&shared_variable_mutex);
break;
}
shared_variable += 10;
pthread_mutex_unlock(&shared_variable_mutex);
/* Other code that doesn't use shared_variable goes here */
}
return NULL;
}

How to set an int to 1 if dependent on a button and in a while loop?

I'm programming a robot, and unfortunately in its autonomous mode I'm having some issues.
I need to set an integer to 1 when a button is pressed, but in order for the program to recognize the button, it must be in a while loop. As you can imagine, the program ends up in an infinite loop and the integer values end up somewhere near 4,000.
task autonomous()
{
while(true)
{
if(SensorValue[positionSelectButton] == 1)
{
positionSelect = positionSelect + 1;
wait1Msec(0350);
}
}
}
I've managed to get the value by using a wait, but I do NOT want to do this. Is there any other way I can approach this?
assuming that the SensorValue comes from a physical component that is asynchronous to the while loop, and is a push button (i.e. not a toggle button)
task autonomous()
{
while(true)
{
// check whether
if(current_time >= next_detect_time && SensorValue[positionSelectButton] == 1)
{
positionSelect = positionSelect + 1;
// no waiting here
next_detect_time = current_time + 0350;
}
// carry on to other tasks
if(enemy_is_near)
{
fight();
}
// current_time
current_time = built_in_now()
}
}
Get the current time either by some built-in function or incrementing an integer and wrap around once reach max value.
Or if you are in another situation:
task autonomous()
{
while(true)
{
// check whether the flag allows incrementing
if(should_detect && SensorValue[positionSelectButton] == 1)
{
positionSelect = positionSelect + 1;
// no waiting here
should_detect = false;
}
// carry on to other tasks
if(enemy_is_near)
{
if(fight() == LOSING)
should_detect = true;
}
}
}
Try remembering the current position of the button, and only take action when its state changes from off to on.
Depending on the hardware, you might also get a signal as though it flipped back and forth several times in a millisecond. If that's an issue, you might want to also store the timestamp of the last time the button was activated, and then ignore repeat events during a short window after that.
You could connect the button to an interrupt and then make the necessary change in the interrupt handler.
This might not be the best approach, but it will be the simplest.
From The Vex Robotics catalogue :
(12) Fast digital I/O ports which can be used as interrupts
So, most probably which ever micro-controller of Vex you are using will support Interrupts.
Your question is a bit vague
I m not sure why u need this variable to increment and how things exactly work...but i ll make a try.Explain a bit more how things work for the robot to move...and we will be able to help more.
task autonomous()
{
int buttonPressed=0;
while(true)
{
if(SensorValue[positionSelectButton] == 1)
{
positionSelect = positionSelect +1;
buttonPressed=1;
}
else{
buttonPressed = 0;
}
//use your variables here
if( buttonPressed == 1){
//Move robot front a little
}
}
}
The general idea is :
First you detect all buttons pressed and then you do things according to them
All these go in your while loop...that will(and should) run forever(at least as long as your robot is alive :) )
Hope this helps!

Resources