Program getting throttled linux c - c

i have the following loop that is supposed to run at a little above 40Hz (24ms between each run). It works great for a couple of second to minutes, but then goes down to 20Hz for no apparent (at least not to me) reason.
I have tried running the code on a onion Omega 2 (that runs openwrt) and a ubuntu laptop with the same result. (the code went down to 25Hz on the omega2, and 20Hz on my laptop)
How can i force it to continue to run at the right speed? I have checked with an oscilloscope on the output, and the send_brk and send_dmx functions seem to run perfectly even when the program is throttled, so i don't think they are the problem.
double t1 = now();
double t2 = now();
while ( 1 ) {
// Attempt receive UDP data
int x = read(sock, &DMX[1], SIZE - 1);
t2 = now();
double dt = t2 - t1;
if(dt < 0.024){
usleep(500);
continue;
}
t1 = t2;
printf("Frame in %f %f/sec\n", dt, 1.0/dt);
if (x < 0) {
if(errno != EAGAIN){
fprintf(stderr, "Error reading from socket %s\n", strerror(errno));
}
}
send_brk();
send_dmx();
}
the functions called are here:
send_brk() sends a serial break that lasts 88us
void send_brk() {
ioctl(fd, TIOCSBRK);
double start = now();
while((now() - start) < 0.000088){
}
ioctl(fd, TIOCCBRK);
}
send_dmx() writes a byte buffer of lenght 512 to the serial port
void send_dmx() {
// setmode(TX);
int n = write(fd, DMX, SIZE);
if(n == -1) {
if (errno != EAGAIN) {
fprintf(stderr, "Error writing to serial %s\n", strerror(errno));
exit(1);
}
}
if (n != SIZE) {
printf("couldn't write full frame =(\n");
}
}
and now() returns the current time in second as a float.
double now() {
struct timeval tv;
gettimeofday(&tv, NULL);
double time = tv.tv_sec;
time = time + (tv.tv_usec / 1000000.0);
return time;
}

Related

Golang TCP Connection Slow when Server Doesn't Have Backlog Available

Update
With the added pthreaded C client, the problem is recreated, indicating the long connection times are part of the TCP protocol, rather than specific implementations. Altering those protocols doesn't seem easily available.
Initial Question
I believe my question is largely: What does the Golang net package do when attempting to connect to a server over TCP and:
The server has no connections available, even in backlog.
The connection is not refused/failed.
There seems to be a large amount of overhead in that connection with server response times ramping up from 5 ms to several seconds. This was seen both in a production environment and in the minimal example below. The proper solution is to use connection pools to the server, which will be implemented. This is largely my curiosity.
To reproduce:
Run server with backlog = 1, run client.go.
All 50 goroutines fire at once, with a total completion time of almost 2 minutes.
Run server with backlog = 100, run client.go.
All 50 goroutines fire at once, queue up connected to the server, and complete in ~260 ms.
Running three C clients utilizing 50 us retry times was able to complete connections within 12 ms on average, so didn't see this issue.
Example output for backlog = 1 (first time is time to dial, second is time to completion):
user#computer ~/tcp-tests $ go run client.go 127.0.0.1:46999
Long Elapsed Time: 216.579µs, 315.196µs
Long Elapsed Time: 274.169µs, 5.970873ms
Long Elapsed Time: 74.4µs, 10.753871ms
Long Elapsed Time: 590.965µs, 205.851066ms
Long Elapsed Time: 1.029287689s, 1.029574065s
Long Elapsed Time: 1.02945649s, 1.035098229s
...
Long Elapsed Time: 3.045881865s, 6.378597166s
Long Elapsed Time: 3.045314838s, 6.383783688s
Time taken stats: 2.85 +/- 1.59 s // average +/- STDEV
Main Taken: 6.384677948s
Example output for backlog = 100:
...
Long Elapsed Time: 330.098µs, 251.004077ms
Long Elapsed Time: 298.146µs, 256.187795ms
Long Elapsed Time: 315.832µs, 261.523685ms
Time taken stats: 0.13 +/- 0.08 s
Main Taken: 263.186955ms
So what's going on under the hood of net.DialTCP (we used other flavors of dial as well, with no discernible difference) that causes the dial time to grow?
Polling time between attempts to make a connection?
An RFC 5681 Global Congestion Control (likely including mutex lock?) variable that gets incremented on all the initial failed connection attempts?
Something else?
I'm leaning towards the first two, as the 1s, 3s, 5s values seem to be magic numbers. They show up both on my modest local machine, and a large scale production environment.
Here is the minimal server written in C. The configuration value of interest is the backlog argument to listen.
/*
Adapted from
https://www.geeksforgeeks.org/tcp-server-client-implementation-in-c/
Compile and run with:
gcc server.c -o server; ./server
*/
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <sys/time.h>
int main(void)
{
int socket_desc, client_sock, client_size;
struct sockaddr_in server_addr, client_addr;
char server_message[2000], client_message[2000];
// Clean buffers:
memset(server_message, '\0', sizeof(server_message));
memset(client_message, '\0', sizeof(client_message));
// Create socket:
socket_desc = socket(AF_INET, SOCK_STREAM, 0);
if(socket_desc < 0){
printf("Error while creating socket\n");
return -1;
}
printf("Socket created successfully\n");
// Set port and IP:
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(46999);
server_addr.sin_addr.s_addr = inet_addr("127.0.0.1");
// Bind to the set port and IP:
if(bind(socket_desc, (struct sockaddr*)&server_addr, sizeof(server_addr))<0){
printf("Couldn't bind to the port\n");
return -1;
}
printf("Done with binding\n");
// Listen for clients:
// Increasing the backlog allows the Go client to connect and wait
// rather than poll/retry.
if(listen(socket_desc, 100) < 0){
printf("Error while listening\n");
return -1;
}
printf("\nListening for incoming connections.....\n");
// Accept an incoming connection:
client_size = sizeof(client_addr);
int server_run = 1;
do
{
struct timeval start, end;
double cpu_time_used;
gettimeofday(&start, NULL);
client_sock = accept(socket_desc, (struct sockaddr*)&client_addr, &client_size);
if (client_sock < 0){
printf("Can't accept\n");
return -1;
}
// Receive client's message:
if (recv(client_sock, client_message, sizeof(client_message), 0) < 0){
printf("Couldn't receive\n");
return -1;
}
if (strcmp(client_message, "stop") == 0)
{
server_run = 0;
printf("Received stop message.\n");
}
// Respond to client:
strcpy(server_message, "This is the server's message.");
if (send(client_sock, server_message, strlen(server_message), 0) < 0){
printf("Can't send\n");
return -1;
}
// sleep for 5 ms
usleep(5000);
// Closing the socket:
close(client_sock);
gettimeofday(&end, NULL);
cpu_time_used = (end.tv_usec - start.tv_usec) / 1000.0;
if (cpu_time_used > 0.0) // overflow in tv_usec if negative
printf("Server Time: %.4f ms\n", cpu_time_used);
} while(server_run);
close(socket_desc);
return 0;
}
Here is the testing Go client
/*
Adapted from
https://www.linode.com/docs/guides/developing-udp-and-tcp-clients-and-servers-in-go/
Run once the server.c is compiled and running with:
go run client.go 127.0.0.1:46999
*/
package main
import (
"fmt"
"net"
"os"
"time"
"github.com/montanaflynn/stats"
"sync"
)
func do_message(wg *sync.WaitGroup, connect string, time_taken *float64) {
defer wg.Done()
message := make([]byte, 128)
start_time := time.Now()
pAddr, err := net.ResolveTCPAddr("tcp", connect)
if err != nil {
return
}
c, err := net.DialTCP("tcp", nil, pAddr)
if err != nil {
fmt.Println(err)
return
}
c.SetLinger(0)
dialed_time := time.Since(start_time)
defer func() {
c.Close()
elapsed_time := time.Since(start_time)
if elapsed_time.Microseconds() > 60 { // microseconds
fmt.Println("Long Elapsed Time: " + dialed_time.String() + ", " + elapsed_time.String())
}
*time_taken = float64(elapsed_time.Microseconds())
}()
text := "{\"service\": \"magic_service_str\"}"
c.Write([]byte(text))
code, _ := c.Read(message) // Does not actually wait for response.
code = code
}
func main() {
main_start := time.Now()
arguments := os.Args
if len(arguments) == 1 {
fmt.Println("Please provide host:port.")
return
}
n_messages := 50
wg := new(sync.WaitGroup)
wg.Add(n_messages)
times := make([]float64, n_messages)
for i := 0; i < n_messages; i++ {
// Used to turn the goroutines into serial implementation
// time.Sleep(5500 * time.Microsecond)
go do_message(wg, arguments[1], &times[i])
}
wg.Wait()
avg, _ := stats.Mean(times)
std, _ := stats.StandardDeviation(times)
fmt.Println("Time taken stats: " + fmt.Sprintf("%.2f", avg / 1000000.0) + " +/- " + fmt.Sprintf("%.2f", std / 1000000.0) + " s")
main_taken := time.Since(main_start)
fmt.Println("Main Taken: " + main_taken.String())
}
Updated pthreaded client in C and confirmed the issue is not the Golang implementation:
// gcc client_p.c -o pclient -lpthread
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdlib.h>
#include<sys/time.h>
#include <pthread.h>
#include <errno.h>
#ifndef THREAD_LOOP_COUNT
#define THREAD_LOOP_COUNT 1
#endif
/* Subtract the ‘struct timeval’ values X and Y,
storing the result in RESULT.
Return 1 if the difference is negative, otherwise 0.
https://www.gnu.org/software/libc/manual/html_node/Calculating-Elapsed-Time.html
*/
int
timeval_subtract (struct timeval *result, struct timeval *x, struct timeval *y)
{
/* Perform the carry for the later subtraction by updating y. */
if (x->tv_usec < y->tv_usec) {
int nsec = (y->tv_usec - x->tv_usec) / 1000000 + 1;
y->tv_usec -= 1000000 * nsec;
y->tv_sec += nsec;
}
if (x->tv_usec - y->tv_usec > 1000000) {
int nsec = (x->tv_usec - y->tv_usec) / 1000000;
y->tv_usec += 1000000 * nsec;
y->tv_sec -= nsec;
}
/* Compute the time remaining to wait.
tv_usec is certainly positive. */
result->tv_sec = x->tv_sec - y->tv_sec;
result->tv_usec = x->tv_usec - y->tv_usec;
/* Return 1 if result is negative. */
return x->tv_sec < y->tv_sec;
}
static void* workerThreadFunc(void* arg)
{
int socket_desc;
struct sockaddr_in server_addr;
char server_message[2000], client_message[2000];
// Clean buffers:
memset(server_message,'\0',sizeof(server_message));
memset(client_message,'\0',sizeof(client_message));
// Set port and IP the same as server-side:
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(46999);
server_addr.sin_addr.s_addr = inet_addr("127.0.0.1");
int retries = 0;
struct timeval start, end, difference;
double cpu_time_used;
for(int i = 0; i < THREAD_LOOP_COUNT; i++)
{
gettimeofday(&start, NULL);
// Create socket:
socket_desc = socket(AF_INET, SOCK_STREAM, 0);
if(socket_desc < 0){
printf("Unable to create socket\n");
return;
}
// Send connection request to server:
while(connect(socket_desc, (struct sockaddr*)&server_addr, sizeof(server_addr)) < 0){
retries++;
if (retries > 10)
{
printf("Unable to connect\n");
retries = 0;
}
usleep(5);
}
int retries = 0;
// Send the message to server:
if(send(socket_desc, client_message, strlen("client message."), 0) < 0){
printf("Unable to send message\n");
close(socket_desc);
return;
}
// Receive the server's response:
if(recv(socket_desc, server_message, sizeof(server_message), 0) < 0){
printf("Error while receiving server's msg\n");
close(socket_desc);
return;
}
// Close the socket:
close(socket_desc);
gettimeofday(&end, NULL);
timeval_subtract (&difference, &end, &start);
double cpu_time_used = (double)difference.tv_sec + (double)difference.tv_usec / 1000000.0;
printf("Client Time: %.4e s\n", cpu_time_used);
}
}
int main(int argc, char **argv)
{
int n_threads = 50; // default value
if (argc > 1)
n_threads = atoi(argv[1]);
pthread_t *threads = (pthread_t*)malloc(n_threads * sizeof(pthread_t));
struct timeval start, end, difference;
gettimeofday(&start, NULL);
for(int i = 0; i < n_threads; i++)
{
int createRet = pthread_create(&threads[i], NULL, workerThreadFunc, NULL);
if (createRet != 0)
{
printf("failed to create thread\n");
}
}
for(int i = 0; i < n_threads; i++)
pthread_join(threads[i], NULL);
gettimeofday(&end, NULL);
timeval_subtract (&difference, &end, &start);
double cpu_time_used = (double)difference.tv_sec + (double)difference.tv_usec / 1000000.0;
printf("Total Client Time: %.4e s\n", cpu_time_used);
free(threads);
return 0;
}
As indicated by #user207421, the issue lies in the TCP implementation, which includes an exponential backoff on retries. Neither Golang nor C appear to have an easy way to alter this behavior.
The answer is: Don't open and close connections of TCP if you high throughput--use a connection pool.
There was some work looking at removing the exponential backoff, linked below, but there is likely a better solution for specific cases. There was for me.
ACM SIGCOMM Computer Communication Review, "Removing Exponential Backoff from TCP", Volume 38, Number 5, October 2008.

retrieve data from gpsd without wait time

I am newbie using gpsd with C. I implemented my first client which uses gps_stream function. If I understood correctly, it is like a pub/sub function that you can read gps data using gps_read. I want to retrieve the data as soon as it is available. The only way I found is to decrease the time on the gps_waiting function. I wonder if there is a way to not use the gps_waiting function and retrieve as soon as possible. Here below is my code.
int runGpsStreamClient() {
int rc;
int count = 0;
clock_t t;
struct gps_data_t gps_data;
t = clock();
if ((rc = gps_open("localhost", "2947", &gps_data)) == -1) {
printf("code: %d, reason: %s\n", rc, gps_errstr(rc));
return EXIT_FAILURE;
}
get_metric(t, "gps_open");
t = clock();
gps_stream(&gps_data, WATCH_ENABLE | WATCH_JSON, NULL);
get_metric(t, "gps_stream");
while (count < 60) {
/* wait for 0.1 second to receive data */
if (gps_waiting(&gps_data, 100000)) {
t = clock();
int rc = gps_read(&gps_data);
get_metric(t, "gps_read");
/* read data */
if (rc == -1) {
printf("error occurred reading gps data. code: %d, reason: %s\n", rc, gps_errstr(rc));
} else {
/* Display data from the GPS receiver. */
double lat = gps_data.fix.latitude;
double lon = gps_data.fix.longitude;
double alt = gps_data.fix.altitude;
double speed = gps_data.fix.speed;
double climb = gps_data.fix.climb;
time_t seconds = (time_t) gps_data.fix.time;
int status = gps_data.status;
int mode = gps_data.fix.mode;
printf("status[%d], ", status);
printf("mode[%d], ", mode);
printf("latitude[%f], ", lat);
printf("longitude[%f], ", lon);
printf("altitude[%f], ", alt);
printf("speed[%f], ", speed);
printf("v speed[%f], ", climb);
printf("Time[%s].", ctime(&seconds));
if ((status == STATUS_FIX)
&& (mode == MODE_2D || mode == MODE_3D)
&& !isnan(lat) && !isnan(lon)) {
printf(" GPS data OK.\n");
} else {
printf(" GPS data NOK.\n");
}
}
} else {
printf("counter[%d]. Timeout to retrieve data from gpsd. Maybe increase gps_waiting.\n", count);
}
count++;
}
/* When you are done... */
gps_stream(&gps_data, WATCH_DISABLE, NULL);
gps_close(&gps_data);
return EXIT_SUCCESS;
}
Thanks,
Felipe
From the gpsd documents (emphasis mine)
gps_waiting() can be used to check whether there is new data from the daemon. The second argument is the maximum amount of time to wait (in microseconds) on input before returning. It returns true if there is input waiting, false on timeout (no data waiting) or error condition. When using the socket export, this function is a convenience wrapper around a select(2) call...
gps_waiting(&gps_data, t) will block up to t microseconds if there is no new data. As soon as new data is received from the GPS, gps_waiting should return. If no new data is received, the function will timeout and return after t microseconds.
Getting a faster data rate will be dependent on how fast your GPS is outputting data. Merely decreasing the second parameter of gps_waiting will give you the illusion of faster data rates, but if you check the function's return value, you'll see that all you've done is cause the function to time out quicker.

esp32 idf multi-socket-server

It's my first post so ask for remotely anything if it can help and I didn't provide it.
My application requires multiple sockets being opened at once from Master, then the slaves connect to WiFi, and then to the sockets
Problem is: I have to make it "bulletproof" against constant reconnecting from slaves and i get Accept error:
E (23817) TCP SOCKET: accept error: -1 Too many open files in system
It appears when I reconnect client for 5th time, when Max Number of Open Sockets = 5 in menuconfig,
I disconnect clients from the server when they don't send anything in 1second -> then i assume they got DC-d.
I do it with close() procedure.
void closeOvertimedTask(void * ignore)
{
while(1)
{
for(int i = 0; i < openedSockets;)
{
if(needsRestart[i] == 1)
{
ESP_LOGI("RESTARTING", " task#%d",i);
//lwip_close_r(clientSock[i]);
//closesocket(clientSock[i]);
//ESP_LOGI("closing result", "%d", close(clientSock[i]));
stopSocketHandler(i);
needsRestart[i] = 0;
//if(isSocketOpened[i])
{
}
ESP_LOGI("close", "%d", lwip_close_r(clientSock[i]));
isSocketOpened[i] = 0;
xTaskCreate( handleNthSocket, "TCP_HANDLER", 10*1024, &(sockNums[i]) , tskIDLE_PRIORITY, &socketHandlerHandle[i]);
configASSERT(socketHandlerHandle[i]);
needsRestart[i] = 0;
}
if(isSocketOpened[i])
{
int diff = ((int)((uint64_t)esp_timer_get_time()) - lastWDT[i]) - 2*TCPWDT;
if(diff > 0)
{
if(isSocketOpened[i])
{
ESP_LOGI("I FOUND OUT HE DC-d","");
//closesocket(clientSock[i]);
}
ESP_LOGI("close", "%d", close(clientSock[i]));
stopSocketHandler(i);
isSocketOpened[i] = 0;
xTaskCreate( handleNthSocket, "TCP_HANDLER", 10*1024, &(sockNums[i]) , tskIDLE_PRIORITY, &socketHandlerHandle[i]);
configASSERT(socketHandlerHandle[i]);
}
}
}
}
}
For each socket I run 1 task that is supposed to receive from that socket and act further.
For all of them I have an other task that checks last time a message arrived and restarts tasks when time has exceeded (it's 2 seconds)
I need around 16 sockets opened in the final version so there is no room to have sockets that are still closing after Slave has restarted whole connection
How to properly close a Task with running recv() procedure in it to properly close Socket.
Is there a way to read from Server side that socket has been closed if WiFi hasn't realized STA DC-d
Is this about TIME_WAIT from tcp stack ?
Socket read code:
void handleNthSocket(void * param) // 0 <= whichSocket < openedSockets
{
int whichSocket = *((int *) param);
ESP_LOGI("TCP SOCKET", "%s #%d", getSpaces(whichSocket), whichSocket);
struct sockaddr_in clientAddress;
while (1)
{
if(needsRestart [whichSocket] == 0)
{
socklen_t clientAddressLength = sizeof(clientAddress);
clientSock[whichSocket] = accept(sock[whichSocket], (struct sockaddr *)&clientAddress, &clientAddressLength);
if (clientSock[whichSocket] < 0)
{
ESP_LOGE("TCP SOCKET", "accept error: %d %s", clientSock[whichSocket], strerror(errno)); //HERE IT FLIPS
//E (232189) TCP SOCKET: accept error: -1 Too many open files in system
isSocketOpened[whichSocket] = 0;
needsRestart[whichSocket] = 1;
continue;
}
//isSocketOpened[whichSocket] = 1;
// We now have a new client ...
int total = 1000;
char dataNP[1000];
char *data;
data = &dataNP[0];
for(int z = 0; z < total; z++)
{
dataNP[z] = 0;
}
ESP_LOGI("TCP SOCKET", "%snew client",getSpaces(whichSocket));
ESP_LOGI(" ", "%s#%d connected",getSpaces(whichSocket), whichSocket);
lastWDT[whichSocket] = (uint64_t)esp_timer_get_time() + 1000000;
isSocketOpened[whichSocket] = 1;
// Loop reading data.
while(isSocketOpened[whichSocket])
{
/*
if (sizeRead < 0)
{
ESP_LOGE(tag, "recv: %d %s", sizeRead, strerror(errno));
goto END;
}
if (sizeRead == 0)
{
break;
}
sizeUsed += sizeRead;
*/
ssize_t sizeRead = recv(clientSock[whichSocket], data, total, 0);
/*for (int k = 0; k < sizeRead; k++)
{
if(*(data+k) == '\n')
{
ESP_LOGI("TCP DATA ", "%sthere was enter", getSpaces(whichSocket));
//ESP_LOGI("TIME ", "%d", (int)esp_timer_get_time());
}
//ESP_LOGI("last wdt", "%d", (int)lastWDT[whichSocket]);
}*/
lastWDT[whichSocket] = (uint64_t)esp_timer_get_time();
int diff = ((int)((uint64_t)esp_timer_get_time()) - lastWDT[whichSocket]) - 2*TCPWDT;
ESP_LOGI("last wdt", "%d, data = %s", (int)lastWDT[whichSocket], data);
if(diff > 0)
{
ESP_LOGI("last wdt", "too long - %d", diff);
isSocketOpened[whichSocket] = 0;
}
if (sizeRead < 0)
{
isSocketOpened[whichSocket] = 0;
}
//TODO: all RX from slave routine
for(int k = 0; k < sizeRead; k++)
{
*(data+k) = 0;
}
// ESP_LOGI("lol data", "clientSock[whichSocket]=%d,
/*if(sizeRead > -1)
{
ESP_LOGI("TCP DATA: ", "%c", *(data + sizeRead-1));
}
else
{
ESP_LOGI("TCP DC ", "");
goto END;
}*/
}
if(isSocketOpened[whichSocket])
{
ESP_LOGI("closing result", "%d", close(clientSock[whichSocket]));
}
}
}
}
I don't see you closing your sockets anywhere?
Sockets, no matter the platform, is usually a limited resource, and a resource that will be reused. If you don't close the sockets then the system will think that you still use then, and can't reuse those sockets for new connections (and on POSIX systems even opening files will be affected).
So close connections immediately when they are not needed any more.
Usually this is done by checking what recv and send returns: If they return a value less than zero an error occured and in most cases it's a non-recoverable errors, so connection should be closed. Even if it is a recoverable error, it's easier to close the connection and let the client reconnect.
For recv there's also the special case when it returns zero. That means the other end has closed the connection. That of course you need to close your end as well.
this post solved all my problems
https://www.esp32.com/viewtopic.php?t=911

How to make read() non-blocking and reset read()

So i made this function which acts like a countdown.I want to read a command while the countdown decreases. My big problem is making read() the wait for a input while countdown is decreasing.As you can see I tried using select() but after the first printf("timeout.\n"); it stops trying to read. I made the show only once timeout or else it would go until countdown reached 0. I need to try read again.
int timer(int seconds)
{
time_t start, end;
double elapsed;
int opened=0;
char command[10];
struct timeval tv;
int fd_stdin,rv;
fd_set rd;
fd_stdin=fileno(stdin);
FD_ZERO(&rd);
FD_SET(fileno(stdin),&rd);
tv.tv_sec=5;
tv.tv_usec=0;
time(&start); /* start the timer */
do
{
time(&end);
elapsed = difftime(end, start);
if(fmod(elapsed,5)==0)
{
printf("Time remaining: %f minutes.\n", (seconds-elapsed)/60);
sleep(1);
if(opened==0)
{
printf("Use opentest to open your test.\n");
opened=1;
}
fflush(stdout);
}
int c;
rv=select(fd_stdin+1,&rd,NULL,NULL,&tv);
if(rv==-1)
{
perror("Error on select.\n");
exit(1);
}
else if (rv==0 && c!=1)
{
printf("timeout.\n");
rv=select(fd_stdin+1,&rd,NULL,NULL,&tv);
c=1;
}
else
{
c=0;
read(fd_stdin,command,10);
}
}
while(elapsed < seconds);
return 0;
}
EDIT: to use the fmod() function , I compile like this: gcc client.c -lm -o client.exe. I don`t think this is the problem but I am not sure.
select() modifies the fd_set upon exit to reflect which descriptors have been signaled. You are not resetting the fd_set after each timeout.
Also, on some platforms, select() modifies the timeval structure to reflect how much time is remaining, so you would have to reset the timeval each time you call select() on those platforms.
Also, your c variable is declared inside the loop and is uninitialized. Move it outside the loop instead.
Try something more like this:
int timer(int seconds)
{
time_t start, end;
double elapsed;
int opened = 0;
char command[10];
struct timeval tv;
int fd_stdin, rv;
fd_set rd;
int c = 0;
fd_stdin = fileno(stdin);
time(&start); /* start the timer */
do
{
time(&end);
elapsed = difftime(end, start);
if (fmod(elapsed, 5) == 0)
{
printf("Time remaining: %f minutes.\n", (seconds-elapsed)/60);
sleep(1);
if (opened == 0)
{
printf("Use opentest to open your test.\n");
opened = 1;
}
fflush(stdout);
}
FD_ZERO(&rd);
FD_SET(fd_stdin, &rd);
tv.tv_sec = 5;
tv.tv_usec = 0;
rv = select(fd_stdin+1, &rd, NULL, NULL, &tv);
if (rv == -1)
{
perror("Error on select.\n");
exit(1);
}
else if (rv == 0)
{
if (c != 1)
{
printf("timeout.\n");
c = 1;
}
}
else
{
c = 0;
read(fd_stdin, command, 10);
}
}
while (elapsed < seconds);
return 0;
}

SIGPIPE With Running Program

I have two daemons, and A is speaking to B. B is listening on a port, and A opens a tcp connection to that port. A is able to open a socket to B, but when it attempts to actually write said socket, I get a SIGPIPE, so I'm trying to figure out where B could be closing the open socket.
However, if I attach to both daemons in gdb, the SIGPIPE happens before any of the code for handling data is called. This kind of makes sense, because the initial write is never successful, and the listeners are triggered from receiving data. My question is - what could cause daemon B to close the socket before any data is sent? The socket is closed less than a microsecond after opening it, so I'm thinking it can't be a timeout or anything of the sort. I would love a laundry list of possibilities to track down, as I've been chewing on this one for a few days and I'm pretty much out of ideas.
As requested, here is the code that accepts and handles communication:
{
extern char *PAddrToString(pbs_net_t *);
int i;
int n;
time_t now;
fd_set *SelectSet = NULL;
int SelectSetSize = 0;
int MaxNumDescriptors = 0;
char id[] = "wait_request";
char tmpLine[1024];
struct timeval timeout;
long OrigState = 0;
if (SState != NULL)
OrigState = *SState;
timeout.tv_usec = 0;
timeout.tv_sec = waittime;
SelectSetSize = sizeof(char) * get_fdset_size();
SelectSet = (fd_set *)calloc(1,SelectSetSize);
pthread_mutex_lock(global_sock_read_mutex);
memcpy(SelectSet,GlobalSocketReadSet,SelectSetSize);
/* selset = readset;*/ /* readset is global */
MaxNumDescriptors = get_max_num_descriptors();
pthread_mutex_unlock(global_sock_read_mutex);
n = select(MaxNumDescriptors, SelectSet, (fd_set *)0, (fd_set *)0, &timeout);
if (n == -1)
{
if (errno == EINTR)
{
n = 0; /* interrupted, cycle around */
}
else
{
int i;
struct stat fbuf;
/* check all file descriptors to verify they are valid */
/* NOTE: selset may be modified by failed select() */
for (i = 0; i < MaxNumDescriptors; i++)
{
if (FD_ISSET(i, GlobalSocketReadSet) == 0)
continue;
if (fstat(i, &fbuf) == 0)
continue;
/* clean up SdList and bad sd... */
pthread_mutex_lock(global_sock_read_mutex);
FD_CLR(i, GlobalSocketReadSet);
pthread_mutex_unlock(global_sock_read_mutex);
} /* END for each socket in global read set */
free(SelectSet);
log_err(errno, id, "Unable to select sockets to read requests");
return(-1);
} /* END else (errno == EINTR) */
} /* END if (n == -1) */
for (i = 0; (i < max_connection) && (n != 0); i++)
{
pthread_mutex_lock(svr_conn[i].cn_mutex);
if (FD_ISSET(i, SelectSet))
{
/* this socket has data */
n--;
svr_conn[i].cn_lasttime = time(NULL);
if (svr_conn[i].cn_active != Idle)
{
void *(*func)(void *) = svr_conn[i].cn_func;
netcounter_incr();
pthread_mutex_unlock(svr_conn[i].cn_mutex);
func((void *)&i);
/* NOTE: breakout if state changed (probably received shutdown request) */
if ((SState != NULL) &&
(OrigState != *SState))
break;
}
else
{
pthread_mutex_lock(global_sock_read_mutex);
FD_CLR(i, GlobalSocketReadSet);
pthread_mutex_unlock(global_sock_read_mutex);
close_conn(i, TRUE);
pthread_mutex_unlock(svr_conn[i].cn_mutex);
pthread_mutex_lock(num_connections_mutex);
sprintf(tmpLine, "closed connections to fd %d - num_connections=%d (select bad socket)",
i,
num_connections);
pthread_mutex_unlock(num_connections_mutex);
log_err(-1, id, tmpLine);
}
}
else
pthread_mutex_unlock(svr_conn[i].cn_mutex);
} /* END for i */
/* NOTE: break out if shutdown request received */
if ((SState != NULL) && (OrigState != *SState))
return(0);
/* have any connections timed out ?? */
now = time((time_t *)0);
for (i = 0;i < max_connection;i++)
{
struct connection *cp;
pthread_mutex_lock(svr_conn[i].cn_mutex);
cp = &svr_conn[i];
if (cp->cn_active != FromClientDIS)
{
pthread_mutex_unlock(svr_conn[i].cn_mutex);
continue;
}
if ((now - cp->cn_lasttime) <= PBS_NET_MAXCONNECTIDLE)
{
pthread_mutex_unlock(svr_conn[i].cn_mutex);
continue;
}
if (cp->cn_authen & PBS_NET_CONN_NOTIMEOUT)
{
pthread_mutex_unlock(svr_conn[i].cn_mutex);
continue; /* do not time-out this connection */
}
/* NOTE: add info about node associated with connection - NYI */
snprintf(tmpLine, sizeof(tmpLine), "connection %d to host %s has timed out after %d seconds - closing stale connection\n",
i,
PAddrToString(&cp->cn_addr),
PBS_NET_MAXCONNECTIDLE);
log_err(-1, "wait_request", tmpLine);
/* locate node associated with interface, mark node as down until node responds */
/* NYI */
close_conn(i, TRUE);
pthread_mutex_unlock(svr_conn[i].cn_mutex);
} /* END for (i) */
return(0);
}
NOTE: I didn't write this code.
Is it possible you messed up and somewhere else in the program you try to close the same handle twice?
That could do this to you very easily.
HINT: systrace can determine if this is happening.

Resources