What are some ways I can try to throttle back a send/sendto() function inside a loop. I am creating a port scanner for my network and I tried two methods but they only seem to work locally (they work when I test them on my home machine but when I try to test them on another machine it doesn't want to create appropriate throttles).
method 1
I was originally parsing /proc/net/dev and reading in the "bytes sent" attribute and basing my sleep time off that. That worked locally (the sleep delay was adjusting to adjust the flow of bandwidth) but as soon as I tried it on another server also with /proc/net/dev it didn't seem to be adjusting data right. I ran dstat on a machine I was locally scanning and it was outputting to much data to fast.
method 2
I then tried to keep track of how many bytes total I was sending and adding it to a total_sent variable which my bandwidth thread would read and compute a sleep timer for. This also worked on my local machine but when I tried it on a server it was saying that it was only sending 1-2 packets each time my bandwidth thread would check total_sent making my bandwidth thread reduce sleep to 0, but even at 0 the total_sent variable did not increase due to the reduced sleep time but instead stayed the same.
Overall I am wanting a way to monitor bandwidth of the Linux computer and calculate a sleep time I can pass into usleep() before or after each of my send/sendto() socket calls to throttle back the bandwidth.
Edit: some other things I forgot to mention is that I do have a speedtest function that calculates upload speed of the machine and I have 2 threads. 1 thread adjusts a global sleep timer based on bandwidth usage and thread 2 sends the packets to the ports on a remote machine to test if they are open and to fingerprint them (right now I am just using udp packets with a sendto() to test this all).
How can I implement bandwidth throttling for a send/sendto() call using usleep().
Edit: Here is the code for my bandwidth monitoring thread. Don't concern yourself about the structure stuff, its just my way of passing data to a thread.
void *bandwidthmonitor_cmd(void *param)
{
int i = 0;
double prevbytes = 0, elapsedbytes = 0, byteusage = 0, maxthrottle = 0;
//recreating my param struct i passed to the thread
command_struct bandwidth = *((command_struct *)param);
free(param);
//set SLEEP (global variable) to a base time in case it was edited and not reset
SLEEP = 5000;
//find the maximum throttle speed in kb/s (takes the global var UPLOAD_SPEED
//which is in kb/s and times it by how much bandwidth % you want to use
//and devides by 100 to find the maximum in kb/s
//ex: UPLOAD_SPEED = 60, throttle = 90, maxthrottle = 54
maxthrottle = (UPLOAD_SPEED * bandwidth.throttle) / 100;
printf("max throttle: %.1f\n", maxthrottle);
while(1)
{
//find out how many bytes elapsed since last polling of the thread
elapsedbytes = TOTAL_BYTES_SEND - prevbytes;
printf("elapsedbytes: %.1f\n", elapsedbytes);
//set prevbytes to our current bytes so we can have results next loop
prevbytes = TOTAL_BYTES_SEND;
//convert our bytes to kb/s
byteusage = 8 * (elapsedbytes / 1024);
//throttle control to make it adjust sleep 20 times every 30~
//iterations of the loop
if(i & 0x40)
{
//adjust SLEEP by 1.1 gain
SLEEP += (maxthrottle - byteusage) * -1.1;//;
if(SLEEP < 0){
SLEEP = 0;
}
printf("sleep:%.1f\n\n", SLEEP);
}
//sleep the thread for a short bit then start the process over
usleep(25000);
//increment variable i for our iteration throttling
i++;
}
}
My sending thread is just a simple sendto() routine in a while(1) loop sending udp packets for testing. sock is my sockfd, buff is a 64 byte character array filled with "A" and sin is my sockaddr_in.
while(1)
{
TOTAL_BYTES_SEND += 64;
sendto(sock, buff, strlen(buff), 0, (struct sockaddr *) &sin, sizeof(sin))
usleep(SLEEP);
}
I know my socket functions work because I can see the usage in dstat on my local machine and the remote machine. This bandwidth code works on my local system (all the variables change as they should) but on the server I tried testing on elapsed bytes does not change (is always 64/128 per iteration of the thread) and results in SLEEP throttling down to 0 which should in theory make the machine send packets faster but even with SLEEP equating to 0 elapsedbytes remain 64/128. I've also encoded the sendto() function inside a if statement checking for the function returning -1 and alerting me by printf-ing the error code but there hasn't been one in the tests I've done.
It seems like this could be most directly solved by calculating the throttle sleep time in the send thread. I'm not sure I see the benefit of another thread to do this work.
Here is one way to do this:
Select a time window in which you will measure your send rate. Based on your target bandwidth this will give you a byte maximum for that amount of time. You can then check to see if you have sent that many bytes after each sendto(). If you do exceed the byte threshold then sleep until the end of the window in order to perform the throttling.
Here is some untested code showing the idea. Sorry that clock_gettime and struct timespec add some complexity. Google has some nice code snippets for doing more complete comparisons, addition, and subtraction with struct timespec.
#define MAX_BYTES_PER_SECOND (128L * 1024L)
#define TIME_WINDOW_MS 50L
#define MAX_BYTES_PER_WINDOW ((MAX_BYTES_PER_SECOND * TIME_WINDOW_MS) / 1000L)
#include <time.h>
#include <stdlib.h>
int foo(void) {
struct timespec window_start_time;
size_t bytes_sent_in_window = 0;
clock_gettime(CLOCK_REALTIME, &window_start_time);
while (1) {
size_t bytes_sent = sendto(sock, buff, strlen(buff), 0, (struct sockaddr *) &sin, sizeof(sin));
if (bytes_sent < 0) {
// error handling
} else {
bytes_sent_in_window += bytes_sent;
if (bytes_sent_in_window >= MAX_BYTES_PER_WINDOW) {
struct timespec now;
struct timespec thresh;
// Calculate the end of the window
thresh.tv_sec = window_start_time.tv_sec;
thresh.tv_nsec = window_start_time.tv_nsec;
thresh.tv_nsec += TIME_WINDOW_MS * 1000000;
if (thresh.tv_nsec > 1000000000L) {
thresh.tv_sec += 1;
thresh.tv_nsec -= 1000000000L;
}
// get the current time
clock_gettime(CLOCK_REALTIME, &now);
// if we have not gotten to the end of the window yet
if (now.tv_sec < thresh.tv_sec ||
(now.tv_sec == thresh.tv_sec && now.tv_nsec < thresh.tv_nsec)) {
struct timespec remaining;
// calculate the time remaining in the window
// - See google for more complete timespec subtract algorithm
remaining.tv_sec = thresh.tv_sec - now.tv_sec;
if (thresh.tv_nsec >= now.tv_nsec) {
remaining.tv_nsec = thresh.tv_nsec - now.tv_nsec;
} else {
remaining.tv_nsec = 1000000000L + thresh.tv_nsec - now.tv_nsec;
remaining.tv_sec -= 1;
}
// Sleep to end of window
nanosleep(&remaining, NULL);
}
// Reset counters and timestamp for next window
bytes_sent_in_window = 0;
clock_gettime(CLOCK_REALTIME, &window_start_time);
}
}
}
}
If you'd like to do this at the application level, you could use a utility such as trickle to limit or shape the socket transfer rates available to the application.
For instance,
trickle -s -d 50 -w 100 firefox
would start firefox with a max download rate of 50KB/s and a peak detection window of 100KB. Changing these values may produce something suitable for your application testing.
Related
I am making a simple Sender and Reciever to check the time difference between using Reno CC algorithm and a Cubic CC algorithm.
in order to make an accurate comparison between them I am using struct timeval instead of clock, and I am also adding a check for packet loss with
sudo tc qdisc add dev lo root netem loss 10%
my code sends 2 parts of the file one part in cubic second part in reno, for a couple of times, I am asking it to do.
for some reason the timeval return 0 on the second loop I tried checking with gdb and didn't found why and how to fix it.
relevant code
while (1)
{
struct timeval start_t_cubic, end_t_cubic, tval_result_cubic; // will use them to check the timing
struct timeval start_t_reno, end_t_reno, tval_result_reno; // will use them to check the timing
// set the algorithm to cubic
char *cc = "cubic";
if (setsockopt(client_socket, IPPROTO_TCP, TCP_CONGESTION, cc, strlen(cc)) != 0)
{
printf("setsockopt failed \n");
return;
}
gettimeofday(&start_t_cubic, NULL); // start the time
while (num_of_bytes < BUFSIZE / 2)
{
recv(client_socket, client_message, 1, 0);
num_of_bytes++;
}
gettimeofday(&end_t_cubic, NULL); // finish count for first part of the file
timersub(&end_t_cubic, &start_t_cubic, &tval_result_cubic); // the total time cubic
printf("algo: cubic, time: %ld.%06ld, iter num: %d\n",
(long int)tval_result_cubic.tv_sec,
(long int)tval_result_cubic.tv_usec,
iteration_number);
// change the algorithm to reno
char *cc_algo = "reno"; // the CC algorithm to use (in this case, "reno")
check(setsockopt(server_socket, IPPROTO_TCP, TCP_CONGESTION, cc_algo, strlen(cc_algo)),
"setsockopt failed");
gettimeofday(&start_t_reno, NULL); // start the time
// recive a file of half mega bytes
while (num_of_bytes < BUFSIZE)
{
recv(client_socket, client_message, 1, 0);
num_of_bytes++;
}
gettimeofday(&end_t_reno, NULL); // finish count for first part of the file
timersub(&end_t_reno, &start_t_reno, &tval_result_reno); // the total time reno
printf("algo: reno, time: %ld.%06ld, iter num: %d\n", (long int)tval_result_reno.tv_sec, (long int)tval_result_reno.tv_usec, iteration_number);
// store the time elapsed in a variable
long int time_elapsed_reno = tval_result_reno.tv_sec * 1000000 + tval_result_reno.tv_usec;
}
if someone had come across this issue I will be glad for any help or hint please
full source code:
https://github.com/dolev146/networking
The problem was that in the second iteration the function time was too fast there fore the output was 0
Somehow the PC optimizes the code
I added usleep(1000) to so that I not get 0 value for the time difference
Currently I'm polling the register to get the expected value and now I want reduce the CPU usage and increase the performance.
So, I think, if we do polling for particular time (Say for 10ms) and if we didn't get expected value then wait for some time (like udelay(10*1000) or usleep(10*1000) delay/sleep in ms) then continue to do polling for more more extra time (Say 100ms) and still if you didn't get the expected value then do sleep/delay for 100ms.....vice versa... need to do till it reach to maximum timeout value.
Please let me know if anything.
This is the old code:
#include <sys/time.h> /* for setitimer */
#include <unistd.h> /* for pause */
#include <signal.h> /* for signal */
#define INTERVAL 500 //timeout in ms
static int timedout = 0;
struct itimerval it_val; /* for setting itimer */
char temp_reg[2];
int main(void)
{
/* Upon SIGALRM, call DoStuff().
* Set interval timer. We want frequency in ms,
* but the setitimer call needs seconds and useconds. */
if (signal(SIGALRM, (void (*)(int)) DoStuff) == SIG_ERR)
{
perror("Unable to catch SIGALRM");
exit(1);
}
it_val.it_value.tv_sec = INTERVAL/1000;
it_val.it_value.tv_usec = (INTERVAL*1000) % 1000000;
it_val.it_interval = it_val.it_value;
if (setitimer(ITIMER_REAL, &it_val, NULL) == -1)
{
perror("error calling setitimer()");
exit(1);
}
do
{
temp_reg[0] = read_reg();
//Read the register here and copy the value into char array (temp_reg
if (timedout == 1 )
return -1;//Timedout
} while (temp_reg[0] != 0 );//Check the value and if not try to read the register again (poll)
}
/*
* DoStuff
*/
void DoStuff(void)
{
timedout = 1;
printf("Timer went off.\n");
}
Now I want to optimize and reduce the CPU usage and want to improve the performance.
Can any one help me on this issue ?
Thanks for your help on this.
Currently I'm polling the register to get the expected value [...]
wow wow wow, hold on a moment here, there is a huge story hidden behind this sentence; what is "the register"? what is "the expected value"? What does read_reg() do? are you polling some external hardware? Well then, it all depends on how your hardware behaves.
There are two possibilities:
Your hardware buffers the values that it produces. This means that the hardware will keep each value available until you read it; it will detect when you have read the value, and then it will provide the next value.
Your hardware does not buffer values. This means that values are being made available in real time, for an unknown length of time each, and they are replaced by new values at a rate that only your hardware knows.
If your hardware is buffering, then you do not need to be afraid that some values might be lost, so there is no need to poll at all: just try reading the next value once and only once, and if it is not what you expect, sleep for a while. Each value will be there when you get around to reading it.
If your hardware is not buffering, then there is no strategy of polling and sleeping that will work for you. Your hardware must provide an interrupt, and you must write an interrupt-handling routine that will read every single new value as quickly as possible from the moment that it has been made available.
Here are some pseudo code that might help:
do
{
// Pseudo code
start_time = get_current_time();
do
{
temp_reg[0] = read_reg();
//Read the register here and copy the value into char array (temp_reg
if (timedout == 1 )
return -1;//Timedout
// Pseudo code
stop_time = get_current_time();
if (stop_time - start_time > some_limit) break;
} while (temp_reg[0] != 0 );
if (temp_reg[0] != 0)
{
usleep(some_time);
start_time = get_current_time();
}
} while (temp_reg[0] != 0 );
To turn the pseudo code into real code, see https://stackoverflow.com/a/2150334/4386427
I'm using sleep like this to grab a frame every 1/25th of a second. OS is Debian 6 armel.
#define VIDEO_FRAME_RATE 25.0f
while (RECORDING) {
sprintf(buffer, "Someting from a data struct that is updated\n");
fprintf(Output, buffer);
usecTosleep = (1.0f/VIDEO_FRAME_RATE) * 1000000;
usleep(usecToSleep);
}
Question: What is the guarantee that the loop will output the buffer to Output file descriptor at every 1/25th of a second?
Is there a better way to do this in C? I need it to be as precise as possible to prevent drifting.
Thank you.
Your "recording operation" still takes some time...
So you need to calculate time that should be spent on on frame (usecTosleep = (1.0f/VIDEO_FRAME_RATE) * 1000000;) and then calculate time operation really took, adjust sleep time accordingly:
usecTosleep = (1.0f/VIDEO_FRAME_RATE) * 1000000;
lastFrameUsec = 0;
while (RECORDING) {
sprintf(buffer, "Someting from a data struct that is updated\n");
fprintf(Output, buffer);
usecTosleep = (1.0f/VIDEO_FRAME_RATE) * 1000000;
// currentFrameUec - lastFrameUsec = actual time spending on operation
currentFrameUsec = getUsecElapsedFromStart();
actualSleep = usecToSleep - (currentFrameUsec - lastFrameUsec);
// If there's time to sleep left, sleep
if(actualSleep > 0){
usleep(actualSleep);
lastFrameUsec = getUsecElepasedFromStart();
} else {
lastFrameUsec = currentFrameUsec;
}
}
I'm not aware of multi platform getUsecElapsedFromStart() so you probably will have to implement your own for example like this one.
int getUsecElapsedFromStart(const struct timespec *tstart)
{
struct timespec *tnow;
clock_gettime(CLOCK_MONOTONIC, &tnow);
return (int)((tnow tv_sec*10.0e9 + tnow.tv_nsec) -
(tstart.tv_ses*10.0e9 + tstart.tv_nsec));
}
clock_gettime(CLOCK_MONOTONIC, &tstart);
while(RECORDING){
// ...
currentFrameUsec = getUsecElapsedFromStart(&tstart);
}
In response to your first question, there is no such guarantee. usleep() promises only that it will sleep at least as long as you tell it to. But it may sleep longer:
The usleep() function suspends execution of the calling process for (at least) usec microseconds. The
sleep may be lengthened slightly by any system activity or by the time spent processing the call or by
the granularity of system timers.
I am doing some trivial benchmarking of writing x lines of the same text into a file using two methods:
Direct fwrite.
Make a new thread and communication is done via asynchronous queue (main thread is inserting on one side and the other thread is reading from the other). This method is used to try to minimize slowest writing (due to flushing)
This is a snippet of the code which should give a basic idea of the program:
int i;
char * buf;
int buf_size;
double local_start, local_end, global_start, global_end;
double slowest, fastest;
double local_time_difference;
buf = "A string to be printed to a file \n";
buf_size = strlen(buf);
fastest = MAX_WRITE_TIME;
slowest = 0;
logger_init(atoi(argv[1]));
global_start = get_time();
for(i = 0 ; i < 100000000 ; i++)
{
local_start = get_time();
logger_write(buf, buf_size);
local_end = get_time();
local_time_difference = local_end-local_start;
if(local_time_difference < fastest && local_time_difference != 0)
fastest = local_time_difference;
if(local_time_difference > slowest)
slowest = local_time_difference;
if(i % 10000 == 0)
usleep(1);
}
global_end = get_time();
printf("Fastest: %1.9f\nSlowest: %1.9f\nTotal Time: %1.9f\n", fastest, slowest, global_end-global_start);
logger_destroy();
Get time procedure returns time in microseconds
double get_time()
{
struct timeval t;
struct timezone tzp;
gettimeofday(&t, &tzp);
return t.tv_sec + t.tv_usec*1e-6;
}
Depending on the argument passed to logger_init, logger_write will either directly write to the file or insert it in the queue (size of the queue must not exceed some particular limit). GAsyncQueue is being used
The method I'm currently using to calculate fastest and slowest write certainly works but my question is: is there a tool or profiler that would do this for me? i.e. give me statistics about each function (maximum, minimum and average call)
Tools that I've tried so far but had no luck with:
gprof
Zoom
Kcachegrind
VTune
TL:DR
I am looking for a tool to give me min, max and average execution time of a particular function, not the overall time taken.
Use the correct high resolution OS API functions for benchmarking.
Don't calculate execution times from inside the measurement itself, especially not if you are using float numbers.
Why are you calling a sleep function? Are you trying to force a context switch or some oddity like that? The OS will likely handle such better and more efficient than your program.
gcc (GCC) 4.6.0 20110419 (Red Hat 4.6.0-5)
I am trying to get the time of start and end time. And get the difference between them.
The function I have is for creating a API for our existing hardware.
The API wait_events take one argument that is time in milli-seconds. So what I am trying to get the start before the while loop. And using time to get the number of seconds. Then after 1 iteration of the loop get the time difference and then compare that difference with the time out.
Many thanks for any suggestions,
/* Wait for an event up to a specified time out.
* If an event occurs before the time out return 0
* If an event timeouts out before an event return -1 */
int wait_events(int timeout_ms)
{
time_t start = 0;
time_t end = 0;
double time_diff = 0;
/* convert to seconds */
int timeout = timeout_ms / 100;
/* Get the initial time */
start = time(NULL);
while(TRUE) {
if(open_device_flag == TRUE) {
device_evt.event_id = EVENT_DEV_OPEN;
return TRUE;
}
/* Get the end time after each iteration */
end = time(NULL);
/* Get the difference between times */
time_diff = difftime(start, end);
if(time_diff > timeout) {
/* timed out before getting an event */
return FALSE;
}
}
}
The function that will call will be like this.
int main(void)
{
#define TIMEOUT 500 /* 1/2 sec */
while(TRUE) {
if(wait_events(TIMEOUT) != 0) {
/* Process incoming event */
printf("Event fired\n");
}
else {
printf("Event timed out\n");
}
}
return 0;
}
=============== EDIT with updated results ==================
1) With no sleep -> 99.7% - 100% CPU
2) Setting usleep(10) -> 25% CPU
3) Setting usleep(100) -> 13% CPU
3) Setting usleep(1000) -> 2.6% CPU
4) Setting usleep(10000) -> 0.3 - 0.7% CPU
You're overcomplicating it - simplified:
time_t start = time();
for (;;) {
// try something
if (time() > start + 5) {
printf("5s timeout!\n");
break;
}
}
time_t should in general just be an int or long int depending on your platform counting the number of seconds since January 1st 1970.
Side note:
int timeout = timeout_ms / 1000;
One second consists of 1000 milliseconds.
Edit - another note:
You'll most likely have to ensure that the other thread(s) and/or event handling can happen, so include some kind of thread inactivity (using sleep(), nanosleep() or whatever).
Without calling a Sleep() function this a really bad design : your loop will use 100% of the CPU. Even if you are using threads, your other threads won't have much time to run as this thread will use many CPU cycles.
You should design something like that:
while(true) {
Sleep(100); // lets say you want a precision of 100 ms
// Do the compare time stuff here
}
If you need precision of the timing and are using different threads/processes, use Mutexes (semaphores with a increment/decrement of 1) or Critical Sections to make sure the time compare of your function is not interrupted by another process/thread of your own.
I believe your Red Hat is a System V so you can sync using IPC