Measure tcp-connection speed - c

I want to write simple unix-application that measures tcp-connection speed.
So I have:
server listens on specified port, accepts connections and measures speed
client sends messages (continuously)
I thought that measurement on server is somthing like this:
clock_gettime(CLOCK_REALTIME, &start);
size = recv(csocket_fd, buf, BUFFER_SIZE, 0);
clock_gettime(CLOCK_REALTIME, &end);
but it seems like it's wrong way.
any suggestions?

On the server, when you receive the first data from the client, record the current time to a variable.
Also on the server, whenever you receive data from the client, add the number of bytes received to a counter variable.
Then at any time you want, you can calculate the cumulative average bytes-per-second speed of the connection by calculating (total_bytes_received)/(current_time - first_data_received_time); (Watch out for a potential divide by zero if current_time and first_data_received_time are equal!)
If you want to do something more elaborate, like a running average over the last 10 seconds, that's a little more involved, but computing the cumulative average is pretty easy.

I have done a few assignments in networking and one thing that i noticed it is that it wouldn't work the way you are trying to. We have to complete a send-receive for the server to receive again and there are other factors that prevent us from doing it ... from MTUs to buffer sizes etc. I have used is netperf to benchmark bandwidths before[is this the speed you are talking about ?]. The code is open source.

Related

Flink Dashboard Throughput doesn't add up

I have two operators, a source and a map. The incoming throughput of of the map is stuck at just above 6K messages/s whereas the message count reaches the size of the whole stream (~ 350K) in under 20s (see duration). 350000/20 means that I have a throughput of at least 17500 and not 6000 as flink suggests! What's going on here?
as shown in the picture:
start time = 13:10:29
all messages are already read by = 13:10:46 (less than 20s)
I checked the flink library code and it seems that the numRecordsOutPerSecond statistic (as well as the rest similar ones) operate on a window. This means that they display average throughput but of the last X seconds. It's not the average throughput of the whole execution

Is there an easy way to get the percentage of successful reads of last x minutes?

I have a setup with a Beaglebone Black which communicates over I²C with his slaves every second and reads data from them. Sometimes the I²C readout fails though, and I want to get statistics about these fails.
I would like to implement an algorithm which displays the percentage of successful communications of the last 5 minutes (up to 24 hours) and updates that value constantly. If I would implement that 'normally' with an array where I store success/no success of every second, that would mean a lot of wasted RAM/CPU load for a minor feature (especially if I would like to see the statistics of the last 24 hours).
Does someone know a good way to do that, or can anyone point me in the right direction?
Why don't you just implement a low-pass filter? For every successfull transfer, you push in a 1, for every failed one a 0; the result is a number between 0 and 1. Assuming that your transfers happen periodically, this works well -- and you just have to adjust the cutoff frequency of that filter to your desired "averaging duration".
However, I can't follow your RAM argument: assuming you store one byte representing success or failure per transfer, which you say happens every second, you end up with 86400B per day -- 85KB/day is really negligible.
EDIT Cutoff frequency is something from signal theory and describes the highest or lowest frequency that passes a low or high pass filter.
Implementing a low-pass filter is trivial; something like (pseudocode):
new_val = 1 //init with no failed transfers
alpha = 0.001
while(true):
old_val=new_val
success=do_transfer_and_return_1_on_success_or_0_on_failure()
new_val = alpha * success + (1-alpha) * old_val
That's a single-tap IIR (infinite impulse response) filter; single tap because there's only one alpha and thus, only one number that is stored as state.
EDIT2: the value of alpha defines the behaviour of this filter.
EDIT3: you can use a filter design tool to give you the right alpha; just set your low pass filter's cutoff frequency to something like 0.5/integrationLengthInSamples, select an order of 0 for the IIR and use an elliptic design method (most tools default to butterworth, but 0 order butterworths don't do a thing).
I'd use scipy and convert the resulting (b,a) tuple (a will be 1, here) to the correct form for this feedback form.
UPDATE In light of the comment by the OP 'determine a trend of which devices are failing' I would recommend the geometric average that Marcus Müller ꕺꕺ put forward.
ACCURATE METHOD
The method below is aimed at obtaining 'well defined' statistics for performance over time that are also useful for 'after the fact' analysis.
Notice that geometric average has a 'look back' over recent messages rather than fixed time period.
Maintain a rolling array of 24*60/5 = 288 'prior success rates' (SR[i] with i=-1, -2,...,-288) each representing a 5 minute interval in the preceding 24 hours.
That will consume about 2.5K if the elements are 64-bit doubles.
To 'effect' constant updating use an Estimated 'Current' Success Rate as follows:
ECSR = (t*S/M+(300-t)*SR[-1])/300
Where S and M are the count of errors and messages in the current (partially complete period. SR[-1] is the previous (now complete) bucket.
t is the number of seconds expired of the current bucket.
NB: When you start up you need to use 300*S/M/t.
In essence the approximation assumes the error rate was steady over the preceding 5 - 10 minutes.
To 'effect' a 24 hour look back you can either 'shuffle' the data down (by copy or memcpy()) at the end of each 5 minute interval or implement a 'circular array by keeping track of the current bucket index'.
NB: For many management/diagnostic purposes intervals of 15 minutes are often entirely adequate. You might want to make the 'grain' configurable.

Persistent Connection on a web server HTTP1.1

I'm trying to write a web server in C under Linux using protocol HTTP1.1 .
I've used select for multiple requests and I'd like to implement persistent connections but it didn't work so far 'cause I can't set a timeout properly. How can I do it? I think about setsockopt function:
setsockopt(connsd, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(tv))
where tv is a struct timeval. This isn't working either.
Any suggestions?
SO_RCVTIMEO will only work when you are actually reading data. select() won't honor it. select() takes a timeout parameter in its last argument. If you have a timer data structure to organize which connections should timeout in what order, then you can pass the soonest to timeout time to select(). If the return value is 0, then a timeout has occurred, and you should expire all timed out connections. After processing live connections (and re-setting their idle timeout in your timer data structure), you should again check to see if any connections should be timed out before calling select() again.
There are various data structures you can use, but popular ones include the timing wheel and timer heap.
A timing wheel is basically an array organized as a circular buffer, where each buffer position represents a time unit. If the wheel units is in seconds, you could construct a 300 element array to represent 5 minutes of time. There is a sticky index which represents the last time any timers were expired, and the current position would be the current time modulo the size of the array. To add a timeout, calculate the absolute time it needs to be timed out, modulo that by the size of the array, and add it to the list at that array position. All buckets between the last index and the current position whose time out has been reached need to be expired. After expiring the entries, the last index is updated to the current position. To calculate the time until the next expiration, the buckets are scanned starting from the current position to find a bucket with an entry that will expire.
A timer heap is basically a priority queue, where entries that expire sooner have higher priority than entries that expire later. The top of a non-empty heap determines the time to next expiration.
If your application is inserting a lots and lots of timers all the time, and then cancelling them all the time, then a wheel may be more appropriate, as inserting into the wheel and removing from the wheel is more efficient than inserting and removing from a priority queue.
The simplest solution is probably to keep a last-time-request-received for each connection, then regularly check that time and if it's too long ago then close the connection.

c - Multiple select()s to monitor multiple FD_SETs

I'm not an expert in Network Programming. I basically have two kinds of clients who have different time-outs. I am supposed to use UDP with connected sockets for client-server communication.
The problem is twofold:
a) I need to mark as died whichever client (alternatively, socket) does not respond for t1 seconds. Using select would time out if none of the sockets in read_fd_set have anything to read within the timeout value. So, how do I time-out any one socket which is not having data to read for quite some time?
Currently, whenever select returns, I myself keep track of which sockets are responding and which not. And I add t1.tu_sec to the individual time elapsed of each client (socket). Then, I manually close and exclude from FD_SET the socket which does not respond for (n) * (t1.tu_sec) time. Is this a good enough approach?
b) The main problem is that there are two kinds of clients which have different time-outs, t1 and t2. How do I handle this?
Can I have two select()s for the two kinds of clients in the same loop? Would it cause starvation without threads? Is using threads advisable (or even required) in this case?
I've been roaming around the web for ages!
Any help is much appreciated.
This is just a special case of a very common pattern, where a select/poll loop is associated with a collection of timers.
You can use a priority queue of tasks, ordered on next (absolute) firing time; the select timeout is always then just the absolute time at the front of the queue.
when select times out (and just before the next iteration, if your tasks may take a long time to complete), get the current time, pull every task that should already have executed off the queue, and execute it
(some) tasks will need to be re-scheduled, so make sure they can mutate the priority queue while you do this
Then your logic is trivial:
on read, mark the socket busy
on timer execution, mark the socket idle
if it was already idle, that means nothing was received since the last timer expiry: it's dead
A quick solution that comes to my mind, is to keep the sockets in a collection sorted by the time remaining until the nearest timeout.
Use select with the timeout set to the smallest time remaining, remove/close/delete the timed-out socket from the collection, and repeat.
So, in pseudo-code it might look like this:
C = collection of structs ( socket, timeout, time_remaining := timeout )
while (true) {
sort_the_collection_by_time_remaining
next_timeout = min(time_remaining in C)
select ( sockets in C, next_timeout )
update_all_time_remaining_values
remove_from_C_if_required //if timeout occured
}
It can easily be solved with a single select call. For each socket have two values related to the timeout: The actual timeout; And the amount of time until timeout. Then count down the "time until timeout" every 0.1 second (or similar), and when it reaches zero close the socket. If the socket receives traffic before the timeout simply reset the "time until timeout" to the timeout value and start the down-count again.

Socket : measure data transfer rate (in bytes / second) between 2 applications

I have an application that keeps emitting data to a second application (consumer application) using TCP socket. How can I calculate the total time needed from when the data is sent by the first application until the data is received by the second application? Both the applications are coded using C/C++.
My current approach is as follow (in pseudocode):
struct packet{
long sent_time;
char* data;
}
FIRST APP (EMITTER) :
packet p = new packet();
p.data = initialize data (either from file or hard coded)
p.sent_time = get current time (using gettimeofday function)
//send the packet struct (containing sent time and packet data)
send (sockfd, p, ...);
SECOND APP (CONSUMER)
packet p = new packet();
nbytes = recv (sockfd, p, .....); // get the packet struct (which contains the sent time and data)
receive_time = get current time
data transfer time = receive time - p.senttime (assume I have converted this to second)
data transfer rate = nbytes / data transfer time; // in bytes per second
However the problem with this is that the local clock time between the 2 applications (emitter and consumer) are not the same because they are both running on different computers, leading this result to a completely useless result.
Is there any other better way to do this in a proper way (programmatically), and to get as accurate data transfer rate as possible?
If your protocol allows it, you could send back an acknowledgementn from the server for the received packet. This is also a must if you want to be sure that the server received/processed the data.
If you have that, you can simply calculate the rate on the client. Just substract the RTT from the length of the send+ACK intervall and you'll have a quite accurate measurement.
Alternatively you can use a time syncronization tool like NTP to synchronize the clocks on the two servers.
First of all: Even if your times were in sync, you would be calculating latency, not throughput. On every network connection chances are, that there is more than one packet en route at a given point in time, rendering your single-packet approach useless for throughput measurement.
E.g. Compare the ping time from your mobile to a HTTP server with the max download speed - ping time will be tens of ms, packet size will be ca. 1.5KByte, which would result in a much lower max throughput than observerd when downloading.
If you want to measure real throughput, use a blocking socket on the sender side and send e.g. 1 million packets as fast as the system will allow you, on the receiving side measure time between arrival of first packet and arrival of last packet.
If OTOH you want to accurately measure latency, use
struct packet{
long sent_time;
long reflect_time;
char* data;
}
and have the server reflect the packet. On the client side check all three timestamps, then reverse roles to get a grip on asymetric latencies.
Edit: I meant: The reflect time will be the "other" clock, so when running the test back and forth you will be able to filter out the offset.

Resources