What exactly means Concurrency in Apache Bench? [duplicate] - benchmarking

The benchmark documentation says concurrency is how many requests are done simultaneously, while number of requests is total number of requests. What I'm wondering is, if I put a 100 requests at a concurrency level of 20, does that mean 5 tests of 20 requests at the same time, or 100 tests of 20 requests at the same time each? I'm assuming the second option, because of the example numbers quoted below..
I'm wondering because I frequently see results such as this one on some testing blogs:
Complete requests: 1000000
Failed requests: 2617614
This seems implausible, since the number of failed requests is higher than the number of total requests.
Edit: the site that displays the aforementioned numbers: http://zgadzaj.com/benchmarking-nodejs-basic-performance-tests-against-apache-php
OR could it be that it keeps trying until it reaches one million successes? Hm...

It means a single test with a total of 100 requests, keeping 20 requests open at all times. I think the misconception you have is that requests all take the same amount of time, which is virtually never the case. Instead of issuing requests in batches of 20, ab simply starts with 20 requests and issues a new one each time an existing request finishes.
For example, testing with ab -n 10 -c 3 would start with3 concurrent requests:
[1, 2, 3]
Let's say #2 finishes first, ab replaces it with a fourth:
[1, 4, 3]
... then #1 may finish, replaced by a fifth:
[5, 4, 3]
... Then #3 finishes:
[5, 4, 6]
... and so on, until a total of 10 requests have been made. (As requests 8, 9, and 10 complete the concurrency tapers off to 0 of course.)
Make sense?
As to your question about why you see results with more failures than total requests... I don't know the answer to that. I can't say I've seen that. Can you post links or test cases that show this?
Update: In looking at the source, ab tracks four types of errors which are detailed below the "Failed requests: ..." line:
Connect - (err_conn in source) Incremented when ab fails to set up the HTTP connection
Receive - (err_recv in source) Incremented when ab fails a read of the connection fails
Length - (err_length in source) Incremented when the response length is different from the length of the first good response received.
Exceptions - (err_except in source) Incremented when ab sees an error while polling the connection socket (e.g. the connection is killed by the server?)
The logic around when these occur and how they are counted (and how the total bad count is tracked) is, of necessity, a bit complex. It looks like the current version of ab should only count a failure once per request, but perhaps the author of that article was using a prior version that was somehow counting more than one? That's my best guess.
If you're able to reproduce the behavior, definitely file a bug.

I see nothing wrong. Failed requests can increment more than one error each. That's how ab works.
There are various statically declared buffers of fixed length.
Combined with the lazy parsing of the command line arguments,
the response headers from the server and other external inputs,
this might bite you.
You might notice for example that the previous node results have a similar count for 3 of the error counters. Most probably, from the 100 000 requests made only 8409 failed and not 25227.
Receive: 8409, Length: 8409, Exceptions: 8409

Related

Starvation of one of 2 streams in ConnectedStreams

Background
We have 2 streams, let's call them A and B.
They produce elements a and b respectively.
Stream A produces elements at a slow rate (one every minute).
Stream B receives a single element once every 2 weeks. It uses a flatMap function which receives this element and generates ~2 million b elements in a loop:
(Java)
for (BElement value : valuesList) {
out.collect(updatedTileMapVersion);
}
The valueList here contains ~2 million b elements
We connect those streams (A and B) using connect, key by some key and perform another flatMap on the connected stream:
streamA.connect(streamB).keyBy(AClass::someKey, BClass::someKey).flatMap(processConnectedStreams)
Each of the b elements has a different key, meaning there are ~2 million keys coming from the B stream.
The Problem
What we see is starvation. Even though there are a elements ready to be processed they are not processed in the processConnectedStreams.
Our tries to solve the issue
We tried to throttle stream B to 10 elements in a 1 second by performing a Thread.sleep() every 10 elements:
long totalSent = 0;
for (BElement value : valuesList) {
totalSent++;
out.collect(updatedTileMapVersion);
if (totalSent % 10 == 0) {
Thread.sleep(1000)
}
}
The processConnectedStreams is simulated to take 1 second with another Thread.sleep() and we have tried it with:
* Setting parallelism of 10 to all the pipeline - didn't work
* Setting parallelism of 15 to all the pipeline - did work
The question
We don't want to use all these resources since stream B is activated very rarely and for stream A elements having high parallelism is an overkill.
Is it possible to solve it without setting the parallelism to more than the number of b elements we send every second?
It would be useful if you shared the complete workflow topology. For example, you don't mention doing any keying or random partitioning of the data. If that's really the case, then Flink is going to pipeline multiple operations in one task, which can (depending on the topology) lead to the problem you're seeing.
If that's the case, then forcing partitioning prior to the processConnectedStreams can help, as then that operation will be reading from network buffers.

How to handle db concurrency?

Let's say I've got 3 tables (LiveDataTable, ReducedDataTable, ScheduleTable).
Basically I've got a stream of events -> whenever I receive an event I write extracted data of this event to LiveDataTable.
The problem is that there's a huge number of events that's why LiveDataTable may become really huge, so I've got another ReducedDataTable where I combine rows from LiveDataTable (think of selecting 100 rows from LiveDataTable, reducing it to 1 row and write it to ReducedDataTable and then deleting these 100 rows from LiveDataTable).
In order to determine the right time of performing these reducing operations there's ScheduleTable. You may think that 1 row ScheduleTable corresponds to 1 reducing operation.
I want to be able to support List<Data> getData() method from Interface. There're 2 cases: I should either read from ReducedDataTable only or merge the results from ReducedDataTable and LiveDataTable.
Here's how my caching works step-by-step:
Read 1 row from ScheduleTable
Read from LiveDataTable
Write to ReducedDataTable (at least 4 rows)
Remove (<= INT_MAX) rows from LiveDataTable
Remove 1 row from ScheduleTable
The problem is I want to determine whether I should read from LiveDataTable and ReducedDataTable programmatically when receiving getData() request. For every step (before #3) I want to read from LiveDataTable and then I'd like to read from ReducedDataTable. How do I determine what step I'm currently at when receiving getData() request?
The reason I asked this questions I believe this's a common problem in DB when handling concurrency.
(Assuming that your compaction process is fast enough)
You can first optimistically read from the small table and if the data is missing - then read from the non-compacted one.
In most cases there will be only one request, not two.
Otherwise you can maintain the timestamp of the data that has been already compacted.

Generated unique id with 6 characters - handling when too much ids already used

​In my program you can book an item. This item has an id with 6 characters from 32 possible characters.
So my possibilities are 32^6. Every id must be unique.
func tryToAddItem {
if !db.contains(generateId()) {
addItem()
} else {
tryToAddItem()
}
}
For example 90% of my ids are used. So the probability that I call tryToAddItem 5 times is 0,9^5 * 100 = 59% isn't it?
So that is quite high. This are 5 database queries on a lot of datas.
When the probability is so high I want to implement a prefix „A-xxxxxx“.
What is a good condition for that? At which time do I will need a prefix?
In my example 90% ids were use. What is about the rest? Do I threw it away?
What is about database performance when I call tryToAddItem 5 times? I could imagine that this is not best practise.
For example 90% of my ids are used. So the probability that I call tryToAddItem 5 times is 0,9^5 * 100 = 59% isn't it?
Not quite. Let's represent the number of call you make with the random variable X, and let's call the probability of an id collision p. You want the probability that you make the call at most five times, or in general at most k times:
P(X≤k) = P(X=1) + P(X=2) + ... + P(X=k)
= (1-p) + (1-p)*p + (1-p)*p^2 +... + (1-p)*p^(k-1)
= (1-p)*(1 + p + p^2 + .. + p^(k-1))
If we expand this out all but two terms cancel and we get:
= 1- p^k
Which we want to be greater than some probability, x:
1 - p^k > x
Or with p in terms of k and x:
p < (1-x)^(1/k)
where you can adjust x and k for your specific needs.
If you want less than a 50% probability of 5 or more calls, then no more than (1-0.5)^(1/5) ≈ 87% of your ids should be taken.
First of all make sure there is an index on the id columns you are looking up. Then I would recommend thinking more in terms of setting a very low probability of a very bad event occurring. For example maybe making 20 calls slows down the database for too long, so we'd like to set the probability of this occurring to <0.1%. Using the formula above we find that no more than 70% of ids should be taken.
But you should also consider alternative solutions. Is remapping all ids to a larger space one time only a possibility?
Or if adding ids with prefixes is not a big deal then you could generate longer ids with prefixes for all new items going forward and not have to worry about collisions.
Thanks for response. I searched for alternatives and want show three possibilities.
First possibility: Create an UpcomingItemIdTable with 200 (more or less) valid itemIds. A task in the background can calculate them every minute (or what you need). So the action tryToAddItem will always get a valid itemId.
Second possibility
Is remapping all ids to a larger space one time only a possibility?
In my case yes. I think for other problems the answer will be: it depends.
Third possibility: Try to generate an itemId and when there is a collision try it again.
Possible collisions handling: Do some test before. Measure the time to generate itemIds when there are already 1000,10.000,100.000,1.000.000 etc. entries in the table. When the tryToAddItem method needs more than 100ms (or what you prefer) then increase your length from 6 to 7,8,9 characters.
Some thoughts
every request must be atomar
create an index on itemId
Disadvantages for long UUIDs in API: See https://zalando.github.io/restful-api-guidelines/#144
less usable, because...
-cannot be memorized and easily communicated by humans
-harder to use in debugging and logging analysis
-less convenient for consumer facing usage
-quite long: readable representation requires 36 characters and comes with higher memory and bandwidth consumption
-not ordered along their creation history and no indication of used id volume
-may be in conflict with additional backward compatibility support of legacy ids
[...]
TLDR: For my case every possibility is working. As so often it depends on the problem. Thanks for input.

Array Partitioning

I have a set of object numbered from 1 to n and there is only one instance of each object.Now say, I have received k request, each request is to get some set of object.
So how I should process these requests so that we can process maximum number of request. we have prior knowledge of all the request.
Ex: say we have 10 apple numbered from 1 to 10 and three requests:
request 1: {2,6}
request 2: {1,2,3,4,5}
request 3: {6,7,8,9,10}
so here if we process request 1 then we can full-fill only 1 request but if we process 2 and 3 we can full-fill 2 request.
Please suggest a optimized Algo.
The problem you describe is called maximum set packing: given a set (your array) an a set of subsets (your requests), find the maximum number of subsets (requests) that do not have an element in common (are pairwise disjoint).
As shown in the wikipedia article, the problem can be formulated as an integer linear program, which can be solved by a standard solver. Since those are highly optimized, that is as optimal as you can probably get. Like many packing problems, the problem is NP-hard, so you cannot do much better than brute-force if you try to implement this yourself.

Is neural network's response guaranteed on training data?

I'm trying to train an ANN (I use this library: http://leenissen.dk/fann/ ) and the results are somewhat puzzling - basically if I run the trained network on the same data used for training, the output is not what specified in the training set, but some random number.
For example, the first entry in the training file is something like
88.757004 88.757004 104.487999 138.156006 100.556000 86.309998 86.788002
1
with the first line being the input values and the second line is the desired output neuron's value. But when I feed the exact same data to the trained network, I get different results on each train attempt, and they are quite different from 1, e.g.:
Max epochs 500000. Desired error: 0.0010000000.
Epochs 1. Current error: 0.0686412785. Bit fail 24.
Epochs 842. Current error: 0.0008697828. Bit fail 0.
my test result -4052122560819626000.000000
and then on another attempt:
Max epochs 500000. Desired error: 0.0010000000.
Epochs 1. Current error: 0.0610717005. Bit fail 24.
Epochs 472. Current error: 0.0009952184. Bit fail 0.
my test result -0.001642
I realize that the training set size may be inadequate (I only have about a 100 input/output pairs so far), but shouldn't at least the training data trigger the right output value? The same code works fine for the "getting started" XOR function described at the FANN's website (I've already used up my 1 link limit)
Short answer: No
Longer answer (but possibly not the as correct):
1st: a training run only moves the weights of the neurons towards a position where they affect the output to be as in the testdata. After some/many iterations the output should be close to the expected output. Iff the neurol network is up to the task, which brings me to
2nd: Not every neuronal network works for every problem. For a single neuron it is pretty easy to come up with a simple function that can not get approximated by a single neuron. Though not as easy to see, the same limit applies for every neural network. In such cases your results will very likely look like random numbers. Edit after comment: In many cases this can be fixed by adding neurons to the network.
3rd: actually the first point is a strength of a neural network, because it allows the network to handle outliers nicely.
4th: I blame 3 for my lacking understanding of music. It just doesn't fit my brain ;-)
No, if you get your ANN to work perfectly on the training data, you either have a really easy problem or you're overfitting.

Resources