aiohttp BaseConnector _acquired vs _conns - aiohttp

I see internally connections (<aiohttp.client_proto.ResponseHandler objects) are stored in: _conns and _acquired
What is the difference between those?
context: I try to investigate why connections hang on..
when limit for TCPConnector is set, lets say to 10,
with lsof I see like 70 of them (ESTABLISHED)
The number drops after sth like 20min..
ps: I use: session.get(url, ssl=False)

Related

Wrk vs Gatling benchmark test comparison

With wrk, I runt the following command :
wrk -t10 -c10 -d30s http://localhost:8080/myService --latency -H "Accept-Encoding: gzip"
As a result, I obtain Requests/sec: 15000 and no error
I am trying to reproduce the same kind of test with Gatling. So I have tried the following :
scn.inject(
rampUsersPerSec(1) to 15000 during (30 seconds)
)
But as a result, I obtain errors :
---- Errors --------------------------------------------------------------------
i.n.c.AbstractChannel$AnnotatedSocketException: Can't assign r 573 (42,44%)
equested address: localhost/127.0.0.1:8080
i.n.c.AbstractChannel$AnnotatedSocketException: Resource tempo 530 (39,26%)
rarily unavailable: localhost/0:0:0:0:0:0:0:1:8080
j.i.IOException: Premature close 247 (18,30%)
From wrk, I believe my server can handle 15000 request/s but with Gatling it seems not the case. Do you have an idea why such a difference ?
Disclaimer: Gatling's creator here
You're comparing apples and oranges.
With wrk, you're opening 10 connections and looping as fast as possible during 30s.
With your current Gatling set up, you're spawning 225,015 virtual users ((1 + 15,000) / 2 * 30), each one trying to open its own connection.
I recommend you reading this article about picking injection profiles that make sense for your use case.
If you really want to do the same thing as wrk here, you need to wrap your scenario in a during(30) loop and change your injection profile to atOnceUsers(10).
You also have the option of using a shared connection pool.
Then, you can't expect any other to load test tool to be as fast as wrk for this kind of logicless, static test.
Also note that:
there was a mistake in Gatling's JVM configuration that was fixed in Gatling 3.4.0 that hurt performance in this kind of minimalistic
super high throughput tests, see issue
Gatling runs on a JVM, hence with a runtime, so it needs to warm up, boot throughput will be lower than the warm one

Aggregation Strategy based on message size of aggregate

I would like to aggregate exchanges, and when then exchange hits a certain size (say 20KB) I would like to mark the exchange as closed.
I have a rudimentary implementation that checks size of the exchange and if it is 18KB I return true from my predicate. However, if a messages comes in that is 4KB and it is currently 17KB that will mean I will complete the aggregation when it is 21KB which is too big.
Any ideas on how to solve this? Can I do something in the aggregation strategy to reject the join and start a new Exchange to aggregate on?
I figured I could put it through another process to check actual size remove messages off the end of the message to fit the size, and for each removed message, push them back through...but that seems a little ugly because I have a constantly compensating routine that would likely execute.
Thanks in advance for any tips.
I think there is an eager complete option you can use to mark it as complete when you have that 17 + 4 > 20 situation. Then it will complete the 17, and start a new group with the 4.
See the docs at: https://github.com/apache/camel/blob/master/camel-core/src/main/docs/eips/aggregate-eip.adoc
And you would also likely need to use `PreCompleteAggregationStrategy' and return true in that 17 + 4 > 20 situation, as otherwise it would group them together first and complete, eg so it becomes 21. But by using both the eager completion check option and this interface you can do as you want.
https://github.com/apache/camel/blob/master/camel-core/src/main/java/org/apache/camel/processor/aggregate/PreCompletionAwareAggregationStrategy.java

Understanding Datastore Get RPCs in Google App Engine

I'm using sharded counters (https://cloud.google.com/appengine/articles/sharding_counters) in my GAE application for performance reasons, but I'm having some trouble understanding why it's so slow and how I can speed things up.
Background
I have an API that grabs a set of 20 objects at a time and for each object, it gets a total from a counter to include in the response.
Metrics
With Appstats turned on and a clear cache, I notice that getting the totals for 20 counters makes 120 RPCs by datastore_v3.Get which takes 2500ms.
Thoughts
This seems like quite a lot of RPC calls and quite a bit of time for reading just 20 counters. I assumed this would be faster and maybe that's where I'm wrong. Is it supposed to be faster than this?
Further Inspection
I dug into the stats a bit more, looking at these two lines in the get_count method:
all_keys = GeneralCounterShardConfig.all_keys(name)
for counter in ndb.get_multi(all_keys):
If I comment out the get_multi line, I see that there are 20 RPC calls by datastore_v3.Get totaling 185ms.
As expected, this leaves get_multi to be the culprit for 100 RPC calls by datastore_v3. Get taking upwards of 2500 ms. I verified this, but this is where I'm confused. Why does calling get_multi with 20 keys cause 100 RPC calls?
Update #1
I checked out Traces in the GAE console and saw some additional information. They show a breakdown of the RPC calls there as well - but in the sights they say to "Batch the gets to reduce the number of remote procedure calls." Not sure how to do that outside of using get_multi. Thought that did the job. Any advice here?
Update #2
Here are some screen shots that show the stats I'm looking at. The first one is my base line - the function without any counter operations. The second one is after a call to get_count for just one counter. This shows a difference of 6 datastore_v3.Get RPCs.
Base Line
After Calling get_count On One Counter
Update #3
Based on Patrick's request, I'm adding a screenshot of info from the console Trace tool.
Try splitting up the for loop that goes through each item and the actual get_multi call itself. So something like:
all_values = ndb.get_multi(all_keys)
for counter in all_values:
# Insert amazeballs codes here
I have a feeling it's one of these:
The generator pattern (yield from for loop) is causing something funky with get_multi execution paths
Perhaps the number of items you are expecting doesn't match actual result counts, which could reveal a problem with GeneralCounterShardConfig.all_keys(name)
The number of shards is set too high. I've realized that anything over 10 shards causes performance issues.
When I've dug into similar issues, one thing I've learned is that get_multi can cause multiple RPCs to be sent from your application. It looks like the default in the SDK is set to 1000 keys per get, but the batch size I've observed in production apps is much smaller: something more like 10 (going from memory).
I suspect the reason it does this is that at some batch size, it actually is better to use multiple RPCs: there is more RPC overhead for your app, but there is more Datastore parallelism. In other words: this is still probably the best way to read a lot of datastore objects.
However, if you don't need to read the absolute most current value, you can try setting the db.EVENTUAL_CONSISTENCY option, but that seems to only be available in the older db library and not in ndb. (Although it also appears to be available via the Cloud Datastore API).
Details
If you look at the Python code in the App Engine SDK, specifically the file google/appengine/datastore/datastore_rpc.py, you will see the following lines:
max_count = (Configuration.max_get_keys(config, self.__config) or
self.MAX_GET_KEYS)
...
if is_read_current and txn is None:
max_egs_per_rpc = self.__get_max_entity_groups_per_rpc(config)
else:
max_egs_per_rpc = None
...
pbsgen = self._generate_pb_lists(indexed_keys_by_entity_group,
base_req.ByteSize(), max_count,
max_egs_per_rpc, config)
rpcs = []
for pbs, indexes in pbsgen:
rpcs.append(make_get_call(base_req, pbs,
self.__create_result_index_pairs(indexes)))
My understanding of this:
Set max_count from the configuration object, or 1000 as a default
If the request must read the current value, set max_gcs_per_rpc from the configuration, or 10 as a default
Split the input keys into individual RPCs, using both max_count and max_gcs_per_rpc as limits.
So, this is being done by the Python Datastore library.

How to make datastore keys mapreduce-friendly(-er)?

Edit: See my answer. Problem was in our code. MR works fine, it may have a status reporting problem, but at least the input readers work fine.
I ran an experiment several times now and I am now sure that mapreduce (or DatastoreInputReader) has odd behavior. I suspect this might have something to do with key ranges and splitting them, but that is just my guess.
Anyway, here's the setup we have:
we have an NDB model called "AdGroup", when creating new entities
of this model - we use the same id returned from AdWords (it's an
integer), but we use it as string: AdGroup(id=str(adgroupId))
we have 1,163,871 of these entities in our datastore (that's what
the "Datastore Admin" page tells us - I know it's not entirely
accurate number, but we don't create/delete adgroups very often, so
we can say for sure, that the number is 1.1 million or more).
mapreduce is started (from another pipeline) like this:
yield mapreduce_pipeline.MapreducePipeline(
job_name='AdGroup-process',
mapper_spec='process.adgroup_mapper',
reducer_spec='process.adgroup_reducer',
input_reader_spec='mapreduce.input_readers.DatastoreInputReader',
mapper_params={
'entity_kind': 'model.AdGroup',
'shard_count': 120,
'processing_rate': 500,
'batch_size': 20,
},
)
So, I've tried to run this mapreduce several times today without changing anything in the code and without making changes to the datastore. Every time I ran it, mapper-calls counter had a different value ranging from 450,000 to 550,000.
Correct me if I'm wrong, but considering that I use the very basic DatastoreInputReader - mapper-calls should be equal to the number of entities. So it should be 1.1 million or more.
Note: the reason why I noticed this issue in the first place is because our marketing guys started complaining that "it's been 4 days after we added new adgroups and they still don't show up in your app!".
Right now, I can think of only one workaround - write all keys of all adgroups into a blobstore file (one per line) and then use BlobstoreLineInputReader. The writing to blob part would have to be written in a way that does not utilize DatastoreInputReader, of course. Should I go with this for now, or can you suggest something better?
Note: I have also tried using DatastoreKeyInputReader with the same code - the results were similar - mapper-calls were between 450,000 and 550,000.
So, finally questions. Is it important how you generate ids for your entities? Is it better to use int ids instead of str ids? In general, what can I do to make it easier for mapreduce to find all of my entities mapping them?
PS: I'm still in the process of experimenting with this, I might add more details later.
After further investigation we have found that the error was actually in our code. So, mapreduce actually works as expected (mapper is called for every single datastore entity).
Our code was calling some google services functions that were sometimes failing (the wonderful cryptic ApplicationError messages). Due to these failures, MR tasks were being retried. However, we have set a limit on taskqueue retries. MR did not detect nor report this in any way - MR was still showing "success" in the status page for all shards. That is why we thought that everything is fine with our code and that there is something wrong with the input reader.

App Engine Datastore Viewer, how to show count of records using GQL?

I would think this would be easy for an SQL-alike! What I want is the GQL equivalent of:
select count(*) from foo;
and to get back an answer something similar to:
1972 records.
And I want to do this in GQL from the "command line" in the web-based DataStore viewer. (You know, the one that shows 20 at a time and lets me see "next 20")
Anyway -- I'm sure it's brain-dead easy, I just can't seem to find the correct syntax. Any help would be appreciated.
Thanks!
With straight Datastore Console, there is no direct way to do it, but I just figured out how to do it indirectly, with the OFFSET keyword.
So, given a table, we'll call foo, with a field called type that we want to check for values named "bar":
SELECT * FROM foo WHERE type="bar" OFFSET 1024
(We'll be doing a quick game of "warmer, colder" here, binary style)
Let's say that query returns nothing. Change OFFSET to 512, then 256, 128, 64, ... you get the idea. Same thing in reverse: Go up to 2048, 4096, 8192, 16384, etc. until you see no records, then back off.
I just did one here at work. Started with 2048, and noticed two records came up. There's 2049 in the table. In a more extreme case, (lets say there's 3300 records), you could start with 2048, notice there's a lot, go to 4096, there's none... Take the midpoint (1024 between 2048 and 4096 is 3072) next and notice you have records... From there you could add half the previous midpoint (512) to get 3584, and there's none. Whittle back down half (256) to get 3328, still none. Once more down half (128) to get 3200 and there's records. Go up half of the last val (64) and there's still records. Go up half again (32) to 3296 - still records, but so small you can easily see there's exactly 3300.
The nice thing about this vs. Datastore statistics to see how many records are in a table is you can limit it by the WHERE clause.
I don't think there is any direct way to get the count of entities via GQL. However you can get the count directly from the dashbaord
;
More details - https://cloud.google.com/appengine/docs/python/console/managing-datastore
As it's stated in other questions, it looks like there is no count aggregate function in GQL. The GQL Reference also doesn't say there is the ability to do this, though it doesn't explicitly say that it's not possible.
In the development console (running your application locally) it looks like just clicking the "List Entities" button will show you a list of all entities of a certain type, and you can see "Results 1-10 of (some number)" to get a total count in your development environment.
In production you can use the "Datastore Statistics" tab (the link right underneath the Datastore Viewer), choose "Display Statistics for: (your entity type)" and it will show you the total number of entities, however this is not the freshest view of the data (updated "at least once per day").
Since you can't run arbitrary code in production via the browser, I don't think saying "use .count() on a query" would help, but if you're using the Remote API, the .count() method is no longer capped at 1000 entries as of August, 2010, so you should be able to run print MyEntity.all().count() and get the result you want.
This is one of those surprising things that the datastore just can't do. I think the fastest way to do it would be to select __KEY__ from foo into a List, and then count the items in the list (which you can't do in the web-based viewer).
If you're happy with statistics that can be a little bit stale, you can go to the Datastore Statistics page of the admin console, which will tell you how many entities of each type there were some time ago. It seems like those stats are usually less than 10 hours old. Unfortunately, you can't query them more specifically.
There's no way to get a total count in GQL. Here's a way to get a count using python:
def count_models(model_class, max_fetch=1000):
total = 0
cursor = None
while True:
query = model_class.all(keys_only=True)
if cursor:
query.with_cursor(cursor)
results = query.fetch(max_fetch)
total += len(results)
print('still counting: ' + total)
if (len(results) < max_fetch):
return total
cursor = query.cursor()
You could run this function using the remote_api_shell, or add a custom page to your admin site to run this query. Obviously, if you've got millions of rows you're going to be waiting a while. You might be able to increase max_fetch, I'm not sure what the current fetch limit is.

Resources