get_by_key_name() in GAE taking as long as 750ms. Is this expected? - google-app-engine

My program fetches ~100 entries in a loop. All entries are fetched using get_by_key_name(). Appstats show that some get_by_key_name() requests are taking as much as 750ms! (other big values are 355ms, 260ms, 230ms). Average for other fetches ranges from 30ms to 100ms. These times are in real_time and hence contribute towards 'ms' and not 'cpu_ms'.
Due to the above, total time taken to return the webpage is very high ms=5754, where cpu_ms=1472. (above times are seen repeatedly for back to back requests.)
Environment: Python 2.7, webapp2, jinja2, High Replication, No other concurrent requests to the server, Frontend Instance Class is F1, No memcache set yet, max idle instances is automatic, min pending latency is automatic, using db (NOT NDB).
Any help will be greatly appreciated as I based whole database design on fetching entries from the datastore using only get_by_key_name()!!
Update:
I tried profiling using time.clock() before and immediately after every get_by_key_name() method call. The difference I get from time.clock() for every single call is 10ms! (Just want to clarify that the get_by_key_name() is called on different Kinds).
According to time.clock() the total execution time (in wall-clock time) is 660ms. But the real-time is 5754 (=ms), and cpu_ms is 1472 per GAE logs.
Summary of Questions:
*[Update: This was addressed by passing list of keys] Why get_by_key_name() is taking that long?*
Why ms of 5754 is so much more than cpu_ms of 1472. Is task execution in halted/waiting-state for 75% (1-1472/5754) of the time due to which real-time (wall clock) time taken is so long as far as end user is concerned?
If the above is true, then why time.clock() shows that only 660ms (wall-clock time) elapsed between start of the first get_by_key_name() request and the last (~100th) get_by_key_name() request; although GAE shows this time as 5754ms?

Related

Based on Gatling report, how to make sure 100 requests are processed in less than 1 second

how can I check my requirement of 100 requests are processed in less than 1 second in my gatling3 report. I ran this using jenkins.
my simulation looks like as below
rampConcurrentUsers(1) to (100) during (161 second),
constantConcurrentUsers(100) during (1 minute)
Below is my response time percentile graph of two executions for an interval of one second.
enter image []1 here
What does the min,max here will tell us, i am assuming the percentages 25%-99% are the completion of the request.
Those graph sections are not what you're after - they show the distribution of response times and the number of active users.
So min is the fastest response time
max is the longest
95% is the response time for which 95% of your requests were under
and so on...
So what you could do is look at the section of your graph corresponding to the 100 constant concurrent users injection stage. In this part you would require that the max response time always be under 1 second
(Note: there's something odd with your 2nd report - I assume it didn't come from running the stated injection profile as it has more than 100 concurrent users active)

Apache flink - Time characterstics

How can i use the Ingestion time characteristics in Apache flink. I know we need to set the environment time characteristics. But how can i collect the data with timestamps which can be referred as ingestion time. Currently when i am using it, it is processing the window based on system clock time. I want to do the processing based on the time at which data enters the flink environment.
A little code extract which may help to understand it clearly :
Time characteristics for environment :
env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);
Window time :
keyedEvents.timeWindow(Time.minutes(5))
Collection in source :
ctx.collect(monSourceData);
If the data collection starts at let say 11:03, I want to end it at 11:08 i.e. for 5 minutes. But it stops at 11:05 ( somehow behaving like processing time ).
Thanks in advance for your help.
Tumbling and sliding windows in Flink are always aligned to the clock (either the event time clock defined by the events and watermarks, or the system clock); time windows are not aligned to first event. So if you have windows that are 5 minutes long, there will be a window for the interval from 11:00 to 11:05, for example, regardless of the TimeCharacteristic.
Tumbling windows do, however, take an optional offset parameter that can be used to shift this alignment. So you could specify TumblingEventTimeWindows.of(Time.minutes(5), Time.minutes(3)), for example to shift the intervals by 3 minutes.

Google App Engine Log - ms and cpu_ms [duplicate]

I see some enigmatic time fields in the standard Google app-engine logs which make me curious:
2011-05-09 21:56:00.577 /whatever 200 211ms 775cpu_ms 589api_cpu_ms
0.1.0.1 - - [09/May/2011:21:56:00 -0700] "GET /whatever HTTP/1.1"
200 34 - "AppEngine-Google; (+http://code.google.com/appengine)"
"****.appspot.com" ms=212 cpu_ms=776 api_cpu_ms=589 cpm_usd=0.021713
queue_name=__cron task_name=dc4d411120bc75ea8bbea773d23eaecc
Particularly: ms, cpu_ms, api_cpu_ms, each of them two times with slightly different values.
Additionally, when I log timing information myself with a simple structure below for the GET request, it prints a somewhat lower value. In this case, particulary 182 msecs, against the official 775.
protected void doGet(HttpServletRequest req, HttpServletResponse resp) {
long t0 = System.currentTimeMillis();
//Do the stuff
long t1 = System.currentTimeMillis();
log.info("Completed in " + (t1-t0) + " msecs.\n");
}
So, my questions are: Why the difference between my measured time result and the cpu_ms value and how could I lower it? What are the exact meaning of time values in GAE log fields?
I want to optimize my code and I realized based on the aforementioned facts, that most time (nearly 600 msecs!) doesn't spent directly during doGet request. (I use JPA, URLFetch and this is a cron task.)
211ms: It's the response's time, as it will be perceived by the user who requested the page. You will try to decrease it, in order to improve the speed of your website.
775cpu_ms: According to the App Engine documentation, "CPU time is reported in "seconds," which is equivalent to the number of CPU cycles that can be performed by a 1.2 GHz Intel x86 processor in that amount of time. The actual number of CPU cycles spent varies greatly depending on conditions internal to App Engine, so this number is adjusted for reporting purposes using this processor as a reference measurement."
Then, it's normal not to have the "real" time: it should be different from what you measured with System.currentTimeMillis() because it's adjusted. Instead, you should use the Quota API to monitor the CPU usage: see documentation here. CPU time is billable (the free quota is 6.5 CPU-hours per day, and you can pay for more CPU time). Then, you will try to decrease it, in order to pay less.
589api_cpu_ms: It's the adjuested CPU time spent by the API usage (Datastore, User API, etc.)

What is meant by 'bucket-size' of queue in the google app engine?

Google app engine task queues have configuration as (example)
<queue>
<name>mail-queue</name>
<rate>5/m</rate>
<bucket-size>10</bucket-size>
</queue>
Here, what does the 'bucket-size' mean? I could not find a comprehensive documentation about this in google app engine documentation.
Does specifying this as 10 means that if 100 tasks are queued at an instant only 10 of those will be put in the queue and rest will be ignored?
bucket-size is perfectly described here:
Limits the burstiness of the queue's processing, i.e. a higher bucket size allows bigger spikes in the queue's execution rate. For example, consider a queue with a rate of 5/s and a bucket size of 10. If that queue has been inactive for some time (allowing its "token bucket" to fill up), and 20 tasks are suddenly enqueued, it will be allowed to execute 10 tasks immediately. But in the following second, only 5 more tasks will be able to be executed because the token bucket has been depleted and is refilling at the specified rate of 5/s.
If no bucket_size is specified for a queue, the default value is 5.
For your case it means that if 100 messages are queued, only ten are directly being executed and another 5 every next minute. You won't loose any messages, but they will queue up if your bucket-size and rate is too low.

How do I measure response time in seconds given the following benchmarking data?

We recently got some data back on a benchmarking test from a software vendor, and I think I'm missing something obvious.
If there were 17 transactions (I assume they mean successfully completed requests) per second, and 1500 of these requests could be served in 5 minutes, then how do I get the response time for a single user? Is this sort of thing even possible with benchmarking? I have a lot of other data from them, including apache config settings, but I'm not sure how to do all the math.
Given the server setup they sent, I want to know how I can deduce the user response time. I have looked at other similar benchmarking tests, but I'm having trouble measuring requests to response time. What other data do I need to provide here to get that?
If only 1500 of these can be served per 5 minutes then:
1500 / 5 = 300 transactions per min can be served
300 / 60 = 5 transactions per second can be served
so how are they getting 17 completed transactions per second? Last time I checked 5 < 17 !
This doesn't seem to fit. Or am I looking at it wrongly?
I presume be user response time, you mean the time it takes to serve a single transaction:
If they can serve 5 per second than it takes 200ms (1/5) per transaction
If they can serve 17 per second than it takes 59ms (1/17) per transaction
That is all we can tell from the given data. Perhaps clarify how many transactions are being done per second.

Resources