I have a Python 3.7 project which communicates with Datastore using the google.cloud.ndb library.
I've noticed that the first request when an instance is brought up is always an order of magnitude (several seconds) slower than subsequent ones. This is true even running locally with an emulated Datastore. I've verified that the delay is due to the first ndb.Key(...).get() which gets run. Presumably the Datastore connection takes some time to setup?
Has anyone found a way to reduce this delay?
Code example:
from flask import Flask
from google.cloud import ndb
import time
client = ndb.Client()
def ndb_wsgi_middleware(wsgi_app):
def middleware(environ, start_response):
with client.context():
return wsgi_app(environ, start_response)
return middleware
app = Flask(__name__)
app.wsgi_app = ndb_wsgi_middleware(app.wsgi_app)
#app.route('/main')
def main():
now_ts = time.time()
org = ndb.Key(Org, 1).get()
print('Finished get in %f' % (time.time() - now_ts))
return 'Does not exist' if org is None else 'Exists'
class Org(ndb.Model):
pass
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8080, debug=True)
Output after 2 localhost:8080/main fetches from browser (using the local datastore emulator brought up by the command gcloud beta emulators datastore start):
Finished get in 2.043116
127.0.0.1 - - [09/Oct/2019 22:41:49] "GET /main HTTP/1.1" 200 -
Finished get in 0.001995
127.0.0.1 - - [09/Oct/2019 22:41:56] "GET /main HTTP/1.1" 200 -
The reason you are experiencing this behavior is because unless configured otherwise, all App Engine application shut down their instances when idle. This means that if they receive a new request, they must take time to spin back up and respond, resulting in a higher response time than they would otherwise. This is by design to avoid taking up resources and generating charges unnecessarily as instances are charged by the minute when they are running.
You can avoid this using warm-up requests which is a special type of loading request that prepares an instance before live requests are made.
Other option would be to set up Automatic scaling on your application using modules, and setting a value for minimum idle instances. This means, if you set it up, that at least one instance will be running at all times, preventing any and all loading requests to be needed. However, as these instances are in a constant state of running, this will incur additional charges.
Related
I successfully deployed a twitter screenshot bot on Google App Engine.
This is my first time deploying.
First thing I noticed was that the app didn't start running until I clicked the link.
When I did, the app worked successfully (replied to tweets with screenshots) as long as the tab was loading and open.
When I closed the tab, the bot stopped working.
Also, in the cloud shell log, I saw:
Handling signal: term
[INFO] Worker exiting (pid 18)
This behaviour surprises me as I expect it to keep running on google server indefinitely.
My bot works by streaming with Twitter api. Also the "worker exiting" line above surprises me.
Here is the relevant code:
def get_stream(set):
global servecount
with requests.get(f"https://api.twitter.com/2/tweets/search/stream?tweet.fields=id,author_id&user.fields=id,username&expansions=author_id,referenced_tweets.id", auth=bearer_oauth, stream=True) as response:
print(response.status_code)
if response.status_code == 429:
print(f"returned code 429, waiting for 60 seconds to try again")
print(response.text)
time.sleep(60)
return
if response.status_code != 200:
raise Exception(
f"Cannot get stream (HTTP {response.status_code}): {response.text}"
)
for response_line in response.iter_lines():
if response_line:
json_response = json.loads(response_line)
print(json.dumps(json_response, indent=4))
if json_response['data']['referenced_tweets'][0]['type'] != "replied_to":
print(f"that was a {json_response['data']['referenced_tweets'][0]['type']} tweet not a reply. Moving on.")
continue
uname = json_response['includes']['users'][0]['username']
tid = json_response['data']['id']
reply_tid = json_response['includes']['tweets'][0]['id']
or_uid = json_response['includes']['tweets'][0]['author_id']
print(uname, tid, reply_tid, or_uid)
followers = api.get_follower_ids(user_id='1509540822815055881')
uid = int(json_response['data']['author_id'])
if uid not in followers:
try:
client.create_tweet(text=f"{uname}, you need to follow me first :)\nPlease follow and retry. \n\n\nIf there is a problem, please speak with my creator, #JoIyke_", in_reply_to_tweet_id=tid, media_ids=[mid])
except:
print("tweet failed")
continue
mid = getmedia(uname, reply_tid)
#try:
client.create_tweet(text=f"{uname}, here is your screenshot: \n\n\nIf there is a problem, please speak with my creator, #JoIyke_", in_reply_to_tweet_id=tid, media_ids=[mid])
#print(f"served {servecount} users with screenshot")
#servecount += 1
#except:
# print("tweet failed")
editlogger()
def main():
servecount, tries = 1, 1
rules = get_rules()
delete = delete_all_rules(rules)
set = set_rules(delete)
while True:
print(f"starting try: {tries}")
get_stream(set)
tries += 1
If this is important, my app.yaml file has only one line:
runtime: python38
and I deployed the app from cloud shell with gcloud app deploy app.yaml
What can I do?
I have searched and can't seem to find a solution. Also, this is my first time deploying an app sucessfully.
Thank you.
Google App Engine works on demand i.e. when it receives an HTTP(s) request.
Neither Warmup requests nor min_instances > 0 will meet your needs. A warmup tries to 'start up' an instance before your requests come in. A min_instance > 0 simply says not to kill the instance but you still need an http request to invoke the service (which is what you did by opening a browser tab and entering your Apps url).
You may ask - since you've 'started up' the instance by opening a browser tab, why doesn't it keep running afterwards? The answer is that every request to a Google App Engine (Standard) app must complete within 1 - 10 minutes (depending on the type of scaling) your App is using (see documentation). For Google App Engine Flexible, the timeout goes up to 60 minutes. This tells you that your service will timeout after at most 10 minutes on GAE standard or 60 minutes on GAE Flexible.
I think the best solution for you on GCP is to use Google Compute Engine (GCE). Spin up a virtual server (pick the lowest configuration so you can stick within the free tier). If you use GCE, it means you spin up a Virtual Machine (VM), deploy your code to it and kick off your code. Your code then runs continuously.
App Engine works on demand, i.e, only will be up if there are requests to the app (this is why when you click on the URL the app works). As well you can set 1 instance to be "running all the time" (min_instances) it will be an anti-pattern for what you want to accomplish and App Engine. Please read How Instances are Managed
Looking at your code you're pulling data every minute from Twitter, so the best option for you is using Cloud Scheduler + Cloud Functions.
Cloud Scheduler will call your Function and it will check if there is data to process, if not the process is terminated. This will help you to save costs because instead of have something running all the time, the function will only work the needed time.
On the other hand I'm not an expert with the Twitter API, but if there is a way that instead of pulling data from Twitter and Twitter calls directly your function it will be better since you can optimize your costs and the function will only run when there is data to process instead of checking every n minutes.
As an advice, first review all the options you have in GCP or the provider you'll use, then choose the best one for your use case. Just selecting one that works with your programming language does not necessarily will work as you expect like in this case.
I need to make decisions in an external system based on the current CPU utilization of my App Engine Flexible service. I can see the exact values / metrics I need to use in the dashboard charting in my Google Cloud Console, but I don't see a direct, easy way to get this information from something like a gcloud command.
I also need to know the count of running instances, but I think I can use gcloud app instances list -s default to get a list of my running instances in the default service, and then I can use a count of lines approach to get this info easily. I intend to make a python function which returns a tuple like (instance_count, cpu_utilization).
I'd appreciate if anyone can direct me to an easy way to get this. I am currently exploring the StackDriver Monitoring service to get this same information, but as of now it is looking super-complicated to me.
You can use the gcloud app instances list -s default command to get the running instances list, as you said. To retrieve CPU utilization, have a look on this Python Client for Stackdriver Monitoring. To list available metric types:
from google.cloud import monitoring
client = monitoring.Client()
for descriptor in client.list_metric_descriptors():
print(descriptor.type)
Metric descriptors are described here. To display utilization across your GCE instances during the last five minutes:
metric = 'compute.googleapis.com/instance/cpu/utilization'
query = client.query(metric, minutes=5)
print(query.as_dataframe())
Do not forget to add google-cloud-monitoring==0.28.1 to “requirements.txt” before installing it.
Check this code that locally runs for me:
import logging
from flask import Flask
from google.cloud import monitoring as mon
app = Flask(__name__)
#app.route('/')
def list_metric_descriptors():
"""Return all metric descriptors"""
# Instantiate client
client = mon.Client()
for descriptor in client.list_metric_descriptors():
print(descriptor.type)
return descriptor.type
#app.route('/CPU')
def cpuUtilization():
"""Return CPU utilization"""
client = mon.Client()
metric = 'compute.googleapis.com/instance/cpu/utilization'
query = client.query(metric, minutes=5)
print(type(query.as_dataframe()))
print(query.as_dataframe())
data=str(query.as_dataframe())
return data
#app.errorhandler(500)
def server_error(e):
logging.exception('An error occurred during a request.')
return """
An internal error occurred: <pre>{}</pre>
See logs for full stacktrace.
""".format(e), 500
if __name__ == '__main__':
# This is used when running locally. Gunicorn is used to run the
# application on Google App Engine. See entrypoint in app.yaml.
app.run(host='127.0.0.1', port=8080, debug=True)
I have a Google App Engine program that calls BigQuery for data.
The query usually takes 3 - 4.5 seconds and is fine but sometimes takes over five seconds and throws this error:
DeadlineExceededError: The API call urlfetch.Fetch() took too long to respond and was cancelled.
This article shows the deadlines and the different kinds of deadline errors.
Is there a way to set the deadline for a BigQuery job to be above 5 seconds? Could not find it in the BigQuery API docs.
BigQuery queries are fast, but often take longer than the default App Engine urlfetch timeout. The BigQuery API is async, so you need to break up the steps into API calls that each are shorter than 5 seconds.
For this situation, I would use the App Engine Task Queue:
Make a call to the BigQuery API to insert your job. This returns a JobID.
Place a task on the App Engine task queue to check out the status of the BigQuery query job at that ID.
If the BigQuery Job Status is not "DONE", place a new task on the queue to check it again.
If the Status is "DONE," then make a call using urlfetch to retrieve the results.
Note I would go with Michael's suggestion since that is the most robust. I just wanted to point out that you can increase the urlfetch timeout up to 60 seconds, which should be enough time for most queries to complete.
How to set timeout for urlfetch in Google App Engine?
I was unable to get the urlfetch.set_default_fetch_deadline() method to apply to the Big Query API, but was able to increase the timeout when authorizing the big query session as follows:
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
credentials = ServiceAccountCredentials.from_json_keyfile_dict(credentials_dict, scopes)
# Create an authorized session and set the url fetch timeout.
http_auth = credentials.authorize(Http(timeout=60))
# Build the service.
service = build(service_name, version, http=http_auth)
# Make the query
request = service.jobs().query(body=query_body).execute()
Or with an asynchronous approach using jobs().insert
query_response = service.jobs().insert(body=query_body).execute()
big_query_job_id = query_response['jobReference']['jobId']
# poll the job.get endpoint until the job is complete
while True:
job_status_response = service.jobs()\
.get(jobId=big_query_job_id).execute()
if job_status_response['status']['state'] == done:
break
time.sleep(1)
results_respone = service.jobs()\
.getQueryResults(**query_params)\
.execute()
We ended up going with an approach similar to what Michael suggests above, however even when using the asynchronous call, the getQueryResults method (paginated with a small maxResults parameter) was timing out on url fetch, throwing the error posted in the question.
So, in order to increase the timeout of URL Fetch in Big Query / App Engine, set the timeout accordingly when authorizing your session.
To issue HTTP requests in AppEngine you can use urllib, urllib2, httplib, or urlfetch. However, no matter what library you choose, AppEngine will perform HTTP requests using App Engine's URL Fetch service.
The googleapiclient uses httplib2. It looks like httplib2.Http passes it's timeout to urlfetch. Since it has a default value of None, urlfetch sets the deadline of that request to 5s no matter what you set with urlfetch.set_default_fetch_deadline.
Under the covers httplib2 uses the socket library for HTTP requests.
To set the timeout you can do the following:
import socket
socket.setdefaulttimeout(30)
You should also be able to do this but I haven't tested it:
http = httplib2.Http(timeout=30)
If you don't have existing code to time the request you can wrap your query like so:
import time
start_query = time.time()
<your query code>
end_query = time.time()
print(end_query - start_query)
This is one way to solve bigquery timeouts in AppEngine for Go. Simply set TimeoutMs on your queries to well below 5000. The default timeout for bigquery queries is 10000ms which is over the default 5 second deadline for outgoing requests in AppEngine.
The gotcha is that the timeout must be set both in the initial request: bigquery.service.Jobs.Query(…) and the subsequent b.service.Jobs.GetQueryResults(…) which you use to poll the query results.
Example:
query := &gbigquery.QueryRequest{
DefaultDataset: &gbigquery.DatasetReference{
DatasetId: "mydatasetid",
ProjectId: "myprojectid",
},
Kind: "json",
Query: "<insert query here>",
TimeoutMs: 3000, // <- important!
}
queryResponse := bigquery.service.Jobs.Query("myprojectid", query).Do()
// determine if queryResponse is a completed job and if not start to poll
queryResponseResults := bigquery.service.Jobs.
GetQueryResults("myprojectid", res.JobRef.JobId).
TimeoutMs(DefaultTimeoutMS) // <- important!
// determine if queryResponseResults is a completed job and if not continue to poll
The nice thing about this is that you maintain the default request deadline for the overall request (60s for normal requests and 10min for tasks and cronjobs) while avoiding setting the deadline for outgoing requests to some arbitrary large value.
I'm using urlfetch in my app and while everything works perfectly fine in the development environment, i'm finding urlfetch to be VERY unreliable when it's actually deployed. Sometimes it works as it should (retrieving data), but then a few minutes later it might return nothing, then it'll be working fine again a few minutes after that. This is very unacceptable. I've checked to make sure it's NOT the source URL that's the problem (YQL) and, again, everything works as it should in the development environment.
Are there any third-party libraries I could try?
Example code:
url = "http://query.yahooapis.com/v1/public/yql?q=%s&format=json" % urllib.quote_plus(query)
result = urlfetch.fetch(url, deadline=10)
if result.status_code == 200:
r = json.loads(result.content)
else:
return
a = r['query']['results']
# Do stuff with 'a'
Sometimes it'll work as it should, but other times - completely randomly with no code changes - i'll get this this error:
a = r['query']['results']
TypeError: 'NoneType' object is unsubscriptable
Sometimes it'll work as it should,
but other times completely randomly with no code changes
This is a common symptom that your application's requests have exceeded the Yahoo API calls rate limit.
Quoting Yahoo developer documentations rate limit:
IP Based Limits
Our service rate limits are imposed as
a limit on the number of API calls
made per IP address during a specific
time window. If your IP address
changes during that time period, you
may find yourself with more "credit"
available. However, if someone else
had been using the address and hit the
limit, you'll need to wait until the
end of the time period to be allowed
to make more API calls.
Google App Engine uses a pool of IP addresses for outgoing urlfetch requests and your application is sharing these IP addresses with other applications that are calling the same Yahoo endpoint; when the rate limit is exceeded, the endpoint replies with a limit exceeded error causing UrlFetch to fail.
Here another case using the Twitter search API.
When you mix Google App Engine+Third party web APIs, you need to be sure that the API provides authenticated calls allowing your application to have its own quota (StackApps API for example).
import urllib2
response = urllib2.urlopen('http://python.org/')
html = response.read()
This isn't an error in URLFetch - it's an issue with the JSON being returned. Either json.loads is returning None, or r['query'] is - I'm guessing it's probably the latter. Try logging result.content to see what the service is returning. You probably also want to cehck result.status.
One possibility is that your request is being denied or ratelimited by Yahoo in production, but not on your development machine.
I need to make application that needs to poll server often, but GAE has limitations on requests, so making a lot of requests could be very costly. Is it possible to use long polling and make requests wait for the maxium 30 seconds for changes?
Google AppEngine has a new feature Channel API, with that you have
a possibility to build a good realtime application.
Another solution is to use a third part comet server like mochiweb
or twisted with a iframe pattern.
Client1, waiting a event:
client1 --Iframe Pattern--> Erlang/Mochiweb(HttpLongPolling):
Client2, sending a message:
client2 --XhrIo--> AppEngine --UrlFetch--> Erlang/Mochiweb
To use mochiweb with comet pattern, Richard Jones has written a good
topic (on google: Richard Jones A Million-user Comet Application).
We've tried implementing a Comet-like long-polling solution on App Engine, with mixed results.
def wait_for_update(request, blob):
"""
Wait for blob update, if wait option specified in query string.
Otherwise, return 304 Not Modified.
"""
wait = request.GET.get('wait', '')
if not wait.isdigit():
return blob
start = time.time()
deadline = start + int(wait)
original_sha1 = blob.sha1
try:
while time.time() < deadline:
# Sleep one or two seconds.
elapsed = time.time() - start
time.sleep(1 if elapsed < 7 else 2)
# Try to read updated blob from memcache.
logging.info("Checking memcache for blob update after %.1fs",
elapsed)
blob = Blob.cache_get_by_key_name(request.key_name)
# Detect changes.
if blob is None or blob.sha1 != original_sha1:
break
except DeadlineExceededError:
logging.info("Caught DeadlineExceededError after %.1fs",
time.time() - start)
return blob
The problem I'm seeing is that requests following a long-polling one, are getting serialize (synchronized) behind the long-polling request. I can look at a trace in Chrome and see a timeline like this:
Request 1 sent. GET (un-modified) blob (wait until changed).
Request 2 sent. Modify the blob.
After full time-out, Request 1 returns (data unmodified).
Request 2 gets processed on server, and returns success.
I've used wireshark and Chrome/timeline to confirm that I AM sending the modification request to the server on a distinct TCP connection from the long-polling one. So this snychronization must be happing on the App Engine production server. Google doesn't document this detail of the server behavior, as far as I know.
I think waiting for the channel API is the best hope we have of getting good real-time behavior from App Engine.
I don't think long polling is possible. The default request timeout for google appengine is 30 seconds.
In long polling if the message takes more than 30 secs to generate, then it will fail.
You are probably better off using short polling.
Another approach is to "simulate" long polling withing the 30 sec limit. To do this if a message does not arrive within, say 20 sec, the server can send a "token" message instead of a normal message, requiring the client to consume it and connect again.
There seems to be feature request (and its accepted) on google appengine for long polling