How to warm up app engine endpoint - google-app-engine

I have appengine endpoint and trying to reduce latency on few first calls to newly created endpoint instance. Application is written in Java and endpoints are auto scaled.
To address this issue I configured idle instance, although even if instance is already created, first few calls routed to it consume some extra time. Following documentation I've implemented the custom servlet handling warm up requests and marked the EndpointsServlet as load on startup.
Inside the warm up servlet I've put code that initiates some commonly used services, load some data etc. Effect was almost impossible to notice.
After it I have implemented calls to methods exposed by the endpoint like that:
call("/_ah/api/teamly/v1/test/dummy")
It works for some cases (even most of them) and after calling few key methods instance is really ready to serve. The problem I'm facing now is that if I'm using auto scaling for some module I can't route the request to specific instance.
So the question is:
How should I properly warm up the endpoint instance to avoid load requests initiated from frontend.

You need to put a listener to /_ah/warmup and then make calls to any resources you want it to be warmed up. You can find detailed information at:
Configuring Warmup Requests to Improve Performance

Related

How to programmatically scale up app engine?

I have an application which uses app engine auto scaling. It usually runs 0 instances, except if some authorised users use it.
This application need to run automated voice calls as fast as possible on thousands of people with keypad interactions (no, it's not spam, it's redcall!).
Programmatically speaking, we ask Twilio to initialise calls through its Voice API 5 times/sec and it basically works through webhooks, at least 2, but most of the time 4 hits per call. So GAE need to scale up very quickly and some requests get lost (which is just a hang up on the user side) at the beginning of the trigger, when only one instance is ready.
I would like to know if it is possible to programmatically scale up App Engine (through an API?) before running such triggers in order to be ready when the storm will blast?
I think you may want to give warmup requests a try. As they load your app's code into a new instance before any live requests reach that instance. Thus reducing the time it takes to answer while you GAE instance has scaled down to zero.
The link I have shared with you, includes the PHP7 runtime as I see you are familiar with it.
I would also like to agree with John Hanley, since finding a sweet spot on how many idle instances you have available, would also help the performance of your app.
Finally, the solution was to delegate sending the communication through Cloud Tasks:
https://cloud.google.com/tasks/docs/creating-appengine-tasks
https://github.com/redcall-io/app/blob/master/symfony/src/Communication/Processor/QueueProcessor.php
Tasks can try again hitting the app engine in case of errors, and make the app engine pop new instances when the surge comes.

Is catch-all handler pointing to "auto" a bad idea?

My instance has little to no traffic but I have a min-idle instance set to 1. What I notice is that whenever there is a random url (via some bot) that doesn't exist is accessed, it is considered a dynamic request since my catch all handler is auto. This is fine, except I see these 404 errors (404 because there are no http handlers associated with these url patterns even though the yaml defines a catch all pattern) resulting in instance restarts. Why should the instance restart if it runs into 404 errors?
I have all my dynamic handlers follow "/api" pattern and then a few that don't. So, I can explicitly list all valid patterns and map them to the auto handler. Would that then consider these random links as static but not present and throw 404 error (which I am fine with)? I want to make sure the instance doesn't keep running just because of some rouge requests.
I just did a local experiment (I don't presently have any quickly deployable play app) and it looks like your quite interesting idea could work.
I replaced the .* pattern previously catching all stragglers and routing them to my default service script (I'm using the python runtime) with specific patterns, then added this handler after all others:
- url: /(.*)$
static_files: images/\1
upload: images/.*
My images directory is real, holding static images (but for which I already have another handler with a more specific pattern).
With this in place I made a request to /crap and got, as expected (there is no images/crap file):
INFO 2019-11-08 03:06:02,463 module.py:861] default: "GET /crap
HTTP/1.1" 404 -
I added logging calls in my script handler's get() and dispatch() calls to confirm they're not actually getting invoked (the development server request logging casts a bit of doubt).
I also checked on an already deployed GAE app that requesting an image that matches a static handler pattern but which doesn't actually exist gets the 404 answer without causing a service's instance to be started (no instance was running at the time), i.e. it comes directly from the GAE's static content CDN.
So I think it's well worth a try with the go runtime, this could save some significant instance time for an app without a lot of activity faced with random bot traffic.
As for the instance restarts, I suspect what you see is just a symptom of your min-idle instance set to 1. Unlike a dynamic instance the idle (aka resident) instance is not normally meant to handle traffic, it's just ready to do it if/when needed. Only when there is no dynamic instance running (and able to handle incoming traffic efficiently) and a new request comes in that request is immediately routed to the idle instance. At that moment:
the idle instance becomes a dynamic one and will continue to serve traffic until it shuts due to inactivity or dies
a fresh idle instance is started to meet the min-idle configuration, it will remain idle until another similar event occurs
Note: your idea will help with the instance hours portion used by the dynamic instances, but not with the idle instance portion.
According to the documentation which quotes the following:
"When an instance responds to the request /_ah/startwith an HTTP status code of 200–299 or 404, it is considered to have started correctly and that it can handle additional requests. Otherwise, App Engine cancels the instance. Instances with manual scale adjustment restart immediately, while instances with basic scale adjustment restart only when necessary to deliver traffic."
You can find more detail about how instances are managed for Standard App Engine environment for Go 1.12 on the link: https://cloud.google.com/appengine/docs/standard/go112/how-instances-are-managed
As well, I recommend you to read the document "How instances are managed", on which quotes the following:
"Secondary routing
If a request matches the part [YOUR_PROJECT_ID].appspot.comof the host name, but includes the name of a service, version, or instance that does not exist, the service is routed default. Secondary routing does not apply to custom domains; requests sent to these domains will show an HTTP status code 404if the hostname is not valid."
https://cloud.google.com/appengine/docs/standard/go112/how-instances-are-managed

Programatically listing and sending requests to dynamic App Engine instances

I want to send a particular HTTP request (or otherwise communicate a message) to every (dynamic/autoscaled) instance which is currently running for a particular App Engine application.
My goal is to trigger each instance to discard some locally cached data (because I have just modified the underlying data and want them to reload it).
One possible solution is to store a value in Memcache, and have instances check this each time they handle a request to see if they should flush their cache. But this adds latency to every request.
Another possible solution would be to somehow stop all running instances. No fixed overhead, but some impact while instances are restarted.
An even less desirable solution would be to redeploy the application code in order to cause all instances to be stopped. This now adds additional delay on my end as a deployment takes some time.
You could use the management API to list instances for a given version, but I'd suggest that you'd probably want to use something like the PubSub API to create a subscription on each of your App Engine instances. Since each instance has its own subscription, any messages sent to the monitored queue will be received by all instances.
You can create the subscription at startup (the /_ah/start endpoint may be useful), and then delete it at shutdown (using the /_ah/stop endpoint).

Long-running script on Google App Engine

I'm attempting to create a microservice on Google App Engine that is not intended to handle HTTP requests.
Instead, I was hoping to have a continuously running Python script that monitors a remote queue--RabbitMQ, to be precise--and sends out an api-call to another service as tasks are pushed to the queue.
I was wondering, firstly, is it possible to run a script upon deployment--one that did not originate with a user action/request?
Secondly, how would I accomplish this?
Thanks in advance for your time!
You can deploy your "script" as a manually scaled module -- see https://cloud.google.com/appengine/docs/python/modules/ -- with exactly one instance. As the docs say, "When you start a manual scaling instance, App Engine immediately sends a /_ah/start request to each instance"; so, just set that module's handler for /_ah/start to the handler you want to run (in the module's yaml file and the WSGI app in the Python code, using whatever lightweight framework you like -- webapp2, falcon, flask, bottle, or whatever else... the framework won't be doing much for you in this case save the one-off routing).
Note that the number of free machine hours for manual scaling modules is limited to 8 hours per day (for the smaller, B1 instance class; proportionally fewer for larger instance classes), so you may need to upgrade to paid-app status if you need to run for more than 8 hours.
Like #brant said, App Engine is designed to handle HTTP requests. It's not a perfect fit for background jobs, unless you try to wrap your logic into one http request.
Further, App Engine will emit an error when the response timeout, depending on your scaling settings. If you want to try it, consider basic or manual scaling.
For this type of workload, I would suggest you use a VM.
I think there are a few problems with this design.
First, App Engine is designed to be an HTTP request processor, not a RabbitMQ message processor. GAE is intended for many small requests, not one long-running process.
Second, "RabbitMQ should not be exposed to the public internet, it wasn't created for such use case."
I would recommend that you keep the RabbitMQ clients on the same internal network as the RabbitMQ broker, and have the clients send HTTP requests to App Engine.

NLog's WebService target: How to sequentially send the requests?

I use NLog's WebService target in Silverlight and run into a problem if the logging service is unavailable.
What happens is that all calls to the logging service hang for a long time until they time out.
This is firstly ugly and secondly problematic in the face of a request limit, which I have under my given circumstances. After the request limit is reached due to several pending logging requests, the application also fails to make requests that are not logging related.
Ideally I'd like a WebService target that sends the requests sequentially, but I can't configure it to do that, can I?
Since I have full control about the logging server I could also move to a different target, but I'd rather have a purely configuration-based solution.
Some time back I implemented a logging target like that for Silverlight. We were using Common.Logging for .NET and it did not support Silverlight. So, we ported part of Common.Logging to Silverlight and implemented a "logging service adapter" to send our logging messages to a logging service. I implemented a logging queue using the producer/consumer pattern. Maybe you will find it useful.
In the end, the project that I was working on when I implemented this didn't go anywhere, so this particular piece of code is not in use.
Using WCF service via async interface from worker thread, how do I ensure that events are sent from the client "in order"

Resources