Long-running script on Google App Engine - google-app-engine

I'm attempting to create a microservice on Google App Engine that is not intended to handle HTTP requests.
Instead, I was hoping to have a continuously running Python script that monitors a remote queue--RabbitMQ, to be precise--and sends out an api-call to another service as tasks are pushed to the queue.
I was wondering, firstly, is it possible to run a script upon deployment--one that did not originate with a user action/request?
Secondly, how would I accomplish this?
Thanks in advance for your time!

You can deploy your "script" as a manually scaled module -- see https://cloud.google.com/appengine/docs/python/modules/ -- with exactly one instance. As the docs say, "When you start a manual scaling instance, App Engine immediately sends a /_ah/start request to each instance"; so, just set that module's handler for /_ah/start to the handler you want to run (in the module's yaml file and the WSGI app in the Python code, using whatever lightweight framework you like -- webapp2, falcon, flask, bottle, or whatever else... the framework won't be doing much for you in this case save the one-off routing).
Note that the number of free machine hours for manual scaling modules is limited to 8 hours per day (for the smaller, B1 instance class; proportionally fewer for larger instance classes), so you may need to upgrade to paid-app status if you need to run for more than 8 hours.

Like #brant said, App Engine is designed to handle HTTP requests. It's not a perfect fit for background jobs, unless you try to wrap your logic into one http request.
Further, App Engine will emit an error when the response timeout, depending on your scaling settings. If you want to try it, consider basic or manual scaling.
For this type of workload, I would suggest you use a VM.

I think there are a few problems with this design.
First, App Engine is designed to be an HTTP request processor, not a RabbitMQ message processor. GAE is intended for many small requests, not one long-running process.
Second, "RabbitMQ should not be exposed to the public internet, it wasn't created for such use case."
I would recommend that you keep the RabbitMQ clients on the same internal network as the RabbitMQ broker, and have the clients send HTTP requests to App Engine.

Related

How to programmatically scale up app engine?

I have an application which uses app engine auto scaling. It usually runs 0 instances, except if some authorised users use it.
This application need to run automated voice calls as fast as possible on thousands of people with keypad interactions (no, it's not spam, it's redcall!).
Programmatically speaking, we ask Twilio to initialise calls through its Voice API 5 times/sec and it basically works through webhooks, at least 2, but most of the time 4 hits per call. So GAE need to scale up very quickly and some requests get lost (which is just a hang up on the user side) at the beginning of the trigger, when only one instance is ready.
I would like to know if it is possible to programmatically scale up App Engine (through an API?) before running such triggers in order to be ready when the storm will blast?
I think you may want to give warmup requests a try. As they load your app's code into a new instance before any live requests reach that instance. Thus reducing the time it takes to answer while you GAE instance has scaled down to zero.
The link I have shared with you, includes the PHP7 runtime as I see you are familiar with it.
I would also like to agree with John Hanley, since finding a sweet spot on how many idle instances you have available, would also help the performance of your app.
Finally, the solution was to delegate sending the communication through Cloud Tasks:
https://cloud.google.com/tasks/docs/creating-appengine-tasks
https://github.com/redcall-io/app/blob/master/symfony/src/Communication/Processor/QueueProcessor.php
Tasks can try again hitting the app engine in case of errors, and make the app engine pop new instances when the surge comes.

How to warm up app engine endpoint

I have appengine endpoint and trying to reduce latency on few first calls to newly created endpoint instance. Application is written in Java and endpoints are auto scaled.
To address this issue I configured idle instance, although even if instance is already created, first few calls routed to it consume some extra time. Following documentation I've implemented the custom servlet handling warm up requests and marked the EndpointsServlet as load on startup.
Inside the warm up servlet I've put code that initiates some commonly used services, load some data etc. Effect was almost impossible to notice.
After it I have implemented calls to methods exposed by the endpoint like that:
call("/_ah/api/teamly/v1/test/dummy")
It works for some cases (even most of them) and after calling few key methods instance is really ready to serve. The problem I'm facing now is that if I'm using auto scaling for some module I can't route the request to specific instance.
So the question is:
How should I properly warm up the endpoint instance to avoid load requests initiated from frontend.
You need to put a listener to /_ah/warmup and then make calls to any resources you want it to be warmed up. You can find detailed information at:
Configuring Warmup Requests to Improve Performance

Programatically listing and sending requests to dynamic App Engine instances

I want to send a particular HTTP request (or otherwise communicate a message) to every (dynamic/autoscaled) instance which is currently running for a particular App Engine application.
My goal is to trigger each instance to discard some locally cached data (because I have just modified the underlying data and want them to reload it).
One possible solution is to store a value in Memcache, and have instances check this each time they handle a request to see if they should flush their cache. But this adds latency to every request.
Another possible solution would be to somehow stop all running instances. No fixed overhead, but some impact while instances are restarted.
An even less desirable solution would be to redeploy the application code in order to cause all instances to be stopped. This now adds additional delay on my end as a deployment takes some time.
You could use the management API to list instances for a given version, but I'd suggest that you'd probably want to use something like the PubSub API to create a subscription on each of your App Engine instances. Since each instance has its own subscription, any messages sent to the monitored queue will be received by all instances.
You can create the subscription at startup (the /_ah/start endpoint may be useful), and then delete it at shutdown (using the /_ah/stop endpoint).

Using 1 intance of google-app-engine to monitor external service

I planning to create a NodeJS program, that work 24/7, that ping and make requests to an external server (outside of google cloud) every minute. Just to see that it the external services are are live.
If there is any error it will notify me by SMS & Email.
I don't need any front-end for this app, and no one needs to connect to it. Just simple NodeJS program.
The monitoring and configuration will be by texts files.
Now the questions:
It looks like it will cost me just $1.64. It sounds very cheap. Am I missing something?
It needs to work around the clock, I will request it to start it once, and it need to continue working, (by using setInterval). Is it will be aborted?
What it is exactly mean buy 1 instance. What an instance can do? Only respond to one request or what?
I tried to search in Google: appengine timeout, but didn't found anything that helps.
Free Quota
If you write your application in Python, PHP, Go or Java it can fit in free usage quota:
https://cloud.google.com/appengine/docs/quotas
So there will be absolutely no costs to run it on Google App Engine platform.
There are limit of 657,000 UrlFetch API Calls per day (more than 450 calls per minute in 24/7 mode) for free apps. 4GB traffic may also be sufficient for this kind of work.
Keep in mind there is no SMS sending services provided by Google App Engine and you will need to spend additional UrlFetch API calls to use external SMS services.
Email sending is also limited to 100 Emails per day (or 5000 Emails to admin address), so try not so send repeated notifications about same monitored server every minute, or you'll deplete your Email quote in 1.5 hours.
Scheduled Tasks
There is no way to run single process indefinitely without interruption on App Engine. But you don't have to!
You'll need to encapsulate all the work you're planning to execute in every iteration into single task and then schedule it to run every minute with Cron. See this documentation for Python: https://cloud.google.com/appengine/docs/python/config/cron
It is recommended to have some configuration page where you can set some internal configuration or see monitoring statistics, at least manage flag to temporarily pause tasks execution without redeploying your app.

Google Application Engine - Using URL Fetch Service

I've looked at http://code.google.com/appengine/docs/java/urlfetch/overview.html
but the code does not show a pooling example,
i mean if i want to fetch www.example.com/1.html, www.example.com/3.html, www.example.com/3.html, ...., www.example.com/1000.html
I'd have to open 1000 connection and close 1000 connections.
I think I could just open 1 connection 'keep-alive', and issue 1000 request and then close it.
that should be faster.
but i have no idea how to do that using url.openStream()
The URLFetch service operates at a higher level of abstraction than individual connections, and the native Python and Java libraries that use it are modified to use this service. As such, you have no direct control over connections - but you can expect that the underlying service will keep connections open when it deems it appropriate.
Unfortunately, as the docs for Java App Engine say, at this time "The Java API for the URL Fetch service only supports synchronous requests". The Python version of App Engine does support async requests, so, if porting to Python is just unthinkable, you may wait in the reasonable hope that such functionality will eventually be in the Java side too. After all, the Python version has been around for a year more, so of course it's more mature, stable, and function-rich.

Resources