Use App Engine Modules to run cron job in F1 instance class - google-app-engine

I have a frequent cron job in my app on google app engine standard, and it's using tons of instance hours to perform a quick task. I find that the instance hours problem goes away if I switch the app to F1 in app.yaml, but the web front-end needs more power (been using F4_1G).
It seems like a simple solution would be to use App Engine Modules to run the cron job on F1 while keeping the rest of the app on F4_1G, but the documentation is short on actual code. Can somebody please show how this can be accomplished?

This doesn't actually require code changes, it's controlled by your projects configuration (yaml) files.
You create a service (formerly module) by specifying it in a separate .yaml file, deploying the service, and then telling your cron job to run on that service.
Let's assume you want to create a service called "lightweight".
Start by copying your existing app.yaml to lightweight.yaml,
add (or modify) the "service" line to read "service: lightweight", and update the application instance to f1.
Optionally, clean up the handlers so that only the ones you need for your cron instance are present
eg, lightweight.yaml:
application: yourapp
service: lightweight
version: 0-4
runtime: python27
api_version: 1
threadsafe: true
instance_class: F1
handlers:
- url: /mycronjob
script: main.app
login: admin
Then, in your cron.yaml, specify the service as your target.
cron:
- description: example
url: /mycronjob
schedule: every 5 minutes
target: lightweight
Once that is done, deploy lightweight.yaml and cron using gcloud or appcfg.
Once deployed, your cron job will run on the lightweight service, using an f1 instance. You can also access the lightweight service directly in your browser lightweight.yourapp.appspot.com

Related

Google App Engine Manual Scaling Prevents Restart

I have a python app engine that handles api results and it's stateful. However it seems that after a few hours of inactivity (no requests), the server shuts off, resetting all states, and when a new request is made, it's listening again.
But the states are reset. I want the server to actively remain unchanged 24/7 and not reset/restart as I want to maintain states.
I have configured as per documentation but it's still restarting, I am not sure what's wrong
Here is my app.yaml:
runtime: python37
entrypoint: python main.py
manual_scaling:
instances: 1
In App Engine the general recomendation is to create stateless applications as mentioned on the documentation
Your app should be "stateless" so that nothing is stored on the instance.
As an alternative for the application not to get restarted you can deploy it on Compute Engine, As that service is a Virtual Machine you can have total control of the states.

Securing a service on Google App Engine standard

I have two services default and taskworker that get deployed to app engine's standard environment.
I communicate from default to taskworker via cloud tasks on a few exposed HTTP handlers. E.g: background/check_emails.
I also have a cron job running every minute for background/check_emails.
My deployments of default and taskworker are straightforward e.g:
runtime: python37
service: taskworker
handlers:
- url: /background/.*
script: auto
runtime: python37
handlers:
- url: /static
static_dir: static/
- url: /favicon.ico
static_files: static/img/favicon.ico
upload: static/img/favicon.ico
- url: .*
script: auto
Given that I want to continue getting external traffic to "default" and restrict traffic of "taskworker" from everyone except 1) cron job 2) cloud task http requests:
What are my options?
p.s: I'm not very firewall savvy, and the app engine rules for the project seem to affect the whole project, I do not know how to do a service-based firewall.
You can check for a few headers in your handler to ensure the request is indeed from the Task clouds.
https://cloud.google.com/tasks/docs/creating-appengine-handlers#reading_request_headers
X-AppEngine-QueueName is one of them but you can see the doc for more.
The document specifically says "If your request handler finds any of the headers listed above, it can trust that the request is a Cloud Tasks request."
As of now a "service.based" firewall is not supported for App Engine Standard, the App Engine firewall act upon all the services that comprise your application and it doesn't support granularity for each service.
But there is a Feature Request, please go to the following link in order to check the status of it, notice that there is no ETA or a guarantee that this will be implemented.
As suggested on the Feature Request link shared, the current workaround would consist of migrating your application to App Engine Flexible environment and configure the respective firewall rules on a VPC networks of the Compute Engine instances where your app will reside.

How to backup datastore via cron without datastore_admin?

I have backuped datastore via cron using cron.yaml like following
- description: My Daily Backup
url: /_ah/datastore_admin/backup.create?name=BackupToCloud&kind=LogTitle&kind=EventLog&filesystem=gs&gs_bucket_name=whitsend
schedule: every 12 hours
target: ah-builtin-python-bundle
But
According to google announcement, datastore-admin will go to "deprecated".
https://cloud.google.com/datastore/docs/console/datastore-backing-up-restoring
How to backup datastore via cron without datastore_admin?
https://cloud.google.com/appengine/articles/scheduled_backups
says only about using gcloud.
Note that just the backup/restore functionality based on the datastore-admin will be deprecated, not the datastore-admin itself.
The deprecation note points to the Managed export and import service as the recommended replacement alternative.
Exports based on this method can also be scheduled, see Scheduling an Export. You'll note in that article that a standard env GAE app with a cron service is exactly what the method is based on.
The article is targeted at those apps using the Datastore outside of GAE. Since you already have a GAE app you can just modify your existing backup cron job handler following the example in the article or, if you want to separate it a bit from your main app, you can add a separate service to your app, dedicated to the backup cron job.

Cron per Service/Module (AppEngine)

I have two modules/services and each one has a cron.xml. Only one of these ever seems to run (the most recently deployed in my experience), and the other doesn't fail, but the endpoint is never triggered (never shows up in the logs).
Is there a 1-cron per project limit? What is the best way to manage crons so that modules aren't cross-dependant?
The cron.yaml is an app-level config file, not a service/module one. Meaning when you deploy the one you have in one module it'll overwrite the cron config from the other one.
So you have to create a single cron.yaml file containing job configs for all services/modules. As #GAEfan mentioned you'll also need add target configs for each job. You might also need to add a dispatch.yaml file and maybe re-visit/adjust the request paths so that the cron job-issued requests make it to the right service/module.
Deploying the app-level cron.yaml might not be happening implicitly when deploying the service(s), you may need to deploy it explicitly. From
Uploading cron jobs:
Option 2: Upload only your cron updates
To update just the cron configuration without uploading the rest of
the application, run the following command:
appcfg.py update_cron <app-directory>
Some more or less related Q&As:
Deploying different languages services to the same Application [Google App Engine]
Can a default service/module in a Google App Engine app be a sibling of a non-default one in terms of folder structure?
Use the target: backend-module-name parameter inside a cron job you want to send to a module other than default. Only one cron.yaml needed.
Make sure you update: appcfg.py update app.yaml backend_module.yaml cron.yaml

AppEngine Managed VMs Cron Job?

So I'm using an AE managed VM to host a website with the nodejs docker image - works great - site works, etc. However, I can't seem to get a AE cron job registered. I added a cron.yaml file right next to my app.yaml file, and I'm not excluding it in my docker file.
Is there some extra step I need to take for the cron job to be registered? Or are the cron jobs not supported on managed VMs?
Cron.yaml:
cron:
- description: daily summary job
url: /cron/socialmedia/twitter
schedule: every 2 minutes
At least on regular GAE (i.e. not managed VM) simply uploading the application with appcfg.py update doesn't always also update the cron jobs.
Updating cron jobs specifically, using appcfg.py update_cron should work in such cases.
You can deploy your cron.yaml jobs using gcloud preview app deploy cron.yaml

Resources