Google app engine cron job scheduling setup and pricing - google-app-engine

I want to trigger a https endpoint every 1 minute I was using cron-job.org but it is not that reliable and goes down often. I have looked at 2 options Microsoft azure scheduler and Google app engine cron scheduler. Microsoft scheduler pricing is very clear, however, I dont understand how to setup google cron job and pricing to run the cron job every minute.

To use Google's cron scheduler, you will have to pay for the app engine running 24x7. Whereas Azure Scheduler is a true microservice and you only pay based on number of jobs/job collections, not the underlying resources consumed.

Unlike Microsoft's scheduler which appears to be an independently configurable and billable service, the GAE cron service can only be a part of a GAE app.
A standard environment GAE app is charged by instance-hours plus the various services it uses. See App Engine Pricing. But it also comes with fairly generous free daily quotas.
A simple app which would only make a few requests per minute - like the one you describe - should have no problems staying within the free daily limits.
Check the Quickstart to see how to get a basic app skeleton running. You already have the cron service doc, you only need the cron.yaml Reference to add a cron service to your app.

Related

Who starts the google cloud scheduler?

I can't find any documentation on how gcp scheduler works under the hood. An App Engine is needed in the project, so I assume that the Http calls or Pub/Sub messages are started from the App Engine.
Currently I can use a cloud scheduler even without an App Engine in the project. Apparently a compute engine that also contains a permanently running VM is also sufficient. Could someone confirm my assumptions please or does anyone have sources on this?
I can't tell you how work Cloud Scheduler under the hood. I can just tell you that works well!
I'm sure there is a VM, or a cluster of VM, on Google serverless environment, and your Cloud Scheduler job is set on it. It's serverless, the under the hood doesn't matter, it works, and it's what I want!
Now, the relation with App Engine can be confusing. In fact, there is no longer relation between the product now, but you need the App Engine API activated on your project to use Cloud Scheduler. This strange things is normal if you have been using Google Cloud for a while. At the beginning, only App Engine existed, and Datastore, Cloud Task, Cloud Scheduler was all "modules" of App Engine. Years, after years, google has refactored and extracted these modules to create independent products, as you can see them today. However, some relations are still present, like the API activation.

What's the difference between Google Cloud Scheduler and GAE cron job?

After reading the docs
cloud scheduler - https://cloud.google.com/scheduler/
GAE cron job -
https://cloud.google.com/appengine/docs/flexible/nodejs/scheduling-jobs-with-cron-yaml
cloud function pub/sub trigger -
https://cloud.google.com/functions/docs/calling/pubsub
I think they are mostly the same.
I can use GAE cron job + pub/sub + cloud function to implement the same functions which cloud scheduler has.
In my understanding, it seems there are some differences between them:
Cloud Scheduler can be more convenient to adjust frequency. To update the frequency of GAE cron job, you must update the config, like schedule: every 1 hours of cron.yaml and redeploy.
There is no need to implement the cron job architecture(integrate GAE, GAE cron service, pub/sub, cloud function, etc..) by yourself which means you don't need to write code for combining them together anymore.
Am I correct? Or, any other differences?
You're right in that the Google Cloud Scheduler is kind of an evolution of the GAE cron job mechanism to make it more user-friendly and flexible. You can see that they are still related since the Cloud Scheduler doc specifies:
To use Cloud Scheduler your project must contain an App Engine app
that is located in one of the supported regions. If your project does
not have an App Engine app, you must create one.
Historically, GAE cron job was the only cron service offered by the platform. You could only target a GAE handler to receive the request from cron. From there you could indeed perform actions like like publish on pub/sub, call an HTTP Cloud Function or launch a dataflow job, but you always had to deploy a GAE service to handle it, which wasn't optimal.
The new Cloud Scheduler makes it more straightforward to use with Pub/Sub, Cloud Functions but also any publicly available HTTP endpoint (may be on-premise). And of course App Engine handlers. More targets may be added in the future for more use-cases.
Finally, as you mentioned, the API exposed to manage it decouples it from App Engine and its cron.yaml file and makes it easier to create and update cron jobs dynamically.

In GCP, is there a canonical way to scrape data from an API?

I'm building an application that will periodically pull data from several APIs and write them to cloud storage for later processing by Dataflow. There are many different ways to do this so I wanted to sanity check before I jumped in.
My plan is this:
For each API, Cloud Scheduler will hit an endpoint for an App Engine app
The app will create a Compute Engine VM instance with a startup script that pulls the data from the API and writes it to storage
When done, the VM will hit another endpoint on the App Engine app that shuts down the VM.
Is this a reasonable way to perform this sort of action? Is there a simpler or more straight-forward method? Thank you in advance for the replies.
Cloud Scheduler can schedule Compute Engine without App Engine however it seems that you cannot create and delete the VM with this method.
You can just use App Engine cron jobs to schedule the tasks. Your App Engine app cron handler can simply run the script that pulls data from the APIs. Maybe I am missing something, why do you need to use a Compute Engine instance to run the script?

Where to run continuous jobs on Google Cloud Platform?

I have a job that involves continuously listening to one or more websocket/mqtt feeds and forwarding this data to an event queue. This job is written in javascript and would run 24/7 in a continuous loop.
The most obvious solution is to run this job on a VM with Compute Engine, but I was wondering is there is a more elegant solution. Azure, for example, has WebJobs that's well-suited to this kind of task. It even restarts the script if there is an error.
Is there some other component on GCP that can run this job in a "managed" way?
Google Cloud does not have a product similar to Azure WebJobs at the moment. Both the standard and flexible environments of Google Cloud App Engine do not currently support websockets. In order to use websockets you can use Compute Engine or Kubernetes Engine.

GAE Microservices with dedicated Cron Jobs per microservice

Can different microservices in GAE, own dedicated cron jobs?
Background
We have written multiple services on GAE microservices application.
One micronservice say Service1(default) [JAVA in GAE Standard environment] has 10 cron jobs, wheareas another microservice say, Service2 [Python in GAE Flexible environment] has 5 other cronjobs.
When we deploy both the services, cron jobs get replaced with the latest service cron jobs.
I know that Task Queue is shared resource in GAE Microservices and hence Cron jobs too may be shared. But is it impossible to let microservice have their dedicated cronjobs based on their service scope and get them uploaded on Server where all cronjobs can co-exist?
Timely response is highly appreciated.
The cron configuration is also an application level configuration, not a module/service level one, which is why when you deploy it for one service it overwrites the previous one from another service.
You need to combine all cron jobs for all your services into a single cron configuration file and deploy that one instead, preferably using the specific cron deployment command, not by uploading it together with a particular service (sometimes that fails for multi service apps).
There are other such app level configurations as well, see https://stackoverflow.com/a/42361987/4495081

Resources