Google Cloud Tasks availability in asia-southeast1 - google-app-engine

I'm not sure whether this is the right place to ask such question. Feel free to direct me to the right place.
According to https://cloud.google.com/about/locations/#asia-pacific Cloud Tasks is available in the region asia-southeast1 (but App Engine is not available in the region)
But from the Cloud Tasks documentation
To access the Cloud Tasks service using Cloud Tasks API , you must
have a project that contains an App Engine app that hosts the Cloud
Tasks queues that you create. This app is located in a specific region
which serves as the LOCATION_ID parameter for your Cloud Tasks
requests, so make a note of it. The app serves as the location for
whatever queues the developer creates. The underlying Cloud Tasks
service itself runs in that same location.
I assume that there should be App Engine configured for a particular region in order to use Cloud Tasks for that region. And since only 1 App Engine configuration can be made for each project, am I correct to assume that we can only create Cloud Tasks queue in only 1 region per project?
But asia-southeast1 seems to have Cloud Tasks available, but not App Engine. Is the information on https://cloud.google.com/about/locations/#asia-pacific incorrect or is there a way to create a Cloud Tasks queue in asia-southeast1 ?

Related

Google Cloud Scheduler: Cost of required App Engine instance?

I would like to use google cloud scheduler to invoke a Google Cloud Run function on a routine schedule.
On google cloud scheduler documentation it states:
Cloud Scheduler is currently available in all App Engine supported
regions. To use Cloud Scheduler your Cloud project must contain an App
Engine app that is located in one of the supported regions. If your
project does not have an App Engine app, you must create one.
I have never used app engine as a deployment target and don't really want to. This one cloud run function meets my needs.
Beyond the stated costs of cloud scheduler pricing will I also incur google app engine costs for a service I don't otherwise use?
You can definitely create an scheduler job that runs your Cloud function on a timed interval.
However, App Engine needs to be at least enabled since Cloud scheduler requires it (see point 4 of the before you begin section):
Cloud Scheduler uses App Engine cron jobs, so Cloud Scheduler requires App Engine enablement and configuration.
You would only need to set up the region of App Engine, not deploy an App Engine app which incurs costs.

Who starts the google cloud scheduler?

I can't find any documentation on how gcp scheduler works under the hood. An App Engine is needed in the project, so I assume that the Http calls or Pub/Sub messages are started from the App Engine.
Currently I can use a cloud scheduler even without an App Engine in the project. Apparently a compute engine that also contains a permanently running VM is also sufficient. Could someone confirm my assumptions please or does anyone have sources on this?
I can't tell you how work Cloud Scheduler under the hood. I can just tell you that works well!
I'm sure there is a VM, or a cluster of VM, on Google serverless environment, and your Cloud Scheduler job is set on it. It's serverless, the under the hood doesn't matter, it works, and it's what I want!
Now, the relation with App Engine can be confusing. In fact, there is no longer relation between the product now, but you need the App Engine API activated on your project to use Cloud Scheduler. This strange things is normal if you have been using Google Cloud for a while. At the beginning, only App Engine existed, and Datastore, Cloud Task, Cloud Scheduler was all "modules" of App Engine. Years, after years, google has refactored and extracted these modules to create independent products, as you can see them today. However, some relations are still present, like the API activation.

What's the difference between Google Cloud Scheduler and GAE cron job?

After reading the docs
cloud scheduler - https://cloud.google.com/scheduler/
GAE cron job -
https://cloud.google.com/appengine/docs/flexible/nodejs/scheduling-jobs-with-cron-yaml
cloud function pub/sub trigger -
https://cloud.google.com/functions/docs/calling/pubsub
I think they are mostly the same.
I can use GAE cron job + pub/sub + cloud function to implement the same functions which cloud scheduler has.
In my understanding, it seems there are some differences between them:
Cloud Scheduler can be more convenient to adjust frequency. To update the frequency of GAE cron job, you must update the config, like schedule: every 1 hours of cron.yaml and redeploy.
There is no need to implement the cron job architecture(integrate GAE, GAE cron service, pub/sub, cloud function, etc..) by yourself which means you don't need to write code for combining them together anymore.
Am I correct? Or, any other differences?
You're right in that the Google Cloud Scheduler is kind of an evolution of the GAE cron job mechanism to make it more user-friendly and flexible. You can see that they are still related since the Cloud Scheduler doc specifies:
To use Cloud Scheduler your project must contain an App Engine app
that is located in one of the supported regions. If your project does
not have an App Engine app, you must create one.
Historically, GAE cron job was the only cron service offered by the platform. You could only target a GAE handler to receive the request from cron. From there you could indeed perform actions like like publish on pub/sub, call an HTTP Cloud Function or launch a dataflow job, but you always had to deploy a GAE service to handle it, which wasn't optimal.
The new Cloud Scheduler makes it more straightforward to use with Pub/Sub, Cloud Functions but also any publicly available HTTP endpoint (may be on-premise). And of course App Engine handlers. More targets may be added in the future for more use-cases.
Finally, as you mentioned, the API exposed to manage it decouples it from App Engine and its cron.yaml file and makes it easier to create and update cron jobs dynamically.

In GCP, is there a canonical way to scrape data from an API?

I'm building an application that will periodically pull data from several APIs and write them to cloud storage for later processing by Dataflow. There are many different ways to do this so I wanted to sanity check before I jumped in.
My plan is this:
For each API, Cloud Scheduler will hit an endpoint for an App Engine app
The app will create a Compute Engine VM instance with a startup script that pulls the data from the API and writes it to storage
When done, the VM will hit another endpoint on the App Engine app that shuts down the VM.
Is this a reasonable way to perform this sort of action? Is there a simpler or more straight-forward method? Thank you in advance for the replies.
Cloud Scheduler can schedule Compute Engine without App Engine however it seems that you cannot create and delete the VM with this method.
You can just use App Engine cron jobs to schedule the tasks. Your App Engine app cron handler can simply run the script that pulls data from the APIs. Maybe I am missing something, why do you need to use a Compute Engine instance to run the script?

How to make computations with Google Cloud?

I have a machine learning project and for this project, I have to get data from a website in evey 15 minutes. And I decided to use google cloud platform to do it. I've coded a python script to do the process(get the data from website and write down to a csv file) and when I run this script on my computer, it works well. I need to run this script for a couple weeks. So it should be running in google cloud's computers and it should continue running when I close my computer. How can I do this?
I can also use another cloud service if it's required to but google cloud would be better.
Disclaimer: I'm with Google Cloud Platform Support
Google Cloud Compute Engine is defined as an Infrastructure as a Service. It basically provides access to Virtual Machines (VMs), Disks and Networking functionalities. By using this product, you are able to configure your resources from scratch, defining one or multiple VM instances, configuring your work environment, etc. It might require more configuration and boiler plating than needed, but it offers the most control. You can always use some resources for free but in my opinion it is a lot of scratch to start from.
Google Cloud App Engine is defined as a Platform as a Service. It is basically a managed app platform. The management can be automatised to certain degrees. It is based on Compute Engine, in the sense that it provides functionalities, a platform, on top of the infrastructure defined by Compute Engine VMs. You can thus deploy your python script in an App Engine Flexible Python Environment. You can define your whole application as a collection of interrelated microservices, i.e. one service gets the data from a website, maybe another writes csv files and another might trigger ML jobs.
App Engine also provides the possibility to schedule jobs as cron jobs. So if your application needs to run periodical jobs or at a specific time, this is the tool to use. App Engine pricing is correlated with the used resources, but you can estimate eventual budgets by using the Google Cloud Platform Pricing Calculator.
You can store the csv files in Google Cloud Storage as objects in buckets or as data in Datastore, Cloud SQL or BigQuery. Components of Google Cloud Platform can communicate with each other via service acounts. This allows your App Engine deployment, for example, to perform CRUD operations in your Cloud SQL instance, programatically. Or... to trigger a Cloud Machine Learning job.
Your question is very broad and can be addressed in multiple, various ways. I would initially deploy the python script in App Engine Flexible. I would deploy a cron job if needed, to fetch data every 15 minutes. I would upload the csv files in Google Cloud Storage Buckets. I would then use the Cloud Machine Learning python client to trigger Machine Learning jobs programatically.
There are other products that might interest you:
Cloud Dataflow - configure stream/batch data processing
Cloud Dataprerp - transform/clean raw data
Cloud Pub/Sub - global real-time messaging.
All the products/components and sub-products/sub-components can communicate with each other and processes can easily be automated in the Cloud. So the whole project can run in Google's Cloud infrastructure when you close your computer. But, of course, you have to configure it beforehand, in your Google Cloud Platform Project(s).
I am aware that I met your broad question with a broad answer. For any specific issues along your path of implementing the project in the Cloud, the community will be here to provide support.
Good luck!

Resources