MySQL has this Sample Sakila DB where we can start playing around with bunch of data already for our application, how about for Google App Engine/GAEJ is there something like this for the datastore?
I started recently to experiment with the Google App Engine and I was confronted with the same question. I was interested in a REST based app engine backbone which I could easily load/unload with data but couldn't find something to play around.
So I started to build up two projects on github which supports me in such kind of work.
clb-appEngineTemplate is a skeleton application for a Google App Engine Jave REST backend. It provides some sample code for a standardized REST API based persistency layer on Business Object level and can be easily extended (using Objectify and GSON).
clb-test which is a utility class which allows to load Test Data from Excel CSV file into your Google App Engine REST backend.
Both projects are maven based and allow me easily to define data objects which I can upload into the App Engine. Mainly I'm run them against the local test server, which serves me for initial testing.
I just released a first version and will incrementally extend over the next weeks.
AFAIK, there is no sample DB for GAE, probably because datastore write operations are expensive. There are demos bundled with GAE SDK. If you are using Eclipse you can import the samples to your workspace. Some of them involve datastore so you can run the application and add data yourself.
Another way is to use bulkloader to upload data at once using CSV files. But you can quickly run out of free quota for datastore writes.
Related
I'm a little bit confused between gcp components, here is my use case :
daily, I need to fetch data from an external API (the API return json data), store it in GCS then load it in Bigquery, I already created the python script fetching the data and store it in GCS and i'm confused which component to use for deployment :
Cloud run : from the doc it is used for deploying services, so I think its a bad choose
Cloud function: I think it works, but it is used for even based processing (through single purpose function...)
composer :(I'll use composer to orchestrate tasks, such as preprocessing of files in GCS, load them to BQ, transfert them to an archive Bucket) through kubernetesPodOperator, create a task that trigger the script to get the data
compute engine: I don't think that its the best chose since there are better ones
app engine: also I don't think it a good idea since it is used to deploy and scale web app ...
(correcte me if i'm wrong in what I said, ) so my question is : what is the GCP component used for this kind of task
Cloud run : from the doc it is used for deploying services
app engine: also I don't think it a good idea since it is used to deploy and scale web app ...
I think you've misunderstood. Both Cloud run and Google App Engine (GAE) are serverless offerings from Google Cloud. You deploy your code to any of them and you can invoke their urls which in turn will cause your code to execute and do stuff like go fetch data from somewhere and save it somewhere.
Google App Engine has a shorter timeout than Cloud Run (can't remember if Cloud Run has time out). So, if your code will take a long time to run, you don't want to use Google App Engine (unless you make it a background task) and if you don't need a UI, then you don't need GAE.
For your specific scenario, you can deploy your code to Cloud Run and use Cloud Scheduler to schedule it to be invoked at specific times. We have that architecture running in a similar scenario (we have a task that runs once daily; it's deployed to Cloud Run; Google Scheduler invokes the endpoint, it runs and saves data to datastore linked to an App Engine App). We wrote a blog article on deploying to Cloud Run and another on securing your cloud run (based off our experience in the earlier described scenario)
GAE Timeout:
Every request to a Google App Engine (Standard) must complete within 1 - 10 minutes for automatic scaling and up to 24 hours for basic scaling (see documentation). For Google App Engine Flexible, the timeout is 60 minutes (documentation).
I have a machine learning project and for this project, I have to get data from a website in evey 15 minutes. And I decided to use google cloud platform to do it. I've coded a python script to do the process(get the data from website and write down to a csv file) and when I run this script on my computer, it works well. I need to run this script for a couple weeks. So it should be running in google cloud's computers and it should continue running when I close my computer. How can I do this?
I can also use another cloud service if it's required to but google cloud would be better.
Disclaimer: I'm with Google Cloud Platform Support
Google Cloud Compute Engine is defined as an Infrastructure as a Service. It basically provides access to Virtual Machines (VMs), Disks and Networking functionalities. By using this product, you are able to configure your resources from scratch, defining one or multiple VM instances, configuring your work environment, etc. It might require more configuration and boiler plating than needed, but it offers the most control. You can always use some resources for free but in my opinion it is a lot of scratch to start from.
Google Cloud App Engine is defined as a Platform as a Service. It is basically a managed app platform. The management can be automatised to certain degrees. It is based on Compute Engine, in the sense that it provides functionalities, a platform, on top of the infrastructure defined by Compute Engine VMs. You can thus deploy your python script in an App Engine Flexible Python Environment. You can define your whole application as a collection of interrelated microservices, i.e. one service gets the data from a website, maybe another writes csv files and another might trigger ML jobs.
App Engine also provides the possibility to schedule jobs as cron jobs. So if your application needs to run periodical jobs or at a specific time, this is the tool to use. App Engine pricing is correlated with the used resources, but you can estimate eventual budgets by using the Google Cloud Platform Pricing Calculator.
You can store the csv files in Google Cloud Storage as objects in buckets or as data in Datastore, Cloud SQL or BigQuery. Components of Google Cloud Platform can communicate with each other via service acounts. This allows your App Engine deployment, for example, to perform CRUD operations in your Cloud SQL instance, programatically. Or... to trigger a Cloud Machine Learning job.
Your question is very broad and can be addressed in multiple, various ways. I would initially deploy the python script in App Engine Flexible. I would deploy a cron job if needed, to fetch data every 15 minutes. I would upload the csv files in Google Cloud Storage Buckets. I would then use the Cloud Machine Learning python client to trigger Machine Learning jobs programatically.
There are other products that might interest you:
Cloud Dataflow - configure stream/batch data processing
Cloud Dataprerp - transform/clean raw data
Cloud Pub/Sub - global real-time messaging.
All the products/components and sub-products/sub-components can communicate with each other and processes can easily be automated in the Cloud. So the whole project can run in Google's Cloud infrastructure when you close your computer. But, of course, you have to configure it beforehand, in your Google Cloud Platform Project(s).
I am aware that I met your broad question with a broad answer. For any specific issues along your path of implementing the project in the Cloud, the community will be here to provide support.
Good luck!
I've inherited a project which uses Google App Engine Blobstore to store files, and I need to download all the images so they can be migrated to a new system.
I was able to get all the Google Datastore data out with https://github.com/jodeleeuw/export-google-datastore and was hoping there was something similarly easy with Blobstore. Or at least an example of how to read all the files in the blobstore and download them.
The Datastore solution you mentioned relies on the Datastore API being available for apps not running on GAE. But AFAIK the Blobstore API is not available outside GAE, so a similar solution is likely impossible.
I see a couple of options:
enhance the GAE app code you inherited by adding the capability to get the data in the Blobstore (since it normally should be able to) and export it either directly to where you want the data moved or in an intermediate place, say GCS for example, from where you can ship the data to the final destination easier. Moving directly to the final location would be preferable IMHO if you need to keep the app working with the new location - you can build a nice, hitless migration story
use the developer console Blobstore browser/viewer which also allows downloading and deleting the blobs, either manually or using a GUI automation tool (like selenium, for example) for a programmatic approach.
We got a couple millions data in the current GAE project using Google Cloud store. Mostly GPS point information. We want to be able to use all these GPS points in another demo instance, which is hosted in another GAE instance. Anyway we can do it?
Using Golang + Google App Engine
There is a Google Cloud Datastore API that you can use to access your Datastore data from any other deployment, including a different App Engine app. It's not available in Go, so you will have to mix in some Python or Java.
I'm build an android application with Kinvey platform back-end,with BusinessLogic to be based on Google App Engine - using the recent integration between Google and Kinvey -
My question is that would it be better - faster, cheaper & more effecient - to use Kinvey OOTB datastore collection, or should I implement the data model layer of the application with Google Cloud Datastore ?! and if I started with Kinvey now, would it be easy to migrate later to Google Cloud Datastore upon need ?!
Thanks
I think we'd need a bit more detail about how you're designing your application to answer this for you...
If you're using App Engine for your application logic, it stands to reason that you might want to store your data there too (this let's you do things like... offline operations on data using an App Engine module). If you do that, you'll have to write your own API handlers on App Engine to process requests from your mobile app.
Hope that helps, feel free to delete if not.