Query an existing Google Cloud Datastore from Google app engine - google-app-engine

I have a Entity with ~50k rows in Google Cloud Datastore, the stand alone not GAE. I am starting development with GAE and would like to query this existing datastore without having to import it to GAE. I have been unable to find a way to connect to an existing datastore Kind.
Basic code altered from Hello World and other guides im trying to get working as a POC.
import webapp2
import json
import time
from google.appengine.ext import ndb
class Product(ndb.Model):
type = ndb.StringProperty()
#classmethod
def query_product(cls):
return ndb.gql("SELECT * FROM Product where name >= :a LIMIT 5 ")
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
query = Product.query_product()
self.response.write(query)
app = webapp2.WSGIApplication([
('/', MainPage),
], debug=True)
Returned Errors are
TypeError: Model Product has no property named 'name'
Seems obvious that its trying to use a GAE datastore with the kind Product instead of my existing Datastore with Product already defined, But I cant find how to make that connection.

There is only one Google Cloud Datastore. App Engine does not have a datastore of its own - it works with the same Google Cloud Datastore.
All entities in the Datastore are stored for a particular project. If you are trying to access data from a different project, you will not be able to see it without going through special authentication.

I'm not too certain what it is you're trying to accomplish when you say that you would like to query this existing datastore without having to import it to GAE. I'm guessing that you have project A with the datastore with 50k rows, and you're starting project B. And you want to access the project A datastore from project B. If this is the case, and if you're trying to access the datastore from a different project, then maybe this previous answer that mentions remote api can help you.

Below is working code. I was pretty close at the time I made this original post but the reason I was getting no data back was because I was running my App locally. As soon as I actually deployed my code to App Engine it pulled from Datastore no problem.
import webapp2
import json
import time
from google.appengine.datastore.datastore_query import Cursor
from google.appengine.ext import ndb
class Product(ndb.Model):
name = ndb.StringProperty()
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
query = ndb.gql("SELECT * FROM Product where name >= 'a' LIMIT 5 ")
output = query.fetch()
#query = Product.query(Product.name == 'zubo - pre-owned - nintendo ds')
#query = Product.query()
#output = query.fetch(10)
self.response.write(output)
app = webapp2.WSGIApplication([
('/', MainPage),
], debug=True)

Related

Datastore entity access from ODK

I'm trying to access data that ODK has pushed into the datastore. The below code words fine when I query an entity that I created via Python, which was called "ProductSalesData". The entity name ODK has given it's data is "opendatakit.test1". When I update the data model to class opendatakit.test1(db.Model) it obviously bombs due to a sytax error. How do I call that data?
#!/usr/bin/env python
import webapp2
from google.appengine.ext import db
class ProductSalesData(db.Model):
product_id = db.IntegerProperty()
date = db.DateTimeProperty()
store = db.StringProperty()
q = ProductSalesData.all()
class simplequery(webapp2.RequestHandler):
def get(self):
for ProductSalesData in q:
self.response.out.write('Result:%s<br />' % ProductSalesData.store)
app = webapp2.WSGIApplication(
[('/', simplequery)],
debug=True)
I know you tagged GAE, but do you have to access it straight through the datastore?
If not, I've had better success using the API that has already been built into aggregate: https://code.google.com/p/opendatakit/wiki/BriefcaseAggregateAPI
If you need GAE access I'd suggest the ODK developers group over on google groups - they're pretty active.

AppEngine datastore - backup programmatically

I would like to backup my app's datastore programmatically, on a regular basis.
It seems possible to create a cron that backs up the datastore, according to https://developers.google.com/appengine/articles/scheduled_backups
However, I require a more fine-grained solution: Create different backup files for dynamically changing namespaces.
Is it possible to simply call the /_ah/datastore_admin/backup.create url with GET/POST?
Yes; I'm doing exactly that in order to implement some logic that couldn't be done with cron.
Use the taskqueue API to add the URL request, like this:
from google.appengine.api import taskqueue
taskqueue.add(url='/_ah/datastore_admin/backup.create',
method='GET',
target='ah-builtin-python-bundle',
params={'kind': ('MyKind1', 'MyKind2')})
If you want to use more parameters that would otherwise go into the cron url, like 'filesystem', put those in the params dict alongside 'kind'.
Programmatically backup datastore based on environment
This comes in addition to Jamie's answer. I needed to backup the datastore to Cloud Storage, based on the environment (staging/production). Unfortunately, this can no longer be achieved via a cronjob so I needed to do it programmatically and create a cron to my script. I can confirm that what's below is working, as I saw there were some people complaining that they get a 404. However, it's only working on a live environment, not on the local development server.
from datetime import datetime
from flask.views import MethodView
from google.appengine.api import taskqueue
from google.appengine.api.app_identity import app_identity
class BackupDatastoreView(MethodView):
BUCKETS = {
'app-id-staging': 'datastore-backup-staging',
'app-id-production': 'datastore-backup-production'
}
def get(self):
environment = app_identity.get_application_id()
task = taskqueue.add(
url='/_ah/datastore_admin/backup.create',
method='GET',
target='ah-builtin-python-bundle',
queue_name='backup',
params={
'filesystem': 'gs',
'gs_bucket_name': self.get_bucket_name(environment),
'kind': (
'Kind1',
'Kind2',
'Kind3'
)
}
)
if task:
return 'Started backing up %s' % environment
def get_bucket_name(self, environment):
return "{bucket}/{date}".format(
bucket=self.BUCKETS.get(environment, 'datastore-backup'),
date=datetime.now().strftime("%d-%m-%Y %H:%M")
)
You can now use the managed export and import feature, which can be accessed through gcloud or the Datastore Admin API:
Exporting and Importing Entities
Scheduling an Export

Using GWT/GAE Blobstore as a database

Can the Blobstore in GWT/GAE be used as a database? Or is a new Blobstore created each time I launch the application? I would like to store information without losing it when the application is closed. But I can't seem to find a way to name a Blobstore and then reference it by its ID. Thanks!
If all you want to do is store a string I'd still suggest using the datastore.
Here's the complete python source to an App Engine app that retrieves, modifies, and stores some text in the datastore:
from google.appengine.ext import webapp, db
from google.appengine.ext.webapp import util
class TextDoc(db.Model):
text = db.TextProperty(default="")
class MainHandler(webapp.RequestHandler):
def get(self):
my_text_doc = TextDoc.get_or_insert('my_text_doc')
my_text_doc.text += "Blah, blah, blah. "
my_text_doc.put()
self.response.out.write(my_text_doc.text)
def main():
application = webapp.WSGIApplication([('/', MainHandler)],
debug=True)
util.run_wsgi_app(application)
if __name__ == '__main__':
main()
If you're working in Java it would be more verbose, but similar.

How to manipulate files in google app engine datastore

My problem revolves around a user making a text file upload to my app. I need to get this file and process it with my app before saving it to the datastore. From the little I have read, I understand that user uploads go directly to the datastore as blobs, which is ok if I could then get that file, perform operations on it(meaning change data inside) and then re-write it back to the datastore. All these operations need to be done by the app.
Unfortunately from the datastore documenation, http://code.google.com/appengine/docs/python/blobstore/overview.html
an app cannot directly create a blob in the datastore. That's my main headache. I simply need a way of creating a new blob/file in the datastore from my app without any user upload interaction.
blobstore != datastore.
You can read and write data to the datastore as much as you like so long as your data is <1MB using a db.BlobProperty on your entity.
As Wooble comments, the new File API lets you write to the blobstore, but unless you are incrementally writting to the blobstore-file using tasks or something like the mapreduce library you are still limited by the 1MB API call limit for reading/writing.
Thanks for your help. After many sleepless nights, 3 App Engine Books and A LOT of Googling, I've found the answer. Here is the code (it should be pretty self explanatory):
from __future__ import with_statement
from google.appengine.api import files
from google.appengine.ext import blobstore
from google.appengine.ext import webapp
from google.appengine.ext.webapp import util
class MainHandler(webapp.RequestHandler):
def get(self):
self.response.out.write('Hello WOrld')
form=''' <form action="/" method="POST" enctype="multipart/form-data">
Upload File:<input type="file" name="file"><br/>
<input type="submit"></form>'''
self.response.out.write(form)
blob_key="w0MC_7MnZ6DyZFvGjgdgrg=="
blob_info=blobstore.BlobInfo.get(blob_key)
start=0
end=blobstore.MAX_BLOB_FETCH_SIZE-1
read_content=blobstore.fetch_data(blob_key, start, end)
self.response.out.write(read_content)
def post(self):
self.response.out.write('Posting...')
content=self.request.get('file')
#self.response.out.write(content)
#print content
file_name=files.blobstore.create(mime_type='application/octet-stream')
with files.open(file_name, 'a') as f:
f.write(content)
files.finalize(file_name)
blob_key=files.blobstore.get_blob_key(file_name)
print "Blob Key="
print blob_key
def main():
application=webapp.WSGIApplication([('/', MainHandler)],debug=True)
util.run_wsgi_app(application)
if __name__=='__main__':
main()

What's a namespace used for in the App Engine datastore?

In the development admin console, when I look at my data, it says "Select different namespace".
What are namespaces for and how should I use them?
Namespaces allow you to implement segregation of data for multi-tenant applications. The official documentation links to some sample projects to give you an idea how it might be used.
Namespaces is used in google app engine to create Multitenant Applications. In Multitenent applications single instance of the application runs on a server, serving multiple client organizations (tenants). With this, an application can be designed to virtually partition its data and configuration (business logic), and each client organization works with a customized virtual application instance..you can easily partition data across tenants simply by specifying a unique namespace string for each tenant.
Other Uses of namespace:
Compartmentalizing user information
Separating admin data from application data
Creating separate datastore instances for testing and production
Running multiple apps on a single app engine instance
For More information visit the below links:
http://www.javacodegeeks.com/2011/12/multitenancy-in-google-appengine-gae.html
https://developers.google.com/appengine/docs/java/multitenancy/
http://java.dzone.com/articles/multitenancy-google-appengine
http://www.sitepoint.com/multitenancy-and-google-app-engine-gae-java/
Looking, towards this question is not that much good reviewed and answered so trying to give this one.
When using namespaces, we can have a best practice of key and value separation there on a given namespace. Following is the best example of giving the namespace information thoroughly.
from google.appengine.api import namespace_manager
from google.appengine.ext import db
from google.appengine.ext import webapp
class Counter(db.Model):
"""Model for containing a count."""
count = db.IntegerProperty()
def update_counter(name):
"""Increment the named counter by 1."""
def _update_counter(name):
counter = Counter.get_by_key_name(name)
if counter is None:
counter = Counter(key_name=name);
counter.count = 1
else:
counter.count = counter.count + 1
counter.put()
# Update counter in a transaction.
db.run_in_transaction(_update_counter, name)
class SomeRequest(webapp.RequestHandler):
"""Perform synchronous requests to update counter."""
def get(self):
update_counter('SomeRequest')
# try/finally pattern to temporarily set the namespace.
# Save the current namespace.
namespace = namespace_manager.get_namespace()
try:
namespace_manager.set_namespace('-global-')
update_counter('SomeRequest')
finally:
# Restore the saved namespace.
namespace_manager.set_namespace(namespace)
self.response.out.write('<html><body><p>Updated counters')
self.response.out.write('</p></body></html>')

Resources