Google's documentation states the following on their help page for Backup/Restore, Copy and Delete Data:
Note: Blob data is not backed up by this backup feature!
https://developers.google.com/appengine/docs/adminconsole/datastoreadmin#Backup_And_Restore
I did a simple backup/restore with an entity type in my application that contains a Blob field. After I backed up the entity, I removed the data that was stored in the Blob field. When I restored the entity it had that data once again.
Is it safe to infer that the warning in the documentation refers to data in the Blobstore and not Blob fields of entities stored in the normal data store?
I would say that it is safe to assume the two are not related. As per the google Blobstore Java API Overview:
Note: Blobs as defined by the Blobstore service are not related to blob property values used by the datastore.
Related
I use these following GCP products together for a CRM system:
Cloud SQL
App Engine
Bigquery
Once a week an external application exports data from Bigquery in this way:
The external application makes a request to Appengine with a token.
AppEngine retrieves permissions for this token from Cloud SQL, makes some additional computation to obtain a list of allowed IDs.
Appengine runs a Bigquery's query filtered with these ids. Something like that: SELECT * FROM table WHERE id IN(ids)
Appengine responds to the external application with a unmodified result of query in JSON.
The problem is that the export is not very often, but amount of data can be large and I dont want to load AppEngine with this data. What other GCP products are useful in this case? Remember I need to retrieve permissions from Appengine and CloudSQL.
Unclear whether the JSON is just directly from BigQuery query results, or you do additional processing in the application to render/format it. I'm assuming direct results.
An option that comes to mind is to leverage cloud storage. You can use the signed url feature to provide a time-limited link to your (potential large) results without exposing public access.
This, coupled with BigQuery's ability to export results to GCS (either via an export job, or using the newer EXPORT DATA SQL statement allows you to run a query and deliver results directly to GCS.
With this, you could simply redirect the user to the signed URL at the end of your current flow. There's additional features that are complementary here, such as using GCS data lifecycle features to age out and remove files automatically so you don't need to concern yourself with slow accumulation of results.
I'm wondering if deletion of these entities in blobuploadsession would affect my app functionality or performance in any which way. The reason for deletion is when a new form is created and there were no files that were uploaded to, then it results in unnecessary entities being created.
(edit: additional info from comment)
I use blobstore (part of NDB) to store images asynchronously via upload URL functionality. When I run the app on localhost, there is an auto-creation of a datastore called "BlobUploadSession". This is the entity where all the URLs for the images to be uploaded are stored as entities. When I upload a photo to the URL, it goes into the "BlobInfo" datastore. Now, I don't have a need of the URLs since the photo has already been uploaded. So, I'm wondering if I can delete the BlobUploadSession entities? Btw, BlobUploadSession and BlobInfo are default datastores automatically created.
The __BlobUploadSession__ and __BlobInfo__ entities are created by and only internally used by the development server while emulating the blobstore functionality.
There are others, similarly named __SomeEntityName__ entities for emulating other pieces of functionality, for example a pile of them are created when you're requesting datastore stats (such function doesn't exist per-se in production).
These entities aren't created on GAE, so no need to worry about them in production.
See also related How to remove built-in kinds' names in google datastore using kind queries
We're using Cloudant as the remote database for our app. The database contains documents for each user of the app. When the app launches, we need to query the database for all the documents belonging to a user. What we found is the CDTDatastore API only allows pulling the entire database and storing it inside the app then performing the query in the local copy. The initial pulling to the local datastore takes about 10 seconds and I imagine will take longer when adding more users.
Is there a way I can save only part of the remote database to the local datastore? Or, are we using the wrong service for our app?
You can use a server side replication filter function; you'll need to add information about your filter to the pull replicator. However replication will have a performance hit when using the function.
That being said a common pattern is to use one database per user, however this has other trade offs and it is something you should read up on. There is some information on the one database per user pattern here.
In my app, I'd like to have an entity defined like this:
class MyModel(db.Model):
title = db.StringProperty()
myBlob = db.BlobProperty()
Say this blob holds around 1 megabyte. Will this slow down any queries I make on the MyModel type? Is it fetching the entire 1mb per entity, or just references until I actually try to access the blob?
The minute you retrieve the entity, the blob is loaded from the datastore unless you do a projection query.
You have a few options to avoid loading the BlobProperty until you need it.
do a projection query and then only fetch the full entity when you need it.
stick the BlobProperty in a Child entity (make the top level one the ancestor) and only fetch the property with a get, when you need it.
Don't use a BlobProperty but stick it in GCS (Google CLoud Storage) and serve it from there.
The last has the benefit that if you do no processing on the blob your appengine instance doesn't need to get involved with serving it (depending on what your requirements are of course)
I have a situation where I am uploading an image in sharepoint and it is being saved using blob. I need to create an XML file with the data of the blob and other data that helps users to identify it. The following is a hint of what I want
<image>
<name>mydog</name>
<extension>.jpg</extension>
<blobid>0234234</blobid>
<blobpath>435343445</blobpath> </image>
I was looking at the tables in wss_content and came up to alldocumentstreams where there is a column called rbsid. unfortunately I cannot link this id to non of my documents. My question is this is there a way how i can get all the blob information from the DB so i can link it to other details?
Directly accessing the SharePoint database isn't supported by Microsoft.
If a server component requires information from the database, it must
get that data by using the appropriate items in the SharePoint object
model, and not by trying to get the items from the data structures in
the database through some query mechanism.
You might be better using the SharePoint object model to read these files.
Some links that should help
http://www.codeproject.com/KB/sharepoint/File_Shunter.aspx
http://www.learningsharepoint.com/2011/04/01/read-a-file-in-sharepoint-document-library/