Auto creation of entities in blobuploadsession datastore when using upload to url functionality in blob datastore - google-app-engine

I'm wondering if deletion of these entities in blobuploadsession would affect my app functionality or performance in any which way. The reason for deletion is when a new form is created and there were no files that were uploaded to, then it results in unnecessary entities being created.
(edit: additional info from comment)
I use blobstore (part of NDB) to store images asynchronously via upload URL functionality. When I run the app on localhost, there is an auto-creation of a datastore called "BlobUploadSession". This is the entity where all the URLs for the images to be uploaded are stored as entities. When I upload a photo to the URL, it goes into the "BlobInfo" datastore. Now, I don't have a need of the URLs since the photo has already been uploaded. So, I'm wondering if I can delete the BlobUploadSession entities? Btw, BlobUploadSession and BlobInfo are default datastores automatically created.

The __BlobUploadSession__ and __BlobInfo__ entities are created by and only internally used by the development server while emulating the blobstore functionality.
There are others, similarly named __SomeEntityName__ entities for emulating other pieces of functionality, for example a pile of them are created when you're requesting datastore stats (such function doesn't exist per-se in production).
These entities aren't created on GAE, so no need to worry about them in production.
See also related How to remove built-in kinds' names in google datastore using kind queries

Related

The best GCP architecture for exporting Bigquery data to an external application with API

I use these following GCP products together for a CRM system:
Cloud SQL
App Engine
Bigquery
Once a week an external application exports data from Bigquery in this way:
The external application makes a request to Appengine with a token.
AppEngine retrieves permissions for this token from Cloud SQL, makes some additional computation to obtain a list of allowed IDs.
Appengine runs a Bigquery's query filtered with these ids. Something like that: SELECT * FROM table WHERE id IN(ids)
Appengine responds to the external application with a unmodified result of query in JSON.
The problem is that the export is not very often, but amount of data can be large and I dont want to load AppEngine with this data. What other GCP products are useful in this case? Remember I need to retrieve permissions from Appengine and CloudSQL.
Unclear whether the JSON is just directly from BigQuery query results, or you do additional processing in the application to render/format it. I'm assuming direct results.
An option that comes to mind is to leverage cloud storage. You can use the signed url feature to provide a time-limited link to your (potential large) results without exposing public access.
This, coupled with BigQuery's ability to export results to GCS (either via an export job, or using the newer EXPORT DATA SQL statement allows you to run a query and deliver results directly to GCS.
With this, you could simply redirect the user to the signed URL at the end of your current flow. There's additional features that are complementary here, such as using GCS data lifecycle features to age out and remove files automatically so you don't need to concern yourself with slow accumulation of results.

Cloudant CDTDatastore to pull only part of the database

We're using Cloudant as the remote database for our app. The database contains documents for each user of the app. When the app launches, we need to query the database for all the documents belonging to a user. What we found is the CDTDatastore API only allows pulling the entire database and storing it inside the app then performing the query in the local copy. The initial pulling to the local datastore takes about 10 seconds and I imagine will take longer when adding more users.
Is there a way I can save only part of the remote database to the local datastore? Or, are we using the wrong service for our app?
You can use a server side replication filter function; you'll need to add information about your filter to the pull replicator. However replication will have a performance hit when using the function.
That being said a common pattern is to use one database per user, however this has other trade offs and it is something you should read up on. There is some information on the one database per user pattern here.

GAE backup, Blob fields vs. Blobstore

Google's documentation states the following on their help page for Backup/Restore, Copy and Delete Data:
Note: Blob data is not backed up by this backup feature!
https://developers.google.com/appengine/docs/adminconsole/datastoreadmin#Backup_And_Restore
I did a simple backup/restore with an entity type in my application that contains a Blob field. After I backed up the entity, I removed the data that was stored in the Blob field. When I restored the entity it had that data once again.
Is it safe to infer that the warning in the documentation refers to data in the Blobstore and not Blob fields of entities stored in the normal data store?
I would say that it is safe to assume the two are not related. As per the google Blobstore Java API Overview:
Note: Blobs as defined by the Blobstore service are not related to blob property values used by the datastore.

Deleted Datastore entries reappear

I'd like to re-open Deleted Datastore entries reappear as a registered user. Can the old question be deleted?
I'll try to be more specific this time. I'm experiencing the following problem:
Initially I put N entities of the same kind into the Datastore like that:
datastore_entity = MyModel(model_property=property_value)
datastore_entity.put()
Afterwards I delete them. I have used the Datastore Admin interface as well as a self-defined handler for the mapreduce library in order to do so. The deleted entities do not appear neither in the Datastore viewer nor in the Datastore Admin view.
When I put even just one new single entity of this kind into the Datastore, the old Datastore entities reappear in the Datastore Admin view while the new entity does not (judging by the number of entities). On the contrary, the Datastore viewer correctly reflects the Datastore state. A query also returns only the newly created entity.
There are no tasks at the time the new entity is being put into the Datastore.
I'm also not encountering this problem on my local machine where I'm using the --clean_datastore option when starting the server.
The Datastore Admin and Datastore Statistics are not "live". The Datastore viewer offers a live view.
Check "Entity statistics last updated..." and you will notice the difference.
If the old entities are not visible in the Datastore viewer - no need to worry. Eventually the statistics will be updated.

Deleted Datastore entries reappear [duplicate]

I'd like to re-open Deleted Datastore entries reappear as a registered user. Can the old question be deleted?
I'll try to be more specific this time. I'm experiencing the following problem:
Initially I put N entities of the same kind into the Datastore like that:
datastore_entity = MyModel(model_property=property_value)
datastore_entity.put()
Afterwards I delete them. I have used the Datastore Admin interface as well as a self-defined handler for the mapreduce library in order to do so. The deleted entities do not appear neither in the Datastore viewer nor in the Datastore Admin view.
When I put even just one new single entity of this kind into the Datastore, the old Datastore entities reappear in the Datastore Admin view while the new entity does not (judging by the number of entities). On the contrary, the Datastore viewer correctly reflects the Datastore state. A query also returns only the newly created entity.
There are no tasks at the time the new entity is being put into the Datastore.
I'm also not encountering this problem on my local machine where I'm using the --clean_datastore option when starting the server.
The Datastore Admin and Datastore Statistics are not "live". The Datastore viewer offers a live view.
Check "Entity statistics last updated..." and you will notice the difference.
If the old entities are not visible in the Datastore viewer - no need to worry. Eventually the statistics will be updated.

Resources