I'm migrating a project from no organization to a new organization. Will the project ID, dataset IDs, and any other IDs remain the same? Are there any potential disruptions besides those mentioned in this document?
According to official documentacion, the project ID can only be modified when you're creating the project, after that you can not modify the project ID. So when you migrate your project to a new organization, the project ID will not change.
Regarding your question about the dataset’s ID, once you create a dataset it can't be relocated, but you can make copies of a dataset, manually or using the BigQuery Data Transfer Service. When you copy your dataset to a new location, you can choose to create the new dataset with the same ID or use a different one.
To manually move a dataset from one location to another, follow this
process:
Export the data from your BigQuery tables to a Cloud Storage bucket in
either the same location as your dataset or in a location contained
within your dataset's location. For example, if your dataset is in the
EU multi-region location, you could export your data to the
europe-west1 Belgium location, which is part of the EU.
Copy or move the data from your export Cloud Storage bucket to a new
bucket you created in the destination location. For example, if you
are moving your data from the US multi-region to the
asia-northeast1 Tokyo region, you would transfer the data to a
bucket you created in Tokyo. For information on transferring Cloud
Storage objects, see Copying, renaming, and moving objects in the
Cloud Storage documentation.
After you transfer the data to a Cloud Storage bucket in the new
location, create a new BigQuery dataset (in the new location). Then,
load your data from the Cloud Storage bucket into BigQuery.
Related
I'm wondering if deletion of these entities in blobuploadsession would affect my app functionality or performance in any which way. The reason for deletion is when a new form is created and there were no files that were uploaded to, then it results in unnecessary entities being created.
(edit: additional info from comment)
I use blobstore (part of NDB) to store images asynchronously via upload URL functionality. When I run the app on localhost, there is an auto-creation of a datastore called "BlobUploadSession". This is the entity where all the URLs for the images to be uploaded are stored as entities. When I upload a photo to the URL, it goes into the "BlobInfo" datastore. Now, I don't have a need of the URLs since the photo has already been uploaded. So, I'm wondering if I can delete the BlobUploadSession entities? Btw, BlobUploadSession and BlobInfo are default datastores automatically created.
The __BlobUploadSession__ and __BlobInfo__ entities are created by and only internally used by the development server while emulating the blobstore functionality.
There are others, similarly named __SomeEntityName__ entities for emulating other pieces of functionality, for example a pile of them are created when you're requesting datastore stats (such function doesn't exist per-se in production).
These entities aren't created on GAE, so no need to worry about them in production.
See also related How to remove built-in kinds' names in google datastore using kind queries
We're using Cloudant as the remote database for our app. The database contains documents for each user of the app. When the app launches, we need to query the database for all the documents belonging to a user. What we found is the CDTDatastore API only allows pulling the entire database and storing it inside the app then performing the query in the local copy. The initial pulling to the local datastore takes about 10 seconds and I imagine will take longer when adding more users.
Is there a way I can save only part of the remote database to the local datastore? Or, are we using the wrong service for our app?
You can use a server side replication filter function; you'll need to add information about your filter to the pull replicator. However replication will have a performance hit when using the function.
That being said a common pattern is to use one database per user, however this has other trade offs and it is something you should read up on. There is some information on the one database per user pattern here.
I am building an Azure Chargeback solution and for that I am pulling the Azure Usage data from Azure Billing REST APIs for multiple subscriptions and different dates. I need to store this into custom MS SQL database as per customer’s requirements. I get various usage records from Azure.
Problem: From these Usage records, I am not able to find any combination of the columns in the data I receive which will give me a
Unique Key to identify a Usage record for a particular subscription
and for a specific date. Only column I see as different is Quantity
but even that can be duplicated. E.g. If there are 2 VMs of type A1
with no data or applications on them, in the same cloud service, then
they will have exact quantity of usage. I do not get the exact name
of the VM or any other resource via the Usage APIs.
One Custom Solution (Ineffective): I can append a counter or unique ID to the usage records but if I fetch the data next time the
order may shuffle or new data may be introduced thereby affecting the
logic for uniqueness. Any logic I build to checking if any data is
missing in DB will result in bugs if there is any alteration in the
order the usage records are returned (for a specific subscription for
a specific date).
I am sure that Microsoft stores this data in some database. I can’t find the unique id to identify a usage record from many records returned by the Billing API. Maybe I am missing something here.
I will appreciate any help or any pointers on this.
When you call the Usage API set the ShowDetails parameter to true: &showDetails=true
MSDN Doc
This will populate the instance data in the returned JSON with the unique URI for the resource which includes the name for example:
Website:
"instanceData": "{\"Microsoft.Resources\":{\"resourceUri\":\"/subscriptions/xxx-xxxx/resourceGroups/mygoup/providers/Microsoft.Web/sites/WebsiteName\",\"...
Virtual Machine:
"instanceData": "{\"Microsoft.Resources\":{\"resourceUri\":\"/subscriptions/xxx-xxxx/resourceGroups/TESTINGBillGroup/providers/Microsoft.Compute/virtualMachines/TestingBillVM\",\...
If ShowDetails is false all your resources will be aggregated on the server side based on the resource type, all your websites will show as one entry.
The resource URI, date range and meterid will form a unique key as far as I know.
NOTE: If you are using the legacy API your VMs will be aggregated under the cloud service hosting them.
Google's documentation states the following on their help page for Backup/Restore, Copy and Delete Data:
Note: Blob data is not backed up by this backup feature!
https://developers.google.com/appengine/docs/adminconsole/datastoreadmin#Backup_And_Restore
I did a simple backup/restore with an entity type in my application that contains a Blob field. After I backed up the entity, I removed the data that was stored in the Blob field. When I restored the entity it had that data once again.
Is it safe to infer that the warning in the documentation refers to data in the Blobstore and not Blob fields of entities stored in the normal data store?
I would say that it is safe to assume the two are not related. As per the google Blobstore Java API Overview:
Note: Blobs as defined by the Blobstore service are not related to blob property values used by the datastore.
I'm building an client/server-app where I want to sync data. I'm thinking about including the largest key from the local client database in the query so the server can fetch all entities added after that entity (with key > largest_local_key).
Can I be sure that the Google App Engine always increase the ID of new entities?
Is that a good way to implement synchronization?
No, IDs do not increase monotonically.
Consider synchronizing based on a create/update timestamp.