i am using NBD Datastore to save some information that my application requires.Let's say i have a class like this:
class Example(db.Model):
Entity1 = db.TextProperty()
Entity2 = db.StringProperty(multiline=True)
when i populate my db and i view it locally, i can see my update and even make changes (manually) to both the Entity1(TextProperty) and Entity2(StringProperty).
But when i deploy this app, and make an update and for some reasons, i want to change my values or update from the datastore viewer on appengine.google.com, only Entity2(StringProperty) becomes editable and for some reasons, i just can't change the value of Entity1(TextProperty). Is there any settings that i need to do to make this work ?
The datastore viewer is just a convenience, it's not surprising only some entity types can be edited directly.
As you've seen, just a difference in model type changes the behavior. And the behavior on the development server is often different from the live system in any case.
The simplest (only) solution is to write code that lets you perform the required edit on the model.
Related
I am writing a datastore migration for our current production App Engine application.
We made some fairly extensive changes to the data model so I am trying to put in place an architecture to allow easier migrations in the future. This includes test suites for the migrations and common class structures for migration scripts.
I am running into a problem with my current strategy. For both the migrations and the test scripts I need a way to load the Model classes from the old schema and the Model classes for the new data schema into memory at the same time and load entities using either.
Here is an example set of schemas.
rev1.py
class Account(db.Model):
_version = db.IntegerProperty(default = 1)
user = db.UserProperty(auto_current_user_add = True, required = True)
name = db.StringProperty()
contact_email = db.EmailProperty()
rev2.py
class Account(db.Model):
_version = db.IntegerProperty(default = 2)
auth_id = db.StringProperty()
name = db.StringProperty()
pwd_hash = db.StringProperty(required = True, indexed = False)
A migration script may look something like:
import rev1
import rev2
class MyMigration(...):
def isNeeded(self):
num_accounts = num_entities_with_version(rev1.Account, 1)
return num_accounts > 0
def run(self):
rev1_accounts = rev1.Account.all()
for account in [a for a in rev1_accounts if account._version == 1]:
auth_id = account.contact_email
if auth_id is None or auth_id == '':
auth_id = account.user.email()
new_account = rev2.Account.create(auth_id = auth_id,
name = account.name)
And a test suite would look something like this:
import rev1
import rev2
class MyTest(...):
def testIt(self):
# Setup data
act1 = rev1.Account(name = '..', contact_email = '..')
act1.put()
act2 = rev1.Account(name = '..', contact_email = '..')
act2.put()
# Run migration
migration.run()
# Check results
accounts = rev2.Account.all().fetch(99)
So as you can see I am using the old revision in two ways. I am using it in the migration as a way to read data in the old format and convert it into the new format. (note: I can't read it in the new format because of things like the required pwd_hash field and other field changes). I am using it in the test suite to setup test data in the old format before running the migration.
It all seems great in theory, but in practice it falls apart because GAE doesn't allow multiple models to be loaded for the same kind, or more specifically, queries only return for the most recently defined model.
In the development server this seems to be due to the fact that the process of calling get() on a query for an entity (ex: Account.get(my_key)) calls a result hook that builds the result Model object by calling class_for_kind on the entity kind name from the data. So even though I may call rev2.Account.get(), it may build up rev1.Account Model objects because the kind 'Account' maps to rev1.Account in the _kind_map dictionary.
This has made me rethink my migration strategy a bit and I wanted to ask if anyone has thoughts. Specifically:
Would it be safe to manually override google.appengine.ext.db._kind_map at runtime in test and on the production servers to allow this migration method to work?
Is there some better way to keep two versions of a Model in memory at the same time?
Is there a different migration method that may be a smarter way to go about this work?
Other methods I have thought of trying include:
Change the entity kind when the version changes. (use kind() to change it) Then when we migrate we move all classes to the new kind name.
Find a way to query the entities and get back a 'raw' object (proto buffers??) that has not been built into a full object. (would not work with tests)
'Just Do It Live': Don't write tests for any of this and just try to migrate using the latest schema loading the older data working around issues as the come up.
I think there are actually several questions within the greater question. There seem to be two key questions here though, one is how to test and the other is how to really do it.
I wouldn't define the kind multiple times; as you've noted there are nuances to doing this, and, if you wind up with the wrong model loaded, you'll get all sorts of headaches. That said, it is completely possible for you to manipulate the kind_map. I've done this in some special cases, but I try to avoid it when possible.
For a live migration where you've got significant schema changes, you've got two choices: use Expando or use the lower level API. When adding required fields, you might find it easier to use Expando, then run a migration to add the new information, then switch back to a plain db.Model. The lower-level API sits right under the ext.db stuff, and it presents the entity as a Python dict. This can be very convenient for manipulating an entity. Use whichever method you're more comfortable with. I prefer Expando when posible, since it is a higher level interface, but it is a two-step process.
For testing, I'd personally suggest you focus on the actual conversion routines. So instead of testing the method from the point of querying down, test to ensure your conversion routines themselves function correctly. You might even choose to pass in the old entity as a Python dict, then return the new entity.
I'd make one other adjustment here as well. I'd rather use a query to find all my rev 1 accounts. That's the great thing about having an indexed _version on your models. You can trivially find things that need migrated.
Also, check out Google's article on updating schemas. It is old, but still good.
Another approach is to simply do the migration on version 2, leaving the old attributes on the model and setting them to None after you update the version. This will clear out the space they use but will still leave them defined. Then in a following release you can just remove them from the model.
This method is pretty simple, but does require two releases to remove old attribute completely, so is more akin to deprecating the existing attributes.
I need to know when my app's data store was last updated.
Surely I could find and patch every line of code where queries INSERT, UPDATE and DELETE are used but may be there is such official capability in datastore?
You can use a 'database service hook' to execute your own bit of code whenever the database is written to.
See http://code.google.com/appengine/articles/hooks.html
I would advise against trying to accomplish this with an RPC hook. RPC hooks are neat, but they plug into relatively low-level components of the datastore stack. It's preferable to work with the high-level abstractions unless there's a good reason not to.
Why not just attach an update timestamp to your models?
class BaseModel(db.Model):
updated_at = db.DateTimeProperty(auto_now=True)
class MyModel(BaseModel):
name = db.StringProperty()
class OtherModel(BaseModel):
total = db.IntegerProperty()
Every model that inherits from BaseModel will automatically track an update timestamp.
If you use an object database, what happens when you need to change the structure of your object model?
For instance, I'm playing around with the Google App Engine. While I'm developing my app, I've realized that in some cases, I mis-named a class, and I want to change the name. And I have two classes that I think I need to consolidate.
However,I don't think I can, because the name of the class in intuitively tied into the datastore, and there is actual data stored under those class names.
I suppose the good thing about the "old way" of abstracting the object model from the data storage is that the data storage doesn't know anything about the object model --it's just data. So, you can change your object model and just load the data out of the datastore differently.
So, in general, when using a datastore which is intimate with your data model...how do you change things around?
If it's just class naming you're concerned about, you can change the class name without changing the kind (the identifier that is used in the datastore):
class Foo(db.Model):
#classmethod
def kind(cls):
return 'Bar'
If you want to rename your class, just implement the kind() method as above, and have it return the old kind name.
If you need to make changes to the actual representation of data in the datastore, you'll have to run a mapreduce to update the old data.
The same way you do it in relational databases, except without a nice simple SQL script: http://code.google.com/appengine/articles/update_schema.html
Also, just like the old days, objects without properties don't automatically get defaults and properties that don't exist in the schema still hang around as phantoms in the objects.
To rename a property, I expect you can remove the old property (the phantom hangs around) add the new name, populate the data with a copy from the old (phantom) property. The re-written object will only have the new property
You may be able to do it the way we are doing it in our project:
Before we update the object-model (schema), we export our data to a file or blob in json format using a custom export function and version tag on top. After the schema has been updated we import the json with another custom function which creates new entities and populates them with old data. Of course the import version needs to know the json format associated with each version number.
I have created a class that inherits for DomainService and have a Silverlight app that uses System.ServiceModel.DomainServices.Client to get a DomainContext. I have also created POCO DataContracts that are used in the DomainServices's Query, Update, Insert, and Delete methods. I also have a ViewModel that executes all the LoadOperations. And now I'm at the part of my app where I want to add new Entities to the generated EntitySets but am unsure about what's going to happen when one user creates new and sets the Key value; all while another user creates a similar entity with the same Key value.
I have seen in the documentation that an ObjectContext is used, but in my situation I was not able to use the EntityFramework model generator. So I had to create my datacontracts by hand.
So I guess my question is, is there any way I can force other silverlight apps to update on database change?
When you make a Save operation to your DomainContext, depending on the load behavior, it will automatically refresh.
TicketContext.Load(TicketContext.GetTicketByIdQuery(ticketId),
LoadBehavior.RefreshCurrent,
x =>
{
Ticket = x.Entities.First();
Ticket.Load();
((IChangeTracking) TicketContext.EntityContainer).AcceptChanges();
}, null);
Here I've set the LoadBehavior to RefreshCurrent. When you make a save, RIA will send the entity back across the wire to the client and merge the changes with the entity already cached on your client side context. I don't know if that quite answers your question or not however.
Google is proposing changing one entry at a time to the default values ....
http://code.google.com/appengine/articles/update_schema.html
I have a model with a million rows and doing this with a web browser will take me ages. Another option is to run this using task queues but this will cost me a lot of cpu time
any easy way to do this?
Because the datastore is schema-less, you do literally have to add or remove properties on each instance of the Model. Using Task Queues should use the exact same amount of CPU as doing it any other way, so go with that.
Before you go through all of that work, make sure that you really need to do it. As noted in the article that you link to, it is not the case that all entities of a particular model need to have the same set of properties. Why not change your Model class to check for the existence of new or removed properties and update the entity whenever you happen to be writing to it anyhow.
Instead of what the docs suggest, I would suggest to use low level GAE API to migrate.
The following code will migrate all the items of type DbMyModel:
new_attribute will be added if does not exits.
old_attribute will be deleted if exists.
changed_attribute will be converted from boolean to string (True to Priority 1, False to Priority 3)
Please note that query.Run returns iterator returning Entity objects. Entity objects behave simply like dicts:
from google.appengine.api.datastore import Query, Put
query = Query("DbMyModel")
for item in query.Run():
if not 'new_attribute' in item:
item['attribute'] = some_value
if 'old_attribute' in item:
del item['old_attribute']
if ['changed_attribute'] is True:
item['changed_attribute'] = 'Priority 1'
elif ['changed_attribute'] is False:
item['changed_attribute'] = 'Priority 3'
#and so on...
#Put the item to the db:
Put(item)
In case you need to select only some records, see the google.appengine.api.datastore module's source code for extensive documentation and examples how to create filtered query.
Using this approach it is simpler to remove/add properties and avoid issues when you have already updated your application model than in GAE's suggested approach.
For example, now-required fields might not exist (yet) causing errors while migrating. And deleting fields does not work for static properties.
This doesn't help OP but may help googlers with a tiny app: I did what Alex suggested, but simpler. Obviously this isn't appropriate for production apps.
deploy App Engine Console
write code right inside the web interpreter against your live datastore
like so:
from models import BlogPost
for item in BlogPost.all():
item.attr="defaultvalue"
item.put()