Migrating an App to High-Replication Datastore - now ancestor issues - google-app-engine

Following these instructions:
http://neogregious.blogspot.com/2011/04/migrating-app-to-high-replication.html
I have managed to migrate to the high replication datastore however I am now getting the following exception:
datastore_errors.BadArgumentError('ancestor argument should match app ("%r" != "%r")' %
(ancestor.app(), app))
The model data looks something like this:
class base_business(polymodel.PolyModel):
created = db.DateTimeProperty(auto_now_add=True)
class business(base_business):
some_data = db.StringProperty()
etc..
class business_image(db.Model):
image = db.BlobProperty(default=None)
mimetype = db.StringProperty()
comment = db.StringProperty(required=False)
# the image is associated like so
image_item = business_image(parent = business_item, etc... )
image_item.put()
The new app name has not been assigned to the ancestor model data. At the moment the data is returned however the logs are being populated with this exception message.
The actual stack trace using logging.exception:
2011-11-03 16:45:40.211
======= get_business_image exception [ancestor argument should match app ("'oldappname'" != "'s~newappname'")] =======
Traceback (most recent call last):
File "/base/data/home/apps/s~newappname/3.354412961756003398/oldappname/entities/views.py", line 82, in get_business_image
business_img = business_image.gql("WHERE ANCESTOR IS :ref_business and is_primary = True", ref_business = db.Key(business_key)).get()
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/init.py", line 2049, in get
results = self.fetch(1, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/init.py", line 2102, in fetch
raw = raw_query.Get(limit, offset, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 1668, in Get
config=config, limit=limit, offset=offset, prefetch_size=limit))
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 1600, in GetBatcher
return self.GetQuery().run(_GetConnection(), query_options)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 1507, in GetQuery
order=self.GetOrder())
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 93, in positional_wrapper
return wrapped(*args, **kwds)
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_query.py", line 1722, in init
ancestor=ancestor)
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 93, in positional_wrapper
return wrapped(*args, **kwds)
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_query.py", line 1561, in init
(ancestor.app(), app))
BadArgumentError: ancestor argument should match app ("'oldappname'" != "'s~newappname'")
Is there a way to manually set the app on the model data? Could I do something like this to resolve this?
if( ancestor.app() != app )
set_app('my_app')
put()
Before I do this or apply any other HACK is there something I should have done as part of the data migration?

This sort of error usually occurs because you're using fully qualified keys somewhere that have been stored in the datastore as strings (instead of ReferenceProperty), or outside the datastore, such as in URLs. You should be able to work around this by reconstructing any keys from external sources such that you ignore the App ID, something like this:
my_key = db.Key.from_path(*db.Key(my_key).to_path())

Related

Keyerror after looping a dictionary

I am doing an api call to a large database that sends back data in JSON format. Since the data is that big, the database sends the JSON data in separate batches, each batch containing an nextPageUrl: to the next batch. I want to loop/crawl through the batches, collect the URL of each batch, store them in a list, and then loop that list again to parse all the JSON data. Then populate my own SQLITE database with the parse results. However, I get this error message:
Traceback (most recent call last):
File "Database_download_v2.py", line 52, in <module>
if (len(json_dict['nextPageUrl']) > 0):
KeyError: 'nextPageUrl'
The code I use is:
load_page = requests.get(form_response_tree, headers=headers).content
page_decode = load_page.decode()
json_dict = json.loads(page_decode)
url_subseq_page = json_dict['nextPageUrl']
url_list = list()
url_list.append(url_subseq_page)
for all_pages in url_list:
load_page = requests.get(all_pages, headers=headers).content
page_decode = load_page.decode()
json_dict = json.loads(page_decode)
if (len(json_dict['nextPageUrl']) > 0):
url_subseq_page = json_dict['nextPageUrl']
url_list.append(url_subseq_page)
else:
continue
Any idea what is wrong here?
It is because you don't have anything in the list so it will technically not work.
Solution
url_list = ["putstringsinthislist"]

django Model one-to-many save fail

I am starting developping an app that uses django model ORM. I came upon a strange behavior when creating and saving objects related by a one-to-many relation. I have made two model creation test cases of a simple one-to-many relationship, one is working and the other is failing, but I don't understand why. Here are my models :
class Document(models.Model):
pass
class Section(models.Model):
document = models.ForeignKey('Document',on_delete=models.CASCADE)
Here is the working creation test case (in manage.py shell):
>>> doc = models.Document()
>>> doc.save()
>>> section = models.Section(document=doc)
>>> section.document
<Document: Document object (5)>
>>> section.save()
>>>
And here is the test case that fails:
>>> doc = models.Document()
>>> section = models.Section(document=doc)
>>> section.document
<Document: Document object (None)>
>>> doc.save()
>>> section.document
<Document: Document object (6)>
>>> section.save()
Traceback (most recent call last):
File "<input>", line 1, in <module>
section.save()
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 741, in save
force_update=force_update, update_fields=update_fields)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 779, in save_base
force_update, using, update_fields,
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 870, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 908, in _do_insert
using=using, raw=raw)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/query.py", line 1186, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1335, in execute_sql
cursor.execute(sql, params)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 99, in execute
return super().execute(sql, params)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/mysql/base.py", line 76, in execute
raise utils.IntegrityError(*tuple(e.args))
django.db.utils.IntegrityError: (1048, "Column 'document_id' cannot be null")
>>> section.document
Traceback (most recent call last):
File "<input>", line 1, in <module>
section.document
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/fields/related_descriptors.py", line 189, in __get__
"%s has no %s." % (self.field.model.__name__, self.field.name)
diagnosisrank.models.common.Section.document.RelatedObjectDoesNotExist: Section has no document.
>>>
The only difference with the working test case is that the related Document model is not saved before instantiating the Section model. However, after saving the Document model, we see that the related document model of the Section model is pointing to the saved Document (its id is set). But when trying to save the Section, the related id is not set and the related model is lost. Why is that? Why do the related Document model instance must be assigned to the Section only after being saved?
To clarify my problem my goal is to collect all info and instantiate all my models in a first step, then saving all to database in a section step. I can still do this in this way: say D as many S, create D and S, save D, assign D in S, save S. But I would prefer to do: create D, create S with related D, save D, save S. Why can't I?
Thanks for any help or insight!
Phil
Well it was a bug and it is fixed: https://github.com/django/django/commit/519016e5f25d7c0a040015724f9920581551cab0
However it is not in the latest stable 2.2.4 that I get out of pip3 ...

django model method fails database commit

The Configuration model below allows for uploading of a .ZIP and ties the extracted network device .TXT files within to the location where the device configurations came from.
class Configuration(models.Model):
configfile = models.FileField('Configuration File Upload', upload_to='somecompany/configs/', help_text='Select a .ZIP file which contains .TXT file configuration dumps from devices which belong to a single location.')
location_name = models.ForeignKey('Location', help_text='Associate the .ZIP file selected above to the location from which the device .TXT file configuration dumps were taken.')
I extended the default SAVE model method for the class to allow for processing of the .ZIP (code not shown for brevity). I've parsed the extracted .TXT files, collected all of the desired information into variables, and I'm trying to insert that information into my database but it's failing. Specifically, below I show an example of all of the values collected from a single one of the extracted .TXT files (modified slightly for privacy) and my attempt at DB insertion:
dbadd_ln = 'Red Rock'
dbadd_dn = 'DEVICE4'
dbadd_manu = 'cisco'
dbadd_os = 'nxos'
dbadd_dt = '-'
dbadd_prot = '-'
dbadd_cred = '-'
dbadd_ser = 'ABCD1234'
dbadd_addr = '10.10.10.10'
dbadd_model = 'N7K-C7010'
dbadd_ram = '2048256000'
dbadd_flash = '1109663744/1853116416'
dbadd_image = 'n7000-s1-dk9.5.2.9.bin'
dbadd = Device(location_name=dbadd_ln, device_name=dbadd_dn, device_type=dbadd_dt, protocol=dbadd_prot, credential=dbadd_cred, serial=dbadd_ser, address=dbadd_addr, manufacturer=dbadd_manu, model=dbadd_model, ram=dbadd_ram, flash=dbadd_flash, os=dbadd_os, image=dbadd_image)
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "c:\code-projects\MYVIRTUALENV\lib\site-packages\django\db\models\base.py", line 431, in __init__
setattr(self, field.name, rel_obj)
File "c:\code-projects\MYVIRTUALENV\lib\site-packages\django\db\models\fields\related_descriptors.py", line 207, in __set__
self.field.remote_field.model._meta.object_name,
ValueError: Cannot assign "'Red Rock'": "Device.location_name" must be a "Location" instance.
'Red Rock' is a legitimate Location entry which already exists in my database...
>>> Location.objects.filter(location_name='Red Rock')
[Location: Red Rock]
... so I guess I'm unclear on what this really means:
"Device.location_name" must be a "Location" instance.
Any assistance to help resolve this issue is appreciated. Thanks in advance.
Did some more searching and found this:
Cannot assign "u''": "Company.parent" must be a "Company" instance
Now I see. When I create an instance, I get past the particular error I was describing. Now on to solve the next problem. :)

Cannot get records back from Google App Engine

I have harvested data, its not particularly clean data, and this has been bulk uploaded into the DataStore. However I am getting the following issue when trying to simply loop through all the records. I don't much care about validation at this point as all I want to do is perform a bulk operation but GAE appears not to let me even loop through the data records. I want to get at the bottom of this. To my knowledge all records have data in the field for the country and I could switch of validation, but can someone explain why this is happening and GAE is being sensitive. Thanks
result = Company.all()
my_count = result.count()
if result:
for r in result:
self.response.out.write("hello")
The data model has these properties:
class Company(db.Model):
companyurl = db.LinkProperty(required=True)
companyname = db.StringProperty(required=True)
companydesc = db.TextProperty(required=True)
companyaddress = db.PostalAddressProperty(required=False)
companypostcode = db.StringProperty(required=False)
companyemail = db.EmailProperty(required=True)
companycountry = db.StringProperty(required=True)
The error message is below
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 701, in __call__
handler.get(*groups)
File "/base/data/home/apps/XXX/1.358667163009710608/showcompanies.py", line 99, in get
for r in result:
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 2312, in next
return self.__model_class.from_entity(self.__iterator.next())
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1441, in from_entity
return cls(None, _from_entity=entity, **entity_values)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 973, in __init__
prop.__set__(self, value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 613, in __set__
value = self.validate(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 2815, in validate
value = super(StringProperty, self).validate(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 640, in validate
raise BadValueError('Property %s is required' % self.name)
BadValueError: Property companycountry is required
If you have the bulk process you wish to run in its own script, you can construct a modified version of your Company class without validation. Since db.Model classes are just wrappers to the datastore based on the name of the class, you can have different classes in different parts of your code with different behaviors.
So you might have a model.py file with:
class Company(db.Model):
companyurl = db.LinkProperty(required=True)
# ...
companycountry = db.StringProperty(required=True)
# Normal operations go here
And, another bulk_process.py file with:
class Company(db.Model):
companyurl = db.LinkProperty()
# ...
companycountry = db.StringProperty()
result = Company.all()
my_count = result.count()
if result:
for r in result:
self.response.out.write("hello")
Because this second model class lacks the validation, it should run just fine. And, because the code is logically separated you don't have to worry about unintentional side-effects from removing the validation in the rest of your code. Just make sure that your bulk process doesn't accidentally write back data without validation (unless you're OK with this).

Google app engine: problem with autoload reference keys!

I have a model Image that has a propery named "uploaded_by_user" that is a db.ReferenceProperty for the User model. When I query for the image in the "uploaded by user" property I already have the user value as a model.. I don't want that, I just want the key. I don't want that extra query to load the user. Is that possible?
EDIT1:
the array of images then is sent over PYAMF to flex. So the items must contain all data necessary.
EDIT2
Error:
Traceback (most recent call last):
, File "D:\Totty\webDevelopment\TottysWorld\src\pyamf\remoting\amf0.py", line 108, in __call__
*args, **kwargs)
, File "D:\Totty\webDevelopment\TottysWorld\src\pyamf\remoting\amf0.py", line 61, in _getBody
**kwargs)
, File "D:\Totty\webDevelopment\TottysWorld\src\pyamf\remoting\gateway\__init__.py", line 510, in callServiceRequest
return service_request(*args)
, File "D:\Totty\webDevelopment\TottysWorld\src\pyamf\remoting\gateway\__init__.py", line 233, in __call__
return self.service(self.method, args)
, File "D:\Totty\webDevelopment\TottysWorld\src\pyamf\remoting\gateway\__init__.py", line 133, in __call__
return func(*params)
, File "D:\Totty\webDevelopment\TottysWorld\src\app\services\lists\get_contents.py", line 39, in get_contents
item.uploaded_by_user = str(Image.uploaded_by_user.get_value_for_datastore(item))
, File "C:\GAE\google\appengine\ext\db\__init__.py", line 3216, in __set__
value = self.validate(value)
, File "C:\GAE\google\appengine\ext\db\__init__.py", line 3246, in validate
if value is not None and not value.has_key():
,AttributeError: 'str' object has no attribute 'has_key'
No more this error! instead of saving over the item.uploaded_by_user I save on item.uploaded_by_user_key. But The item.uploaded_by_user still loads the User model..
item.uploaded_by_user_key = str(Image.uploaded_by_user.get_value_for_datastore(item))
item is my Image in this case. As image inherents from item I call it item.
class Image(db.Model):
uploaded_by_user= db.ReferenceProperty(User)
class User(db.Model):
...
You could do:
image = db.get(image_key)
user_id = Image.uploaded_by_user.get_value_for_datastore(image).id()

Resources