I have harvested data, its not particularly clean data, and this has been bulk uploaded into the DataStore. However I am getting the following issue when trying to simply loop through all the records. I don't much care about validation at this point as all I want to do is perform a bulk operation but GAE appears not to let me even loop through the data records. I want to get at the bottom of this. To my knowledge all records have data in the field for the country and I could switch of validation, but can someone explain why this is happening and GAE is being sensitive. Thanks
result = Company.all()
my_count = result.count()
if result:
for r in result:
self.response.out.write("hello")
The data model has these properties:
class Company(db.Model):
companyurl = db.LinkProperty(required=True)
companyname = db.StringProperty(required=True)
companydesc = db.TextProperty(required=True)
companyaddress = db.PostalAddressProperty(required=False)
companypostcode = db.StringProperty(required=False)
companyemail = db.EmailProperty(required=True)
companycountry = db.StringProperty(required=True)
The error message is below
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 701, in __call__
handler.get(*groups)
File "/base/data/home/apps/XXX/1.358667163009710608/showcompanies.py", line 99, in get
for r in result:
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 2312, in next
return self.__model_class.from_entity(self.__iterator.next())
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1441, in from_entity
return cls(None, _from_entity=entity, **entity_values)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 973, in __init__
prop.__set__(self, value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 613, in __set__
value = self.validate(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 2815, in validate
value = super(StringProperty, self).validate(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 640, in validate
raise BadValueError('Property %s is required' % self.name)
BadValueError: Property companycountry is required
If you have the bulk process you wish to run in its own script, you can construct a modified version of your Company class without validation. Since db.Model classes are just wrappers to the datastore based on the name of the class, you can have different classes in different parts of your code with different behaviors.
So you might have a model.py file with:
class Company(db.Model):
companyurl = db.LinkProperty(required=True)
# ...
companycountry = db.StringProperty(required=True)
# Normal operations go here
And, another bulk_process.py file with:
class Company(db.Model):
companyurl = db.LinkProperty()
# ...
companycountry = db.StringProperty()
result = Company.all()
my_count = result.count()
if result:
for r in result:
self.response.out.write("hello")
Because this second model class lacks the validation, it should run just fine. And, because the code is logically separated you don't have to worry about unintentional side-effects from removing the validation in the rest of your code. Just make sure that your bulk process doesn't accidentally write back data without validation (unless you're OK with this).
Related
I am doing an api call to a large database that sends back data in JSON format. Since the data is that big, the database sends the JSON data in separate batches, each batch containing an nextPageUrl: to the next batch. I want to loop/crawl through the batches, collect the URL of each batch, store them in a list, and then loop that list again to parse all the JSON data. Then populate my own SQLITE database with the parse results. However, I get this error message:
Traceback (most recent call last):
File "Database_download_v2.py", line 52, in <module>
if (len(json_dict['nextPageUrl']) > 0):
KeyError: 'nextPageUrl'
The code I use is:
load_page = requests.get(form_response_tree, headers=headers).content
page_decode = load_page.decode()
json_dict = json.loads(page_decode)
url_subseq_page = json_dict['nextPageUrl']
url_list = list()
url_list.append(url_subseq_page)
for all_pages in url_list:
load_page = requests.get(all_pages, headers=headers).content
page_decode = load_page.decode()
json_dict = json.loads(page_decode)
if (len(json_dict['nextPageUrl']) > 0):
url_subseq_page = json_dict['nextPageUrl']
url_list.append(url_subseq_page)
else:
continue
Any idea what is wrong here?
It is because you don't have anything in the list so it will technically not work.
Solution
url_list = ["putstringsinthislist"]
code:
query = Order.update(is_review=1, review_time=review_time).where(Order.order_sn == order_sn)
query.execute()
exception:
cursor = database.execute(self)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\peewee.py", line 2952, in execute
sql, params = ctx.sql(query).query()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\peewee.py", line 601, in sql
return obj.__sql__(self)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\peewee.py", line 2363, in __sql__
for k, v in sorted(self._update.items(), key=ctx.column_sort_key):
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\peewee.py", line 555, in column_sort_key
return item[0].get_sort_key(self)
AttributeError: 'UnknownField' object has no attribute 'get_sort_key'
environment:
peewee version:3.9.5
python:3.7
I notice that you have posted this as a GitHub issue on the Peewee issue tracker as well. I've responded to your ticket and will paste my response here for anyone else:
You are clearly using models generated using the pwiz tool. If pwiz cannot determine the right field type to use for a column, it uses a placeholder "UnknownField". You will need to edit your model definitions and replace the UnknownField with an appropriate field type (e.g. TextField, IntegerField, whatever) if you wish to be able to use it in queries.
The Configuration model below allows for uploading of a .ZIP and ties the extracted network device .TXT files within to the location where the device configurations came from.
class Configuration(models.Model):
configfile = models.FileField('Configuration File Upload', upload_to='somecompany/configs/', help_text='Select a .ZIP file which contains .TXT file configuration dumps from devices which belong to a single location.')
location_name = models.ForeignKey('Location', help_text='Associate the .ZIP file selected above to the location from which the device .TXT file configuration dumps were taken.')
I extended the default SAVE model method for the class to allow for processing of the .ZIP (code not shown for brevity). I've parsed the extracted .TXT files, collected all of the desired information into variables, and I'm trying to insert that information into my database but it's failing. Specifically, below I show an example of all of the values collected from a single one of the extracted .TXT files (modified slightly for privacy) and my attempt at DB insertion:
dbadd_ln = 'Red Rock'
dbadd_dn = 'DEVICE4'
dbadd_manu = 'cisco'
dbadd_os = 'nxos'
dbadd_dt = '-'
dbadd_prot = '-'
dbadd_cred = '-'
dbadd_ser = 'ABCD1234'
dbadd_addr = '10.10.10.10'
dbadd_model = 'N7K-C7010'
dbadd_ram = '2048256000'
dbadd_flash = '1109663744/1853116416'
dbadd_image = 'n7000-s1-dk9.5.2.9.bin'
dbadd = Device(location_name=dbadd_ln, device_name=dbadd_dn, device_type=dbadd_dt, protocol=dbadd_prot, credential=dbadd_cred, serial=dbadd_ser, address=dbadd_addr, manufacturer=dbadd_manu, model=dbadd_model, ram=dbadd_ram, flash=dbadd_flash, os=dbadd_os, image=dbadd_image)
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "c:\code-projects\MYVIRTUALENV\lib\site-packages\django\db\models\base.py", line 431, in __init__
setattr(self, field.name, rel_obj)
File "c:\code-projects\MYVIRTUALENV\lib\site-packages\django\db\models\fields\related_descriptors.py", line 207, in __set__
self.field.remote_field.model._meta.object_name,
ValueError: Cannot assign "'Red Rock'": "Device.location_name" must be a "Location" instance.
'Red Rock' is a legitimate Location entry which already exists in my database...
>>> Location.objects.filter(location_name='Red Rock')
[Location: Red Rock]
... so I guess I'm unclear on what this really means:
"Device.location_name" must be a "Location" instance.
Any assistance to help resolve this issue is appreciated. Thanks in advance.
Did some more searching and found this:
Cannot assign "u''": "Company.parent" must be a "Company" instance
Now I see. When I create an instance, I get past the particular error I was describing. Now on to solve the next problem. :)
I' currently doing django from some video tutorial. I had difficult to figure out the line which forces to following error.
Validating models...
Unhandled exception in thread started by <bound method Command.inner_run of <django.contrib.staticfiles.management.commands.runserver.Command object at 0xb6e5722c>>
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/django/core/management/commands/runserver.py", line 88, in inner_run
self.validate(display_num_errors=True)
File "/usr/lib/python2.7/dist-packages/django/core/management/base.py", line 253, in validate
raise CommandError("One or more models did not validate:\n%s" % error_text)
django.core.management.base.CommandError: One or more models did not validate:
events.event: 'attendees' specifies an m2m relation through model Attendence, which has not been installed
I had include source code at bitbucket.
In models.ManyToManyField() of Event class I had put wrong name on through which causes the problem. Later I fixed it with correct class name and problem solved.
After correcting error model.py look like
...
class Event(models.Model): ##error
#"""docstring for Event"""
description = models.TextField()
creation_date = models.DateTimeField(default = datetime.now) #if you put the parenthesis on datetime.now every time changing the date called new function with different date and time
start_date = models.DateTimeField(null = True,blank = True)
creator = models.ForeignKey(User,related_name='event_creator_set')
attendees = models.ManyToManyField(User,through='Attendance') #error occurs due to wrong class name on through
latest = models.BooleanField(default = True)
objects = EventManager()
def __unicode__(self):
return self.description
def save(self,**kwargs):
Event.objects.today().filter(latest=True,
creator=self.creator).update(latest=False)
super(Event,self).save(**kwargs)
class Attendance(models.Model): ##error
user = models.ForeignKey(User)#,related_name='attendance_user_set')
event = models.ForeignKey(Event)#,related_name='attendance_event_set') ###error
registration_date = models.DateTimeField(default=datetime.now)
def __unicode__(self):
return "%s is attending %s" %(self.user.username,self.event)
...
Also Django syncdb error: One or more models did not validate suggested me to add related_name but excluding it don't cause any error so I hadn't include this.
Following these instructions:
http://neogregious.blogspot.com/2011/04/migrating-app-to-high-replication.html
I have managed to migrate to the high replication datastore however I am now getting the following exception:
datastore_errors.BadArgumentError('ancestor argument should match app ("%r" != "%r")' %
(ancestor.app(), app))
The model data looks something like this:
class base_business(polymodel.PolyModel):
created = db.DateTimeProperty(auto_now_add=True)
class business(base_business):
some_data = db.StringProperty()
etc..
class business_image(db.Model):
image = db.BlobProperty(default=None)
mimetype = db.StringProperty()
comment = db.StringProperty(required=False)
# the image is associated like so
image_item = business_image(parent = business_item, etc... )
image_item.put()
The new app name has not been assigned to the ancestor model data. At the moment the data is returned however the logs are being populated with this exception message.
The actual stack trace using logging.exception:
2011-11-03 16:45:40.211
======= get_business_image exception [ancestor argument should match app ("'oldappname'" != "'s~newappname'")] =======
Traceback (most recent call last):
File "/base/data/home/apps/s~newappname/3.354412961756003398/oldappname/entities/views.py", line 82, in get_business_image
business_img = business_image.gql("WHERE ANCESTOR IS :ref_business and is_primary = True", ref_business = db.Key(business_key)).get()
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/init.py", line 2049, in get
results = self.fetch(1, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/init.py", line 2102, in fetch
raw = raw_query.Get(limit, offset, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 1668, in Get
config=config, limit=limit, offset=offset, prefetch_size=limit))
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 1600, in GetBatcher
return self.GetQuery().run(_GetConnection(), query_options)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 1507, in GetQuery
order=self.GetOrder())
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 93, in positional_wrapper
return wrapped(*args, **kwds)
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_query.py", line 1722, in init
ancestor=ancestor)
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 93, in positional_wrapper
return wrapped(*args, **kwds)
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_query.py", line 1561, in init
(ancestor.app(), app))
BadArgumentError: ancestor argument should match app ("'oldappname'" != "'s~newappname'")
Is there a way to manually set the app on the model data? Could I do something like this to resolve this?
if( ancestor.app() != app )
set_app('my_app')
put()
Before I do this or apply any other HACK is there something I should have done as part of the data migration?
This sort of error usually occurs because you're using fully qualified keys somewhere that have been stored in the datastore as strings (instead of ReferenceProperty), or outside the datastore, such as in URLs. You should be able to work around this by reconstructing any keys from external sources such that you ignore the App ID, something like this:
my_key = db.Key.from_path(*db.Key(my_key).to_path())