Query or Expression for excluding certain values from DAL selection - google-app-engine

I'm trying to exclude posts which have a tag named meta from my selection, by:
meta_id = db(db.tags.name == "meta").select().first().id
not_meta = ~db.posts.tags.contains(meta_id)
posts=db(db.posts).select(not_meta)
But those posts still show up in my selection.
What is the right way to write that expression?
My tables look like:
db.define_table('tags',
db.Field('name', 'string'),
db.Field('desc', 'text', default="")
)
db.define_table('posts',
db.Field('title', 'string'),
db.Field('message', 'text'),
db.Field('tags', 'list:reference tags'),
db.Field('time', 'datetime', default=datetime.utcnow())
)
I'm using Web2Py 1.99.7 on GAE with High Replication DataStore on Python 2.7.2
UPDATE:
I just tried posts=db(not_meta).select() as suggested by #Anthony, but it gives me a Ticket with the following Traceback:
Traceback (most recent call last):
File "E:\Programming\Python\web2py\gluon\restricted.py", line 205, in restricted
exec ccode in environment
File "E:/Programming/Python/web2py/applications/vote_up/controllers/default.py", line 391, in <module>
File "E:\Programming\Python\web2py\gluon\globals.py", line 173, in <lambda>
self._caller = lambda f: f()
File "E:/Programming/Python/web2py/applications/vote_up/controllers/default.py", line 8, in index
posts=db(not_meta).select()#orderby=settings.sel.posts, limitby=(0, settings.delta)
File "E:\Programming\Python\web2py\gluon\dal.py", line 7578, in select
return adapter.select(self.query,fields,attributes)
File "E:\Programming\Python\web2py\gluon\dal.py", line 3752, in select
(items, tablename, fields) = self.select_raw(query,fields,attributes)
File "E:\Programming\Python\web2py\gluon\dal.py", line 3709, in select_raw
filters = self.expand(query)
File "E:\Programming\Python\web2py\gluon\dal.py", line 3589, in expand
return expression.op(expression.first)
File "E:\Programming\Python\web2py\gluon\dal.py", line 3678, in NOT
raise SyntaxError, "Not suported %s" % first.op.__name__
SyntaxError: Not suported CONTAINS
UPDATE 2:
As ~ isn't currently working on GAE with Datastore, I'm using the following as a temporary work-around:
meta = db.posts.tags.contains(settings.meta_id)
all=db(db.posts).select()#, limitby=(0, settings.delta)
meta=db(meta).select()
posts = []
i = 0
for post in all:
if i==settings.delta: break
if post in meta: continue
else:
posts.append(post)
i += 1
#settings.delta is an long integer to be used with limitby

Try:
meta_id = db(db.tags.name == "meta").select().first().id
not_meta = ~db.posts.tags.contains(meta_id)
posts = db(not_meta).select()
First, your initial query returns a complete Row object, so you need to pull out just the "id" field. Second, not_meta is a Query object, so it goes inside db(not_meta) to create a Set object defining the set of records to select (the select() method takes a list of fields to return for each record, as well as a few other arguments, such as orderby, groupby, etc.).

Related

Field 'id' expected a number but got ' '. But I dont even have a field named 'id'

we are making a web app using Django, postgresql, and reactjs. I am creating two models and connecting them using one to one relationship in django. The view file is literally empty. This is models.py file I changed the primary key fields for each table to avoid the error but it isnot solving anything.I am new to Django. Please help.
from django.db import models
class userInfo(models.Model):
Username=models.CharField(max_length=100,primary_key=True)
Password=models.CharField(max_length=100)
def __str__(self):
return self.Username
class rentDetails(models.Model):
user=models.OneToOneField(
userInfo,
on_delete=models.CASCADE,
primary_key=True
)
floor_no=models.CharField(max_length=20,default=True)
distance=models.IntegerField(default=True)
location=models.CharField(max_length=200,default=True)
area=models.IntegerField(default=True)
no_of_rooms=models.IntegerField(default=True)
price=models.IntegerField(default=True)
property_choices=[
('hostel','Hostel'),
('house','House') ,
('room','Room'),
('flat','flat')
]
property_type=models.CharField(
max_length=10,
choices=property_choices,
)
images=models.ImageField(upload_to='uploads/',null=True)
#Here's the traceback:
Traceback (most recent call last):
File "C:\Users\acer\Documents\CHAANO\Chano\manage.py", line 22, in <module>
main()
File "C:\Users\acer\Documents\CHAANO\Chano\manage.py", line 18, in main
execute_from_command_line(sys.argv)
" ".join(
File "C:\Users\acer\Documents\python 310\lib\site-packages\django\db\backends\base\schema.py", line 296, in _iter_column_sql
default_value = self.effective_default(field)
File "C:\Users\acer\Documents\python 310\lib\site-packages\django\db\backends\base\schema.py", line 410, in effective_default
return field.get_db_prep_save(self._effective_default(field), self.connection)
File "C:\Users\acer\Documents\python 310\lib\site-packages\django\db\models\fields\related.py", line 1126, in get_db_prep_save
return self.target_field.get_db_prep_save(value, connection=connection)
File "C:\Users\acer\Documents\python 310\lib\site-packages\django\db\models\fields\__init__.py", line 910, in get_db_prep_save
return self.get_db_prep_value(value, connection=connection, prepared=False)
File "C:\Users\acer\Documents\python 310\lib\site-packages\django\db\models\fields\__init__.py", line 2668, in get_db_prep_value
value = self.get_prep_value(value)
File "C:\Users\acer\Documents\python 310\lib\site-packages\django\db\models\fields\__init__.py", line 1990, in get_prep_value
raise e.__class__(
ValueError: Field 'id' expected a number but got ' '.

How can I create and shuffle a dataset for triplet mining in TensorFlow 2?

I'm working on a network using triplet mining for training. In order to make it work properly, I need my batches to contain several images of the same class. The problem I'm currently facing is that I have 751 classes, for a total of 12,937 pictures, and a batch size of 48 pictures. When shuffling the dataset using the command below, the odds to get pictures from the same class are really low, making the triplet mining inefficient.
dataset = dataset.shuffle(12937)
What I would need instead is a way of generating batches that contain a specific number of pictures for every class represented in this batch. As an example, let's say here that I want 12 classes per batch, there would be 4 pictures for each of them.
Another problem I'm facing is how would I shuffle this dataset at the end of every epoch so that I can have different batches that still follow the condition fixed above, that is 12 classes, 4 pictures for each one of them?
Is there any proper way to do it? I can't really find one. Please let me know if I'm unclear, and if you need further details.
================ EDIT ================
I've been trying a few things, and came up with something that would do what I want. The function would be the following:
counter = 0.
# Assuming a format such as (data, label)
def predicate(data, label):
global counter
allowed_labels = tf.constant([counter])
isallowed = tf.equal(allowed_labels, tf.cast(label, tf.float32))
reduced = tf.reduce_sum(tf.cast(isallowed, tf.float32))
counter += 1
return tf.greater(reduced, tf.constant(0.))
##tf.function
def custom_shuffle(train_dataset, batch_size, samples_per_class = 4, iterations_in_epoch = 100, database='market'):
assert batch_size%samples_per_class==0, F'batch size must be a {samples_per_class} multiple.'
if database == 'market':
class_nbr = 751
else:
raise Exception('Unsuported database yet')
all_datasets = [train_dataset.filter(predicate) for _ in range(class_nbr)] # Every element of this array is a dataset of one class
for i in range(iterations_in_epoch):
choice = tf.random.uniform(
shape=(batch_size//samples_per_class,),
minval=0,
maxval=class_nbr,
dtype=tf.dtypes.int64,
) # Which classes will be in batch
choice = tf.data.Dataset.from_tensor_slices(tf.concat([choice for _ in range(4)], axis=0)) # Exactly 4 picture from each class in the batch
batch = tf.data.experimental.choose_from_datasets(all_datasets, choice)
if i==0:
all_batches = batch
else:
all_batches = all_batches.concatenate(batch)
all_batches = all_batches.batch(batch_size)
return all_batches
It does what I want, however the returned dataset is extremely slow to iterate, making modele learning impossible. As per this thread, I understood that I needed to decorate custom_shuffle with #tf.function, as the one commented out. However, when doing so, it raises the following error:
Traceback (most recent call last):
File "training.py", line 137, in <module>
main()
File "training.py", line 80, in main
train_dataset = get_dataset(TRAINING_FILENAMES, IMG_SIZE, BATCH_SIZE, database=database, func_type='train')
File "E:\Morgan\TransReID_TF\tfr_to_dataset.py", line 260, in get_dataset
dataset = custom_shuffle(dataset, batch_size)
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\def_function.py", line 846, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\function.py", line 1843, in _filtered_call
return self._call_flat(
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\function.py", line 545, in call
outputs = execute.execute(
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: No unary variant device copy function found for direction: 1 and Variant type_index: class tensorflow::data::`anonymous namespace'::DatasetVariantWrapper
[[{{node BatchDatasetV2/_206}}]] [Op:__inference_custom_shuffle_11485]
Function call stack:
custom_shuffle
Which I don't understand, and don't see how to fix.
Is there something I'm doing wrong?
PS: I'm aware the lack of minimal code to reproduce this behavior makes it hard to debug, I'll try to provide some as soon as possible.

How to send only 1 variable from a set of 3 in TinyDB

[DISCORD.PY and TINYDB]
I have set up a warning system for Discord. If someone gets warned, the data is saved like so:
{'userid': 264452325945376768, 'warnid': 37996302, 'reason': "some reason"}
Problem of this: I want the command "!warnings" to display these warnings, but I don't want it to have this ugly formatting of a JSON, instead I want it to display like so:
{member} has {amount} warnings.
WARN-ID: {warning id here}
REASON: {reason here}
To do this, I need to somehow call only one variable at a time instead of having all 3 (as of above JSON) instantly.
My code is as follows:
#Ralf.command()
async def warnings(ctx, *, member: discord.Member=None):
if member is None:
member = Ralf.get_user(ctx.author.id)
member_id = ctx.author.id
else:
member = Ralf.get_user(member.id)
member_id = member.id
WarnList = Query()
Result = warndb.search(WarnList.userid == member_id)
warnAmt = len(Result)
if warnAmt == 1:
await ctx.send(f"**{member}** has `{warnAmt}` warning.")
else:
await ctx.send(f"**{member}** has `{warnAmt}` warnings.")
for item in Result:
await ctx.send(item)
This code is working, but it shows the ugly {'userid': 264452325945376768, 'warnid': 37996302, 'reason': "some reason"} as output.
MAIN QUESTION: How do I call only userid without having warnid and reason displayed?
EDIT 1:
Trying to use dicts results in following:
For that I get the following:
Ignoring exception in command warnings:
Traceback (most recent call last):
File "C:\Users\entity2k3\AppData\Local\Programs\Python\Python39\lib\site-packages\discord\ext\commands\core.py", line 85, in wrapped
ret = await coro(*args, **kwargs)
File "c:\Users\entity2k3\Desktop\Discord Bots All\Entity2k3's Moderation\main.py", line 201, in warnings
await ctx.send(f"WARN-ID: `{Result['warnid']}` REASON: {Result['reason']}")
TypeError: list indices must be integers or slices, not str
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\entity2k3\AppData\Local\Programs\Python\Python39\lib\site-packages\discord\ext\commands\bot.py", line 903, in invoke
await ctx.command.invoke(ctx)
File "C:\Users\entity2k3\AppData\Local\Programs\Python\Python39\lib\site-packages\discord\ext\commands\core.py", line 859, in invoke
await injected(*ctx.args, **ctx.kwargs)
File "C:\Users\entity2k3\AppData\Local\Programs\Python\Python39\lib\site-packages\discord\ext\commands\core.py", line 94, in wrapped
raise CommandInvokeError(exc) from exc
discord.ext.commands.errors.CommandInvokeError: Command raised an exception: TypeError: list indices must be integers or slices, not str
You are getting the TypeError because your Result is a list of dictionaries.
Make sure to iterate through Result and process each dictionary separately.
Your Result object is like this [{}, {}, {} ...]
Also you shouldn't capitalize the first letter of your variable. You should name it results, because it may contain more than 1 result.
for item in results:
user_id = item.get("userid")
warn_id = item.get("warnid")
reason = item.get("reason")
# do stuff with those

Datastore error: BadValueError: Expected integer, got [0, 1, 2, 3]

Others have reported a similar error, but the solutions given do not solve my problem.
For example there is a good answer here. The answer in the link mentions how ndb changes from a first use to a later use and suggests there is a problem because a first run produces a None in the Datastore. I cannot reproduce or see that happening in the Datastore for my sdk, but that may be because I am running it here from the interactive console.
I am pretty sure I got an initial good run with the GAE interactive console, but every run since then has failed with the error in my Title to this question.
I have left the print statements in the following code because they show good results and assure me that the error is occuring in the put() at the very end.
from google.appengine.ext import ndb
class Account(ndb.Model):
week = ndb.IntegerProperty(repeated=True)
weeksNS = ndb.IntegerProperty(repeated=True)
weeksEW = ndb.IntegerProperty(repeated=True)
terry=Account(week=[],weeksNS=[],weeksEW=[])
terry_key=terry.put()
terry = terry_key.get()
print terry
for t in list(range(4)): #just dummy input, but like real input
terry.week.append(t)
print terry.week
region = 1 #same error message for region = 0
if region :
terry.weeksEW.append(terry.week)
else:
terry.weeksNS.append(terry.week)
print 'EW'+str(terry.weeksEW)
print 'NS'+str(terry.weeksNS)
terry.week = []
print 'week'+str(terry.week)
terry.put()
The idea of my code is to first build up the terry.week list values incrementally and then later store the whole list to the appropriate region, either NS or EW. So I'm looking for a workaround for this scheme.
The error message is likely of no value but I am reproducing it here.
Traceback (most recent call last):
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/request_handler.py", line 237, in handle_interactive_request
exec(compiled_code, self._command_globals)
File "<string>", line 55, in <module>
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 3458, in _put
return self._put_async(**ctx_options).get_result()
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
self.check_success()
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 427, in _help_tasklet_along
value = gen.throw(exc.__class__, exc, tb)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/context.py", line 824, in put
key = yield self._put_batcher.add(entity, options)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 430, in _help_tasklet_along
value = gen.send(val)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/context.py", line 358, in _put_tasklet
keys = yield self._conn.async_put(options, datastore_entities)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/datastore/datastore_rpc.py", line 1858, in async_put
pbs = [entity_to_pb(entity) for entity in entities]
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 697, in entity_to_pb
pb = ent._to_pb()
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 3167, in _to_pb
prop._serialize(self, pb, projection=self._projection)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1422, in _serialize
values = self._get_base_value_unwrapped_as_list(entity)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1192, in _get_base_value_unwrapped_as_list
wrapped = self._get_base_value(entity)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1180, in _get_base_value
return self._apply_to_values(entity, self._opt_call_to_base_type)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1352, in _apply_to_values
value[:] = map(function, value)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1234, in _opt_call_to_base_type
value = _BaseValue(self._call_to_base_type(value))
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1255, in _call_to_base_type
return call(value)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1331, in call
newvalue = method(self, value)
File "/Users/brian/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/model.py", line 1781, in _validate
(value,))
BadValueError: Expected integer, got [0, 1, 2, 3]
I believe the error comes from these lines:
terry.weeksEW.append(terry.week)
terry.weeksNS.append(terry.week)
You are not appending another integer; You are appending a list, when an integer is expected.
>>> aaa = [1,2,3]
>>> bbb = [4,5,6]
>>> aaa.append(bbb)
>>> aaa
[1, 2, 3, [4, 5, 6]]
>>>
This fails the ndb.IntegerProperty test.
Try:
terry.weeksEW += terry.week
terry.weeksNS += terry.week
EDIT: To save a list of lists, do not use the IntegerProperty(), but instead the JsonProperty(). Better still, the ndb datastore is deprecated, so... I recommend Firestore, which uses JSON objects by default. At least use Cloud Datastore, or Cloud NDB.

In Google App Engine, how to check input validity of Key created by urlsafe?

Suppose I create a key from user input websafe url
key = ndb.Key(urlsafe=some_user_input)
How can I check if the some_user_input is valid?
My current experiment shows that statement above will throw ProtocolBufferDecodeError (Unable to merge from string.) exception if the some_user_input is invalid, but could not find anything about this from the API. Could someone kindly confirm this, and point me some better way for user input validity checking instead of catching the exception?
Thanks a lot!
If you try to construct a Key with an invalid urlsafe parameter
key = ndb.Key(urlsafe='bogus123')
you will get an error like
Traceback (most recent call last):
File "/opt/google/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/opt/google/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/opt/google/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/tim/git/project/main.py", line 10, in <module>
from src.tim import handlers as handlers_
File "/home/tim/git/project/src/tim/handlers.py", line 42, in <module>
class ResetHandler(BaseHandler):
File "/home/tim/git/project/src/tim/handlers.py", line 47, in ResetHandler
key = ndb.Key(urlsafe='bogus123')
File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 212, in __new__
self.__reference = _ConstructReference(cls, **kwargs)
File "/opt/google/google_appengine/google/appengine/ext/ndb/utils.py", line 142, in positional_wrapper
return wrapped(*args, **kwds)
File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 642, in _ConstructReference
reference = _ReferenceFromSerialized(serialized)
File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 773, in _ReferenceFromSerialized
return entity_pb.Reference(serialized)
File "/opt/google/google_appengine/google/appengine/datastore/entity_pb.py", line 1710, in __init__
if contents is not None: self.MergeFromString(contents)
File "/opt/google/google_appengine/google/net/proto/ProtocolBuffer.py", line 152, in MergeFromString
self.MergePartialFromString(s)
File "/opt/google/google_appengine/google/net/proto/ProtocolBuffer.py", line 168, in MergePartialFromString
self.TryMerge(d)
File "/opt/google/google_appengine/google/appengine/datastore/entity_pb.py", line 1839, in TryMerge
d.skipData(tt)
File "/opt/google/google_appengine/google/net/proto/ProtocolBuffer.py", line 677, in skipData
raise ProtocolBufferDecodeError, "corrupted"
ProtocolBufferDecodeError: corrupted
Interesting here are is
File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 773, in _ReferenceFromSerialized
return entity_pb.Reference(serialized)
which is the last code executed in the key.py module:
def _ReferenceFromSerialized(serialized):
"""Construct a Reference from a serialized Reference."""
if not isinstance(serialized, basestring):
raise TypeError('serialized must be a string; received %r' % serialized)
elif isinstance(serialized, unicode):
serialized = serialized.encode('utf8')
return entity_pb.Reference(serialized)
serialized here being the decoded urlsafe string, you can read more about it in the link to the source code.
another interesting one is the last one:
File "/opt/google/google_appengine/google/appengine/datastore/entity_pb.py", line 1839, in TryMerge
in the entity_pb.py module which looks like this
def TryMerge(self, d):
while d.avail() > 0:
tt = d.getVarInt32()
if tt == 106:
self.set_app(d.getPrefixedString())
continue
if tt == 114:
length = d.getVarInt32()
tmp = ProtocolBuffer.Decoder(d.buffer(), d.pos(), d.pos() + length)
d.skip(length)
self.mutable_path().TryMerge(tmp)
continue
if tt == 162:
self.set_name_space(d.getPrefixedString())
continue
if (tt == 0): raise ProtocolBuffer.ProtocolBufferDecodeError
d.skipData(tt)
which is where the actual attempt to 'merge the input to into a Key' is made.
You can see in the source code that during the process of constructing a Key from an urlsafe parameter not a whole lot can go wrong. First it checks if the input is a string and if it's not, a TypeError is raised, if it is but it's not 'valid', indeed a ProtocolBufferDecodeError is raised.
My current experiment shows that statement above will throw ProtocolBufferDecodeError (Unable to merge from string.) exception if the some_user_input is invalid, but could not find anything about this from the API. Could someone kindly confirm this
Sort of confirmed - we now know that also TypeError can be raised.
and point me some better way for user input validity checking instead of catching the exception?
This is an excellent way to check validity! Why do the checks yourself if the they are already done by appengine? A code snippet could look like this (not working code, just an example)
def get(self):
# first, fetch the user_input from somewhere
try:
key = ndb.Key(urlsafe=user_input)
except TypeError:
return 'Sorry, only string is allowed as urlsafe input'
except ProtocolBufferDecodeError:
return 'Sorry, the urlsafe string seems to be invalid'

Resources