Case-insensitive exact match with SQLAlchemy

Case-insensitive exact match with SQLAlchemy - database

How can I ensure that the = operator is always rendered case-insensitive? Are comparisions with the LOWER or the UPPER functions the best bet for performance? ILIKE seems to be very slow.

If you need only case-insensitivity use upper or lower since like is not only about case-insensitivity
example of lower:
my_string = 'BarFoo'
session.query(Foo).filter(func.lower(Foo.bar) == my_string.lower()).all()
see some more info on like here how to execute LIKE query in sqlalchemy?

For case-insensitive comparisons, you can subclass Comparator.
Building Custom Comparators
The example class below allows case-insensitive comparisons on the attribute named word_insensitive:
from sqlalchemy.ext.hybrid import Comparator, hybrid_property
from sqlalchemy import func, Column, Integer, String
from sqlalchemy.orm import Session
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class CaseInsensitiveComparator(Comparator):
def __eq__(self, other):
return func.lower(self.__clause_element__()) == func.lower(other)
class SearchWord(Base):
__tablename__ = 'searchword'
id = Column(Integer, primary_key=True)
word = Column(String(255), nullable=False)
#hybrid_property
def word_insensitive(self):
return self.word.lower()
#word_insensitive.comparator
def word_insensitive(cls):
return CaseInsensitiveComparator(cls.word)
Above, SQL expressions against word_insensitive will apply the LOWER() SQL function to both sides:
>>> print Session().query(SearchWord).filter_by(word_insensitive="Trucks")
SELECT searchword.id AS searchword_id, searchword.word AS searchword_word FROM searchword
WHERE lower(searchword.word) = lower(:lower_1)
The CaseInsensitiveComparator above implements part of the ColumnOperators interface. A “coercion” operation like lowercasing can be applied to all comparison operations (i.e. eq, lt, gt, etc.) using Operators.operate():
class CaseInsensitiveComparator(Comparator):
def operate(self, op, other):
return op(func.lower(self.__clause_element__()), func.lower(other))

Related

Add a CALCULATED fields with if-else condition in Django model

How can i add a calculated field with if-else condition in Django model?
for instance,
def _get_total(self):
"Returns the Overall Backlog"
return self.Forecast - self.Actual_t
Overall_Backlog = property(_get_total)
here, i want to create Overall_Backlog with a condition like if (self.Forecast - self.Actual_t)<0 then 0 else (self.Forecast - self.Actual_t).
Please help!

Exactly like that:
#property
def Overall_Backlog(self):
"Returns the Overall Backlog"
val = self.Forecast - self.Actual_t
if val < 0:
return 0
return val
A disadvantage of using a property is however that the database does not know anything about properties and methods, and thus can not filter. An alternative to using properties is using .annotate(…) [Django-doc]:
from django.db.models import Value
from django.db.models.functions import Greatest
MyModel.objects.annotate(
overall_backlog=Greatest(F('Forecast') - F('Actual_t'), Value(0))
)
The MyModel objects that arise from this QuerySet will have an extra attribute .overall_backlog that is the maximum of Forecast - Actual_t and 0.

How to create a completely (uniformly) random dataset on PyTorch

I need to run some experiments on custom datasets using pytorch. The question is, how can I create a dataset using torch.Dataloader?
I have two lists, one is called Values and has a datapoint tensor at every entry, and the other one is called Labels, that has the corresponding label. What I did is the following:
for i in range(samples):
dataset[i] = [values[i],labels[I]]
So I have a list with datapoint and respective label, and then tried the following:
dataset = torch.tensor(dataset).float()
dataset = torch.utils.data.TensorDataset(dataset)
data_loader = torch.utils.data.DataLoader(dataset=dataset, batch_size=100, shuffle=True, num_workers=4, pin_memory=True)
But, first of all, I get the error "Not a sequence" in the torch.tensor command, and second, I'm not sure this is the right way of creating one. Any suggestion?
Thank you very much!

You do not need to overload DataLoader, but rather create a Dataset for your data.
For instance,
class MyDataset(Dataset):
def __init__(self):
super(MyDataset, self).__init__()
# do stuff here?
self.values = values
self.labels = labels
def __len__(self):
return len(self.values) # number of samples in the dataset
def __getitem__(self, index):
return self.values[index], self.labels[index]

Just to enrich the answer by #shai
class MyDataset(Dataset):
def __init__(self, values):
super(MyDataset, self).__init__()
self.values = values
def __len__(self):
return len(self.values)
def __getitem__(self, index):
return self.values[index]
values = np.random.rand(51000, 3)
dataset = MyDataset(values)

sqlalchemy query from sorted columns in time series

I have a sqlite table defined as:
class HourlyUserWebsite(Base):
__tablename__ = 'hourly_user_website'
id = Column(Integer, primary_key=True)
user = Column(String(600), index=True)
domain = Column(String(600))
time_secs = Column(Integer, index=True)
def __repr__(self):
return "HourlyUserWebsite(user='%s', domain='%s', time_secs=%d)" % \
(self.user, self.domain, self.time_secs)
and I add elements to it with a class method as:
def add_elements_to_hourly_db(self, data, start_secs, end_secs, engine):
session = self._get_session(engine)
for el in data:
session.add(el)
session.commit()
return
as the data is time series I am expecting to add always elements with increasing or equal time_secs value (not decreasing).
I get the data from the table with a query like:
session.query(HorlyUserWebsite)
I'd like to have the results from the query sorted by time_secs and by user.
Is there any way I can do it? Can the data be stored in such a way that query for sorted data is optimised keeping in mind that it is a time series?

session.query(HourlyUserWebsite).order_by(HourlyUserWebsite.user,HourlyUserWebsite.time_secs.desc()).all()

How to use Slick's mapped tables with foreign keys?

I'm struggling with Slick's lifted embedding and mapped tables. The API feels strange to me, maybe just because it is structured in a way that's unfamiliar to me.
I want to build a Task/Todo-List. There are two entities:
Task: Each task has a an optional reference to the next task. That way a linked list is build. The intention is that the user can order the tasks by his priority. This order is represented by the references from task to task.
TaskList: Represents a TaskList with a label and a reference to the first Task of the list.
case class Task(id: Option[Long], title: String, nextTask: Option[Task])
case class TaskList(label: String, firstTask: Option[Task])
Now I tried to write a data access object (DAO) for these two entities.
import scala.slick.driver.H2Driver.simple._
import slick.lifted.MappedTypeMapper
implicit val session: Session = Database.threadLocalSession
val queryById = Tasks.createFinderBy( t => t.id )
def task(id: Long): Option[Task] = queryById(id).firstOption
private object Tasks extends Table[Task]("TASKS") {
def id = column[Long]("ID", O.PrimaryKey, O.AutoInc)
def title = column[String]("TITLE")
def nextTaskId = column[Option[Long]]("NEXT_TASK_ID")
def nextTask = foreignKey("NEXT_TASK_FK", nextTaskId, Tasks)(_.id)
def * = id ~ title ~ nextTask <> (Task, Task.unapply _)
}
private object TaskLists extends Table[TaskList]("TASKLISTS") {
def label = column[String]("LABEL", O.PrimaryKey)
def firstTaskId = column[Option[Long]]("FIRST_TASK_ID")
def firstTask = foreignKey("FIRST_TASK_FK", firstTaskId, Tasks)(_.id)
def * = label ~ firstTask <> (Task, Task.unapply _)
}
Unfortunately it does not compile. The problems are in the * projection of both tables at nextTask respective firstTask.
could not find implicit value for evidence parameter of type
scala.slick.lifted.TypeMapper[scala.slick.lifted.ForeignKeyQuery[SlickTaskRepository.this.Tasks.type,justf0rfun.bookmark.model.Task]]
could not find implicit value for evidence parameter of type scala.slick.lifted.TypeMapper[scala.slick.lifted.ForeignKeyQuery[SlickTaskRepository.this.Tasks.type,justf0rfun.bookmark.model.Task]]
I tried to solve that with the following TypeMapper but that does not compile, too.
implicit val taskMapper = MappedTypeMapper.base[Option[Long], Option[Task]](
option => option match {
case Some(id) => task(id)
case _ => None
},
option => option match {
case Some(task) => task.id
case _ => None
})
could not find implicit value for parameter tm: scala.slick.lifted.TypeMapper[Option[justf0rfun.bookmark.model.Task]]
not enough arguments for method base: (implicit tm: scala.slick.lifted.TypeMapper[Option[justf0rfun.bookmark.model.Task]])scala.slick.lifted.BaseTypeMapper[Option[Long]]. Unspecified value parameter tm.
Main question: How to use Slick's lifted embedding and mapped tables the right way? How to I get this to work?
Thanks in advance.

The short answer is: Use ids instead of object references and use Slick queries to dereference ids. You can put the queries into methods for re-use.
That would make your case classes look like this:
case class Task(id: Option[Long], title: String, nextTaskId: Option[Long])
case class TaskList(label: String, firstTaskId: Option[Long])
I'll publish an article about this topic at some point and link it here.

Google App Engine: Model integrity constraints?

I have a datastore model representing items in an ecommerce site:
class Item(db.Model):
CSIN = db.IntegerProperty()
name = db.StringProperty()
price = db.IntegerProperty()
quantity = db.IntegerProperty()
Is there some way to enforce integrity constraints? For instance, I would like to make sure that quantity is never set to be less than 0.

The Property constructor lets you specify a function with the 'validator' named argument. This function should take one argument, the value, and raise an exception if the valid is invalid. For example:
def range_validator(minval, maxval):
def validator(v):
if (minval is not None and v < minval) or (maxval is not None and v > maxval):
raise ValueError("Value %s outside range (%s, %s)" % (v, minval, maxval))
return validator
class Item(db.Model):
CSIN = db.IntegerProperty()
name = db.StringProperty()
price = db.IntegerProperty()
quantity = db.IntegerProperty(validator=range_validator(0, None))
Note that the example uses a nested function to define general-purpose validators - you can, of course, use simple functions if you want to write a more special purpose validator.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Case-insensitive exact match with SQLAlchemy - database

How can I ensure that the = operator is always rendered case-insensitive? Are comparisions with the LOWER or the UPPER functions the best bet for performance? ILIKE seems to be very slow.

If you need only case-insensitivity use upper or lower since like is not only about case-insensitivity example of lower: my_string = 'BarFoo' session.query(Foo).filter(func.lower(Foo.bar) == my_string.lower()).all() see some more info on like here how to execute LIKE query in sqlalchemy?

Related

Add a CALCULATED fields with if-else condition in Django model

How to create a completely (uniformly) random dataset on PyTorch

sqlalchemy query from sorted columns in time series

How to use Slick's mapped tables with foreign keys?

Google App Engine: Model integrity constraints?

Categories

Resources