Storing conditions for further selecting from a database. Design challenge - database

Let’s assume I have a online computer shop and I want to add a new feature - advertising.
As a manager of the shop I want to create an advert about particular product. Let’s say it goes like this: “IF a client is already signed up for at least 2 months AND during purchase the value of his order exceeded 150$ OR product A OR B is already in his shopping cart THEN show him this ad”.
My question is how to store such statements (condition “A or B”, “A and B”, “(A or B) and C”, etc.) in a database and then, how to select the records and display (or not) desired ad?
One of my idea:
Adverts
1. id,
2. name,
3. description,
…
4. criterias_pattern [i.e “(1 OR 5) AND 4”]
Second table:
AdvertsCriterias
1. id
...
2. type
3. value
In short:
I parse the pattern stored in “criterias_pattern” field, extract criterias_id and then I check the conditions.
It should work but it has many obvious drawbacks.

random thought -
why not store actual sql WHERE clauses that would do the evaluation. at least then you do not need to reinvent your own syntax and parsers.

You have two problems: Modeling the conditions, and storing them in a DB.
Modeling the conditions: The easier would probably be if you could use an existing scripting language model, or something like that. If you can't you can model the conditions like:
interface Expression
{
public T Evaluate(Context context);
}
class OrExpression : Expression
{
Expressionleft;
Expressionright;
}
class AndExpression : Expression
{
Expressionleft;
Expressionright;
}
class IfExpression : Expression
{
Expressioncondition;
ExpressionthenClause;
ExpressionelseClause;
}
class EqualExpression : Expression
{
Expression left;
Expression right;
}
class ContextVariableValue : Expression
{
}
“IF a client is already signed up for at least 2 months AND during purchase the value of his order exceeded 150$ OR product A OR B is already in his shopping cart THEN show him this ad”
would be something like that:
var clientIsAlreadySignedUpFor2Months = new GreaterThanExpression(new ContextVariableValue("ClientSignedUpLengthInMonths"), new ConstantExpression(2));
var purchaseExceeds150 = new GreaterThanExpression(new ContextVariableValue("PurchaseAmount"),new ConstantExpression(150));
new IfExpression(new ) )
var result = new AndExpression(clientIsAlreadySignedUpFor2Months, purchaseExceeds150, etc...)
To store it in the DB you can either serialize them to some equation like text and then parse them, or you could use something like
Conditions
Id Type ParametersType Operand1 Operand2
1 GreaterThanExpression Number 2 3
2 ContextVariableValue Number ClientSignedUpLengthInMonths null
3 ConstantExpression Number 2 null
etc...
I think your best bet, is to preload all the advertisments with their conditions in memory, and run them every time you could be displaying an add...
If you detect that a lot of adds have "similar" conditions, then you could add some logic to only evaluate those conditions once, to speed up things, etc...

Related

How much trust should I put in the validity of retrieved data from database?

Other way to ask my question is: "Should I keep the data types coming from database as simple and raw as I would ask them from my REST endpoint"
Imagine this case class that I want to store in the database as a row:
case class Product(id: UUID,name: String, price: BigInt)
It clearly isn't and shouldn't be what it says it is because The type signatures of nameand price are a lie.
so what we do is create custom data types that better represent what things are such as: (For the sake of simplicity imagine our only concern is the price data type)
case class Price(value: BigInt) {
require(value > BigInt(0))
}
object Price {
def validate(amount: BigInt): Either[String,Price] =
Try(Price(amount)).toOption.toRight("invalid.price")
}
//As a result my Product class is now:
case class Product(id: UUID,name: String,price: Price)
So now the process of taking user input for product data would look like this:
//this class would be parsed from i.e a form:
case class ProductInputData(name: String, price: BigInt)
def create(input: ProductInputData) = {
for {
validPrice <- Price.validate(input.price)
} yield productsRepo.insert(
Product(id = UUID.randomUUID,name = input.name,price = ???)
)
}
look at the triple question marks (???). this is my main point of concern from an entire application architecture perspective; If I had the ability to store a column as Price in the database (for example slick supports these custom data types) then that means I have the option to store the price as either price : BigInt = validPrice.value or price: Price = validPrice.
I see so many pros and cons in both of these decisions and I can't decide.
here are the arguments that I see supporting each choice:
Store data as simple database types (i.e. BigInt) because:
performance: simple assertion of x > 0 on the creation of Price is trivial but imagine you want to validate a Custom Email type with a complex regex. it would be detrimental upon retrieval of collections
Tolerance against Corruption: If BigInt is inserted as negative value it would't explode in your face every time your application tried to simply read the column and throw it out on to the user interface. It would however cause problem if it got retrieved and then involved in some domain layer processing such as purchase.
Store data as it's domain rich type (i.e. Price) because:
No implicit reasoning and trust: Other method some place else in the system would need the price to be valid. For example:
//two terrible variations of a calculateDiscount method:
//this version simply trusts that price is already valid and came from db:
def calculateDiscount(price: BigInt): BigInt = {
//apply some positive coefficient to price and hopefully get a positive
//number from it and if it's not positive because price is not positive then
//it'll explode in your face.
}
//this version is even worse. It does retain function totality and purity
//but the unforgivable culture it encourages is the kind of defensive and
//pranoid programming that causes every developer to write some guard
//expressions performing duplicated validation All over!
def calculateDiscount(price: BigInt): Option[BigInt] = {
if (price <= BigInt(0))
None
else
Some{
//Do safe processing
}
}
//ideally you want it to look like this:
def calculateDiscount(price: Price): Price
No Constant conversion of domain types to simple types and vice versa: for representation, storage,domain layer and such; you simply have one representation in the system to rule them all.
The source of all this mess that I see is the database. if data was coming from the user it'd be easy: You simply never trust it to be valid. you ask for simple data types cast them to domain types with validation and then proceed. But not the db. Does the modern layered architecture address this issue in some definitive or at least mitigating way?
Protect the integrity of the database. Just as you would protect the integrity of the internal state of an object.
Trust the database. It doesn't make sense to check and re-check what has already been checked going in.
Use domain objects for as long as you can. Wait till the very last moment to give them up (raw JDBC code or right before the data is rendered).
Don't tolerate corrupt data. If the data is corrupt, the application should crash. Otherwise it's likely to produce more corrupt data.
The overhead of the require call when retrieving from the DB is negligible. If you really think it's an issue, provide 2 constructors, one for the data coming from the user (performs validation) and one that assumes the data is good (meant to be used by the database code).
I love exceptions when they point to a bug (data corruption because of insufficient validation on the way in).
That said, I regularly leave requires in code to help catch bugs in more complex validation (maybe data coming from multiple tables combined in some invalid way). The system still crashes (as it should), but I get a better error message.

How can I reassign opportunities in Salesforce based on number of opportunities per account?

I have created a simple class and visual force page that displays a "group by". The output is perfect, it will display the number of opportunities a given account has:
lstAR = [ select Account.Name AccountName, AccountId, Count(CampaignID) CountResult from Opportunity where CampaignID != null group by Account.Name,AccountId having COUNT(CampaignID) > 0 LIMIT 500 ];
I would like to be able to say, if an account has more then 10 opportunities, then assign the opportunity to another account that has less then 10.
I used the following code to get the results in my visual force page:
public list<OppClass> getResults() {
list<OppClass> lstResult = new list<OppClass>();
for (AggregateResult ar: lstAR) {
oppClass objOppClass = new oppClass(ar);
lstResult.add(objOppClass);
}
return lstResult;
}
class oppClass {
public Integer CountResult { get;set; }
public String AccountName { get;set; }
public String AccountID { get;set; }
public oppClass(AggregateResult ar) {
//Note that ar returns objects as results, so you need type conversion here
CountResult = (Integer)ar.get('CountResult');
AccountName = (String)ar.get('AccountName');
AccountID = (String)ar.get('AccountID');
}
What would be the best approach to check the count greater then a given number and then assign an account with less then that given number the opportunities?
As I said, code wise I have a nice little controller and vf page that will display the account and count in a grid. Just not sure of a good approach to do the reassigning opportunity.
Thanks
Frank
I'm not sure why you'd be moving your opportunity to another account b/c typically the account is the organization/person buying the stuff?
But that said, ignoring the why and focusing on the how...
Trigger on the Opportunity, before insert
loop over trigger.new and count how many oppties you have per account (or owner) in that batch, put that into a map accountId to count [because you could be inserting 10 oppties for the same account!]. If ever your count is > 10 change the assignment using whatever assignment helper class you have.
Also populate set of accountIds.
Then run your aggregate for each account where Id in set of accountIds, you'll have to group by AccountId.
Loop over results, and update the map of accountId to count.
Then loop over trigger.new and for each oppty, look up in the map by accountId the count. If the count > 10 then do your assignment using your helper class.
And done.
Of course your assignment helper class is another issue to tackle - how do you know which account/user to assign the opportunity to, are you going to use queues, custom objects, custom settings to govern the rules, etc...
But the concept above should work...

Prolog - Exercise on facts and lists

The problem is that I need to program a predicate capable of consulting the products in a database so that it returns a list filled with the names of the products its prices are reduced(that is indicated by the "state".
domains
state = reduced ; normal
element = string
list = element*
database
producte (string, integer, state)
predicates
nondeterm reduced(list)
clauses
% ---> producte( description , price , state )
producte("Enciam",2,reduced).
producte("Peix",5,normal).
producte("Llet",1,reduced).
producte("Formatge",5,normal).
%unique case
reduced([D]):-
producte(D,_,reduced).
%general case
reduced([D|L]) :-
producte(D,_, reduced),retract(producte(D,_,reduced)),reduced(L).
Goal
reduced(List).
I appreciate it.
Now, it gives me three different solutions. How could I force the predicate to give me one solution? In fact, The last one?
Since I don't use visual-prolog, I'll just propose something I found in the doc.
reduced(List) :-
List = [ Price || producte(_, Price, reduced) ].
What about when the first product in the list is NOT reduced -- you have no rule for that case.

Django: Avoiding ABA scenario in Database and ORM’s

I have a situation where I need to update votes for a candidate.
Citizens can vote for this candidate, with more than one vote per candidate. i.e. one person can vote 5 votes, while another person votes 2. In this case this candidate should get 7 votes.
Now, I use Django. And here how the pseudo code looks like
votes = candidate.votes
vote += citizen.vote
The problem here, as you can see is a race condition where the candidate’s votes can get overwritten by another citizen’s vote who did a select earlier and set now.
How can avoid this with an ORM like Django?
If this is purely an arithmetic expression then Django has a nice API called F expressions
Updating attributes based on existing fields
Sometimes you'll need to perform a simple arithmetic task on a field, such as incrementing or decrementing the current value. The obvious way to achieve this is to do something like:
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold += 1
>>> product.save()
If the old number_sold value retrieved from the database was 10, then the value of 11 will be written back to the database.
This can be optimized slightly by expressing the update relative to the original field value, rather than as an explicit assignment of a new value. Django provides F() expressions as a way of performing this kind of relative update. Using F() expressions, the previous example would be expressed as:
>>> from django.db.models import F
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold = F('number_sold') + 1
>>> product.save()
This approach doesn't use the initial value from the database. Instead, it makes the database do the update based on whatever value is current at the time that the save() is executed.
Once the object has been saved, you must reload the object in order to access the actual value that was applied to the updated field:
>>> product = Products.objects.get(pk=product.pk)
>>> print product.number_sold
42
Perhaps the select_for_update QuerySet method is helpful for you.
An excerpt from the docs:
All matched entries will be locked until the end of the transaction block, meaning that other transactions will be prevented from changing or acquiring locks on them.
Usually, if another transaction has already acquired a lock on one of the selected rows, the query will block until the lock is released. If this is not the behavior you want, call select_for_update(nowait=True). This will make the call non-blocking. If a conflicting lock is already acquired by another transaction, DatabaseError will be raised when the queryset is evaluated.
Mind that this is only available in the Django development release (i.e. > 1.3).

What is an appropriate data structure and database schema to store logic rules?

Preface: I don't have experience with rules engines, building rules, modeling rules, implementing data structures for rules, or whatnot. Therefore, I don't know what I'm doing or if what I attempted below is way off base.
I'm trying to figure out how to store and process the following hypothetical scenario. To simplify my problem, say that I have a type of game where a user purchases an object, where there could be 1000's of possible objects, and the objects must be purchased in a specified sequence and only in certain groups. For example, say I'm the user and I want to purchase object F. Before I can purchase object F, I must have previously purchased object A OR (B AND C). I cannot buy F and A at the same time, nor F and B,C. They must be in the sequence the rule specifies. A first, then F later. Or, B,C first, then F later. I'm not concerned right now with the span of time between purchases, or any other characteristics of the user, just that they are the correct sequence for now.
What is the best way to store this information for potentially thousands of objects that allows me to read in the rules for the object being purchased, and then check it against the user's previous purchase history?
I've attempted this, but I'm stuck at trying to implement the groupings such as A OR (B AND C). I would like to store the rules in a database where I have these tables:
Objects
(ID(int),Description(char))
ObjectPurchRules
(ObjectID(int),ReqirementObjectID(int),OperatorRule(char),Sequence(int))
But obviously as you process through the results, without the grouping, you get the wrong answer. I would like to avoid excessive string parsing if possible :). One object could have an unknown number of previous required purchases. SQL or psuedocode snippets for processing the rules would be appreciated. :)
It seems like your problem breaks down to testing whether a particular condition has been satisfied.
You will have compound conditions.
So given a table of items:
ID_Item Description
----------------------
1 A
2 B
3 C
4 F
and given a table of possible actions:
ID_Action VerbID ItemID ConditionID
----------------------------------------
1 BUY 4 1
We construct a table of conditions:
ID_Condition VerbA ObjectA_ID Boolean VerbB ObjectB_ID
---------------------------------------------------------------------
1 OWNS 1 OR MEETS_CONDITION 2
2 OWNS 2 AND OWNS 3
So OWNS means the id is a key to the Items table, and MEETS_CONDITION means that the id is a key to the Conditions table.
This isn't meant to restrict you. You can add other tables with quests or whatever, and add extra verbs to tell you where to look. Or, just put quests into your Items table when you complete them, and then interpret a completed quest as owning a particular badge. Then you can handle both items and quests with the same code.
This is a very complex problem that I'm not qualified to answer, but I've seen lots of references to. The fundamental problem is that for games, quests and items and "stats" for various objects can have non-relational dependencies. This thread may help you a lot.
You might want to pick up a couple books on the topic, and look into using LUA as a rules processor.
Personally I would do this in code, not in SQL. Each item should be its own class implementing an interface (i.e. IItem). IItem would have a method called OkToPurchase that would determine if it is OK to purchase that item. To do that, it would use one or more of a collection of rules (i.e. HasPreviouslyPurchased(x), CurrentlyOwns(x), etc.) that you can build.
The nice thing is that it is easy to extend this approach with new rules without breaking all the existing logic.
Here's some pseudocode:
bool OkToPurchase()
{
if( HasPreviouslyPurchased('x') && !CurrentlyOwns('y') )
return true;
else
return false;
}
bool HasPreviouslyPurchased( item )
{
return purchases.contains( item )
}
bool CurrentlyOwns( item )
{
return user.Items.contains( item )
}

Resources