Databases, bad practise to use null as ternary option? - database

I've never used null before really, I tend to avoid using it and having it existing in my data. However I recently designed this table (simplified):
tblPeopleWhoNeedToCastVotes
User | hasVotedYes
In this scenario, hasVotedYes can either be null, true or false. Null indicates they have not yet cast their vote (useful information).
Is this bad practise or fine?

This is really what NULL is intended for - where the data is not available. From Wikipedia:
Null is a special marker used in Structured Query Language (SQL) to indicate that a data value does not exist in the database.
I wouldn't recommend using it as a general "third option", I'd go with a tinyint and map it to an Enum in code instead. However, for Yes/No and an Unknown, I'd say a NULL is probably the best way, as reading 0/1/2 in the database would be less clear.

I would rather prefer a DEFAULT value instead of NULL. So, In database like MySQL I will create a column
hasVotedYes TINYINT(1) DEFAULT 0
when user votes "against" I will change to 1, if user votes "in favor", I will mark it as 2. However, NULL as default is NOT a bad practice, till you handle NULL object in your application code.
Thinking a bit more, I guess default values are even better idea. For example you want to filter users who voted in favor, voted against, or not voted -- you will create a prepared statement something like
... where hasVotedYes = ?;
While in case of NULL default, you will be writing two types of queries.
... where hasVotedYes = ?
This works for voted in favor or against case.
.... where hasVotedYes is NULL;
This for not voted case.

I would go for
tblPeopleWhoNeedToCastVote
User | Voted
Where votes is nullable bit: 1 = Yes, 0 = No, Null = not voted.
Otherwise (when not using null) you need to know:
A: Has the person voted
B: What did he vote.
By not using NULL B would be defaulted to 0. This could be annoying when wanting to create a query of all people who voted Yes (1) or No (0), then you must check if the person has voted.
When using NULL you can simple query for 1 or 0.

Null has been built for this usage. It (should) means : not shure, don't know.
The only drawback is when you want to know, say, the number of people who didn't vote Yes : you may have to convert null to something or the result will not be correct.
select * from tblPeople as t where t.hasVotedYes <> 'Y'; -- Don't counts nulls
select * from tblPeople as t where nvl(t.hasVotedYes, 'N') <> 'Y'; -- counts nulls.

I use NULL for UNKNOWN but you should also take three value logic into consideration if your going to be doing a lot of search clauses on the data to make sure you don't make a mistake, see this image:

Null gets used for all sorts of things but usually it leads to contradictions and incorrect results.
In this case it seems that your dilemma stems from trying to overload too much information into a single binary attribute - you are trying to record both whether a person voted and how they voted. This in itself is probably going to create problems for you.
I don't really see what you hope to gain by using a null to represent the case where a person hasn't voted. Why don't you just not store the information until they have voted. That way the absence of the row clearly indicates that a person hasn't voted yet.

Related

How to limit amount of associations in Elixir Ecto

I have this app where there is a Games table and a Players table, and they share an n:n association.
This association is mapped in Phoenix through a GamesPlayers schema.
What I'm wondering how to do is actually quite simple: I'd like there to be an adjustable limit of how many players are allowed per game.
If you need more details, carry on reading, but if you already know an answer feel free to skip the rest!
What I've Tried
I've taken a look at adding check constraints, but without much success. Here's what the check constraint would have to look something like:
create constraint("games_players", :limit_players, check: "count(players) <= player_limit")
Problem here is, the check syntax is very much invalid and I don't think there actually is a valid way to achieve this using this call.
I've also looked into adding a trigger to the Postgres database directly in order to enforce this (something very similar to what this answer proposes), but I am very wary of directly fiddling with the DB since I should only be using ecto's interface.
Table Schemas
For the purposes of this question, let's assume this is what the tables look like:
Games
Property
Type
id
integer
player_limit
integer
Players
Property
Type
id
integer
GamesPlayers
Property
Type
game_id
references(Games)
player_id
references(Players)
As I mentioned in my comment, I think the cleanest way to enforce this is via business logic inside the code, not via a database constraint. I would approach this using a database transaction, which Ecto supports via Ecto.Repo.transaction/2. This will prevent any race conditions.
In this case I would do something like the following:
begin the transaction
perform a SELECT query counting the number of players in the given game; if the game is already full, abort the transaction, otherwise, continue
perform an INSERT query to add the player to the game
complete the transaction
In code, this would boil down to something like this (untested):
import Ecto.Query
alias MyApp.Repo
alias MyApp.GamesPlayers
#max_allowed_players 10
def add_player_to_game(player_id, game_id, opts \\ []) do
max_allowed_players = Keyword.get(opts, :max_allowed_players, #max_allowed_players)
case is_game_full?(game_id, max_allowed_players) do
false -> %GamesPlayers{
game_id: game_id,
player_id: player_id
}
|> Repo.insert!()
# Raising an error causes the transaction to fail
true -> raise "Game #{inspect(game_id)} full; cannot add player #{inspect(player_id)}"
end
end
defp is_game_full?(game_id, max_allowed_players) do
current_players = from(r in GamesPlayers,
where: r.game_id == game_id,
select: count(r.id)
)
|> Repo.one()
current_players >= max_allowed_players
end

Reduced Survey Frequency - Salesforce Workflow

Hoping you can help me review the logic below for errors. I am looking to create a workflow that will send a survey out to end users on a reduced frequency. Basically, it will check the Account object of the Case for a field, 'Reduced Survey Frequency', which contains a # and will not send a survey until that # of days has passed since the last date set on the Contact field 'Last Survey Date'. Please review the code and let me know any recommended changes!
AND( OR(ISPICKVAL(Status,"Closed"), ISPICKVAL(Status,"PM Sent")),
OR(CONTAINS(RecordType.Name,"Portal Case"),CONTAINS(RecordType.Name,"Standard Case"),
CONTAINS(RecordType.Name,"Portal Closed"),
CONTAINS(RecordType.Name,"Standard Closed")),
NOT( Don_t_sent_survey__c )
,
OR(((TODAY()- Contact.Last_Survey_Date__c) >= Account.Reduced_Survey_Frequency__c ),Account.Reduced_Survey_Frequency__c==0,
ISBLANK(Account.Reduced_Survey_Frequency__c),
ISBLANK(Contact.Last_Survey_Date__c)
))
Thanks,
Brian H.
Personally I prefer the syntax where && and || are used instead of AND(), OR()functions. It just reads bit nicer to me, no need to trace so many commas, keep track of indentation in the more complex logic... But if you're more used to this Excel-like flow - go for it. In the end it has to be readable for YOU.
Also I'd consider reordering this a bit - simple checks, most likely to fail first.
The first part - irrelevant to your question
Don't use RecordType.Name because these Names can be translated to say French and it will screw your logic up for users who will select non-English as their preferred language. Use RecordType.DeveloperName, it's safer.
CONTAINS - do you really have so many record types that share this part in their name? What's wrong with normal = comparison? You could check if the formula would be more readable with CASE() statement. Or maybe flip the logic if there are say 6 rec types and you've explicitly listed 4 (this might have to be reviewed though when you add new rec. type). If you find yourself copy-pasting this block of 4 checks frequently - consider making a helper formula field with it...
The second part
ISBLANK checks could be skipped if you'll properly use the "treat nulls as blanks / as zeroes" setting at the bottom of formula editor. Because you're making check like
OR(...,
Account.Reduced_Survey_Frequency__c==0,
ISBLANK(Account.Reduced_Survey_Frequency__c),
...
)
which is essentially what this thing was designed for. I'd flip it to "treat nulls as zeroes" (but that means the ISBLANK check will never "fire"). If you're not comfortable with that - you can also "safely compare or substract" by using
BLANKVALUE(Account.Reduced_Survey_Frequency__c,0)
Which will have the similar "treat null as zero" effect but only in this one place.
So... I'd end up with something like this:
(ISPICKVAL(Status,'Closed') || ISPICKVAL(Status, 'PM Sent')) &&
(RecordType.DeveloperName = 'Portal_Case' ||
RecordType.DeveloperName = 'Standard_Case' ||
RecordType.DeveloperName = 'Portal_Closed' ||
RecordType.DeveloperName = 'Standard_Closed'
) &&
NOT(Don_t_sent_survey__c) &&
(Contact.Last_Survey_Date__c + Account.Reduced_Survey_Frequency__c < TODAY())
No promises though ;)
You can easily test them by enabling debug logs. You'll see there the workflow formula together with values that are used to evaluate it.
Another option is to make a temporary formula field with same logic and observe (in a report?) where it goes true/false for mass spot check.

Why properties referenced in an equality (EQUAL) or membership (IN) filter cannot be projected?

https://developers.google.com/appengine/docs/java/datastore/projectionqueries
Why a projected query such as this : SELECT A FROM kind WHERE A = 1 not supported ?
Because it makes no sense. You are asking
SELECT A FROM kind WHERE A = 1
so, give me A where A = 1. Well, you already know that A = 1. It makes no sense for DB to allow that.
The IN query is internally just a series of equals queries merged together, so the same logic applies to it.
The reasoning behind this could be that since you already have the values of the properties you are querying you don't need them returned by the query. This is probably a good thing in the long run, but honestly, it's something that App Engine should allow anyway. Even if it didn't actually fetch these values from the datastore, it should add them to the entities returned to you behind the scenes so you can go about your business.
Anyway, here's what you can do...
query = MyModel.query().filter(MyModel.prop1 == 'value1', MyModel.prop2 == 'value2)
results = query.fetch(projection=[MyModel.prop3])
for r in results:
r.prop1 = 'value1' # the value you KNOW is correct
r.prop2 = 'value2'
Again, would be nice for this to happen behind the scenes because I don't think it's something anybody should ever care about. If I mention a property in a projection list, I'm already stating that I want that property as part of my entities. I shouldn't have to do any more computation to get that to happen.
On the other hand, it's just an extra for-loop. :)

What is an appropriate data structure and database schema to store logic rules?

Preface: I don't have experience with rules engines, building rules, modeling rules, implementing data structures for rules, or whatnot. Therefore, I don't know what I'm doing or if what I attempted below is way off base.
I'm trying to figure out how to store and process the following hypothetical scenario. To simplify my problem, say that I have a type of game where a user purchases an object, where there could be 1000's of possible objects, and the objects must be purchased in a specified sequence and only in certain groups. For example, say I'm the user and I want to purchase object F. Before I can purchase object F, I must have previously purchased object A OR (B AND C). I cannot buy F and A at the same time, nor F and B,C. They must be in the sequence the rule specifies. A first, then F later. Or, B,C first, then F later. I'm not concerned right now with the span of time between purchases, or any other characteristics of the user, just that they are the correct sequence for now.
What is the best way to store this information for potentially thousands of objects that allows me to read in the rules for the object being purchased, and then check it against the user's previous purchase history?
I've attempted this, but I'm stuck at trying to implement the groupings such as A OR (B AND C). I would like to store the rules in a database where I have these tables:
Objects
(ID(int),Description(char))
ObjectPurchRules
(ObjectID(int),ReqirementObjectID(int),OperatorRule(char),Sequence(int))
But obviously as you process through the results, without the grouping, you get the wrong answer. I would like to avoid excessive string parsing if possible :). One object could have an unknown number of previous required purchases. SQL or psuedocode snippets for processing the rules would be appreciated. :)
It seems like your problem breaks down to testing whether a particular condition has been satisfied.
You will have compound conditions.
So given a table of items:
ID_Item Description
----------------------
1 A
2 B
3 C
4 F
and given a table of possible actions:
ID_Action VerbID ItemID ConditionID
----------------------------------------
1 BUY 4 1
We construct a table of conditions:
ID_Condition VerbA ObjectA_ID Boolean VerbB ObjectB_ID
---------------------------------------------------------------------
1 OWNS 1 OR MEETS_CONDITION 2
2 OWNS 2 AND OWNS 3
So OWNS means the id is a key to the Items table, and MEETS_CONDITION means that the id is a key to the Conditions table.
This isn't meant to restrict you. You can add other tables with quests or whatever, and add extra verbs to tell you where to look. Or, just put quests into your Items table when you complete them, and then interpret a completed quest as owning a particular badge. Then you can handle both items and quests with the same code.
This is a very complex problem that I'm not qualified to answer, but I've seen lots of references to. The fundamental problem is that for games, quests and items and "stats" for various objects can have non-relational dependencies. This thread may help you a lot.
You might want to pick up a couple books on the topic, and look into using LUA as a rules processor.
Personally I would do this in code, not in SQL. Each item should be its own class implementing an interface (i.e. IItem). IItem would have a method called OkToPurchase that would determine if it is OK to purchase that item. To do that, it would use one or more of a collection of rules (i.e. HasPreviouslyPurchased(x), CurrentlyOwns(x), etc.) that you can build.
The nice thing is that it is easy to extend this approach with new rules without breaking all the existing logic.
Here's some pseudocode:
bool OkToPurchase()
{
if( HasPreviouslyPurchased('x') && !CurrentlyOwns('y') )
return true;
else
return false;
}
bool HasPreviouslyPurchased( item )
{
return purchases.contains( item )
}
bool CurrentlyOwns( item )
{
return user.Items.contains( item )
}

Autocomplete Dropdown - too much data, timing out

So, I have an autocomplete dropdown with a list of townships. Initially I just had the 20 or so that we had in the database... but recently, we have noticed that some of our data lies in other counties... even other states. So, the answer to that was buy one of those databases with all towns in the US (yes, I know, geocoding is the answer but due to time constraints we are doing this until we have time for that feature).
So, when we had 20-25 towns the autocomplete worked stellarly... now that there are 80,000 it's not as easy.
As I type I am thinking that the best way to do this is default to this state, then there will be much less. I will add a state selector to the page that defaults to NJ then you can pick another state if need be, this will narrow down the list to < 1000. Though, I may have the same issue? Does anyone know of a work around for an autocomplete with a lot of data?
should I post teh codez of my webservice?
Are you trying to autocomplete after only 1 character is typed? Maybe wait until 2 or more...?
Also, can you just return the top 10 rows, or something?
Sounds like your application is suffocating on the amount of data being returned, and then attempted to be rendered by the browser.
I assume that your database has the proper indexes, and you don't have a performance problem there.
I would limit the results of your service to no more than say 100 results. Users will not look at any more than that any how.
I would also only being retrieving the data from the service once 2 or 3 characters are entered which will further reduce the scope of the query.
Good Luck!
Stupid question maybe, but... have you checked to make sure you have an index on the town name column? I wouldn't think 80K names should be stressing your database...
I think you're on the right track. Use a series of cascading inputs, State -> County -> Township where each succeeding one grabs the potential population based on the value of the preceding one. Each input would validate against its potential population to avoid spurious inputs. I would suggest caching the intermediate results and querying against them for the autocomplete instead of going all the way back to the database each time.
If you have control of the underlying SQL, you may want to try several "UNION" queries instead of one query with several "OR like" lines in its where clause.
Check out this article on optimizing SQL.
I'd just limit the SQL query with a TOP clause. I also like using a "less than" instead of a like:
select top 10 name from cities where #partialname < name order by name;
that "Ce" will give you "Cedar Grove" and "Cedar Knolls" but also "Chatham" & "Cherry Hill" so you always get ten.
In LINQ:
var q = (from c in db.Cities
where partialname < c.Name
orderby c.Name
select c.Name).Take(10);

Resources