Is there a name for "tables generations" in relational modeling? - database

I mean, when I'm designing DBs, I find tables with no parents, let say "free tables", and others that are their childs, and childs of the childs...
Let me try to be more formal:
- Let say -> means "references to..."
- Let say [tableName] is a table
Consider this:
[cell]->[organism]->[community]->[region]->[planet]
According with my first approach, [planet] is a "free table", but [cell] has a certain amount of "references to" in secuence, 4 acctually. [planet] has 0, [community] has 2.
I guess it is kind of a grade... Here born my question, Is there a formal name for that number?
Thanks and sorry for my poor English.

There is no name for that number, as many tables reference multiple tables and so have multiple numbers. From a system I'm currently working on:
StructureElement -> Generator ->PlugIn -> PlugInType
PlugInType is only a sort of description identifying what sort of work the PlugIn's of that type do.
StructureElement -> Calendar -> CalendarType
What calendar it's using.
Also, you get hierarchical relationships. A StructureElement can be part of another StructureElement, so you have
StructureElement -> StructureElement
That could conceptually go on for infinity, but my software limits it to 10 levels. What is the number of StructureElement?
Cheers -

Related

Neo4j or MongoDB for relative attributes/relations

I wish to build a database of objects with various types of relations between them. It will be easier for me to explain by giving an example.
I wish to have a set of objects, each is described by a unique name and a set of attributes (say, height, weight, colour, etc.), but instead of values, these attributes may contain values which are relative to other objects. For example, I might have two objects, A and B, where A has height 1 and weight "weight of B + 2", and B has height "height of A + 3" and weight 4.
Some objects may have completely other attributes; for example, object C may represent a box, and objects A and B will be related to C by the relations "I appear x times in C".
Queries may include "what is the height of A/B" or what is the total weight of objects appearing in C with multiplicities.
I am a bit familiar with MongoDB, and fond of its simplicity. I heard of Neo4j, but never tried working with it. From its description, it sounds more suitable for my need (but I can't tell it is capable of the task). But is MongoDB, with its simplicity, suitable as well? Or perhaps a different database engine?
I am not sure it matters, but I plan to use python as the engine which processes the queries and their outputs.
Either can do this. I tend to prefer neo4j, but either way could work.
In neo4j, you'd create a graph consisting of a node (A) and its "base" (B). You could then connect them like this:
(A:Node { weight: "base+2" })-[:base]->(B:Node { weight: 2 })
Note that modeling in this way would make it possible to change the base relationship to point to another node without changing anything about A. The downside is that you'd need a mini calculator to expand expressions like "base+2", which is easy but in any case extra work.
Interpreting your question another way, you're in a situation where you'd probably want a trigger. Here's an article on neo4j triggers, how graphs handle this. If parsing that expression "base+2" at read time isn't what you want, and you want to actually set the value on A to be b.weight + 2, then you want a trigger. This would let you define some other function to be run when the graph gets updated in a certain way. In this case, when someone inserts a new :base relationship in the graph, you might check the base value (endpoint of the relationship) and add 2 to its weight, and set that new property value on the source of the relationship.
Yes, you can use either DBMS.
To help you decide, this is an example of how to support your uses cases in neo4j.
To create your sample data:
CREATE
(a:Foo {id: 'A'}), (b:Foo {id: 'B'}), (c:Box {id: 123}),
(h1:Height {value: 1}), (w4:Weight {value: 4}),
(a)-[:HAS_HEIGHT {factor: 1, offset: 0}]->(h1),
(a)-[:HAS_WEIGHT {factor: 1, offset: 2}]->(w4),
(b)-[:HAS_WEIGHT {factor: 1, offset: 0}]->(w4),
(b)-[:HAS_HEIGHT {factor: 1, offset: 3}]->(h1),
(c)-[:CONTAINS {count: 5}]->(a),
(c)-[:CONTAINS {count: 2}]->(b);
"A" and "B" are represented by Foo nodes, and "C" by a Box node. Since a given height or weight can be referenced by multiple nodes, this example data model uses shared Weight and Height nodes. The HAS_HEIGHT and HAS_WEIGHT relationships have factor and offset properties to allow adjustment of the height or weight for a particular Foo node.
To query "What is the height of A":
MATCH (:Foo {id: 'A'})-[ra:HAS_HEIGHT]->(ha:Height)
RETURN ra.factor * ha.value + ra.offset AS height;
To query "What is the ratio of the heights of A and B":
MATCH
(:Foo {id: 'A'})-[ra:HAS_HEIGHT]->(ha:Height),
(:Foo {id: 'B'})-[rb:HAS_HEIGHT]->(hb:Height)
RETURN
TOFLOAT(ra.factor * ha.value + ra.offset) /
(rb.factor * hb.value + rb.offset) AS ratio;
Note: TOFLOAT() is used above to make sure integer division, which would truncate, is never used.
To query "What is the total weight of objects appearing in C":
MATCH (:Box {id: 123})-[c:CONTAINS]->()-[rx:HAS_WEIGHT]->(wx:Weight)
RETURN SUM(c.count * (rx.factor * wx.value + rx.offset));
I have not used Mongo and decided not to after studying it. So filter my opinion with that in mind; users may find my comments easy to overcome. Mongo is not a true graph database. The user must create and manage the relationships. In Neo4j, relationships are "native" and robust.
There is a head to head comparison at this site:
[https://db-engines.com/en/system/MongoDB%3bNeo4j]
see also: https://stackoverflow.com/questions/10688745/database-for-efficient-large-scale-graph-traversal
There is a distinction between NoSQL (e.g., Mongo) and a true graph database. Many seem to assume that if it is not SQL then it's a graph database. This is not true. Most NoSQL data bases do not store relationships. The free book describes this in chapter 2.
Personally, I'm sold on Neo4j. It makes relationships, transerving graphs and collecting lists along the path easy and powerful.

RDF - citations for predicates

I'm attempting to create/implement an RDF-schema for historical data: setting up a base schema is simple, but I've run into a roadblock when trying to add a verifiable source for each claim (predicate).
For example, it's relatively easy to declare 'Alexander (subject) has a brother (predicate) Robert (object)'... but I would like that claim (that is the whole) that Robert was indeed Alexander's brother to be cited, with a link to the original source document, if possible.
The best I could come up with is a 'doubled triplet' schema:
Alexander -> has_a_brother -> Robert
Alexander -> has_a_brother_source -> 'http://sourcedocumentserver.com/blablabla'
Alexander -> has_a_brother_source -> 'http://sourcedocumentserver2.com/blablabla'
Alexander -> has_a_brother_source -> 'http://sourcedocumentserver3.com/blablabla'
Is there a better way to go about this?
PS: this problem is a bit different than a 'predicate variable' (like the question that this one was said to replicate), but in this 'citation needed' circumstance, every single predicate in the database would need a secondary description/label/name (and that's even without imagining multiple citations, attribution data, etc.).
The best I could find was transforming every 'citation needed' predicate into a named graph (with multiple RDF triplets) that would contain the object normally directly joined (if no citation were required).

Normalization 3NF

I an reading through some examples of normalization, however I have come across one that I do not understand.
The website the example is located here: http://cisnet.baruch.cuny.edu/holowczak/classes/3400/normalization/#allinone
The part I do not understand is "Third Normal Form"
In my head I see the transitive dependencies in EMPLOYEE_OFFICE_PHONE (Name, Office, Floor, Phone) as the following Name->->Office|Floor and Name->->Office|Phone
The author splits the table EMPLOYEE_OFFICE_PHONE (Name, Office, Floor, Phone) into EMPLOYEE_OFFICE (Name, Office, Floor) and EMPLOYEE_PHONE (Office, Phone)
From my judgement in the beginning, I still see the transitive dependency in Name->->Office|Floor so I don't understand why it is in 3NF. Was I wrong to state that there is a transitive dependency in Name->->Office|Floor?
Reasoning for transitivity:
Here is my list of the functional dependencies
Name -> Office
Name -> Floor
Name -> Phone
Office -> Phone
Office -> Floor (Is this the incorrect one? and why?
Thank-you your help everyone!
5) you assume a naming sheme here ... offices 4xx have to be on floor 4 ... 5xx have to be on floor 5 ... if such a scheme exists, you can have your dependency ... as long as this is not part of the specification ... no. 5 is out of the game ...
1. Name -> Office
2. Name -> Floor
3. Name -> Phone
4. Office -> Phone
5. Office -> Floor (Is this the incorrect one? and why?
(1) You and the author and I agree that Name->Office.
(2) You and the author agree that Name->Floor. While that's true based solely on the sample data, it's also true that Office->Floor. I'd explore this kind of issue by asking this question: "If an office is empty, do I still know what floor that office is on?" (Yes)
Those things suggest there's a transitive dependency, Name->Office, and Office->Floor. So I would disagree with you and with the author on this one.
(3) You say Name->Phone. The author says Office->Phone. The author also says that "each office has exactly one phone number." So given one value for Office, I know one and only one value for Phone. And given one value for Name, I know one and only one value for Phone. I'd explore this issue by asking, "If I move to a different office, does my phone number follow me?" If it does, then Name->Phone. If it doesn't, then Office->Phone.
There isn't enough information here to answer that question, and I've worked in offices that worked each of those two ways, so real-world experience doesn't help us very much, either. I'd have to side with the author in this case, although I think it's not very well thought through for a normalization example.
(4) This is really just an extension of (3) above.
(5) See (2) above. This doesn't have anything to do with a naming scheme, and you don't need to assume that offices numbered 5xx are on the 5th floor. The only relevant question is this: Given one value for Office, is there one and only one value for Floor? (Yes) I might explore this issue by asking "Can one office be on more than one floor?" (In the real world, that's remotely possible. But the sample data doesn't support that possibility.)
Some additional FDs, based solely on the sample data.
Phone->Office
Phone->Floor
Office->Name
First of all,let me define 3NF clearly :-
A relation is in 3NF if following conditions are satisfied:-
1.)Relation is in 2NF
2.)No non prime attribute is transitively dependent on the primary key.
In other words,a relation is in 3NF is one of the following conditions is satisfied for every functional dependency X->Y:-
1.)X is superkey
2.)Y is a prime attribute
For Your Question,if the following FDs are present :-
Name -> Office
Name -> Floor
Name -> Phone
Office -> Phone
Then we cannot say anything about Office and Floor.You can verify this by applying and checking any of the Armstrong Inference Rules.When you apply these rules, you will find that you cannot infer anything about office and floor.

representing play in relational db

I run a project that deals with plays, mostly Shakespeare. Right now, for storage, we just parse them out into a JSON array formatted like this:
[0: {"title": "Hamlet", "author": ["Shakespeare", "William"], "noActs": 5}, | info
1: [null, ["Scene 1 Setting", "Stage direction", ["character", ["char line 1", "2"]] | act 1
...]
and save them to a file.
We'd like to now move to a relational database for a variety of reasons (foremost currently search) but have no idea how to represent things.
I'm looking for an outline of the best way to do things?
the topic is 'normalization'
begin by identifying your main classifications, such as PLAY, AUTHOR, ACT, SCENE
then add attributes - specifically, a primary key like play_id, act_id, etc.
then add still more attributes like NAME or other identifying information.
then finally, add some relationships between these object bby creating more tables like PLAY_AUTHOR which includes PLAY_ID and AUTHOR_ID.
Not much of a theater goer myself, but this may give you some ideas should you choose a relational DB.

What is an appropriate data structure and database schema to store logic rules?

Preface: I don't have experience with rules engines, building rules, modeling rules, implementing data structures for rules, or whatnot. Therefore, I don't know what I'm doing or if what I attempted below is way off base.
I'm trying to figure out how to store and process the following hypothetical scenario. To simplify my problem, say that I have a type of game where a user purchases an object, where there could be 1000's of possible objects, and the objects must be purchased in a specified sequence and only in certain groups. For example, say I'm the user and I want to purchase object F. Before I can purchase object F, I must have previously purchased object A OR (B AND C). I cannot buy F and A at the same time, nor F and B,C. They must be in the sequence the rule specifies. A first, then F later. Or, B,C first, then F later. I'm not concerned right now with the span of time between purchases, or any other characteristics of the user, just that they are the correct sequence for now.
What is the best way to store this information for potentially thousands of objects that allows me to read in the rules for the object being purchased, and then check it against the user's previous purchase history?
I've attempted this, but I'm stuck at trying to implement the groupings such as A OR (B AND C). I would like to store the rules in a database where I have these tables:
Objects
(ID(int),Description(char))
ObjectPurchRules
(ObjectID(int),ReqirementObjectID(int),OperatorRule(char),Sequence(int))
But obviously as you process through the results, without the grouping, you get the wrong answer. I would like to avoid excessive string parsing if possible :). One object could have an unknown number of previous required purchases. SQL or psuedocode snippets for processing the rules would be appreciated. :)
It seems like your problem breaks down to testing whether a particular condition has been satisfied.
You will have compound conditions.
So given a table of items:
ID_Item Description
----------------------
1 A
2 B
3 C
4 F
and given a table of possible actions:
ID_Action VerbID ItemID ConditionID
----------------------------------------
1 BUY 4 1
We construct a table of conditions:
ID_Condition VerbA ObjectA_ID Boolean VerbB ObjectB_ID
---------------------------------------------------------------------
1 OWNS 1 OR MEETS_CONDITION 2
2 OWNS 2 AND OWNS 3
So OWNS means the id is a key to the Items table, and MEETS_CONDITION means that the id is a key to the Conditions table.
This isn't meant to restrict you. You can add other tables with quests or whatever, and add extra verbs to tell you where to look. Or, just put quests into your Items table when you complete them, and then interpret a completed quest as owning a particular badge. Then you can handle both items and quests with the same code.
This is a very complex problem that I'm not qualified to answer, but I've seen lots of references to. The fundamental problem is that for games, quests and items and "stats" for various objects can have non-relational dependencies. This thread may help you a lot.
You might want to pick up a couple books on the topic, and look into using LUA as a rules processor.
Personally I would do this in code, not in SQL. Each item should be its own class implementing an interface (i.e. IItem). IItem would have a method called OkToPurchase that would determine if it is OK to purchase that item. To do that, it would use one or more of a collection of rules (i.e. HasPreviouslyPurchased(x), CurrentlyOwns(x), etc.) that you can build.
The nice thing is that it is easy to extend this approach with new rules without breaking all the existing logic.
Here's some pseudocode:
bool OkToPurchase()
{
if( HasPreviouslyPurchased('x') && !CurrentlyOwns('y') )
return true;
else
return false;
}
bool HasPreviouslyPurchased( item )
{
return purchases.contains( item )
}
bool CurrentlyOwns( item )
{
return user.Items.contains( item )
}

Resources