Difficult kind of Hierachical Data in Relational Database - database

I have "components" which can be assembled in different ways into a "system". I want my database to hold all these "components", their type specific data and define how they are connected to each other to form a "system".
The systems are typically gearboxes and they can have rather complex branched designs. Let's start with an easy example:
This system is built up out of Masses (horizontal lines) and Stiffnesses (vertical lines). Gears and clutches are types of masses and come in pairs. Colors represent different branch speeds due to gear ratios. Here's a (bad) example of how I could store everything from this particular illustration:
ID | Type | Clutch | Ends | DrivenBy | NoOfTeeth| Mass | Stiffness
--- | ---- | ------ | ---- | --------- | -------- | ---- | ---------
1 | Mass | | Input1 | | | 5 |
2 | Stiffness | | | | | | 15
3 | Mass | 1.1 | | | | 2 |
4 | Mass | 1.2 | | | | 3 |
5 | Stiffness | | | | | | 20
6 | Gear | | | | 10 | 4 |
7 | Stiffness | | | | | | 30
8 | Gear | | | | 4 | 5 |
9 | Gear | | | 8 | 7 | 2 |
10 | Stiffness | | | | | | 40
11 | Mass | | | | | 4 |
12 | Stiffness | | Output1 | | | | 10
13 | Gear | | | 6 | 5 | 4 |
14 | Stiffness | | | | | | 20
15 | Mass | 2.1 | | | | 4 |
16 | Mass | 2.2 | | | | 3
17 | Stiffness | | | | | | 30
18 | Mass | | Output2 | | | 2 |
Obviously, this is not a very good way to store the data. This design pattern resembles somewhat of a "Repeated attributes" since each component type has a different attribute to be filled. I could create a table for each type of component, but things become more complex when looking at other examples, such as this 2-stage gearbox:
There are also examples with more than 1 input and several outputs, but I can't post more links due to low reputation.
Eitherway, you will see that the usual hierarchical data storage doesn't apply here because the data is not purely "tree-shaped" where everything branches off from 1 main branch.
I think that even though I could store data in the above mentioned way, I will get huge difficulties when it comes to the programming stage.
To add to the complexity, these gearboxes are actually sub-systems to a much bigger system.
So, any suggestions on a good way to store this type of data?*

Perhaps this is a possible way of doing it?
Here you will see that there is a "main" table called GearboxBranch, keeping track of all elements in the gearbox, giving them an id and to identify in which branch the element exists.
Then for the elements themselves, masses are defined in their dedicated table, so are stiffnesses. Gears and Clutches (which are types of masses) are then defined in their perspective tables. A recursive relationship is existing in the gear table, since one gear has to be driven by at least one other gear.
Furthermore, the table with Shaft Ends defines which of the elements in the gearbox are input or output and what number they have.
I can't seem to see any problems with this method, but I'm a little unsure how to get data out of the database. There will be considerable coding involved I'm afraid.

Related

Best way of storing enumerated fields with ability to change order Postgres

What is the best way for storing enumerated fields with ability to change its order?
Lets say my database looks like this:
| Table |
|---------------------|
| id | name | order|
| 1 | 1st | 1 |
| 2 | 2nd | 2 |
| 3 | 3rd | 3 |
| 4 | 4th | 4 |
Now, when user change order in such a away
| Table |
|---------------------|
| id | name | order|
| 1 | 1st | 1 |
| 4 | 4nd | 2 |
| 2 | 2nd | 3 |
| 3 | 3rd | 4 |
Here I would have to update all rows in this table.
I consider 2 solutions
Solution 1)
When inserting row X between for example order 2 and order 3, I would change row's X order field to 3.5, So I would choose number in the middle between adjacent orders.
Above table would look like this
| Table |
|---------------------|
| id | name | order|
| 1 | 1st | 1 |
| 4 | 4nd | 2.5 |
| 2 | 2nd | 2 |
| 3 | 3rd | 3 |
Then, after for example 16 changes I would update table and normalize all order fields, so table after normalization would be like this:
| Table |
|---------------------|
| id | name | order|
| 1 | 1st | 1 |
| 4 | 4nd | 2 |
| 2 | 2nd | 3 |
| 3 | 3rd | 4 |
Solution 2)
I also consider adding fields "next" (or "next" and "prev") to each row, but it looks for me like waste of memory.
I really dont want to update whole table every time somebody change order. What is the best way of solving this problem?

Algorithm or pseudo code for finding combinations from table

I have a table with thousands of items with a lot of attributes (approx 15+). I would like to select the following results:
Select all combination of items to have at least 100% from each attributes? Exactly 100% would be nice but thats not necessary so it can go over a little or be a little less (maybe +-2%).
All combinations would be a big dataset so I think it would be better to sort them by price and select only the 10 cheapest one.
Also if I would like to modify selects before so that one or several attributes cant get over some value, like 50% for example?
| ----------- | ------------ | ----------- | ----------- | ----- |
| item name | attribute 1 | attribute 2 | attribute 3 | price |
| item 1 | 25% | 1% | 5% | 1€ |
| item 2 | 10% | 10% | 10% | 2€ |
| item 3 | 5% | 20% | 5% | 3€ |
| item 4 | 20% | 15% | 50% | 12€ |
I don't know if there is an existing algorithm for my problem ( I hope so ) or my problem has a name I can google but I would be thankful for any tips how I should proceed.
The only way I could think of for now is to bruteforce all the combinations and drop the unusable ones. But I don't think that's the right way (maybe I'm wrong and thats the only way).
The number of items, price and attribute values can change over time. If they were static I would just run the bruteforce option once and be done with it.
Sorry if this question was already asked.
EDIT:
As an example I can provide nutritional information about food (All the numbers are made up):
daily intake of carbohydrates/fat/protein are 225g/30g/65g
| ----------- | --------------- | ------- | --------- | ------ | ----- |
| item name | carbohydrates | fat | protein | sodium | price |
| apple | 10g | 1g | 5g | 1mg | 1€ |
| banana | 20g | 2g | 10g | 1mg | 2€ |
| pear | 15g | 3g | 5g | 5mg | 3€ |
| ----------- | --------------- | ------- | --------- | ------ | ----- |
find me combination of foods which will reach daily intake.
Now i want the same as in 1. but sort it by the price/select the cheapest.
I want only combinations with sodium not exceeding 30mg

How can we validate tabular data in robot framework?

In Cucumber, we can directly validate the database table content in tabular format by mentioning the values in below format:
| Type | Code | Amount |
| A | HIGH | 27.72 |
| B | LOW | 9.28 |
| C | LOW | 4.43 |
Do we have something similar in Robot Framework. I need to run a query on the DB and the output looks like the above given table.
No, there is nothing built in to do exactly what you say. However, it's fairly straight-forward to write a keyword that takes a table of data and compares it to another table of data.
For example, you could write a keyword that takes the result of the query and then rows of information (though, the rows must all have exactly the same number of columns):
| | ${ResultOfQuery}= | <do the database query>
| | Database should contain | ${ResultOfQuery}
| | ... | #Type | Code | Amount
| | ... | A | HIGH | 27.72
| | ... | B | LOW | 9.28
| | ... | C | LOW | 4.43
Then it's just a matter of iterating over all of the arguments three at a time, and checking if the data has that value. It would look something like this:
**** Keywords ***
| Database should contain
| | [Arguments] | ${actual} | #{expected}
| | :FOR | ${type} | ${code} | ${amount} | IN | #{expected}
| | | <verify that the values are in ${actual}>
Even easier might be to write a python-based keyword, which makes it a bit easier to iterate over datasets.

What database technologies should I consider for building a scalable "running average" view?

We are working on an application where millions of users will be entering information at the same time. Suppose the application allows people to rate geographic regions on where they would like to live. Each participant is allowed to rate each region using a decimal value from 0-10. Each person belongs to one or more groups based upon attributes such as gender, and people that consider themselves active, or enjoy culture.
Every time a rating is made, we need to have a view which shows us the average rating for each region/group. I'm aware that most DB's have an "average" function, but for our purposes we need to be able to use our own function as we may use a the geometric mean instead of the arithmetic mean.
Below are some tables which might be used. Note: I did not include the relationship table PeopleGroups which map which groups a person is a member of for brevity purposes.
Regions People Groups RegionScoresByPerson
+-----+------------+ +-----+-------+ +-----+----------+ +-----+-----+-------+
| RID | NAME | | PID | Name | | GID | Name | | RID | PID | Score |
+-----+------------+ +-----+-------+ +-----+----------+ +-----+-----+-------+
| 1 | Flordia | | P1 | Alice | | G0 | Everyone | | 1 | P1 | 6 |
| 2 | California | | P2 | Bob | | G1 | Women | | 1 | P2 | 8 |
+-----+------------+ | P3 | Frank | | G2 | Men | | 1 | P3 | 3 |
| P4 | Mary | | G3 | Active | | 1 | P4 | 2 |
+-----+-------+ | G4 | Culture | | 1 | P1 | 7 |
+-----+----------+ | 1 | P2 | 5 |
| 1 | P3 | 8 |
| 1 | P4 | 2 |
+-----+-----+-------+
Our current implementation uses a similar set of tables for storing ratings, but we don't calculate averages real-time. Anytime we need the results (e.g. show me the average score California for women), we have to pull all the information into memory and run the calculations manually.
I was wondering how I leverage database technologies such as views, triggers, stored procedures, etc. to present to me a simple table that will allow me to get scores by for people and groups so we don't have to manually run calculations.
I would like some table like the following, where everything is handled by the DB. Any insert,update,delete actions on the RegionScoresByPerson or Groups tables would automatically be reflected in this table. If it is not apparent, the rows marked with * calculated rows. In this case I'm using a simple arithmetic average, but I the design should allow for any type of function.
EID stands for entity ID (a person or group)
Besides deciding how to build such a view, I'm unsure of what sort of datatypes to use (and index) for People and Groups. I suppose I'd like the index to be integers, but that would prevent me from creating the table below because I couldn't distinguish between Person 1 and Group 1 -- Would having ID's such as P1 and G1 be a performance hit? I'm obviously concerned about the design being scalable.
ScoreView
+-----------+-----+-------+
| RID | EID | Score |
| 1 | P1 | 6 |
| 1 | P2 | 8 |
| 1 | P3 | 3 |
| 1 | P4 | 2 |
| 1 | P1 | 7 |
| 1 | P2 | 5 |
| 1 | P3 | 8 |
| 1 | P4 | 2 |
| 1 | G0 | 4.75 |*
| 1 | G1 | 4 |*
| 1 | G2 | … |*
| 1 | G3 | … |*
+-----------+-----+-------+
Apache Flume is the open source tool designed to solve this kind of problem. Also have a look at Google Cloud Dataflow.
https://flume.apache.org/

Database design for download presets

Newbie with databases, I would like some advise please..
I have agencies who can download photo's.
Standard each agency can download "medium" & "large" photos.
Now from their account page I would like them to make extra custom presets and manage those.
I looked in the database of some blog software how they handle categories and wrapped my head around this example. Is this the right approach?
Cheers
agency 1 has preset "medium" & "large"
agency 2 has preset "medium", "large" & "Bill custom"
-----------
| presets |
-----------------------------------------------
| preset_id | preset_name | preset_dimensions |
-----------------------------------------------
| 1 | medium | 800x600 |
| 2 | large | 3000x2000 |
| 3 | Bill custom | 640x420 |
-----------------------------------------------
----------------
| preset_assoc |
------------------------------------------------------------
| presassoc_id | presassoc_preset_id | presassoc_agency_id |
------------------------------------------------------------
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 1 | 2 |
| 4 | 2 | 2 |
| 5 | 3 | 2 |
------------------------------------------------------------
------------
| agencies |
---------------------------
| agency_id | agency_name |
---------------------------
| 1 | Joe ltd |
| 2 | Bill inc |
---------------------------
The approach is right. Because you have NxN relation (1 agency can have multiple presets, and the same preset could be used by multiple agencies) you need to have a joining table. The only questionable thing is that preset_assoc doesn't have to have presassoc_id because the other 2 columns could be used as a combined primary key.

Resources