Designing a multilevel company structure - sql-server

I try to build a database model for the following structure:
I have companies with up to 3 hierachical levels. For each unit I have a value (these values are given randomly and duplicates between companies (not within) are possible. Let us say (1 Level: 222-Amazon, 2 Level: 441-Amazon: Germany, 542-Britan, 3 Level: 6-Distribution, 99-Shop, 124-Programming, 5-HR.
Of course for each company this is different. What I did is:
Table1:
ID_Worker
CompanyName
ID_CompanyLvL1
ID_CompanyLvL2
ID_CompanyLvL3
...
Table2:
ID_CompanyLevel1
Slot1
Slot2
...
Table3:
ID_CompanyLevel2
Slot1
Slot2
...
But with this approach I have the following problem: If two companies have the same number for a CompanyLevel1(2 or 3) unit I cannot distingush them anymore.
Another approach that is not working is
Table1:
ID_Company
ID_Worker
ID_CompanyLevel1
...
Tabel2:
ID_CompanyLevel1
Slot1
ID_CompanyLevel2
...
Table3:
ID_CompanyLevel2
Slot
ID_CompanyLevel3
...
With this approach I cannot identify which person is in e.g. which level2 unit. Could anyone help me with this i just cannot come up with the right design.

You need to decide whether the organization structure is purely hierarchical (an org unit can only belong to 0 or 1 other org unit), or whether it is graphical (an org unit can belong to 0, 1, or 1+ org units).
Your limit of three is a business rule, and should be enforced by database logic (trigger) and not the database schema.
Why the codes with the names?
If hierarchical, this is your schema:
create table organizations (
organization_id int primary key,
name varchar(whatever) not null,
parent_id int null references organizations(organization_id)
);
Use Recursive Common Table Expressions to query them.
If graphical, this is your schema:
create table organizations (
organization_id int primary key,
name varchar(whatever) not null
);
create table organizations_structure (
parent_organization_id int references organizations(organization_id),
child_organization_id int references organizations(organization_id),
primary key (parent_organization_id, child_organization_id),
check (parent_organization_id <> child_organization_id)
);

For anything like that - make sure you do not put yourself into a cornder. For example:
I have companies with up to 3 hierachical levels
No. YOu do have companies with CURRENTLY up to 3 hierarchical levels. And they do not want to scream at you when one of them decides to have 4.
I would suggest reading the Data Model Ressource Book Volume 1 - they describe all kinds of stuff and standard data schemata, among them entity organizations (entity as in "legal, human or organizatonal entity" which includes organigrams. Things are a lot more complex as you think when you do not want to put yourself into a corner that WILL make the program require a rewrite in the not too far future.

Related

Simple database design - some columns have multiple values

Caveat: very new to database design/modeling, so bear with me :)
I'm trying to design a simple database that stores information about images in an archive. Along with file_name (which is one distinct string), I have fields like genre and starring where each field might contains multiple strings (if an image is associated with multiple genres, and/or if an image has multiple actors in it).
Right now the database is just a single table keyed on file_name, and the fields like starring and genre just have multiple comma-separated values stored. I can query it fine by using wildcards and like and in operators, but I'm wondering if there's a more elegant way to break out the data such that it is easier to use/query. For instance, I'd like to be able to find how many unique actors are represented in the archive, but I don't think that's possible with the current model.
I realize this is a pretty elementary question about data modeling, but any guidance anyone can provide or reading you can direct me to would be greatly appreciated!
Thanks!
You need to create extra tables in order to stick with the normalization. In your situation you need 4 extra tables to represent these n->m relations(2 extra would be enough if the relations were 1->n).
Tables:
image(id, file_name)
genre(id, name)
image_genres(image_id, genre_id)
stars(id, name, ...)
image_stars(image_id, star_id)
And some data in tables:
image table
id
file_name
1
/users/home/song/empire.png
2
/users/home/song/promiscuous.png
genre table
id
name
1
pop
2
blues
3
rock
image_genres table
image_id
genre_id
1
2
1
3
2
1
stars table
id
name
1
Jay-Z
2
Alicia Keys
3
Nelly Furtado
4
Timbaland
image_stars table
image_id
star_id
1
1
1
2
2
3
2
4
For unique actor count in database you can simply run the sql query below
SELECT COUNT(name) FROM stars

Recipe Database Design

I am trying to create a database to store my recipes. However, I am not sure how to implement it. I looked at other questions like this but they do not have the same focus as I.
I assume any dish is actually just an ingredient, which can then be used in other dishes, or in this case in other ingredients. Any ingredient may have multiple recipes. For now, each recipe indicates how much of each ingredient is needed, but I also want to know how these ingredients are combined without having a long text description of it.
For example, in text, I would describe one (very bad) scrambled eggs recipes like this:
Scrambled eggs:
Cooked for 5 minutes(
1g Butter,
Whisked(
1g Salt,
1g Pepper,
2 Eggs
)
and then Scrambled eggs could be used in another recipe as an ingredient.
But how would that translate in a database? I don't need that database to be SQL based since this is a personal project, but I don't know any other kind of databases so far.
I thought about defining an Ingredient, as having an optional Technique associated with it but that means Whisked(1g salt, 1g pepper, 2 eggs) would have to be an Ingredient. Which I guess could work and I could also make the name of ingredients optional, but it seems awkward.
I also thought about defining a Recipe as having multiple TransformedIngredients which would contain a Technique applied to many Ingredients but sometimes a Recipe contains raw, untransformed, Ingredients and sometimes TransformedIngredients would need to be applied to TransformedIngredient. From what I know of databases that wouldn't work.
PS: I stumbled onto a functional programming Tiramisu recipe which, though very much focused on the techniques, displays fairly well what I'm trying to implement for my database.
I think what's confusing is that there are two different things to think about with a recipe, 'Items' and 'Steps'.
One database structure that comes to mind for this is a Star Schema structure which separates these ideas nicely (into Dimension and Fact tables, respectively).
A quick description of each:
Dimension
"The state of something" i.e. a record is merely there to describe what the thing is. A customer's address table would be an example of a dimension table.
Fact
"Things changing over time" i.e. each record relates to a dimension table, but has changing values. An example would be shipped purchases from a website to a customer's address. The address stays the same, but the shipments are getting constantly added to the table.
This isn't to say that Dimension tables don't change, too; obviously new users sign up for websites all the time. In the above address example, if a customer were to change his address, a new primary key value would be added for the new address.
Now on to your recipe examples:
Imagine you're cooking something. I would put anything that you hold in your hands in a "dimension" table. For example: DIM_INGREDIENT (with columns such as INDREDIENT_ID, INGREDIENT_NAME), and DIM_AMOUNT (AMOUNT_ID, AMOUNT, UNITS) to describe the amounts. And DIM_ACTION (ACTION_ID, TYPE, LENGTH, UNITS) to describe the action. There are more you can come up with; these are a few to get started.
Any steps I'd be taking could go in a FACT_RECIPE_STEPS table that would map to all the dimension tables. Any step that doesn't have a logical step would have a null value (i.e. stir for 5 minutes would have null for INGREDIENT_ID).
The FACT_RECIPE_STEPS could look like this:
RECIPE_ID, RECIPE_STEP, ACTION_STEP_ID, INGREDIENT_ID, AMOUNT_ID, ACTION_ID
What gets confusing is the "substep" of whisking the stuff together. I put that in another FACT table called FCT_ACTION_STEP since "whisking" is one action in the recipe list, but to perform the action you actually need to do three things.
I think the following is what some of the tables would look like with your data:
DIM_INGREDIENT
INGREDIENT_ID: 1
INGREDIENT_NAME: 'Scrambled eggs'
INGREDIENT_ID: 2
INGREDIENT_NAME: 'Salt'
INGREDIENT_ID: 3
INGREDIENT_NAME: 'Pepper'
INGREDIENT_ID: 4
INGREDIENT_NAME: 'Eggs'
INGREDIENT_ID: 5
INGREDIENT_NAME: 'Butter'
DIM_ACTION
ACTION_ID: 1
TYPE: 'Cook'
LENGTH: 5
UNITS: 'minutes'
ACTION_ID: 2
TYPE: 'Whisk'
LENGTH: null
UNITS: null
FCT_ACTION_STEP
STEP_ID: 1
ACTION_ID: 2
DIM_AMOUNT
AMOUNT_ID: 1
AMOUNT: 1
UNITS: 'grams'
AMOUNT_ID: 2
AMOUNT: 2
UNITS: null
FACT_RECIPE_STEPS
RECIPE_ID, RECIPE_STEP, ACTION_STEP_ID, INGREDIENT_ID, AMOUNT_ID, ACTION_ID
EDIT:
I was a bit unsure myself as to how to do the "Whisked" part of the recipe and thought that, when you add the whisked mixture to the final result, it's like adding in one ingredient to the recipe. However, you need to prepare the mixture before and it has three steps. It's basically like it's own little recipe, and the FACT_ACTION_STEP takes that other 'recipe' into account to be able to add the result one row in the FACT_RECIPE_STEPS table.
Now that I think about it a bit more, it might be better to just assign "Whisked" as its own recipe in FACT_RECIPE_STEPS and DIM_INGREDIENT (called something like "Whisked spices for eggs") +and get rid of the FACT_ACTION_STEP table altogether. That way you can easily make more complex recipes, such as "Eggs and Pancake Breakfast" where the Eggs part is the result of this recipe.
You can add some other fields to tables but I believe this schema works for you.
recipe
------------
r_id PK
recipe_name
cooking_time
recipe_of_recipes
-----------------
ror_id PK
ror_name
recipe_ror (table for many to many relation-> defining a recipe as an ingredient)
-------------
r_ror_id PK
r_id FK
ror_id FK
ingredients
-------------
i_id PK
t_id FK
r_id FK
ror_id FK (added later)
ingredient_name
quantity
technique
-------------
t_id PK
technique_name
EDIT
Let's say you want to store a recipe (X) which is a combination of x and y recipes plus z ingredient.
To prepare X recipe (big X),
in recipe,ingredients and technique tables you store
the x recipe and w,t,r ingredients with technique of p
the y recipe and b,n,m ingredients with technique of v
also z ingredient with technique of f (for this I forgot to add field ror_id as a FK in ingredients table)
You can define 2 different recipes (x and y) as ingredients of a recipe (X) using the recipe_ror table. This table relates to different recipes as one.(many to many relationship between tables recipe and recipe_of_recipes)
If you also want to store the technique for X,x or y recipes(like cook in your example), you can also add t_id field as FK to recipe and recipe_of_recipes table.

Is it good practice to assign ranges to userid?

I'm building a database schema for users of my app, and I am thinking of setting the userid value according to user type. So,
buyers: 10001 to 19999
sellers: 20001 to 29999
shippers: 30001 to 39999
Next, I assign unique email addresses to the userid:
Login_table
Email.......password.......userid
aaaaa#yy.com....... password.......10005 ---> this email belong to user 10005 (a buyer)
bbbbb#yy.com.......password.......20008 ---> this email belongs to user 20008 (a seller)
ccccc#yy.com.......password.......30187 ---> this email belongs to user 30187 (a shipper)
I then have 3 tables for buyers, sellers, and shippers because each may have different attributes:
buyer_table
buyerid.......name....... mother
10005....... John....... Mary
10006 ....... Chris....... Nancy
seller_table
sellerid....... name....... pet
20008 ....... Adam....... Dog
20018 ....... Tony ....... cat
shipper_table
shipperid....... name....... car
30187....... George....... GMC
30188 ....... Larry ....... Honda
The advantage here is that I have a single login_table for all user types. I do not want to have 3 login tables for each type. Based on the userid value I know what type of user it is. Keeping three tables for each user (buyer_table, seller_table, and shipper_table) is good for making the schema more understandable, in addition to being able to assign different attributes to each user type.
Sounds good? Maybe.
However, I have a problem in that the login_table refers to “userid” while the three user tables each has a different id name for the user, so in the buyer_table I have buyerid as primary key, in the seller_table it is sellerid as primary key, and finally in the shipper_table, the shipperid is the primary key.
How can I link these three primary keys to the login_table? The login_table has userid as a foreign key to one of those three tables, but it is called “userid”, not buyerid, or sellerid, or shipperid!
1) Is it a good idea to classify the userid value according to ranges?
2) If so, how can I resolve the PK-FK issue as described above?
3) Am I off completely?
Having ranges of values for different kinds of similar objects is not bad. If you feel like doing so, you could use sequences wich support value ranges. This way, you could have a buyer sequence wich goes from 0-1000, a seller one from 1001 to 2000 and so on. That would also help you keeping track of the increasing index of the different kinds!

Enforcing a unique combination relationship in fields

Summery: I need any combination of [Field_1] and [Field_2] to be unique and for that uniqueness to be enforced. Note: This is not for permutations - and that's the difficulty.
In Depth:
I'm trying to track contacts for vendor software. I've set my DB up in the time old fashion such that a Vendor record may have many contacts. The trick is that contacts may be related to each other and may not be related to the parent vendor record. An example:
1. SuperBrokenSoftware is a tool who's vendor I need to contact all the time.
2. WeMakeBadSoftware is the Vendor
3. Fred works for WeMakeBadSoftware
4. Gale works for WeHelpPeopleWhenOthersWont
Let's say Gale is the appropriate contact to fix my issue with the SuperBrokenSoftware.
There is no way using the current hierarchy to track Gales relationship to SuperBrokenSoftware.
My solution is to keep track of these relationships in a table like so:
Field1 Field2 Field3
Fred Gale Gale handles specific issues for Fred
However given this solution Field_1 and Field_2 must be unique in combination. That is to say the records:
Field1 Field2 Field3
Fred Gale "Gale handles specific issues for Fred"
Gale Fred "Gale is awesome - Fred sucks"
Should be viewed as the same. Record 2 should not be allowed in the database because it is not unique.
What I have Tried:
Using the bijective - Szudzik's function: a >= b ? a * a + a + b : a + b * b; where a, b >= 0
I can calculate a unique identifier for every combination - but access cannot enforce uniqueness on a calculated field.
What is the best way to enforce a combination in Access?
Thanks in advance!!!
Create new field for unique identifier with unique index and create Before Change data macro, which should insert/change calculated identifier in new field.
Unique key can be just sorted concatenation of field1 and field2

Is there any contradiction against too many values in a table (database)?

I was wondering if there's any contradiction or futur problems against a table in a database which contains about 80 columns. There will be only VARCHARs, few INT and maybe 1 or 2 MESSAGE. I did some research on the net but there's nothing really talking about that kind of problem...In other terms, is this okay or even 'normal' to put that much of values inside a table??
Thanks in advance!
You shouldn't have any real problems if the fields are mostly integers. Most DBMSes have a limit on row length, so a bunch of long columns can cause issues...but unless the varchar columns are very long, you're probably OK.
I've honestly never even needed to think about that, though -- with a properly normalized database, it's quite rare to ever need that many columns in a table.
More columns you have, more memory server needs to process the records.
I recomend to use the "multiple to one" relation scheme in this case.
Example of tables:
customer
id
name
email
...
ins_app_form (Insurance application form)
id
customer_id (relation with customer)
date
... (here comes some other data if you need)
ins_app_item (Insurance application form items/fields)
id
ins_app_form_id (relation with Insurance application form)
question (the name of a question in application form)
answer (customer's answer)
So to show the application form with this scheme you will need to run a query:
SELECT
iaf.id AS application_id,
iaf.date AS `date`,
iai.question,
iai.answer
FROM ins_app_form AS iaf
LEFT JOIN ins_app_item AS iai ON iai.ins_app_form_id=iaf.id
WHERE iaf.customer_id=<ID of a customer>
This query will bring you something like this:
id date question answer
1 2014-03-31 "Year" "2008"
1 2014-03-31 "Car make" "Audi"
1 2014-03-31 "Car model" "Q7"
...

Resources