break normalization rules , creat table /view from muliple tables - sql-server

I have a question and am not sure if this is the correct forum to post it .
I have two tables, StudentTable and CourseTable, where each student takes more than one course.
Example: Student1 takes 2 courses: (C1, C2).
Student2 takes 3 courses (C1, C2, C3).
I need to create a table/view that contains student information from StudentTable, plus all the courses and the score for each course from CourseTable - in one row.
Example:
Row1= Student1_Id, C1_code, C1_name, C1_Score, C2_code, C2_name, C2_Score
Row2=
Student2_Id, C1_code, C1_name, C1_Score, C2_code, C2_name, C2_Score, C3_code, C3_name, C3_Score
Since Student1 has just two courses, I should enter NULL in 'Course 3 fields'
My struggle is in the insert statement. I tried the following but it showed an error.
Insert Into Newtable
( St_ID, C1_code,c1_name, C1_Score ,C2_code ,C2_name,C2_score,C3_code ,C3_name,C3_score)
Select
(Select St_ID from StudentTable)
,
(Select C_code,c_name,c_Score
from Coursetable,SudentTable
where course.Stid =Studet.stid)
,
(Select C_code,c_name,c_Score
from course ,student
where course.Stid =Studet.stid ),
(Select C_code,c_name,c_Score
from course ,student
where course.Stid =Studet.stid );
I'm fully aware that the New table/View will break the rules of normalization, but I need it for a specific purpose.
I tried also the PIVOT BY functionality but no luck with it.
FYI, I'm not expert in SQL syntax. I just know the basics.
I will be great full for any helpful suggestions to try.
I added My DB structure so you can have better Idea
First Table is Member table which Represent Students Information .The fields in this table are
member_sk (PrimaryKey), full_or_part_time, gender, age_at_entry, age_band_at_entry, disability, ethnicity,
widening_participation_level, nationality
Second Table is Modules table which include the Courses' scores that Student took .
The fields in this table are
Module_result_k(Primary Key), member_sk(Foreign key to connect to Member table), member_stage_sk ,module_k(Foreign key to connect to Module table), module_confirmed_grade_src, credit_or_result
Third Table is AllModuleInfo which is include general information for each course .The fields in this table are
Module_k (Primary key), module_name ,module_code, Module_credit, Module stage.
The New table that I will create has the following fields
member_sk (PrimaryKey), full_or_part_time, gender, age_at_entry, age_band_at_entry, disability, ethnicity,
widening_participation_level, nationality " This will be retrieved from Member table"
Also will include
Module 1_name ,module1_code, Module1_credit, Module1_ stage, member1_stage_sk , module1_confirmed_grade_src, credit1_or_result
Module 2_name ,module2_code, Module2_credit, Module2_ stage, member2_stage_sk , module2_confirmed_grade_src, credit2_or_result
-
-
-
I will repeat this fields 14 times which is equal to Maximum courses number that any of the students took.
//// I hope now my questions become more clear

Related

SSIS - Split multiple columns into single rows

Below is a very small example of the flat table I need to split. The first table is a Lesson table which has the ID, Name and Duration. The second table is a student table which only has the Student Name as a PK. And the third table will be a Many to Many of Lesson ID and Student Name.
Lesson Id
Lesson Name
Lesson Duration
Student1
Student2
Student3
Student4
1
Maths
1 Hour
Jean
Paul
Jane
Doe
2
English
1 Hour
Jean
Jane
Doe
I don't know how, using SSIS, I can assign Jean, Paul, Jane and Doe to their own tables using the Student 1, 2, 3 and 4 columns. When I figure this out, I imagine I can use the same logic to map the Lesson ID and columns to the third Many to Many table?
How do I handle duplicate entries, for example Jean Jane and Doe already exist from the first row so they do not need to be added to the Students table.
I assume I use a conditional split to skip null values? For example Student4 on the second row is Null.
Thanks for the assistance.
Were it me, I would design this as 3 data flows.
Data flow 1 - student population
Since we're assuming the name is what makes a student unique, we need to build a big list of the unique names.
SELECT D.*
FROM
(
SELECT S.Student1 AS StudentName
FROM dbo.MyTable AS S
UNION
SELECT S.Student2 AS StudentName
FROM dbo.MyTable AS S
UNION
SELECT S.Student3 AS StudentName
FROM dbo.MyTable AS S
UNION
SELECT S.Student4 AS StudentName
FROM dbo.MyTable AS S
)D
WHERE D.StudentName IS NOT NULL
ORDER BY D.StudentName;
The use of UNION in the query will handle deduplication of data and we wrap that in a derived table to filter the NULLs.
I add an explicit order by not that it's needed but since I'm assuming you're using the name as the primary key, let's avoid sort operation when we land the data.
Add an OLE DB Source to your data flow and instead of picking a table in the drop down, you'll use the above query.
Add an OLE DB Destination to the same data flow and connect the two. Assuming your target table looks something like
CREATE TABLE dbo.Student
(
StudentName varchar(50) NOT NULL CONSTRAINT PK__dbo__Student PRIMARY KEY(StudentName)
);
Data Flow 2 - Lessons
Dealers choice here, you can either write the query or just point at the source table.
A very good practice to get into with SSIS is to only bring the data you need into the buffers so I would write a query like
SELECT DISTINCT S.[Lesson Id], S.[Lesson Name], S.[Lesson Duration]
FROM dbo.MyTable AS S;
I favor a distinct here as I don't know enough about your data but if it were extended and a second Maths class was offered to accommodate another 4 students, it might be Lesson Id 1 again. Or it might be 3 as it indicates course time or something else.
Add an OLE DB Destination and land the data.
Data Flow 3 - Many to Many
There's a few different ways to handle this. I'd favor the lazy way and repeat our approach from the first data flow
SELECT D.*
FROM
(
SELECT S.Student1 AS StudentName, S.[Lesson Id]
FROM dbo.MyTable AS S
UNION
SELECT S.Student2 AS StudentName, S.[Lesson Id]
FROM dbo.MyTable AS S
UNION
SELECT S.Student3 AS StudentName, S.[Lesson Id]
FROM dbo.MyTable AS S
UNION
SELECT S.Student4 AS StudentName, S.[Lesson Id]
FROM dbo.MyTable AS S
)D
WHERE D.StudentName IS NOT NULL
ORDER BY D.StudentName;
And then land in your bridge table with an OLE DB Destination and be done with it.
If this has been homework/an assignment to have you learn the native components...
Do keep with the 3 data flow approach. Trying to do too much in one go is a recipe for trouble.
The operation of moving wide data to narrow data is an Unpivot operation. You'd use that in the Student and bridge table data flows but honestly, I think I've used that component less than 10 times in my career and/or answering SSIS questions here and I do a lot of that.
If the Unpivot operation generates a NULL, then yes, you'd likely want to use a Conditional Split to filter those rows out.
If your reference tables were more complex, then you'd likely be adding a Lookup component to your bridge table population step to retrieve the surrogate key.

How to implement many-to-many-to-many database relationship?

I am building a SQLite database and am not sure how to proceed with this scenario.
I'll use a real-world example to explain what I need:
I have a list products that are sold by many stores in various states. Not every Store sells a particular Product at all, and those that do, may only sell it in one State or another. Most stores sell a product in most states, but not all.
For example, let's say I am trying to buy a vacuum cleaner in Hawaii. Joe's Hardware sells vacuums in 18 states, but not in Hawaii. Walmart sells vacuums in Hawaii, but not microwaves. Burger King does not sell vacuums at all, but will give me a Whopper anywhere in the US.
So if I am in Hawaii and search for a vacuum, I should only get Walmart as a result. While other stores may sell vacuums, and may sell in Hawaii, they don't do both but Walmart does.
How do I efficiently create this type of relationship in a relational database (specifically, I am currently using SQLite, but need to be able to convert to MySQL in the future).
Obviously, I would need tables for Product, Store, and State, but I am at a loss on how to create and query the appropriate join tables...
If I, for example, query a certain Product, how would I determine which Store would sell it in a particular State, keeping in mind that Walmart may not sell vacuums in Hawaii, but they do sell tea there?
I understand the basics of 1:1, 1:n, and M:n relationships in RD, but I am not sure how to handle this complexity where there is a many-to-many-to-many situation.
If you could show some SQL statements (or DDL) that demonstrates this, I would be very grateful. Thank you!
An accepted and common way is the utilisation of a table that has a column for referencing the product and another for the store. There's many names for such a table reference table, associative table mapping table to name some.
You want these to be efficient so therefore try to reference by a number which of course has to uniquely identify what it is referencing. With SQLite by default a table has a special column, normally hidden, that is such a unique number. It's the rowid and is typically the most efficient way of accessing rows as SQLite has been designed this common usage in mind.
SQLite allows you to create a column per table that is an alias of the rowid you simple provide the column followed by INTEGER PRIMARY KEY and typically you'd name the column id.
So utilising these the reference table would have a column for the product's id and another for the store's id catering for every combination of product/store.
As an example three tables are created (stores products and a reference/mapping table) the former being populated using :-
CREATE TABLE IF NOT EXISTS _products(id INTEGER PRIMARY KEY, productname TEXT, productcost REAL);
CREATE TABLE IF NOT EXISTS _stores (id INTEGER PRIMARY KEY, storename TEXT);
CREATE TABLE IF NOT EXISTS _product_store_relationships (storereference INTEGER, productreference INTEGER);
INSERT INTO _products (productname,productcost) VALUES
('thingummy',25.30),
('Sky Hook',56.90),
('Tartan Paint',100.34),
('Spirit Level Bubbles - Large', 10.43),
('Spirit Level bubbles - Small',7.77)
;
INSERT INTO _stores (storename) VALUES
('Acme'),
('Shops-R-Them'),
('Harrods'),
('X-Mart')
;
The resultant tables being :-
_product_store_relationships would be empty
Placing products into stores (for example) could be done using :-
-- Build some relationships/references/mappings
INSERT INTO _product_store_relationships VALUES
(2,2), -- Sky Hooks are in Shops-R-Them
(2,4), -- Sky Hooks in x-Mart
(1,3), -- thingummys in Harrods
(1,1), -- and Acme
(1,2), -- and Shops-R-Them
(4,4), -- Spirit Level Bubbles Large in X-Mart
(5,4), -- Spiirit Level Bubble Small in X-Mart
(3,3) -- Tartn paint in Harrods
;
The _product_store_relationships would then be :-
A query such as the following would list the products in stores sorted by store and then product :-
SELECT storename, productname, productcost FROM _stores
JOIN _product_store_relationships ON _stores.id = storereference
JOIN _products ON _product_store_relationships.productreference = _products.id
ORDER BY storename, productname
;
The resultant output being :-
This query will only list stores that have a product name that contains an s or S (as like is typically case sensitive) the output being sorted according to productcost in ASCending order, then storename, then productname:-
SELECT storename, productname, productcost FROM _stores
JOIN _product_store_relationships ON _stores.id = storereference
JOIN _products ON _product_store_relationships.productreference = _products.id
WHERE productname LIKE '%s%'
ORDER BY productcost,storename, productname
;
Output :-
Expanding the above to consider states.
2 new tables states and store_state_reference
Although no real need for a reference table (a store would only be in one state unless you consider a chain of stores to be a store, in which case this would also cope)
The SQL could be :-
CREATE TABLE IF NOT EXISTS _states (id INTEGER PRIMARY KEY, statename TEXT);
INSERT INTO _states (statename) VALUES
('Texas'),
('Ohio'),
('Alabama'),
('Queensland'),
('New South Wales')
;
CREATE TABLE IF NOT EXISTS _store_state_references (storereference, statereference);
INSERT INTO _store_state_references VALUES
(1,1),
(2,5),
(3,1),
(4,3)
;
If the following query were run :-
SELECT storename,productname,productcost,statename
FROM _stores
JOIN _store_state_references ON _stores.id = _store_state_references.storereference
JOIN _states ON _store_state_references.statereference =_states.id
JOIN _product_store_relationships ON _stores.id = _product_store_relationships.storereference
JOIN _products ON _product_store_relationships.productreference = _products.id
WHERE statename = 'Texas' AND productname = 'Sky Hook'
;
The output would be :-
Without the WHERE clause :-
make Stores-R-Them have a presence in all states :-
The following would make Stores-R-Them have a presence in all states :-
INSERT INTO _store_state_references VALUES
(2,1),(2,2),(2,3),(2,4)
;
Now the Sky Hook's in Texas results in :-
Note This just covers the basics of the topic.
You will need to create combine mapping table of product, states and stores as tbl_product_states_stores which will store mapping of products, state and store. The columns will be id, product_id, state_id, stores_id.

Database schema for end user report designer

I'm trying to implement a feature whereby, apart from all the reports that I have in my system, I will allow the end user to create simple reports. (not overly complex reports that involves slicing and dicing across multiple tables with lots of logic)
The user will be able to:
1) Select a base table from a list of allowable tables (e.g., Customers)
2) Select multiple sub tables (e.g., Address table, with AddressId as the field to link Customers to Address)
3) Select the fields from the tables
4) Have basic sorting
Here's the database schema I have current, and I'm quite certain it's far from perfect, so I'm wondering what else I can improve on
AllowableTables table
This table will contain the list of tables that the user can create their custom reports against.
Id Table
----------------------------------
1 Customers
2 Address
3 Orders
4 Products
ReportTemplates table
Id Name MainTable
------------------------------------------------------------------
1 Customer Report #2 Customers
2 Customer Report #3 Customers
ReportTemplateSettings table
Id TemplateId TableName FieldName ColumnHeader ColumnWidth Sequence
-------------------------------------------------------------------------------
1 1 Customer Id Customer S/N 100 1
2 1 Customer Name Full Name 100 2
3 1 Address Address1 Address 1 100 3
I know this isn't complete, but this is what I've come up with so far. Does anyone have any links to a reference design, or have any inputs as to how I can improve this?
This needs a lot of work even though it’s relatively simple task. Here are several other columns you might want to include as well as some other details to take care of.
Store table name along with schema name or store schema name in additional column, add column for sorting and sort order
Create additional table to store child tables that will be used in the report (report id, schema name, table name, column in child table used to join tables, column in parent table used to join tables, join operator (may not be needed if it always =)
Create additional table that will store column names (report id, schema name, table name, column name, display as)
There are probably several more things that will come up after you complete this but hopefully this will get you in the right direction.

Database tables: One-to-many of different types

Due to non-disclosure at my work, I have created an analogy of the situation. Please try to focus on the problem and not "Why don't you rename this table, m,erge those tables etc". Because the actual problem is much more complex.
Heres the deal,
Lets say I have a "Employee Pay Rise" record that has to be approved.
There is a table with single "Users".
There are tables that group Users together, forexample, "Managers", "Executives", "Payroll", "Finance". These groupings are different types with different properties.
When creating a "PayRise" record, the user who is creating the record also selects both a number of these groups (managers, executives etc) and/or single users who can 'approve' the pay rise.
What is the best way to relate a single "EmployeePayRise" record to 0 or more user records, and 0 or more of each of the groupings.
I would assume that the users are linked to the groups? If so in this case I would just link the employeePayRise record to one user that it applies to and the user that can approve. So basically you'd have two columns representing this. The EmployeePayRise.employeeId and EmployeePayRise.approvalById columns. If you need to get to groups, you'd join the EmployeePayRise.employeeId = Employee.id records. Keep it simple without over-complicating your design.
My first thought was to create a table that relates individual approvers to pay rise rows.
create table pay_rise_approvers (
pay_rise_id integer not null references some_other_pay_rise_table (pay_rise_id),
pay_rise_approver_id integer not null references users (user_id),
primary key (pay_rise_id, pay_rise_approver_id)
);
You can't have good foreign keys that reference managers sometimes, and reference payroll some other times. Users seems the logical target for the foreign key.
If the person creating the pay rise rows (not shown) chooses managers, then the user interface is responsible for inserting one row per manager into this table. That part's easy.
A person that appears in more than one group might be a problem. I can imagine a vice-president appearing in both "Executive" and "Finance" groups. I don't think that's particularly hard to handle, but it does require some forethought. Suppose the person who entered the data changed her mind, and decided to remove all the executives from the table. Should an executive who's also in finance be removed?
Another problem is that there's a pretty good chance that not every user should be allowed to approve a pay rise. I'd give some thought to that before implementing any solution.
I know it looks ugly but I think somethimes the solution can be to have the table_name in the table and a union query
create table approve_pay_rise (
rise_proposal varchar2(10) -- foreign key to payrise table
, approver varchar2(10) -- key of record in table named in other_table
, other_table varchar2(15) );
insert into approve_pay_rise values ('prop000001', 'e0009999', 'USERS');
insert into approve_pay_rise values ('prop000001', 'm0002200', 'MANAGERS');
Then either in code a case statement, repeated statements for each other_table value (select ... where other_table = '' .. select ... where other_table = '') or a union select.
I have to admit I shudder when I encounter it and I'll now go wash my hands after typing a recomendation to do it, but it works.
Sounds like you'd might need two tables ("ApprovalUsers" and "ApprovalGroups"). The SELECT statement(s) would be a UNION of UserIds from the "ApprovalUsers" and the UserIDs from any other groups of users that are the "ApprovalGroups" related to the PayRiseId.
SELECT UserID
INTO #TempApprovers
FROM ApprovalUsers
WHERE PayRiseId = 12345
IF EXISTS (SELECT GroupName FROM ApprovalGroups WHERE GroupName = "Executives" and PayRiseId = 12345)
BEGIN
SELECT UserID
INTO #TempApprovers
FROM Executives
END
....
EDIT: this would/could duplicate UserIds, so you would probably want to GROUP BY UserID (i.e. SELECT UserID FROM #TempApprovers GROUP BY UserID)

table relationships, SQL 2005

Ok I have a question and it is probably very easy but I can not find the solution.
I have 3 tables plus one main tbl.
tbl_1 - tbl_1Name_id
tbl_2- tbl_2Name_id
tbl_3 - tbl_3Name_id
I want to connect the Name_id fields to the main tbl fields below.
main_tbl
___________
tbl_1Name_id
tbl_2Name_id
tbl_3Name_id
Main tbl has a Unique Key for these fields and in the other table, fields they are normal fields NOT NULL.
What I would like to do is that any time when the record is entered in tbl_1, tbl_2 or tbl_3, the value from the main table shows in that field, or other way.
Also I have the relationship Many to one, one being the main tbl of course.
I have a feeling this should be very simple but can not get it to work.
Take a look at SQL Server triggers. This will allow you to perform an action when a record is inserted into any one of those tables.
If you provide some more information like:
An example of an insert
The resulting change you would like
to see as a result of that insert
I can try and give you some more details.
UPDATE
Based on your new comments I suspect that you are working with a denormalized database schema. Below is how I would suggest you structure your tables in the Employee-Medical visit scenario you discussed:
Employee
--------
EmployeeId
fName
lName
EmployeeMedicalVisit
--------------------
VisitId
EmployeeId
Date
Cost
Some important things:
Note that I am not entering the
employees name into the
EmployeeMedicalVisit table, just the EmployeeId. This
helps to maintain data integrity and
complies with First Normal Form
You should read up on 1st, 2nd and
3rd normal forms. Database
normalization is a very imporant
subject and it will make your life
easier if you can grasp them.
With the above structure, when an employee visited a medical office you would insert a record into EmployeeMedicalVisit. To select all medical visits for an employee you would use the query below:
SELECT e.fName, e.lName
FROM Employee e
INNER JOIN EmployeeMedicalVisit as emv
ON e.EployeeId = emv.EmployeeId
Hope this helps!
Here is a sample trigger that may show you waht you need to have:
Create trigger mytabletrigger ON mytable
For INSERT
AS
INSERT MYOTHERTABLE (MytableId, insertdate)
select mytableid, getdate() from inserted
In a trigger you have two psuedotables available, inserted and deleted. The inserted table constains the data that is being inserted into the table you have the trigger on including any autogenerated id. That is how you get the data to the other table assuming you don't need other data at the same time. YOu can get other data from system stored procuders or joins to other tables but not from a form in the application.
If you do need other data that isn't available in a trigger (such as other values from a form, then you need to write a sttored procedure to insert to one table and return the id value through an output clause or using scope_identity() and then use that data to build the insert for the next table.

Resources