Keep referenced field data changes

Keep referenced field data changes - database

I have a table Salary with a column PersonalId and a table Person with a column Name.
In the first table salary data will saved with a PersonalId which relates it to the Person table. In salary bill all data will gather together and Person name will be referenced from Person table.
After 1 year a specific person name will change from Michael to Maic. Now I want the last year salaries bill remain with previous person name Michael and the new salaries bill generate by new name Maic.
How we can do that?

It could depend on what type of operation you need to to most and on how much people change their name, because the number of joins you may need to make could vary a lot.
keep a field in Person that points to the next Person which is a change of name
keep another key in Person that varies only for the physical person
keep a limited number of names in Person that someone could dispose of, with an index of the current name
in another table you keep the relations between the various name of the Person
It could depend on what rules of normalization you follow, for now I'm not thinking about that.
Anyway, with the first case you don't need to change Salary, but to reconstruct the identity of a Person you need multiple requests or at least a stored procedure.
In the second case you still don't need to change Salary because you add a field to Person, but to get all the Salary entries for that physical person you'll need some work, again probably a stored procedure to get the added field and then something that joins all the Salary entries.
The third maybe is the simplest, but also the limited one, and you need in Salary another field that tells the index of the name to use in that entry.
The last case gives you a stable identity, but it may need some work because of the added table, and still there are multiple implementations. You could have salary reference that table instead of Person, or you could consult that table only when you need all the data, but you cannot reference its primary key from Salary because it would not permit to discriminate the name.

Lunadir's right in a certain way -- but all of those approaches are complex, and of rather great difficulty.
The other way -- simpler, and perhaps more correct & robust -- is to keep NAME and PAID_DATE columns in Salary or SalaryPaid, and write the actual name & date paid at the time the payment is made.
Good old batch-processing style -- and it has the benefit of actually capturing the key financial facts, of what payment was made & what name it was made to, which are the actual auditable transaction history.
Do you pay each Salary entry individually, or in bunch (PaySlip or SalaryPaid)? Put the NAME column wherever you record the actual payment & timestamp it occurred.

Related

Avoid duplicating fields across multiple tables

Let me describe briefly the table structures:
Customer Table
id | name | address_line_one | address_line_two | contact_no_one
SaleInvoice Table
id | id_Customer (Foreign Key) | invoice_no
If I have to print a Sale invoice, I have to use the Customer information (like name, address) from the Customer table.
Assume that after a year, some customer data changes (like name or address), and I update the new data in my customer table. Now, if the customer asks for an old invoice, it will be printed with the new customer data which shall be legally wrong.
Does that mean, I have to create
name_customer
address_line_one_customer
...
and all these fields in the Sale Invoice table too?
If yes, is there a better way to get data from these fields in Customer table to the Sale Invoice table then to write a SQL query to get the values and then set the values?

This is really up to you. In some cases, where it is a legal document, you will save all the details so that you can always bring it up the way it was created. Alternatively if you are producing pdf invoices then save them to be 100% sure.
The other alternative is to create a CustomerHistory table, so that past versions are always saved with a date range, so that you can go back to the old version.
It depends on the use cases, but those are your main options.

It sounds like a problem easily solved by placing the Employee table in version normal form (VNF). This is actually just a flavor of 2nf but done in a way that provides the ability to query current data and past data using the same query.
A datetime parameter is used to provide the distinction. When the value is set to NOW, the current data is returned. When the value is set to a specific datetime value in the past, the data that was current at that date and time is returned.
A brief discussion of the particulars can be found here. That answer also contains links to more information if you think it is something that would work for you.

Many employees have many courses which have expiry dates

I'm looking for the best way to store this information. Not every course has an expiry date.
The easiest way I've found so far is:
tblEmployee
-----------
ID (pk)
Expiry1
Expiry2
tblCourseCatalog
----------------
CourseID(pk)
Name
For every course in tblCourseCatalog, a new Expiry is created in tblEmployee to match tblCourseCatalog.CourseID.
I tried to have:
tblCourseExpiryDates
--------------------
EmployeeID (pk) 1:1 with tblEmployee.ID
FirstAid
UnderWaterBasketWeaving
Anytime a new course was added to tblCourseCatalog, a new column was added to tblCourseExpiryDates to match. This became tricky when trying to query some info. Does my current way (Expiry in tblEmployee) change things much from having tblCourseExpiryDates? to me, having a Expiry2 column is a waste if tblCourseCatalog.CourseID=2 (UnderWaterBasketWeaving) does not expire.

The standard normalised way to store something like this is to have a table where every row looks at just one course employee combination, holds any data that is specific to just that combination, and is usually called something like EmployeeCourse (or CourseEmployee)
EmployeeID
CourseID
ExpiryDate
You only put records in this table where you actually have a date that is valid. For a course that has no dates, no employee would ever get a record. If a given employee has never done a course, they get not record. If you want to remove a date, you can remove the record, or just remove the date but leave the record (I'd probably remove it). If you want to add a new course, you just put a record into the Course table and you're done - you don't have to change anything else.
When you need to look up the record, you need to join to the table with an outer join, so that you get any records in the main table that have no CourseEmployee record.
The downside of this normalised data is that it does make it harder to get a list of all the expiry dates for an employee in one row of output - this is where pivot tables come in (and I'm not sure how they work in access).
If you want to read more about this, look up database normalisation.

ER diagram that implements a database for trainee

I edited and remade the ERD. I have a few more questions.
I included participation constraints(between trainee and tutor), cardinality constraints(M means many), weak entities (double line rectangles), weak relationships(double line diamonds), composed attributes, derived attributes (white space with lines circle), and primary keys.
Questions:
Apparently to reduce redundant attributes I should only keep primary keys and descriptive attributes and the other attributes I will remove for simplicity reasons. Which attributes would be redundant in this case? I am thinking start_date, end_date, phone number, and address but that depends on the entity set right? For example the attribute address would be removed from Trainee because we don't really need it?
For the part: "For each trainee we like to store (if any) also previous companies (employers) where they worked, periods of employment: start date and end date."
Isn't "periods of employment: start date, end date" a composed attribute? because the dates are shown with the symbol ":" Also I believe I didn't make an attribute for "where they worked" which is location?
Also how is it possible to show previous companies (employers) when we already have an attribute employers and different start date? Because if you look at the Question Information it states start_date for employer twice and the second time it says start_date and end_date.
I labeled many attributes as primary keys but how am I able to distinguish from derived attribute, primary key, and which attribute would be redundant?
Is there a multivalued attribute in this ERD? Would salary and job held be a multivalued attribute because a employer has many salaries and jobs.
I believe I did the participation constraints (there is one) and cardinality constraints correctly. But there are sentences where for example "An instructor teaches at least a course. Each course is taught by only one instructor"; how can I write the cardinality constraint for this when I don't have a relationship between course and instructor?
Do my relationship names make sense because all I see is "has" maybe I am not correctly naming the actions of the relationships? Also I believe schedules depend on the actual entity so they are weak entities.... so does that make course entity set also a weak entity (I did not label it as weak here)?
For the company address I put a composed attribute, street num, street address, city... would that be correct? Also would street num and street address be primary keys?
Also I added the final mark attribute to courses and course_schedule is this in the right entity set? The statement for this attribute is "Each trainee identified by: unique code, social security number, name, address, a unique telephone number, the courses attended and the final mark for each course."
For this part: "We store in the database all classrooms available on the site" do i make a composed attribute that contains site information?
Question Information:
A trainee may be self-employed or employee in a company
Each trainee identified by:
unique code, social security number, name, address, a unique
telephone number, the courses attended and the final mark for each course.
If the trainee is an employee in a company: store the current company (employer), start date.
For each trainee we like to store (if any) also previous companies (employers) where they worked, periods of employment: start date and end date.
If a trainee is self-employed: store the area of expertise, and title.
For a trainee that works for a company: we store the salary and job
For each company (employer): name (unique), the address, a unique telephone number.
We store in the database all known companies in the
city.
We need also to represent the courses that each trainee is attending.
Each course has a unique code and a title.
For each course we have to store: the classrooms, dates, and times (start time, and duration in minutes) the course is held.
A classroom is characterized by a building name and a room number and the maximum places’ number.
A course is given in at least a classroom, and may be scheduled in many classrooms.
We store in the database all classrooms
available on the site.
We store in the database all courses given at least once in the company.
For each instructor we will store: the social security number, name, and birth date.
An instructor teaches at least a course.
Each course is taught by only one instructor.
All the instructors’ telephone numbers must also be stored (each instructor has at least a telephone number).
A trainee can be a tutor for one or many trainees for a specific
period of time (start date and end date).
For a trainee it is not mandatory to be a tutor, but it is mandatory to have a tutor

The attribute ‘Code’ will be your PK because it’s only use seems to be that of a Unique Identifier.
The relationship ‘is’ will work but having a reference to two tables like that can get messy. Also you have the reference to "Employers" in the Trainee table which is not good practice. They should really be combined. See my helpful hints section to see how to clean that up.
Company looks like the complete table of Companies in the area as your details suggest. This would mean table is fairly static and used as a reference in your other tables. This means that the attribute ‘employer’ in Employed would simply be a Foreign Key reference to the PK of a specific company in Company. You should draw a relationship between those two.
It seems as though when an employee is ‘employed’ they are either an Employee of a company or self-employed.
The address field in Company will be a unique address your current city, yes, as the question states the table is a complete list of companies in the city. However because this is a unique attribute you must have specifics like street address because simply adding the city name will mean all companies will have the same address which is forbidden in an unique field.
Some other helpful hints:
Stay away from adding fields with plurals on them to your diagram. When you have a plural field it often means you need a separate table with a Foreign Key reference to that table. For example in your Table Trainee, you have ‘Employers’. That should be a Employer table with a foreign key reference to the Trainee Code attribute. In the Employer Table you can combine the Self-employed and Employed tables so that there is a single reference from Trainee to Employer.
ERD Link http://www.imagesup.net/?di=1014217878605. Here's a quick ERD I created for you. Note the use of linker tables to prevent Many to Many relationships in the table. It's important to note there are several ways to solve this schema problem but this is just as I saw your problem laid out. The design is intended to help with normalization of the db. That is prevent redundant data in the DB. Hope this helps. Let me know if you need more clarification on the design I provided. It should be fairly self explanatory when comparing your design parameters to it.
Follow Up Questions:
If you are looking to reduce attributes that might be arbitrary perhaps phone_number and address may be ones to eliminate, but start and end dates are good for sorting and archival reasons when determining whether an entry is current or a past record.
Yes, periods_of_employment does not need to be stored as you can derive that information with start and end dates. Where they worked I believe is just meant to say previous employers, so no location but instead it’s meant that you should be able to get a list all the employers the trainee has had. You can get that with the current schema if you query the employer table for all records where trainee code equals requested trainee and sort by start date. The reason it states start_date twice is to let you know that for all ‘previous’ employers the record will have a start and end date. Hence the previous. However, for current employers the employment hasn't ended which means there will be no end_date so it will null. That’s what the problem was stating in my opinion.
To keep it simple PK’s are unique values used to reference a record within another table. Redundant values are values that you essentially don’t need in a table because the same value can be derived by querying another table. In this case most of your attributes are fine except for Final_Mark in the Course table. This is redundant because Course_Schedule will store the Final_Mark that was received. The Course table is meant to simply hold a list of all potential courses to be referenced by Course_Schedule.
There is no multivalued attributes in this design because that is bad practice Job and salary are singular and if and job or salary changes you would add a new record to the employer table not add to that column. Multivalued attributes make querying a db difficult and I would advise against it. That’s why I mentioned earlier to abstract all attributes with plurals into their own tables and use a foreign key reference.
You essentially do have that written here because Course_Schedule is a linker table meaning that it is meant to simplify relationships between tables so you don’t have many to many relationships.
All your relationships look right to me. Also since the schedules are linker tables and cannot exist without the supporting tables you could consider them weak entities. Course in this schema is a defined list of all courses available so can be independent of any other table. This by definition is not a weak entity. When creating this db you’d probably fill in the course table and it probably wouldn’t change after that, except rarely when adding or removing an available course option.
Yes, you can make address a composite attribute, and that would be right in your diagram. To be clear with your use of Primary key, just because an attribute is unique doesn’t make it a primary key. A table can have one and only one primary key so you must pick a column that you are certain will not be repeated. In this example you may think street number might be unique but what if one company leaves an address and another company moves into that spot. That would break that tables primary key. Typically a company name is licensed in a city or state so cannot be repeated. That would be a better choice for your primary key. You can also make composite primary keys, but that is a more advanced topic that I would recommend reading about at a later date.
Take final_mark out of courses. That’s table will contain rows of only courses, those courses won’t be linked to any trainee except by course_schedule table. The Final_Mark should only be in that table. If you add final_mark to Course table then, if you have 10 trainees in a course, You’d have 10 duplicate rows in the course table with only differing final_marks. Instead only hold the course_code and title that way you can assign different instructors, trainees and classrooms using the linker tables.
No composite attribute is needed using this schema. You have a Classroom table that will hold all available classrooms and their relevant information. You then use the Classroom_Schedule linker table to assign a given Classroom to a Course_Schedule. No attributes of Classroom can be broken down to simpler attributes.

Data modeling : gas station managing

I need to create a database to manage a gas station.
I'm thinking of a basic product inventory and sales data model, but it need some changes.
See http://www.databaseanswers.org/data_models/inventory_and_sales/index.htm. This is how they proceed: the manager keep tracks of the inventory and sales twice a day, each time a gas pump attendant is in charge, and takes the responsibility of the sales.
How can I keep track of this ?

Using the Model that you provided you could use the first Model as reference:
And I would use all the six (6) tables namely:
1) Products
2) Product_Types
3) Product_In_Sales
4) Sales
5) Daily_Inventory_Level
6) Ref_Calendar
But I had to make some changes by alteration and adding:
First I need to include SalesPerson table that would have at least the following fields
1) SalesPersonID
2) Lastname
3) Firstname
4) Alias
In line with that I therefore need to add SalesPersonID as Foreign key in
my Sales table.
Now since you want to have twice a day Inventory then you could approach in many ways
you could add single primary key for Daily_Inventory_Level table or you could add a new field named Inventory_Daily_Flag which has either only the value of 1 or 2. If 1 that means that's the first inventory and if 2 that means that's the second inventory for the day. And that means by the way that you're Primary and Foreign Key at the same time would no longer be just Day_Date and ProductID but also Inventory_Daily_Flag for Daily_Inventory_Level table.
And also in line with that, that means you need to also to add a field in your Product_In_Sales like FlagForInventory with Boolean as Data Type.
So, let's say a Supervisor came in to do the first inventory, then the products sold
in Product_In_Sales for the day would be flag as True for the FlagForInventory and
then would be transferred to Daily_Inventory_Levels with Inventory_Daily_Flag field
as 1 to indicate as the first inventory and of course the Level also would be updated.
And so when the days end and the 2nd inventory is to be executed then those
sales for the day from Product_In_Sales table whose FlagForInventory is false then
it would be flag as True for FlagForInventory and then transferred again to Daily_Inventory_Levels with Inventory_Daily_Flag as 2 indicating the second inventory.
And of course you need to update the Level as well.
Does it make sense? If not I could always change the approach? ;-)

Employee table (Master and detail table)

I am wondering if it is okay to have master and detail table for employees?
As per requirments, data can filtered by department by country and by employee code on report level.
If employee's department or country code is changed then the changes will go in detail table and old record will be set to IS_ACTIVE = 'T'.
---------------------Master Table--------------------------------------
**EMPLOYEE_CODE** VARCHAR2(20 BYTE) NOT NULL,
EMAIL VARCHAR2(100 BYTE)
FIRST_NAME VARCHAR2(50 BYTE)
LAST_NAME VARCHAR2(50 BYTE)
WORKING_HOURS NUMBER
---------------------Detail Table--------------------------------------
**PK_USER_DETAIL_ID** NUMBER,
FK_EMPLOYEE_CODE VARCHAR2(20 BYTE),
FK_GROUP NUMBER,
FK_DEPARTMENT_CODE NUMBER,
FK_EMPLOYER_COUNTRY_CODE VARCHAR2(5 BYTE),
FK_MANAGER_ID VARCHAR2(20 BYTE),
FK_ROLE_CODE VARCHAR2(6 BYTE),
START_DATE DATE,
END_DATE DATE,
IS_ACTIVE VARCHAR2(1 BYTE),
INACTIVE_DATE DATE
Employee table will be linked with Timesheet table and for timesheet reports data can be filtered by department, country and by employee code.
OPTION : I
Have one employee table with one Primary Key and create a new entry whenever department or role is updated for an employee.
Add country and department code in the timesheet table.
--> This way i don't need to search employee table.
OPTION : II
Have master and detail table.
Add country and department code in timesheet table.
--> This way i don't need to search employee table plus i will have master detail table
OPTION : NEW
Have master and detail table.
Timesheet table will have EmpCode.
If user move to new location or change department then Insert a new row in the detail table with the new dept Code and same Emp No.
Update an old row and set the End Date field so if he changes his location or department then the End Date field needs to be updated.
Which one is a best option and is there any other better option available?

This is one way of implementing this requirement, and it's an approach many people take. However, it has on emajor drawback: every time you query the current employee status you need to filter the details on start and end date. This may seem like a trivial thing, but you wouldn't believe how much confusion it can cause, and it has performance implications too.
These things matter, because most of the time you will want only the current details, with queries on history being a relatively rare occurence. Consequently you are hampering the implementation of your most common use case to make it easier to implement a less-used one. (Of course I am making assumptions about your business requirements, and perhaps yours is not a run-of-the-mill employee application...)
The better solution would be to have two tables, an EMPLOYEES table with all the detail columns too and an EMPLOYEES_HISTORY table with the same columns plus the start and end date. When you change an employee's record insert a copy of the old record in the History table, probably by a trigger. Your standard processes have just the one table to query, and your history needs are met fully.
By the way, your proposed data model is wrong. Working_hours, email_address and last_name are definitely things which can change and perhaps even first name (e.g. through changes in personal circumstances such as getting married). So all those columns should be held in your details name

Option 3 - Please note that this option is useful only for the reports Point of View.
Whenever you insert the data, create a De-Normalized entry in a new table.
Whenever an entry will be updated, the De-Normalized entry will be updated in the new table.
The New Table will have all De-Normalized columns of Employee.
So while Performing the search, this will benefit you as the results will be calculated without using Joins. Thus, the access time will be reduced.
Records in the new table will be Created/Updated in The Insert/Update Trigger.
Improvements in Option - 2 and Option 1
Don't create redundancy by adding duplicate columns.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight