i have database that has master tables prefixed with mt_ and transactions tables prefixed with tr_ . But when i go through the database i started to wonder what is the actual definition of a transaction table and master table. To my understanding transaction table should have a composite primary key (primary key made from two or many PKs of other tables). But when i looked at the transaction tables in database, there are tables that have the composite key as mentioned previously also it has tables tagged as tr_ but have only one key tagged as PK and they also have PK keys that belongs to other tables but they weren't even tagged as FK...
So could any one here explain the difference between a master table and a transaction table and how to identify them in DB?
Updated
Here is an examtple of my db
tr_orders
OrderId int PK
CustomerId int Fk
OrderDate datetime etc
tr_reciept
RecieptId int PK
OrderId int **(but not FK)**
PaindAmount money
recieptDate datetime
Here are the table structure of the complete two tables:
tr_orders
tr_reciept
i dont understand why these tables are tr tables?
Why you don't always put a Primary Key on all the Foreign Keys
When something 'happens' it goes into a transaction. Someone buys a toy at a shop. A row is created recording that it was a toy and the datetime it happened and how much it cost.
The someone else buys a toy ten minutes later
We have two records in our transaction table:
Date Time Product_Key Shop_Key Amount
--------------------------------------------------------------------------
18 Dec 2015 13:05 7 12 10
18 Dec 2015 13:15 7 12 10
Here we have two foreign keys: Product_Key and Shop_Key
We can't create a PK on just those two foreign keys because then one shop could only ever sell one toy.
So the PK does not automatically go on all the FK's
But really the thing to take away is that your data model (tables, fields, keys, datatypes) reflects what your business does. If a shop could truly only ever sell one toy, it would be a valid data model to have a PK on those two fields.
Some characteristics of 'transactional vs master tables
"Transactional" and "master" tables generally have a many to one relationship, meaning many transactions match one master record. Many purchase records match the same single toy record. A FK is a dead giveaway to this kind of relationship although "master" tables also have FK's
"Transactional" tables usually have a date or some kind of event id and are often 'aggregated' when reporting. This could be a record count or a sum of an amount.
Some characteristics of real world systems
It's entirely likely that someone forgot to put on a FK or PK, or it could be that there is a unique key (not a PK) enforcing what you are expectig to see.
I've seen live systems where the keys were clearly incorrect, or there were no keys at all.
Master - - - - - - - - - - - - - - - - - - - - - - Transaction
Country .... Employeee ...... Customer ........... Order
Master & Lookup tables exist in a Range, not a Binary On/Off State, and reflects the expectation of the amount of activity the table will experience
Lookup/Reference tables like State, Currency, Country, etc. RARELY have new records or changes -- Very "Master"
Employees add or change records occasionally, but not often (hopefully) so more "Master" than "Transaction"
Customers add or change records more often than Employees (also hopefully) so still a measure of "Master" but also with "Transaction" qualities
Orders are added and changed ALL the time (hopefully) and are Very "Transaction,"
Same with Reciepts
If a table has No FK fields, it's likely very "Master"
Tables with FKs have some amount of "Transaction" to them, the more you expect new records - the more "Transaction" the table could be.
Keys:
in my opinion, every table should have a surrogate PK, not related to FKs or any Natural keys. Lots of reasons for this opinion, but whatever works for you is cool, too.
Often, if there is an obvious Natural key, then a table needs a Unique constraint for that key, in addition to the PK
Related
I am trying to understand how fact tables are form in relation to the dimension tables.
E.g. Sale Fact Table
For there is a query for Sale of product by year/month/week/day, do I create a dimension for each type of period: Dim_Year, Dim_Month, Dim_Week and Dim_Day, each with their own respective keys?
Or is it possible to just use one dimension for all periods: Dim_Date and only have one date key?
Another area I am confused about is that why do some fact tables not contain their own ID? E.g. Sale fact table does not have SaleID included in the fact table.
Sale Fact Table Textbook Example
DATES
Your date dimension needs to correspond to the grain of your fact table. So if you had daily sales you would have a Dim_Day, weekly sales you would have a Dim_Week, etc.
You would normally have multiple date dimensions (at different grains) in your data warehouse as you would have facts at different date grains.
Each date dimension would hold hold attributes applicable to levels higher up in the date hierarchy. So a Dim_Day might hold day, week, month, year attributes; Dim_Month might hold month, quarter and year attributes, etc.
PRIMARY KEYS
Primary keys are rarely (never?) a technical requirement when creating tables in a database i.e. you can create a table without defining a PK. So you need to consider why we normally (at least in OLTP DBs) include PKs. Common reasons include:
To easily identify an individual record
To ensure that duplicate records (those with the same PK value) are
not created
So there are good reasons for creating PKs, however there are cost overheads e.g. the PK needs to be checked every time a new record is inserted into the table.
In a dimensional model where you are performing bulk inserts/updates, having PKs would cause a significant performance hit. Additionally, the insert logic/checks should always be implemented in your ETL processes so there is no need to include these types of checks/constraints in the DB itself.
Fact tables do have a primary key but it is often implicit rather than explicit - so a group of the FKs in the fact table uniquely identify each record. This compound PK may be documented but is is never enabled/implemented.
Occasionally a fact table will have an explicit, single column, PK. This is normally used when the fact table needs to be updated and its implicit PK involves a large number of columns. There is normally logic required to identify the record to be updated using its FKs but this returns the PK; then the update statement just has a clause like this:
WHERE table_pk = 12345678
rather than having to include all the columns in the implicit PK:
WHERE table_sk1 = 1234
AND table_sk2 = 5678
AND table_sk3 = 9876
....
Hope this helps?
I've read a lot of tips and tutorials about normalization but I still find it hard to understand how and when we need normalization. So right now I need to know if this database design for an electricity monitoring system needs to be normalized or not.
So far I have one table with fields:
monitor_id
appliance_name
brand
ampere
uptime
power_kWh
price_kWh
status (ON/OFF)
This monitoring system monitors multiple appliances (TV, Fridge, washing machine) separately.
So does it need to be normalized further? If so, how?
Honestly, you can get away without normalizing every database. Normalization is good if the database is going to be a project that affects many people or if there are performance issues and the database does OLTP. Database normalization in many ways boils down to having larger numbers of tables themselves with fewer columns. Denormalization involves having fewer tables with larger numbers of columns.
I've never seen a real database with only one table, but that's ok. Some people denormalize their database for reporting purposes. So it isn't always necessary to normalize a database.
How do you normalize it? You need to have a primary key (on a column that is unique or a combination of two or more columns that are unique in their combined form). You would need to create another table and have a foreign key relationship. A foreign key relationship is a pair of columns that exist in two or more tables. These columns need to share the same data type. These act as a map from one table to another. The tables are usually separated by real-world purpose.
For example, you could have a table with status, uptime and monitor_id. This would have a foreign key relationship to the monitor_id between the two tables. Your original table could then drop the uptime and status columns. You could have a third table with Brands, Models and the things that all models have in common (e.g., power_kWh, ampere, etc.). There could be a foreign key relationship to the first table based on model. Then the brand column could be eliminated (via the DDL command DROP) from the first table as this third table will have it relating from the model name.
To create new tables, you'll need to invoke a DDL command CREATE TABLE newTable with a foreign key on the column that will in effect be shared by the new table and the original table. With foreign key constraints, the new tables will share a column. The tables will have less information in them (fewer columns) when they are highly normalized. But there will be more tables to accommodate and store all the data. This way you can update one table and not put a lock on all the other columns in a denormalized database with one big table.
Once new tables have the data in the column or columns from the original table, you can drop those columns from the original table (except for the foreign key column). To drop columns, you need to invoke DDL commands (ALTER TABLE originalTable, drop brand).
In many ways, performance will be improved if you try to do many reads and writes (commit many transactions) on a database table in a normalized database. If you use the table as a report, and want to present all the data as it is in the table normally, normalized the database will hurt the peformance.
By the way, normalizing the database can prevent redundant data. This can make the database consume less storage space and use less memory.
It is nice to have our database normalize.It helps us to have a efficient data because we can prevent redundancy here and also saves memory usages. On normalizing tables we need to have a primary key in each table and use this to connect to another table and when the primary key (unique in each table) is on another table it is called the foreign key (use to connect to another table).
Sample you already have this table :
Table name : appliances_tbl
-inside here you have
-appliance_id : as the primary key
-appliance_name
-brand
-model
and so on about this appliances...
Next you have another table :
Table name : appliance_info_tbl (anything for a table name and must be related to its fields)
-appliance_info_id : primary key
-appliance_price
-appliance_uptime
-appliance_description
-appliance_id : foreign key (so you can get the name of the appliance by using only its id)
and so on....
You can add more table like that but just make sure that you have a primary key in each table. You can also put the cardinality to make your normalizing more understandable.
I'm developing a system for a retailer and I've hit a bit of a conundrum when it comes to deciding how to represent the orders in the database. The schema for my Order table so far is as follows:
Id - PK
AccountId - FK (Nullable)
ShippingAddressId - FK (Nullable)
BillingAddressId - FK (Nullable)
ShippingMethod - (Nullable)
Type - (Nullable)
Status
Date
SubTotal
Tax
Total
My problem is I'm not sure whether I should represent online purchases and in-store purcahses in separate tables or not. If I were to store them in the same table, all non-nullable fields would be the only ones applicable for in-store purchases.
Another design pattern that crossed my mind is something like this:
Online order table:
PurchaseId - PK, FK
AccountId - FK
ShippingAddressId - FK
BillingAddressId - FK
ShippingMethod
Type
Purchase table:
Id - PK
Status
Date
SubTotal
Tax
Total
And for in-store purchases, there would simply be no reference from the online orders table.
Thoughts?
I would make a second table for location, with a primary key and location information. That could be online as well. Then use a foriegn key in your main table. You would then just fill the fields require for the application you are doing(in store, or online). This would also allow For the business to grow to more locations just by simply adding it into the location table.
I'm going with the original design. Likely more maintainable and efficient as well.
Your second design is very close to an Entity Sub-typing pattern. If the primary key of your online order table was the foreign key to your purchase table then you would have entity sub-typing.
Your original design is a practical design for the physical implementation of your database because it is simple to use. Entity sub-typing would be the preferred design at the logical level because it clearly represents your rules about which predicates (columns) belong to which logical tables.
Some people would also use the entity sub-typing pattern for their physical model too because they have an aversion to nulls.
I am designing an airline database (the outline of one anyway) for an assignment and seem to be running around in circles.
Three tables are concerned:
Customer Booking_Reference Flight
cust_id(pk) reference_id(pk) Flight_id(pk)
cust_id(fk)
A booking reference can have many flights.
A flight will have many booking references.
I am trying to break up the many to many relationship. Is it possible to have a relational table with the flight_id as the attributes (columns) and the booking_reference as the rows (data)? If so there can be no primary key, which is a no-go as I understand.
Alternatively I could make the booking_reference/flight relational table with 2 attributes and a compound primary key of booking_reference/flight, which would result in both entities being duplicated but the primary key being unique (half of it anyway). Is this acceptable design practice?
I was going to just list a max number of 8 flights as columns in the booking reference table (with NULL for the entries where there is less than 8 flights) and give customers with more than 8 flights a new reference_id, but this seems to be more ridiculous as i learn more about databases, resulting in more reference ids and more NULL data.
Any ideas on which route to take?
Rather than having eight (or any arbitrary number of) columns, create what's sometimes called a join table, with three columns:
Table: references_flights
id (Primary key)
reference_id (fk)
flight_id (fk)
You should then be able to query data across them with the right JOINs, but I'll leave that for someone with more database expertise.
When Developing the database then compolsury to define the primary key or forign key in each tables of the databse if any tables that do not contain any unique field that time how can we connect the table with other table.
Suppose i have three tabe.
table1 Personel Detail
Emai_Address (PK)
Name
City
ContactNo
Land_Line_No
D_O_B
Gender
Marital_Status
Language_Known
Table2 Professiona_Detail
Total_Experiance
Annual_Salary
Functional_Area
Current_Industry
Key_Skill
Resume_HeadLine
Table3 WorkPreference
Specify Your Preference
Start Working
Prefered Location
Job Type
The Obove Table1 contain the PK but Table2 or Table 3 does not contain any Pk Or FK then how can connect this three table.
No. It's not compulsory. But HIGHLY recommended!
Some SQL guru once said:
If it doesn't have a primary key, it's not a table!
Live by that statement!
And foreign keys will make your database more secure, and avoid "zombie" rows. Again: it's not compulsory or technically necessary by all means, but you'll get yourself into trouble if you don't know it right from the start! Trust me.... been there, cleaned up that mess......
Table2 and Table3 should have a FK to Table1. Otherwise you will not know what person the records in those tables are for. Each table should also have a PK defined for it. This is so that you can uniquely identify a row when doing UPDATES or DELETES.
There is nothing that will enforce primary/foreign keys apart from you as a developer.
They are not compulsary, but are best practice and should be created.
When Developing modelling the database then compolsury to define the primary key or forign key in each tables of the databse if any tables that do not contain any unique field that time how can we connect the table with other table.
(1) Yes.
You are experiencing complications and difficulties at step 6 because you have not completed step 5. The steps have to be followed in sequence.
A Relational Database requires that the rows (not the identity column) in each table are unique. It is compulsory. If the rows are not unique, it is not a Relational table, it is something else, a bucket of fish.
After that FKs, etc will be easy. before that, FKs etc will be impossible.
(2) You already have a very good, stable unique Identifier for Person. The Professional and WorkPreference tables are missing a column or two. They do not sit out their on their own. Who or what does Professional and WorkPreference apply to ?
They belong to a Person. The only Person Identifier you have so far is EmailAddress. So EmailAddress needs to be added to Professional and WorkPreference.
EmailAddress is the PK in Professional and WorkPreference.
EmailAddress is also the FK in Professional and WorkPreference to Person. (So far the cardinality is 1::1.)
(3) Now you may also need an Unique Constraint on Person.Name, but then you have to deal with two "Bob Smith" and "Bob Smith" vs "Smith, Bob" vs "Robert Smith". So there is still some work to do there. If it is a simple database it may not matter Person.Name may be good enough.
That is it, the task is complete at the logical level.
(4) Now at the physical level (elements that the user does not see), you may decide that carrying the CHAR(30) EmailAddress in the child tables is not sensible for performance reasons, so you may add a narrow Surrogate Key to Person, such as PersonId INT. A Surrogate Key is always an additional column and index; it is not a substitution for the natural keys; you still need EmailAddress UNIQUE as the natural key that maintains uniqueness of rows.
Then you can use PersonId as the PK in Person.
Then you migrate PersonId as the FK and PK to Professional and WorkPreference; instead of EmailAddress.
But you cannot give up Person.EmailAddress UNIQUE, because that is the basis of maintaining unique rows in Person.