Database normalization for electricity monitoring system - database

I've read a lot of tips and tutorials about normalization but I still find it hard to understand how and when we need normalization. So right now I need to know if this database design for an electricity monitoring system needs to be normalized or not.
So far I have one table with fields:
monitor_id
appliance_name
brand
ampere
uptime
power_kWh
price_kWh
status (ON/OFF)
This monitoring system monitors multiple appliances (TV, Fridge, washing machine) separately.
So does it need to be normalized further? If so, how?

Honestly, you can get away without normalizing every database. Normalization is good if the database is going to be a project that affects many people or if there are performance issues and the database does OLTP. Database normalization in many ways boils down to having larger numbers of tables themselves with fewer columns. Denormalization involves having fewer tables with larger numbers of columns.
I've never seen a real database with only one table, but that's ok. Some people denormalize their database for reporting purposes. So it isn't always necessary to normalize a database.
How do you normalize it? You need to have a primary key (on a column that is unique or a combination of two or more columns that are unique in their combined form). You would need to create another table and have a foreign key relationship. A foreign key relationship is a pair of columns that exist in two or more tables. These columns need to share the same data type. These act as a map from one table to another. The tables are usually separated by real-world purpose.
For example, you could have a table with status, uptime and monitor_id. This would have a foreign key relationship to the monitor_id between the two tables. Your original table could then drop the uptime and status columns. You could have a third table with Brands, Models and the things that all models have in common (e.g., power_kWh, ampere, etc.). There could be a foreign key relationship to the first table based on model. Then the brand column could be eliminated (via the DDL command DROP) from the first table as this third table will have it relating from the model name.
To create new tables, you'll need to invoke a DDL command CREATE TABLE newTable with a foreign key on the column that will in effect be shared by the new table and the original table. With foreign key constraints, the new tables will share a column. The tables will have less information in them (fewer columns) when they are highly normalized. But there will be more tables to accommodate and store all the data. This way you can update one table and not put a lock on all the other columns in a denormalized database with one big table.
Once new tables have the data in the column or columns from the original table, you can drop those columns from the original table (except for the foreign key column). To drop columns, you need to invoke DDL commands (ALTER TABLE originalTable, drop brand).
In many ways, performance will be improved if you try to do many reads and writes (commit many transactions) on a database table in a normalized database. If you use the table as a report, and want to present all the data as it is in the table normally, normalized the database will hurt the peformance.
By the way, normalizing the database can prevent redundant data. This can make the database consume less storage space and use less memory.

It is nice to have our database normalize.It helps us to have a efficient data because we can prevent redundancy here and also saves memory usages. On normalizing tables we need to have a primary key in each table and use this to connect to another table and when the primary key (unique in each table) is on another table it is called the foreign key (use to connect to another table).
Sample you already have this table :
Table name : appliances_tbl
-inside here you have
-appliance_id : as the primary key
-appliance_name
-brand
-model
and so on about this appliances...
Next you have another table :
Table name : appliance_info_tbl (anything for a table name and must be related to its fields)
-appliance_info_id : primary key
-appliance_price
-appliance_uptime
-appliance_description
-appliance_id : foreign key (so you can get the name of the appliance by using only its id)
and so on....
You can add more table like that but just make sure that you have a primary key in each table. You can also put the cardinality to make your normalizing more understandable.

Related

One to one relationship. Is Id column needed for both table?

Let's consider the tables:
CREATE TABLE [dbo].[User] (
[Id] INT PRIMARY KEY,
...
);
CREATE TABLE [dbo].[UserInfo_1] (
[Id] INT PRIMARY KEY,
[UserId] INT,
...,
CONSTRAINT FK_UserId FOREIGN KEY ([UserId]) REFERENCES [dbo].[User] (Id)
);
CREATE TABLE [dbo].[UserInfo_2] (
[UserId] INT PRIMARY KEY,
...,
CONSTRAINT FK_UserId FOREIGN KEY ([UserId]) REFERENCES [dbo].[User] (Id)
);
What are the procs and cons of using FOREIGN KEY for UserInfo_1 and UserInfo_2 tables? Also in terms of ORM.
I don't think there would be a con for using Foreign Keys on any form of tables. In fact, I make sure to have a primary key on all tables I use, especially on temps and variable tables since I know I will be joining and filtering with them.
Now your first table User and UserInfo_1 is a one to many relationship. Meaning a single User can have many different UserInfo_1 associated with it.
the second one of User and UserInfo_2 is a one to one relationship. In which a single User can only ever have one UserInfo_2 associated with it.
In terms of performance, since they are Indexed they would perform relatively the same, depending upon your filtering and what plan was cached in the query plan. though you may not entirely run into issues with cached query plans as EF utilizes ad-hock statements, though EF does run up the cached plan memory and that is typically recommended to be disabled when using EF.
One-to-One
I am a fan of one to one relationship, especially in a Domain Driven Design aspect, and when implemented correctly. If each of your rows for User is going to require information from UserInfo_2, then I would theoretically keep them on the User table. Now if you know you will not be querying that information much or not all Users will require the columns on that table, or if your main table is fairly large I would keep it as a one to one relationship.
I personally like to use System Versioning. I have tables which contain certain columns which typically update and columns which almost never update. Those that I know update on a daily/weekly bases I have them congregated on a one to one relationship to the main table that should almost never update. But each business needs and scenario is different. Not any one design fits all situations.
Benefits of Indexing
When you create a Foreign Key, you are creating an Index in the database. This will allow for you to perform faster queries. The SQL optimizer will utilize the Index to better find what you are looking for. Without the index, your query plan will turn into a table scan, which is a row by row search. When doing a row by row search, you can seriously slow your system down as your table grows.
Should you choose to create your tables without an Index, or in this case a Foreign Key index, you might find issues with the ORM aspect. When you query your database from EF, you will call your DbSet. If you had proper Foreign Key connections with your two tables, EF can utilize the .Include to join the two tables searching for what you need. Otherwise you would be forced to utilize two queries into the database for both tables.
In a project I worked on one time, a developer did that. He did not properly attach a Foreign Key connection between two objects and then didn't understand why EF would not properly return his values when he used the .Include and wasn't very fast. He thought it was EF's fault and had to do two queries to obtain the information he needed.
Well User > UserInfo_1 is a one-to-many relationship, as UserInfo_1.UserID is not a key. And EF 6 doesn't support alternate keys. EF Core does, so you could make it a key.
But the simplest design is always to collapse 1-1 relationships into a single table. In EF Core you can still have a main Entity Type and one or more separate Owned Entity Types. But on the database it's typically better to have them in a single table.
The second-simplest is to have have both tables have the same key columns.

Define sets in Visio DB design

I have Transaction table that contain transactions related cashiers transactions. Transactions might be two sales and refund and are defined in TransType table. Possible in future we will have more. Trying to show this in Visio 2016, but not sure I do it correctly.
How to define set of available record values in TranType and define relation to Transactios table. What type relation it would be?
I would recommend creating a TransTypeId in your TransType table.
Then, in the Transactions table, create a column that serves as a foreign key, linking the two tables. TransTypeFk in your Transactions tables should then be an integer with a foreign key constraint to TransTypeId.
As a note, your relationship path between the two entities should always connect two specific columns, not a column and an entity.
The relationship would be one TransType to many Transactions.
You can define the set of available records by adding rows to that database. For Visio, using a note or text box would likely work fine.

Identifying where joins can be made between two sql server tables

Very new to sql server. I have a db with about 20 tables each with around 40 columns. How can I select two tables and see if they have any columns in common?
I basically want to see where I can make joins.. If there's a better way of quickly telling where I can combine info from two tables that could be helpful too.
First of all, in relational databases there is not such a concept of "joinable tables and/or columns". You can always list two relations (= tables) crossing every row in one relation with each row of the other (the cross/carthesian product of them) and then filter those based on some predicate (also called a "join", if the predicate involves columns of both relations).
The idea of "joinable" tables/columns comes into being only when thinking about the database schema. The schema's author can ask the database engine to enforce some referential integrity, by means of foreign keys.
Now if your database schema is well done (that is, its author was kind/clever enough to put referential integrity all over the schema) you can have a clue of which tables are joinable (by which columns).
To find those foreign keys, for each table you can run sp_help 'databasename.tablename' (you can omit the databasename. part, if it is the current database).
This command will output some facts about the given table, like its columns (along with their datatypes, requiredness, ...), its indexes and so on. Somewhere near the end it will list foreign keys along with where (if ever) its primary key is imported as foreign key on other tables.
For each key imported as foreign key on other table you have a candidate predicate for a join.
Please note that this procedure will only work if the foreign keys are set correctly. If they aren't, you can fix your database schema (but to do this you must know already which tables are joinable anyway). Also it won't show you joinable tables on other databases (in the same or linked server).
This also won't work for views.
Try to see in the SQL Management Studio, in the database diagram, there you find the relations between tables.

Database normalization - How not OK is it to have a table with no relationships?

I'm really new to database design, as I will now demonstrate:
I have an MS Sql database that I need to add a table to. The table contains information that pertains to another table. However, there are no candidates for primary keys (all fields can be duplicates). The only thing the table will ever be used for is to keep records that may be required for a certain kind of query, and they can be retrieved super-easily using a field that my other tables also contain (but never uniquely).
Specifically, my main table has a bunch of chemistry records. Each chemistry record is associated with another set of records called quality-control records (in my second table). They are associated by a field called "BatchID". The super-easy part is that I can say, "get all records with this BatchID" and get exactly what I need. But there can be multiple instances of any BatchID in both tables (in fact, there usually are), so I'd need to jump through hoops to link them. In a more general sense, in theory, is it OK to have a table floating around not attached to anything?
The overwhelmingly simple solution is to just put the quality control in the db with no relationships to the chemistry table. I'd need to insert at least one other table to relate it to anything else, maybe more, and the only reason for complicating my life like that is that I don't want to violate some important precept of database design.
My question is, is it ever OK to just have a free-floating table in a database? Or is that right out?
Thanks for any help.
In theory, it's ok to have a table that doesn't have any foreign key constraints. But the table you describe (both tables you describe) should probably have a foreign key that references the table of batches. We'd expect the table of batches to have "BatchID" as its primary key.
The relational model requires tables to have at least one candidate key. It's almost always a bad idea to have a SQL table that doesn't have a candidate key.

Database Mapping - Multiple Foreign Keys

I want to make sure this is the best way to handle a certain scenario.
Let's say I have three main tables I will keep them generic. They all have primary keys and they all are independent tables referencing nothing.
Table 1
PK
VarChar Data
Table 2
PK
VarChar Data
Table 3
PK
VarChar Data
Here is the scenario, I want a user to be able to comment on specific rows on each of the above tables. But I don't want to create a bunch of comment tables. So as of right now I handled it like so..
There is a comment table that has three foreign key columns each one references the main tables above. There is a constraint that only one of these columns can be valued.
CommentTable
PK
FK to Table1
FK to Table2
FK to Table3
VarChar Comment
FK to Users
My question: is this the best way to handle the situation? Does a generic foreign key exist? Or should I have a separate comments table for each main table.. even though the data structure would be exactly the same? Or would a mapping table for each one be a better solution?
My question: is this the best way to handle the situation?
Multiple FKs with a CHECK that allows only one of them to be non-NULL is a reasonable approach, especially for relatively few tables like in this case.
The alternate approach would be to "inherit" the Table 1, 2 and 3 from a common "parent" table, then connect the comments to the parent.
Look here and here for more info.
Does a generic foreign key exist?
If you mean a FK that can "jump" from table to table, then no.
Assuming all 3 FKs are of the same type1, you could theoretically implement something similar by keeping both foreign key value and referenced table name2 and then enforcing it through a trigger, but declarative constraints should be preferred over that, even at a price of slightly more storage space.
If your DBMS fully supports "virtual" or "calculated" columns, then you could do something similar to above, but instead of having a trigger, generate 3 calculated columns based on FK value and table name. Only one of these calculated columns would be non-NULL at any given time and you could use "normal" FKs for them as you would for the physical columns.
But, all that would make sense when there are many "connectable" tables and your DBMS is not thrifty in storing NULLs. There is very little to gain when there are just 3 of them or even when there are many more than that but your DBMS spends only one bit on each NULL field.
Or should I have a separate comments table for each main table, even though the data structure would be exactly the same?
The "data structure" is not the only thing that matters. If you happen to have different constraints (e.g. a FK that applies to one of them but not the other), that would warrant separate tables even though the columns are the same.
But, I'm guessing this is not the case here.
Or would a mapping table for each one be a better solution?
I'm not exactly sure what you mean by "mapping table", but you could do something like this:
Unfortunately, that would allow a single comment to be connected to more than one table (or no table at all), and is in itself a complication over what you already have.
All said and done, your original solution is probably fine.
1 Or you are willing to store it as string and live with conversions, which you should be reluctant to do.
2 In practice, this would not really be a name (as in string) - it would be an integer (or enum if DBMS supports it) with one of the well-known predefined values identifying the table.
Thanks for all the help folks, i was able to formulate a solution with the help of a colleague of mine. Instead of multiple mapping tables i decided to just use one.
This mapping table holds a group of comments, so it has no primary key. And each group row links back to a comment. So you can have multiple of the same group id. one-many-one would be the relationship.

Resources