Should a large database table be split where it doesn't improve normalisation? - database

I am currently building a database system for a company that stores a lot of information about their employees. Performance isn't a massive issue as there won't be a huge amount of employees in the database (>1000)
The database design that I have come up with so far has been normalized to an extent so employee data has been split in to separate tables where multiple data items need to stored for each employee, but there are a lot of fields which are primary key dependant so this has resulted in the main employee information table having around 50 columns.
Is this too many and should I try to group similar items into their own table e.g a table for contact information, a table for personal information, or is it better to leave it as is?

Related

More Rows vs More Tables

I have recently started designing database for one of my project. I am confused on one simple question "More Rows vs More Tables". I am not experienced enough to answer this question. Any help on this will be appreciated. Here is the scenario:
Scenario
I Have a Company. Company will have many Users, Vehicles.
More Rows:
Should I have 1 table for user and vehicle with reference to COMPANY_ID. Obviously over time it will have a lot of records. I have to use GUID as ID because of the requirement. So if it has too many records, I think it will effect the searching operation as well.
More Tables:
Should I have 2 tables created every time I add a new company with company prefix e.g. I add a new company "Tesla", table names will be like TESLA_USER, TESLA_VEHICLES. Obviously over time number of tables will increase a lot.
My concern is which is more efficient way? More Rows or More Tables?
Thank you
Cheers
D
You can create a table for the Companies, a table for users and a table for vehicles in which you put all your data. Then you add two joining tables who only stores the links between companies and users and companies and vehicles.
Example

Oracle APEX - Data Modeling & Primary Keys

I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks
4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.

Where should I store repetitive data in Access?

I'm creating this little Access DB, for the HR department to store all data related to all the training sessions that the company organizes for all the employees.
So, I have a Training Session table with information like date, subject, place, observations, trainer, etc, and the unique ID number.
Then there's the Personnel table, with employer ID (which is also the unique table number), names and working department.
So, after that I need another table that keeps a record of all the attendants of each training session. And here's the question, should I use a table for that in the first place? Does it have to be one table for each training session to store the attendants?
I've used excel for quite some time now, but I'm very new to Access and databases (even small ones like this). Any information will be highly appreciated.
Thanks in advance!
It should be one table for persons, one table for trainings, and one for participation/attendance, to minimize (or best: avoid) repetition. Your tables should use primary and foreign keys, so that there are one-to-many relationships between trainings and attendances as well as people and attendances (the attendances table would then have a column referring to the person who attended, and another column referring to the training session).
Google "database normalization" for more detail and variations of that principle (https://en.wikipedia.org/wiki/Database_normalization).

My relational database structure is getting too large and I don't know how to shrink it

My new employer runs a standard ecommerce website with about 14k products. They create "microsites" that a person can log into and see custom pricing and limited products (hide some/show some products) and custom categories.
The current system is just has a huge relational database. So there is a products table, a sites table and then sites_products has the following columns
site_id
product_id
product_price
If the microsite is suppose to show a product it simply stores it in this table. This table is currently # 2 million rows and growing. The custom categories is a similar relational table setup but the numbers are much lower so I am not as worried about that.
I would appreciate any help/ideas you could provide to decrease this table size. I am confident that in the next couple years it will be at 20 million at this rate.
-Justin

Best practice: database referencing tables

In database design what are the feelings of tuple vs referencing table for small pieces of data?
For instance, supposing you are designing a schema involving office management. You want to record what department each employee belongs to, but are otherwise uninterested in any information relating to departments. So do you have department as a string/char/varchar/etc in your EMPLOYEE table, or have it instead be a foreign key, relating a DEPARTMENT table.
If the DEPARTMENT table is recording nothing other than department names, one would normally want to combine this with the EMPLOYEE table. But if this is contained in the EMPLOYEE table you cannot guarantee that some users will call HR "HumanResourses", some may call it "H-R", some may call it "human resources", etc. Having it as a foreign key guarantees that it can be only one thing. Also, if other information is ever to be added about departments, it would be easy if it is in a table of its own.
So what do people think about it? Naturally more tables and referencing is also likely to have a negative impact on performance. My question specifically is asked with Oracle 11g in mind, but I doubt that the type of rdms involved has much bearing on this design consideration.
If you use the related table, then you don't have the performance problem of updating 1,000,000 records because the Personnel Department became the Human Resources department.
You have another option. Create the table and use it as a lookup for data entry. But store the information in the main table.
However, I prefer the option of using the related table for the departments and storing the ID for the department and the employee in a join table that has the ids and start and endates. Over time employees tend to move from one department to another. It is helpful for reporting to be able to tell what department they were in when. You need to consider how the data will be used over time and in reporting when designing this sort of thing. Short-sighted designs are hard to fix later.
Your concern about having too many tables is really unfounded. Databases are designed to have many tables and to use joins. If you index correctly, there will not be preformance implications for most databases. And you know what,I know of realtional database with many many tables that have terrabytes of data that perform just fine.
You only have to worry about the performance impact of this sort of thing if you're dealing with truly massive datasets. For any regular office environment system like this, prefer the normalized schema.

Resources