Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am on a project to create a new Java EE application (JSF, Hibernate, Spring Security, Informix database). This application will automate the entry of notes for the annual interview of bank employees.
At the very beginning, everything was entered in an Excel file which then generated a report with the various performance graphs (according to the notes entered from 0 to 4).
Now I want to do a fairly optimized database design. I thought of creating the following tables:
Interview with columns (interview_id, interview_date),
Competency with columns (competency_id, competency_group, competency_name),
Interview_note with columns (interview_note_id, employee_id (FK), interview_id (FK) , competency_id (FK))
However, I have some doubts about how to keep it compact and logical. Is this the right way of doing things? Are there any improvements to take into account for more optimization?
In your narrative and draft database schema, I find the following identified entities: Employee, Competency, Interview and Interview_note.
In this regard, only the Employee table is missing, but I'm sure you have it somewhere. Moreover, your design is very flexible, since it allows for several Interview_notes of the same interview, competence and employee. What is perhaps missing therefore, is the id of who made the notes. Alternatively, if there's only one set of notes for an interview, you could consider to identify the interviewer in Interview.
A part from that, and maybe some missing data for the note (points, percentage of satisfaction, or some textual annotations?) your design seems to fulfil its purposes.
The database engine will very well optimize all the joins you'll have to do. Maybe facilitate its job by defining the _id as primary key, if you didn't do it.
I can't see other optimizations: each table clearly represent a different relation (in the relational algebra meaning of the term) and merging any of them would inevitably result in a suboptimal redundant schema.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
Improve this question
I'm redesigning our service app, and getting rid of some really awful schema problems while I'm at it. Trying to build the replacement with best practices as much as possible.
I'm having a company table rather than just customer, as it's often useful to identify companies that are not customers (suppliers, contractors, etc etc). I'm trying to decide whether it's better to simply include a boolean field represented in the relevant part of the app by a checkbox that identifies relevant companies as customers (which would become uneditable once the customer has services attached to them), or if I should, instead, have a separate table that's basically just a single field referencing the Company ID that is in turn referenced by any child records.
This similar question asks about records that can be one of several subtypes. While the question is materially different (every policy seems to be only one of the potential subtypes, whereas Companies can be any or all of Customer/Supplier/Contractor etc) its similarity combined with the fact that it has multiple conflicting answers raises the possibility that there is no industry-wide consensus, so:
Is there an established best practice here? I'm not immediately seeing any reasons that other fields should be included in the prospective Customer table, but I'm open to the idea that there might... is that a good enough reason to go with B? Or is this a clear YMMV situation, where both options have benefits, either being equally valid?
I should, instead, have a separate table that's basically just a single field referencing the Company ID that is in turn referenced by any child records.
There are probably several attributes that apply to a customer that don't apply to a non-customer Company, so CompanyID probably won't end up being the only attribute of Customer.
So if that's the case, the clear choice is to have a separate Customer table.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I have to design and build a star / snowflake schema database that will keep data about employees in a company - especially the rates that are payed to the employees. This is the first time I am experimenting with this schema type and I'm not sure about which parts of the fact tables should be separate dimension tables.
I don't exactly understand the practical upsides of having this schema, is it actually that much easier to perform queries on this type of database? Or is it only about the performance?
Below I am attaching the project of the schema of my database. I would like to know what should I modify for this to be the best possible version for this database. I also have a question about two things:
Should the rate column be just a value in the fact table? Or should it be a foreign key to a dim_rate table?
What about date dimensions? Should they just be values in specific tables? Or should they always be foreign keys? If they should be foreign keys, should there be one dim_date table or a table for each type of date?
As an example for question 2 lets takie the dim_employee table and the employment_date and end_of_employment columns. I have these dates as values in the dim_employee table but I can think of 2 other versions of how to handle this data: either foreign keys to a dim_date table or seperate fact tables for fact_start_of_employment and fact_end_of_deployment. I know I will need different kinds of report for example reports showing how many people started work and left the company for different date intervals (eg. in december of 2020). Honestly at this point I have no idea which option would be best and easiest to work with in the future.
Also as I said - I would love any constructive criticism of this schema, even if it means completely redesigning it.
I would merge both fact tables because I think there is a strong relation between rate and position. But that's how I look at this data without knowing all the details.
I would also create a date dimension and a form_of_employment dimension.
That would result in 4 dimensions:
dim_employee
dim_date
dim_position
dim_form_of_employment
And a single fact table with these columns:
fact_assignment
employee_id
date_id
position_id
form_of_employment_id
rate
student
This setup results in a proper star and very simpel SQL for your reports
For every BI or reporting system, you have a process of designing your tables and building them based on that design. This process is called dimensional modeling. Some others call it data warehouse design, which is the same thing. Dimensional modeling is the process of thinking and designing the data model including tables and their relationships. As you see, there is no technology involved in the process of dimensional modeling, It is all happening on your head and ends up with sketching diagrams on the paper. Dimensional modeling is not the diagram in which tables are connected to each other, it is the process of doing that.
Star Schema is the best way of designing a data model for reporting, You will get the best performance and also flexibility using such a model.
In this case the Employee Dimension will be a Historical Dimension or Slowly Changing Dimension :
You can use a bridge table.
In a classic dimensional schema, each dimension attached to a fact table has a single value consistent with the fact table’s grain. But there are a number of situations in which a dimension is legitimately multivalued.
Like in your example, an employee can have many positions :
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Is it a good thing to create a database without relationships between the tables?
Is there any problem doing this? I have to design a database with historical events, sports events, environment data, etc. but can I put them in only one database?
In your case (as you said in a comment, it's for a history table), having no explicit relation between the parent table and the child table isn't a problem, as:
you won't need the unique constraints
you don't need to delete the orphans (if it's a history table, you want to maintain all the data, isn't it?)
And if the requests to this history table are made independently to the parent (e.g. any ORM used), make sure to have an index in the parent id column to be able to easily retrieve all the data linked to the parent.
Is a good thing create a database that its table hasn't relationships?
Sure if you don't have/need to make relations (Example Table Users and Table StarsInTheSky)
I have to design a database with some historical events, sports events, environment data and other stuff, but can I put them in only one database?
Probably you are talking about putting data in only one table; In my opinion You should think about Normalization:
Begin writing in a paper your unique table and the first row (Use your imagination).
Question yourself: "Am i repeating some Data in the rows written?"
EX:
Name - Surname - BirthDate - Address
Paul - Allen - 01/11/1957 - 21 Baker Street NY
Paul - Allen - 01/11/1957 - 66 Mullholland Drive LosAngeles
As you can see here U can Relate Personal Data with Address in two distinct table.
Question yourself: "Am i using irresponsible Columns (Fields)?
EX:
Name - Surname - BirthDate - Phone1 - Phone2
Paul - Allen - 01/11/1957 - 25412255 - null
What if another user has 3 or 4 phone numbers?
Relate User data with Phone table.
EDIT: Use a single Database or not? AFAIK programs need evolution and implementation in time, maybe one day you would need to make some relation so it's better if u use a single database per Program no matter how many tables u have and if they are related or not, keep the future work as simple as u can :)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
For background, my situation is I have a database that is missing a lot of foreign key relationships. One in particular, let's call it Orders, which represents orders with a composite primary key of OrderID and LocationID. The other table we'll call OrderDetails which has an OrderID but no LocationID. In reality, it is impossible to have an order in two locations at once, so it was assumed that there was no need to have LocationID in the details table. I didn't design it, and I can't change that.
We also have to work under the assumption there will be no support to add location id to the details table for various reasons. We are also working with Oracle and a high volume database with many concurrent users in many locations. Finally, there will be minimal time to change any applications that use this table.
So my question is: is this solution is feasible, or is there anything else I should try?
Say I create an intersection table, for lack of a better name AllOrders or whatever with primary key OrderID. Now we link Order.OrderID to AllOrders.OrderID and link OrderDetails.OrderID to AllOrders.OrderID. Would it be reasonable then to fill in AllOrders via a trigger on each insert to Orders to enforce the integrity? I am assuming all applications are inserting details after orders or the changes to enforce would be minimal and allowed.
Are there any better solutions? I understand we would do this differently if in charge of designing or given more leeway for fixing, but I'm trying to make the most given the constraints.
Edit --
To clarify what I am looking to accomplish, I want to treat all orders with the same ID as an equivalence class modulo location and ensure that if any order is deleted it requires all orders with the same id deleted and all child order details to be deleted. With primary importance of no orphan details. This has to be done with minimal application changes if possible and no redesign of existing tables if possible.
Create a new table to handle the mapping going forward.
Table: Tb_order_orderdetails
Columns: OrderID, LocationID, OrderDetailsID
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Say there are two modules users and status. I split these modules in following 2 cases
Here is case-1
Here is case-2
I am trying to understand which database design lowly coupled and should be adopted according to the Software Engineering Design principals? Particularly interested to have comments that which approach is better by considering re-usability. I mean in future which approach can be re-used easily & effectly to any other software design
Both your cases have consistency issues rather than coupling/cohesion issues.
First, both your cases allow for a department to have a limitless amount of statuses. This might not make sense if, for instance, the status represents whether the department is open or closed. If departments may only have 1 status at any given time, your primary key for an status must be dept_id (in which case it should be within the departments table as a foreign key to the table with the available statuses instead), this may be incorrect depending on what you are modelling. The second case, however, is worse for consistency because it allows you to have an unlimited amount of values for the variable status (There is no table to define the valid values for status, so this case allows you to have typos even, for instance a department with status "opne" instead of "open")
Secondly, the users table has no relationship with the rest of the data, which may not make sense again (users can't be members of any department, etc). In the first case, users have no status and in the second case it is related to an status table... Neither case (for the users table) has more or less coupling than the other (because it has no relationship with anything else in your model), but you need to check whether you want users to have an status (and what is that status, whether it should be selected from a fixed list of values or not).
We don't have much to go on about analyzing coupling/cohesion in both of your cases. You must better understand what you are trying to model and should first worry about ensuring consistency.
Here's an short but interesting blog post about coupling/cohesion if you want to read some: https://thebojan.ninja/2015/04/08/high-cohesion-loose-coupling/
Hope it helps!