Database modelling remove bridge table - sql-server

I have designed a database with some outside help and I am thinking about a major change to the database model because of a problem creating reports in Power BI I have recently encountered here: SQL Power BI Report with Bridge Table
Disclaimer: if I could ask anyone within my firm, I would, but I can't.
We have a three-layer structure
A main table Firms, with information about the year and unique key/name for each firm
A bridge table Firm_Bridge which information about the type of the firm linked to Firm_Types
Many end Tables with information about sales and stuff
For these end tables, there are two types
Single type tables that only matter for one Type (End_Table_Type_A or End_Table_Type_B) (often the case)
Multi type tables that matter for more than one type (End_Table_Type_all) (rarely the case)
This is a made-up example similar to the real model
I am wondering if it was better to simplify this structure and directly connect the single type end tables to the firm table. I can still use the bridge table to document the type of firm and also connect the End_Table_Type_All to the firm's table. I imaging like this.
I hope this would avoid the problem that I have in my other question when summarizing stuff of end tables across years from Firms and reduce the complexity of our data model, as most tables are only single type end tables. I can exclude one join for most queries.
I am afraid I am missing something and that the current way is the proper way to model this.
What would happen in the changed model if I were to join the 'Firm_Type' to 'Firms' and then the single Type end table 'End_Table_Type_A'. Would it fetch the single end type table for each type?
Then, my idea would be stupid and I need to find another solution to my problem.

Just to give this closure. It seems to be indeed a stupid idea, because I would duplicate rows connected to the end tables, as soon as I merge the information of the type with the Firms table.

Related

Am I modelling my warehouse tables the right way?

I'm designing a website where users answer surveys. I need to design a data warehouse to aggregate their responses. So far in my model I have:
A dim table for Users.
A dim table for Questions.
A fact table for UserResponses. <= This is where I'm having the problem.
So the problem I have is that additional comments can be added to their responses. For example, somebody may come in and make 2 comments against a single response. How should I model this in the database?
I was thinking of creating another fact table for "Comments", and linking it to a record in UserResponses. Is this the right thing to do? This additional table would have something like the below columns:
CommentText
Foreign key relationship to fact.UserResponses.
Yes, your idea to create another table is correct. I would typically call it a "child" table rather than calling it another fact table.
The key thing that you didn't mention is that the table comments still needs an ID field. A table without an ID would be bad design (although it is indeed possible to create the table with no ID) since you would have no simple way to refer to individual comments.
In a dimension model, fact tables are never linked to each other, as the grain of the data will be compromised.
The back-end database of a client application is not usually a data warehouse schema, but more of an online transactional processing (OLTP) schema. This is because transactional systems work better with third normal form. Analytical systems work better with dimensional models because the data can be aggregated (i.e., "sliced and diced") more easily.
I would recommend switching back to an OLTP database. It can still be aggregated when needed, but maintains third normal form for easier transactional processing.
Here is a good comparison between a dimensional model (OLAP) and a transactional system (OLTP):
https://www.guru99.com/oltp-vs-olap.html

Many tables to a single row in relational database

Consider we have a database that has a table, which is a record of a sale. You sell both products and services, so you also have a product and service table.
Each sale can either be a product or a service, which leaves the options for designing the database to be something like the following:
Add columns for each type, ie. add Service_id and Product_id to Invoice_Row, both columns of which are nullable. If they're both null, it's an ad-hoc charge not relating to anything, but if one of them is satisfied then it is a row relating to that type.
Add a weird string/id based system, for instance: Type_table, Type_id. This would be a string/varchar and integer respectively, the former would contain for example 'Service', and the latter the id within the Service table. This is obviously loose coupling and horrible, but is a way of solving it so long as you're only accessing the DB from code, as such.
Abstract out the concept of "something that is chargeable" for with new tables, of which Product and Service now are an abstraction of, and on the Invoice_Row table you would link to something like ChargeableEntity_id. However, the ChargeableEntity table here would essentially be redundant as it too would need some way to link to an abstract "backend" table, which brings us all the way back around to the same problem.
Which way would you choose, or what are the other alternatives to solving this problem?
What you are essentially asking is how to achieve polymorphism in a relational database. There are many approaches (as you yourself demonstrate) to this problem. One solution is to use "table per class" inheritance. In this setup, there will be a parent table (akin to your "chargeable item") that contains a unique identifier and the fields that are common to both products and services. There will be two child tables, products and goods: Each will contain the unique identifier for that entity and the fields specific to it.
One benefit to this approach over others is you don't end up with one table with many nullable columns that essentially becomes a dumping ground to describe anything ("schema-less").
One downside is as your inheritance hierarchy grows, the number of joins needed to grab all the data for an entity also grows.
I believe it depends on use case(s).
You could put the common columns in one table and put product and service specific columns in its own tables.Here the deal is that you need to join stuff.
Else if you maintain two separate tables, one for Product and another for Sale. You use application logic to determine which table to insert into. And getting all sales will essentially mean , union of getting all products and getting all sale.
I would go for approach 2 personally to avoid joins and inserting into two tables whenever a sale is made.

Database design query

I'm trying to work out a sensible approach for designing a database where I need to store a wide range of continuously changing information about pets. The categories of data can be broken down into, for example, behaviour, illness etc. Data will be submitted on a regular basis relating to these categories, so i need to find a good way to design the db to efficiently accommodate this. A simple approach would just to store multiple records for each pet within each relevant table - e.g the behaviour table would store the behaviour data and would simply have a timestamp for each record along with the identifier for that pet. When querying the db, it would be straightforward to query the one table with the pet id, using the timestamps to output the correct history of submissions. Is there a more sensible way around this or does that make sense?
I would use a combination of lookup tables with a strong use of foreign keys. I think what you are suggesting is very common. For example, get me all the reported illnesses for a particluar pet during this data range would look something like:
Select *
from table_illness
where table_illness.pet_id = <value>
and date between table_illness.start_date and table_illness.finish_date
You could do that for any of the tables. The lookup tables will be a link between, for example, table_illness.illness_type and illness_types.illness_type. The illness_types table is where you would store the details on the types of illnesses.
When designing a database you should build your tables to mimic real-life objects or concepts. So in that sense the design you suggest makes sense. Each pet should have its own record in a pet table which doesn't change. Changing information should then be placed into the appropriate table which has the pet's id. The time stamp method you suggest is probably what I would do -- unless of course this is for a vet or something. Then I'd create an appointment table with the date and connect the illness or behavior to the appointment as well.

How can I build a voting system to support multiple types of objects to vote on?

I'm really looking for something very similar to the way SO is setup where a few different kinds of things can be voted on (questions AND answers). What kind of DB schema, generally, could I use to support voting on many different kinds of objects?
Would I have a single Vote table that would have references to other objects in the database? Or do I have to have or should have a separate vote table for each of the objects I would like to vote on.
Kyle,
it's a bit hard to understand what exactly you need...but here's my 2 cents:
I assume you want for each vote to store when it was voted, by whom (maybe IP address or user name), I would go for a single Voting table solution. I also assume that when you query an entity from the DB (question, answer, etc.), you also want to join this table, and it's not likely to have a query solely on the voting table.
in this case, you can use 2 columns on the voting table - one for the type of object that was voted, and one to hold the id of that object. that way you can join the voting table to any other query by specifying the type of the object.
I think managing all your votes in a single table will make your domain simpler then spreading it and creating a voting table for each entity. in this single voting table solution you only need to maintain the list of types of entities (probably using a small dictionary table).
there is also a consideration of performance. if you expect millions of votes, that single table will grow much more quickly than a set of separate tables. it might be a consideration against it. also take care to not make it a bottleneck if there are many concurrent read/writes to it.

Table "Inheritance" in SQL Server

I am currently in the process of looking at a restructure our contact management database and I wanted to hear peoples opinions on solving the problem of a number of contact types having shared attributes.
Basically we have 6 contact types which include Person, Company and Position # Company.
In the current structure all of these have an address however in the address table you must store their type in order to join to the contact.
This consistent requirement to join on contact type gets frustrating after a while.
Today I stumbled across a post discussing "Table Inheritance" (http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server).
Basically you have a parent table and a number of sub tables (in this case each contact type). From there you enforce integrity so that a sub table must have a master equivalent where it's type is defined.
The way I see it, by this method I would no longer need to store the type in tables like address, as the id is unique across all types.
I just wanted to know if anybody had any feelings on this method, whether it is a good way to go, or perhaps alternatives?
I'm using SQL Server 05 & 08 should that make any difference.
Thanks
Ed
I designed a database just like the link you provided suggests. The case was to store the data for many different technical reports. The number of report types is undefined and will probably grow to about 40 different types.
I created one master report table, that has an autoincrement primary key. That table contains all common information like customer, testsite, equipmentid, date etc.
Then I have one table for each report type that contains the spesific information relating to that report type. That table have the same primary key as the master and references the master as well.
My idea for splitting this into different tables with a 1:1 relation (which normally would be a no-no) was to avoid getting one single table with a huge number of columns, that gets very difficult to maintain as your constantly adding columns.
My design with table inheritance gave me segmented data and expandability without beeing difficult to maintain. The only thing I had to do was to write special a special save method to handle writing to two tables automatically. So far I'm very happy with the design and haven't really found any drawbacks, except for a little more complicated save method.
Google on "gen-spec relational modeling". You'll find a lot of articles discussing exactly this pattern. Some of them focus on table design, while others focus on an object oriented approach.
Table inheritance pops up in a few of them.
I know this won't help much now, but initially it may have been better to have an Entity table rather than 6 different contact types. Then each Entity could have as many addresses as necessary and there would be no need for type in the join.
You'll still have the problem that if you want the sub-type fields and you have only the master contact, you'll have to know what table to go looking at - or else join to all of them. But otherwise this is a workable solution to a common problem.
Another possibility (fairly similar in structure, but different in how you think of it) is to simply put all your contacts into one table. Then for the more specific fields (birthday say for people and department for position#company) create separate tables that are associated with that contact.
Contact Table
--------------
Name
Phone Number
Address Table
-------------
Street / state, etc
ContactId
ContactBirthday Table
--------------
Birthday
ContactId
Departments Table
-----------------
Department
ContactId
It requires a different way of thinking of things though - instead of thinking of people vs. companies, you think of the various functional requirements for the task at hand - if you want to send out birthday cards, get all the contacts that have birthdays associated with them, etc..
I'm going to go out on a limb here and suggest you should rethink your normalization strategy (as you seem to be lucky enough to be able to rethink your schema quite fundamentally). If you typically store an address for each contact, then your contact table should have the address fields in it. Alternatively if the address is stored per company then the address should be stored in the company table and your contacts linked to that company.
If your contacts only have one address, or one (or even 3, just not 'many') instance of the other fields, think about rationalizing them into a single table. In my experience having a few null fields is a far better alternative than needing left joins to data you aren't sure exists.
Fortunately for anyone who vehemently disagrees with me you did ask for opinions! :) IMHO you should only normalize when you really need to. Where you are rethinking schemas, denormalization should be considered at every opportunity.
When you have a 7th type, you'll have to create another table.
I'm going to try this approach. Yes, you have to create new tables when you have a new type, but since this table will probably have different columns, you'll end up doing this anyway if you don't use this scheme.
If the tables that inherit the master don't differentiate much from one another, I'd recommend you try another approach.
May I suggest that we just add a Type table. Ie a person has an address, name etc then the student, teacher as each use case presents its self we have a PersonType table that has an entry from the person table to n types and the subsequent new tables teacher, alien, singer as the system eveolves...

Resources