Best Relational Database Diagrams - database

I am looking for good-designed and real life relational database diagrams. Could you offer any book or other sources? I prefer books.
Thank you.

Not sure what you mean with "real-life relational database diagrams", but any data structures that you need serialized in a relational (SQL) scheme, must first come from a class or class-like diagram. in order to create a well-designed relational database structure based on these classes, one must first submit it to a series of transformations, known as Normalization Degrees. This will help keep your database redundancy-free and organized. (except when redundancy is a business requirement defined by you).
PDF books:
http://cir.dcs.uni-pannon.hu/cikkek/Database_Normalization.pdf
http://www.cs.cityu.edu.hk/~helena/cs34622000B/Normalization.pdf
I hope this helps!

Related

Why is it said that dimensional models (DM/DW) are denormalized when most of them are in 1NF?

Currently I am working with the Dimensional modeling / Data Warehouse / Data Mart.
"Dimensional modeling" is the data model of the data warehouse. There are two basic models: "star schema" and "snowflake schema"
Dimensional modelling is used for OLAP (Online Analytical Processing).
I have been reading about dimensional modeling and OLAP, and this kind of database is described as "denormalized."
But since I work with them, I see all the data structures always minimally in 1NF. I have never worked with a completely denormalized database structure.
So here is the question, does 1NF mean the same thing as "denormalized?" If not, then why do people say it?
Because it is denormalised in comparison to more commonly used relational models, which are very often 3NF+. The assumption is that your source systems are using 3NF+ databases, and when you drop down to 2NF or 1NF, you are denormalising.
This is a big assumption, and not always correct. Plenty of systems are built on relational databases which don't really follow a 3NF model. And more recently, some systems are not using a relational model at all! (Think about all the NoSQL data stores now in use.)
Further to this, one fairly common data warehouse architecture involves creating a 3NF+ datawarehouse which is loaded from the source, and then denormalising the data to create dimensional data marts which are loaded from the more normalised model. In this case saying you are "denormalising" makes sense.

Star schema vs Snowflake Schema

In Business Intelligence perspective this is a common question but I am looking for a statistical answer.
Can we take decision depending on relational database to go for one of these design? I mean, is there any mathematical ratio among data volume that suits one of the schema?
Star schema stores de-normalised data while snowflake stores normalised data.
Usually, snow flake retains the referential integrity in the relational database, meaning you will have many dimensions linked by primary/foreign keys. On the other hand, the star schema will have a flat structure that merges all of the linked tables into one dimension.
Star schema is less complex and has much better performance than the snow flake schema. In BI perspective, star schema should be the way the go. Snow flake should only be used when necessary.
Star Schema Vs SnowFlake Schema..What to choose:
well this entirely depends on the project requirement and scenarios.
If we want to dive more into Dimensional Analysis then SnowFlake will be a good choice because as suggested in above answer, it mains referential integrity, does not contain data redundancy because of it normalised behaviour. For eg: if we want to find out who are the customers that are attracted towards a particular scheme started by the Bank.!!
If the purpose is more into Metric Analysis, then Star is the best option. For eg: if we want to find how much amount did the customer spend in a particular scheme weekly/monthly/quarterly/yearly basis..how much profit does the company made etc.
As suggested above, Star schema is less complex because of less no. of joins and runs much faster, query execution is much better as compared to snowflake.
But again, these are used according to the need of the project.
I hope this answer is helpful.
any suggestions, guidance is highly, deeply appreciated... :)
In relational databases there are fundamentally 2 types of schema (and i realise there are other edge cases): 3NF and Star schemas.
3NF are normally found in transactional systems and Star schemas in analytical schemas.
In a star schema it is possible to create snowflakes off a dimension but this is normally bad practice and should be avoided. If you have a very specific use case and you have the knowledge and experience to know that the only way to solve it is with a Snowflake then thats fine - however building Snowflakes because you don't know how to design a Star schema is not going to end well!
So a Star schema with a limited number of Snowflakes may be ok but a design that has a large number of Snowflakes is not a Snowflake schema - it's just a badly designed Star schema

How to design a database that can handle unknown reports?

I am working on a project which stores large amounts of data on multiple industries.
I have been tasked with designing the database schema.
I need to make the database schema flexible so it can handle complex reporting on the data.
For example,
what products are trending in industry x
what other companies have a similar product to my company
how is my company website different to x company website
There could be all sorts of reports. Right now everything is vague. But I know for sure the reports need to be fast.
Am I right in thinking my best path is to try to make as many association tables as I can? The idea being (for example) if the product table is linked to the industry table, it'll be relatively easy to get all products for a certain industry without having to go through joins on other tables to try to make a connection to the data.
This seems insane though. The schema will be so large and complex.
Please tell me if what I'm doing is correct or if there is some other known solution for this problem. Perhaps the solution is to hire a data scientist or DBA whose job is to do this sort of thing, rather than getting the programmer to do it.
Thank you.
I think getting these kinds of answers from a relational/operational database will be very difficult and the queries will be really slow.
The best approach I think will be to create multidimensional data structures (in other words a data warehouse) where you will have flattened data which will be easier to query than a relational database. It will also have historical data for trend analysis.
If there is a need for complex statistical or predictive analysis, then the data scientists can use the data warehouse as their source.
Adding to Amit's answer above, the problem is that what you need from your transactional database is a heavily normalized association of facts for operational purposes. For an analytic side you want what are effectively tagged facts.
In other words what you want is a series of star schemas where you can add whatever associations you want.

Before starting the Database model

What you do before starting the Database model diagram? I mean how you form the Requirements, Specifications etc. Use cases is one thing but anything else? Some best practice or a rule of thumb? Being a self learner I want to see how it goes in the hands of professionals?
Make sure you have a complete list of requirements from your client. Do your best to completely understand these requirements, it will really help in your design if you do. If YOU are defining the requirements it may be easier since you will already have an idea of what you need to do. Having a thorough grasp of your goal is the most important part.
If there is an obvious part of your database that will be the most important (an application in an online application system for instance) I will usually start from there and work out one piece at a time.
Personally I like to draw rough pictures (what ever makes sense to you, doesn't have to be an official ERD) of what I think the database will look like and revise it to finer levels of detail.
Don't rely only on written requirements. There is no such thing as a complete list of requirements. Talk to the stakeholders, ask questions and use the results of those interviews to determine what attributes need to be modelled, how they are used and to identify the business keys. Then some data analysis and investigation is usually needed to determine the right data types and other aspects.
It may be possible to get a good first cut of a data model up front but don't worry if you can't. Data modelling generally ought to be an iterative, agile process, done in sensible sized steps as a project evolves (although there are certainly cases like Data Warehouse design where the agile approach may be harder to apply).
Depending on your clientele, it can be a good idea to have two data models and two diagrams. One model and diagram is for data analysis. The other is for database design.
I have had good results by using an ER (Entity-Relationship) model and diagram for data analysis and an RDM (Relational Data Model) model and diagram to reflect database design.
The ER diagram is useful for communicating the requirements discovered so far back to the clients, and making sure they are complete and correct. ER diagrams are easy to understand even if the client has no background in database theory. As others have responded, this is an iterative process, not a once only waterfall.
The RDM model and diagram is useful for reflecting logical database design decisions such as the decision to normalize data or do something else. Its easy to derive an RDM model from an ER model, although you have to throw in some design decisions that are intentionally omitted from the ER diagram.
In turn, its easy to build a table create script from an RDM diagram. You will have to add some physical features like indexes, in order to obtain good performance without tearing your hair out.

DataWarehouse - What is a good definition?

Could someone give me a good, practical definition of what a data warehouse is?
I'm surprised no one has posted Inmon's definition:
A warehouse is a subject-oriented,
integrated, time-variant and
non-volatile collection of data in
support of management's decision
making process
From the same page you can pick up Kimball's definition:
A copy of transaction data
specifically structured for query and
analysis
I think that, unfortunately, data warehousing is a wide-ranging field. There is a lot of variety with very few standard paradigms, specifically I'm thinking of Kimball's dimensional modelling. Inmon does not have as a specific a methodology as Kimball's and thus some 3NF models may or may not conform to his principles.
Because Inmon has broadened his scope for what warehousing is meant to accomplish, it can encompass unstructured data. However, analysis of unstructured data is very different than traditional analysis.
As applied to SQL Server, typically the largest Data Warehouses on SQL Server are modelled dimensionally, because this lends itself well to the non-distributed, non-massively parallel model. Massively parallel systems like Teradata generally perform a lot better with 3NF models. These are still table-based systems with the various tables connected with foreign key constraints (perhaps not enforced, but at least logical).
Of course, we are also seeing NoSQL data processing systems like Map/Reduce which are not really databases at all in the sense of normalized, denormalized or non/poorly-normalized relational databases which we have had for 40 years now.
i just started with Datawarehousing and Buisness Intelligence and looking around the web you can find some interesting links :
Get Start With Datawarehousing
I think this two links could help you to understand the concepts of datawarehousing.
sorry, im new i can post only one link ^^
we're sorry, but as a spam prevention mechanism, new users can only post a maximum of one hyperlink. Earn 10 reputation to post more hyperlinks.
A database optimized for retrieval, in general denormalized data, usually a star schema(but could be snowflake) and uses dimensional modeling (fact and dimension tables)
While this is not an academic definition, it might serve as a practical one. A data warehouse is a collection of datamarts and will combine datasets across the breadth of an organization.
A datamart will contain datasets specific to certain portions of the business. In the datamart you will find fact tables, measurable pieces of information, along with dimensions, attributes of your measurable pieces.
A true data warehouse will have conformed dimension tables that can be shared across datamarts.
An example...
Your company may build a datamart around sales. And another datamart around human resources. If the customer dimension table is shared across both these datamarts, it would be considered a conformed dimension. All three of these entities together would make up a data warehouse.
As someone else stated you can find more detailed information by searching for Ralph Kimball's Data Strategies.
Definition : Datawarehouse is a database used for analysis purpose rather than for transaction processing
Check the below link for more informaion on datawarehouse
http://www.idatastage.com/datawarehouse/

Resources