Simple Database - Design issues - database

Just a homework question I am trying to figure out, I would appreciate some assistance.
Apparently, there are three problems with the design of this database design:
Account = {AccNumber, Type, Balance}
Customer = {CustID, FirstName, LastName, Address, AccNumber}
The one that is pretty obvious is that 'CustID' is useless if 'AccNumber' exists.
I am not quite sure about the second and third problem.
Is there a problem with a separate attribute for 'FirstName' and "LastName', cant we just use 'Name'?
And another option, if 'AccNumber' is the primary key (assuming CustID will be removed), it probably should be place in the beginning :
Such as:
Customer = {AccNumber, Name, Address}
Any input would be appreciated!
Thanks

The customer-account relationship, at first glance, appears to be a many-many relationship, which necessitates the use of an intermediary relationship table. For instance, I have three accounts of my own at my bank. In addition, my wife has two of her own. Finally, we have a shared account. The schema above could not well handle such relationships.
You could, indeed, just use "Name" - but you may need to know what the first or last names are at some point in the future and such a concatination can be quite problematic to split.
Good luck with your homework...

The problem is that you haven't presented us with what the database should represent in words; as it is now, there's nothing "wrong" with the design, since we don't know what the design is supposed to model.
I certainly wouldn't say that CustID is useless, as it serves as the primary key of the table. What you need to determine is the relationship between customers and accounts. It should be one of the following:
A single customer can be tied to multiple accounts, but a single account can be tied to a single customer
A single customer can be tied to only one account, but an account can be tied to multiple customers.
A single customer can be tied to multiple accounts, and a single account can be tied to multiple customers
Right now, with AccNumber in the Customer table, your design models #2.

How is is designed right now, each customer could only have one bank account.
The many-to-many relationship will be a problem. Instead, you might create a third table that holds the relationships. For example:
Account = {AccNumber, Type, Balance}
Connection = {ConnID, AccNumber, CustID}
Customer = {CustID, FirstName, LastName, Address}
This way, both Account and Customer are parented by Connection (for lack of a better name). You could query all connections with a certain AccNumber and find all the customers using that account, and vice versa.

Related

Best way to create these tables

I have the following situation.
We want an reputation table to evaluate Users And Companies.
This reputation table would store the reputation given by an Company to User and vice-versa.
It was suggested that we should create two reputation tables, one for the Users and another for the Companies, both with the same columns.
I dont think thats the best way but I cant find another solution.
Is there any other way we could do that?
thx
I don't think your approach is bad; another solution would be to have an abstract Entity table, with each User and Company having its own Entity record (and thus entity ID); then you only track reputation between two entities in a single Reputation table.
Another approach is to have a Reputation table with a user ID, a company ID, and a type (or direction, or whatever seems logical in your model) field which indicates whether it is reputation for the company given by the user, or the other way around. Seems less normalized though.
You would have a table for the company containing a unique key for that Co. Same for the user.
I assume the relationship between Co and User is many-to-many.
You need one more table containing both keys, for Co and User, and two fields, one for Co-rep and one for User-rep. The Co-key and User-key combination would be unique entries for this table.

Should User and Address be in separate tables?

Currently my users table has the below fields
Username
Password
Name
Surname
City
Address
Country
Region
TelNo
MobNo
Email
MembershipExpiry
NoOfMembers
DOB
Gender
Blocked
UserAttempts
BlockTime
Disabled
I'm not sure if I should put the address fields in another table. I have heard that I will be breaking 3NF if I don't although I can't understand why. Can someone please explain?
There are several points that are definitely not 3NF; and some questionable ones in addition:
Could there could be multiple addresses per user?
Is an address optional or mandatory?
Does the information in City, Country, Region duplicate that in Address?
Could a user have multiple TelNos?
Is a TelNo optional or mandatory?
Could a user have multiple MobNos?
Is a MobNo optional or mandatory?
Could a user have multiple Emails?
Is an Email optional or mandatory?
Is NoOfMembers calculated from the count of users?
Can there be more than one UserAttempts?
Can there be more than one BlockTime per user?
If the answer to any of these questions is yes, then it indicates a problem with 3NF in that area. The reason for 3NF is to remove duplication of data; to ensure that updates, insertions and deletions leave the data in consistent form; and to minimise the storage of data - in particular there is no need to store data as "not yet known/unknown/null".
In addition to the questions asked here, there is also the question of what constitutes the primary key for your table - I would guess it is something to do with user, but name and the other information you give is unlikely to be unique, so will not suffice as a PK. (If you think name plus surname is unique are you suggesting that you will never have more than one John Smith?)
EDIT:
In the light of further information that some fields are optional, I would suggest that you separate out the optional fields into different tables, and establish 1-1 links between the new tables and the user table. This link would be established by creating a foreign key in the new table referring to the primary key of the user table. As you say none of the fields can have multiple values then they are unlikely to give you problems at present. If however any of these change, then not splitting them out will give you problems in upgrading the application and the data to support the application. You still need to address the primary key issue.
As long as every user has one address and every address belongs to one user, they should go in the same table (a 1-to-1 relationship). However, if users aren't required to enter addresses (an optional relationship) a separate table would be appropriate. Also, in the odd case that many users share the same address (e.g. they're convicts in the same prison), you have a 1-to-many relationship, in which case a separate table would be the way to go. EDIT: And yes, as someone pointed out in the comments, if users have multiple address (a 1-to-many the other way around), there should also be separate tables.
Just as point that I think might help someone in this question, I once had a situation where I put addresses right in the user/site/company/etc tables because I thought, why would I ever need more than one address for them? Then after we completed everything it was brought to my attention by a different department that we needed the possibility of recording both a shipping address and a billing address.
The moral of the story is, this is a frequent requirement, so if you think you ever might want to record shipping and billing addresses, or can think of any other type of address you might want to record for a user, go ahead and put it in a separate table.
In today's age, I think phone numbers are a no brainer as well to be stored in a separate table. Everyone has mobile numbers, home numbers, work numbers, fax numbers, etc., and even if you only plan on asking for one, people will still put two in the field and separate them by a semi-colon (trust me). Just something else to consider in your database design.
the point is that if you imagine to have two addresses for the same user in the future, you should split now and have an address table with a FK pointing back to the users table.
P.S. Your table is missing an identity to be used as PK, something like Id or UserId or DataId, call it the way you want...
By adding them to separate table, you will have a easier time expanding your application if you decide to later. I generally have a simple user table with user_id or id, user_name, first_name, last_name, password, created_at & updated_at. I then have a profile table with the other info.
Its really all preference though.
You should never group two different types of data in a single table, period. The reason is if your application is intended to be used in production, sooner or later different use-cases will come which will need you to higher normalised table structure.
My recommendation - Adhere to SOLID principles even in DB design.

What is the convention for designating the primary relationship in a one-to-many relation between tables?

I understand how to design a database schema that has simple one-to-many relationships between its tables. I would like to know what the convention or best practice is for designating one particular relationship in that set as the primary one. For instance, one Person has many CreditCards. I know how to model that. How would I designate one of those cards as the primary for that person? Solutions I have come up with seem inelegant at best.
I'll try to clarify my actual situation. (Unfortunately, the actual domain would just confuse things.) I have 2 tables each with a lot of columns, let's say Person and Task. I also have Project which has only a couple of properties. One Person has many Projects, but has a primary Project. One Project has many Tasks, but sometimes has one primary Task with alternates, and other times has no primary task and instead a sequence of Tasks. There are no Tasks that are not part of a Project, but it isn't strictly forbidden.
PERSON (PERSON_ID, NAME, ...)
TASK (TASK_ID, NAME, DESC, EST, ...)
PROJECT (NAME, DESC)
I can't seem to figure a way to model the primary Project, primary Task, and the Task sequence all at the same time without introducing either overcomplexity or pure evil.
This is the best I've come up with so far:
PERSON (PERSON_ID, NAME, ...)
TASK (TASK_ID, NAME, DESC, EST, ...)
PROJECT (PROJECT_ID, PERSON_FK, TASK_FK, INDEX, NAME, DESC)
PERSON_PRIMARY_PROJECT (PERSON_FK, PROJECT_FK)
PROJECT_PRIMARY_TASK (PROJECT_FK, TASK_FK)
It just seems like too many tables for a simple concept.
Here's a question I've found that deals with a very similar situation: Database Design: Circular dependency.
Unfortunately, there didn't seem to be a consensus about how to handle the situation, and the "correct" answer was to disable the database consistency checking mechanism. Not cool.
Well, it seems to me that a Person has two relationships with a CreditCard. One is that the person owns it, and the other is that they consider it their primary CreditCard. That tells me you have a one-to-one and a one-to-many relationship. The return relationship for the one-to-one is already in the CreditCard because of the one-to-many relationship its in.
This means I'd add primary_cc_id as a field in Person and leave CreditCard alone.
Two strategies:
Use a bit column to indicate the preffered card.
Use a PrefferedCardTable associating each Person with the ID of its preffered card.
One person can have many credit cards; Then you'd need an identifier on each credit card to actually link that specific credit card to one individual - which I assume you've already made in your model (some kind of ID that links the person to that credit card).
Primary credit card (I assume you mean e.g. as a default credit card?) That would have to be some sort of manual operation (e.g. that you have a third table, that links them together and a column that specifies which one would be the default).
Person (SSN, Name)
CreditCard (CCID, AccountNumber)
P_CC (SSN, CCID, NoID)
So that would mean that if you connect a person to a credit card, you'd have to specify the NoID, as say '1', then design your query to per default find the credit card that belongs to this individual with NoID '1'.
This is of course just one way of doing it, maybe you'd want to limit by 0, 1 - and then sort them by the date the credit card was added to that person.
Maybe if you'd elaborate and give more information about your columns and ideas it'd make it easier.
So here what I tried out with Northwind and C# Windows App ,and I had just one query executed.
My Code:
DataClasses1DataContext context = new DataClasses1DataContext();
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Product>(b => b.Category);
context.LoadOptions = dlo;
context.DeferredLoadingEnabled = false;
context.Log = Console.Out;
List<Product> test = context.Products.ToList();
MessageBox.Show(test[0].Category.CategoryName);
Result:
SELECT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued], [t2].[test], [t2].[CategoryID] AS [CategoryID2], [t2].[CategoryName], [t2].[Description], [t2].[Picture]
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t1].[CategoryID], [t1].[CategoryName], [t1].[Description], [t1].[Picture]
FROM [dbo].[Categories] AS [t1]
) AS [t2] ON [t2].[CategoryID] = [t0].[CategoryID]

Table "Inheritance" in SQL Server

I am currently in the process of looking at a restructure our contact management database and I wanted to hear peoples opinions on solving the problem of a number of contact types having shared attributes.
Basically we have 6 contact types which include Person, Company and Position # Company.
In the current structure all of these have an address however in the address table you must store their type in order to join to the contact.
This consistent requirement to join on contact type gets frustrating after a while.
Today I stumbled across a post discussing "Table Inheritance" (http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server).
Basically you have a parent table and a number of sub tables (in this case each contact type). From there you enforce integrity so that a sub table must have a master equivalent where it's type is defined.
The way I see it, by this method I would no longer need to store the type in tables like address, as the id is unique across all types.
I just wanted to know if anybody had any feelings on this method, whether it is a good way to go, or perhaps alternatives?
I'm using SQL Server 05 & 08 should that make any difference.
Thanks
Ed
I designed a database just like the link you provided suggests. The case was to store the data for many different technical reports. The number of report types is undefined and will probably grow to about 40 different types.
I created one master report table, that has an autoincrement primary key. That table contains all common information like customer, testsite, equipmentid, date etc.
Then I have one table for each report type that contains the spesific information relating to that report type. That table have the same primary key as the master and references the master as well.
My idea for splitting this into different tables with a 1:1 relation (which normally would be a no-no) was to avoid getting one single table with a huge number of columns, that gets very difficult to maintain as your constantly adding columns.
My design with table inheritance gave me segmented data and expandability without beeing difficult to maintain. The only thing I had to do was to write special a special save method to handle writing to two tables automatically. So far I'm very happy with the design and haven't really found any drawbacks, except for a little more complicated save method.
Google on "gen-spec relational modeling". You'll find a lot of articles discussing exactly this pattern. Some of them focus on table design, while others focus on an object oriented approach.
Table inheritance pops up in a few of them.
I know this won't help much now, but initially it may have been better to have an Entity table rather than 6 different contact types. Then each Entity could have as many addresses as necessary and there would be no need for type in the join.
You'll still have the problem that if you want the sub-type fields and you have only the master contact, you'll have to know what table to go looking at - or else join to all of them. But otherwise this is a workable solution to a common problem.
Another possibility (fairly similar in structure, but different in how you think of it) is to simply put all your contacts into one table. Then for the more specific fields (birthday say for people and department for position#company) create separate tables that are associated with that contact.
Contact Table
--------------
Name
Phone Number
Address Table
-------------
Street / state, etc
ContactId
ContactBirthday Table
--------------
Birthday
ContactId
Departments Table
-----------------
Department
ContactId
It requires a different way of thinking of things though - instead of thinking of people vs. companies, you think of the various functional requirements for the task at hand - if you want to send out birthday cards, get all the contacts that have birthdays associated with them, etc..
I'm going to go out on a limb here and suggest you should rethink your normalization strategy (as you seem to be lucky enough to be able to rethink your schema quite fundamentally). If you typically store an address for each contact, then your contact table should have the address fields in it. Alternatively if the address is stored per company then the address should be stored in the company table and your contacts linked to that company.
If your contacts only have one address, or one (or even 3, just not 'many') instance of the other fields, think about rationalizing them into a single table. In my experience having a few null fields is a far better alternative than needing left joins to data you aren't sure exists.
Fortunately for anyone who vehemently disagrees with me you did ask for opinions! :) IMHO you should only normalize when you really need to. Where you are rethinking schemas, denormalization should be considered at every opportunity.
When you have a 7th type, you'll have to create another table.
I'm going to try this approach. Yes, you have to create new tables when you have a new type, but since this table will probably have different columns, you'll end up doing this anyway if you don't use this scheme.
If the tables that inherit the master don't differentiate much from one another, I'd recommend you try another approach.
May I suggest that we just add a Type table. Ie a person has an address, name etc then the student, teacher as each use case presents its self we have a PersonType table that has an entry from the person table to n types and the subsequent new tables teacher, alien, singer as the system eveolves...

Separating user table from people table in a relational database

I've done many web apps where the first thing you do is make a user table with usernames, passwords, names, e-mails and all of the other usual flotsam. My current project presents a situation where non-users records need to function similarly to users, but do not need to the ability to be a first order user.
Is it reasonable to create a second table, people_tb, that is the main relational table and data store, and only use the users_tb for authentication? Does separating user_tb from people_tb present any problems? If this is commonly done, what are some strategies and solutions as well as drawbacks?
This is certainly a good idea, as you are normalizing the database. I have done a similar design in an app that I am writing, where I have an employee table and a user table. Users may a from an external company or an employee, so I have separate tables because an employee is always a user, but a user may not be an employee.
The issues that you'll run into is that whenever you use the user table, you'll nearly always want the person table to get the name or other common attributes you would want to show up.
From a coding standpoint, if you're using straight SQL, it will take a little more effort to mentally parse the select statement. It may be a little more complicated if you're using an ORM library. I don't have enough experience with those.
In my application, I'm writing it in Ruby on Rails, so I'm constantly doing things like employee.user.name, where if I kept them together, it would be just employee.name or user.name.
From a performance standpoint, you are hitting two tables instead of one, but given proper indexes, it should be negligible. If you had an index that contained the primary key and the person name, for instance, the database would hit the user table, then the index for the person table (with a nearly direct hit), so the performance would be nearly the same as having one table.
You could also create a view in the database to keep both tables joined together to give you additional performance enhancements. I know in the later versions of Oracle you can even put an index on a view if needed to increase performance.
I routinely do that because for me the concept of "user" (username, password, create date, last login date) is different from "person" (name, address, phone, email). One of the drawbacks that you may find is that your queries will often require more joins to get the info you're looking for. If all you have is a login name, you'll need to join the "people" table to get the first and last name for example. If you base everything around the user id primary key, this is mitigated a bit, but still pops up.
If user_tb has auth info, I would very much keep it separate from people_tb. I would however keep a relationship between the two, and most of users' info would be stored in people_tb except all of the info needed for auth (which i guess will not be used for much else) Its a nice tradeoff between design and efficiency i think.
That is definitely what we do as we have millions of people records and only thousands of users. We also separate address, phones and emails into relational tables as many people have more than one of each of these things. Critial is to not rely on name as the identifier as name is not unique. Make sure the tables are joined through some type of surrogate key (an integer or a GUID is preferable) not name.
I always try to avoid as much data repetition as possible. If not all people need to login, you can have a generic people table with the information that applies to both people and users (eg. firstname, lastname, etc).
Then for people that login, you can have a users table that has a 1~1 relationship with people. This table can store the username and password.
I'd say go for the normalized design (two tables) and only denormalize (go down to one user/person table) if it will really make your life easier down the line. If however practically all people are also users it may be simpler to denormalize up front. Its up to you; I have used the normalized approach without problems.
Very reasonable.
As an example, take a look at the aspnet_* services tables here.
Their built in schema has a aspnet_Users and aspnet_Membership with the later table having more extended information about a given user (hashed passwords, etc) but the aspnet_User.UserID is used in the other portions of the schema for referential integrity etc.
Bottom line, it's very common, and good design, to have attributes in a separate table if they are different entities, as in your case.

Resources