Different Tables or different Databases for Business Data and Identity

Different Tables or different Databases for Business Data and Identity - database

I read through Apress - Pro Asp.Net MVC 5 and the free chapters of the Identity Framework and now, I want to create a small sample application with some data and Identity.
Later I want to make a test deployment to Windows Azure.
Now, should I create one single Database for this Application, containing all Data (Products, whatsoever, IdentityData (User-Accounts, Oauth Linkings...)) or would it be better to create two Databases?
I know, If I'd create two, I would be able to use the same Identity-Data for other MVC Applications, but is there some kind of best practice for MVC?

There's no "best practice", per se, in this area. It depends on the needs of your individual application. What I can tell you is that if you choose to use multiple database, you'll end up with a somewhat fractured application. That sounds like a bad thing, but remember this is a valid choice in some scenarios. What I mean by that is simply that if you were to separate Identity from the rest of your application, requiring two databases and two contexts, there's no way, then, to relate your ApplicationUser with any other object in your application.
For example, let's say you creating a reviews site. Review would be a class in your application context, and ApplicationUser would of course be a class in your Identity context. You could never do something like:
public class Review
{
...
public virtual ApplicationUser ReviewedBy { get; set; }
}
That would typically result in a foreign key being created on the reviews table, pointing to a row in your users table. However, since these two tables are in separate databases, that's not possible. In fact, if you were to do something like this, Entity Framework would realize this problem, and actually attach ApplicationUser to your application context and attempt to generate a table for it in your application's database.
What you could do, though, is simply store the id of the user:
public string ReviewedById { get; set; }
But, again, this wouldn't be a foreign key. If you needed the user instance, you'd have to perform a two step process:
var review = appContext.Reviews.Find(reviewId);
var user = indentityContext.Users.Find(review.ReviewedById);
Generally speaking, it's better to keep all your application data together, including things like Identity. However, if you can't, or have a business case that precludes that, you can still do pretty much anything you need to do, it just becomes a bit more arduous and results in more queries.

Related

Are classes depends on database tables?

I'm newbie to designing class diagrams.
As my application works as REST API, I would like to use DTO-DAO design patterns. For user registration module, DB contains 3 tables for user signon, profile and address.
Do I need to create 3 DTOs and corresponding DAOs to insert/update user signon, profile and address?
If so, what if I only one table is created instead of three tables and dropped two tables in future?

Whatever design pattern you follow, data modelling is entirely upto you.Your design pattern should be based on your data modelling and your need. Not that,your data model will depend on the design pattern but on your need

You can create whatever dto objects you like. However both your database design and your dto design is driven by the concepts in your system (user/company/address etc) this often called the domain.
You'll often find that the two are very similar, after all they both represent the same domain!
As to whether you need different dtos for different calls that really depends on you. Do you need a different class to represent an insert/update call? What's the difference? Often the update has an id (whereas the insert hasn't had one assigned yet). So why not have two where the update inherits from the insert but adds the id property?
Delete dtos, you can do these as either an update or just as an id. After all why bother to populate an entire object you're about tot delete. Personally I'd just say
DeleteUser(int id);
Much easier!

Need advice on structure of my database, to create useful Entities

I need people's advice as to whether this the best way to achieve what I want. Apologies in advance if this is a little to subjective.
I want to use Entity Framework V.1 to create something similar to the following C# classes:
abstract class User
{
public int UserId;
public string TelephoneNumber;
}
class Teacher : User
{
public string FavorateNewspaper;
}
class Pupil : User
{
public string FavorateCartoon;
}
I need people's advice as to how to best to persist this information.
I plan to use SQL Server and the normal Membership Provider. It will create for me a table called aspnet_Users. There will be two roles: Teacher and Pupil.
I will add fields to the table aspnet_Users which are common to both roles. Then create tbl_Teachers and tbl_Pupils to hold information specific to one role.
So My database will look a bit like this:
aspnet_Users
int UserId
varchar TelephoneNumber
tbl_Teachers
int UserId
varchar FavorateNewspaper
tbl_Pupils
int UserId
varchar FavorateCartoon
The idea of course being that I can match up the data in aspnet_Users to that in either tbl_Teachers or tbl_Pupils by joining on UserId.
So to summarise, my questions are:
Is my database structure the best option to achieve these classes?
Should I try to wrap the Entities within my own POCO classes?
Should I change my database structure so that EF creates entities which are closer to the classes I want?
EDIT: I re-arranged my question it make it a bit clearer what I'm asking.

If you're using EF 1, then POCO can be a bit unpleasant. Unless there's a good reason not to, I'd just use normal EF entities. Your database model is fine, by the way, and is an example of TPT (Table Per Type) inheritance mapping. You could either use the wizard to create entites from the databaes, or create your entites and map them to the associated tables. If you do the former you'd initially end up with three unrelated entities. You'd then use the designer to tell EF that Pupil and Teacher inherit from User, and that User is abstract.
In general, one of the strengths of EF is that the entities don't have to match that closely to the tables that persist them. In this case though there's a natural mapping.

What would you do to avoid conflicting data in this database schema?

I'm working on a multi-user internet database-driven website with SQL Server 2008 / LinqToSQL / custom-made repositories as the DAL. I have run across a normalization problem which can lead to an inconsistent database state if exploited correctly and I am wondering how to deal with the problem.
The problem: Several different companies have access to my website. They should be able to track their Projects and Clients at my website. Some (but not all) of the projects should be assignable to clients.
This results in the following database schema:
**Companies:**
ID
CompanyName
**Clients:**
ID
CompanyID (not nullable)
FirstName
LastName
**Projects:**
ID
CompanyID (not nullable)
ClientID (nullable)
ProjectName
This leads to the following relationships:
Companies-Clients (1:n)
Companies-Projects (1:n)
Clients-Projects(1:n)
Now, if a user is malicious, he might for example insert a Project with his own CompanyID, but with a ClientID belonging to another user, leaving the database in an inconsistent state.
The problem occurs in a similar fashion all over my database schema, so I'd like to solve this in a generic way if any possible. I had the following two ideas:
Check for database writes that might lead to inconsistencies in the DAL. This would be generic, but requires some additional database queries before an update and create queries are performed, so it will result in less performance.
Create an additional table for the clients-Projects relationship and make sure the relationships created this way are consistent. This also requires some additional select queries, but far less than in the first case. On the other hand it is not generic, so it is easier to miss something in the long run, especially when adding more tables / dependencies to the database.
What would you do? Is there any better solution I missed?
Edit: You might wonder why the Projects table has a CompanyID. This is because I want users to be able to add projects with and without clients. I need to keep track of which company (and therefore which website user) a clientless project belongs to, which is why a project needs a CompanyID.

I'd go with with the latter, having one or more tables that define the allowable relationships between entities.

Note, there's no circularity in the references you have, so the title is misleading.
What you have is the possibility of conflicting data, that's different.
Why do you have "CompanyID" in the project table? The ID of the company involved is implicitly given by the client you link to. You don't need it.
Remove that column and you've removed your problem.
Additionally, what is the purpose of the "name" column in the client table? Can you have a client with one name, differing from the name of the company?
Or is "client" the person at that company?
Edit: Ok with the clarification about projects without companies, I would separate out the references, but you're not going to get rid of the problem you're describing without constraints that prevent multiple references being made.
A simple constraint for your existing tables would be that not both the CompanyID and ClientID fields of the project row could be non-null at the same time.

If you want to use the table like this and avoid the all the new queries just put triggers on the table and when user tries to insert row with wrong data the trigger with stop him.
Best Regards,
Iordan

My first thought would be to create a special client record for each company with name "No client". Then eliminate the CompanyId from the Project table, and if a project has no client, use the "No client" record rather than a "normal" client record. If processing of such no-client's is special, add a flag to the no-client record to explicitly identify it. (I'd hate to rely on the name being "No Client" or something like that -- too fuzzy.)
Then there would be no way to store inconsistent data so the problem would go away.

In the end I implemented a completely generic solution which solves my problem without much runtime overhead and without requiring any changes to the database. I'll describe it here in case someone else has the same problem.
First off, the approach only works because the only table that other tables are referencing through multiple paths is the Companies table. Since this is the case in my database, I only have to check whether all n:1 referenced entities of each entity that is to be created / updated / deleted are referencing the same company (or no company at all).
I am enforcing this by deriving all of my Linq entities from one of the following types:
SingleReferenceEntityBase - The norm. Only checks (via reflection) if there really is only one reference (no matter if transitive or intransitive) to the Companies table. If this is the case, the references to the companies table cannot become inconsistent.
MultiReferenceEntityBase - For special cases such as the Projects table above. Asks all directly referenced entities what company ID they are referencing. Raises an exception if there is an inconsistency. This costs me a few select queries per CRUD operation, but since MultiReferenceEntities are much rarer than SingleReferenceEntities, this is negligible.
Both of these types implement a "CheckReferences" and I am calling it whenever the linq entity is written to the database by partially implementing the OnValidate(System.Data.Linq.ChangeAction action) method which is automatically generated for all Linq entities.

Modeling Classes Based on Table Designs

Is this how one would normally design classes?
One class = 1 Table.
How about tables that contain a foreign key to another table?
Suppose I have the following:
PersonTable
---------------
person_id
name
PersonMapTable
---------------
map_id
type_id (fk)
person_id
PersonTypeTable
-------------------
type_id
description
parent_type_id
AddressTable
-------------------
address_id
address1
address2
city
state
zip
AddressMapTable
-----------
address_map_id
address_id
person_id
Would good practice consist of creating a class for each table?
If so, what are the best practices for loading/saving such classes back to the database without an orm? A simple code example would be really helpful

I'd recommend reading Martin Fowler's Patterns of Enterprise Application Architecture, which has several patterns of mapping between classes and tables.

I don't think that one object per table is necessarily a good design. It's hard to give a one size fits all rule, but objects can be richer and more fine grained. A database can be denormalized for reasons that don't apply to objects. In that case, you'd have more objects than tables.
Your case would include 1:1 and 1:m relationships:
public class Person
{
// 1:m
private List<your.namespace.Map> maps;
}
public class Map
{
// 1:1
private your.namespace.Type;
}

For the most part I tend to map tables to entities, but it's not a hard rule. Sometimes there are instances where the repository for a specific entity in question is better off dealing with the general concerns surrounding a specific entity, which means it will cross into dealing with other tables as a result, without those tables specifically needing to exist as entities.
What I never do (except in very specific planned cases where the dependent data ALWAYS needs to be retrieved with the entity), is set an entity or collection of entities as a property of another entity. Instead, that entity will either be discoverable via its ID, which will either be a property of the parent entity or discoverable via the associated repository in relation to the parent entity.
In cases where I need the child entity or entities of another entity to be bundled together, I'll make use of an "info" helper class to pull together all the required data. For example, if I have an entity class Widget and it has a collection of child Part objects, then I would create a WidgetInfo class which would contain the Widget instance as a property and a collection of Part objects as the other property.
This way all entity classes remain as lightweight as they can and never make the assumption that dependent data will need to be loaded. Also it keeps the repository model clean without forcing you into messy ORM territory which is generally the case if you create child object collections on an entity class. If you do that without ORM, then you end up with the messy problem of when to load the children or not, and when to assume that the children have or have not been loaded.

I wouldn't say that I always have class per table, especially when you have many to many relationships. Based on your table above I would have 2 classes... I am not sure why you have both and id and a person_type_id, to me they would be the same thing but here are the classes.
Person
{
public int Id { get; set; }
public string Name { get; set; }
public List<PersonType> { get; set; }
}
PersonType
{
// I would discourage from using Type as property name as it is a keyword...
public string [Type] { get; set; }
}

Unless your users are data entry clerks, it's generally considered better to design classes from Use Cases/User Stories. Even if the database already exists.
Reason? It's too easy for users to end up assuming their job is to exercise the software, rather than expecting the software to help them do their jobs.
Clearly they need to intersect at some point. I concur that Fowler's book is a great place to start. But I think he'll reinforce this point of view.
If you want a modeling perspective that helps you get both the classes and the database right, consider Object Role Modeling.

If you plan on using Object-Relational Mapping (ORM), this may affect your table design. Hibernate, for instance, does not like mixed inheritance mapping strategies within the same tree.
Since you specifically indicated that you will not be using ORM, you can follow traditional database design principals. This typically means starting with one table per class, normalizing to third-normal form (read about database normalization here), then denormalizing to meet performance constraints (read about denormalization here).
Regarding your question about how to load and save the objects without the use of ORM, a common strategy is to use Data Access Objects (DAOs). Here is a simple example:
public interface ICustomerDao
{
public void insert(Customer customer) throws CustomerDaoException;
public void update(long id, Customer customer) throws CustomerDaoException;
public void delete(long id) throws CustomerDaoException;
public Customer[] findAll() throws CustomerDaoException;
public Customer findByPrimaryKey(long id) throws CustomerDaoException;
public Customer[] findByCompany(int companyId) throws CustomerDaoException;
}
You didn't specify which language you are using, but regardless you may find this example using Java generics for DAO useful.

Would good practice consist of
creating a class for each table? If
so, what are the best practices for
loading/saving such classes back to
the database without an orm?
You are using ORM. You are mapping objects to relational tables. Whether you use a pre-built library to do so or not is your call. If you don't, you'll be essentially implementing one yourself, though probably without all the bells and whistles of existing ORMs.
The two most common ways of doing this are the ActiveRecord pattern and the Data Mapper pattern. Each has its advantages and disadvantages.
With the ActiveRecord pattern, you define classes whose attributes define the table columns for you. Each instance of this class corresponds to a row in the database, and by creating (and saving) a new instance, you create a new row in the database. More information on that is available here: http://en.wikipedia.org/wiki/Active_record_pattern
In the Data Mapper pattern, you define table objects for each table, and write mapper functions which assign columns of the table to existing classes. SQLAlchemy uses this pattern by default (though there are ActiveRecord type extension modules, which adapt SQLAlchemy's functionality to a different interface. A brief introduction to this pattern can be found in SQLAlchemy's documentation here: http://www.sqlalchemy.org/docs/05/ormtutorial.html (read from the beginning up to but not including the section entitled, "Creating Table, Class and Mapper All at Once Declaratively;" that section explains ActiveRecord using SQLAlchemy).
The ActiveRecord pattern is easier to set up and get working with, and gives you classes which are clearly representative of your database, which has benefits in terms of readability. As a side benefit, the declarative nature of ActiveRecord classes effectively acts as clear and straightforward documentation for your database schema.
The Data Mapper pattern gives you far more flexibility in how your data maps to your classes, so you aren't tied to a more-or-less one-to-one relationship between tables and classes. It also separates your persistence layer from your business code, which means that you can swap out other persistence mechanisms later, if need be. It also means you can more easily test your classes without needing to have a database set up to back them.
For more in depth discussion of SQLAlchemy's mapper configuration, check out http://www.sqlalchemy.org/docs/05/mappers.html. Even if you aren't planning on using a library like SQLAlchemy, the documentation should help you see some of the options you may want to consider in mapping your classes to database tables.

This DataBase to Class point of view approach is more than likely to lead you to cranking lots of code quite quickly. However, a good chunk of this code will likely not be of any use or require some severe mutation. In other words, you'll likely build specific classes that mismatch your displays and workflow.
First, consider your apps, your users' needs, general workflow .. etc Actually come up with something that looks workable (i.e. mock your displays).
Concentrate on the classes you need to use the displays and model your storage (db design) after those needs. Chances are that you will have only a few of straight-table classes as most of your classes will naturally tend provide the solution for your displays.
Good luck.

Table Module vs. Domain Model

I asked about Choosing a method to store user profiles the other day and received an interesting response from David Thomas Garcia suggesting I use the Table Module design pattern. It looks like this is probably the direction I want to take. Everything I've turned up with Google seems to be fairly high level discussion, so if anyone could point me in the direction of some examples or give me a better idea of the nuts and bolts involved that would be awesome.

The best reference is "Patterns of Enterprise Application Architecture" by Martin Fowler:
Here's an excerpt from the section on Table Module:
A Table Module organizes domain
logic with one class per table in the
database, and a single instance of a
class contains the various procedures
that will act on the data. The
primary distinction with Domain
Model is that, if you have many
orders, a Domain Model will have one
order object per order while a Table
Module will have one object to handle
all orders.
Table Module would be particularly useful in the flexible database architecture you have described for your user profile data, basically the Entity-Attribute-Value design.
Typically, if you use Domain Model, each row in the underlying table becomes one object instance. Since you are storing user profile information in multiple rows, then you end up having to create many Domain Model objects, whereas what you really want is one object that encapsulates all the user properties.
Instead, the Table Module makes it easier for you to code logic that applies to multiple rows in the underlying database table. If you create a profile for a given user, you'd specify all those properties, and the Table Module class would have the code to translate that into a series of INSERT statements, one row per property.
$table->setUserProfile( $userid, array('firstname'=>'Kevin', 'lastname'=>'Loney') );
Likewise, querying a given user's profile would use the Table Module to map the multiple rows of the query result set to object members.
$hashArray = $table->getUserProfile( $userid );