how to create a Database design for access control and tracking?

how to create a Database design for access control and tracking? - database

Planing to create Access control system. Keeping an education institution in mind i created the Perimeter Tables like Main Gate, Zones, Building , Levels, Sections, Rooms(each table with foreign key relation).....Now i wish to create department table and employee tables, But how can i define a department to a particular perimeter. Because as the user the department can be a room or an entire building. Giving all the perimeter table foreign Key into the department table is not an apt way i guess. I also want the access to be in a path, don't want to authorized, if the user come to a Checkpoint through different point. I am newbie in database design, there might be a simple way i did not think of.. Please do help with a optimal design idea to create the above.
Thanks in advance..

Since the point of creating a hierarchy of securable locations is to manage who should have access to which locations, you are better off managing all of these locations in a single table with an involuted foreign key (self-referencing relationship).
Consider the following ERD:
Here you have security zones that can contain other security zones. People are granted access to the appropriate zones. Access to a lower level zone implies access to all of the zones that contain the lower level zone.
Using an involuted foreign key implies having to deal with hierarchical data, which can be a nuisance in SQL. To ease hierarchy navigation, I suggest using visitation numbers as I described at length in my answer to this question.
One thing to consider in any kind of access control system is minimizing the amount of data to be maintained by your security administrator. Many people will have the same access rules. For this reason, you may want to expand on the above ERD to use role based security in which groups not individuals are granted access to zones and individuals are granted access to groups. See my answer to this question on dba.se for more about role based security. If it might help, you could use a mixture of role-based and individual access rules.
Another option to consider is that your ACCESS table could include both allow and disallow flags. These could be used to allow access to a larger area while specifically forbidding access to a smaller area that is contained within. This approach could reduce the amount of data that needs to be managed for each person.

Related

Practical Role Based Data Access Controls in ASP.NET MVC / SQL Server

I have an ASP.NET MVC + SQL Server application with 250 simultaneous users daily which uses AD/NTLM SSO to do all the authorization using a custom authorization security class that control access to controllers & Actions based on users & groups.
A dilemma recently came up where the 50K+ account records of the database are going to be managed by different groups to varying degree's:
All users will be able to view most records certain records can only
be edited by certain users/groups of specific departments There will
be an admin & support groups that will be able to edit any group owned records
etc.
This is not a problem of who has access to what features/forms/etc. in the controllers, but instead a dilemma of data ownership restrictions that must be imposed. I am guessing this means I need some additional layer of security for row level security.
I am looking for a pragmatic & robust way to tackle data ownership within the current application framework with minimal performance hits since it is likely the same thing will need to be imposed on other much larger tables of data. Initially there will be about 5 ownership groups, but creeping up to 25 to 100 in the near future.
Sadly there are no cut and dry business rules that are hard and fast that can be implemented here.. there is no rhyme or reason make sense of who owns what except the record primary key id.
To try to fix it I was thinking of creating a table of owner_roles and map it to the users table then create another table called accounts_ownership that looks something like:
tbl(PK),row(PK),owner(PK),view,create,modify,delete
accounts,1,hr,1,1,1,1
accounts,1,it,1,0,0,0
accounts,2,hr,1,1,1,1
accounts,2,it,1,1,1,1
accounts,3,it,1,0,0,0
But in doing so that would create a table that was 250K lines and could easily get some crappy performance. Looking at sites like Facebook and others this must be a common thing that has to be implemented, but I am hesitant to introduce a table like that since it could create serious performance issues.
The other way I thought this may be implemented is by adding an extra column to the accounts table that is a compound field that is comma separated that would contain the owner(s) with a coded set of rights ie.:
id owners
1 ,hr,
2 ,hr,
3 ,hr,it,
4 ,it,
And then add a custom class to search using the 'like' statement.. provided the logged in users role was "it" and the comma's were reserved and not allowed in owners names:
SELECT * FROM accounts WHERE owners LIKE '%,it,%'
... however this really just feels wrong from a DBA perspective (ugly as hell) and a maintenance nightmare.
Any practical approaches on how I could implement this without destroying my site?

Start with Role-based access control, you can possibly skip the roles from the pure definition but should be able to implement it like this:
Every user can be in one or more groups like admin, support, it, hr
Every data row has an owner like it, hr
On Access, check the access: an admin can see and edit all rows. Support+it sees every row and can edit those from it etc. This way you need only (user-groups + row-access) new rows in your database, not (user-groups * row-access).
User groups in your scenario should be possible to hardcode in your application, in a CMS there is generally a table defining what rights to assign to each user group - complicating the coding but very flexible.
Roles in the original concept allow a user to select what rights he/she wants to use, there would be a "Unlock with admin rights" or the like in your interface.

Primarily for performance reasons, I went with the less elegant approach listed. It took some doing, but there are careful application controls that had to be created to enforce things like no comma's in the id's.

Securely store data for multiple party's in a single table

I'm certainly no DBA and only a beginner when it comes to software development, so any help is appreciated. What is the most secure structure for storing the data from multiple parties in one database? For instance if three people have access to the same tables, I want to make sure that each person can only see their data. Is it best to create a unique ID for each person and store that along with the data then query based on that ID? Are there other considerations I should take into account as well?

You are on the right track, but mapping the USER ID into the table is probably not what you want, because in practice many users have access to the corporations data. In those cases you would store "CorpID" as a column, or more generically "ContextID". But yes, to limit access to data, each row should be able to convey who the data is for, either directly (the row actually contains a reference to CorpID, UserID, ContextID or the like) or it can be inferred by joining to other tables that reference the qualifier.
In practice, these rules are enforced by a middle tier that queries the database, providing the user context in some way so that only the correct records are selected out of the database and ultimately presented to the user.

...three people have access to the same tables...
If these persons can query the tables directly through some query tool like toad then we have a serious problem. if not, that is like they access through some middle tier/service layer or so then #wagregg's solution above holds.
coming to the case when they have direct access rights then one approach is:
create database level user accounts for each of the users.
have another table with row level grant information. say your_table has a primary key column MY_PK_COL then the structure of the GRANTS_TABLE table would be like {USER_ID; MY_PK_COL} with MY_PK_COL a foreign key to your_table.
Remove all privileges of concerned users from your_table
Create a view. SELECT * FROM your_table WHERE user_id=getCurrentUserID();
give your users SELECT/INSERT/UPDATE rights on this view.
Most of the database systems (MySQL, Oracle, SQLServer) provide way to get current logged user. (the one used in the connection string). They also provide ways to restrict access to certain tables. now for your users the view will behave as a normal table. they will never know the difference.
a problem happens when there are too many users. provisioning a database level uer account to every one of them may turn difficult. but then DBMS like MsSQLServer can use windows authentication, there by reducing the user/creation problem.
In most of the scenarios the filter at middle tier approach is the best way. but there are times when security is paramount. Also a bug in the middle tier may allow malicious users to bypass the security. SQL injection is one thing to name. then you have to do what you have to do.

It sounds like you're talking about a multi-tenant architecture, but I can't tell for sure.
This SO answer has a summary of the issues, and links to an online article containing details about the trade-offs.

How to handle different organization with single DB?

Background
building an online information system which user can access through any computer. I don't want to replicate DB and code for every university or organization.
I just want user to hit a domain like www.example.com sign in and use it.
For second user it will also hit the same domain www.example.com sign in and use it. but the data for them are different.
Scenario
suppose a university has 200 employees, 2nd university has 150 and so on.
Qusetion
Do i need to have separate employee table for each university or is it OK to have a single table with a column that has University ID?
I assume 2nd is best but Suppose i have 20 universities or organizations and a total of thousands of employees.
What is the best approach?
This same thing is for all table? This is just to give you an example.
Thanks

The approach will depend upon the data, usage, and client requirements/restrictions.
Use an integrated model, as suggested by duffymo. This may be appropriate if each organization is part of a larger whole (i.e. all colleges are part of a state college board) and security concerns about cross-query access are minimal2. This approach has a minimal amount of separation between each organization as the same schema1 and relations are "openly" shared. It leads to a very simple model initially, but it can become very complicated (with compound FKs and correct usage of such) if needing relations for organization-specific values because it adds another dimension of data.
Implement multi-tenancy. This can be achieved with implicit filters on the relations (perhaps hidden behinds views and store procedures), different schemas, or other database-specific support. Depending upon implementation this may or may not share schema or relations even though all data may reside in the same database. With implicit isolation, some complicated keys or relationships can be hidden/eliminated. Multi-tenancy isolation also generally makes it harder/impossible to cross-query.
Silo the databases entirely. Each customer or "organization" has a separate database. This implies separate relations and schema groups. I have found this approach to to be relatively simple with automated tooling, but it does require managing multiple database. Direct cross-querying is impossible, although "linked databases" can be used if there is a need.
Even though it's not "a single DB", in our case, we had the following restrictions 1) not allowed to ever share/expose data between organizations, and 2) each organization wanted their own local database. Thus, our product ended up using a silo approach. Make sure that the approach chosen meets customer requirements.
None of these approaches will have any issue with "thousands", "hundreds of thousands", or even "millions" of records as long as the indices and queries are correctly planned. However, switching from one to another can violate many assumed constraints and so the decision should be made earlier on.
1 In this response I am using "schema" to refer to the security grouping of database objects (e.g. tables, views) and not the database model itself. The actual database model used can be common/shared, as we do even when using separate databases.
2 An integrated approach is not necessarily insecure - but it doesn't inherently have some of the built-in isolation of other designs.

I would normalize it to have UNIVERSITY and EMPLOYEE tables, with a one-to-many relationship between them.
You'll have to take care to make sure that only people associated with a given university can see their data. Role based access will be important.

This is called a multi-tenant architecture. you should read this:
http://msdn.microsoft.com/en-us/library/aa479086.aspx
I would go with Tenant Per Schema, which means copying the structure across different schemas, however, as you should keep all your SQL DDL in source control, this is very easy to script.
It's easy to screw up and "leak" information between tenants if doing it all in the same table.

Multi Tenant Database with some Shared Data

I have a full multi-tenant database with TenantID's on all the tenanted databases. This all works well, except now we have a requirement to allow the tenanted databases to "link to" shared data. So, for example, the users can create their own "Bank" records and link accounts to them, but they could ALSO link accounts to "global" Bank records that are shared across all tenants.
I need an elegant solution which keeps referential integrity
The ways I have come up with so far:
Copy: all shared data is copied to each tenant, perhaps with a "System" flag. Changes to shared data involve huge updates across all tenants. Probably the simplest solution, but I don't like the data duplication
Special ID's: all links to shared data use special ID's (e.g. negative ID numbers). These indicate that the TenantID is not to be used in the relation. You can't use an FK to enforce this properly, and certainly cannot reuse ID's within tenants if you have ANY FK. Only triggers could be used for integrity.
Separate ID's: all tables which can link to shared data have TWO FK's; one uses the TenantID and links to local data, the other does not use TenantID and links to shared data. A constraint indicates that one or the other is to be used, not both. This is probably the most "pure" approach, but it just seems...ugly, but maybe not as ugly as the others.
So, my question is in two parts:
Are there any options I haven't considered?
Has anyone had experience with these options and has any feedback on advantages/disadvantages?

A colleague gave me an insight that worked well. Instead of thinking about the tenant access as per-tenant think about it as group access. A tenant can belong to multiple groups, including it's own specified group. Data then belongs to a group, possibly the Tenant's specific group, or maybe a more general one.
So, "My Bank" would belong to the Tenant's group, "Local Bank" would belong to a regional grouping which the tenant has access to, and "Global Bank" would belong to the "Everyone" group.
This keeps integrity, FK's and also adds in the possibility of having hierarchies of tenants, not something I need at all in my scenario, but a nice little possibility.

At Citus, we're building a multi-tenant database using PostgreSQL. For shared information, we keep it in what we call "reference" tables, which are indeed copied across all the nodes. However, we keep this in-sync and consistent using 2PC, and can also create FK relationships between reference and non-reference data.
You can find more information here.

Users should not get their own set of tables. It will most likely not perform as well as one table (properly indexed), and schema changes will have to be deployed to all user tables.
You could have default values specified on the table for things that are optional.
With difficulty. With one set of tables it will be a lot easier, and probably faster.
That sort of data should be stored in a User Preferences table that stores all preferences for all users. Again, don't duplicate the schema for all users.

Generally the idea of creating separate tables for each entity (in this case users) is not a good idea. If each table is separate querying may be cumbersome.
If your table is large you should optimize the table with indexes. If it gets very large, you also may want to look into partitioning tables.
This allows you to see the table as 1 object, though it is logically split up - the DBMS handles most of the work and presents you with 1 object. This way you SELECT, INSERT, UPDATE, ALTER etc as normal, and the DB figures out which partition the SQL refers to and performs the command.
Not splitting up the tables by users, instead using indexes and partitions, would deal with scalability while maintaining performance. if you don't split up the tables manually, this also makes that points 2, 3, and 4 moot.
Here's a link to partitioning tables (SQL Server-specific):
http://databases.about.com/od/sqlserver/a/partitioning.htm

It doesn't make any kind of sense to me to create a set of tables for each user. If you have a common set of tables for all users then I think that avoids all the issues you are asking about.

It sounds like you need to locate a primer on relational database design basics. Regardless of the type of application you are designing, you should start there. Learn how joins work, indices, primary and foreign keys, and so on. Learn about basic database normalization.
It's not customary to create new tables on-the-fly in an application; it's usually unnecessary in a properly designed schema. Usually schema changes are done at deployment time. The only time "users" get their own tables is an artifact of a provisioning decision, wherein each "user" is effectively a tenant in a walled-off garden; this only makes sense if each "user" (more likely, a company or organization) never needs access to anything that other users in the system have stored.
There are mechanisms for dealing with loosely structured types of information in databases, but if you find yourself reaching for this often (the most common method is called Entity-Attribute-Value), your problem is either not quite correctly modeled, or you may not actually need a relational database, in which case it might be better off with a document-oriented database like CouchDB/MongoDB.
Adding, based on your updated comments/notes:
Your concerns about the number of records in a particular table are most likely premature. Get something working first. Most modern DBMSes, including newer versions of MySql, support mechanisms beyond indices and clustered indices that can help deal with large numbers of records. To wit, in MS Sql Server you can create a partition function on fields on a table; MySql 5.1+ has a few similar partitioning options based on hash functions, ranges, or other mechanisms. Follow well-established conventions for database design modeling your domain as sensibly as possible, then adjust when you run into problems. First adjust using the tools available within your choice of database, then consider more drastic measures only when you can prove they are needed. There are other kinds of denormalization that are more likely to make sense before you would even want to consider having something as unidiomatic to database systems as a "table per user" model; even if I were to look at that route, I'd probably consider something like materialized views first.

I agree with the comments above that say that a table per user is a bad idea. Also, while it's a good idea to have strategies in mind now for how you can cope when things get really big, I'd concentrate on getting things right for a small number of users first - if no-one wants to / is able to use your service, then unfortunately you won't be faced with the problem of lots of users.
A common approach among very large sites is database sharding. The summary is: you have N instances of your database in parallel (on separate machines), and each holds 1/N of the total data. There's some shared way of knowing which instance holds a given bit of data. To access some data you have 2 steps, rather than the 1 you might expect:
Work out which shard holds the data
Go to that shard for the data
There are problems with this, such as: you set up e.g. 8 shards and they all fill up, so you want to share the data over e.g. 20 shards -> migrating data between shards.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight