I’m looking at an implement for multi-tenancy in SQL Server. I'm considering a shared database, shared schema and tenant view filter described here. The only drawback is a fragmented connection pool...
Per http://msdn.microsoft.com/en-au/architecture/aa479086, Tenant View Filter is described as follows:
"SQL views can be used to grant individual tenants access to some of the rows in a given table, while preventing them from accessing other rows.
In SQL, a view is a virtual table defined by the results of a SELECT query. The resulting view can then be queried and used in stored procedures as if it were an actual database table. For example, the following SQL statement creates a view of a table called Employees, which has been filtered so that only the rows belonging to a single tenant are visible:
CREATE VIEW TenantEmployees AS
SELECT * FROM Employees WHERE TenantID = SUSER_SID()
This statement obtains the security identifier (SID) of the user account accessing the database (which, you'll recall, is an account belonging to the tenant, not the end user) and uses it to determine which rows should be included in the view"
Thinking this through , if we have one database storing say 5,000 different tenants, then the connection pool is completely fragmented and every time a request is sent to the database ADO.NET needs to establish a new connection and authenticate (remember connection pooling works for each unique connection string) and this approach means you have 5,000 connection strings…
How worried should I be about this? Can someone give me some real world examples of how significant an impact the connection pool has on a busy multi-tenant database server (say servicing 100 requests per second)? Can I just throw more hardware at the problem and it goes away?
Thoughts ??
My sugestion will be to develop a solid API over your database. Scalability, modularity, extensibility, accounting will be the main reasons. Few years down the line you may be found swearing at yourself for playing with SUSER_SID(). For instance, consider multiple tenants managed by one account or situations like whitelabels...
Have a data access api, which will take care of authentication. You can still do authorisation on the DB level, but it's a whole different topic then. Have users and perhaps groups and grant them permissions to tenants.
For huge projects nevertheless, you'll still find it better to have a single DB per big player.
I see I did not answer your main question about fragmented connection pool performance, but I'm convinced there are many valid arguments not to go that path nevertheless.
See http://msdn.microsoft.com/en-us/library/bb669058.aspx for hybrid solution.
See Row level security in SQL Server 2012
Related
I have develop application in which i have created different logins for every client.Our applications is having so many clients like job portals or facebook and every client having huge amount of data .If i use single database then one table get huge amount of data for all client
I find out one solution for that and solution is to create separate database for every client but as there are so many client then we need to create so many databases so that not correct solution
Please can you tell me right way to implement this by using sql server 2008 r2
Thanks
You could try having one schema per client, and that client's logon has that schema as their default and is the only schema that they have access to. However you'll have a lot of schemas so it may not be much help! (Also, iof you're using something like EF to access the db it won't work.)
Single database good:
Easy management
Single database bad:
Possible performance problems (although not until you get into
billions of rows; one DB I designed had a table with more than 21B
rows after 3 months; lucky I made the IDENTITY column a BigInt!)
Security issues/complexity: how do you stop one client accessing
another's data?
Single point of failure for all clients
Multiple database good
Security is easier
Single point of failure per client (assuming multiple DB Servers to
spread that load also)
More flexibility in applying updates: some clients are OK with
Wednesday, some with Thursday
I'm sure that there are other issues as well. Really it's up to your requirements and how they can best be met,
Multiple db bad:
More management required
Given a DB has overhead, your overhead resource usage goes up
I am working on a database auditing solution and was thinking of having SQL Server triggers take care of changes and inserting them into an auditing table. Since this is a SQL Azure Database and will be fairly large I am concerned about the cost of a growing database due to auditing.
In order to cut down on the costs needed for auditing purposes, I am considering storing the audit table (or tables) in Azure Tables instead of Azure SQL databases. So the question becomes, how to get the SQL Server trigger to get the changed data into Azure Tables?
The only thing I can come up with is to have an audit table (or tables) in SQL Databases so the trigger can insert the rows locally, and then have a Worker Role every X seconds pull any rows from that and move them to Azure Tables and delete from the SQL Database table so it doesn't grow large.
Is there a better way to do this integration? Can I somehow put a message in a queue from a trigger?
Azure SQL Database (formerly SQL Azure) doesn't support CLR (hence no EXTERNAL NAME trigger parameter) so there's no way for your triggers to do anything outside of T-SQL. If you want audit content to go to a table, you could take the approach you came up with (temporarily write to SQL table, then move content periodically to Table). There are other approaches you could take (and this would be opinion/subjective, frowned upon here), but going with the queue concept for a minute, since you asked about queues, and illustrating what you could do with Azure Queues:
You could use an Azure queue to specify an item to insert/update in your SQL database. The queue processing code could then be responsible for performing the update and writing to the Azure table. Since the queue messages must be explicitly deleted after processing, you could simply repeat the queue message processing if something failed during execution (e.g. you write to SQL but fail before writing to table storage). The message eventually becomes visible for reading again, if you don't delete it before its timeout value. As long as your operations are idempotent, you'd be ok with this pattern.
A cheaper solution than using worker roles would be to use a combination of Azure Scheduled Tasks (you can enable them for free to run every 15 min within Mobile Apps) and Azure Web Sites. Basically the way it would work is to run this scheduled job every 15 min which would make an HTTP call to some code you have running within your Azure Web Site. This code would do the same work you had outlined for your worker role.
Alternatively, use SQL Server System-Versioned temporal tables to automatically handle the writing of audited record (i.e., changes) to corresponding history tables.
We are considering using a single SQL Server database to store data for multiple clients. We feel having all the data in one database could make things more manageable than a "separate db per client" setup.
The biggest concern we have is accidental access to the wrong client. It would be very, very bad if we were to ever accidentally show one client's data to another client. We perform lots of queries, and are afraid of a scenario where someone says "write me a query of this and this to go show the client for the meeting in 15 minutes." If someone is careless and omits the WHERE clause that filters for the correct client then we would be in serious trouble. Is there a robust setup or design pattern for SQL Server such that it makes it impossible (or at least very difficult) to accidently pull the wrong client's data from a single "global" database?
To be clear, this is NOT a database that the clients use directly or via apps (yet). We are talking about a database accessed by several of our programmers and we are afraid of screwing up ourselves.
At the very minimum, you should put the client data in separate schemas. In SQL Server, schemas are the unit of authorization. Only people authorized for a given client should be able to see that client's data. In addition to other protections, you should be using the built-in authorization capabilities of the database.
Right now, it sounds like you are in a situation where a very small group of people are the ones accessing all the data for everyone. Well, if you are successful, then you will probably need more people in the future. In fact, you might be giving some clients direct access to the data. If it is their data, they will want apps running on it.
My best advice, if you are planning on growing, is to place each client's data in a separate database. I would architect the system so this database can be on a remote server. If it needs to synchronize with common data, then develop a replication strategy for moving that data around.
You may think it is bad to have one client see another client's data. From the business perspective, this is deadly -- like "company goes out of business, no job" deadly. Your clients are probably more concerned about such confidentiality than you are. And, an architecture that ensures protection will make them more comfortable.
Multi-Tenant Data Architecture
http://msdn.microsoft.com/en-us/library/aa479086.aspx
here's what we do (mysql unfortunately):
"tenant" column in each table
tables are in one schema [1]
views are in another schema (for easier security and naming). view must not include tenant column. view does a WHERE on the tenant based on current user
tenant value is set by trigger on insert, based on the user
Assuming that all your DDL is in .sql files under source control (which it should be), then having many databases or schemas is not so tough.
[1] a schema in mysql is called a 'database'
You could set up one inline table valued function for each table that takes a required parameter #customerID and filters that particular table to the data of this customer. If the entire app were to use only these TVP's the app would be safe by construction.
There might be some performance implications. The exact numbers depend on the schema and queries. They can be zero, however, as inline TVP's are inlined and optimized together with the rest of the query.
You can limit access to data only via storedprocedures with obligatory customerid parameter.
If you allow you IT build views sooner or later someone forget this where clause as you said.
But a schema per client with already prefiltered views will enable selfservice and extra Brings value i guess.
We have a requirement from a client to protect the database our application uses, even from their local administrators (Auditors just gave them that requirement).
In their requirement, protecting the data means that the Sql Server admin cannot read, nor modify sensitive data stored in tables.
We could do that with Encryption in Sql Server 2005, but that would interfere with our third party ORM, and it has other cons, like indexing, etc.
In Sql Server 2008 we could use TDE, but I understand that this solution doesn't protect against a user with Sql Server admin rights to query the database.
Is there any best practice or known solution to this problem?
This problem could be similar to the one of having an application hosted by a host provider, and you want to protect the data from the host admins.
We can use Sql Server 2005 or 2008.
This has been asked a lot in the last few weeks. The answers usually boil down to:
(
a) If you don't control the application you are doomed to trust the DBA
or
b) If you do control the application you can encrypt everything with a key only known to the application, and decrypt on the way out. It'll hurt performance a bit (or a lot) though, that's why TDE exists. A variant of this to prevent tampering is to use a cryptographic hash of the values in the column, checking them upon application access.
)
and
c) Do extensive auditing, so you can control what are your admins doing.
I might have salary information in my tables, and I don't want my trusted dba's to see.
Faced with the same problem we have narrowed are options to:
1- Encrypt outside SQLServer, before inserts and updates and decrypt after selects. ie: Using .net encryption.
Downside: You loose some indexing and searching capabilities, cannot use like and betweens.
2- Use third party tools (at io level) that block crud to the database unless a password is provided. ie: www.Blockkk.com
Downside: You will need to trust a third party tool installed in your server. It might not keep up with SQL Server patches, etc...
3- Use an Auditing Solution that will keep track of selects, inserts, deletes, etc... And will notify (by email or event log)if violations occurred. A sample violation could be a dba running a select on your Salaries table. then fire the dba and change everyone salaries.
Auditors always ask for this, like they ask for other things that can never be done.
What you need to do is put it into risk-mitigation terms and show what controls you do have (tracking when users are elevated to administrators, what they did and that they were de-elevated afterwards) instead of in absolutes.
I once had a boss ask for total system redundancy without defining what he meant or how much he was willing to pay and sacrifice.
I think the right solution would be to only allow trusted people be DBA's.
It is implicit in being DBA, that you have full access, so in my opinion, your auditor should demand that you have procedures for restricting who has DBA access.
That way you work with the system through processes in stead of working aginst the system (ie. sql server).
To have person you don't trust be DBA would be nuts...
If you don't want any people in the admin group on the server to be able to access the database, then remove the "BUILTIN\Administrators" user on the server.
However, make sure you have another user that is a sysadmin on the server!
another way i heard that a company has implemented but i haven't seen it is:
there's a government body which issues kind of timestamped certificate.
each db change is sent to async queue and is timestamped with this certificate and is stored off site. this way noone can delete anything without breaking the timestamp chain.
i don't know how exactly this works on a deeper level.
Working with an application that needs to provide row and column level security for user reports. The logic for the row filtering an column masking is in place, but there are still decisions to be made about identifying users at report execution time.
The application uses a single SQL Server login to authenticate, as all rights are data driven within the application itself. This mechanism does not carry well to reports, as clients like Crystal and MS Office do not authenticate through the application (web and WinForms).
The traditional approach of using SQL Server logins and database users will work will, but may have one issue. In some implementations of the application, the number of users who run reports and need to be uniquely identified may run into the hundreds.
Are there any practical limits to the number of logins or users on a SQL Server database (v 2005+) where this approach may cause problems? Administration of the users on the database server can be automated by the application, but the potential number of credentials may be a concern.
We have looked into user impersonation techniques, but they become difficult to implement when a report client such as Excel authenticates directly to the server.
Edit: The concern is not concurrency or workload, but rather administration issues on remote instances where a local DBA is not available, especially when the server is not dedicated to the application. Interested in scenarios where the numbers of logins were problematic.
I've used your described approach (SQL Server accounts managed automatically by our application) and we didn't have any trouble. However, we only ever had a maximum of perhaps 200 SQL accounts. But we didn't experience any kind of administrative overhead except when "power users" restored databases without telling us, causing the SQL login account to become out of synch with the database*.
I think your approach is sound.
EDIT: Our solution for this was a proc that simply ran through the user accounts and called our procs that deleted/created the user accounts. When the power users called this proc all was well, and it was reasonably fast.