I have a system with shared multitenancy, which means each table contains data for all tenants with a TenantId column to distinguish between them.
Provisioning a new tenant is quick and easy, however now I'm facing a challenge with deleting a single tenant.
Given that entities depend on each other for consistency, how do I delete a tenant easily from my database, while the system is in use by other tenants?
The system uses SQL Server 2008 R2, if that helps.
If I got you right - this is the classical case for use of FOREIGN KEYS with ON CASCADE option. You only delete one record from master tenants table and due to proper chain of FKeys the system deletes related records or updates the reference columns with NULL or DEFAULT value
Sometimes will not work in cases where table references itself with DELETE ON CASCADE
As Oleg has pointed out FK with ON CASCADE option should help. But since you haven't shown us the schema, I am not very sure whether there is a possibility of system throwing an error saying "Introducing FOREIGN KEY constraint causes cycles or Multiple cascade paths". If you see this error then may be instead of CASCADE DELETE add a INSTEAD OF DELETE trigger to do the job.
CREATE TRIGGER dbo.Tenants_Delete
ON dbo.Tenants
INSTEAD OF DELETE
AS
BEGIN;
--Delete from the Child and Master table as per your need here.
--Make use of the magic table DELETED
END;
Here's another approach: If deleting the tenant causes too much headache, you may be able to use a workaround.
Simply add a boolean column active to your tenant table. Then introduce a view that selects only the active tenants. Adjust your stored procedures to look up the data in this view rather than in the original tenant table.
Related
In SQL Server 2005 I just struck the infamous error message:
Introducing FOREIGN KEY constraint XXX on table YYY may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.
Now, StackOverflow has several topics about this error message, so I've already got the solution (in my case I'll have to use triggers), but I'm curious as to why there is such a problem at all.
As I understand it, there are basically two scenarios that they want to avoid - a cycle and multiple paths. A cycle would be where two tables have cascading foreign keys to each other. OK, a cycle can span several tables too, but this is the basic case and will be easier to analyze.
Multiple paths would be when TableA has foreign keys to TableB and TableC, and TableB also has a foreign key to TableC. Again - this is the minimum basic case.
I cannot see any problems that would arise when a record would get deleted or updated in any of those tables. Sure, you might need to query the same table multiple times to see which records need updating/deleting, but is that really a problem? Is this a performance issue?
In other SO topics people go as far as to label using cascades as "risky" and state that "resolving cascade paths is a complex problem". Why? Where is the risk? Where is the problem?
You have a child table with 2 cascade paths from the same parent: one "delete", one "null".
What takes precedence? What do you expect afterwards? etc
Note: A trigger is code and can add some intelligence or conditions to a cascade.
The reason we forbid using cascade delete has to do with performance and locking. Yes it's not so bad when you delete one record but sooner or later you will need to delete a large group of records and your database will comes to a standstill.
If you are deleting enough records, SQL Server might escalate to a table lock and no one can do anything with the table until it is finished.
We recently moved one of our clients to his own server. As part of the deal we also then had to delete all of that client's records form our original server. Deleting all his information in batches (so as not to cause problems with other users) took a couple of months. If we had cascade delete set up, the database would have been inaccessible to the other clients for a long time as millions of records were deleted in one transaction and hundreds of tables were locked until the transaction was done.
I could also see a scenario where a deadlock might have occured in using cascade delete because we have no control over the order the cascade path would have taken and our database is somewhat denormalized with clientid appearing in most tables. So if it locked the one table that had a foreign key also to a third table as well as the client table that was in a differnt path, it possibly couldn't check that table in order to delete from the third table because this is all one transaction and the locks wouldn't be released until it was done. So possibly it wouldn't have let us set up cascade deletes if it saw the possibility of creating deadlocks in the transaction.
Another reason to avoid cascading deletes is that sometimes the existence of a child record is reason enough not to delete the parent record. For instance, if you have a customer table and that customer has had orders in the past, you would not want to delete him and lose the information on the actual order.
Consider a table of employees:
CREATE TABLE Employee
(
EmpID INTEGER NOT NULL PRIMARY KEY,
Name VARCHAR(40) NOT NULL,
MgrID INTEGER NOT NULL REFERENCES Employee(EmpID) ON DELETE CASCADE
);
INSERT INTO Employees( 1, "Bill", 1);
INSERT INTO Employees( 23, "Steve", 1);
INSERT INTO Employees(234212, "Helen", 23);
Now suppose Bill retires:
DELETE FROM Employees WHERE Name = "Bill";
Ooooppps; everyone just got sacked!
[We can debate whether the details of the syntax are correct; the concept stands, I think.]
I think the problem is that when you make one path "ON DELETE CASCADE" and the other "ON DELETE RESTRICT", or "NO ACTION" the outcome (the result) is unpredicable. It depends on which delete-trigger (this is also a trigger, but one you don't have to build yourself) will be executed first.
I agree with that cascades being "risky" and should be avoided. (I personally prefer cascading the changes manually rather that having sql server automatically take care of them). This is also because even if sql server deleted millions of rows, the output would still show up as
(1 row(s) affected)
I think whether or not to use a ON DELETE CASCADE option is a question of the business model you are implementing. A relationship between two business objects could be either a simple "association", where both ends of the relationship are related, but otherwise independent objects the lifecycle of which are different and controlled by other logic. There are, however, also "aggregation" relationships, where one object might actually be seen as the "parent" or "owner" of a "child" or "detail" object. There is the even stronger notion of a "composition" relationship, where an object solely exists as a composition of a number of parts.
In the "association" case, you usually won't declare an ON DELETE CASCADE constraint. For aggregations or compositions, however, the ON DELETE CASCADE helps you mapping your business model to the database more accurately and in a declarative way.
This is why it annoys me that MS SQL Server restricts the use of this option to a single cascade path. If I'm not mistaken, many other widely used SQL database systems do not impose such restrictions.
Should I break my database up into two under this scenario?
Scenario
While customers are creating, editing, saving orders in the Order table, the website owner calls a stored procedure to alter table properties in the Email table.
Update
The only relation the Order table has with the Email table is the userId (FK on Email table) . So, what are the ramifications if WHILE customer is placing an order, I am simultaneous adding say a nullable "CcAddressId" column to the Email table. Would problems occur with this order being successful?
Questions:
Will either have a potential error if these events occurring simultaneously?
Would it be better to break up the database into groups?
Short answer is not to break the database in two because you will lose the referential integrity enforced by your foreign keys, not to mention other issues that others brought up above.
What you need to do is find out if SQL Server will issue any kind of lock for your EMails table as you are inserting data into Orders table.
You can experiment and find out for yourself using this as a starting point:
http://aboutsqlserver.com/2012/04/05/locking-in-microsoft-sql-server-part-13-schema-locks/
Altering a table will try to issue a schema modification lock (SCH-M) which is incompatible with any other kinds of locks. Therefore, if there is any activity going on over the table being altered (which I assume there will be because foreign key constraints are being validated), your schema modification statement will be blocked for a long time.
This is why it is better to run schema altering statements when your database is NOT under heavy load.
I am pretty new to Business Analysis. I have to write requirements that show both (for now) cascade delete (for two tables) and the rest of the tables will delete explicitly.
I need some guidance for how to write the requirements for cascade deletion.
Delete child entities on parent
deletion.
Delete collection members if collection entity is deleted.
Actually it is hard to understand the task without context and also it smells like university/colledge homework (we had one very similar to this).
Use the ON DELETE CASCADE option to specify whether you want rows deleted in a child table when corresponding rows are deleted in the parent table. If you do not specify cascading deletes, the default behavior of the database server prevents you from deleting data in a table if other tables reference it.
If you specify this option, later when you delete a row in the parent table, the database server also deletes any rows associated with that row (foreign keys) in a child table. The principal advantage to the cascading-deletes feature is that it allows you to reduce the quantity of SQL statements you need to perform delete actions.
For example, the all_candy table contains the candy_num column as a primary key. The hard_candy table refers to the candy_num column as a foreign key. The following CREATE TABLE statement creates the hard_candy table with the cascading-delete option on the foreign key:
CREATE TABLE all_candy
(candy_num SERIAL PRIMARY KEY,
candy_maker CHAR(25));
CREATE TABLE hard_candy
(candy_num INT,
candy_flavor CHAR(20),
FOREIGN KEY (candy_num) REFERENCES all_candy
ON DELETE CASCADE)
Because ON DELETE CASCADE is specified for the dependent table, when a row of the all_candy table is deleted, the corresponding rows of the hard_candy table are also deleted. For information about syntax restrictions and locking implications when you delete rows from tables that have cascading deletes, see Considerations When Tables Have Cascading Deletes.
Source: http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls292.htm
You don't write use cases for functionality - that is the reason why it is hard to properly answer your question - we don't know the actor who interacts with the system and of course we know nothing about the system, so we cannot tell you how to write description of their interactions.
You should write your use cases first and from them derive the functionality.
I'm not 100% sure how cascading deletes work.
I have for simplicity tables that look like this
User
User_ID
ExtendedUser
User_ID
Comments
User_Id
Posts
User_ID
I basically have a ton of tables which reference the User_ID from User. I'd like to set a cascading delete on one table so that I can delete the User object and ensure that all tables that reference User are deleted.
However, my understanding is that I need to set the delete action on every table that references User. that is I need to set the "cascade delete" on every child table. Is my understanding correct?
SQL Server Cascading
Update:
It looks like I have to set it for every relationship. Where should I think of these relationships as being "stored"? Maybe my conception is not right.
It looks like I can set all the referential integrity rules for each relationship using the management studio from the parent table.
For each relationship, you can specify what action to take.
Easiest way to manage this likely would be to use SQL Server Management Studio. Design your parent table, and find all the PK-FK relationships.
For each, choose which path to take when a Delete event occurs:
No Action - this would cause a FK error when it occurs
Cascade - delete the child record
Set null - the FK column value would be null'd. This would throw an err obviously when nulls aren't allowed in the child table.
Set default - if the FK column on the child table has a default, it would then be the new value in the child column.
What you are describing seems like bad mojo (especially if you don't enforce referential integrity). Consider the following:
You have 3 tables that reference the user table: Client, Employee and Guest
If I understand you correctly you are saying that when you delete a Client record that references a User record then you want that User record deleted as well (correct?).
If you do this and there are Employee and Guest records that reference that same user record then they will suddenly be pointing to nothing (if you don't enforce referential integrity).
It seems to me that if you have a bunch of tables that reference the User table then you should not do cascading deletes...
How do I go about deleting a row that is referenced by many other tables, either as a primary key or as a foreign key?
Do I need to delete each reference in the appropriate order, or is there an 'auto' way to perform this in, for example, linq to sql?
If you're performing all of your data access through stored procedures then your delete stored procedure for the master should take care of this. You need to maintain it when you add a new related table, but IMO that requires you to think about what you're doing, which is a good thing.
Personally, I stay away from cascading deletes. It's too easy to accidentally delete a slew of records when the user should have been warned about existing children instead.
Many times the best way to delete something in a database is to just "virtually" delete it by setting an IsDeleted column, and then ignoring the row in all other queries.
Deletes can be very expensive for heavily linked tables, and the locks can cause other queries to fail while the delete is happening.
You can just leave the "IsDeleted" rows in the system forever (which might be helpful for auditing), or go back and delete them for real when the system is idle.
if you have the foreign keys set with ON DELETE CASCADE, it'll take care of pruning your database with just DELETE master WHERE id = :x