In SQL Server 2005 I just struck the infamous error message:
Introducing FOREIGN KEY constraint XXX on table YYY may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.
Now, StackOverflow has several topics about this error message, so I've already got the solution (in my case I'll have to use triggers), but I'm curious as to why there is such a problem at all.
As I understand it, there are basically two scenarios that they want to avoid - a cycle and multiple paths. A cycle would be where two tables have cascading foreign keys to each other. OK, a cycle can span several tables too, but this is the basic case and will be easier to analyze.
Multiple paths would be when TableA has foreign keys to TableB and TableC, and TableB also has a foreign key to TableC. Again - this is the minimum basic case.
I cannot see any problems that would arise when a record would get deleted or updated in any of those tables. Sure, you might need to query the same table multiple times to see which records need updating/deleting, but is that really a problem? Is this a performance issue?
In other SO topics people go as far as to label using cascades as "risky" and state that "resolving cascade paths is a complex problem". Why? Where is the risk? Where is the problem?
You have a child table with 2 cascade paths from the same parent: one "delete", one "null".
What takes precedence? What do you expect afterwards? etc
Note: A trigger is code and can add some intelligence or conditions to a cascade.
The reason we forbid using cascade delete has to do with performance and locking. Yes it's not so bad when you delete one record but sooner or later you will need to delete a large group of records and your database will comes to a standstill.
If you are deleting enough records, SQL Server might escalate to a table lock and no one can do anything with the table until it is finished.
We recently moved one of our clients to his own server. As part of the deal we also then had to delete all of that client's records form our original server. Deleting all his information in batches (so as not to cause problems with other users) took a couple of months. If we had cascade delete set up, the database would have been inaccessible to the other clients for a long time as millions of records were deleted in one transaction and hundreds of tables were locked until the transaction was done.
I could also see a scenario where a deadlock might have occured in using cascade delete because we have no control over the order the cascade path would have taken and our database is somewhat denormalized with clientid appearing in most tables. So if it locked the one table that had a foreign key also to a third table as well as the client table that was in a differnt path, it possibly couldn't check that table in order to delete from the third table because this is all one transaction and the locks wouldn't be released until it was done. So possibly it wouldn't have let us set up cascade deletes if it saw the possibility of creating deadlocks in the transaction.
Another reason to avoid cascading deletes is that sometimes the existence of a child record is reason enough not to delete the parent record. For instance, if you have a customer table and that customer has had orders in the past, you would not want to delete him and lose the information on the actual order.
Consider a table of employees:
CREATE TABLE Employee
(
EmpID INTEGER NOT NULL PRIMARY KEY,
Name VARCHAR(40) NOT NULL,
MgrID INTEGER NOT NULL REFERENCES Employee(EmpID) ON DELETE CASCADE
);
INSERT INTO Employees( 1, "Bill", 1);
INSERT INTO Employees( 23, "Steve", 1);
INSERT INTO Employees(234212, "Helen", 23);
Now suppose Bill retires:
DELETE FROM Employees WHERE Name = "Bill";
Ooooppps; everyone just got sacked!
[We can debate whether the details of the syntax are correct; the concept stands, I think.]
I think the problem is that when you make one path "ON DELETE CASCADE" and the other "ON DELETE RESTRICT", or "NO ACTION" the outcome (the result) is unpredicable. It depends on which delete-trigger (this is also a trigger, but one you don't have to build yourself) will be executed first.
I agree with that cascades being "risky" and should be avoided. (I personally prefer cascading the changes manually rather that having sql server automatically take care of them). This is also because even if sql server deleted millions of rows, the output would still show up as
(1 row(s) affected)
I think whether or not to use a ON DELETE CASCADE option is a question of the business model you are implementing. A relationship between two business objects could be either a simple "association", where both ends of the relationship are related, but otherwise independent objects the lifecycle of which are different and controlled by other logic. There are, however, also "aggregation" relationships, where one object might actually be seen as the "parent" or "owner" of a "child" or "detail" object. There is the even stronger notion of a "composition" relationship, where an object solely exists as a composition of a number of parts.
In the "association" case, you usually won't declare an ON DELETE CASCADE constraint. For aggregations or compositions, however, the ON DELETE CASCADE helps you mapping your business model to the database more accurately and in a declarative way.
This is why it annoys me that MS SQL Server restricts the use of this option to a single cascade path. If I'm not mistaken, many other widely used SQL database systems do not impose such restrictions.
I'm adding custom SQL to an EF6 code first migration. The SQL adds delete triggers to various tables. It's generating correct syntax and getting output as expected.
But the location of the output is giving me trouble. Code first migrations put all the foreign key constraints at the bottom of the generated SQL. When creating the constraints for a join table (used in a many-to-many association), the ON DELETE CASCADE clause is added. All fine and dandy, except when I've added a delete trigger to one of the tables participating in that relationship -- cascade delete isn't allowed coming from a table with a delete trigger on it.
Ideally, I'd like to do one of two things -- either move my SQL to the bottom of the script (where I can yank the cascade deletes off the constraints after they're added) or somehow tell EF not to cascade delete certain particular join tables, while leaving the default behavior in place for the more common case where the cascade deletes are the correct thing to do.
Anyone have a clue as to how this can be done?
Thanks.
I have a tree table that has foreign keys to self (depth-unlimited parents and children). Given that SQL Server lacks the CASCADE (within same table, loops) support MySQL offers, I need to first order the IDs to delete before deleting them. I order them with a CTE from most-deletable (no children) to least-deletable (top parents).
I want to throw this set of IDs to a DELETE but I need them deleted in the exact order I send them to SQLServer. Can this be done in a single query?
I could write a stored procedure to delete the records or just DELETE one by one with multiple queries from C#... but I'm wondering if this can be achieved in a simple DELETE FROM WHERE IN query? I have no clue if DELETE retains order of the IN (...) collection, because if it doesn't... the FKC will block it. -- Coming from MySQL, never cared as they can cascade within the same table.
My company has an application with a bunch of database tables that used to use a sequence table to determine the next value to use. Recently, we switched this to using an identity property. The problem is that in order to upgrade a client to the latest version of the software, we have to change about 150 tables to identity. To do this manually, you can right click on a table, choose design, change (Is Identity) to "Yes" and then save the table. From what I understand, in the background, SQL Server exports this to a temporary table, drops the table and then copies everything back into the new table. Clients may have their own unique indexes and possibly other things specific to the client, so making a generic script isn't really an option.
It would be really awesome if there was a stored procedure for scripting this task rather than doing it in the GUI (which takes FOREVER). We made a macro that can go through and do this, but even then, it takes a long time to run and is error prone. Something like: exec sp_change_to_identity 'table_name', 'column name'
Does something like this exist? If not, how would you handle this situation?
Update: This is SQL Server 2008 R2.
This is what SSMS seems to do:
Obtain and Drop all the foreign keys pointing to the original table.
Obtain the Indexes, Triggers, Foreign Keys and Statistics of the original table.
Create a temp_table with the same schema as the original table, with the Identity field.
Insert into temp_table all the rows from the original table (Identity_Insert On).
Drop the original table (this will drop its indexes, triggers, foreign keys and statistics)
Rename temp_table to the original table name
Recreate the foreign keys obtained in (1)
Recreate the objects obtained in (2)
I need to delete many rows from sql server 2008 database, it must be scalable so I was thinking about bulk delete, the problem is that the are not many references on this, at least in my case.
The first factor is that I will exactly know the ID of every row to delete, so any tips with TOP are not an option, also I will delete less rows that I want to retain so there will be no need for some "drop/temp table/re-create" methods.
So I was thinking to use WHERE IN , either suppling IDs or xml data with IDs, there is also an option to use MERGE to delete rows.
If I will have to delete 1000+ rows, sending all ids to WHERE IN could be a problem ? And what with MERGE - it is really a cure for all bulk insert/update/delete problems? What to choose?
One option would be to store the known ID's into a "controller" table, and then delete rows from your main data table that have their ID show up in the controller table.
That way, you could easily "batch" up your deletes, e.g.
DELETE FROM dbo.YourMainDataTable
WHERE ID IN (SELECT TOP (250) ID FROM dbo.DeleteControllerTable)
You could easily run this delete statement in e.g. a SQL Agent Job which comes around every 15 minutes to check if there's anything to delete. In the meantime, you could add more ID's to your DeleteController table, thus you could "decouple" the process of entering the ID's to be deleted from the actual deletion process.