I executed this query that deletes 16 000 000 rows :
delete from [table_name]
3 minutes after execution and no results, I cancelled the query.
Took a while to cancel, but in the end it said "Query Cancelled"
Does that mean that any of the 16 000 000 are deleted? Or are they still all there?
This is not the real query, I just used it for the example.
The rows should be there. Delete is rolled back if you cancel the statement.
Once it completes, the changes are committed though.
So, first off - if you want to be sure you can cancel a query successfully, you need to use transactions. IF you were using transactions and that transaction was rolled back, none of the rows would be deleted. Management studio will only use implicit transactions if you have that flag set (see here).
That said, I was really intrigued because I figured that a delete operation should be an "atomic" operation so I wrote the following code to test that:
create database test_database
use test_database
go
create table sample (
ID BIGINT IDENTITY(1,1) PRIMARY KEY,
value VARCHAR(25)
)
declare #id int
select #id = 1
-- up 90000 if this doesn't have enough rows
while #id >=1 and #id <= 90000
begin
insert into sample (value) values ('abcdefg');
select #id = #id + 1
end
-- run, but cancel in the middle
delete from sample
-- check, are there 90000 records now?
select count(id) from sample
-- clean up
drop table sample
Turns out, if you click cancel it does treat this as an "atomic" statement, which means all the rows are still there. Your environment might be different so test it out!
Related
I am struggling to find a SQL Server replacement for select for update that works.
I have a master table that contains a column which is used for next order number. The application does a select from update on this row, reads the current value (while locked) adds one to this value and then updates the row, then uses the number it received. This process works perfectly on all databases I've tried but for SQL Server which does not seem to have any process for selecting data for exclusive use.
How do I do a locked read and update of something like a next order number from a sequence table is SQL Server?
BTW, I know I can use things like IDENTITY cols and stuff, to do this, but in this case I must read from this existing column. Get the value and inc it, and do it in a safe locked manner to avoid 2 users getting the same value.
UPDATE::
Thank you, that works for this case :)
DECLARE #Output char(30)
UPDATE scheme.sysdirm
SET #Output = key_value = cast(key_value as int)+1
WHERE system_key='OPLASTORD'
SELECT #Output
I have one other place I do something similar. I read and lock a stock record too.
SELECT STOCK
FROM PRODUCT
WHERE ID = ? FOR UPDATE.
I then do some validation and the do
UPDATE PRODUCT SET STOCK = ?
WHERE ID=?
I can't just use your above method here, as the value I update is based on things I do from the stock I read. But I need to ensure no one else can mess with the stock while I do this. Again, easy on other DB's with SELECT FOR UPDATE... is there a SQL Server workaround?? :)
You can simple do an UPDATE that also reads out the new value into a SQL Server variable:
DECLARE #Output INT
UPDATE dbo.YourTable
SET #Output = YourColumn = YourColumn + 1
WHERE ID = ????
SELECT #Output
Since it's an atomic UPDATE statement, it's safe against concurrency issues (since only one connection can get an update locks at any one given time). A potential second session that wants to get the incremented value at the same time will have to wait until the first one completes, thus getting the next value from the table.
As an alternative you can use the OUTPUT clause of the UPDATE statement, although this will insert into a table variable.
Create table YourTable
(
ID int,
YourColumn int
)
GO
INSERT INTO YourTable VALUES (1, 1)
GO
DECLARE #Output TABLE
(
YourColumn int
)
UPDATE YourTable
SET YourColumn = YourColumn + 1
OUTPUT inserted.YourColumn INTO #Output
WHERE ID = 1
SELECT TOP 1 YourColumn
FROM #Output
**** EDIT
If you want to ensure that no-one can change the data after you have read it, you can use a repeatable read. You should be aware that any reads of any tables you do will be locked for Update (pessimistic locking) and may cause Deadlocking. You can also sue the SELECT ... FROM TABLE (UPDLOCK) hint within a transaction.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT STOCK
FROM PRODUCT
WHERE ID = ?
.....
...
UPDATE Product
SET Stock = nnn
WHERE ID = ?
COMMIT TRANSACTION
I'm trying to understand a problem I have run into that I don't believe should be possible when dealing with transactions utilizing the read committed isolation level. I have a table that is being used as a queue. In one thread (connection 1) I insert multiple batches of 20 records into each table. Each batch of 20 records is performed inside a transaction. In a second thread (connection 2) I perform an update to change the status of the records that have been inserted into the queue, which also occurs inside a transaction. When running concurrently, it is my expectation that the number of rows affected by the update (connection 2) should be a multiple of 20, since connection 1 is inserting rows in the table inserts in batches of 20 rows within a transaction.
But my testing shows this is not always the case, and on occasion I'm able to update a subset of records from connection 1's batch. Should this be possible or am I missing something about transactions, concurrency, and isolation levels? Below is a set of test scripts I created to reproduce this issue in T-SQL.
This script inserts 20,000 records into the table in transaction batches of 20.
USE ReadTest
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
SET NOCOUNT ON
DECLARE #trans_id INTEGER
DECLARE #cmd_id INTEGER
DECLARE #text_str VARCHAR(4000)
SET #trans_id = 0
SET #text_str = 'Placeholder String Value'
-- First empty the table
DELETE FROM TABLE_A
WHILE #trans_id < 1000 BEGIN
SET #trans_id = #trans_id + 1
SET #cmd_id = 0
BEGIN TRANSACTION
-- Insert 20 records into the table per transaction
WHILE #cmd_id < 20 BEGIN
SET #cmd_id = #cmd_id + 1
INSERT INTO TABLE_A ( transaction_id, command_id, [type], status, text_field )
VALUES ( #trans_id, #cmd_id, 1, 1, #text_str )
END
COMMIT
END
PRINT 'DONE'
This script updates the records in the table, changing the status from 1 to 2 and then checks the rowcount from the update operation. When the rowcount is not a multiple of 20, and print statement indicates this and the number of rows affected.
USE ReadTest
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
SET NOCOUNT ON
DECLARE #loop_counter INTEGER
DECLARE #trans_id INTEGER
DECLARE #count INTEGER
SET #loop_counter = 0
WHILE #loop_counter < 100000 BEGIN
SET #loop_counter = #loop_counter + 1
BEGIN TRANSACTION
UPDATE TABLE_A SET status = 2
WHERE status = 1
and type = 1
SET #count = ##ROWCOUNT
COMMIT
IF ( #count % 20 <> 0 ) BEGIN
-- Records in concurrent transaction inserting in batches of 20 records before commit.
PRINT '*** Rowcount not a multiple of 20. Count = ' + CAST(#count AS VARCHAR) + ' ***'
END
IF #count > 0 BEGIN
-- Delete the records where the status was changed.
DELETE TABLE_A WHERE status = 2
END
END
PRINT 'DONE'
This script creates the test queue table in a new database called ReadTest.
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'ReadTest')
BEGIN;
DROP DATABASE ReadTest;
END;
GO
CREATE DATABASE ReadTest;
GO
ALTER DATABASE ReadTest
SET ALLOW_SNAPSHOT_ISOLATION OFF
GO
ALTER DATABASE ReadTest
SET READ_COMMITTED_SNAPSHOT OFF
GO
USE ReadTest
GO
CREATE TABLE [dbo].[TABLE_A](
[ROWGUIDE] [uniqueidentifier] NOT NULL,
[TRANSACTION_ID] [int] NOT NULL,
[COMMAND_ID] [int] NOT NULL,
[TYPE] [int] NOT NULL,
[STATUS] [int] NOT NULL,
[TEXT_FIELD] [varchar](4000) NULL
CONSTRAINT [PK_TABLE_A] PRIMARY KEY NONCLUSTERED
(
[ROWGUIDE] ASC
) ON [PRIMARY]
) ON [PRIMARY]
ALTER TABLE [dbo].[TABLE_A] ADD DEFAULT (newsequentialid()) FOR [ROWGUIDE]
GO
You expectations are completely misplaced. You have never expressed in your query the requirement to 'dequeue' exactly 20 rows. The UPDATE can return 0, 19, 20, 21 or 1000 rows and all results are correct, as long as the status is 1 and type is 1. If you expect that the 'dequeue' occurs in the order of the 'enqueue' (which is somehow eluded to in your question, but never explicitly stated) then your 'dequeue' operation must contain an ORDER BY clause. Had you add such an explicitly stated requirement then your expectation that 'dequeue' always return an entire batch of 'enqueue' rows (ie. multiple of 20 rows) would be one step closer to being a reasonable expectation. As things stand right now, is, as I said, completely misplaced.
For a lengthier discussion see Using Tables as Queues.
I shouldn't be concerned that while one transaction is committing a
batch of 20 inserted records, another concurrent transaction is only
able to update a subset of those records and not all 20?
Basically the question boils down to If I SELECT while I INSERT, how many inserted rows will I see?. You only have a right to be concerned if the isolation level is declared as SERIALIZABLE. None of the other isolation levels make any prediction about how many rows inserted while the UPDATE was running will be visible. Only SERIALIZABLE states that the outcome has to be the same as running the two statements one after another (ie. serialized, hence the name). While the technical details of how the UPDATE 'sees' only part of the INSERT batch are easy to understand once you consider physical order and the lack of ORDER BY clause, the explanation is irrelevant. The fundamental issue is that the expectation is non-warranted. Even if the 'issue' is 'fixed' by adding a proper ORDER BY and the correct clustered index key (the article linked above explains the details), the expectation is still non-warranted. It will still be perfectly legal for the UPDATE to 'see' 1, 19 or 21 rows, although it will be unlikely to happen.
I guess I've always understood READ COMMITTED to only read committed
data, and that a transaction commit is an atomic operation, making all
the changes that occurred in the transaction available at once.
That is correct. What is incorrect is to expect that a concurrent SELECT (or update) to see the entire change, irrelevant of where it happens to be in the execution. Open an SSMS query and run the following:
use tempdb;
go
create table test (a int not null primary key, b int);
go
insert into test (a, b) values (5,0)
go
begin transaction
insert into test (a, b) values (10,0)
Now open a new SSMS query and run the following:
update test
set b=1
output inserted.*
where b=0
This will block behind the uncommitted INSERT. Now go back to first query and run the following:
insert into test (a, b) values (1,0)
commit
When this commits, the second SSMS query will finish, and it will return two rows, not three. QED. This is READ COMMITTED. What you expect is SERIALIZABLE execution (in which case the example above will deadlock).
It could happen like this:
The writer/inserter writes 20 rows (does not commit)
The reader/updater reads one row (which is not committed - it discards it)
The writer/inserter commits
The reader/updater reads 19 rows which are now committed thus visible
I believe that only an isolation level of serializable (or snapshot isolation which is more concurrent) fixes this.
I am using SQL Server 2008. I have a table where orders with SKU are recorded, a table for inventory that has counts and a table where the relationship between the SKU sold and inventory items is recorded.
In the end, I got the report like this
Inventory CurrentQuantity OpenedOrder
SKU1 300 50
SKU2 100 10
Each order will be processed individually. How can I have the database automatically update the inventory tablet after each order is processed?
i.e
If the order has 2 SKU1 in it got processed, the the inventory table will automatically show 298.
Thanks
I would use a Stored Procedure, and perform the order insert and quantity update in one hit:
CREATE PROC dbo.ProcessOrder
#Item int,
#Quantity int
AS
BEGIN
--Update order table here
INSERT INTO dbo.Orders(ItemID,Quantity)
VALUES (#ItemID, #Quantity)
--Update Inventory here
UPDATE dbo.Inventory
SET CurrentQuantity = CurrentQuantity - Quantity
WHERE ItemID = #ItemID
END
I think what you are looking for is a trigger
Basically, set up a trigger that will update the appropriate columns using the inserted/updated data given. Without a full schema set, that is the best answer I can give at this time
I wouldn't be looking at a trigger myself for this.
My check out process
Start a transaction
Check stock level.
If OK, (optional validation / authorisation)
Add a check out record
Reduce the stock
Possibly add some record to invoice teh recipent etc.
Commit the transaction
While you could do it with triggers, I simply fail to see the point, a nice simple clear and all in one place SP_CheckOut stored procedure is where I'd be going.
I would normally advise to use a trigger but stock manipulation is that kind of operation that's usually done a lot of times, sometimes on batches and this is not the best scenario for triggers to be honest.
I think PKG's idea is very good, but you should never forget to add transaction control to it, otherwise you can endup with non-matching stocks:
CREATE PROC dbo.ProcessOrder
#Item int,
#Quantity int
AS
BEGIN
begin transaction my_tran
begin try
--Update order table here
INSERT INTO dbo.Orders(ItemID,Quantity)
VALUES (#ItemID, #Quantity)
--Update Inventory here
UPDATE dbo.Inventory
SET CurrentQuantity = CurrentQuantity - Quantity
WHERE ItemID = #ItemID
commit transaction
end try
begin catch
rollback transaction
--raise error if necessary
end catch
END
you can use the trigger,also use the procedure,and the specific steps on the top,use the procedure need to open the atuo exec feature in mastaer DB.
I am trying to add a check constraint which verity if after an update the new value (which was inserted) is greater than old values which is already stored in table.
For example i have a "price" column which already stores value 100, if the update comes with 101 the is ok, if 99 comes then my constraint should reject the update process. Can this behavior be achieved using check constraints or should i try to use triggers or functions ?
Please advice me regarding this...
Thanks,
Mircea
Check constraints can't access the previous value of the column. You would need to use a trigger for this.
An example of such a trigger would be
CREATE TRIGGER DisallowPriceDecrease
ON Products
AFTER UPDATE
AS
IF NOT UPDATE(price)
RETURN
IF EXISTS(SELECT * FROM inserted i
JOIN deleted d
ON i.primarykey = d.primarykey
AND i.price< d.price)
BEGIN
ROLLBACK TRANSACTION
RAISERROR('Prices may not be decreased', 16, 1)
END
Triggers start as a quick fix, and end with a maintenance nightmare. The two big problems with triggers are:
It's hard to see when a trigger is called. You can easily write an update statement without being aware that a trigger will run.
When triggers start triggering other triggers, it becomes hard to tell what will happen.
As an alternative, wrap access to the table in a stored procedure. For example:
create table TestTable (productId int, price numeric(6,2))
insert into TestTable (productId, price) values (1,5.0)
go
create procedure dbo.IncreasePrice(
#productId int,
#newPrice numeric(6,2))
with execute as owner
as
begin
update dbo.TestTable
set price = #newPrice
where productId = #productId
and price <= #newPrice
return ##ROWCOUNT
end
go
Now if you try to decrease the price, the procedure will fail and return 0:
exec IncreasePrice 1, 4.0
select * from TestTable --> 1, 5.00
exec IncreasePrice 1, 6.0
select * from TestTable --> 1, 6.00
Stored procedures are pretty easy to read. Compared to triggers, they'll cause you a lot less headaches. You can enforce the use of stored procedures by not giving any user the right to UPDATE tables. That's a good practice anyway.
Suppose that I have a table with 10000000 record. What is difference between this two solution?
delete data like :
DELETE FROM MyTable
delete all of data with a application row by row :
DELETE FROM MyTable WHERE ID = #SelectedID
Is the first solution has best performance?
what is the impact on log and performance?
If you need to restrict to what rows you need to delete and not do a complete delete, or you can't use TRUNCATE TABLE (e.g. the table is referenced by a FK constraint, or included in an indexed view), then you can do the delete in chunks:
DECLARE #RowsDeleted INTEGER
SET #RowsDeleted = 1
WHILE (#RowsDeleted > 0)
BEGIN
-- delete 10,000 rows a time
DELETE TOP (10000) FROM MyTable [WHERE .....] -- WHERE is optional
SET #RowsDeleted = ##ROWCOUNT
END
Generally, TRUNCATE is the best way and I'd use that if possible. But it cannot be used in all scenarios. Also, note that TRUNCATE will reset the IDENTITY value for the table if there is one.
If you are using SQL 2000 or earlier, the TOP condition is not available, so you can use SET ROWCOUNT instead.
DECLARE #RowsDeleted INTEGER
SET #RowsDeleted = 1
SET ROWCOUNT 10000 -- delete 10,000 rows a time
WHILE (#RowsDeleted > 0)
BEGIN
DELETE FROM MyTable [WHERE .....] -- WHERE is optional
SET #RowsDeleted = ##ROWCOUNT
END
If you have that many records in your table and you want to delete them all, you should consider truncate <table> instead of delete from <table>. It will be much faster, but be aware that it cannot activate a trigger.
See for more details (this case sql server 2000):
http://msdn.microsoft.com/en-us/library/aa260621%28SQL.80%29.aspx
Deleting the table within the application row by row will end up in long long time, as your dbms can not optimize anything, as it doesn't know in advance, that you are going to delete everything.
The first has clearly better performance.
When you specify DELETE [MyTable] it will simply erase everything without doing checks for ID. The second one will waste time and disk operation to locate a respective record each time before deleting it.
It also gets worse because every time a record disappears from the middle of the table, the engine may want to condense data on disk, thus wasting time and work again.
Maybe a better idea would be to delete data based on clustered index columns in descending order. Then the table will basically be truncated from the end at every delete operation.
Option 1 will create a very large transaction and have a big impact on the log / performance, as well as escalating locks so that the table will be unavailable.
Option 2 will be slower, although it will generate less impact on the log (assuming bulk / full mode)
If you want to get rid of all the data, Truncate Table MyTable would be faster than both, although it has no facility to filter rows, it does a meta data change at the back and basically drops the IAM on the floor for the table in question.
The best performance for clearing a table would bring TRUNCATE TABLE MyTable. See http://msdn.microsoft.com/en-us/library/ms177570.aspx for more verbose explaination
Found this post on Microsoft TechNet.
Basically, it recommends:
By using SELECT INTO, copy the data that you want to KEEP to an intermediate table;
Truncate the source table;
Copy back with INSERT INTO from intermediate table, the data to the source table;
..
BEGIN TRANSACTION
SELECT *
INTO dbo.bigtable_intermediate
FROM dbo.bigtable
WHERE Id % 2 = 0;
TRUNCATE TABLE dbo.bigtable;
SET IDENTITY_INSERT dbo.bigTable ON;
INSERT INTO dbo.bigtable WITH (TABLOCK) (Id, c1, c2, c3)
SELECT Id, c1, c2, c3 FROM dbo.bigtable_intermediate ORDER BY Id;
SET IDENTITY_INSERT dbo.bigtable OFF;
ROLLBACK TRANSACTION
The first will delete all the data from the table and will have better performance that your second who will delete only data from a specific key.
Now if you have to delete all the data from the table and you don't rely on using rollback think of the use a truncate table