Related
I have an insert that is executed every 2 seconds (20 columns, 15000 rows, outside of SQL Server [sensor data]), it runs in 200 ms. I would like to keep only 10 minutes of data (sum ~4.500.000 rows) in this table then move (archive) the earliest 5 minutes to another archive table (that will store 50 days, billions of rows). The stored procedure for archiving:
begin tran
declare #UTC_min datetime2(2) = (select TOP 1 UTCLogDateTime from table1 order by UTCLogDateTime asc)
declare #UTC_copy datetime2(2) = dateadd(minute,5,#UTC_min)
INSERT INTO archive_table
SELECT *
FROM table1
where UTCLogDateTime<#UTC_copy
delete top(100000) from table1 where UTCLogDateTime<#UTC_copy
WHILE ##rowcount > 0
BEGIN
delete top(100000) from table1 where UTCLogDateTime<#UTC_copy
END
commit
I would like to ensure that the insert runs flawlessly and as fast as possible, without locking out from this archiving process. When archiving starts, the runtime for insert grows to 4-5 sec. I will also have a 2 sec live query (Power BI) to read this table.
Currently I have a clustered index on the UTCLogDateTime column for both tables.
These all processes have to run seamlessly, without locking the table from each other.
Do you have any suggestions how I can achieve this?
If you are using SQL Server 2016 and above, you may use TRUNCATE WITH partitions. This uses fewer locks compared to DELETE. Worth trying partitioning the table in TEST environment first.
I have this table:
TableAB
{
IDA;
IDB;
}
And I want to ensure that always I have the pair (ID1, ID2) and (ID2, ID1). So I am trying to use this to scripts:
To insert:
begin tran
insert into tablaAB (IDTablaA, IDTablaB) VALUES(1,2);
insert into tablaAB (IDTablaA, IDTablaB) VALUES(2,1);
commit
To delete:
begin tran
delete tablaAB where IDTablaA = 1 and IDTablaB = 2
delete tablaAB where IDTablaA = 2 and IDTablaB = 1;
commit
I am using two instance of Microsoft Management Studio, to run both queries, and in most of the cases, it works, I get the two rows or any of them. But sometimes, I get only one of them.
The steps are:
run the query to delete (1,2).
run the query to add (1,2).
In most of the cases, it is block until the transaction to delete both rows finishes, but in some case it can pass to the next line, to insert the second row. If this happens, then I don't have a coherence data.
But I don't know if it is because I make some mistakes in the test or in same rare cases the first query is not blocked as I expect.
Really in all cases the first insert should be block if the first delete is done?
The table is empty. So it seems that the row is blocked when I try to delete and it doesn't allow to insert the row, but I don't know if really can be some rare situations in which the row is not blocked.
Thanks.
But I don't know if it is because I make some mistakes in the test or in same rare cases the first query is not blocked as I expect.
Really in all cases the first insert should be block if the first delete is done?
It seems you are running using the READ COMMITTED isolation level. In this case, no lock is held by the DELETE session when no rows qualify so the INSERT session can proceed to insert rows. This becomes a race condition where you may end up with zero, one, or two rows. Consider this sequence that results in one row:
--session 1:
begin tran;
delete TableAB where IDTablaA = 1 and IDTablaB = 2;
--no row deleted, no lock held
--session 2:
begin tran
insert into TableAB (IDTablaA, IDTablaB) VALUES(1,2);
--row inserted, lock held
insert into TableAB (IDTablaA, IDTablaB) VALUES(2,1);
--row inserted, lock held
commit;
-- inserts committed and locks released
--session 1:
delete TableAB where IDTablaA = 2 and IDTablaB = 1;
--row deleted, lock held
commit;
--deleted committed, lock released
If you instead use the SERIALIZABLE isolation level, the DELETE statement will hold a lock (table lock in this case due to no indexes) and block the insert session. A less restrictive key range lock will be held with an index on the column used to locate rows to be deleted.
Note that SERIALIZABLE is it is more prone to deadlocks than less restrictive isolation levels.
Can anyone confirm this for me. I need to be able to write to a field in a row an 'ownership' value (Who own's the record) and need it to be the first person who selects the row for update and ignore any further selects until the row is available to write to....
My Transaction will be:
BEGIN TRANSACTION
Declare #OwnerField Varchar(20)
SET #OwnerField = SELECT OwnerField
FROM Table
WHERE RecordID = 2
IF #OwnerField IS NULL -- Can own
BEGIN
UPDATE Table
SET OwnerField = 'John Smith'
WHERE RecordID = 2
END
END TRANSACTION
As far as my knowledge goes (with Google's help) this will allow me to lock the row, check if there is a value in it, if not then write one, if so then exit..
Does this make sense?
Thank you in advance..
Derek.
Unless you want to handle the contention by producing deadlocks, don't use SERIALIZABLE for this. SERIALIZABLE will take and hold Shared (S) locks in the first query, so concurrent transactions will both read the row, and enter into a deadlock as they both try to update it. One will be killed; the other will succeed, and the SERIALIZABLE semantics are preserved.
Instead you should put a restrictive lock on the target row as you read it.
eg:
BEGIN TRANSACTION
Declare #OwnerField Varchar(20)
SET #OwnerField = SELECT OwnerField
FROM Table with (UPDLOCK,HOLDLOCK)
WHERE RecordID = 2
IF #OwnerField IS NULL -- Can own
BEGIN
UPDATE Table
SET OwnerField = 'John Smith'
WHERE RecordID = 2
END
END TRANSACTION
(UPDLOCK,HOLDLOCK) gives you the same range-locking protection of the SERIALIZABLE isolation level, but uses a restrictive lock, so multiple transactions will block on the SELECT. The second reader will block until the first has committed, and see the updated OwnerField column.
I am just wondering something about snapshot behavior on read committed isolation level. Let's assume that I have a table with name "A". Here is the first transaction:
Select blabla
From A
Insert Into A blabla
and second transaction does the same
Select blabla
From A
Insert Into A blabla
and assume that below timeline occurred:
Tran1: select
Tran1: insert (not yet committed)
Tran2: select (I don't know it is possible or not)
Tran2: insert
As far as I know, in standard read committed isolation level, tran2 select query would be blocked because of tran1 insert command not yet committed or rolled back. But, while "is_read_committed_snapshot" is enabled, I expect that any of lock won't acquired during insert or update command.
So what will happen to tran2?
I expect that tran2 select query won't see the data that inserted by tran1, because it would be "dirty read". But it wouldn't get block as well.
Because of the tran1 insert query does not acquire any lock, wouldn't this situation be a problem about concurrency of executing these two transactions?
I expect that any of lock won't acquired during insert or update
command.
That is wrong. Even if you have enabled RCSI, writers still block writers, and X locks are still acqiured.
What is different between RC and RCSI is reading behaviour.
When working on pessimistic RC, SELECT from Tran2 will be blocked on X lock held on A, while working on RCSI Tran2's SELECT will not be blocked, it will be provided with the last committed version of A, i.e. with the state of A before Tran1 has modificated it.
What happend then depends on your table organisation and on what you INSERT.
Some examples.
1) table A is a heap, you are doing single insert in both transactions.
In this case your INSERT in Tran2 will succeed in any case, be it the same value that you try to insert in both transactions or not, because what the server acquires in this case is IX on a table (that is compatible with IX held by Tran1), IX on a page (that is also compatible with IX held by Tran1, even if it is the same page), and X on RID (while Tran1 has X on another RID), so there is no conflict.
2) table A is clustered table, you are trying to insert the same new key in this table.
In this case your Tran2's INSERT will be blocked because of the conflict between two X lock on the same key, the first is held by Tran1, the secont is requested by Tran2 and is blocked.
3) table A is clustered table, you are trying to insert different keys in this table.
Insert2 will succeed because X lock on key requested by Tran2 will be granted as Tran1 holds IX on table, IX on page, and X on another key.
Lets say you're doing it this way:
SELECT id FROM customers
BEGIN TRAN new_tran
UPDATE customers
SET ID = '1'
WHERE ID = '01'
IF your query is something like this:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
GO
BEGIN TRAN
SELECT *
FROM customers
WHERE id = '01'
Result- Even if we have changed the value to 01, we will still see old record in session 2 (2, TWO).
Now, let’s commit transaction in session 1
Now lets say you commit the transaction, in session 2, now you'll get the new updated value:
COMMIT
SELECT *
FROM DemoTable
WHERE i = 2
You can read more about it on Pinal Dave's blog: blog.sqlauthority.com/2015/07/03/sql-server-difference-between-read-committed-snapshot-and-snapshot-isolation-level/
I'm trying to understand a problem I have run into that I don't believe should be possible when dealing with transactions utilizing the read committed isolation level. I have a table that is being used as a queue. In one thread (connection 1) I insert multiple batches of 20 records into each table. Each batch of 20 records is performed inside a transaction. In a second thread (connection 2) I perform an update to change the status of the records that have been inserted into the queue, which also occurs inside a transaction. When running concurrently, it is my expectation that the number of rows affected by the update (connection 2) should be a multiple of 20, since connection 1 is inserting rows in the table inserts in batches of 20 rows within a transaction.
But my testing shows this is not always the case, and on occasion I'm able to update a subset of records from connection 1's batch. Should this be possible or am I missing something about transactions, concurrency, and isolation levels? Below is a set of test scripts I created to reproduce this issue in T-SQL.
This script inserts 20,000 records into the table in transaction batches of 20.
USE ReadTest
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
SET NOCOUNT ON
DECLARE #trans_id INTEGER
DECLARE #cmd_id INTEGER
DECLARE #text_str VARCHAR(4000)
SET #trans_id = 0
SET #text_str = 'Placeholder String Value'
-- First empty the table
DELETE FROM TABLE_A
WHILE #trans_id < 1000 BEGIN
SET #trans_id = #trans_id + 1
SET #cmd_id = 0
BEGIN TRANSACTION
-- Insert 20 records into the table per transaction
WHILE #cmd_id < 20 BEGIN
SET #cmd_id = #cmd_id + 1
INSERT INTO TABLE_A ( transaction_id, command_id, [type], status, text_field )
VALUES ( #trans_id, #cmd_id, 1, 1, #text_str )
END
COMMIT
END
PRINT 'DONE'
This script updates the records in the table, changing the status from 1 to 2 and then checks the rowcount from the update operation. When the rowcount is not a multiple of 20, and print statement indicates this and the number of rows affected.
USE ReadTest
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
SET NOCOUNT ON
DECLARE #loop_counter INTEGER
DECLARE #trans_id INTEGER
DECLARE #count INTEGER
SET #loop_counter = 0
WHILE #loop_counter < 100000 BEGIN
SET #loop_counter = #loop_counter + 1
BEGIN TRANSACTION
UPDATE TABLE_A SET status = 2
WHERE status = 1
and type = 1
SET #count = ##ROWCOUNT
COMMIT
IF ( #count % 20 <> 0 ) BEGIN
-- Records in concurrent transaction inserting in batches of 20 records before commit.
PRINT '*** Rowcount not a multiple of 20. Count = ' + CAST(#count AS VARCHAR) + ' ***'
END
IF #count > 0 BEGIN
-- Delete the records where the status was changed.
DELETE TABLE_A WHERE status = 2
END
END
PRINT 'DONE'
This script creates the test queue table in a new database called ReadTest.
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'ReadTest')
BEGIN;
DROP DATABASE ReadTest;
END;
GO
CREATE DATABASE ReadTest;
GO
ALTER DATABASE ReadTest
SET ALLOW_SNAPSHOT_ISOLATION OFF
GO
ALTER DATABASE ReadTest
SET READ_COMMITTED_SNAPSHOT OFF
GO
USE ReadTest
GO
CREATE TABLE [dbo].[TABLE_A](
[ROWGUIDE] [uniqueidentifier] NOT NULL,
[TRANSACTION_ID] [int] NOT NULL,
[COMMAND_ID] [int] NOT NULL,
[TYPE] [int] NOT NULL,
[STATUS] [int] NOT NULL,
[TEXT_FIELD] [varchar](4000) NULL
CONSTRAINT [PK_TABLE_A] PRIMARY KEY NONCLUSTERED
(
[ROWGUIDE] ASC
) ON [PRIMARY]
) ON [PRIMARY]
ALTER TABLE [dbo].[TABLE_A] ADD DEFAULT (newsequentialid()) FOR [ROWGUIDE]
GO
You expectations are completely misplaced. You have never expressed in your query the requirement to 'dequeue' exactly 20 rows. The UPDATE can return 0, 19, 20, 21 or 1000 rows and all results are correct, as long as the status is 1 and type is 1. If you expect that the 'dequeue' occurs in the order of the 'enqueue' (which is somehow eluded to in your question, but never explicitly stated) then your 'dequeue' operation must contain an ORDER BY clause. Had you add such an explicitly stated requirement then your expectation that 'dequeue' always return an entire batch of 'enqueue' rows (ie. multiple of 20 rows) would be one step closer to being a reasonable expectation. As things stand right now, is, as I said, completely misplaced.
For a lengthier discussion see Using Tables as Queues.
I shouldn't be concerned that while one transaction is committing a
batch of 20 inserted records, another concurrent transaction is only
able to update a subset of those records and not all 20?
Basically the question boils down to If I SELECT while I INSERT, how many inserted rows will I see?. You only have a right to be concerned if the isolation level is declared as SERIALIZABLE. None of the other isolation levels make any prediction about how many rows inserted while the UPDATE was running will be visible. Only SERIALIZABLE states that the outcome has to be the same as running the two statements one after another (ie. serialized, hence the name). While the technical details of how the UPDATE 'sees' only part of the INSERT batch are easy to understand once you consider physical order and the lack of ORDER BY clause, the explanation is irrelevant. The fundamental issue is that the expectation is non-warranted. Even if the 'issue' is 'fixed' by adding a proper ORDER BY and the correct clustered index key (the article linked above explains the details), the expectation is still non-warranted. It will still be perfectly legal for the UPDATE to 'see' 1, 19 or 21 rows, although it will be unlikely to happen.
I guess I've always understood READ COMMITTED to only read committed
data, and that a transaction commit is an atomic operation, making all
the changes that occurred in the transaction available at once.
That is correct. What is incorrect is to expect that a concurrent SELECT (or update) to see the entire change, irrelevant of where it happens to be in the execution. Open an SSMS query and run the following:
use tempdb;
go
create table test (a int not null primary key, b int);
go
insert into test (a, b) values (5,0)
go
begin transaction
insert into test (a, b) values (10,0)
Now open a new SSMS query and run the following:
update test
set b=1
output inserted.*
where b=0
This will block behind the uncommitted INSERT. Now go back to first query and run the following:
insert into test (a, b) values (1,0)
commit
When this commits, the second SSMS query will finish, and it will return two rows, not three. QED. This is READ COMMITTED. What you expect is SERIALIZABLE execution (in which case the example above will deadlock).
It could happen like this:
The writer/inserter writes 20 rows (does not commit)
The reader/updater reads one row (which is not committed - it discards it)
The writer/inserter commits
The reader/updater reads 19 rows which are now committed thus visible
I believe that only an isolation level of serializable (or snapshot isolation which is more concurrent) fixes this.