better than a cursor tsql - sql-server

The question here is simple (although I am prepared for the answer not to be), how can I make this query more efficient.
In a nutshell it copies records. It selects X records, then using those records data duplicates them capturing the new identifier. Using the id of the original record and new record, it then inserts by copying data of the original data for another table using the new identifier.
This takes a long time. Can you help shorten it?
DECLARE DaysToDuplicateCursor CURSOR FAST_FORWARD FOR
SELECT
DayId
FROM [Days]
WHERE AgentId IN ('XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX')
AND PersonAgentId IN (
'YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY'
,'WWWWWWWW-WWWW-WWWW-WWWW-WWWWWWWWWWWW'
,'ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ'
,'TTTTTTTT-TTTT-TTTT-TTTT-TTTTTTTTTTTT'
)
DECLARE #Id INT
OPEN DaysToDuplicateCursor
FETCH NEXT FROM DaysToDuplicateCursor INTO #Id
WHILE ##FETCH_STATUS = 0
BEGIN
--
-- Insert Days data.
--
INSERT INTO [Days] (
[DayTemplateId]
,[DayDate]
)
SELECT [DayTemplateId]
,DATEADD(YEAR,-1,[DayDate]) AS [DayDate]
FROM [Days] WHERE [DayId] = #Id
--
-- Insert Periods data.
--
INSERT INTO [Periods] (
[DayId]
,[PeriodTemplateId]
)
SELECT
SCOPE_IDENTITY()
,[PeriodTemplateId]
FROM [Periods] WHERE [DayId] = #Id
--
END
CLOSE DaysToDuplicateCursor
DEALLOCATE DaysToDuplicateCursor

You do not need to use a cursor at all if you use the OUTPUT clause instead of asking for scope_identity. You will put this information into a table varaiable. You will also want to return any other columns in the output clause that uniquely identify the record so you can use them in joins to get the data you need in subsequent inserts.

Related

Trigger AFTER INSERT, UPDATE, DELETE to call stored procedure with table name and primary key

For a sync process, my SQL Server database should record a list items that have changed - table name and primary key.
The DB already has a table and stored procedure to do this:
EXEC #ErrCode = dbo.SyncQueueItem "tableName", 1234;
I'd like to add triggers to a table to call this stored procedure on INSERT, UPDATE, DELETE. How do I get the key? What's the simplest thing that could possibly work?
CREATE TABLE new_employees
(
id_num INT IDENTITY(1,1),
fname VARCHAR(20),
minit CHAR(1),
lname VARCHAR(30)
);
GO
IF OBJECT_ID ('dbo.sync_new_employees','TR') IS NOT NULL
DROP TRIGGER sync_new_employees;
GO
CREATE TRIGGER sync_new_employees
ON new_employees
AFTER INSERT, UPDATE, DELETE
AS
DECLARE #Key Int;
DECLARE #ErrCode Int;
-- How to get the key???
SELECT #Key = 12345;
EXEC #ErrCode = dbo.SyncQueueItem "new_employees", #key;
GO
The way to access the records changed by the operation is by using the Inserted and Deleted pseudo-tables that are provided to you by SQL Server.
Inserted contains any inserted records, or any updated records with their new values.
Deleted contains any deleted records, or any updated records with their old values.
More Info
When writing a trigger, to be safe, one should always code for the case when multiple records are acted upon. Unfortunately if you need to call a SP that means a loop - which isn't ideal.
The following code shows how this could be done for your example, and includes a method of detecting whether the operation is an Insert/Update/Delete.
declare #Key int, #ErrCode int, #Action varchar(6);
declare #Keys table (id int, [Action] varchar(6));
insert into #Keys (id, [Action])
select coalesce(I.id, D.id_num)
, case when I.id is not null and D.id is not null then 'Update' when I.id is not null then 'Insert' else 'Delete' end
from Inserted I
full join Deleted D on I.id_num = D.id_num;
while exists (select 1 from #Keys) begin
select top 1 #Key = id, #Action = [Action] from #Keys;
exec #ErrCode = dbo.SyncQueueItem 'new_employees', #key;
delete from #Keys where id = #Key;
end
Further: In addition to solving your specified problem its worth noting a couple of points regarding the bigger picture.
As #Damien_The_Unbeliever points out there are built in mechanisms to accomplish change tracking which will perform much better.
If you still wish to handle your own change tracking, it would perform better if you could arrange it such that you handle the entire recordset in one go as opposed to carrying out a row-by-row operation. There are 2 ways to accomplish this a) Move your change tracking code inside the trigger and don't use a SP. b) Use a "User Defined Table Type" to pass the record-set of changes to the SP.
You should use the Magic Table to get the data.
Usually, inserted and deleted tables are called Magic Tables in the context of a trigger. There are Inserted and Deleted magic tables in SQL Server. These tables are automatically created and managed by SQL Server internally to hold recently inserted, deleted and updated values during DML operations (Insert, Update and Delete) on a database table.
Inserted magic table
The Inserted table holds the recently inserted values, in other words, new data values. Hence recently added records are inserted into the Inserted table.
Deleted magic table
The Deleted table holds the recently deleted or updated values, in other words, old data values. Hence the old updated and deleted records are inserted into the Deleted table.
**You can use the inserted and deleted magic table to get the value of id_num **
SELECT top 1 #Key = id_num from inserted
Note: This code sample will only work for a single record for insert scenario. For Bulk insert/update scenarios you need to fetch records from inserted and deleted table stored in the temp table or variable and then loop through it to pass to your procedure or you can pass a table variable to your procedure and handle the multiple records there.
A DML trigger should operate set data else only one row will be processed. It can be something like this. And of course use magic tables inserted and deleted.
CREATE TRIGGER dbo.tr_employees
ON dbo.employees --the table from Northwind database
AFTER INSERT,DELETE,UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare #tbl table (id int identity(1,1),delId int,insId int)
--Use "magic tables" inserted and deleted
insert #tbl(delId, insId)
select d.EmployeeID, i.EmployeeID
from inserted i --empty when "delete"
full join deleted d --empty when "insert"
on i.EmployeeID=d.EmployeeID
declare #id int,#key int,#action char
select top 1 #id=id, #key=isnull(delId, insId),
#action=case
when delId is null then 'I'
when insId is null then 'D'
else 'U' end --just in case you need the operation executed
from #tbl
--do something for each row
while #id is not null --instead of cursor
begin
--do the main action
--exec dbo.sync 'employees', #key, #action
--remove processed row
delete #tbl where id=#id
--refill #variables
select top 1 #id=id, #key=isnull(delId, insId),
#action=case
when delId is null then 'I'
when insId is null then 'D'
else 'U' end --just in case you need the operation executed
from #tbl
end
END
Not the best solution, but just a direct answer on the question:
SELECT #Key = COALESCE(deleted.id_num,inserted.id_num);
Also not the best way (if not the worst) (do not try this at home), but at least it will help with multiple values:
DECLARE #Key INT;
DECLARE triggerCursor CURSOR LOCAL FAST_FORWARD READ_ONLY
FOR SELECT COALESCE(i.id_num,d.id_num) AS [id_num]
FROM inserted i
FULL JOIN deleted d ON d.id_num = i.id_num
WHERE (
COALESCE(i.fname,'')<>COALESCE(d.fname,'')
OR COALESCE(i.minit,'')<>COALESCE(d.minit,'')
OR COALESCE(i.lname,'')<>COALESCE(d.lname,'')
)
;
OPEN triggerCursor;
FETCH NEXT FROM triggerCursor INTO #Key;
WHILE ##FETCH_STATUS = 0
BEGIN
EXEC #ErrCode = dbo.SyncQueueItem 'new_employees', #key;
FETCH NEXT FROM triggerCursor INTO #Key;
END
CLOSE triggerCursor;
DEALLOCATE triggerCursor;
Better way to use trigger based "value-change-tracker":
INSERT INTO [YourTableHistoryName] (id_num, fname, minit, lname, WhenHappened)
SELECT COALESCE(i.id_num,d.id_num) AS [id_num]
,i.fname,i.minit,i.lname,CURRENT_TIMESTAMP AS [WhenHeppened]
FROM inserted i
FULL JOIN deleted d ON d.id_num = i.id_num
WHERE ( COALESCE(i.fname,'')<>COALESCE(d.fname,'')
OR COALESCE(i.minit,'')<>COALESCE(d.minit,'')
OR COALESCE(i.lname,'')<>COALESCE(d.lname,'')
)
;
The best (in my opinion) way to track changes is to use Temporal tables (SQL Server 2016+)
inserted/deleted in triggers will generate as many rows as touched and calling a stored proc per key would require a cursor or similar approach per row.
You should check timestamp/rowversion in SQL Server. You could add that to the all tables in question (not null, auto increment, unique within database for each table/row etc).
You could add a unique index on that column to all tables you added the column.
##DBTS is the current timestamp, you can store today's ##DBTS and tomorrow you will scan all tables from that to current ##DBTS. timestamp/rowversion will be incremented for all updates and inserts but for deletes it won't track, for deletes you can have a delete only trigger and insert keys into a different table.
Change data capture or change tracking could do this easier, but if there is heavy volumes on the server or large number of data loads, partition switches scanning the transaction log becomes a bottleneck and in some cases you will have to remove change data capture to save the transaction log from growing indefinetely.

Trigger for updating total records on both insert and delete

I'm writing a trigger to store the record count of one table as a column in another to speed up some reporting queries on a large db.
Here's what I've got so far, it works fine on deletes but I also need to it work on inserts. Do I need to use a separate trigger? Also, is the use of the cursor necessary or is there a more efficient way?
Thanks!
ALTER TRIGGER [dbo].[updateSourceTotals]
ON [dbo].imports
AFTER INSERT, DELETE
AS
BEGIN
SET NOCOUNT ON;
DECLARE #sourceId int;
DECLARE deleteCursor CURSOR FOR SELECT DISTINCT sourceId FROM deleted
OPEN deleteCursor
FETCH NEXT FROM deleteCursor INTO #sourceId
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE sources
SET totalImports = (
SELECT COUNT(*)
FROM imports
WHERE sourceId = #sourceId
)
WHERE id = #sourceId
FETCH NEXT FROM deleteCursor INTO #sourceId
END
CLOSE deleteCursor
DEALLOCATE deleteCursor
END
GO
If you are really set on the Trigger approach (and I do NOT recommend it) then this is a much simpler and probably faster version of your current code:
ALTER TRIGGER [dbo].[updateSourceTotals]
ON [dbo].imports
AFTER INSERT, DELETE
AS
BEGIN
UPDATE s
SET totalImports = (
SELECT COUNT(*)
FROM imports i
WHERE i.sourceId = s.Id
)
FROM sources s
WHERE s.id IN(SELECT sourceId FROM deleted)
END
If you want to cover INSERTs also, this should do it:
ALTER TRIGGER [dbo].[updateSourceTotals]
ON [dbo].imports
AFTER INSERT, DELETE
AS
BEGIN
UPDATE s
SET totalImports = (
SELECT COUNT(*)
FROM imports i
WHERE i.sourceId = s.id
)
FROM sources s
WHERE s.id IN(
SELECT sourceId FROM deleted
UNION
SELECT sourceId FROM inserted
)
END
As an added bonus, it should work for UPDATEs as well.
Just to clarify, the problem with doing pre-aggregation in a Trigger, even after you eliminate the Cursor, is that instead of re-calculating the query on each request, you are instead re-calculating them on each modification.
Even in the abstract, this is only a win if you do many such requests, but do not modify the table very much. However, in the real context of an active DBMS server, you lose most of even this small advantage too, because if you are making many such requests, then they are probably getting cached very effectively (in turn, because reads are much more cache-effective than writes).

is there an efficient way of transfering all the rows of one table to another table, one record at a time

requirement is to transfer all inserted records to one table to another. i've used a trigger to do it. i'm looping through all the inserted records and inserting to new table one record at a time as i have to increment a sequence number in the destination table. but this loop considarably slow when number of inserted rows increase. is there a better way of doing this.
Declare #maxpk int, #count int, #seq int
set #maxpk=(select max(refno) from inserted )
set #count=(select count(1) from inserted)
set #seq=((select max(seq_no) from dbase.dbo.destination))
while #count>0
begin
set #seq=(select #seq+1)
insert into dbase.dbo.destination(orderno,SEQ_NO,PRODUCT_ID,qty)
select ordernumber,#seq,productid ,quantity
from inserted where refno=#maxpk
set #count=(select #count-1)
set #maxpk=(select top 1 refno from inserted where refno<#maxpk)
end
refno is primary key of source table. is there a way to check the end of inserted records so i don't have to initialize and maintain a loop counter?
and can loop be executed for each record in inserted table so i don't have to find the next record to insert by comparing the value of primary key.
using mssql 2005
This should handle concurrency ok but I really think you need to revisit the design (e.g. make seq_no an IDENTITY column, then the system generates the unique values for you, and handles concurrency too).
CREATE TRIGGER dbo.SourceTrg ON dbo.Source
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
DECLARE #seq INT;
SET #seq = COALESCE((SELECT MAX(seq_no)
FROM dbase.dbo.Destination WITH (TABLOCKX, HOLDLOCK)), 0);
INSERT dbase.dbo.Destination(SEQ_NO, orderno, PRODUCT_ID, qty)
SELECT #seq + ROW_NUMBER() OVER (ORDER BY refno),
ordernumber, productid, quantity
FROM inserted;
END
GO

SQl Server Express 2005 - updating 2 tables and atomicity?

First off, I want to start by saying I am not an SQL programmer (I'm a C++/Delphi guy), so some of my questions might be really obvious. So pardon my ignorance :o)
I've been charged with writing a script that will update certain tables in a database based on the contents of a CSV file. I have it working it would seem, but I am worried about atomicity for one of the steps:
One of the tables contains only one field - an int which must be incremented each time, but from what I can see is not defined as an identity for some reason. I must create a new row in this table, and insert that row's value into another newly-created row in another table.
This is how I did it (as part of a larger script):
DECLARE #uniqueID INT,
#counter INT,
#maxCount INT
SELECT #maxCount = COUNT(*) FROM tempTable
SET #counter = 1
WHILE (#counter <= #maxCount)
BEGIN
SELECT #uniqueID = MAX(id) FROM uniqueIDTable <----Line 1
INSERT INTO uniqueIDTableVALUES (#uniqueID + 1) <----Line 2
SELECT #uniqueID = #uniqueID + 1
UPDATE TOP(1) tempTable
SET userID = #uniqueID
WHERE userID IS NULL
SET #counter = #counter + 1
END
GO
First of all, am I correct using a "WHILE" construct? I couldn't find a way to achieve this with a simple UPDATE statement.
Second of all, how can I be sure that no other operation will be carried out on the database between Lines 1 and 2 that would insert a value into the uniqueIDTable before I do? Is there a way to "synchronize" operations in SQL Server Express?
Also, keep in mind that I have no control over the database design.
Thanks a lot!
You can do the whole 9 yards in one single statement:
WITH cteUsers AS (
SELECT t.*
, ROW_NUMBER() OVER (ORDER BY userID) as rn
, COALESCE(m.id,0) as max_id
FROM tempTable t WITH(UPDLOCK)
JOIN (
SELECT MAX(id) as id
FROM uniqueIDTable WITH (UPDLOCK)
) as m ON 1=1
WHERE userID IS NULL)
UPDATE cteUsers
SET userID = rn + max_id
OUTPUT INSERTED.userID
INTO uniqueIDTable (id);
You get the MAX(id), lock the uniqueIDTable, compute sequential userIDs for users with NULL userID by using ROW_NUMBER(), update the tempTable and insert the new ids into uniqueIDTable. All in one operation.
For performance you need and index on uniqueIDTable(id) and index on tempTable(userID).
SQL is all about set oriented operations, WHILE loops are the code smell of SQL.
You need a transaction to ensure atomicity and you need to move the select and insert into one statement or do the select with an updlock to prevent two people from running the select at the same time, getting the same value and then trying to insert the same value into the table.
Basically
DECLARE #MaxValTable TABLE (MaxID int)
BEGIN TRANSACTION
BEGIN TRY
INSERT INTO uniqueIDTable VALUES (id)
OUTPUT inserted.id INTO #MaxValTable
SELECT MAX(id) + 1 FROM uniqueIDTable
UPDATE TOP(1) tempTable
SET userID = (SELECT MAXid FROM #MaxValTable)
WHERE userID IS NULL
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
RAISERROR 'Error occurred updating tempTable' -- more detail here is good
END CATCH
That said, using an identity would make things far simpler. This is a potential concurrency problem. Is there any way you can change the column to be identity?
Edit: Ensuring that only one connection at a time will be able to insert into the uniqueIDtable. Not going to scale well though.
Edit: Table variable's better than exclusive table lock. If need be, this can be used when inserting users as well.

Efficient transaction, record locking

I've got a stored procedure, which selects 1 record back. the stored procedure could be called from several different applications on different PCs. The idea is that the stored procedure brings back the next record that needs to be processed, and if two applications call the stored proc at the same time, the same record should not be brought back. My query is below, I'm trying to write the query as efficiently as possible (sql 2008). Can it get done more efficiently than this?
CREATE PROCEDURE GetNextUnprocessedRecord
AS
BEGIN
SET NOCOUNT ON;
--ID of record we want to select back
DECLARE #iID BIGINT
-- Find the next processable record, and mark it as dispatched
-- Must be done in a transaction to ensure no other query can get
-- this record between the read and update
BEGIN TRAN
SELECT TOP 1
#iID = [ID]
FROM
--Don't read locked records, only lock the specific record
[MyRecords] WITH (READPAST, ROWLOCK)
WHERE
[Dispatched] is null
ORDER BY
[Received]
--Mark record as picked up for processing
UPDATE
[MyRecords]
SET
[Dispatched] = GETDATE()
WHERE
[ID] = #iID
COMMIT TRAN
--Select back the specific record
SELECT
[ID],
[Data]
FROM
[MyRecords] WITH (NOLOCK, READPAST)
WHERE
[ID] = #iID
END
Using the READPAST locking hint is correct and your SQL looks OK.
I'd add use XLOCK though which is also HOLDLOCK/SERIALIZABLE
...
[MyRecords] WITH (READPAST, ROWLOCK, XLOCK)
...
This means you get the ID, and exclusively lock that row while you carry on and update it.
Edit: add an index on Dispatched and Received columns to make it quicker. If [ID] (I assume it's the PK) is not clustered, INCLUDE [ID]. And filter the index too because it's SQL 2008
You could also use this construct which does it all in one go without XLOCK or HOLDLOCK
UPDATE
MyRecords
SET
--record the row ID
#id = [ID],
--flag doing stuff
[Dispatched] = GETDATE()
WHERE
[ID] = (SELECT TOP 1 [ID] FROM MyRecords WITH (ROWLOCK, READPAST) WHERE Dispatched IS NULL ORDER BY Received)
UPDATE, assign, set in one
You can assign each picker process a unique id, and add columns pickerproc and pickstate to your records. Then
UPDATE MyRecords
SET pickerproc = myproc,
pickstate = 'I' -- for 'I'n process
WHERE Id = (SELECT MAX(Id) FROM MyRecords WHERE pickstate = 'A') -- 'A'vailable
That gets you your record in one atomic step, and you can do the rest of your processing at your leisure. Then you can set pickstate to 'C'omplete', 'E'rror, or whatever when it's resolved.
I think Mitch is referring to another good technique where you create a message-queue table and insert the Ids there. There are several SO threads - search for 'message queue table'.
You can keep MyRecords on a "MEMORY" table for faster processing.

Resources