Deleting records from SQL Server table without cursor - sql-server

I am trying to selectively delete records from a SQL Server 2005 table without looping through a cursor. The table can contain many records (sometimes > 500,000) so looping is too slow.
Data:
ID, UnitID, Day, Interval, Amount
1 100 10 21 9.345
2 100 10 22 9.367
3 200 11 21 4.150
4 300 11 21 4.350
5 300 11 22 4.734
6 300 11 23 5.106
7 400 13 21 10.257
8 400 13 22 10.428
Key is: ID, UnitID, Day, Interval.
In this example I wish to delete Records 2, 5 and 8 - they are adjacent to an existing record (based on the key).
Note: record 6 would not be deleted because once 5 is gone it is not adjacent any longer.
Am I asking too much?

See these articles in my blog for performance detail:
SQL Server: deleting adjacent values
SQL Server: deleting adjacent values (improved)
The main idea for the query below is that we should delete all even rows from continuous ranges of intervals.
That is, if for given (unitId, Day) we have the following intervals:
1
2
3
4
6
7
8
9
, we have two continuous ranges:
1
2
3
4
and
6
7
8
9
, and we should delete every even row:
1
2 -- delete
3
4 -- delete
and
6
7 -- delete
8
9 -- delete
, so that we get:
1
3
6
8
Note that "even rows" means "even per-range ROW_NUMBER()s" here, not "even values of interval".
Here's the query:
DECLARE #Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)
INSERT INTO #Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO #Table VALUES (2, 100, 10, 22, 9.345)
INSERT INTO #Table VALUES (3, 200, 11, 21, 9.345)
INSERT INTO #Table VALUES (4, 300, 11, 21, 9.345)
INSERT INTO #Table VALUES (5, 300, 11, 22, 9.345)
INSERT INTO #Table VALUES (6, 300, 11, 23, 9.345)
INSERT INTO #Table VALUES (7, 400, 13, 21, 9.345)
INSERT INTO #Table VALUES (8, 400, 13, 22, 9.345)
INSERT INTO #Table VALUES (9, 400, 13, 23, 9.345)
INSERT INTO #Table VALUES (10, 400, 13, 24, 9.345)
INSERT INTO #Table VALUES (11, 400, 13, 26, 9.345)
INSERT INTO #Table VALUES (12, 400, 13, 27, 9.345)
INSERT INTO #Table VALUES (13, 400, 13, 28, 9.345)
INSERT INTO #Table VALUES (14, 400, 13, 29, 9.345)
;WITH rows AS
(
SELECT *,
ROW_NUMBER() OVER
(
PARTITION BY
(
SELECT TOP 1 qi.id AS mint
FROM #Table qi
WHERE qi.unitid = qo.unitid
AND qi.[day] = qo.[day]
AND qi.interval <= qo.interval
AND NOT EXISTS
(
SELECT NULL
FROM #Table t
WHERE t.unitid = qi.unitid
AND t.[day] = qi.day
AND t.interval = qi.interval - 1
)
ORDER BY
qi.interval DESC
)
ORDER BY interval
) AS rnm
FROM #Table qo
)
DELETE
FROM rows
WHERE rnm % 2 = 0
SELECT *
FROM #table
Update:
Here's a more efficient query:
DECLARE #Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)
INSERT INTO #Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO #Table VALUES (2, 100, 10, 22, 9.345)
INSERT INTO #Table VALUES (3, 200, 11, 21, 9.345)
INSERT INTO #Table VALUES (4, 300, 11, 21, 9.345)
INSERT INTO #Table VALUES (5, 300, 11, 22, 9.345)
INSERT INTO #Table VALUES (6, 300, 11, 23, 9.345)
INSERT INTO #Table VALUES (7, 400, 13, 21, 9.345)
INSERT INTO #Table VALUES (8, 400, 13, 22, 9.345)
INSERT INTO #Table VALUES (9, 400, 13, 23, 9.345)
INSERT INTO #Table VALUES (10, 400, 13, 24, 9.345)
INSERT INTO #Table VALUES (11, 400, 13, 26, 9.345)
INSERT INTO #Table VALUES (12, 400, 13, 27, 9.345)
INSERT INTO #Table VALUES (13, 400, 13, 28, 9.345)
INSERT INTO #Table VALUES (14, 400, 13, 29, 9.345)
;WITH source AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY unitid, day ORDER BY interval) rn
FROM #Table
),
rows AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY unitid, day, interval - rn ORDER BY interval) AS rnm
FROM source
)
DELETE
FROM rows
WHERE rnm % 2 = 0
SELECT *
FROM #table

I don't think what you're asking for is possible — but you may be able to get close. It appears you can almost do it by finding records with a self-join like this:
SELECT t1.id
FROM
table t1 JOIN table t2 ON (
t1.unitid = t2.unitid AND
t1.day = t2.day AND
t1.interval = t2.interval - 1
)
but the problem is, that'll find id=6 as well. However, if you create a temporary table from this data, it may be much smaller than your original data, and thus far faster to scan with a cursor (to fix the id=6 problem). You can then do a DELETE FROM table WHERE id IN (SELECT id FROM tmp_table) to kill the rows.
There may be a way to fix the ID=6 problem w/o a cursor, but if so, I don't see it.

There is the WHILE statement, which is an alternative to the cursor. That combined with table variables might let you do the same thing within a performance bound you're OK with.

DECLARE #Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)
INSERT INTO #Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO #Table VALUES (2, 100, 10, 22, 9.367)
INSERT INTO #Table VALUES (3, 200, 11, 21, 4.150)
INSERT INTO #Table VALUES (4, 300, 11, 21, 4.350)
INSERT INTO #Table VALUES (5, 300, 11, 22, 4.734)
INSERT INTO #Table VALUES (6, 300, 11, 23, 5.106)
INSERT INTO #Table VALUES (7, 400, 13, 21, 10.257)
INSERT INTO #Table VALUES (8, 400, 13, 22, 10.428)
DELETE FROM #Table
WHERE ID IN (
SELECT t1.ID
FROM #Table t1
INNER JOIN #Table t2
ON t2.UnitID = t1.UnitID
AND t2.Day = t1.Day
AND t2.Interval = t1.Interval - 1
LEFT OUTER JOIN #Table t3
ON t3.UnitID = t2.UnitID
AND t3.Day = t2.Day
AND t3.Interval = t2.Interval - 1
WHERE t3.ID IS NULL)
SELECT * FROM #Table

Lieven is so close - it worked for the test set, but if I add a few more records it starts to miss some.
We cannot use any odd/even criteria - we have no idea how the data falls.
Add this data and retry:
INSERT #Table VALUES (9, 100, 10, 23, 9.345)
INSERT #Table VALUES (10, 100, 10, 24, 9.367)
INSERT #Table VALUES (11, 100, 10, 25, 4.150)
INSERT #Table VALUES (12, 100, 10, 26, 4.350)
INSERT #Table VALUES (13, 300, 11, 25, 4.734)
INSERT #Table VALUES (14, 300, 11, 26, 5.106)
INSERT #Table VALUES (15, 300, 11, 27, 10.257)
INSERT #Table VALUES (16, 300, 11, 29, 10.428)

Related

How to get the root in a hierarchy query using SQL Server from any level of Hierarchy

I would like to get the Top most Ancestor (Root) of the hierarchy from any level of data.
The following is my table.
CREATE TABLE #SMGROUP (ID INT NOT NULL, GRP NVARCHAR(40), GRPCLASS INT, PARENTGRP NVARCHAR(40), PARENTGRPCLASS INT)
INSERT INTO #SMGROUP VALUES (1, 'A', 1, NULL,NULL)
INSERT INTO #SMGROUP VALUES (1, 'B', 1, NULL,NULL)
INSERT INTO #SMGROUP VALUES (1, 'C', 1, NULL,NULL)
INSERT INTO #SMGROUP VALUES (1, 'A.1', 2, 'A',1)
INSERT INTO #SMGROUP VALUES (1, 'A.2', 2, 'A',1)
INSERT INTO #SMGROUP VALUES (1, 'A.3', 2, 'A',1)
INSERT INTO #SMGROUP VALUES (1, 'B.1', 2, 'B',1)
INSERT INTO #SMGROUP VALUES (1, 'B.2', 2, 'B',1)
INSERT INTO #SMGROUP VALUES (1, 'A.3.3', 3, 'A.3',2)
INSERT INTO #SMGROUP VALUES (1, 'A.3.3.3', 4, 'A.3.3',3)
INSERT INTO #SMGROUP VALUES (1, 'A.3.3.3.1', 5, 'A.3.3.3',4)
INSERT INTO #SMGROUP VALUES (1, 'B.1.2', 3, 'B.1',2)
INSERT INTO #SMGROUP VALUES (1, 'B.2.1', 3, 'B.2', 2)
SELECT * FROM #SMGROUP
I Would like to have the value of - 'A' if I provide 'A.1' as input, also the return value would be 'A' if I provide 'A.3.3' as input. Also the return would be 'A' if the parameter is 'A.3.3.3.1'
I have written some thing like this, but I am not sure how to continue after this.
;WITH items AS (
SELECT G.GRP ,CAST('' AS NVARCHAR(30)) AS ParentGroup,
0 AS Level
FROM #SMGROUP G
WHERE G.PARENTGRP IS NULL
UNION ALL
SELECT G.GRP, CAST(G.PARENTGRP AS NVARCHAR(30)) AS ParentGroup
, Level + 1
FROM #SMGROUP G
INNER JOIN items itms ON itms.GRP = G.PARENTGRP
)
SELECT * FROM items
You are on the right direction, you just need one last push.
Instead of using a "standard" recursive cte that traverse from root to leaf nodes, you "reverse" the process and traverse from the input node back to the root.
Then it's simply a top 1 with level desc in the order by clause:
DECLARE #GRP NVARCHAR(40) = 'A.3.3.3.1';
WITH items AS (
SELECT G.GRP,
ISNULL(G.PARENTGRP, '') AS ParentGroup,
0 AS Level
FROM #SMGROUP G
WHERE G.GRP = #GRP
UNION ALL
SELECT G.GRP,
G.PARENTGRP,
Level + 1
FROM #SMGROUP G
INNER JOIN items itms
ON itms.ParentGroup = G.GRP
)
SELECT TOP 1 Grp
FROM items
ORDER BY Level DESC

How to remove redundant conditions in **IN** query Sql

I have this kind of query. But i need to optimize this query so how can omit redundant conditions with same split function.
DECLARE #Filter nvarchar(20)
SELECT #Filter ='5,22,3'
SELECT * FROM Employee e
WHERE e.code IN
(
CASE WHEN((SELECT count(*) FROM dbo.FNSPLITSTRING(SUBSTRING(#Filter,1,LEN(#Filter)-1), ',') d
WHERE d.splitdata IN (5, 16, 20, 23, 33, 49, 62, 90, 91, 92, 93, 94))>0) THEN 5 ELSE 0 END
,CASE WHEN((SELECT count(*) FROM dbo.FNSPLITSTRING(SUBSTRING(#Filter,1,LEN(#Filter)-1), ',') d
WHERE d.splitdata IN (22, 18))>0) THEN 46 ELSE 0 END
,CASE WHEN((SELECT count(*) FROM dbo.FNSPLITSTRING(SUBSTRING(#Filter,1,LEN(#Filter)-1), ',') d
WHERE d.splitdata IN (3, 28))>0) THEN 3 ELSE 0 END
)
As #Damien_The_Unbeliever said, avoid split strings as table of values. You do not need to split the same string multiple times. Instead you can use a temporary table variable.
DECLARE #SplitStrings TABLE
(
splitdata int
)
INSERT #SplitStrings
SELECT splitdata FROM dbo.FNSPLITSTRING(SUBSTRING(#Filter,1,LEN(#Filter)-1), ',') d
DECLARE #EmployeeCodes TABLE
(
splitdata INT,
code int
)
INSERT #EmployeeCodes (splitdata, code)
VALUES (5, 5), (16, 5), (20, 5), (23, 5), (33, 5), (49, 5), (62, 5), (90, 5), (91, 5), (92, 5), (93, 5), (94, 5),
(22, 46), (18, 46),
(3, 3), (28, 3)
SELECT e.*
FROM Employee e
JOIN #EmployeeCodes ec
ON e.code = ec.code
JOIN #SplitStrings ss
ON ec.splitdata = ss.splitdata
I hope this is the direction you are looking at.
Note: Assuming that you do not need 0 as employee code.
I have done new way. Less complex and omit redundant code.
DECLARE #Filter nvarchar(20)
SELECT #Filter ='5,22,3'
SELECT Distinct e.EmployeeId FROM Employee e
CROSS JOIN dbo.fnSplitString(#Filter, ',') AS d
WHERE
(e.code = 5 AND d.splitdata IN ('5', '16', '20', '23', '33', '49', '62', '90', '91', '92', '93', '94'))
OR (e.code = 46 AND d.splitdata IN ('22', '18'))
OR (e.code = 3 AND d.splitdata IN ('3', '28'))

SQL Insert Into Select Statement With Condition [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a table like this :
I want to insert into detail_salary_table with Insert into Select statement.
I can insert the salary_component item "Salary" with this code
INSERT INTO detail_salary_table (date_work, id_emp, salary_component, nominal)
SELECT
date_work, id_emp, 'Salary',
IIF(DATEDIFF(minute, start_work, finish_work) > 480, 10000, round(convert(float(53), datediff(minute, start_work, finish_work)) / 480, 1) * 10000)
FROM
attendance_table
How to insert the salary_component item "OverTime" with T-SQL like the image?
If I use VB.NET, I can do it with if and loop statement.
Note :
480 is fix. 10.000 is fix.
overtime = finish_work - start_work - 480. It take nominal from overtime_rate_table where the value near to time_in_minutes
the increment of nominal in overtime_rate_table is not measured. (so, i cannot use * 1000) (example is measured)
The SQL code to create the tables and sample data:
create table employee_table
(
id_emp int primary key,
name_emp varchar(200)
);
GO
create table attendance_table
(
id_data int primary key identity(1,1),
date_work date,
id_emp int,
start_work datetime,
finish_work datetime
);
GO
create table overtime_rate_table
(
id_data int,
time_in_minutes int,
nominal money
);
GO
create table detail_salary_table
(
id_data int primary key identity(1,1),
date_work date,
id_emp int,
salary_component varchar(100),
nominal money
);
GO
insert into employee_table
values (1, 'Emp A'), (2, 'Emp B'), (3, 'Emp C'), (4, 'Emp D'), (5, 'Emp E');
GO
insert into attendance_table (date_work, id_emp, start_work, finish_work)
values
('2017-02-01',1,'2017-02-01 08:00','2017-02-01 16:52'),
('2017-02-01',2,'2017-02-01 07:45','2017-02-01 16:48'),
('2017-02-01',3,'2017-02-01 08:02','2017-02-01 12:05'),
('2017-02-01',4,'2017-02-01 07:56','2017-02-01 16:49'),
('2017-02-01',5,'2017-02-01 07:30','2017-02-01 18:05'),
('2017-02-02',1,'2017-02-02 07:52','2017-02-02 16:23'),
('2017-02-02',2,'2017-02-02 07:19','2017-02-02 18:56'),
('2017-02-02',3,'2017-02-02 07:55','2017-02-02 18:23'),
('2017-02-02',4,'2017-02-02 08:01','2017-02-02 16:01'),
('2017-02-02',5,'2017-02-02 07:31','2017-02-02 16:49'),
('2017-02-03',1,'2017-02-03 07:52','2017-02-03 17:44'),
('2017-02-03',2,'2017-02-03 07:41','2017-02-03 17:23'),
('2017-02-03',3,'2017-02-03 07:06','2017-02-03 17:56'),
('2017-02-03',4,'2017-02-03 07:56','2017-02-03 19:00'),
('2017-02-03',5,'2017-02-03 07:45','2017-02-03 18:56');
GO
insert into overtime_rate_table
values (1, 15, 1000), (2, 30, 2000), (3, 45, 3000),
(4, 60, 4000), (5, 75, 5000), (6, 90, 6000),
(7, 105, 7000), (8, 120, 8000), (9, 135, 9000),
(10, 150, 10000), (11, 165, 11000), (12, 180, 12000),
(13, 195, 13000), (14, 210, 14000), (15, 225, 15000);
GO
INSERT INTO detail_salary_table
(date_work,
id_emp,
salary_component,
nominal
)
SELECT date_work,
id_emp,
'OverTime',
ISNULL(o.Nominal, 0)
FROM attendance_table a
LEFT JOIN overtime_rate_table o ON CONVERT( INT, DATEDIFF(minute, a.start_work, a.finish_work) - 480) / 15 * 15 = o.time_in_minutes;

SQL Server - statistical Mode of each column

For a given table, I want a SQL query which returns the statistical mode of each column in a single recordset. I see several ways to do this with aggregation, but they're all single column approaches. Can anyone think of a way to do this without taking the union of as many queries as there are columns? There's no mode() aggregate in SQL Server.
If table #x has 3 columns, I want a single row with 3 columns. Here's an example using SQL Server. It's a lot of heavy lifting, and very much tailored to the table definition. I'm looking for a cleaner, more generalized approach. I might want to do this on different tables at different times.
create table #x (name varchar(20), age int, city varchar(20))
insert into #x values ('Bill', 20, 'NYC')
insert into #x values ('Bill', 15, 'NYC')
insert into #x values ('Mary', 29, 'LA')
insert into #x values ('Bill', 30, 'NYC')
insert into #x values ('Bill', 30, 'NYC')
insert into #x values ('Bill', 20, 'LA')
insert into #x values ('Mary', 20, 'NYC')
insert into #x values ('Joe', 12, 'NYC')
insert into #x values ('Fred', 55, 'NYC')
insert into #x values ('Alex', 41, 'NYC')
insert into #x values ('Alex', 30, 'LA')
insert into #x values ('Alex', 10, 'Chicago')
insert into #x values ('Bill', 20, 'NYC')
insert into #x values ('Bill', 10, 'NYC')
create table #modes (_column varchar(20), _count int, _mode varchar(20))
insert into #modes select top 1 'name' _column, count(*) _count, name _mode from #x group by name order by 2 desc
insert into #modes select top 1 'age' _column, count(*) _count, age _mode from #x group by age order by 2 desc
insert into #modes select top 1 'city' _column, count(*) _count, city _mode from #x group by city order by 2 desc
select name, age, city from (select _mode, _column from #modes) m
pivot (max(_mode) for _column in (name, age, city)) p
This will dynamically generate Item, Value and Hits. You can pivot as you see fit.
Declare #YourTable table (name varchar(20), age int, city varchar(20))
Insert Into #YourTable values
('Bill', 20, 'NYC'),
('Bill', 15, 'NYC'),
('Mary', 29, 'LA'),
('Bill', 30, 'NYC'),
('Bill', 30, 'NYC'),
('Bill', 20, 'LA'),
('Mary', 20, 'NYC'),
('Joe', 12, 'NYC'),
('Fred', 55, 'NYC'),
('Alex', 41, 'NYC'),
('Alex', 30, 'LA'),
('Alex', 10, 'Chicago'),
('Bill', 20, 'NYC'),
('Bill', 10, 'NYC')
Declare #XML xml
Set #XML = (Select * from #YourTable for XML RAW)
Select Item,Value,Hits
From (
Select Item,Value,Hits=count(*),RowNr = ROW_NUMBER() over (Partition By Item Order By Count(*) Desc)
From (
Select ID = r.value('#id','int') -- Usually Reserved
,Item = Attr.value('local-name(.)','varchar(100)')
,Value = Attr.value('.','varchar(max)')
From #XML.nodes('/row') as A(r)
Cross Apply A.r.nodes('./#*[local-name(.)!="id"]') as B(Attr)
) A
Group By Item,Value
) A
Where RowNr=1
Returns
Item Value Hits
age 20 4
city NYC 10
name Bill 7

SQL Server : nested stored procedure to update table values

I have a table structure and its data as follows.
CREATE TABLE TestTable
(
id INT Identity(1,1) PRIMARY KEY,
creationTimestamp DATE,
indexOne INT,
indexTwo INT
);
INSERT INTO TestTable (creationTimestamp, indexOne, indexTwo)
VALUES
(2014-01-10, 100, 0),
(2014-01-11, 100, 0),
(2014-01-12, 100, 0),
(2014-01-13, 152, 2),
(2014-01-14, 152, 2),
(2014-01-15, 152, 2),
(2014-02-12, 152, 2),
(2014-02-13, 152, 2),
(2014-02-14, 333, 4),
(2014-02-15, 333, 4),
(2014-02-16, 333, 4),
(2014-03-10, 333, 4),
(2014-03-11, 333, 4),
(2014-03-12, 333, 4),
(2014-03-13, 333, 4),
(2014-03-14, 333, 4),
(2014-04-20, 500, 7),
(2014-04-21, 500, 7),
(2014-04-22, 500, 7),
(2014-04-23, 500, 7);
When you consider indexOne + indexTwo, there are duplicate rows. But I need them to be unique.
Therefore indexTwo must be properly indexed as follows
(2014-01-10, 100, 0),
(2014-01-11, 100, 1),
(2014-01-12, 100, 2),
(2014-01-13, 152, 0),
(2014-01-14, 152, 1),
(2014-01-15, 152, 2),
(2014-02-12, 152, 3),
(2014-02-13, 152, 4),
(2014-02-14, 333, 0),
(2014-02-15, 333, 1),
(2014-02-16, 333, 2),
(2014-03-10, 333, 3),
(2014-03-11, 333, 4),
(2014-03-12, 333, 5),
(2014-03-13, 333, 6),
(2014-03-14, 333, 7),
(2014-04-20, 500, 0),
(2014-04-21, 500, 1),
(2014-04-22, 500, 2),
(2014-04-23, 500, 3);
I have written the following stored procedure and it does not work properly
declare #indexOne int, #indexTwo int, #x int
declare c cursor for
select indexOne, indexTwo
from TestTable
group by indexOne, indexTwo
open c
fetch next from c into #indexOne, #indexTwo
while ##FETCH_STATUS = 0
begin
set #x = 0;
declare #id int
declare c1 cursor for
select id
from TestTable
where indexOne = #indexOne and indexTwo = #indexTwo
order by creationTimestamp asc
open c1
fetch next from c1 into #id
while ##FETCH_STATUS = 0
begin
UPDATE TestTable SET indexTwo = #x WHERE id = #id
set #x = #x + 1
fetch next from c1 into #id
end
close c1
deallocate c1
fetch next from c into #indexOne, #indexTwo
end
close c
deallocate c
Help me to find why this is not working
You don't need a cursor to do this use window function to generate the indextwo values per creationtimestamp, indexone. I hope this will do the job.
Sql Server
UPDATE A
SET indexTwo = b.indexTwo
FROM testtable a
JOIN (SELECT creationTimestamp, indexOne,
Row_number()OVER(partition BY indexone
ORDER BY creationtimestamp)-1 indexTwo
FROM testtable) B
ON a.creationtimestamp = b.creationtimestamp
AND a.indexone = b.indexone
SQLFIDDLE DEMO

Resources