SQL Server - statistical Mode of each column - sql-server

For a given table, I want a SQL query which returns the statistical mode of each column in a single recordset. I see several ways to do this with aggregation, but they're all single column approaches. Can anyone think of a way to do this without taking the union of as many queries as there are columns? There's no mode() aggregate in SQL Server.
If table #x has 3 columns, I want a single row with 3 columns. Here's an example using SQL Server. It's a lot of heavy lifting, and very much tailored to the table definition. I'm looking for a cleaner, more generalized approach. I might want to do this on different tables at different times.
create table #x (name varchar(20), age int, city varchar(20))
insert into #x values ('Bill', 20, 'NYC')
insert into #x values ('Bill', 15, 'NYC')
insert into #x values ('Mary', 29, 'LA')
insert into #x values ('Bill', 30, 'NYC')
insert into #x values ('Bill', 30, 'NYC')
insert into #x values ('Bill', 20, 'LA')
insert into #x values ('Mary', 20, 'NYC')
insert into #x values ('Joe', 12, 'NYC')
insert into #x values ('Fred', 55, 'NYC')
insert into #x values ('Alex', 41, 'NYC')
insert into #x values ('Alex', 30, 'LA')
insert into #x values ('Alex', 10, 'Chicago')
insert into #x values ('Bill', 20, 'NYC')
insert into #x values ('Bill', 10, 'NYC')
create table #modes (_column varchar(20), _count int, _mode varchar(20))
insert into #modes select top 1 'name' _column, count(*) _count, name _mode from #x group by name order by 2 desc
insert into #modes select top 1 'age' _column, count(*) _count, age _mode from #x group by age order by 2 desc
insert into #modes select top 1 'city' _column, count(*) _count, city _mode from #x group by city order by 2 desc
select name, age, city from (select _mode, _column from #modes) m
pivot (max(_mode) for _column in (name, age, city)) p

This will dynamically generate Item, Value and Hits. You can pivot as you see fit.
Declare #YourTable table (name varchar(20), age int, city varchar(20))
Insert Into #YourTable values
('Bill', 20, 'NYC'),
('Bill', 15, 'NYC'),
('Mary', 29, 'LA'),
('Bill', 30, 'NYC'),
('Bill', 30, 'NYC'),
('Bill', 20, 'LA'),
('Mary', 20, 'NYC'),
('Joe', 12, 'NYC'),
('Fred', 55, 'NYC'),
('Alex', 41, 'NYC'),
('Alex', 30, 'LA'),
('Alex', 10, 'Chicago'),
('Bill', 20, 'NYC'),
('Bill', 10, 'NYC')
Declare #XML xml
Set #XML = (Select * from #YourTable for XML RAW)
Select Item,Value,Hits
From (
Select Item,Value,Hits=count(*),RowNr = ROW_NUMBER() over (Partition By Item Order By Count(*) Desc)
From (
Select ID = r.value('#id','int') -- Usually Reserved
,Item = Attr.value('local-name(.)','varchar(100)')
,Value = Attr.value('.','varchar(max)')
From #XML.nodes('/row') as A(r)
Cross Apply A.r.nodes('./#*[local-name(.)!="id"]') as B(Attr)
) A
Group By Item,Value
) A
Where RowNr=1
Returns
Item Value Hits
age 20 4
city NYC 10
name Bill 7

Related

Union on two tables based on a specific column

I have two tables which have many records , both table have few records where the id is same , I want to have unique group of rows where I have unique id from one particular table incase there are two same id in table
for example
In the above snippet we have empid 1 on both table so , I want all the records such that If there are common empid in both table then value for the common empid the value should be used from Dummy_tab_2 table .
Desired O/P
code to replicate
CREATE TABLE Dummy_tab_1
(
empid int,
Month1 int,
Month2 int,
);
INSERT INTO Dummy_tab_1
VALUES (1, 100, 200), (5,15, 20), (6, 20, 30);
CREATE TABLE Dummy_tab_2
(
empid int,
Month1 int,
Month2 int,
);
INSERT INTO Dummy_tab_2
VALUES (1, 10, 20), (2,15, 20), (3, 20, 30);
I tried this but not sure how to remove the emp id which is not desired
SELECT *
FROM Dummy_tab_2
UNION
SELECT *
FROM Dummy_tab_1
and o/p which I got is this
Using:
SELECT * FROM Dummy_tab_1 WHERE empid NOT IN (SELECT empid FROM Dummy_tab_2)
UNION ALL
SELECT * FROM Dummy_tab_2;
db<>fiddle demo
Using your data examples and what looks like your desired results this should work:
CREATE TABLE Dummy_tab_1
(
empid int,
Month1 int,
Month2 int,
);
INSERT INTO Dummy_tab_1
VALUES (1, 100, 200), (5,15, 20), (6, 20, 30);
CREATE TABLE Dummy_tab_2
(
empid int,
Month1 int,
Month2 int,
);
INSERT INTO Dummy_tab_2
VALUES (1, 10, 20), (2,15, 20), (3, 20, 30);
SELECT COALESCE(DT1.empid, DT2.EmpID) AS EmpID, CASE WHEN DT2.empid IS NOT NULL THEN DT2.Month1 ELSE DT1.Month1 END AS Month1,
CASE WHEN DT2.empid IS NOT NULL THEN DT2.Month2 ELSE DT1.Month2 END AS Month2
FROM Dummy_tab_1 DT1
FULL JOIN Dummy_tab_2 DT2 ON DT1.empid = DT2.empid
ORDER BY COALESCE(DT1.empid, DT2.EmpID)

Get max value from table and group by with column having comma separated value with different order or having more values

I am having a table like this along with data
CREATE TABLE temp (
`name` varchar(20),
`ids` varchar(20),
`value1` int,
`value2` int
);
INSERT INTO temp(`name`,`ids`, `value1`, `value2`) values
('A', '1,2', 10, 11),
('A', '2,1', 12, 100),
('A', '1,2,3', 20, 1),
('B', '6', 30, 10)
I need to get the max value by Name along with ids
I am using the following query to get the max value.
select name, ids, max(value1) as value1, max(value2) as value2
from temp
group by name,ids
The question has been tagged as Sybase ASE, but the 'create table and 'insert' commands are invalid in ASE so not sure if this is an issue of an incorrect tag or the wrong 'create table' and 'insert' commands ... so, assuming this is for a Sybase ASE database:
I'm assuming the desired output is to display those rows where value = max(value).
First we'll setup our test case:
create table mytab
(name varchar(20)
,ids varchar(20)
,value int)
go
insert into mytab (name,ids,value) values ('A', '1,2' , 10)
insert into mytab (name,ids,value) values ('A', '2,1' , 12)
insert into mytab (name,ids,value) values ('A', '1,2,3', 20)
insert into mytab (name,ids,value) values ('B', '6' , 30)
go
Here's one possible solution:
select t.name, t.ids, t.value
from mytab t
join (select name,max(value) as maxvalue from mytab group by name) dt
on t.name = dt.name
and t.value = dt.maxvalue
order by t.name
go
name ids value
-------------------- -------------------- -----------
A 1,2,3 20
B 6 30
The subquery/derived-table gives us the max(value) for each unique name. The main query then joins these name/max(value) pairs back to the main table to give us the desired rows (ie, where value = max(value)).
Tested on ASE 15.7 SP138.

SQL: How to fill empty cells with previous row value on basis of condition?

SQL: Fill empty cells with previous row value on basis of condition?
Please treat this as High Priority Request..help needed
Requesting a high rep user link it (http://i.imgur.com/P4UOiMz.jpg)
I need to produce the column "OXY_ID_NEW" in the following table using SQL. Is this possible in SQL 2008R2 or SQL 2012 or Amazon REDSHIFT?
SQL TABLE image(http://i.imgur.com/P4UOiMz.jpg)
Basically, I wanted to forward fill empty "OXY_ID" cells with last known Oxy_id for that ID, as shown in 'OXY_ID_NEW' column.
maybe something like
coalesce(lag(oxy_id) over (partition by id order by number), oxy_id)
..assuming the id, number cols actually increase .. in the screenshot it looks like they repeat in which case you'll need to provide the whole table definition.
The LAG example as given by gordy is the simplest as long as you have it. Note though that you cannot use it directly to update your table. Windowed functions can only appear in SELECT or ORDER BY clause, so you would need a temporary table.
For older versions you need a cursor. Something like:
declare #demo table
(
id varchar (10),
number int,
oxy_id varchar(2)
)
INSERT INTO #demo VALUES ('308_2123', 36, 'ZY')
INSERT INTO #demo VALUES ('308_2123', 36, NULL)
INSERT INTO #demo VALUES ('308_2123', 37, NULL)
INSERT INTO #demo VALUES ('308_2123', 37, NULL)
INSERT INTO #demo VALUES ('308_2123', 38, 'WY')
INSERT INTO #demo VALUES ('308_2123', 38, 'WY')
INSERT INTO #demo VALUES ('308_2123', 38, NULL)
INSERT INTO #demo VALUES ('308_2123', 39, NULL)
INSERT INTO #demo VALUES ('309_5647', 30, 'AB')
INSERT INTO #demo VALUES ('309_5647', 30, NULL)
INSERT INTO #demo VALUES ('309_5647', 31, NULL)
INSERT INTO #demo VALUES ('309_5647', 32, 'BC')
INSERT INTO #demo VALUES ('310_8897', 20, 'CD')
INSERT INTO #demo VALUES ('310_8897', 21, 'DC')
INSERT INTO #demo VALUES ('310_8897', 22, NULL)
INSERT INTO #demo VALUES ('310_8897', 23, NULL)
INSERT INTO #demo VALUES ('310_8897', 23, NULL)
INSERT INTO #demo VALUES ('311_6789', 1, NULL)
INSERT INTO #demo VALUES ('311_6789', 1, NULL)
INSERT INTO #demo VALUES ('311_6789', 2, 'EF')
INSERT INTO #demo VALUES ('311_6789', 3, 'GH')
INSERT INTO #demo VALUES ('311_6789', 3, NULL)
INSERT INTO #demo VALUES ('312_9874', 1, 'HK')
INSERT INTO #demo VALUES ('312_9874', 1, 'KY')
INSERT INTO #demo VALUES ('312_9874', 1, NULL)
INSERT INTO #demo VALUES ('312_9874', 1, 'YY')
DECLARE #id varchar(10)
DECLARE #oxy_ID varchar(2)
declare #prevOxyID varchar(10) = NULL
declare #number int
DECLARE #previd varchar(10) = NULL
DECLARE cur CURSOR FOR
(SELECT d.id, d.number, d.oxy_id FROM #demo d)
OPEN cur
FETCH NEXT FROM cur into
#id, #number, #oxy_id
WHILE ##FETCH_STATUS = 0
BEGIN
IF #oxy_id IS NULL
BEGIN
if #prevOxyID IS NOT NULL
BEGIN
IF #id = #previd
BEGIN
UPDATE #demo SET oxy_id = #prevOxyID
WHERE id = #id AND number = #number AND oxy_id IS NULL
END
ELSE
BEGIN
SET #prevOxyID = NULL
END
END
SET #previd = #id
END
ELSE
BEGIN
SET #previd = #id
SET #prevOxyID = #oxy_ID
END
FETCH NEXT FROM cur into
#id, #number, #oxy_id
END
close cur
deallocate cur
SELECT * FROM #demo
Please note that you cannot use order by in a cursor. The data must already be in the order as shown on your image. If the data in the table is not in this order then again, you will need to use a temporary table with the records inserted in the right order, and then perform the cursor on the temporary table, and finally update the original table from the temporary one.
EDIT
OK So no cursor version. As mentioned by gordy, the problem with LAG is the repeated numbers. This same problem restricts the use of UPDATE, since there is no unique identifier for a row. Instead I have to insert the results into a temporary table, delete the originals and then re-insert from temp. If you do in fact have a unique key, then please replace this delete and insert with an UPDATE. The following solution, whilst a bit long-winded, does get around the problems, and according to my research should work on Amazon Redshift, but I do not have access to test. I will not repeat the inserts, please copy from above.
declare #demo table
(
id varchar (10),
number int,
oxy_id varchar(2)
)
create table allrownums
(
id varchar (10),
number int,
oxy_id varchar(2),
rownum int
)
INSERT INTO allrownums
SELECT id, number, oxy_id, ROW_NUMBER() OVER (ORDER BY id, number) AS rownum
FROM #demo;
create table allnotnullrows
(
id varchar (10),
number int,
oxy_id varchar(2),
rownum int
)
INSERT INTO allnotnullrows
SELECT * FROM allrownums
WHERE oxy_id IS NOT NULL
create table maxrownums
(
id varchar (10),
rownum int,
maxrownum int
)
INSERT INTO maxrownums
SELECT a.id, a.rownum, Max(n.rownum)
FROM allrownums a INNER JOIN allnotnullrows n
ON n.id = a.id WHERE a.rownum >= n.rownum
GROUP BY a.id, a.rownum
create table tempresults
(
id varchar (10),
number int,
oxy_id varchar(2)
)
INSERT INTO tempresults
SELECT a.id, a.number, coalesce(a.oxy_id, n.oxy_id) as oxy_id
FROM allrownums a
LEFT JOIN maxrownums m
ON m.rownum = a.rownum
LEFT JOIN allnotnullrows n
ON a.id = n.id
and n.rownum = m.maxrownum
DELETE FROM #demo;
INSERT INTO #demo SELECT * FROM tempresults;
DROP TABLE tempresults;
DROP TABLE allrownums;
DROP TABLE allnotnullrows;
DROP TABLE maxrownums;
SELECT * FROM #demo;
Should work in SQL 2008 (I have no way to test, but CROSS APPLY works in SQL 2008)
CREATE TABLE #table1( ID INT, Number INT, OXY_ID VARCHAR( 2 ))
INSERT INTO #table1
VALUES
( 1, 23, 'AD' ),
( 2, 23, 'XY' ),
( 3, 23, '' ),
( 4, 23, '' ),
( 5, 23, 'MY' ),
( 6, 23, '' ),
( 7, 23, 'ZY' )
CREATE INDEX IX_table1__ID ON #table1( ID, OXY_ID )
SELECT a.*, c.OXY_ID AS OXY_ID_New
FROM #table1 AS a
CROSS APPLY
( SELECT TOP 1 ID, OXY_ID
FROM #table1 AS b
WHERE OXY_ID <> '' AND a.ID >= b.ID
ORDER BY ID DESC ) AS c
Comments:
Should be a lot faster than a cursor.
LAG solution is more elegant compared to this.

can we implement innerjoin in the following sql query

These are my tables:
CREATE TABLE forgerock (id INT, [date] DATETIME, empcode INT,[file] VARCHAR);
INSERT INTO forgerock
VALUES
(1, '2015-12-31 01:20:02', 56, 'abc1'),
(2, '2016-01-01 01:20:02', 58, 'abc2'),
(3, '2016-01-02 01:20:02', 46, 'abc3'),
(4, '2016-01-03 01:20:02', 16, 'abc4'),
(5, '2016-01-04 01:20:02', 36, 'abc5');
CREATE TABLE forge (empcode INT, [user_name] VARCHAR);
INSERT INTO forge
VALUES
(56, 'ram'),
(58, 'ram1'),
(46, 'ram2'),
(16, 'ram3'),
(36, 'ram4');
I am trying to print the file name and user_name from the tables with respect to current date and the day before the current date.
I tried the query:
ResultSet resultset = statement.executeQuery("select file from forgerock where '"+date+"' >= CURRENT_DATE('"+date+"', INTERVAL 1 DAY);") ;
but I got the exception:
Incorrect syntax near the keyword 'CURRENT_DATE'.
IF OBJECT_ID('dbo.forgerock', 'U') IS NOT NULL
DROP TABLE dbo.forgerock
CREATE TABLE dbo.forgerock (id INT PRIMARY KEY, [date] DATETIME, empcode INT,[file] VARCHAR(10));
INSERT INTO dbo.forgerock
VALUES
(1, '2015-12-31 01:20:02', 56, 'abc1'),
(2, '2016-01-01 01:20:02', 58, 'abc2'),
(3, '2016-01-02 01:20:02', 46, 'abc3'),
(4, '2016-01-03 01:20:02', 16, 'abc4'),
(5, '2016-01-04 01:20:02', 36, 'abc5');
IF OBJECT_ID('dbo.forge', 'U') IS NOT NULL
DROP TABLE dbo.forge
CREATE TABLE dbo.forge (empcode INT PRIMARY KEY, [user_name] VARCHAR(10));
INSERT INTO dbo.forge
VALUES (56, 'ram'),(58, 'ram1'),(46, 'ram2'),(16, 'ram3'),(36, 'ram4')
DECLARE #dt DATETIME = FLOOR(CAST(GETDATE() AS FLOAT))
SELECT *
FROM dbo.forge
WHERE empcode IN (
SELECT f.empcode
FROM dbo.forgerock f
WHERE f.[date] BETWEEN DATEADD(DAY, -1, #dt) AND #dt
)
output -
empcode user_name
----------- ----------
16 ram3
SELECT fr.file, f.user_name
FROM forgerock fr inner join forge f on fr.empcode = f.empcode
WHERE fr.date >= DATE_ADD(NOW(), INTERVAL -1 DAY)
You can use datediff to get the difference between two dates.
Try this :
ResultSet resultset = statement.executeQuery("select file from forgerock where DATEDIFF(day, GETDATE(), '" + date + "') >= 1") ;
To test the query use this one :
SELECT * FROM forgerock WHERE DATEDIFF(day, GETDATE(), #date) >= 1;
Just replace the #date with the value you want, for example '2016-01-02'
Use this filter:
SELECT [file]
FROM forgerock
WHERE [date] >= DATEADD(DAY, DATEDIFF(DAY,0,GETDATE()-1),0)
The DATEADD expression above will always return 12:00am yesterday morning, allowing your query to only return records from yesterday or today.
Bonus Tip: avoid using reserved keywords (such as file and date) as column or table names.
Since am using ms sql the code should be the following way
SELECT fr.file, f.user_name FROM forgerock fr inner join forge f on fr.empcode = f.empcodewhere [date] >= DATEADD(DAY, DATEDIFF(DAY,0,GETDATE()-1),0)
which will result in printing the two tables file from forgerock and user_name from forge
You have to try following query:-
SELECT fr.file, f.user_name
FROM forgerock fr inner join forge f on fr.empcode = f.empcode
AND `date' >= (DATE_ADD(`date`, INTERVAL 1 day))

Deleting records from SQL Server table without cursor

I am trying to selectively delete records from a SQL Server 2005 table without looping through a cursor. The table can contain many records (sometimes > 500,000) so looping is too slow.
Data:
ID, UnitID, Day, Interval, Amount
1 100 10 21 9.345
2 100 10 22 9.367
3 200 11 21 4.150
4 300 11 21 4.350
5 300 11 22 4.734
6 300 11 23 5.106
7 400 13 21 10.257
8 400 13 22 10.428
Key is: ID, UnitID, Day, Interval.
In this example I wish to delete Records 2, 5 and 8 - they are adjacent to an existing record (based on the key).
Note: record 6 would not be deleted because once 5 is gone it is not adjacent any longer.
Am I asking too much?
See these articles in my blog for performance detail:
SQL Server: deleting adjacent values
SQL Server: deleting adjacent values (improved)
The main idea for the query below is that we should delete all even rows from continuous ranges of intervals.
That is, if for given (unitId, Day) we have the following intervals:
1
2
3
4
6
7
8
9
, we have two continuous ranges:
1
2
3
4
and
6
7
8
9
, and we should delete every even row:
1
2 -- delete
3
4 -- delete
and
6
7 -- delete
8
9 -- delete
, so that we get:
1
3
6
8
Note that "even rows" means "even per-range ROW_NUMBER()s" here, not "even values of interval".
Here's the query:
DECLARE #Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)
INSERT INTO #Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO #Table VALUES (2, 100, 10, 22, 9.345)
INSERT INTO #Table VALUES (3, 200, 11, 21, 9.345)
INSERT INTO #Table VALUES (4, 300, 11, 21, 9.345)
INSERT INTO #Table VALUES (5, 300, 11, 22, 9.345)
INSERT INTO #Table VALUES (6, 300, 11, 23, 9.345)
INSERT INTO #Table VALUES (7, 400, 13, 21, 9.345)
INSERT INTO #Table VALUES (8, 400, 13, 22, 9.345)
INSERT INTO #Table VALUES (9, 400, 13, 23, 9.345)
INSERT INTO #Table VALUES (10, 400, 13, 24, 9.345)
INSERT INTO #Table VALUES (11, 400, 13, 26, 9.345)
INSERT INTO #Table VALUES (12, 400, 13, 27, 9.345)
INSERT INTO #Table VALUES (13, 400, 13, 28, 9.345)
INSERT INTO #Table VALUES (14, 400, 13, 29, 9.345)
;WITH rows AS
(
SELECT *,
ROW_NUMBER() OVER
(
PARTITION BY
(
SELECT TOP 1 qi.id AS mint
FROM #Table qi
WHERE qi.unitid = qo.unitid
AND qi.[day] = qo.[day]
AND qi.interval <= qo.interval
AND NOT EXISTS
(
SELECT NULL
FROM #Table t
WHERE t.unitid = qi.unitid
AND t.[day] = qi.day
AND t.interval = qi.interval - 1
)
ORDER BY
qi.interval DESC
)
ORDER BY interval
) AS rnm
FROM #Table qo
)
DELETE
FROM rows
WHERE rnm % 2 = 0
SELECT *
FROM #table
Update:
Here's a more efficient query:
DECLARE #Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)
INSERT INTO #Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO #Table VALUES (2, 100, 10, 22, 9.345)
INSERT INTO #Table VALUES (3, 200, 11, 21, 9.345)
INSERT INTO #Table VALUES (4, 300, 11, 21, 9.345)
INSERT INTO #Table VALUES (5, 300, 11, 22, 9.345)
INSERT INTO #Table VALUES (6, 300, 11, 23, 9.345)
INSERT INTO #Table VALUES (7, 400, 13, 21, 9.345)
INSERT INTO #Table VALUES (8, 400, 13, 22, 9.345)
INSERT INTO #Table VALUES (9, 400, 13, 23, 9.345)
INSERT INTO #Table VALUES (10, 400, 13, 24, 9.345)
INSERT INTO #Table VALUES (11, 400, 13, 26, 9.345)
INSERT INTO #Table VALUES (12, 400, 13, 27, 9.345)
INSERT INTO #Table VALUES (13, 400, 13, 28, 9.345)
INSERT INTO #Table VALUES (14, 400, 13, 29, 9.345)
;WITH source AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY unitid, day ORDER BY interval) rn
FROM #Table
),
rows AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY unitid, day, interval - rn ORDER BY interval) AS rnm
FROM source
)
DELETE
FROM rows
WHERE rnm % 2 = 0
SELECT *
FROM #table
I don't think what you're asking for is possible — but you may be able to get close. It appears you can almost do it by finding records with a self-join like this:
SELECT t1.id
FROM
table t1 JOIN table t2 ON (
t1.unitid = t2.unitid AND
t1.day = t2.day AND
t1.interval = t2.interval - 1
)
but the problem is, that'll find id=6 as well. However, if you create a temporary table from this data, it may be much smaller than your original data, and thus far faster to scan with a cursor (to fix the id=6 problem). You can then do a DELETE FROM table WHERE id IN (SELECT id FROM tmp_table) to kill the rows.
There may be a way to fix the ID=6 problem w/o a cursor, but if so, I don't see it.
There is the WHILE statement, which is an alternative to the cursor. That combined with table variables might let you do the same thing within a performance bound you're OK with.
DECLARE #Table TABLE (ID INT, UnitID INT, [Day] INT, Interval INT, Amount FLOAT)
INSERT INTO #Table VALUES (1, 100, 10, 21, 9.345)
INSERT INTO #Table VALUES (2, 100, 10, 22, 9.367)
INSERT INTO #Table VALUES (3, 200, 11, 21, 4.150)
INSERT INTO #Table VALUES (4, 300, 11, 21, 4.350)
INSERT INTO #Table VALUES (5, 300, 11, 22, 4.734)
INSERT INTO #Table VALUES (6, 300, 11, 23, 5.106)
INSERT INTO #Table VALUES (7, 400, 13, 21, 10.257)
INSERT INTO #Table VALUES (8, 400, 13, 22, 10.428)
DELETE FROM #Table
WHERE ID IN (
SELECT t1.ID
FROM #Table t1
INNER JOIN #Table t2
ON t2.UnitID = t1.UnitID
AND t2.Day = t1.Day
AND t2.Interval = t1.Interval - 1
LEFT OUTER JOIN #Table t3
ON t3.UnitID = t2.UnitID
AND t3.Day = t2.Day
AND t3.Interval = t2.Interval - 1
WHERE t3.ID IS NULL)
SELECT * FROM #Table
Lieven is so close - it worked for the test set, but if I add a few more records it starts to miss some.
We cannot use any odd/even criteria - we have no idea how the data falls.
Add this data and retry:
INSERT #Table VALUES (9, 100, 10, 23, 9.345)
INSERT #Table VALUES (10, 100, 10, 24, 9.367)
INSERT #Table VALUES (11, 100, 10, 25, 4.150)
INSERT #Table VALUES (12, 100, 10, 26, 4.350)
INSERT #Table VALUES (13, 300, 11, 25, 4.734)
INSERT #Table VALUES (14, 300, 11, 26, 5.106)
INSERT #Table VALUES (15, 300, 11, 27, 10.257)
INSERT #Table VALUES (16, 300, 11, 29, 10.428)

Resources